Stuff I do: linux

Showing posts with label linux. Show all posts

Wednesday, 14 January 2015

Checking symbols/functions in *nix binary library so/a files

What the man page says - "Nm displays the name list (symbol table) of each object file in the argument list."

If you want to list the symbols or function names in the binary library you can see their names using this command.

Example:
nm opt/hadoop-2.4.0/lib/native/libhadoop.so

If you have to check whether there is support for snappy in your hadoop binary do this:
nm opt/hadoop-2.4.0/lib/native/libhadoop.so | grep -i snappy

And you should see something like this:
Java_org_apache_hadoop_io_compress_snappy_SnappyCompressor_compressBytesDirect0000000000003960 T Java_org_apache_hadoop_io_compress_snappy_SnappyCompressor_initIDs0000000000003bb0 T Java_org_apache_hadoop_io_compress_snappy_SnappyDecompressor_decompressBytesDirect0000000000003f60 T Java_org_apache_hadoop_io_compress_snappy_SnappyDecompressor_initIDs0000000000206cf0 b dlsym_snappy_compress0000000000206d20 b dlsym_snappy_uncompress

Without these Java native methods being compiled and available in the libhadoop.so, MR runtime will also complain that "native snappy library not available".

Wednesday, 25 September 2013

Setting up Django Web-app on amazon linux - AWS micro instance

SSH into your linux AWS system using a command like this:

chmod 400 ~/pvtkey.pem
ssh -i ~/pvtkey.pem ec2-user@<AWS-instance-public-IP>

Install Apache httpd server:

sudo yum install httpd
sudo /etc/init.d/httpd start OR service httpd start

sudo chkconfig httpd on

You can check what is installed with RPM

rpm -qa


Install Django:

wget https://www.djangoproject.com/m/releases/1.5/Django-1.5.4.tar.gz
tar xzvf Django-1.5.4.tar.gz

cd Django-1.5.4

sudo python setup.py  install

Install mod_wsgi:

sudo yum install mod_wsgi

Add a new user for django:


sudo useradd djangouser
su - djangouser

Edit http.conf file:

sudo vi /etc/httpd/conf/httpd.conf

NameVirtualHost *:80

WSGIDaemonProcess ec2-54-200-XXX-XXX.us-west-2.compute.amazonaws.com user=djangouser group=djangouser processes=5 threads=1

WSGIProcessGroup ec2-54-200-XXX-XXX.us-west-2.compute.amazonaws.com

DocumentRoot /home/djangouser/web-app

ServerName ec2-54-200-XXX-XXX.us-west-2.compute.amazonaws.com

ErrorLog /home/djangouser/web-app/apache/logs/error.log

CustomLog /home/djangouser/web-app/apache/logs/access.log combined

WSGIScriptAlias / /home/djangouser/web-app/apache/django.wsgi

Order deny,allow

Allow from all

</Directory>

Order deny,allow

Allow from all

</Directory>

Order deny,allow

Allow from all

</Directory>

<Directory /usr/lib/python2.6/site-packages/django/contrib/admin/static/admin/>
Order deny,allow
Allow from all
</Directory>

LogLevel warn

Alias /static/admin/ /usr/lib/python2.6/site-packages/django/contrib/admin/static/admin/

Alias /static/ /home/djangouser/web-app/bmdata/static/

</VirtualHost>

WSGISocketPrefix /home/djangouser/web-app/apache/run/

Add django.wsgi

import os, sys
sys.path.append('/home/djangouser/web-app')
os.environ['DJANGO_SETTINGS_MODULE'] = 'BMonitor.settings'
import django.core.handlers.wsgi

application = django.core.handlers.wsgi.WSGIHandler()

Installing python libs for matplotlib and numpy on AWS

sudo yum install gcc-c++

sudo yum install gcc-gfortran

sudo yum install python-devel

sudo yum install atlas-sse3-devel

sudo yum install lapack-devel

sudo yum install libpng-devel

sudo yum install freetype-devel

sudo yum install zlib-devel

tar xzvf matplotlib-1.3.1.tar.gz

cd matplotlib-1.3.1

sudo python setup.py build

sudo python setup.py install

References: http://pragmaticstartup.wordpress.com/2011/04/02/non-techie-guide-to-setting-up-django-apache-mysql-on-amazon-ec2/
https://code.google.com/p/modwsgi/wiki/IntegrationWithDjango

Saturday, 20 July 2013

vimdiff tips and tricks - how to copy between two screens

do (diff obtain) and dp (diff put) is what you need. Here is a small list of other helpful commands in this context.

]c               - advance to the next block with differences
[c               - reverse search for the previous block with differences
do (diff obtain) - bring changes from the other file to the current file
dp (diff put)    - send changes from the current file to the other file
zo               - unfold/unhide text
zc               - refold/rehide text
zr               - unfold both files completely

Note: Both do and dp work if you are on a block or just one line under a block.

:diffupdate will re-scan the files for changes

Original source:http://unix.stackexchange.com/questions/52754/whats-the-recommended-way-of-copying-changes-with-vimdiff

Wednesday, 3 July 2013

Debugging / Tracing bash shell scripts

Source: http://www.linuxcommand.org/wss0100.php

Watching your script run

It is possible to have bash show you what it is doing when you run your script. To do this, add a "-x" to the first line of your script, like this:

#!/bin/bash -x

Now, when you run your script, bash will display each line (with substitutions performed) as it executes it. This technique is called tracing. Here is what it looks like:

[me@linuxbox me]$ ./trouble.bash
+ number=1 + '[' 1 = 1 ']' + echo 'Number equals 1' Number equals 1

Alternately, you can use the set command within your script to turn tracing on and off. Use set -x to turn tracing on and set +x to turn tracing off. For example.:

#!/bin/bash

number=1

set -x
if [ $number = "1" ]; then
    echo "Number equals 1"
else
    echo "Number does not equal 1"
fi
set +x

Wednesday, 19 June 2013

Vim tips and tricks

Cut and paste:

Position the cursor where you want to begin cutting.
Press v (or upper case V if you want to cut whole lines).
Move the cursor to the end of what you want to cut.
Press d.
Move to where you would like to paste.
Press P to paste before the cursor, or p to paste after.

Copy and paste can be performed with the same steps, only pressing y instead of d in step 4.

Saturday, 25 May 2013

Cloning, Partitioning and formatting a new HDD

My HP laptop's HDD had been giving some errors in the Disk test for quite sometime now, so I got a new HDD with the idea to clone the old disk on this new disk. And to connect this new HDD to computer I bought a USB based 2.5 HDD External interface for HDDs, which allows you to connect the laptop disk to your system through usb cable.

My primary reason to go for cloning the disk was I wanted to avoid starting with a fresh image of Windows and the job of installing the dozens of softwares all over again. Plus I had Ubuntu installation in one of the partitions. This seemed like a perfect use-case for disk cloning.

So, a bit of google search on disk cloning landed me on this page: https://en.wikipedia.org/wiki/Comparison_of_disk_cloning_software

I tried a couple of softwares from the above list. To keep it easy I started off with the Windows based softwares:

Acronis True Image^[1]This one is a paid software, but luckily the new HDD I got was WD, and it turned out that if you have a WD HDD on your system, the software doesn't ask for the license and you get to use it for free. But this one turned out to be a a disappointing one as it failed to recognize my HDD as WD drive (probably because I was interfacing it through USB).
Macrium Reflect This was my next bet, as its a freeware with Graphical interface so going for easy one again :P. Was easy to install and then followed the GUI wizard to clone the disk, and bingo the cloning began. However as luck would have it, a couple of hours later the dialog box saying "Disk read error" popped out of no where and I was back to square one !

I realized that windows was not up to the task, the easy route was turning out to be rather difficult now.

So, I rebooted into Ubuntu this time to try the REAL low level stuff.

dd - is what I started with in Linux. You can create disk backups and disk cloning with a single line of this command (see thats the power of linux ;). To clone 'sda' to 'sdb' you can use the command (use at your own risk):

dd if=/dev/sda of=/dev/sdb bs=4096 conv=notrunc,noerror

ps -ef | grep 'dd' # get the PID of dd command

kill -USR1 8789 #senda signal to 'dd' process to print the progress

WARNING: If you are thinking of trying this, please read this link, and understand what exactly you are doing because a small mistake can make you lose all your data. https://wiki.archlinux.org/index.php/Disk_Cloning
It turned out 'dd' is just a simple copying command with no error handling built into it, so couple of hours into copying this one too failed with "input/output error" on the console :-/

Then i tried to manually create partitions on the new disk using fdisk.

sudo fdisk -l  #shows the disks and partition details for all disks

sudo fdisk /dev/sdc # if you need to partition '/dev/sdc'

The above command takes you to the fdisk command prompt, where you can easily 'create new partition', 'delete a partition' and 'write your changes to partition table' etc. However, this was turing out to be cumbersome process, because I would have to clone all partitions one by one and format them using mkfs and what not :P http://www.idevelopment.info/data/Unix/Linux/LINUX_PartitioningandFormattingSecondHardDrive_ext3.shtml

Now comes the real awesome piece of software to my rescue - 'ddrescue'. Yeah as the name says, so it does :). This is similar to 'dd' but has disk error handling algorithm built into it. What it tries to do is recover as much of the data from the disk as is possible from the good sectors, and then uses slow reading to recover the bad sectors. You can find all the details here: http://www.gnu.org/software/ddrescue/manual/ddrescue_manual.html

sudo ddrescue -f /dev/sda /dev/sdb logfile

'ddrescue' has an awesome log feature which basically logs all the disk recovery it is doing to a file. This log file allows it to resume the recovery process from where it left, incase your system crashes in between or anything !!
You can also get a 'ddrescueview' which shows the graphical view of the recovery process as good/bad sectors etc. which is really helpful. You can download this from here: http://sourceforge.net/projects/ddrescueview/

To summarize, yeah 'ddrescue' was able to clone my old disk to the new one. And the log viewer GUI showed that my old disk had one (just one :P) bad sector in my Windows partition. But yeah, overall this turned out to be a great learning experience about disks ;)

Sunday, 17 February 2013

Best command line tools for Linux, MAC

http://www.cyberciti.biz/open-source/best-terminal-applications-for-linux-unix-macosx/

Monday, 4 February 2013

Using multi-dimensional associative arrays in AWK

Examples:

cat /tmp/33 | awk -F'[\\^\\[\\]]' '{c[$5,$6]+=1}END{for( i in c) {split(i,sep,SUBSEP); print sep[1],sep[2], c[i] ;}}'

Reading:

http://www.chemie.fu-berlin.de/chemnet/use/info/gawk/gawk_12.html

Thursday, 31 January 2013

How to find the process which has the file handle open ?

A lot of times you need to find which Process (PID) has the handle to a particular file.
In linux doing this easy using the following command:

$ ls -ld1 /proc/*/fd/* | grep <file-name-filter>

Enjoy !!

Tuesday, 18 December 2012

Using awk to find the sum of a CSV file after group by on a particular column

http://www.theunixschool.com/2012/06/awk-10-examples-to-group-data-in-csv-or.html

I have a CSV file like :

65523 , 100
65522 , 900
65522 , 1800
65522 , 100
65522 , 100
65521 , 500
65521 , 200
65521 , 200

I need to find the sum of the 2nd column based on the grouping by the 1st column, so that the output looks something like:

65523 , 100
65522 , 2900
65521 , 900

SOLUTION:

This can be easily achieved using a single line awk script:

awk -F"," '{a[$1]+=$2;}END{for (i in a)print i, a[i];}' file

Awesome isn't it !! :)

Sorting a CSV file based on a particular column list

http://stackoverflow.com/questions/9471101/sort-csv-file-by-column-priority-using-the-sort-command-unix

sort --field-separator=';' --key=2,1

sort -nr -t',' -k3

To sort based on multiple columns use the syntax:

sort --key=1,1 --key=2,2r --key=3,3 --key=4,4r
sort -k1,1 -k2,2r -k3,3 -k4,4r

as in the following transcript:

pax$ echo '5 3 2 9
3 4 1 7
5 2 3 1
6 1 3 6
1 2 4 5
3 1 2 3
5 2 2 3' | sort --key=1,1 --key=2,2r --key=3,3 --key=4,4r

1 2 4 5
3 4 1 7
3 1 2 3
5 3 2 9
5 2 2 3
5 2 3 1
6 1 3 6

Remember to provide the -n option if you want them treated as proper numbers (variable length), such as:

sort -n -k1,1 -k2,2r -k3,3 -k4,4r

Learning to use Screen command in Unix

http://kb.iu.edu/data/acuy.html

If your local computer crashes, or you are connected via a modem and lose the connection, the processes or login sessions you establish through screen don't go away. You can resume your screen sessions with the following command: screen -r

screen ==> Start a new screen
(Ctrl+A) & C ==> Start a new screen sub-window
(Ctrl+A) & K ==> Kill the current sub-window

(Ctrl+A) & (Shift + ") ==> Show the list of screens running on the system

screen -r ==> restore to the old screens
screen -ls ==> list of running screens
(Ctrl+A) & (Shift + A) ==> rename the current screen

Saturday, 9 June 2012

Using GDB to debug in linux

To debug a process: "./a.out arg1 arg2", run:

$ gdb ./a.out
(gdb) run arg1 arg2

Reading: http://cs.baylor.edu/~donahoo/tools/gdb/tutorial.html

Understanding the I/O redirection in linux

STDIN is FD 0, while STDOUT is FD 1, and STDERR is FD 2.

To redirect the errors to the output stream use this: 2>&1

Here '&' tells that it is a FD(file descriptor) and not an ordinary file named '1'.

There are lots of redirection symbols that you can use, and here are some of them:

< file	means open a file for reading and associate with STDIN.
<< token	Means use the current input stream as STDIN for the program until token is seen. We will ignore this one until we get to scripting.
> file	means open a file for writing and truncate it and associate it with STDOUT.
>> file	means open a file for writing and seek to the end and associate it with STDOUT. This is how you append to a file using a redirect.
n>&m	means redirect FD n to the same places as FD m. Eg, `2>&1` means send STDERR to the same place that STDOUT is going to.

Reading: http://www.linuxsa.org.au/tips/io-redirection.html

Monday, 16 April 2012

Change grub boot order

Edit the file: "/boot/grub/menu.lst" from linux and set the "default 0" to the appropriate number you want to startup by default.

If you can't find the above file, you need to edit the file "/boot/grub/grub.cfg".

Sunday, 15 April 2012

Using Strace in linux

The following article explains the usage of strace command for troubleshooting in linux:
http://www.hokstad.com/5-simple-ways-to-troubleshoot-using-strace.html

strace is a tool for tracing system calls and signals. It intercepts and records the system calls made by a running process. strace can print a record of each system call, its arguments, and its return value. You can use strace on programs for which you do not have the source since using strace does not require recompilation. It is often useful in instances where a program freezes or otherwise fails to work and offers few clues as to the problem.

Strace Output: Each line starts with a system call name, is followed by its arguments in parenthesis and then has the return value at the end of the line. Errors (which typically have a return value of -1) have the symbolic error name (such as ENOENT in the first line in the example above) as well as a more informative error string appended.

Thursday, 12 April 2012

SCP and SSH without password

To setup the password-less logging from one box 'Source' to another box 'Dest'.
You need to generate the 'Source' user's public/private key pair, and then append the public key of this user to 'Dest' user's ".ssh/authorized_keys" file.
Important thing to note is that destination and source users can be different. However, the ".ssh" folder inside 'Dest' user's home directory needs to have unix permissions '700' for this to work (Because otherwise anyone can write their public key into this folder and gain access to the destination user's account).

The following article explains the setup of password less ssh or scp setup between two systems.
It provides the troubleshooting for common problems as well.

http://homepage.mac.com/kelleherk/iblog/C1901548470/E20061128145420/index.html

Monday, 9 April 2012

Find all files recursively in the current directory and process them one by one

Finding all files in the current directory, and execute a command on each file:

find ./ -mtime +1 -exec rm -vf {} \;

the command executed on each file is in italics in the above line, and '{}' is replaced by the file name during the command execution.

Check the memory used by a unix process

To see the memory used by the process, ps can be used with flags -AH
ps -AH v | grep 3408
where 3408 is the PID of process.

Check the column RSS (resident set size). It is the non-swapped physical memory that a task has used (in kiloBytes). (alias rssize) rsz).

Saturday, 31 March 2012

Accessing windows shared disks from the linux systems

If you need to access windows shared directories on your local network from linux systems, you can use the following open source Java SMB client:

http://jcifs.samba.org/src/

This client lets you access windows shared folders programmatically.

The following is an example client which you can use after downloading jar from the above location.

examples/AuthListFiles.java

SmbFile from = new SmbFile( "smb:// 192.168.1.4/7-Zip/7zCon.sfx" );

SmbFile to = new SmbFile( "smb://192.168.1.2/temp/7zCon.sfx" );

if(!to.exists())

System.out.println(" Not found !!");

from.copyTo( to );

The full format for SMB url/links is as follows:

smb://domain\;user:pass@server/pub/