Friday, 27 December 2013

Gunzip the *.gz files on hadoop HDFS

If you have some gzipped files (*.gz) on your HDFS and you don't want to bring them on local disk for unzipping you can do it as follows:

hadoop dfs -ls /data/7days/netflow/2013/11/15/*/* | grep -i gz | awk '{print "hadoop dfs -cat "$8"  | gunzip | hadoop dfs -put - "substr($8,0,length($8)-3)}'

Sunday, 29 September 2013

CSS Positioning

  • The display property can be set to:
    • block - takes up entire width of the html page and does NOT let any other elements sit next to it.
    • inline-block - allows other elements to sit next to it self in the same line
    • inline - allows elements to sit in the same line. Useful only for block elements like <p>, as otherwise the element loses its dimensions
    • none - the element is not displayed.
  • To place an element in the center of the page use "margin:auto" as the style.
  • We can use negative margin/padding to move element off the page as well.
  • When you float an element on the page, you're telling the webpage: "I'm about to tell you where to put this element, but you have to put it into the flow of other elements." This means that if you have several elements all floating, they all know the others are there and don't land on top of each other.
  • If you tell an element to clear: left, it will immediately move below any floating elements on the left side of the page; it can also clear elements on the right. If you tell it to clear: both, it will get out of the way of elements floating on the left and right!
  • If you don't specify an element's positioning type, it defaults to static. This just means "where the element would normally go." If you don't tell an element how to position itself, it just plunks itself down in the document.
  • The first type of positioning is absolute positioning. When an element is set to position: absolute, it's then positioned in relation to the first parent element it has that doesn't have position:static. If there's no such element, the element with position: absolute gets positioned relative to <html>.
  • Relative positioning is more straightforward: it tells the element to move relative to where it would have landed if it just had the default static positioning.
  • Finally, fixed positioning anchors an element to the browser window—you can think of it as gluing the element to the screen. If you scroll up and down, the fixed element stays put even as other elements scroll past.

Wednesday, 25 September 2013

Setting up Django Web-app on amazon linux - AWS micro instance

  • SSH into your linux AWS system using a command like this: 

chmod 400 ~/pvtkey.pem
ssh -i ~/pvtkey.pem ec2-user@<AWS-instance-public-IP>
  • Install Apache httpd server:
sudo yum install httpd
sudo /etc/init.d/httpd start OR service httpd start
sudo chkconfig httpd on
  • You can check what is installed with RPM

rpm -qa

  • Install Django:

tar xzvf Django-1.5.4.tar.gz

cd Django-1.5.4

sudo python  install
  • Install mod_wsgi:
sudo yum install mod_wsgi
  • Add a new user for django:
sudo useradd djangouser
su - djangouser
  • Edit http.conf file:
sudo vi /etc/httpd/conf/httpd.conf

NameVirtualHost *:80

<VirtualHost *:80>
WSGIDaemonProcess user=djangouser group=djangouser processes=5 threads=1

    DocumentRoot /home/djangouser/web-app
    ErrorLog /home/djangouser/web-app/apache/logs/error.log
    CustomLog /home/djangouser/web-app/apache/logs/access.log combined
    WSGIScriptAlias / /home/djangouser/web-app/apache/django.wsgi

    <Directory /home/djangouser/web-app/apache>
        Order deny,allow
        Allow from all

    <Directory /home/djangouser/web-app/templates>
        Order deny,allow
        Allow from all

    <Directory /home/djangouser/web-app/bmdata/static>
        Order deny,allow
        Allow from all

    <Directory /usr/lib/python2.6/site-packages/django/contrib/admin/static/admin/>
        Order deny,allow
        Allow from all

    LogLevel warn

    Alias /static/admin/ /usr/lib/python2.6/site-packages/django/contrib/admin/static/admin/
    Alias /static/ /home/djangouser/web-app/bmdata/static/

WSGISocketPrefix /home/djangouser/web-app/apache/run/

  • Add django.wsgi 
import os, sys
os.environ['DJANGO_SETTINGS_MODULE'] = 'BMonitor.settings'
import django.core.handlers.wsgi

application = django.core.handlers.wsgi.WSGIHandler()

  • Installing python libs for matplotlib and numpy on AWS
sudo yum install  gcc-c++
sudo yum install  gcc-gfortran
sudo yum install python-devel
sudo yum install atlas-sse3-devel
sudo yum install lapack-devel
sudo yum install libpng-devel
sudo yum install freetype-devel
sudo yum install zlib-devel

tar xzvf matplotlib-1.3.1.tar.gz
cd matplotlib-1.3.1

sudo python build
sudo python install

Saturday, 7 September 2013

Using wget to download an ASP site

You can download an ASPX site, which asks for username/password for log in as follows:

First provide the username/password to the login page and save the cookie file.

wget --mirror -r \
--user-agent="" \
--keep-session-cookies --save-cookies cookies.txt \
--post-data '__VIEWSTATE=%2FwEPDwULLTE3MDc5MjQzOTdkZIP%2Fxc105yfz2jGFj4Nd3tPvrEeNara43fIRI5oAW%2Bwv&__EVENTVALIDATION=%2FwEWBAKisoyCAwLB2tiHDgK1qbSRCwL2k8O9DUQa5owMFDWzFnBoIDusNkznjB65a6zRyNETOEZfBM1o&txtUser=admin&txtPassword=admin&login_btn=Sign+In' \
-E -k -p

Then you can access other pages in the next step using the above generated cookie files.

wget --mirror -r \
--user-agent="" \
--keep-session-cookies --load-cookies cookies.txt \
-E -k -p

For details of the options, refer to the WGET manual i.e "man wget" :)

Saturday, 20 July 2013

vimdiff tips and tricks - how to copy between two screens

do (diff obtain) and dp (diff put) is what you need. Here is a small list of other helpful commands in this context.
]c               - advance to the next block with differences
[c               - reverse search for the previous block with differences
do (diff obtain) - bring changes from the other file to the current file
dp (diff put)    - send changes from the current file to the other file
zo               - unfold/unhide text
zc               - refold/rehide text
zr               - unfold both files completely
Note: Both do and dp work if you are on a block or just one line under a block.
:diffupdate will re-scan the files for changes

Wednesday, 3 July 2013

Debugging / Tracing bash shell scripts

Watching your script run

It is possible to have bash show you what it is doing when you run your script. To do this, add a "-x" to the first line of your script, like this:
#!/bin/bash -x
Now, when you run your script, bash will display each line (with substitutions performed) as it executes it. This technique is called tracing. Here is what it looks like:
[me@linuxbox me]$ ./trouble.bash
+ number=1
+ '[' 1 = 1 ']'
+ echo 'Number equals 1'
Number equals 1
Alternately, you can use the set command within your script to turn tracing on and off. Use set -x to turn tracing on and set +x to turn tracing off. For example.:


set -x
if [ $number = "1" ]; then
    echo "Number equals 1"
    echo "Number does not equal 1"
set +x

Wednesday, 19 June 2013

Vim tips and tricks

Cut and paste:
  1. Position the cursor where you want to begin cutting.
  2. Press v (or upper case V if you want to cut whole lines).
  3. Move the cursor to the end of what you want to cut.
  4. Press d.
  5. Move to where you would like to paste.
  6. Press P to paste before the cursor, or p to paste after.
Copy and paste can be performed with the same steps, only pressing y instead of d in step 4.

Saturday, 15 June 2013

Using nettop in MAC

Use nettop command to check per-application network activity in OS X

Sunday, 9 June 2013

Learning CSS

style sheet is a file that describes how an HTML file should look.


p {
    color: red;
span {
    color: #FFC125;

There are two main reasons for separating your form/formatting (CSS) from your functional content/structure (HTML):
  1. You can apply the same formatting to several HTML elements without rewriting code (e.g. style="color:red":) over and over
  2. You can apply similar appearance and formatting to several HTML pages from a single CSS file

To add the CSS file to your html:

<link type="text/css" rel="stylesheet" href='stylesheet.css'/>

The general format looks like this:
selector {
    property: value;
selector can be any HTML element, such as <p><img>, or <table>
property is an aspect of a selector. For instance, you can change the font-familycolor, and font-size of the text on your web pages (in addition to many more).
value is a possible setting for a property. color can be red, blue, black, or almost any color; font-family can be a whole bunch of different fonts; and so on.
HTML comments look like this:
<!--I'm a comment!-->
CSS comments, on the other hand, look like this:
/*I'm a comment!*/
The font-size unit em is a relativemeasure: one em is equal to the default font size on whatever screen the user is using. That makes it great for smartphone screens, since it doesn't try to tell the smartphone exactly how big to make a font: it just says, "Hey, 1em is the font size that you normally use, so 2em is twice as big and 0.5em is half that size!"

You can tell CSS to try several fonts, going from one to the next if the one you want isn't available.
For example, if you write:
p {
    font-family: Tahoma, Verdana, sans-serif;
CSS will first try to apply Tahoma to your paragraphs. If the user's computer doesn't have that font, it will try Verdana next, and if that doesn't work, it will show a default sans-serif font.
Making a colored block using <div>:

div{     background-color:#cc0000;     width:100px;     height:100px;    }

Many HTML elements support theborder property. This can be especially useful with tables.

selector {
    border: 2px solid red;

Links have a lot of the same properties as regular text: you can change their font, color, size, and so on. But links also have a property, text-decoration, that you can change to give your links a little more custom flair. 
Nested Selectors

If you have a paragraph inside a div that's inside another div, you can get to it like this:

div div p {
    /*Some CSS*/
This will style all paragraphs nested inside two divs and will leave all paragraphs that don't meet these criteria alone.

Remember, you can reach an element that is a child of another element like this:
div div p { /* Some CSS */ }
where in this case, we'd be grabbing any<p> that is nested somewhere inside a<div> that is nested somewhere inside another <div>. If you want to grab direct children—that is, an element that is directlynested inside another element, with no elements in between—you can use the >symbol, like so:
div > p { /* Some CSS */ }
This only grabs <p>s that are nesteddirectly inside of <div>s; it won't grab any paragraphs that are, say, nested inside lists that are in turn nested inside <div>s.

Class selectors

Classes are assigned to HTML elements with the word class and an equals sign, like so:
<div class="square"></div>
<img class="square"/>
<td class="square"></td>
Classes are identified in CSS with a dot (.), like so:
.square {
    height: 100px;
    width: 100px;

ID selectors
IDs are assigned to HTML elements with the word id and an equals sign:
<div id="first"></div>
<div id="second"></div>
<p id="intro"></p>
IDs are identified in CSS with a pound sign (#):
#first {
    height: 50px;

#second {
    height: 100px;

#intro {
    color: #FF0000;
This allows you to apply style to a single instance of a selector, rather than allinstances.
Pseudo selectors

The CSS syntax for pseudo selectors is
selector:pseudo-class_selector {
    property: value;
It's just that little extra colon (:).
There are a number of useful pseudo-class selectors for links, including:
a:link: An unvisited link.
a:visited: A visited link.
a:hover: A link you're hovering your mouse over.
You can actually select any child of an element after the first child with the pseudo-class selector nth-child; you just add the child's number in parentheses after the pseudo-class selector. For example,
p:nth-child(2) {
    color: red;
Would turn every paragraph that is the second child of its parent element red.

There's also a very special selector you can use to apply CSS styling to every element on the page: the * selector. For example, if you type
* {
    border: 2px solid black;

Certain selectors will "override" others if they have a greater specificity valueul li p { is more specific CSS than just p {, so when CSS sees tags that are both <p> tags andhappen to be inside unordered lists, it will apply the more specific styling (ul li p {) to the text inside the lists.

CSS Positioning

Elements populate the page in what's known as the CSS box model. Each HTML element is like a tiny box or container that holds the pictures and text you specify.

You'll see abbreviations like TMTB, and TP in the diagram. These stand for "top margin," "top border," and "top padding." As we'll see, we can adjust the top, right, left, and bottom padding, border, and margin individually.

The display property. We'll learn about four possible values:
  • block: This makes the element a block box. It won't let anything sit next to it on the page! It takes up the full width.
  • inline-block: This makes the element a block box, but will allow other elements to sit next to it on the same line.
  • inline: This makes the element sit on the same line as another element, but without formatting it like a block. It only takes up as much width as it needs (not the whole line). The inline display value is better suited for HTML elements that are blocks by default, such as headers and paragraphs.
  • none: This makes the element and its content disappear from the page entirely!


If you want to specify a particular margin, you can do it like this:
margin-top: /*some value*/
margin-right: /*some value*/
margin-bottom: /*some value*/
margin-left: /*some-value*/
You can also set an element's margins all at once: you just start from the top margin and go around clockwise (going from top to right to bottom to left). For instance,
margin: 1px 2px 3px 4px;
will set a top margin of 1 pixel, a right margin of 2, a bottom of 3, and a left of 4.