Wednesday, 21 November 2012

GIT getting started

A very good tutorial on git : http://www.sbf5.com/~cduan/technical/git/

GIT commands:

  • Git Pull

    From what I understand, git pull will pull down from a remote whatever you ask (so, whatever trunk you’re asking for) and instantly merge it into the branch you’re in when you make the request. Pull is a high-level request that runs ‘fetch’ then a ‘merge’ by default, or a rebase with ‘–rebase’. You could do without it, it’s just a convenience.
  • Git fetch

    Fetch is similar to pull, except it won’t do any merging.
  • Git clone

    Git clone will clone a repo int a newly created directory. It’s useful for when you’re setting up your local doodah


If you get stuck, run ‘git branch -a’ and it will show you exactly what’s going on with your branches. You can see which are remotes and which are local. 
http://betterexplained.com/articles/aha-moments-when-learning-git/
http://jonas.nitro.dk/git/quick-reference.html

Reading: http://blog.mikepearce.net/2010/05/18/the-difference-between-git-pull-git-fetch-and-git-clone-and-git-rebase/

Friday, 9 November 2012

Finding your gateway IP address in Linux, windows and MAC

Original Source: http://wiki.amahi.org/index.php/Find_Your_Gateway_IP


  • Windows:

    • Click Start > All Programs > Accessories > Command Prompt.
    • When Command Prompt is open, type the following command: ipconfig | findstr /i "Gateway" (You can copy & paste it in the command prompt; just right-click anywhere in the command prompt window and select Paste.)
    • You should see something like this:
      C:\Documents and Settings\administrator>ipconfig | findstr /i "Gateway"
      Default Gateway . . . . . . . . . : 192.168.1.1

    • In this example, your default gateway (router) IP address is 192.168.1.1.

  • Linux:

    • You'll need to open a Terminal. Depending on your Linux distribution, it can be located in the menu items at the top, or at the bottom of your screen. In this example, we will use Fedora. Click Applications > System Tools > Terminal.
    • When terminal is open, type the following command: ip route | grep default
    • The output of this should look something like the following:
      joe$ ip route | grep default
      default via 192.168.1.1 dev eth0 proto static

    • In this example, again, 192.168.1.1 is your default gateway (router) IP address.

  • Mac OS X:

    • Open the Terminal application. Do do this, click Finder > Applications > Utilities > Terminal.app.
    • When Terminal.app is open, type the following command: netstat -nr | grep default
    • This will output the following:
      joe$ netstat -nr | grep default
      default 192.168.1.1 UGSc 50 46 en1

    • In this example, again, 192.168.1.1 is your default gateway (router) IP address.

Saturday, 3 November 2012

Debugging hadoop mapreduce jobs using eclipse in local setup

Add the following line to 'conf/hadoop-env.sh':

export HADOOP_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5009"


Then setup eclipse to connect to the above port (5009) using remote debugging configuration.

Detailed steps: http://code.google.com/p/hadoop-clusternet/wiki/DebuggingJobsUsingEclipse

I tried running this is in the Stand alone configuration of Hadoop.
Have to try this in the pseudo-distributed mode as well, does it work ?

Here is an article that claims to debug the daemon processes:
http://srinathsview.blogspot.in/2012/05/debugging-hadoop-task-tracker-job.html


Routing protocols

Learning Hadoop

Read the first 4 chapters of "Hadoop in Action" by Chuck Lam. These chapters introduce the basics of the MapReduce and Hadoop framework in easy language.

Run the word count example which comes with the hadoop installation.

http://hadoop.apache.org/docs/r0.20.1/quickstart.html
http://hadoop.apache.org/docs/r0.20.1/mapred_tutorial.html
https://confluence.guavus.com/display/GTR/Install+and+Setup+Hadoop+on+Mac


Exercise: Implement the median function in an efficient manner using Hadoop.