Thursday, 31 May 2012

Find the smallest positive number missing from an unsorted array

I.) Order statistics, to find the min +ve number missing from the array. Divide the array into positive and negative numbers.
Take a random no.(r) from positive array and partition the array like we do in quick sort. After partitioning the random no. is placed at the correct sorted position(k) in the array. If the left side of position 'k' contains less than k number then the missing number lies in the left half. Or else the missing number lies in the right half. Partition and continue searching.
This can search in average O(n) time, and worst case O(nlogn) time.

II.) This solution uses the indexes of the array(i) to mark the element 'i' seen by making it negative. The first index with the non negative integer is the answer.


Find continuous subsequence of an array of integers such that its sum is maximum positive sum.
function maximumSumSubarray(arr){
    var sum=0,max=0,i;
        sum = sum + arr[i];
           max = sum;
   return max;

Tuesday, 29 May 2012

Without using a calculator, how many zeros are at the end of "100!"? (that's 100*99*98*...*3*2*1)

Answer: What you don't want to do is start multiplying it all out! The trick is
remembering that the number of zeros at the end of a number is equal to the
number of times "10" (or "2*5") appears when you factor the number. Therefore
think about the prime factorization of 100! and how many 2s and 5s there are.
There are a bunch more 2s than 5s, so the number of 5s is also the number of 10s in
the factorization. There is one 5 for every factor of 5 in our factorial multiplication
(1*2*...*5*...*10*...*15*...) and an extra 5 for 25, 50, 75, and 100. Therefore we have
20+4 = 24 zeros at the end of 100!.

You have some nuts and some bolts in random order. You cannot compare nuts with nuts and bolts with bolts. It is guaranteed that a nut will fit in one or more bolts. Now you have to sort both nuts and bolts separately in minimum complexity.

You have some nuts and some bolts in random order. You cannot compare nuts with nuts and bolts with bolts. It is guaranteed that a nut will fit in one or more bolts. Now you have to sort both nuts and bolts separately in minimum complexity.

I used quick sort approach and sorted both in n * log (n). (The exact solution I am not posting it, quick sort is a hint).
Take a bolt and use it as pivot to find the smaller, matching and the bigger nuts. Now using this nut as pivot, partially sort the bolts. Now continue with the left and right halves, recursively.

Ternary search

A ternary search algorithm is a technique in computer science for finding the minimum or maximum of a unimodal function (function that is either strictly increasing and then strictly decreasing or vice versa). A ternary search determines either that the minimum or maximum cannot be in the first third of the domain or that it cannot be in the last third of the domain, then repeats on the remaining two-thirds.

The function

Assume we are looking for a maximum of f(x) and that we know the maximum lies somewhere between A and B. For the algorithm to be applicable, there must be some value x such that
  • for all a,b with A ≤ a < bx, we have f(a) < f(b), and
  • for all a,b with xa < b ≤ B, we have f(a) > f(b).

Run Time Order

T(n) = T(n/3) + c
   Θ(log n)

Recursive algorithm
def ternarySearch(f, left, right, absolutePrecision):
#left and right are the current bounds; the maximum is between them
if (right - left) < absolutePrecision:
return (left + right)/2

   leftThird = (2
*left + right)/3
   rightThird = (left + 2

if f(leftThird) < f(rightThird):
return ternarySearch(f, leftThird, right, absolutePrecision)

return ternarySearch(f, left, rightThird, absolutePrecision)

Negative base representation (Base -2)

The base -r expansion of a number can be found by repeated division by (-r),recording the non-negative remainders of  and concatenating those remainders, starting with the last. Note that if  remainder d, then  For example, in negaternary:

NOTE: remember to subtract the smaller number, e.g. -3*2 = -6 from -5 and not -3*1 = -3  from -5 as -3 > -5.

Find the duplicate in an array of numbers from 1 to 1000

I. Given an array of 1001 integers with number from 1..1000, and one of the numbers duplicated. How would you find the duplicate in the array.

-> (Sum of Array)-(Sum of number from 1..1000) = duplicate number
-> XOR of all numbers from 1 to 1024 is zero. So, XOR all numbers in the array, plus the numbers from 1001..1024. The duplicate number is produced as the output.

II. What if the array has 1002 integers with two different numbers as duplicates. How would you find both the duplicates ?

-> This one is based on the lines of the XOR solution above. Find the XOR.
In this XOR value, each '1' bit represents the position where the two duplicates differ.
So make two sets of numbers, one in which this bit position is set and another with this bit unset.
Take the XORs separately for one set of elements in which a particular bit is set, this will give you one element. XORing the elements with the bit unset will give you the other element.

code: qor:{((x&y)~:)&x|y}

Median of medians algorithm

  1. If n is small, for example n<6, just sort and return the k'th smallest number.( Bound time- 7)
  2. If n>5, then partition the numbers into groups of 5.(Bound time n/5). Sort the numbers within each group. Select the middle elements (the medians). (Bound time- 7n/5)
  3. Call your "Selection" routine recursively to find the median of n/5 medians and call it m. (Bound time-Tn/5)
  4. Compare all n-1 elements with the median of medians m and determine the sets L and R, where L contains all elements <m, and R contains all elements >m. Clearly, the rank of m is r=|L|+1 (|L| is the size or cardinality of L). (Bound time- n)
  5. If k=r, then return m
  6. If k<r, then return k th smallest of the set L .(Bound time T 7n/10)
  7. If k>r, then return k-r th smallest of the set R.

This algorithm (O(n)) is used to improve the worst case performance of basic (partitioning) order selection algo used in quicksort (O(n^2)) .

Monday, 28 May 2012

Finding second largest element in the array with min. no. of comparisions


To find the MAX element of the array, we need to rule out (n-1) no.s, which means that we require atleast (n-1) comparisons. If we try to find the 2nd largest element in a similar fashion, we would need another (n-2) comparisons.

If however, we use the tournament model, we can reduce the no. of comparisons for second largest to (n-1)+logn. This involves finding the MAX as above and then finding the second MAX from amongst all those who lost to the MAX. The elements which lost to MAX at each level can be found out by creating a heap like structure of the tournament.

An extension to this is to find the kth largest element in the array.
->This can be found out by conducting k/2 tournaments to find MAX and 2nd MAX as above.
->We can use order statistics algorithm to find the kth largest element in O(n).
->Create heap in O(n). Remove the MAX n times from the heap, in O(k logn) time.

Sunday, 27 May 2012

Boyer-Moore Majority Vote Algorithm

Given an array A[N] of integers, find the major element in the array or return -1 if no major element exists in the array. An element 'e' of an array is called major if it exists more than N/2 times in the array.
PS: Use O(1) space and time complexity should be O(N).

->Initial thought is to find the median of the array in O(N) and check if the median exists more than N/2 times in the array. Important thing to understand is that, if the element is the major element of the array then it has to occur at the median position.

Another approach i found after searching is based on majority vote algorithm, called Boyer-Moore algorithm. This algorithm is faster than the above one.
->In the first pass of the array you start with the first element as the major element and its count set to 1. There on, major element is voted for (when same element occurs next time) and against (when any other element occurs at the next position). After the count reaches 'zero', major element is changed to the next element and count set to 1. If there is a majority element in the array, then it would remain set as the major element. If however, there is no major element i.e. no element occurs more than N/2 times in the array, then the value in the major element after first pass would be a false value.
-> So we need a second pass, in which we get the count of the major element found in the firs pass. This count tells if the element is a true majority or a false one.

This is easy to PROVE: If there is a majority element 'e', then e's count would be atleast N/2+1 and other elements can not bring this count to zero by voting against it. hence this would be set as the major element after the first pass.

This algorithm is explained very well using a simulation on the following link :, and the original paper here:

Friday, 25 May 2012

Getting an interview call from any company

Everyone knows it is difficult to get an interview call from a company unless you are from a top notch college or you know an insider. True, isn't it ?
Well there is a company called RoundOne which links you to an insider of almost all companies in India. It allows you to be that insider as well, so get going people.
I see almost all top notch comapnies like Google, Microsoft, Amazon ... up there. Is your company there, if not why not add it.

Go ahead and give it a try !!

Thursday, 10 May 2012

Longest palindrome

Write algo for longest palindrome?

Probability and the chessboard horse

A horse is in chessboard. given its x,y find the probability through program that it will remain in board after n moves.

Awk - with loops and regex

Awk built-in variables:

FNR         The input record number in the current input file.
NF          The number of fields in the current input record.
NR          The total number of input records seen so far.

// an awk command to match the regex "UST" in the line
cat matrace_20120430234915.dat | awk -F'[,]' '{print FNR; i=1;for(i=1;i<(NF);i++) { print i,$i, /UST/;} }'

// an awk command which finds the column matching the regular expression of date in format "yyyy-mm-dd" and splits the date and adds 1 to the date and then prints the same.
awk -F'[,]' '{print FNR; i=1;for(i=1;i<(NF);i++) { print i,$i; if(match($i,/(....)\-(..)\-(..)/)) { split($i,a,"-"); printf("%d-%d-%d",a[1],a[2],(a[3]+1));};} }'

// awk using sprintf
awk -F'[,]' '{x=sprintf("%s",$1); i=1; for(i=2;i<(NF);i++) {  if(match($i,/(....)\-(..)\-(..)/)) { split($i,a,"-"); x=sprintf("%s,%s-%s-%d",x,a[1],a[2],(a[3]+1));} else { x=sprintf("%s,%s",x,$i); } } x=sprintf("%s,%s",x,$(NF)); print x;}'

Regex - regular expressions

A regular expression is a special string that describes a search pattern.

A regular expression(regex) is one or more non-empty branches, separated by '|'.
'|' - It matches one of the branches

A branch is one or more atoms, concatenated.

An atom is possibly followed by a '*', '+', '?', or bound.
* - 0 or more atom
+ - 1 or more atom
? - 0 or 1 atom
a{1,3} - atom a occurs between 1 to 3 times

An atom is a regular expression enclosed in '()' (matching a match for the regular expression), a bracket expression (see below),
'.' (matching any single character)
'^' (matching the null string at the beginning of a line)
'$' (matching the null string at the end of a line)
a `\' followed by one of the characters `^.[$()|*+?{\' (matching that character taken as an ordinary character) or a single character with no other significance (matching that character).
There is one more type of atom, the back reference: `\' followed by a non-zero decimal digit d matches the same sequence of characters matched by the d-th parenthesized subexpression (numbering subexpressions by the positions of their opening parentheses, left to right), so that (e.g.) `([bc])\1' matches `bb' or `cc' but not `bc'.

[0-9] or [a-z] for specify the character range
A bracket expression is a list of characters enclosed in '[]'. It normally matches any single character from the list.
If the list begins with '^', it matches any single character not from the rest of the list.
If two characters in the list are separated by `-', this is shorthand for the full range of characters between those two inclusive (e.g. '[0-9]' matches any decimal digit).
With the exception of ']','^','-' all other special characters, including `\', lose their special significance within a bracket expression.

Wednesday, 2 May 2012

MySQL commands

List all databases on the sql server.

mysql> show databases;

Creating a database:

mysql> CREATE DATABASE magento;

Deleting a database:

mysql> drop database magento;

To see all the tables in the db.

mysql> show tables;