Saturday 27 October 2012

String searching problem

Given a TEXT string say "abcmanmfaxyzabcn" and a search string "man". Find the number of occurrences of the SEARCH string "man" in the given text string. The alphabets of the search string can be separated by other characters in between them in the text string e.g. "man" occurs twice in "mabcayzn" with 'a' occurring at two different positions.

Solution:
  1. First build a boolean array of 26 chars and store the chars which occur in the SEARCH string.
  2. Next remove all characters from the TEXT string which don't occur in the SEARCH string using the above array. This allows us to reduce the problem size by removing unwanted characters from TEXT string.
  3. The first approach i gave to the interviewer was to use a recursive function, which passes 2 arguments to the function findSubstring(String text, String search) and this searches search[0] in all the places in text and then calls the function on the remaining subproblem. However the time complexity of this solution is exponential.
  4. Next i came up with a dynamic programming solution. In this we create a matrix res[m][n] with TEXT string chars as the columns and the SEARCH string chars as the rows. Now we calculate  res[i][j] as:
res[i][j] =   0  if SEARCH[i] != TEXT[j]
                          Sum of all res[i-1][k] for all  0<k<j

The time complexity of this solution is O(mn) and is a lot faster than the recursive solution.

Design a timer class

Design a timer class. The users of your class should be able to register to your class with the time in secs and their callbacks. These callbacks are called as the time expires.


  • The timer runs in its own thread, and it sleeps for 1 sec (since, the interviewer had asked for 1 second as the granularity)
  • It maintains a list of  'Task/Runnable' along with the remaining time to expire.
  • Every time the timer wakes up after a sec, it reduces the expiry time for each task by 1 sec.
  • And for all those tasks for which the expiry reaches zero, their callbacks are called.
  • I suggested a faster way (where one doesn't need to iterate the list of tasks) if the timer was limited to a fixed number of seconds using an circular array of seconds.

Linked list Duplication

Given a linked list which has the node structure as shown below:

class Node{
  Node val; // note that the value also randomly points to some other node in the LL
  Node next;
}

How would you duplicate this linked list ?

Hint: I used an external hashmap DS in my solution.

Better solution:
http://www.geeksforgeeks.org/a-linked-list-with-next-and-arbit-pointer/

Microsoft Interview Question

Given two numbers as a linked list of digits, write a program to add the numbers and return the result in the new linked list.

Microsoft Interview Questions


  1. Make the mirror image of the given binary tree.
  2. Design the data structure and algorithm to help in spell correction of words. The correct words are given in alphabetical order in the dictionary file.
  3. Write a function which tests whether the given binary tree is a binary search tree or not?
  4. Merge two given BSTs in linear time.
  5. Using getchar() take user input until user presses Enter key. And then print the user input in the reverse order.

Adobe Interview Questions


  1. A rectangle is cut from another rectangle (any orientation). Find the line which divides both rectangles into 2 equal halves.
  2. Do the tree traversal level by level. After each level print the new line i.e. ‘\n’.
  3. Solve the 12 coin problem to find the different coin in 3 weightings.
  4. A king has 1000 barrels of wine, out of which one is poisoned. Use min. no of prisoners to identify the poisoned bottle.
  5. Modified tower of Hanoi. Every disk move is through an intermediate bar.
  6. Find the no. of ways in which the sum of coins can be obtained using the coins x,y,z.
  7. Sorting of very large file, which does not fit in memory.

Google Interview Questions


  1. You are given a sorted array (a[]) and a value sum. Find the pairs a[i] and a[j] in the sorted array s.t. a[i] + a[j] = sum. Your algorithm should be linear in time and space.
  2. In the tennis tournament find the second best player.

Monday 8 October 2012

Find all contiguous sub-arrays whose sum is zero.

For example:

i/p: 1,2,3,-3,-2,-1
o/p: [2,3], [1,4], [0,5]

You can do a running sum of all the array elements to get the tmp[]:
=> 1 3 6 3 1 0

Now in this array you need to find all duplicates,triplets etc.. All such pairs represent indices of numbers between which the sum of sub-array is zero.

Now to find the duplicates, we can store the tmp[] values as the key in the hashmap with the array index as the values. So, all indices which fall to same key gives us the solution.

Design data structure for merging meeting schedules

Add two numbers without using arithmetic operators

Store an N-ary tree structure in Relational database table

I was asked this question during my tower research interview for the position of Senior script developer.
I came up with the solution to store it in a self referring table which has two columns (parent,child). So, in this kind of table he said that finding the depth of any particular node, i would have to write/call queries equaling the depth of the node under consideration. He asked how can you improve the data structure to store this kind of structure in tables ??

http://iamcam.wordpress.com/2006/03/24/storing-hierarchical-data-in-a-database-part-2a-modified-preorder-tree-traversal/
http://www.sitepoint.com/hierarchical-data-database-2/

Other questions:
Linux:
Find all files open under a process -> ls -l /proc/{PID}/fd
How to setup a password-less SSH
Remove the soft-links in a directory and replace them with the copy of the original file.
List all files in a directory recursively which are not owned by a particular user.

Array Based:
Given 2K+1 numbers out of which K numbers occur twice each and one number occurs just once, find this one time occurring number. -> used XOR
Given numbers from 1 to K with 2 numbers missing, find the 2 missing numbers. -> used XOR based partitions.

Java:
When is finally called. What if you return from inside try and have a return inside finally as well ? What if finally throws an exception ?
About the synchronized() keyword.

Database:
How would you store a tree in the relational tables?

Interviewer: Manikant K.

How to design a web forum

This was asked to me as a design question in amazon telephonic round. The interviewer wanted to know the DB design and the web-app design for a forum which supports around 100K users with the following use-cases:


 -Sign up
- Posting
- Replying
- Searching [keywords + dates based]

I told him we could have following DB schemas:


Thread [tid, subject, date]
Post [pid, tid, uid, msg, datetime ]
User [uid, role, emailid, password_hash ...]
Role [rid, list flags(for permisssion etc.)...]

The plain DB solves the use cases for Sign-up, posting and replying but could become a bottle neck for search if we run a search query based on query in the Post table (with a where clause as 'like "%keyword%"). So, for improving the performance of this, i suggested running an independent background process which parses the msgs in the post table and stores the important keywords from this messsage in the  independent datastore/B-Tree structure to be able to index the keywords for faster searches. The corresponding value for the keywords is the 'pid' (post id) in the table. This should improve the performance of the query. the interviewer seemed to have liked the proposal. Let me know if you guys can think of a better strategy for faster searches.

After this he wanted me to move to the Java side and explain the design there. I told that we would make similar classes/objects like the tables above, define relationship between those classes and then define the operations on these classes.
For components, I told him i would use the MVC architecture, where the views would be JSPs and Controller would be the interacting servlets which would extract data from the views, call the business logic from the Model and then fill the results in a new view which is returned to the user. He asked me to delve a little deeper into the Model components. I told this would basically have two main layers one is DAO (which interacts with the DB) and other is the service layer (which is called by the controller and internally calls the DAO classes for DB interaction). The service classes would be using the above set of objects.
PS: I wasn't very comfortable with the component design part, so any comments are welcome.

Sunday 7 October 2012

Chat server based on Apache Mina

Initially thought of making a chat server based on tomcat, but came across apache mina and apache async-web products which seem to be exciting and more scalable as well.

http://mina.apache.org/

Configuring the maximum no. of request threads for the tomcat server

Reference: http://tomcat-configure.blogspot.in/2009/01/tomcat-maxthreads-configuration.html
Reading: http://tomcat.apache.org/tomcat-5.5-doc/config/http.html
http://tomcat.apache.org/tomcat-6.0-doc/aio.html

Tomcat maxThreads configuration

Tomcat maxThreads represents the maximum number of request
processing threads to be created by the HTTPConnector.



<Connector port="8080" address="localhost"     
     maxThreads="250" maxHttpHeaderSize="8192"
     emptySessionPath="true" protocol="HTTP/1.1"
     enableLookups="false" redirectPort="8443" acceptCount="100"
     connectionTimeout="20000" disableUploadTimeout="true" />





This determines the maximum number of simultaneous requests that can be handled. If not specified, this attribute is set to the default value of 200.

How the process works:



  • At server startup, the HTTP Connector will create a number of processing threads based on the value configured for theminSpareThreads attribute.
  • Each incoming request requires a thread for the duration of that request.
  • If the number of simultaneous requests cannot be handled by the currently available request processing threads, additional threads will be created up to the configured maximum (the value of the maxThreads attribute).
  • If still more simultaneous requests are received, they are stacked up up to the configured maximum (the value of the acceptCount attribute).
  • Any further simultaneous requests will receive "connection refused" errors, until resources are available to process them.
Guidelines for maxThreads:
maxThreads is an important tuning parameter, however if you are reaching an error like:
org.apache.tomcat.util.threads.ThreadPool logFull SEVERE: All threads (150) are
currently busy, waiting. Increase maxThreads (150) or check the servlet status
you should at first investigate if it's rather a problem of individual requests taking too long: are your threads returning to the pool? if, for example, database connections are not released, threads pile up waiting to obtain a database connection thereby making it impossible to process additional requests. This is a problem in your webapp.
Take a thread dump to find out where they're stuck. Increasing too much maxThreads will lead to :
  • Consume a good chunk of memory.
  • Your system will spend too much time context switching
So once you have already optimized your application try raising you maxThread attribute up to 500-750. I wouldn't advice to create larger Connectors, rather if 750 Connections are not enough create a Cluster configuration with several Tomcat instances. For example 2 instances of tomcat each one with maxThreads=500 instead of a single Tomcat with maxThreads=1000
If you want to learn more about JDK tuning read this tutorial which contains many tips valid also for tomcat:
http://www.mastertheboss.com/en/jboss-application-server/113-jboss-performance-tuning-1.html