Google

Sep 21, 2011

Java Collection Interview Questions and Answers

Q. Can you name 2 Core Java APIs widely used in Java?
A. (1) Java Collection API
     (2) String class

You need data structures to store data retrieved from the database, manipulate by filtering, sorting, transforming, etc and then present it to the user via Web GUI, reports (e.g. PDF, online), etc.


Other relevant Java Interview Questions and Answers on data structures



Java Collections Framework (i.e. Data structures)  contains most commonly asked Java interview questions. A good understanding of Collections framework is required to understand and leverage many powerful features of Java technology. Here are a few important practical questions which can be asked in a Core Java interview.


Q. What are the common data structures, and where would you use them?
A.  Many leave out Trees and Graphs.  Trees and Graphs are very useful data structures as well. As you provide your answer,  the interviewer may further quiz you on.

Q. How you would go about implementing your own List, Set, and Map?

Whilst it is not recommended to write your own implementation when Java provides proven and tested implementations, the interviewer is testing your understanding on data structures. My book entitled "Core Java Career Essentials" covers this in more detail with diagrams, examples, and code.


Here are the common data structures.

Arrays are the most commonly used data structure. Arrays are of fixed size, indexed, and all containing elements are of the same type (i.e. a homogenous collection). For example, storing employee details just read from the database as EmployeeDetail[ ], converting and storing a string as a byte array for further manipulation or processing, etc. Wrap an array in a class to protect it from being inadvertently altered. This would be true for other data structures as well.


Lists are known as arrays that can grow. These data structures are generally backed by a fixed sized array and they re-size themselves as necessary. A list can have duplicate elements. For example,  adding new line items to an order that stores its line items as a list, removing all expired products from a list of products, etc. Initialize them with an appropriate initial size to minimize the number of re-sizes

Sets are like lists but they do not hold duplicate elements. Sets can be used when you have a requirement to store unique elements.

Stacks allow access to only one data item, which is the last item inserted (i.e. Last In First Out - LIFO). If you remove this item, you can access the next-to-last item inserted, and so on. The LIFO is achieved through restricting access only via methods like peek( ), push( ), and pop( ). This is useful in many programing situations like parsing mathematical expressions like (4+2) * 3, storing methods and exceptions in the order they occur, checking your source code to see if the brackets and braces are balanced properly, etc.

The LIFO access mechanism used by a stack has many practical uses.  For example, Evaluation of expressions / syntax Parsing, validating and parsing XML, undo sequence in a text editor, pages visited history in a web browser, etc. Java interview questions and answers on stack data structure


Queues are somewhat like a stack, except that in a queue the first item inserted is the first to be removed (i.e. First In First Out – FIFO). The FIFO is achieved through restricting access only via methods like peek( ), offer( ), and poll( ). For example,  waiting in a line for a bus, a queue at the bank or super market teller, etc.

Here is a code example of a blocking queue with multiple threads.


LinkedLists are data structures made of nodes, where each node contains data and a reference to the next node, and possibly  to the previous node as well for a doubly linked list. For example, a stack or queue can be implemented with a linked list or a doubly linked list because you can insert and delete at both ends. There would also be other situations where data will be frequently inserted and deleted from the middle. The Apache library provides a TreeList implementation, which is a good replacement for a LinkedList as it performs much better than a LinkedList at the expense of using a little more memory.  This means a LinkedList is rarely a good choice of implementation.

ArrayList is a good general purpose list implementation. An ArrayList is faster than a TreeList for most operations except inserting and removing in the middle of the list. A TreeList implementation utilizes a tree structure internally to ensure that all insertions and removals are O(log n). This provides much faster performance than both an ArrayList and a LinkedList where elements are inserted and removed repeatedly from anywhere in the list. 


class Link {
   private int id;                    // data
   private String name;               // data
   private Link next;                 // reference to next link
}


HashMaps are amortized constant-time access data structures that map keys to values. This data structure is backed by an array. It uses a hashing functionality to identify a location, and some type of collision detection algorithm is used to handle values that hash to the same location. For example, storing employee records with employee number as the key, storing name/value pairs read from a properties file, etc. Initialize them with an appropriate initial size to minimize the number of re-sizes.

Trees are the data structures that contain nodes with optional data elements and one or more child elements, and possibly each child element references the parent element to represent a hierarchical or ordered set of data elements.  For example, a hierarchy of employees in an organization, storing the XML data as a hierarchy, etc.  If every node in a tree can have utmost 2 children, the tree is called a binary tree. The binary trees are very common because the shape of a binary tree makes it easy to search and insert data. The edges in a tree represent quick ways to navigate from node to node.


Java does not provide an implementation for this but it can be easily implemented as shown below. Just make a class Node with an ArrayList holding links to other Nodes.

package bigo;

import java.util.ArrayList;
import java.util.List;

public class Node {
    private String name;
    private List<node> children = new ArrayList<node>( );
    private Node parent;
    
    public Node getParent( ) {
        return parent;
    }

    public void setParent(Node parent) {
        this.parent = parent;
    }

    public Node(String name) {
        this.name = name;
    }
    
    public void addChild(Node child) {
        children.add(child);
    }
    
    public void removeChild(Node child) {
        children.remove(child);
    }
   
    public String toString( ) {
        return name;
    }
 }


Graphs are data structures that represent arbitrary relationships between members of any data sets that can be represented as networks of nodes and edges. A tree structure is essentially a more organized graph where each node can only have one parent node. Unlike a tree, a graph's shape is dictated by a physical or abstract problem. For example, nodes (or vertices)  in a graph may represent cities, while edges may represent airline flight routes between the cities.




To make a Java graph class, you will have to work out the way in which the information can be stored and accessed. A graph will be using some of the data structures mentioned above. The Java API does not provide an implementation for this. But there are a number of third party libraries like  JUNG, JGraphT, and JDSL (does not seem to support generics).The book "Core Java Career Essentials" covers working examples using Java.

Q. What do you know about the big-O notation and can you give some examples with respect to different data structures?
A. The Big-O notation simply describes how well an algorithm scales or performs in the worst case scenario as the number of elements in a data structure increases.  The Big-O notation can also be used to describe other behavior such as memory consumption. At times you may need to choose a slower algorithm because it also consumes less memory. Big-o notation can give a good indication about performance for large amounts of data, but the only real way to know for sure is to have a performance benchmark with large data sets to take into account things that are not considered in Big-O notation like paging as virtual memory usage grows, etc. Although benchmarks are better, they aren't feasible during the design process, so Big-O complexity analysis is the choice.

The algorithms used by various data structures for different operations like search, insert and delete fall into the following performance groups like constant-time O(1), linear O(n), logarithmic O (log n), exponential O (c to the power n), polynomial O(n to the power c), quadratic O (n to the power 2) and factorial O (n!) where n is the number of elements in the data structure. It is generally a tradeoff between performance and memory usage. Here are some examples.

Example 1: Finding an element in a HashMap is usually a constant-time, which is  O(1) . This is a constant time because a hashing function is used to find an element, and computing a hash value does not depend on the number of elements in the HashMap.

Example 2:  Linear search of an array, list, and LinkedList is linear, which is O(n). This is linear because you will have to search the entire list. This means that if a list is twice as big, searching it will take twice as long.

Example 3: An algorithm that needs to compare every element in an array to sort the array has polynomial complexity, which is O (n2). A nested for loop is O (n2).  An example is shown under sorting algorithms.

Example 4: Binary search of a sorted array or ArrayList is logarithmic, which is O(log n). Searching an element in a LinkedList normally requires O(n). This is one of the disadvantages of LinkedList over the other data structures like an ArrayList or array offering a O (log n) performance, which offers better performance than O(n)  as the number of elements increases. A logarithmic running times mean, if 10 items take at most x amount of time, 100 items will take say at most 2x amount of time, and 10,000 items will take at most  4x. If you plot this on a graph, the time decreases as  n (i.e. number of items) increases.


Q. What can you tell about the performance of a HashMap compared to a TreeMap? Which one would you prefer?
A.  A balanced tree does have O (log n) performance. The TreeMap class in Java maintains key/value objects in a sorted order by using a red-black tree. A red-black tree is a balanced binary tree. Keeping the binary tree balanced ensures the fast insertion, removal, and look-up time of O (log n). This is not as fast as a HashMap, which is O(1) ,  but the TreeMap has the advantage of that the keys are in sorted order which opens up a lot of other capabilities.  



Q. Which one to choose? 
A.The decision as to using an unordered collection like a HashSet  or HasMap versus   using a sorted data structure like a TreeSet  or TreeMap depends mainly on the usage pattern, and to some extent on the data size and the environment you run it on.   The practical reason for keeping the elements in sorted order is for frequent and faster retrieval of sorted data if the inserts and updates are frequent. If the need for a sorted result is infrequent like prior to producing a report or running a batch process, then maintaining an unordered collection and sorting them only when it is really required with Collections.sort(...) could sometimes be more efficient than maintaining the ordered elements. This is only an opinion, and no one can offer you a correct answer. Even the complexity theories like Big-O notation like O(n) assume possibly large values of n.  In practice, a O(n) algorithm can be much faster than a O(log n) algorithm, provided the data set that is handled is sufficiently small. One algorithm might perform better on an AMD processor than on an Intel. If your system is set up to swap, disk performance need to be considered. The only way to confirm the efficient usage is to test and measure both performance and memory usage with the right data size. Measure both the approaches on your chosen hardware to determine, which is more appropriate.


Q. What is the tradeoff between using an unordered array versus an ordered array?
A. The major advantage of an ordered array is that the search times are much faster with O (log n) than an unordered array, which is O (n) for larger values of n. The disadvantage of an ordered array is that the insertion takes longer (i.e. O (n) ) because all the data with higher values need to be moved to make room. The insertion for an unordered array takes constant time of O(1). This means, it does not depend on the number of elements. The following code snippet demonstrates adding an element to both ordered and unordered array.

Inserting an element into an unsorted array


package bigo;

import java.util.Arrays;

public class InsertingElementsToArray {

    public static void insertUnsortedArray(String toInsert) {

        String[ ] unsortedArray = { "A", "D", "C" };

        String[ ] newUnsortedArray = new String[unsortedArray.length + 1];
        System.arraycopy(unsortedArray, 0, newUnsortedArray, 0, 3);
        newUnsortedArray[newUnsortedArray.length - 1] = toInsert;
        System.out.println(Arrays.toString(newUnsortedArray));
    }

    public static void main(String[ ] args) {
        insertUnsortedArray("B");
    }
}


Inserting an element into a sorted array


package bigo;

import java.util.Arrays;

public class InsertingElementsToArray {

    public static void insertSortedArray(String toInsert) {

        String[ ] sortedArray = { "A", "C", "D" };

        /*
         * Binary search returns the index of the search item
         * if found, otherwise returns the minus insertion point. This example
         * returns index = -2, which means the elemnt is not found and needs to
         * be inserted as a second element.
         */
        int index = Arrays.binarySearch(sortedArray, toInsert);

        if (index < 0) {                                   // not found.

            // array indices are zero based. -2 index means insertion point of
            // -(-2)-1 = 1,  so insertIndex = 1
            int insertIndex = -index - 1;

            String[ ] newSortedArray = new String[sortedArray.length + 1];
            System.arraycopy(sortedArray, 0, newSortedArray, 0,  insertIndex); 
            System.arraycopy(sortedArray, insertIndex,
                    newSortedArray, insertIndex + 1,  sortedArray.length - insertIndex);
            newSortedArray[insertIndex] = toInsert;
            System.out.println(Arrays.toString(newSortedArray));
        }
    }

    public static void main(String[ ] args) {
        insertSortedArray("B");
    }
}

So the decision depends on the usage pattern. Ask yourself the following questions. Do I have more inserts/deletes or search? What is the maximum number of elements likely to be stored in this array? How often do I need the sorted data? And finally what does my performance benchmark show?

Q. How do you get an immutable collection?
A. This functionality is provided by the Collections class, which is a wrapper implementation using the decorator design pattern.


public class ReadOnlyExample {
    public static void main(String args[ ]) {
        Set<string> set = new HashSet<string>( );
        set.add("Java");
        set.add("JEE");
        set.add("Spring");
        set.add("Hibernate");
        set = Collections.unmodifiableSet(set);
        set.add("Ajax");                                           // not allowed.
  }
}

Q. What does the following code do? Can the LinkedHashSet be replaced with a HashSet?

import java.util.ArrayList;
import java.util.LinkedHashSet;
import java.util.List;

public class CollectionFunction {
    public <e> List<e> function (List <e> list) {
          return new ArrayList<e>(new LinkedHashSet<e>(list));
    }
}

A. The above code removes duplicates from a supplied list by passing it through an implementation of a Set interface. In this case, a LinkedHashSet is used to honor the ordering. If ordering is not required, the LinkedHashSet can be replaced with a HashSet.

Q. What are some of the best practices relating to the Java Collection framework?
A.

  • Choose the right type of data structure based on usage patterns like fixed size or required to grow, duplicates allowed or not, ordering is required to be maintained or not, traversal is forward only or bi-directional, inserts at the end only or any arbitrary position, more inserts or more reads, concurrently accessed or not, modification is allowed or not, homogeneous or heterogeneous collection, etc. Also, keep multi-threading, atomicity, memory usage and performance considerations discussed earlier in mind.

  • Don't assume that your collection is always going to be small as it can potentially grow bigger with time. So your collection should scale well.

  • Program in terms of interface not implementation: For example, you might decide a LinkedList is the best choice for some application, but then later decide ArrayList might be a better choice for performance reason.

         Bad:
                  ArrayList list = new ArrayList(100);

        Good:
                 // program to interface so that the implementation can change
                List list = new ArrayList(100);
                List list2 = new LinkedList(100);


  • Return zero length collections or arrays as opposed to returning a null in the context of the fetched list is actually empty. Returning a null instead of a zero length collection is more error prone, since the programmer writing the calling method might forget to handle a return value of null.

          List emptyList = Collections.emptyList( );
          Set emptySet = Collections.emptySet( );

  • Use generics for type safety, readability, and robustness.

  • Encapsulate collections: In general, collections are not immutable objects. So care should be taken not to unintentionally expose the collection fields to the caller. The caller may not perform any necessary validation.

Note: These Java interview questions and answers are extracted from my book "Core Java Career Essentials"


Other relevant Java Interview Questions and Answers on data structures

Labels:

19 Comments:

Blogger javin paul said...

Good questions, worth including in my list of Top collection interview questions, but you got to include some from Generics and new collections classes introduced in Java 6.

4:23 PM, June 10, 2012  
Blogger Unknown said...

Thanks Javin. Main purpose of this blog is to compliment my books. My book entitled "Core Java Career Essentials" have chapters dedicated to "Generics" and "Data structures" with lots of examples and code snippets.

1:31 AM, June 11, 2012  
Anonymous Anonymous said...

Good examples for collection.
I am interested in buying the book "Core Java Career Essentials" but i did not find in India.

Can you please provide the vendor for buying in india

10:07 PM, June 30, 2012  
Blogger Unknown said...

It is only available via amazon.com.

9:59 AM, July 01, 2012  
Anonymous Anonymous said...

http://www.lulu.com/items/volume_71/10993000/10993464/2/print/CoreJavaCareerEssentialsInteriorCSFinal-preview.pdf

Free download.............. :)

5:35 PM, July 17, 2012  
Blogger enoughTheory said...

http://www.lulu.com/items/volume_71/10993000/10993464/2/print/CoreJavaCareerEssentialsInteriorCSFinal-preview.pdf

5:35 PM, July 17, 2012  
Blogger Unknown said...

This is a sample preview only. The complete books needs to be purchased via amazon.com.

5:40 PM, July 17, 2012  
Blogger Unknown said...

Hi sir could you please provide the how to explain the project architectural and project flow. and what the difference between the project architecture and flow.

3:57 PM, September 04, 2012  
Blogger Unknown said...

For example 3 tier Vs n-tier architecture, layers


1. A presentation layer (HTML + CSS + JS)
2. A presentation logic layer (Servlets, JSP)
3. Data access layer (business class)
4. Data layer (database, file system, etc)


Refer to following blog posts

http://java-success.blogspot.com.au/2011/10/java-interview-questions-and-answers-on.html


http://java-success.blogspot.com.au/2011/09/enterprise-java-interview-questions-and.html

4:51 PM, September 04, 2012  
Blogger Unknown said...

some correction for the above phrase "In this case, a LinkedHashSet is used to honor the ordering by implementing a SortedSet interface." LinkedHashSet does not implement SortedSet (TreeSet does) otherwise the phrase is true and explaining since it is about the "ordering" and not "sorting" in that piece of code.

5:48 AM, October 21, 2012  
Blogger Unknown said...

Thanks Catalin for pointing that out. I have corrected it.

9:40 AM, October 22, 2012  
Blogger suresh said...

Hi sir i need a free pdf can you please help me out ?

4:13 PM, March 29, 2013  
Anonymous Anonymous said...

Hi Arulkumaran,

I have 2.5+ years of exp and working on JSF-Spring Integration project from last two years.Recently I went through an interview and faced some easy but confusing questions.I have mentioned spring in my resume so the interviewer ask 90% questions from the spring.I have used only IOC and JDBC Template.

Can you please give me the answers for below queries:
1)Should I change the Spring word with JSF-Spring Integration in my resume..?
2)
What to mention in project architectural and project flow and what is the difference between the project architecture and flow.
3)Spring Controller ?Do we need to tell the interviewer about Spring Controller interface.
Can you please specify any book for spring basics.


Regards,
Rahul

7:03 PM, June 10, 2013  
Blogger Unknown said...

Sir I have one question, which was asked by interviewer?

Q. Suppose i inserted no of employee phone number in HashMap(which is started with 97, 98, 99, 88, 87....etc just like that) with employee id as a key. And i want to get those number, which was start with only 98. Then how would i get those number using HashMap? Please sir answer this question.

12:54 AM, September 30, 2013  
Anonymous james said...

Thanks a lot for nice post about collection

11:17 PM, October 06, 2013  
Blogger Unknown said...

What is price of your book

12:37 AM, October 30, 2013  
Blogger Unknown said...

check the link: http://java-success.blogspot.com.au/2012/04/javaj2ee-career-companion-and.html.

9:57 AM, October 30, 2013  
Blogger Abhishek Susarla said...

The effort that you had put to add content to the blog is commendable. Keep it up.!!

11:21 PM, November 20, 2013  
Anonymous Divya said...

Nice work, awesome book
Check More about Java

10:37 PM, January 06, 2014  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home