Google

Jun 2, 2014

Event Driven Programming Java example tutorial style - part 2 JMX enabling & jconsole

This extends the basic tutorial on event driven programming in Java. This tutorial covers JMX enabling the EventHub class vian MBeans and monitoring through JMX compliant tool like jconsole.

Step 1: Define the interface with the methods that are intractable via the JMX compliant client like jconsole.


package com.writtentest14;

public interface EventHubMBean {

 /**
  * Attributes
  */
    long getEventCount();
   
    
    /**
     *   Operations
     */
    void fireEventById(String eventId);
}


Step 2: Define the implementation of the above interface. Note it extends StandardMBean from the Java management API.

package com.writtentest14;

import javax.management.StandardMBean;

public class StandardEventHubMBean extends StandardMBean implements EventHubMBean {

    private EventHub eventHub;
    
    public StandardEventHubMBean(EventHub eventHub) {
        super(EventHubMBean.class, false);
        this.eventHub = eventHub;
    }

    @Override
    public long getEventCount() {
        return this.eventHub.getEventCount();
    }

    @Override
    public void fireEventById(String eventId) {
        Event event = new Event(eventId);
        this.eventHub.fire(event);
    }

}


Step 3: Now, the revised EventHib with JMX enabled.

package com.writtentest14;

import java.lang.management.ManagementFactory;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
import java.util.concurrent.CopyOnWriteArrayList;
import java.util.concurrent.atomic.AtomicLong;

import javax.management.MBeanServer;
import javax.management.MBeanServerFactory;
import javax.management.MalformedObjectNameException;
import javax.management.ObjectName;

/**
 * register and unregister event listeners
 */

public class EventHub {

 private static final EventHub INSTANCE = createInstance();
 
 private ConcurrentMap<String, List<EventListener<Event>>> registeredListeners = new ConcurrentHashMap<String, List<EventListener<Event>>>();

 private EventDispatcher synchronousDispatcher;

 private AtomicLong eventCount = new AtomicLong();

 public EventHub() {
  registerMBean();
 }

 public static EventHub instance() {
  return INSTANCE;
 }

 public EventHub(EventDispatcher synchronousDispatcher) {
  this.synchronousDispatcher = synchronousDispatcher;
  registerMBean();
 }

 public long getEventCount() {
  return this.eventCount.get();
 }

 

 private long getNextEventNumber() {
  return this.eventCount.incrementAndGet();
 }

 protected EventDispatcher getSynchronousDispatcher() {
  return this.synchronousDispatcher;
 }

 public void setSynchronousDispatcher(EventDispatcher dispatcher) {
  this.synchronousDispatcher = dispatcher;
 }

 public void fire(Event event) {
  dispatch(event, getSynchronousDispatcher());
 }

 
 public synchronized void addListener(String eventId, EventListener<Event> listener) {
  List<EventListener<Event>> listeners = this.registeredListeners.get(eventId);
  if (listeners != null) {
   listeners.add(listener);
  } else {
   listeners = new CopyOnWriteArrayList<EventListener<Event>>();
   listeners.add(listener);
   this.registeredListeners.put(eventId, listeners);
  }

 }


 public void removeListener(String eventId, EventListener<Event> listener) {
  List<EventListener<Event>> listeners = this.registeredListeners.get(eventId);
  if (listeners != null) {
   listeners.remove(listener);

  }
 }

 private void registerMBean() {
  MBeanServer server = getMBeanServer();
  StandardEventHubMBean mbean = new StandardEventHubMBean(this);
  try {
   server.registerMBean(mbean, getMBeanObjectName());
  } catch (Exception e) {
   e.printStackTrace();
  }

 }

 protected void dispatch(Event event, EventDispatcher dispatcher) {
  getNextEventNumber();
  List<EventListener<Event>> listeners = getListeners(event);
  if (!listeners.isEmpty()) {

   dispatcher.dispatch(event, listeners);

  }
 }
 
  private static EventHub createInstance() {
         EventHub instance = new EventHub(new SimpleSynchronousDispatcher());
         return instance;
     }

 private List<EventListener<Event>> getListeners(Event event) {
  List<EventListener<Event>> listeners = this.registeredListeners.get(event.getId());
  return (listeners != null) ? listeners : Collections.<EventListener<Event>>emptyList();
 }

 private MBeanServer getMBeanServer() {
  ArrayList<MBeanServer> servers = MBeanServerFactory.findMBeanServer(null);
  if (servers != null && !servers.isEmpty()) {
   return (MBeanServer) servers.get(0);
  } else {
   // If it all fails, get the platform default...
   return ManagementFactory.getPlatformMBeanServer();
  }
 }

 private ObjectName getMBeanObjectName() {
  try {
   String name = "com.com.writtentest14:type=EventHub,instance=" + hashCode();
   return new ObjectName(name);
  } catch (MalformedObjectNameException e) {
   throw new RuntimeException(e);
  }
 }

}


Step 4: Run the EveneHubMain defined in part 1. Type jconsole from a DOS command prompt.



Step 5: You can now view the attributes and operations defined in EventHubMBean interface.  It displays the number of events processed.



Step 6:  You can also fire an event by entering an event name and clicking on fireEventById(  ).




Labels: , ,

May 30, 2014

How to create a well designed Java application? SOLID and GoF design patterns strive for tight encapsulation, loose (or low) coupling and high cohesion

A software application is built by coupling various classes, modules, and components. Without coupling, you can't build a software system. But, the software applications are always subject to changes and enhancements. So, you need to build your applications such a way that they can easily adapt to growing requirements and easy to maintain and understand. 3 key aspects to achieve this are

1. Tight encapsulation.
2. Loose (or low) coupling.
3. High cohesion.

The purpose of SOLID design principles and GoF design patterns is to achieve the above goal.


What is encapsulation?

Encapsulation is about hiding the implementation details. This means

1) Hiding data by setting member variables with private access level and providing public methods with pre and post condition checks to access the member variables.




2) Coding to interface and not implementation: Using a given class or module by its interface, and not needing to know any implementation details.

Badly Encapsulated: Directly dependent on ProductServiceImpl




Tightly Encapsulated: Dependent on the interface ProductService. The implementation can change from ProductServiceImpl to InternationalizedProductServiceImpl  without  the callers or clients OrderServiceImpl, SalesEnquiryServiceImpl, and CustomerServiceImpl  needing to know about the implementation details.





SOLID Principle:  LSP (Liskov substitution principle), DIP (Dependency inversion principle), and SRP (Single Responsibility Principle) strives to create tight encapsulation.


Q. Does tight encapsulation promote loose coupling?
A. Yes, poor encapsulation can allow tight coupling  as it will be possible to access non-encapsulated variables directly from another class, making it dependent on the implementation of the class that contains those variables. It is also possible to have poor encapsulation with loose coupling. So, encapsulation and coupling are different concepts, but poor encapsulation makes it possible to create tight coupling. 



What is coupling?

Coupling in software development is defined as the degree to which a module, class, or other construct, is tied directly to others. You can reduce coupling by defining intermediate components (e.g. a factory class or an IoC container like Spring) and interfaces between two pieces of a system. Favoring composition over inheritance can also reduce coupling.

Let's write some code for the above UML diagram to demonstrated loose coupling with GoF factory design pattern.

Step 1: Define the interfaces for the above diagram

public interface OrderService {
      abstract void process(Product p);
}


import java.math.BigDecimal;

public interface ProductService {
    abstract BigDecimal getDiscount(Product p);
}


Step 2: Define the implementation class for ProductService

public class ProductServiceImpl implements ProductService {

 @Override
 public BigDecimal getDiscount(Product p) {
  //logic goes here
  return null;
 }

}


Step 3: Factory eliminate need to bind application specific classes into the code. The code only deals with the ProductService interface , and can therefore work with any user-defined concrete implementations like ProductServiceImpl or InternationalizedProductServiceImpl.

public final class ProductServiceFactory {
   
 private ProductServiceFactory(){}
 
 public static ProductService getProductService() {
  ProductService ps = new ProductServiceImpl(); // change impl here
  return ps;
 }
}


Step 4: The caller or client class who is loosely coupled to ProductService via the ProductServiceFactory.

import java.math.BigDecimal;

public class OrderServiceImpl implements OrderService {
 
 ProductService ps;
 
 @Override
 public void process(Product p) {
  ps = ProductServiceFactory.getProductService();
  BigDecimal discount = ps.getDiscount(p);
  //logic
 }

}

So, if you want to change over from ProductServiceImpl  to  InternationalizedProductServiceImpl, all you have to do is make a change in the ProductServiceFactory class. An alternative approach to using a factory is to use an IoC container like Spring instead of a factory class to loosely wire the dependencies via dependency injection at run time.


Q. How do you create loose coupling?
A. By abstracting many of the implementation needs into various interfaces and introducing the concepts of OCP and DIP, you can create a system that has very low coupling.



What is cohesion? Cohesion is the extent to which two or more parts of a system are related and how they work together to create something more valuable than the individual parts. You don't want a single class to perform all the functions (or concerns) like being a domain object, data access object, validator, and a service class with business logic. To create a more cohesive system from the higher and lower level perspectives, you need to break out the various needs into separate classes.

Coupling happens in between classes or modules, whereas cohesion happens within a class.



Q. How do you create high cohesion?
A. You can get higher cohesion with a combination of low coupling and SRP (Single Responsibility Principle from SOLID), which allows you to  stack a lot of small pieces (e,g. Service, DAO, Validator, etc) together like puzzles to create something larger and more complex.



So, think, tight encapsulation, loose (low) coupling, and high cohesion.


Labels: ,

May 29, 2014

Event Driven Programming: Java example tutorial style - part 1

Event Driven Architecture aka EDA loosely couples event producers and event consumers.

An event can be defined as "a change in state". For example, when an event producer fires an event to notify all its registered listeners that either "securities" or "security prices" have been loaded, the listeners are notified to update their data via a synchronous or asynchronous dispatcher. Both the "Event" producers and listeners are loosely coupled via an "EventHub" and "Event". An "EventHub" is used to register and unregister listeners.

The "EventHub" can be registed as a JMX bean to control behaviors at runtime via a jconsole like firing an event, count number of events, etc.

A number of tutorials will take you through writing event driven code in Java along with registering as MBean to interact via a JMX compliant tool like jconsole.



Let's define the interfaces and implementation classes.


Step 1: Define the "Event" class from which all other events can be derived from.

package com.writtentest14;

import java.util.Date;

public class Event {

 private String id;
 private Date timeStamp;
 
 public Event(String id) {
  super();
  this.id = id;
  this.timeStamp = new Date();
 }

 public String getId() {
  return id;
 }

 public void setId(String id) {
  this.id = id;
 }

 public Date getTimeStamp() {
  return timeStamp;
 }

 public void setTimeStamp(Date timeStamp) {
  this.timeStamp = timeStamp;
 }

}


Step 2: Define the interface for the listeners

package com.writtentest14;

public interface EventListener<T extends Event> {
 void onEvent(T event);
}


Step 3:  Define the dispatcher interface.

package com.writtentest14;

import java.util.List;


public interface EventDispatcher {

    void dispatch(Event event, List<EventListener<Event>> listeners);    
}


Step 4: The dispatcher implementation. It could be synchronous or asynchronous dispatcher. Let's keep it simple by defining a synchronous dispatcher.

package com.writtentest14;

import java.util.List;

public class SimpleSynchronousDispatcher implements EventDispatcher {

    @Override
    public void dispatch(Event event, List<EventListener<Event>> listeners) {
        for (EventListener<Event> listener : listeners) {
            listener.onEvent(event);
        }
    }
}

Step 5:  Define the EventHub. Binds and unbinds listeners and invokes the dispatcher to dispatch the events.

package com.writtentest14;

import java.util.Collections;
import java.util.List;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
import java.util.concurrent.CopyOnWriteArrayList;
import java.util.concurrent.atomic.AtomicLong;

/**
 * register and unregister event listeners
 */

public class EventHub {

 private static final EventHub INSTANCE = createInstance();
 
 private ConcurrentMap<String, List<EventListener<Event>>> registeredListeners = 
                                   new ConcurrentHashMap<String, List<EventListener<Event>>>();

 private EventDispatcher synchronousDispatcher;

 private AtomicLong eventCount = new AtomicLong();

 public EventHub() {
 }

 public static EventHub instance() {
  return INSTANCE;
 }

 public EventHub(EventDispatcher synchronousDispatcher) {
  this.synchronousDispatcher = synchronousDispatcher;
 }

 public long getEventCount() {
  return this.eventCount.get();
 }

 private long getNextEventNumber() {
  return this.eventCount.incrementAndGet();
 }

 protected EventDispatcher getSynchronousDispatcher() {
  return this.synchronousDispatcher;
 }

 public void setSynchronousDispatcher(EventDispatcher dispatcher) {
  this.synchronousDispatcher = dispatcher;
 }

 public void fire(Event event) {
  dispatch(event, getSynchronousDispatcher());
 }

 public synchronized void addListener(String eventId, EventListener<Event> listener) {
  List<EventListener<Event>> listeners = this.registeredListeners.get(eventId);
  if (listeners != null) {
   listeners.add(listener);
  } else {
   listeners = new CopyOnWriteArrayList<EventListener<Event>>();
   listeners.add(listener);
   this.registeredListeners.put(eventId, listeners);
  }

 }

 public void removeListener(String eventId, EventListener<Event> listener) {
  List<EventListener<Event>> listeners = this.registeredListeners.get(eventId);
  if (listeners != null) {
   listeners.remove(listener);

  }
 }

 protected void dispatch(Event event, EventDispatcher dispatcher) {
  getNextEventNumber();
  List<EventListener<Event>> listeners = getListeners(event);
  if (!listeners.isEmpty()) {

   dispatcher.dispatch(event, listeners);

  }
 }
 
  private static EventHub createInstance() {
         EventHub instance = new EventHub(new SimpleSynchronousDispatcher());
         return instance;
     }

 private List<EventListener<Event>> getListeners(Event event) {
  List<EventListener<Event>> listeners = this.registeredListeners.get(event.getId());
  return (listeners != null) ? listeners : Collections.<EventListener<Event>>emptyList();
 }

}


Step 6: Finally, the EventHubMain that has the main method to run, and creates 3 listeners as  anonymous inner classes, and also acts as a producer to fire events. The listeners and the producer are decoupled via EventHub as the producer and listeners don't interact with each other, but via the EventHub and Event classes.

package com.writtentest14;

import java.util.concurrent.TimeUnit;

public class EventHubMain {

 private static final String PRICE_LOAD_EVENT = "PL_EVENT";
 private static final String SECURITY_LOAD_EVENT = "SL_EVENT";

 public static void main(String[] args) {

  // Anonymous listener1
  EventHub.instance().addListener(PRICE_LOAD_EVENT, new EventListener<Event>() {

   @Override
   public void onEvent(Event event) {
    System.out.println(PRICE_LOAD_EVENT + " received by listener " + this.getClass());
    try {
     TimeUnit.SECONDS.sleep(10);
    } catch (InterruptedException e) {
     // TODO Auto-generated catch block
     e.printStackTrace();
    }
   }

  });

  // Anonymous listener2
  EventHub.instance().addListener(SECURITY_LOAD_EVENT, new EventListener<Event>() {

   @Override
   public void onEvent(Event event) {
    System.out.println(SECURITY_LOAD_EVENT + " received by listener " + this.getClass());
    try {
     TimeUnit.SECONDS.sleep(10);
    } catch (InterruptedException e) {
     // TODO Auto-generated catch block
     e.printStackTrace();
    }
   }

  });

  // Anonymous listener3
  EventHub.instance().addListener(PRICE_LOAD_EVENT, new EventListener<Event>() {

   @Override
   public void onEvent(Event event) {
    System.out.println(PRICE_LOAD_EVENT + " received by listener " + this.getClass());
    try {
     TimeUnit.SECONDS.sleep(10);
    } catch (InterruptedException e) {
     // TODO Auto-generated catch block
     e.printStackTrace();
    }
   }

  });

  // Event dispatcher
  while (true) {
   System.out.println("Event fired " + PRICE_LOAD_EVENT + ".............");
   EventHub.instance().fire(new Event(PRICE_LOAD_EVENT));

   try {
    TimeUnit.SECONDS.sleep(5);
   } catch (InterruptedException e) {
    e.printStackTrace();
   }

   System.out.println("Event fired " + SECURITY_LOAD_EVENT + ".............");
   EventHub.instance().fire(new Event(SECURITY_LOAD_EVENT));

  }
 }

}


Finally, the output if you run the above class, which runs for ever in a while loop.

Event fired PL_EVENT.............
PL_EVENT received by listener class com.writtentest14.EventHubMain$1
PL_EVENT received by listener class com.writtentest14.EventHubMain$3
Event fired SL_EVENT.............
SL_EVENT received by listener class com.writtentest14.EventHubMain$2
Event fired PL_EVENT.............
PL_EVENT received by listener class com.writtentest14.EventHubMain$1


In the next post, I will integrate MBean components with MBean server to manage the EventHub via a JMX client like jconsole.

Labels: ,

Dec 11, 2013

Scalable Straight Through Processing System (OLTP) vs OLAP in Java

Large mission critical applications use Straight Through Processing, and these systems need to be highly scalable. So, when you apply for these high paying jobs, it really pays to prepare for your job interviews with the following questions and answers.

Q. What is Straight Through Processing (STP)?
A. This is the definition from INVESTOPEDIA.

"An initiative used by companies in the financial world to optimize the speed at which transactions are processed. This is performed by allowing information that has been electronically entered to be transferred from one party to another in the settlement process without manually re-entering the same pieces of information repeatedly over the entire sequence of events."

Q. How will you go about designing a STP system?
A.  Most conceptual architectures use a hybrid approach using a combination of different architectures based on the benefits of each approach and its pertinence to your situation. Here is a sample hybrid approach depicting an online trading system, which is a STP system.


The above system is designed for:
  • Placing Buy/Sell online trades real time. The trades are validated first and then sent all the way to Stock Exchange system using the FIX (Financial Information eXchange) protocol. Here some usefil links on FIX.
Are you going for programming jobs at investment banks or financial institutions? Knowing some investment and trading terms will be useful?
  • Once the trade is matched, the contract notes are asynchrnously issued via the SETTLEMENT queue, and processed by an ESB (Enterprise Service Bus) system like web Methods, Tibco, or Websophere MQ. Here are some useful links on asynchronous processing.

The above system is an operational OLTP (i.e. On-Line Transaction Processing) system. These systems are also known as STP (i.e. Straight Through Processing) system. This leads to another question.


Q. What is the difference between OLTP and OLAP?
A. OLTP stands for On-Line Transaction Processing and OLAP stands for On-Line Analytical Processing. OLAP contains a multidimensional or relational data store designed to provide quick access to pre-summarized data & multidimensional analysis. 

 
MOLAP: Multidimensional OLAP – enabling OLAP by providing cubes.
ROLAP: Relational OLAP – enabling OLAP using a relational database management system




OLTP
OLAP
Creates operational source data from transactional systems as shown in the above diagram. This data is the source of truth for many other systems.
Data comes from various OLTP data sources as shown in the above  diagram.
Transactional and normalized data is used for daily operational business activities.
Historical, de-normalized and aggregated multidimensional data is used for analysis and decision making. This is also known as for BI (i.e. Business Intelligence).
Data is inserted via short inserts and updates. The data is normally captured via user actions.
Periodic (i.e. scheduled) and long running (i.e. during off-peak) batch jobs refresh the data. Also, known as ETL process as shown in the below diagram.
The database design involves highly normalized tables.
The database design involves de-normalized tables for speed. Also, requires more indexes for the aggregated data.
Regular backup of data is required to prevent any loss of data, monetary loss, and legal liability.

Data can be reloaded from the OLTP systems if required. Hence, stringent backup is not required.
Transactional data older than certain period can be archived and purged based on the compliance requirements.
Contains historical data. The volume of this data will be higher as well due to its requirement to maintain historical data.
The typical users are operational staff.
The typical users are management and executives to make business decisions.
The space requirement is relatively small if the historical data is regularly archived.
The space requirement is larger due to the existence of aggregation structures and historical data. Also requires more spaces for the indexes.

Labels: ,

Oct 3, 2013

Scaling your application -- Vertical Vs Horizontal scaling

Many organization face scalability and performance issues. So, it is really worth knowing your way around these topics.

Q. What is the difference between performance and scalability?
A. The performance and scalability are two different things.

For example, if you are in the business of transporting people in a horse carriage, the performance is all about utilizing more powerful horses to transporting your people quicker to their destination. Scalability is all about catering for increase in demand for such transportation as your business grows by either increasing the capacity of individual actors (e.g. carriage capacity) or adding more actors (e.g. horses and carriages).


Q. What are the different types of scalability?
A. Vertical and Horizontal scaling.

Vertical Scaling: You can increase the capacity of a horse carriage or use more powerful horses to reduce the time it takes to reach the destination. In a computer term, increase CPU, memory, etc to increase the capacity or tune the code/ database to reduce the time it takes to process. This means we have just increased the capacity of each actor -- horse and/or carriage. You can also vertically scale an application via multi-threading or using non-blocking I/O.



Horizontal Scaling: In a horizontal scaling model, instead of increasing the capacity of each individual actor in the system, we simply add more actors to the system. This means more horses and carriages. In terms of the computers, adding more nodes and servers.


Q. How will you scale your data store?
A. The scalability of database is critical because data is often a shared resource, and it becomes the main contact point for nearly every web request. The most important question you have to ask when considering the scalability of your database is, “What kind of system am I working with?” Are you working with a read-heavy or a write-heavy system?

Scaling Reads: If your website is primarily a read-centric system, vertically scale your data store with a caching strategy that uses memory cache (e.g. ehcache) or a CDN (Content Delivery Newtork). You can also add more CPU/RAM/Disk to scale vertically.

Scaling Writes: If your website is primarily a write-heavy system, you want to think about using a horizontally scalable datastore such as MongoDB (NoSQL database), Riak, Cassandra or HBase. MongoDB is a NoSQL database with great features like replication and sharding built in. This allows you to scale your database to as many servers as you would like by distributing content among them. A database shard ("sharding") is the phrase used to describe a horizontal partition in a database or search engine. The idea behind sharding is to split data among multiple machines while ensuring that the data is always accessed from the correct place. Since sharding spreads the database across multiple machines, the database programmer specifies explicit sharding rules to determine which machines any piece of data will be stored on. Sharding may also be referred to as horizontal scaling or horizontal partitioning. Oracle uses (RAC - Real Application Cluster) where small server blades are genned-in to an Oracle RAC cluster over a high-speed interconnect.

Q. What is BigData?
A. Big data is the term for a collection of data sets so large and complex that becomes very difficult to work with using most relational database management systems and desktop statistics and visualization packages, requiring instead "massively parallel software running on tens, hundreds, or even thousands of servers". Apache™ Hadoop® is an open source software project that enables the distributed processing of large data sets across clusters of commodity servers.  It is designed to scale up from a single server to thousands of machines, with a very high degree of fault tolerance.

Hadoop uses MapReduce to understand and assign work to nodes in the cluster and HDFS(Hadoop Distributed File System), which is file system that spans all the nodes in a Hadoop cluster for data storage.


Q. What are the general scaling practices for a medium size system in Java?
A.
  •  Using non-blocking IO and favoring multi-threading.
  •  Vertical scaling -- more CPU, RAM, etc.
  •  Caching data.
  •  Favor stateless idempotent methods.
  •  Using big JVM heaps
  •  Using JMS -- publish/subscribe model
  • Using resource pooling - e.g. database connection pooling, JMS connection factory pooling, thread pooling, etc.


Q. What are the general scaling practices for a large size system in Java?
A.

Use RTSJ (Real Time Specification for Java):  Java has the following  real time difficulties:

  • During garbage collection all threads are blocked and the garbage collection time can expand to minutes. These huge latencies effectively limit memory which limits scalability.
  • Increased garbage collection latencies make Java less useful for application that use heart beats, make real-time trades, etc.
  • Java supports a strict priority based threading model.


To overcome this,  the Java Community introduced a specification for real-time Java, JSR001 (RTSJ -- Real Time Specification for Java)

RTSJ addressed these critical issues by mandating a minimum specification for the threading model (and allowing other models to be plugged into the VM) and by providing for areas of memory that are not subject to garbage collection, along with threads that are not preemptable by the garbage collector. These areas are instead managed using region-based memory management.


Use Big Data like MongDB.
Use distributed cache.
Use Server clusters/JVM clustering (e.g. terracotta).
SEDA based architecture.
Cloud computing.





Labels: ,

Jul 16, 2013

Java Security Interview Questions and Answers: Single Sign-On (i.e. SSO) with Spring 3 Security

Q. Can you provide a high level overview of the "access control security" in a recent application you had worked?
A. As shown below, SiteMinder is configured to intercept the calls to authenticate the user. Once the user is authenticated, a HTTP header "SM_USER" is added with the authenticated user name. For example "123". The user header is passed to Spring 3 security. The "Security.jar" is a custom component that knows how to retrieve user roles for a given user like 123 from a database or LDAP server. This custom component is responsible for creating a UserDetails Spring object that contains the roles as authorities. Once you have the authorities or roles for a given user, you can restrict your application URLs and functions to provide proper access control.







Q. What is SSO (i.e. Single Sign-ON)?
A. Single sign-on (SSO) is a session/user authentication process that permits a user to enter one name and password in order to access multiple applications. The process authenticates the user for all the applications they have been given rights to and eliminates further prompts when they switch applications during a particular session. For example, SiteMinder, TivoliAccessManager (i.e. TAM), etc provides SSO. As shown in the diagram above SiteMinder authenticates the user and adds the SM_USER HTTP header to the application. It removes all the "SM" headers and add them after authenticating the user. This prevents amy malicious headers being injected via the browser with plugins like "Firefox Modify headers".


Q. How will you go about implementing authentication and authorization in a web application?
A. Use SSO application like SIteminder or Tivoli Access Manager to authenticate users, and Spring security 3 for authorization as described in the following Spring 3 security tutorials. Spring security pre-authentication scenario assumes that a valid authenticated user is available via  either Single Sign On (SSO) applications like Siteminder, Tivoli, etc or a X509 certification based authentication. The Spring security in this scenario will only be used for authorization. The links to the tutorials below demonstrates this with code.


Q. Can you describe your understanding of SSL, key stores, and trust stores?
ASSL, key stores and trust stores


Q. What tools do you use to test your application for security holes?
A. These tests are known as PEN (i.e. penetration) testing or security vulnerability testing. There are tools like
  • SkipFish (web application security scanner) from Google.
  • Tamper data from Firefox. 

Q. What is a two factor authentication?
A. Two-factor authentication is a security process in which the user provides two means of identification. This includes
  • something you have and something you know. For example, a bank card is which something you have and a PIN (i.e. Personal Identification Number) is something you know.
  • two forms of identification like password and a biometric data like finger print or voice print. Some security procedures now require three-factor authentication, which involves possession of a physical token and a password, used in conjunction with biometric data. 
Q. What are the different layers of security?
A.

Application-Layer Security: For example, Spring 3 Security, JAAS (Java Authentication and Authorization) that provides a set of APIs to provide authentication and authorization (aka access control), etc. JAAS provides pluggable and extendable  framework for programmatic user authentication and authorization at the JSE level (NOT JEE level). JAAS provides security at the JVM level (e.g. classes, resources). JAAS is the the core underlying technology for JEE Security. Spring security tackles security at the JEE level (e.g. URLs, Controller methods, service methods, etc)

Transport-Layer Security: Java Secure Sockets Extension (JSSE) provides a framework and an implementation for a Java version of the Secure Sockets Layer (SSL) and Transport Layer Security (TLS) protocols and includes functionality for data encryption, server authentication, message integrity, and optional client authentication to enable secure Internet communications. (TLS) 1.0 / (SSL) 3.0, is the mechanism to provide private, secured and reliable communication over the internet between the client and the server. It is the most widely used protocol that provides HTTPS for internet communications between the client (web browsers) and web servers.

Message-Layer Security: In message-layer security, security information is contained within the SOAP message and/or SOAP message attachment, which allows security information to travel along with the message or attachment. For example, the credit card number is signed by a sender and encrypted for a particular receiver to decrypt. Java Generic Security Services (Java GSS-API) is a token-based API used to securely exchange messages between communicating applications. The GSS-API offers application programmers uniform access to security services on top of a variety of underlying security mechanisms, including Kerberos. The advantage of this over point to point transport layer security is that the security stays with the message over all hops and after the message arrives at its destination. So, it can be used with intermidiaries over multiple hops and protocols (e.g. HTTP, JMS, etc). The major disadvantage is that it is more complex to implement and requires more processing.

Note: Simple Authentication and Security Layer (SASL) is a framework for authentication and data security in Internet protocols.  SASL is an Application-Layer security that supports TLS to compliment the services offered SASL.

Note: The Java security API is complicated and Spring security as demonstrated via the above tutorials might be a better alternative.

Labels: , ,

Jun 24, 2013

OLAP (i.e. Data Warehousing) Vs. OLTP

Q. What is the difference between OLTP and OLAP?
A. OLTP stands for On-Line Transaction Processing and OLAP stands for On-Line Analytical Processing. OLAP contains a multidimensional or relational data store designed to provide quick access to pre-summarized data & multidimensional analysis.

  • MOLAP: Multidimensional OLAP – enabling OLAP by provding cubes.
  • ROLAP: Relational OLAP – enabling OLAP using a relational database management system



OLTP OLAP
Source data is operational data. This data is the source of truth. Data comes from various OLTP data sources as shown in the above diagram
Transactional and normalized data is used for daily operational business activities. Historical, de-normalized and aggregated  multidimensional data is used for analysis and decision making (i.e. for business intelligence).
Data is inserted via short inserts and updates. The data is normally captured via user actions via web based applications. Periodic (i.e. scheduled) and long running (i.e. during off-peak) batch jobs refresh the data. Also, known as ETL process as shown in the diagram.
The database design involves highly normalized tables. The database design involves de-normalized tables for speed. Also, requires more indexes for the aggregated data.
Regular backup of data is required to prevent any loss of data, monetary loss, and legal liability. Data can be reloaded from the OLTP systems if required. Hence, stringent backup is not required.
Transactional data older than certain period can be archived and purged based on the compliance requirements. The volume of this data will be higher as well due to its requirement to maintain historical data.
The typical users are operational staff. The typical users are management and executives to make business decisions.
The space requirement is relatively small if the historical data is archived. The space requirement is larger due to the existence of aggregation structures and historical data. Also requires more indexes than OLTP.

There are a number of commercial and open-source OLAP  (aka Business Intelligence) tools like:
  • Oracle Enterprise BI Server, Oracle Hyperion System
  • Microsoft BI & OLAP tools
  • IBM Cognos Series 10
  • SAS Enterprise BI Server
  • JasperSoft (open source)

The OLAP tools are well known for their drill-down and slice-and-dice functionality. Also they enable users to very quickly analyze data by nesting the information in tabular or graphical formats. They generally provide  good performance due to their highly indexed file structures (i.e. cubes) or in-memory technology. 


Q. What is an OLAP cube?
A. An OLAP cube will connect to a data source to read and process the raw data to perform aggregations and calculations for its associated measures. Cubes are the core components of OLAP systems. They aggregate facts from every level in a dimension provided in a schema. For example, they could take data about products, units sold and sales value, then add them up by month, by store, by month and store and all other possible combinations. They’re called cubes because the end data structure resembles a cube.


Labels: ,

Jun 12, 2013

Scenario based question -- Designing a report or feed generation in Java



Scenario based  and open-ended questions like this can reveal a lot about your ability to design systems.


Q.If you have a requirement to generate a report or a feed file with millions of records pulled from the database, how will you go about designing it and what questions will you ask?
A.The questions to ask are:

  • How to display or provide the report. For example, online -- synchronously the user expects to see the report on the GUI or off-line -- asynchronously by sending the feed/report via an email or any other notification mechanisms like SFTP after generating the report in a separate thread.
  • Should we restrict the online reports for only last 12 months of data to minimize the report size and get better performance, and provide report/feed for data older than 12 months via offline processing.
  • Should we generate both online and offline reports asynchronously, and then for the online reports have the browser or GUI client to poll for report completion to display the results on the GUI. Alternatively can be emailed or downloaded via web at a later time.
  • What report generation framework to use like Jasper Reports, Open CSV, XSL-FO with Apache FOP, etc depending on the required output formats.
  • What is the source of truth for the report data -- database, RESTful web service call, XML, etc?
  • How to handle exceptional scenarios -- send an error email, use a monitoring system like Tivoli or Nagios to raise production support tickets, etc?
  • Security requirements. Are we sending feed/report with sensitive data via email? Do we need proper access control to restrict who can generate what for inline reports?
  • Should we schedule the offline reports to run during off peak? 
  • Archival and purging of the older reports. What is the report retention period for the requirements relating to auditing and compliance purpose? How big are the feed files and should they be gzipped?
 The above scenario can be implemented in a number of different ways.

Firstly, using a simple custom solution.

In this solution, a blocking queue and Java multi-threading (i.e an Executor framework) can be used to asynchronously produce a report. Alternatively, you can use asynchronous processing with Spring.


Secondly, an Enterprise Integration Framework

like Apache Camel can be used to create an asynchronous route. The high-level diagram of a possible solution using the Apache Camel. This framework is written to address the Enterprise Integration Patterns (i.e. EIP).




Apache Camel is awesome if you want to integrate several applications with different protocols and technologies.Spring Integration framework is another alternative. There are a number of tutorials on Apache Camel in this blog to get started as it will be a very handy skill to have to solve business problems and convince your potential employers.


Finally, using an Enterprise Service Bus (ESB) like web Methods, Tibco, Oracle Service Bus, Mule, etc. Mule is an open source ESB. There are pros and cons to each approach. More on these topics can be found at


Scenarios based questions are very popular with the good interviewers, and really pays to brush up. There are a number of different scenarios based questions and answers.

Labels:

Apr 4, 2012

How to become a software architect?

In industry specific forums, I often see questions like “what certification do I need to do to become an architect?”.  The simple answer is that you don't need a certification to become an architect. It may help, but there is a lot more to it to become an architect.  

"If you need to be successful in anything, you need to emulate those who are already successful"

You can't just only study to become an architect. The best way to become an architect is to start thinking and acting like one. If you start thinking like one, you will start to observe the architects and learn. This is true for the other roles like becoming a manager, lead, etc. These roles require technical skills complemented with good soft skills, attitude, and work ethics.

Any self-help book will tell you -- what you think and do is what you become. This is why many interviewers ask open-ended questions like who are your role models? what books did you read recently? how do you keep your knowledge up to date? tell me about yourself? what are your short term and long term goals?, etc. These questions can reveal a lot about your passion, enthusiasm, attitude, communication skills, and technical strength.


Here is my take on the road map to become a software (e.g. JEE) architect.

  • Learn to ask the right questions -- what if ...? how about ...?, design alternatives, pros vs cons, tactical versus strategical, strategical vs political, weight the risks against the benefits, build vs buy, etc. Ask questions pertaining to the 16 key areas. Think in terms of scalability, transactional boundaries, best practices, exception handling, development process improvement, etc.
  • Get a good handle on the 16 key areas and proactively apply these key areas.

  1. Language Fundamentals (LF)
  2. Specification Fundamentals (SF)
  3. Platform Fundamentals (PF)
  4. Design Considerations  (DC)
  5. Design Patterns (DP)
  6. Concurrency Management (CM)
  7. Performance Considerations  (PC)
  8. Memory/Resource Considerations  (MC)
  9. Transaction Management  (TM)
  10. Security (SE)
  11. Scalability  (SC)
  12. Best Practices (BP)
  13. Coding (CO)
  14. Exception Handling (EH)
  15. Software Development Processes (SDP)
  16. Quality of Service  (QoS)


  • Look at things from both business and technical perspective: architects form a bridge between many cross functional teams like business analysts, stake holders, project managers, developers, testers, infrastructure team, operational and support staff. Know your target audience and learn to convey technology to the business non-technically and the business requirements to the techies technically.
  • Learn to look at the big pictures and also pay attention to details where required.
  • Get a well rounded hands-on experience. For example, client side, server side, application integration, full SDLC, etc. Nothing beats experience, and you can proactively fast-track your career by learning from others' experience via good books, blogs, industry specific web sites, and helping others on the forums.
  • Also, should have a good domain knowledge 
  • You don't have to be the "jack of all trades" but as a technical leader and a bridge between various stake holders and the development teams, you need to have good soft skills to make things happen by engaging the right teams and expertise. The key soft skills to have are communication, interpersonal, leadership, analytical, negotiation, and problem solving. The soft-skills and domain knowledge are the most important in the list.

So, the combination of all the above can transform you into an architect. Stay visible at your current work and behave like an architect to get a foot in the door. Ask the right questions and contribute in team meetings and crisis sessions to prove your capabilities.


You may also like:


The books that I enjoyed reading:




Labels: ,

Oct 14, 2011

Java Interview Questions and Answers on Software Architecture

Good caliber candidates have the ability to look at the big picture and drill down into details. The line between software development and software architecture is a tricky one. Regardless of you are an architect, developer, or both, one needs to have a good understanding of the overall software architecture. The following Java interview questions are very popular with the interviewers and it can significantly influence the decision  of Hire/No Hire. So, it really pays to have a good overview of various possible architectures. The questions shown below will also make a good platform for further questions depending on your answers.


Be prepared for a white board session on architectures, especially the bird's eye view of the last application you had worked on. There will be lots of follow on questions like why a particular approach was used?, what are the benefits and drawbacks of a particular approach?, etc. 

Q. Can you draw me a 1000 foot view of the architecture of the system you were/are involved in, in your current/last position?
Q. Can you describe the architecture of a medium-to-large scale system that you actually designed or implemented?
Q. Can you white board the components of the system you recently worked on?
Q. How would you go about designing a JEE shopping cart application?
Q. Can you discuss some of the high level architectures you are experienced with?


A. There are a number of high level conceptual architectures as discussed below. These individual architectures can be mixed and matched to produce hybrid architectures.

Model-View-Controller  Architecture

Most web and stand-alone GUI applications follow this pattern. For example, Struts and Spring MVC frameworks and Swing GUI.



The model represents the core business logic and state. The view renders the content of the model state by adding display logic. The controller translates the interaction with the view into action to be performed by the model. The actions performed by a model include executing the business logic  and changing the state of the model. Based on the user interactions, the controller selects an appropriate view to render. The controller decouples the model from the view.

Service Oriented Architecture (SOA)

The business logic and application state are exposed as reusable services. An Enterprise Service Bus (ESB) is used as an orchestration and mediation layer to decouple the applications from the services. 




The above architecture has 5 tiers. The application tier could be using a typical MVC architecture. The service orchestration tier could be using ESB products like Oracle Service Bus, TIBCO, etc and BPM products like Lombardi BPM, Pega BPM, etc. In the above diagram, the ESB integrates with the BPM via messaging queues. The service tier consists of individual services that can be accessed through SOAP or RESTful web services. The SOA implementation requires change agents to drive adoption of new approaches. The BPM, application integration, and real-time information all contribute to dynamically changing how business users do their jobs. So, it needs full support from the business, requiring  restructuring and also it can take some time to realize the benefits of SOA. Cloud computing is at the leading edge of its hype and as a concept compliments SOA as an architectural style. Cloud computing is expected to provide a computing capability that can scale up (to massive proportions) or scale down dynamically based on demand. This implies a very large pool of computing resources either be within the enterprise intranet or on the Internet (i.e on the cloud).





User Interface (UI) Component Architecture

This architecture is driven by a user interface that is made up of a number of discrete components. Each component calls a service that encapsulates business logic and hides lower level details. Components can be combined to form new composite components allowing richer functionality. These components can also be shared across a number of applications. For example, JavaScript widgets, Java Server Faces (JSF) components, etc.




RESTful data composition Architecture



The user interface can be built by calling a number of underlying services that are each responsible for building part of a page. The user interface translates and combine the data in different formats like XML(translate to HTML using XSLT), JSON (Java Script Object Notation), ATOM (feed for mail messages and calendar applications), RSS (for generating RSS feeds), etc.


HTML composition Architecture



In this architecture, multiple applications output fragments of HTML that are combined to generate the final user interface. For example, Java portlets used inside a portal application server to aggregate individual content..


Plug-in Architecture


In this architecture, a core application defines an interface, and the functionality will be implemented as a set of plug-ins that conform to that interface. For example, the the Eclipse RCP framework, Maven build tool, etc use this architecture.



Event Driven Architecture (EDA)




The EDA pattern decouples the interactions between the event publishers and the event consumers. Many to many communications are achieved via a topic, where one specific event can be consumed by many subscribers. The EDA also supports asynchronous operations and acknowledgments through event messaging. This architecture requires effective monitoring in place to track queue depth, exceptions, and other possible problems. The traceability, isolation, and debugging of an event can be difficult in some cases. This architecture is useful in scenarios where the business process is inherently asynchronous, multiple consumers are interested in an event(e.g. order status has changed to partially-filled ), no immediate acknowledgment is required (e.g. an email is sent with the booking details and itinerary), and real-time request/response is not required (e.g. a long running report can be generated asynchronously and made available later via online or via email).



Most conceptual architectures use a hybrid approach using a combination of different architectures based on the benefits of each approach and its pertinence to your situation. Here is a sample hybrid approach depicting an online trading system.


FIX is a Financial Information eXchange protocol. You could also notice a number of synchronous calls using XML/HTTP or SOAP/HTTP and asynchronous calls using JMS. The above diagram also depicts that an enterprise architecture can be complex with a number of moving parts. So, it is imperative that all these moving parts are properly monitored and tested for any potential performance issues. Most of these services will be running as a cluster or a load balanced service with either active/active or active.passive configuration for high availability and scalability.  



Other Relevant Design Questions and Answers

Labels: