In this article we will explore Server Group membership/Group communication/Service Registry and Discovery capabilities which are required to build Application-Level Server-cluster-aware applications.
Group membership and Group communication
In distributed systems, application must be able to establish a dynamic process/server
group and track the status of all of the servers/processes in that group. Server group members should be able to communicate each other in the group. They should be able to exchange of data and commands across the server group. Fig. A shows a example sample server group called “Server Cluster” with three members.
Leader process/server And Task coordination
In distributed computing, leader election is the process of designating a single process as the organizer of some task distributed among several processes. This single process/server can be called as leader process. After a leader election algorithm has been run, each server throughout the server group recognizes a particular, unique server as the task leader. This election process should be dynamic so that; if a leader server crashes, a new leader can be elected to continue processing application tasks.
This leader process can be used for controlling a task (or) distributed tasks.
Fig B. shows a sample cluster setup, in which component 2 selected as leader process and controls task distribution to other servers.
Distributed Service Registry and Discovery
In SOA/distributed systems, services need to find each other. i.e. a web service might need to find a caching service, etc.. Clients may have to locate service which may be running at multiple servers. Service Registry and Discovery is the mechanism by which severs register their services and clients find the required services.
Service Registry provides a mechanism for Services to register their availability
Service Discovery system provides a mechanism for locating a single instance of a particular service. It also notifies when the instances of a service change (service deletion/addition/update).
Fig. C,D shows Service Discovery and Registry implementation using Apache ZooKeeper. In this all services register their availability with ZooKeeper and Clients (Web Server, API Server) can locate services using ZooKeeper.
Some of the open-source java based tools which can be used to implement above capabilities are
In this article we will explore the distributed caching capability which is required to build Application-Level Server-cluster-aware applications.
Our sample clustered application above [Fig. A] likely to encounter major scalability issues as we try to scale and put more load on application. Scalability bottlenecks mostly occurs in database which are used as data store. Normally databases do not scale very well.
Caching is a well-known concept used software worlds to eliminate the database scalability
bottlenecks. Traditionally, caching was a stand-alone mechanism, but that has been evolved
to distributed caching for scalabilty and availability reasons. Distributed caching is a form of caching that allows the cache to span multiple servers so that it can grow in size. Distributed caching comes in various flavors like Replicated Caching , Cache Grid etc.. [Fig. B, C]
More details about Distributed Caching can be found at
Fig. D shows our sample clustered application with distributed cache. This caching can be used for variety of application specific caches like data, state, configuration caches etc.. These caches are available to all the servers available in the application cluster. A good distributed caching solution will help us to build a Application-Level Server-cluster-aware application.
Some of the available caching tools are
Ehcache is one of the leading java based open-source cache solution.
Ehcache supports basic replication caching.
Terracotta provides leading in-memory data management and Big Data solutions for the enterprise, including BigMemory, Universal Messaging, and more.Terracotta offers commercial distributed cache solutions under the brand name of BigMemoryGO and BigMemoryMax.
Infinipan/JBOSS Data Grid:
Infinispan is an extremely scalable, highly available key/value data store
and data grid platform. It is 100% open source, and written in Java.
It is often used as a distributed cache, but also as a NoSQL key/value store or object database. JBOSS Data Grid is a licenced version with suppoert from Redhat.
Hazelcast is an in-memory Open Source data grid based on Java.
Many NoSQL database technologies have excellent integrated caching capabilities,
keeping frequently-used data in system memory as much as possible and removing the need for a separate caching layer that must be maintained.
Next article we will explore the remaining capabilities/supports required to build Application-Level Server-cluster-aware applications.
This is the third article in the series of articles exploring distributed java
application development. In this article we will explore the capabilities/support
required to build Aplication-Level Server-cluster-aware applications.
We will explore these capabilities with the help of a simple example. In this
example we will take sample usecase of converting standalone application to
Fig A. shows a sample stand-alone application. This sample application process
the real-time data/events/messages supplied by a external system.
Now for scalability and availability reasons we want to convert this application
into clustered application (Fig B).
While converting stand-alone application to clustered application, we must
design applications so that they can run as multiple instances on different/same
physical servers. This can be done by splitting the application into smaller
components that can be deployed independently.
Stateless Application Components:
This conversion is easy, if the application components are stateless. Stateless
application components are independent of others and can run independently.
These stateless application components can handle/process the requests/data
equally. We just need a load-balancing mechaninsm to redirect the requests/data
to avilable application servers. If we need to handle more data, then we need
to add more servers and install the application components.
Stateful Application Components:
But most of the practical clustered applications components are stateful.
For instance, the application components may share a common data/configuaration
based on which processing happens. The application components may share a common
cache for fatser processing.
The following the capabilities/supports required to build stateful application
* Distributed State Sharing / Caching
* Distributed Service Registry and Discovery
* Group communication and membership with state maintenance and querying capability:
* Dynamic leader server election and Task co-ordination
* Distributed Locks/Synchronization
* Distributed Data Structures , ID Generator
* Distributed Execution
* Distributed Messaging System
* Distributed Scheduling
Next article we will explore the above capabilities/supports required to build Application-Level Server-cluster-aware applications.
This is the second article in the series of articles exploring distributed java
application development. In this article we will continue to discuss about
As discuses in the previous article we need a distributed/clustered system for
the following non-functionality reasons.
Application Server Clustering vs Application(-Level) Clustering
Java 2, Enterprise Edition (J2EE) uses application-server clustering to deliver
mission-critical applications over the web. Within the J2EE framework, clusters
provide mission-critical services to ensure minimal downtime and maximum scalability.
A application server cluster is a group of application servers that transparently
run your J2EE application as if it were a single entity. The clustering support
is available for services/requests like JNDI, EJB, JSP, HttpSession replication
and component fail-over, load-balancing etc..
Most of the Java enterprise servers (JBoss, Resin, WebLogic etc) have built-in support for clustering.
More details about Aplication-Server Clustering available at
Application-Server Clustering alone is not sufficient to build full fledged distributed
application. To build a full fledged distributed application, application should be
aware of other available servers in the cluster. This Application-Level Cluster awareness is required to handle various custom use cases like state sharing , group communication and task co-ordination and distribution, etc..
Next article we will explore capabilities/support required to build Application-Level Server-cluster-aware applications.
This is the first article in the series of articles exploring distributed java application development.
In this article we will discuss some basic concepts.
Software requirements are divided in to two types, Functional Requirements and
Functional Requirements (FRs) defines the specific behavior (or) functionality
of the system. They tell exact usage of the system being designed. We will capture these
requirements in Software Requirement Specification (SRS). We define them as
“System shall do *****” etc.. The plan for implementing FRs is detailed in
the “System Design/Design Document”.
Non-Functional Requirements (NFRs) is a requirement that specifies criteria
that can be used to judge the operation of the system. These are also called
“Qualities” of the system. We define them as “System shall be 99.9% Available”
etc.. The plan for implementing non-functional requirements is detailed in the
The important NFRs are
4. Fault Tolerant/Fail-Over
Scalability : Ability to handle additional load by adding more computational resources (CPU , RAM, Disk, Network etc..). Scalability can be Vertical scalability and Horizontal scalability.
Vertical scalability (Scaling Up): is handling additional load by adding more power to a single machine. i.e By adding a faster CPU, add more RAM or using a faster Solid-State Disk (SSD).
Horizontal scalability (Scaling Out): is handling additional load by adding more servers.
Horizontal scalability is much harder to achieve as adding servers requires
ensuring data consistency and process synchronization.
The distribution of processing load among the group of servers is known as server load balancing.
A server failure can be because of many reasons; system failures, planned outage, hardware or network problems etc.
High availability: is redundancy in the system. If one server fails, the others take
over the failed server’s load transparently. The failure of an individual server is invisible to the client. High availability can ensured by not having any single points of failure. Our traditional single-server solution can be good for scalability (add more memory and CPU), but not for high availability as it has single point of failure.
Fault tolerant service: always guarantees strictly correct behavior despite a certain
number of faults. Failover is another key technology behind clustering to achieve fault
tolerance. By choosing another node in the cluster, the process will continue when the
original node fails.
We traditionally start with single server architecture for our applications.
But sooner or later application will have to process more requests/data than a
single server can handle. Scaling up (Vertical Scalability) (Fig A) can be used.
Scaling up can be a short-term solution. And it’s a limited approach because the cost of
upgrading is disproportionately high relative to the gains in server capability.
For these reasons most successful Internet companies/enterprise applications
follow a scale out (Horizontal Scalability) (Fig C) approach.
We need to horizontally scale our application to multiple servers based on
low-cost hardware and operating systems (cluster of servers).
A cluster is a group of application servers that transparently run your application as if it were a single entity. In order to implement server clustering we need distributed applications/technologies.
There are different types clusterings at different tiers like DB Clustering, Hardware Clustering, Application Server (AS) Clustering and Application-level Clustering.
Next article we will discuss more about Application-level Clustering