Introduction to Network Fault Management

Introduction

Fault management (FM) is usually mentioned as the first concern in network management. Its main role is to ensure high availability of a network. Hence, it involves procedures to automatically detect, notify the occurrence of a fault and isolate the  root cause (RCA) of the fault.

Below diagram depicts the view of network operations with and  without  integrated FM.With automated FM system we can integrate and monitor multiple technologies from multiple vendors with limited human resource.

FMIntro

Fault Management

FM is the process of locating problems or faults on the network. 

It involves the following steps:

  1.  Discover/Detect the faults
  2.  Isolate the faults
  3.  Fix/Notify/Report the faults

FMSteps

Below diagram shows the FM functionality as part of  FCAPS model.

FMFunctions

The main functions of FM are

  • Event/Alarm Discovery
  • Event/Alarm Filtering
  • Event/Alarm Correlation (RCA/SIA)
  • Alarm Forwarding/Notification
  • Alarm Reporting/Analysis
  • Third Party Integration

Concepts

The functions of FM can  be broadly divided in to three parts

  • Event Collection
  • Event/Alarm Processing
  • Generating Info/Reports

FMProcess

Event Collection : Connecting and Collecting events/alarms from  the various network elements. Suppressing  unnecessary events/alarms. Managing the retention of events/alarms.

Event/Alarm Processing: Events/Alarms filtering , Events/alarms thresholding, Enrichment Process. Event/Alarm Correlation ,Event/Alarm Forwarding, Root Cause Analysis (RCA)/Service Impact Analysis (SIA).

Generating Info/Reports: Events/Alarm Reporting , Event/Alarm Analysis, Integrate with other OSS  system to generate other information, information forwarding.

Event/Alarm Management

What is Fault?

   A Fault is a software or hardware defect in a system that disrupts communication or degrades the performance.

 What is Event?

An event is a distinct incident that occurs at a specific point in time. Any happening that has an impact on the network performance can be called an event. It can be informational in nature, a cleared event, warning message, a trouble sign or even a critical fault.

All the faults in the system/network are notified as events. Events are the source of information for all the management happenings that take place within the FM system.

Typically an event is associated with an managed object (Ex: ME, PTP, Router, Switch etc..) in which it occurs with a specific event Type at a specific layer rate etc. This combination can be called a AlarmKey i.e All the events associated with same fault will have same alarmKey.

Events also have an associated severity. The common severities are Critical, Major, Minor, Warning  and Clear.

 BasicEvent

 Examples of  fault/events include:

  • Port status change
  • Connectivity loss/Fiber Cut
  • Device reset/Equipment failures
  • Device becoming unreachable by the EMS

What is Alarm?

The life cycle of a fault scenario is called an Alarm. An alarm is characterized by a sequence of related events (having same alarmKey), such as port-down and port-up.  The last event in the sequence determines the severity and state of the alarm. An alarm that ends with an event that has a severity of cleared is called a cleared  alarm.

One ManagedObject can have many different alarms with different alarmKeys.

Example:

port down event with critical severity results in to Critical Alarm  And a port up event  comes with  cleared severity. This moves the Critical Alarm to Clear Alarm.

BasicAlarm

Flapping Events

Flapping is a flood of event notifications with toggling severity which are related to the same alarm (having same alarmKey). Flapping can occur when a fault is  unstable and causes repeated event notifications. Flapping can be indicative of configuration problems, real network problems.

A flapping example is illustrated in below diagram

BasicEventFlap

A sequence of events is identified as flapping if:

  • All events share the same alarmKey
  • The time interval between consecutive events is less than configured value.

Event Discovery/Identification

Normally the management systems(EMS/NMS) notifies the events to the interested parties through SNMP/Corba/TCP mechanisms. Events can also be generated by external systems for threshold events.

 800px-Layerednms

 The event processor listens and parses the event notification messages to get more information about the event and maintains the Event information for further processing

Some of the event properties are

  • Event Source –  Associated  ManagedObject name.
  • Event Functionality Type – Alarm (Fault Event), TCA (Performance Event)
  • Event Type –  ITU-T X.733 Alarm Type (Exa. Communication Alarm , QoS Alarm)
  • Event description  – Indicates event message
  • Event Severity  – Severity of the the Event

Event enrichment

Event enrichment is the process of the populating additional information about the generated event.  This process may need to contact with third-party systems to get the information. This enriched information can be useful during fault resolution.

Event Correlation and Alarms

Event correlation is the process of establishing relationships between network events

 Main Functionality:

  1.  Filter out redundant and spurious events.
  2.  Root cause of faults in a network (RCA)

 Event Filtering

One important aspect of FM is filtering and prioritizing incoming events to identify  the serious events. Based on event information the FM can determine whether the event continue to be processed or is dropped.  All unwanted/duplicated events can dropped at event collection or event processing stage.

FM_EventFiltering

Example:

When an NE on the network is faulty, the management system (EMS/NMS) reports the network events to the FM. Each fault  may triggers multiple events/alarms. Some events may by triggered by the same fault, so they are associated with each other. The alarm correlation function can analyze the events and generate single alarm for multiple events.

RCA/SIA

A failure situation on the network usually generates multiple events, because a failure condition on one device may render other devices inaccessible. The events generated indicate that all of the devices are inaccessible.

Network operators use Root cause analysis (RCA) to investigate the root-cause of events. They can determine which events are root cause and which events are results of that root cause (symptom events) and this enables them to to quickly focus on the events that are causing network problems.

Normally RCA process uses knowledge of the network topology to establish a point of failure and  identify  symptom events.

RCA algorithms can be rule based, predictive or model-based.

If a device fails, the immediate question that needs to be answered is “what business service did it impact” and what is the cost to my business. This kind  analysis is  called Service Impact Analysis (SIA). SIA uses RCA information to find out the impacted services/customers.

Third-Party Integration

The main aim of network operator is to shorten the fault resolution time period. So basic fault detection and reporting features may not be sufficient for efficient fault monitoring.  FM systems should integrate with third-party system like Trouble Ticketing Systems, Performance Mgmt System etc.. These third-party integration’s help the operators to fasten the fault resolution process.

Posted in EMS/NMS/OSS | Tagged , | Leave a comment

The Art of Java Application Performance Analysis and Tuning – Part 7

This is the last article in the series of articles exploring the tools and techniques for analyzing, monitoring, and improving the performance of Java applications.

The Art of Java Application Performance Analysis and Tuning-Part 1                                      The Art of Java Application Performance Analysis and Tuning-Part 2                                     The Art of Java Application  Performance Analysis and Tuning-Part 3                                       The Art of Java Application  Performance Analysis and Tuning-Part 4                                    The Art of Java Application  Performance Analysis and Tuning-Part 5                                          The Art of Java Application  Performance Analysis and Tuning-Part 6

In this blog we are going to discuss  performance analysis during development/QA  and common Java performance problems.

Performance Analysis During during Development/QA

Many of us developers/QA engineers only test for functionality at low load and miss the performance issues at higher load. During higher load, application may experience the performance issues like  slowness, OutOfMemory, Memory/Thread Leaks.

Development

Developers should be aware all the tools mentioned above. Thread dump analysis, JConsole, Jmap are must. Running Profilers like YourKit, IBM Health Monitor will give more insights of your program.

All the existing major features and new features must be profiled to identify the performance problems early in the development.

QA

QA members should be aware of tools like JConsole, JMap, Profilers/IBM HealthMonitor, Linux Commands like (top, ps, iostat, vmstat). All the existing/new features must be profiled to identify the performance problems early in the testing.

Common Java Performance Problems

OutOfMemory

Symptoms:

  1. OutOfMemoryError in logs
  2. Program stops responding
  3. JConsle/Jmap shows full memory usage

Analysis:

  1. Use Thread dump analysis to identify the cause of memory leak.
  2. Use JMap (Oracle)/IBM Health Monitor(IBM)-Memory to identify the leaked objects.
  3. Use Thread dump/JConsole for possible thread leak.
  4. Use Yourkit profiler to identify the leaked objects/caused classes/methods.

Thread Leaks

Symptoms:

  1. OutOfMemoryError in logs
  2. OS hangs
  3. OS related errors like
   "Unable to fork new process"
   "Cannot allocate memory: fork: Unable to fork new process"

Analysis:

  1. Use Thread dump/JConsole to confirm possible Thread leak.
  2. Use Thread dump analysis/Method Profiling to identify the cause of thread leak method.

Command to find out number of the threads in Linux

   # ps -elfT | wc -l

If  Java process is leaking threads then this number increases. This can cause your system to reach maximum allowed threads/processes.

To see the number of threads one specific process is using you can do the following.

   # ps -p PID -lfT
   # ps -p 2089 -lfT

If the above number is increasing then we can suspect thread leak.

The common causes of are thread leaks are

  1. Blocked threads
  2. Infinite loops in Thread.run()
  3. Not closing ExecutorService/ThreadPoolExecutor

Deadlocks

Symptoms:

  1. Program stops responding
  2. Application hangs

Analysis:

  1. Use Thread dumps/JConsole to confirm possible thread leak.

Reference :

http://publib.boulder.ibm.com/infocenter/javasdk/tools/index.jsp?topic=%2Fcom.ibm.java.doc.igaa%2F_1vg000156f385c9-11b26a8be3f-7fff_1001.html

Summary

Performance analysis and tuning is an art and is a iterative process. Performance tuning exercises should be accompanied by concrete performance requirements and measurement programs. Tools like Java thread dump , JConsole and IBM Health monitor
can help us in analyzing the applications. Analyzing Java thread dumps are critical for understanding what’s really going  inside Java application server. Caching is important to reduce the load on database and to improve the performance.

Must Read Book

[1] “Java Concurrency In Practice” by Brian Goetz”, David Holmes, Doug Lea, Tim Peierls, Joshua Bloch

Posted in java | Tagged , | Leave a comment

The Art of Java Application Performance Analysis and Tuning – Part 6

This is the sixth article in the series of articles exploring the tools and techniques for analyzing, monitoring, and improving the performance of Java applications.

The Art of Java Application Performance Analysis and Tuning-Part 1                                      The Art of Java Application Performance Analysis and Tuning-Part 2                                     The Art of Java Application  Performance Analysis and Tuning-Part 3                                       The Art of Java Application  Performance Analysis and Tuning-Part 4                                    The Art of Java Application  Performance Analysis and Tuning-Part 5

The Art of Java Application  Performance Analysis and Tuning-Part 7

In this blog we are going to discuss Java application tuning techniques.

Java Application performance tuning

Java Application performance tuning involves the tuning of following components.

  1. JVM Tuning
  2. OS/Hardware Tuning
  3. Database (MySQL/Oracle) Tuning
  4. Application (Code) Tuning

In this article we will discuss about Application (Code) Tuning.

Application (Code) Tuning

In my experience I came across the following techniques to improve the performance.

  1. Identify the bottlenecks and solve them
  2. Using Better Logic/Algorithms
  3. Using Less and Efficient DB Queries
  4. Caching
  5. Using Java Concurrency API for Improving Performance

How to identify Bottlenecks?

Using Thread dump Analysis: : Take multiple thread dumps during busy/peak hour. Analyze the thread dumps to identify the hot methods/frequently called methods and start optimizing the code/logic..

Using Method Profiling: : Use some profiler/ IBM Health Center – Profile option to find out the hot methods/frequently called methods and start optimizing the code/logic.

Using Application Logs: : Use the application logs/time stamps to measure the performance/response time and start optimizing the code/logic

Using Memory Analysis: : Use the Jmap/IBM Health Center – Classes option to find out the the high memory usage/OutOfMemory errors

Repeat above techniques until you meet your performance requirements.

Using Better Logic/Algorithms/Implementation

After identifying the bottlenecks we may need to optimize/change the code/logic.

single thread bottlenecks: Common reason for slowness/low throughput are single threaded bottlenecks. Single threaded bottlenecks are the code portions where only a single thread exists for processing. These bottlenecks cause other threads/data to wait every time they execute. Often these bottlenecks are the reason for high memory usage and OutOfMemory problems.

Notes:

  1. Design concurrent applications around execution of independent tasks (Java Concurrency Executor API).
  2. We can write faster algorithms by using Java Concurrency API/Parallel algorithms.

Using Less and Efficient DB Queries

Most of the Java applications are data-base bound. Even if you can scale-out your application, you can not scale the RDBMS data-bases easily. Efficient database modeling and query mechanisms are key to the data-base performance.
Notes:

1. It is difficult to change the Database model after project deployment. So we need to make the DB model right first time.

2. Use Batch Queries (PreparedStatement.addBatch, Hibernate Bulk addition, etc.) where-ever possible. In some cases batch queries are 100 times faster! than serial queries.

3. Always try to query on a indexed columns.

4. Care should be taken while doing JOIN queries on huge tables.

5. Some times queries need be tuned for specific DBs (Oracle , MySQL). Use EXPLAIN PLAN command to find out the cost of the query.

6. Care should be taken while using ORMs like Hibernate, JPA etc. Understanding ORM Query generation helps query tuning.

7. Poor database deployments/sizing can hurt the application performance severely. So monitor the DB (Oracle, MySQL) applications and tune the DB Server accordingly.
Excessive load on database effects the DB performance. “Caching” can be used to decrease the load on database.

Caching

The current mantra of industry for high performance and scalable applications are

  1. Scale horizontally (multiple/distributed servers)
  2. Cache as much as you can. The more useful cache the better the responsiveness.

Notes:

1. Cache frequently used data and static data.

2. Cache precomputed results for repeated use.

3. Use thread dump analysis/method profiling to identify the hot methods and try to cache the data.

4. Make sure your caches are configurable. So that caches can be tuned for project requirements.

5. Make sure your cache configurations are align with available RAM.

Using Java Concurrency API for Performance

The number of cores in multi-core processors has increased. Based on project requirements/network size we can add more and more processor-cores (scale up) to the hardware.

But the question is, Are we using the multi-core processors efficiently?

To keep all processor cores busy we need to code more fine grained and more scalable parallelism. The applications that have not been tuned for multi-core systems may suffer from performance problems.

In using Java Concurrency API to achieve better performance, we are trying to utilize the multi-core processing resources we have more effectively.

  1. Design concurrent applications around execution of independent tasks (Java Concurrency Executor API).
  2. We can write faster algorithms by using Java Concurrency API/Parallel algorithms.

Must read book for all Java developers : “Java Concurrency In Practice” by Brian Goetz”

Reference:

http://www.cs.hut.fi/u/tlilja/multicore/slides/java_multicore.pdf

http://highscalability.com/learn-how-exploit-multiple-cores-better-performance-and-scalability

http://www.slideshare.net/leefs/effective-java-concurrency?utm_source=slideshow&utm_medium=ssemail&utm_campaign=download_notification

More details on this topic in future article.

Posted in java | Tagged , | Leave a comment

The Art of Java Application Performance Analysis and Tuning – Part 5

This is the fifth article in the series of articles exploring the tools and techniques for analyzing, monitoring, and improving the performance of Java applications.

The Art of Java Application Performance Analysis and Tuning-Part 1                                      The Art of Java Application Performance Analysis and Tuning-Part 2                                     The Art of Java Application  Performance Analysis and Tuning-Part 3                                       The Art of Java Application  Performance Analysis and Tuning-Part 4

The Art of Java Application  Performance Analysis and Tuning-Part 6                                     The Art of Java Application  Performance Analysis and Tuning-Part 7

In this blog we will continue to look at some of the useful tools for analyzing the Java application programs.

IBM Health Center

Some of the discussed tools (Jmap ) works for Oracle Hot Spot VM only.

The IBM Monitoring and Diagnostic Tools for Java – Health Center is a tool from IBM which enables you to assess the current  status of a running Java application on IBM Java VM. This is very good low-overhead diagnostic tool and API for monitoring an application running on an IBM Java Virtual Machine. This works very well even in high heap memory usage applications.

Using IBM Health Center we can

  1. Identify if native or heap memory is leaking
  2. Discover which methods are taking most time to run
  3. Visualize and tune garbage collection
  4. View any lock contentions
  5. Monitor your applications Thread activity
  6. Detect deadlock conditions in your application
  7. Gather class histogram data (Equivalent to jmap -histo:live)

The Health Center tool is provided in two parts:

The Health Center client: is a GUI-based diagnostics tool for monitoring the status of a running Java Virtual Machine (JVM), installed within the IBM Support Assistant (ISA) Workbench.

The Health Center agent: provides the mechanism by which the Health Center client obtains information about your Java application. The agent uses a small amount of processor time and memory and must be manually installed in an IBM JVM.

Fig G. shows an overview of where the Health Center client and agent are located, when installed in ISA and the JVM.

Fig G. Location of Health Center client and agent

Fig G. Location of Health Center client and agent

Steps to install/use Health Center:

  1. Step 1 – Install the Health Center client:
  2. Step 2 – Launch the Health Center client:
  3. Step 3- Enable your application for monitoring:
  4. Step 4 -Connect Health Center to the enabled Java application:

Above steps are explained in

https://www.ibm.com/developerworks/java/jdk/tools/healthcenter/getting_started.html

Configuring and Connecting Java Application

Installing the Health Center agent: Most of the IBM Java Runtime Environments (JREs) already have a Health Center agent installed.  If installed the “jre/lib” folder contains healthcenter.properties file.

If agent is not installed then install the agent by following instructions from below link.

http://publib.boulder.ibm.com/infocenter/hctool/v1r0/index.jsp?topic=%2Fcom.ibm.java.diagnostics.healthcenter.doc%2Ftopics%2Finstallingagent.html

Configure and start the Health Center agent: You can start the Java application with enabling health center agent by adding the -Xhealthcenter:port=1965,level=headless property to java command line.
More detailed options at:

http://publib.boulder.ibm.com/infocenter/hctool/v1r0/topic/com.ibm.java.diagnostics.healthcenter.doc/topics/configuringagent.html

http://publib.boulder.ibm.com/infocenter/hctool/v1r0/topic/com.ibm.java.diagnostics.healthcenter.doc/topics/enablingagent.html
Install the Health Center client: : Follow the below link to installation

https://www.ibm.com/developerworks/java/jdk/tools/healthcenter/getting_started.html
Launch Health Center Client:

  1. Start the ISA Workbench (From Windows Programs menu)
  2. From the Welcome page, click Analyze Problem.
  3. On the “Analyze Problem” tab, select Tools option
  4. You will be presented with a list of all the installed tools. Select “IBM Monitoring and Diagnostic Tools for Java – Health Center” and then click on Launch.(Fig H.).
  5. The Health Center application will now launch and the Health Center: Connection wizard will be displayed.

Important note: Your application must be enabled for monitoring before Health Center can be connected.

Fig H. ISA Main page

Fig H. ISA Main page

Connecting to a Java application using the Health Center client: Select New Connection from the File menu of an Health Center client.
A connection wizard is displayed (Fig I.). Ensure that you have enabled your application for monitoring then click Next.  Specify the host name and port number. The Health Center makes a connection using these details.

Fig I ISA Connect Page

Fig I. ISA Connect Page

More Details at:

http://publib.boulder.ibm.com/infocenter/hctool/v1r0/topic/com.ibm.java.diagnostics.healthcenter.doc/topics/connectingtoJVM.htm

ISM Health Center Options

Once you have connected Health Center to an application, Health Center Client is composed of following options (Fig. J)

  1. Classes: Displays information about classes loaded and class histogram.
  2. Garbage Collection:: Displays information about Garbage Collection and memory use.
  3. I/O: Displays information about application I/O Activity.
  4. Profiling:: Displays information about application method profiling.
  5. Threads: Displays information about the Java application threads
  6. Native Memory: Displays information about the Native Memory of Java application
  7. Others:There are other useful options like Locking , I/O, Environment etc..
Fig J. ISA Health Center Options Page

Fig J. ISA Health Center Options Page

Monitoring Method Profiling

Method profiling (Fig K.) can be used diagnose applications showing high CPU usage. It shows where the application is spending  its time, by giving full call stack information for all sampled methods.

By identifying the hot methods/mostly called methods, we can optimize the methods/application logic to improve the performance.

Fig K. Method Profiling

Fig K. Method Profiling

Monitoring Garbage Collection and Memory

The Memory tab (Fig L.) provides information about memory consumption and Garbage Collection.  This can be used to analyze OutOfMemory (or) max resource usage scenarios.

Fig L. Monitoring Garbage Collection

Fig L. Monitoring Garbage Collection

Monitoring Class Loading

The Classes tab (Fig. M) displays information about class loading. This gives the number of classes loaded in the system.

The “Class histogram data” button option is the simple way to get the live object instances in the heap memory.  This is equivalent to jmap -histo:live option.

This gives the histogram of class names with number of instances and no. of bytes. This will be useful in OutOfMemomy and  high memory usage situations. Using this histogram we can identify which class has more number of instances and size.

We can get this output at regular intervals and can analyze to identify objects which are growing.

Fig M. Monitoring Class Loading

Fig M. Monitoring Class Loading

Monitoring Thread Information

The Threads option (Fig N.) provides information about thread use. It gives the number of threads running in the application.  We can get the “stack trace” of a particular thread by clicking the thread name on left side panel.

If the number of threads increasing continuously then we can suspect the thread leak.

Fig N. Monitoring Threads

Fig N. Monitoring Threads

Monitoring Native Memory Usage

We can monitor the Native Memory by using “Native Memory” option [Fig O.] Native Memory is the memory used by the Java application other than heap memory. This memory contains the Thread Memory/Stack Size, Direct memory  allocations (-XX:MaxDirectMemorySize=1G ) etc.. This is useful for analyzing the thread leak and native memory leaks..

Fig O. Monitoring Native Memory

Fig O. Monitoring Native Memory

Monitoring Application I/O Activity

I/O Option [Fig. P] gives the I/O activity of the application.

Fig P. Monitoring I/O Activity

Fig P. Monitoring I/O Activity

More Details at:

https://www.ibm.com/developerworks/java/jdk/tools/healthcenter/

http://publib.boulder.ibm.com/infocenter/hctool/v1r0/index.jsp

http://publib.boulder.ibm.com/infocenter/hctool/v1r0/index.jsp?topic=%2Fcom.ibm.java.diagnostics.healthcenter.doc%2Fhomepage%2Fplugin-homepage-hc.html

Posted in java | Tagged , | Leave a comment

The Art of Java Application Performance Analysis and Tuning – Part 4

This is the fourth article in the series of articles exploring the tools and techniques for analyzing, monitoring, and improving the performance of Java applications.

The Art of Java Application Performance Analysis and Tuning-Part 1                                      The Art of Java Application Performance Analysis and Tuning-Part 2                                     The Art of Java Application  Performance Analysis and Tuning-Part 3

The Art of Java Application  Performance Analysis and Tuning-Part 5                                          The Art of Java Application  Performance Analysis and Tuning-Part 6                                     The Art of Java Application  Performance Analysis and Tuning-Part 7

In this blog we will continue to look at some of the useful tools for analyzing the Java application programs.

Jmap (Oracle Hotspot Java VM)- Memory Map

jmap prints heap memory details of a given process or core file of a Java application. This tool available in Oracle Hotspot Java VM.  Its advanced uses are printing shared object memory maps , heap memory details of core file .

jmap’s important capabilities for memory profiling are

  1. To print java heap summary
  2. To print histogram of java live object heap

To print java heap summary

This is the simple way to get the heap usage summary. This can be used to analyze the memory usage of different parts of the Java Heap memory.

  jmap -heap <pid>

# /usr/local/jdk1.6.0_38/bin/jmap -heap 76693
Attaching to process ID 76693, please wait...
Debugger attached successfully.
Server compiler detected.

JVM version is 20.13-b02
using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 53687091200 (51200.0MB)
   NewSize          = 13421772800 (12800.0MB)
   MaxNewSize       = 13421772800 (12800.0MB)
   OldSize          = 5439488 (5.1875MB)
   NewRatio         = 2
   SurvivorRatio    = 16
   PermSize         = 21757952 (20.75MB)
   MaxPermSize      = 536870912 (512.0MB)

Heap Usage:
New Generation (Eden + 1 Survivor Space):
   capacity = 12676169728 (12088.9375MB)
   used     = 1686018912 (1607.9129333496094MB)
   free     = 10990150816 (10481.02456665039MB)
   13.30069688382134% used
Eden Space:
   capacity = 11930566656 (11377.875MB)
   used     = 1532793448 (1461.785743713379MB)
   free     = 10397773208 (9916.089256286621MB)
   12.847616481226757% used
From Space:
   capacity = 745603072 (711.0625MB)
   used     = 153225464 (146.12718963623047MB)
   free     = 592377608 (564.9353103637695MB)
   20.550540864724333% used
To Space:
   capacity = 745603072 (711.0625MB)
   used     = 0 (0.0MB)
   free     = 745603072 (711.0625MB)
   0.0% used
concurrent mark-sweep generation:
   capacity = 40265318400 (38400.0MB)
   used     = 4247009064 (4050.2634658813477MB)
   free     = 36018309336 (34349.73653411865MB)
   10.54756110906601% used
Perm Generation:
   capacity = 146186240 (139.4140625MB)
   used     = 87710416 (83.64717102050781MB)
   free     = 58475824 (55.76689147949219MB)
   59.999091569767444% used

To print histogram of Java live object heap

This is the simple way to get the live object instances in the heap memory. This gives the histogram of class names with number of instances and no. of bytes. This will be useful in OutOfMemomy and high memory usage situations. Using this histogram we can identify
which classes has more number of instances and size.

We can log this output at regular intervals and can analyze to identify objects which are growing.

# /usr/local/jdk1.6.0_38/bin/jmap -histo:live <pid>

# /usr/local/jdk1.6.0_38/bin/jmap -histo:live 76693
 num     #instances         #bytes  class name
----------------------------------------------
  1:      10362767     1175268016  [C
  2:       3779877      651446840  [Ljava.lang.Object;
  3:      11373030      454921200  java.lang.String
  4:         15872      447443072  [Lnet.sf.ehcache.store.chm.SelectableConcurrentHashMap$HashEntry;
  5:       2585762      206860960  org.hibernate.collection.PersistentList
  6:       1030813      190528920  [Ljava.util.HashMap$Entry;
  7:       3423143      136925720  java.util.ArrayList
  8:       1741450       83589600  java.util.HashMap$Entry
  9:         46832       76055168  [Lorg.hibernate.jdbc.Expectation;
 10:       1147644       73449216  com.test.product.TestJob3
 11:       1056207       67597248  com.test.product.TestJob4
 12:        685812       65837952  com.test.product.TestJob5
 13:        648750       57090000  net.sf.ehcache.Element
 14:        842762       53936768  java.util.HashMap
 15:        529534       38126448  com.test.product.TestJob2
 16:        648749       36329944  net.sf.ehcache.store.chm.SelectableConcurrentHashMap$HashEntry
 17:        883157       28261024  java.util.Arrays$ArrayList
 18:         96224       23093760  com.test.product.Test1
 19:        868136       20835264  java.lang.Long
 20:        648750       20760000  net.sf.ehcache.DefaultElementEvictionData
 21:         80046       17930304  in.co.test.product.jobs.common.TestJob
 22:        114409       16796552  <constMethodKlass>
 23:        114409       15570872  <methodKlass>
 24:         59639       15299664  [B
 25:        188040       15043200  java.util.LinkedHashMap
 26:         10634       12314264  <constantPoolKlass>
 27:        191278       12241792  java.util.LinkedHashMap$Entry

In above output, top three objects are Character array, Object array and String objects. Normally Character array, String objects tops the list.

Third line shows 11373030 string objects and these objects occupied 454921200 bytes. Observing top 50 rows gives the object usage during high memory usage scenarios.

class name column: the ‘class name’ column consists object class names.

   boolean 	    	Z
   byte 	    	B
   char 	    	C
   class or interface 	Lclassname;
   double 	    	D
   float 	    	F
   int 	        	I
   long 	    	J
   short 	    	S
   Object Array       [Lclassname

Example:

   [C is a char[]
   [S is a short[]
   [I is a int[]
   [B is a byte[]
   [[I is a int[][]
   [Ljava.lang.String;" is a class name representing a java.lang.String[] array.
   [Ljava.lang.Object is Object[]

More info at:

http://docs.oracle.com/javase/7/docs/api/java/lang/Class.html#getName%28%29

We can also use Jmap to take heap and core dump and analyze using jhat (or) profilers.

jhat : The jhat command parses a java heap dump file and launches a web-server. jhat enables you to browse heap dumps  using your favorite web-browser. This can be used for more advanced heap dump analysis.

Reference:

[1] http://docs.oracle.com/javase/6/docs/technotes/tools/share/jmap.html                       [2] http://docs.oracle.com/javase/6/docs/technotes/tools/share/jhat.html                        [3] http://www.lshift.net/blog/2006/03/08/java-memory-profiling-with-jmap-and-jhat        [4] http://www.herongyang.com/Java-Tools/jstack-jmap-JVM-Heap-Dump-Tool.html

Java Profilers

There are many proprietary profiling tools on the market for measuring performance and tracking down performance bottlenecks. These profilers are used to do deep analysis and profiling of java applications.

YourKit Java Profiler (Need License)

YourKit Profiler can be downloaded from http://www.yourkit.com/download/ and documentation is available at http://www.yourkit.com/docs/index.jsp.

You need to get Evaluation license copy from download site. This license key is required to connect client.

To start profiler with java application we need to add profiler agent path to the java command.

 #Linux x86, 32-bit Java
 -agentpath:yjp-11.0.8/bin/linux-x86-32/libyjpagent.so=port=9998

 #Linux x86, 64-bit Java
 -agentpath:yjp-11.0.8/bin/linux-x86-64/libyjpagent.so=port=9998

 #Solaris SPARC, 64-bit Java
 -agentpath:yjp-11.0.8/bin/solaris-sparc-64/libyjpagent.so=port=9998

 #IBM AIX PPC, 64-bit Java
 -agentpath:yjp-11.0.8/bin/aix-ppc-64/libyjpagent.so=port=9998

From Remote Windows Machine

 1. start the YourKit Profiler
 2. Enter license key details
 3. Select 'connect remote server' option
 4. Enter 'ServerIP:PortNo' ex: 192.168.9.35:9998 and connect to the server.

Most of the profiler screens are self explanatory and similar to JConsole. For more information and usage pl refer to YourKit Documentation

JProfiler Java Profiler (Need License)

http://www.ej-technologies.com/products/jprofiler/overview.html

There are other monitoring tools like JVisualVM, etc..

Posted in java | Tagged , | Leave a comment

The Art of Java Application Performance Analysis and Tuning – Part 3

This is the third article in the series of articles exploring the tools and techniques for analyzing, monitoring, and improving the performance of Java applications.

The Art of Java Application Performance Analysis and Tuning – Part 1                                   The Art of Java Application Performance Analysis and Tuning – Part 2

The Art of Java Application  Performance Analysis and Tuning-Part 4                                    The Art of Java Application  Performance Analysis and Tuning-Part 5                                          The Art of Java Application  Performance Analysis and Tuning-Part 6                                     The Art of Java Application  Performance Analysis and Tuning-Part 7

In this blog we will continue to look at some of the useful tools for analyzing the Java application programs.

JConsole

The JConsole graphical user interface is a monitoring tool and used to monitor memory consumption, CPU consumption, threads and class information, etc.. of applications running on the Java platform. JConsole graph views can be used to monitor the resource
usage of the Java application over time.

Connecting JConsole to a Java Application

On Linux/Unix machines:

  /usr/local/jdk1.6.0_17/bin/jconsole   //To connect Local Application. Need to select Java Process in the GUI
  /usr/local/jdk1.6.0_17/bin/jconsole  <PID> //Local Application, with PID
  /usr/local/jdk1.6.0_17/bin/jconsole 10.209.96.221:9999 // Remote Application

On Windows machines:

  Double click the jconsole.exe program in c:\Program Files\Java\bin\jconsole.exe

Upon opening JConsole client window (Fig B.), connect the local java process by identifying the java process in Local Processes list. (or) connect the remote java process by giving the host name and port number.

JConsole Remote Server configuration : add below lines to server java command for remote monitoring

 -Dcom.sun.management.jmxremote.authenticate=false
 -Dcom.sun.management.jmxremote.ssl=false
 -Dcom.sun.management.jmxremote.port=9999
Fig B. Connecting JConsole to a Java Application

Fig B. Connecting JConsole to a Java Application

JConsole Tabs

Once you have connected JConsole to an application, JConsole is composed of six tabs.

  1. Overview: Displays overview information about the Java VM and monitored values.
  2. Memory: Displays information about memory use.
  3. Threads: Displays information about thread use.
  4. Classes: Displays information about class loading.
  5. VM: Displays information about the Java VM.
  6. MBeans: Displays information about MBeans.

Overview Information

The Overview tab (Fib C.) displays graphical monitoring information about CPU usage, memory usage, thread counts, and the classes loaded in the Java VM, all in a single screen. This can be used to monitor the application for period of time

Fig C. JConsole Overview tab

Fig C. JConsole Overview tab

Monitoring Memory Consumption

The Memory tab (Fig D.) provides information about memory consumption and memory pools.  If the memory curve touches the maximum memory alloted, then we can suspect OutOfMemory (or) high memory usage.

A well behaved application will have a rhythmic graph curve (Fig D.) over the time.

We can use “Perform GC” option to manually start the Garbage Collection. This can be used confirm OutOfMemory Error/high memory usage. Even after the “Perform GC” if the memory is not coming down, then we can suspect OutOfMemory  Error/high memory usage.

Fig D. Monitoring Thread Use

Fig D. Monitoring Thread Use

Monitoring Thread use

The Threads tab (Fig E.) provides information about thread use. It gives the number of threads running in the application.  We can get the “stack trace” of a particular thread by clicking the thread name on left side panel.

If the number of threads increasing continuously then we can suspect the thread leak.

Fig E. Monitoring Thread Use

Fig E. Monitoring Thread Use

Detecting Deadlocked Threads

To check if your application has run into a deadlock (for example, your application seems to be hanging), deadlocked threads can be detected by clicking on the “Detect Deadlock” button. If any deadlocked threads are detected, these are displayed in a new tab that appears next to the “Threads” tab.

Monitoring Class Loading

The Classes tab (Fig F.) displays information about class loading. This gives the number of classes loaded in the system.

In some rare cases we can get class leaks and can lead to OutOfMemory Error.

Fig F. Monitoring Class Loading

Fig F. Monitoring Class Loading

The VM Summary tab provides information about the Java VM.

The MBeans tab displays information about all the MBeans registered with the platform MBean server in a generic way.  If you are using libraries like Hibernate..then you can get statistics using MBean tab.

“JConsole” can be used to connect IBM Java VM

More Details at:

http://docs.oracle.com/javase/6/docs/technotes/guides/management/jconsole.html

Posted in java | Tagged , | Leave a comment

The Art of Java Application Performance Analysis and Tuning – Part 2

This is the second article in the series of articles exploring the tools and techniques for analyzing, monitoring, and improving the performance of Java applications.

The Art of Java Application Performance Analysis and Tuning-Part 1                                     The Art of Java Application  Performance Analysis and Tuning-Part 3                                       The Art of Java Application  Performance Analysis and Tuning-Part 4                                    The Art of Java Application  Performance Analysis and Tuning-Part 5                                          The Art of Java Application  Performance Analysis and Tuning-Part 6                                     The Art of Java Application  Performance Analysis and Tuning-Part 7

In this blog we are going to look at some of the useful tools for analyzing the Java application programs.

Java Thread Dump

A thread dump is a list of stack traces of all the Java threads that are currently active in a Java Virtual Machine (JVM). A thread dump shows the status of all running threads/locks at a given moment.

Java thread dump can be used to analyze many application specific issues such as

  1. The resource usage (CPU, DB, I/O etc..)
  2. Identifying the hot methods/queries
  3. Identifying the synchronization/lock issues

Since a thread dump is a snapshot at a given moment, a single snapshot will not give clear status of application.  So it is recommended to take more than one thread dump while analyzing.

Five thread dumps with five second time interval is a good starting point. Always get a thread dump for a hung/misbehaving application server and performance problems.

How to take Thread Dump

There are several ways to take thread dumps from a JVM.

JStack

We can use jstack command to generate the Thread dump for a given Java Process. This command is available in Oracle HotSpot JDK/bin folder.

 /usr/local/jdk1.6.0_17/bin/jstack <PID>
 /usr/local/jdk1.6.0_17/bin/jstack -l <PID> // Prints additional information about locks

Linux/Unix(AIX, Solaris etc..)

If the JVM is running in the background then send the QUIT signal:

  kill -3 PID

If the JVM is running in a console then simply press Ctrl-\.

PID is the process Id of the running Java process. The thread dump will be sent to standard out.

In IBM VM Java a separate thread dump file will be generated .

example: javacore.20130422.180453.23902.0001.txt.

You can get the process numbers of of all running Java processes with the following commands:

  ps -ef | grep java
  jps (Available in Oracle JDK/bin)

Windows

The Java application that you want to produce a thread dump for must be running / started in a command console. When you want to produce a thread dump press Ctrl-Break

For more details :

  1. http://helpx.adobe.com/cq/kb/TakeThreadDump.html
  2. http://wiki.eclipse.org/How_to_report_a_deadlock#Getting_a_stack_trace_on_other_platforms

Thread Dump Analysis

Thread dumps can used to analyze performance bottlenecks, memory Leaks/OutOfMemory errors, thread leaks, deadlocks ,  synchronization/locks issues etc.. Without thread dumps, it it very difficult  to get to root cause for an application  server “hang” condition.

Thread dump format and content may vary between the different Java vendors(Oracle HotSpot JDK, IBM Java), but at core they provides  you a list of the stack traces for all Java threads in the Java Virtual Machine.

A stack trace is a dump of the current execution stack that shows the method calls running on that thread from the bottom up.  If there are N threads running in the JVM, thread dump contains N Threads information.

For each thread it gives

  1. The name of the thread, priority
  2. Thread group
  3. State (running/blocked/waiting/parking)
  4. Execution stack in form of thread stack trace

Here is an sample stack trace from a Java application.

"pool-3-thread-18" prio=10 tid=0x2a138c00 nid=0x1fbc waiting on condition [0x2759d000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x3102bc00> (a java.util.concurrent.FutureTask$Sync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:905)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1217)
        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
        at java.util.concurrent.AbstractExecutorService.invokeAll(AbstractExecutorService.java:205)
        at com.test.product.reports.server.ReportProvidor.getData(ReportProvider.java:196)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)

The stack trace is bottom up. This means that it started with java.lang.Thread.run(Thread.java:619), that method called the
“ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908” method above it which called the one above it and so on.  The key here is knowing that what is currently running (or waiting) is always the top of the method.

This will give you insight as to what the threads are stuck/waiting on.

During analysis we can glance through the whole stack trace and try to understand what is actually going on. Method names/Class names/Thread names will help us identify the running code.

During coding, it is good practice to name threads/Executor threads. These names will be visible in the thread dump and  help us to identify the running code. In the above stack trace we can see that some report generation code is executing.

Scenario 1: Analyzing for performance/When the processing is slow/When the CPU usage is high

Take multiple thread dumps during peak time/busy hour. Then analyze the thread dumps to find out the hot methods/frequently  called methods/thread waits/I/O waits/DB waits. After identifying the bottleneck we can optimize the code/logic.

Scenario 2: Thread-leak Analysis

If the number of threads in the thread-dump are increasing then we can suspect the thread leak. By using the thread names  we can identify the part of the code which is leaking the threads. Number of threads can be monitored by using JConsole tool.

Scenario 3: Deadlock Analysis

If the program is not responding then we can suspect thread deadlock. By analyzing the thread dump we can identify the threads which are waiting for each others locks.

In IBM VM thread dump file ‘LOCKS’ section shows the threads and locks involved in the deadlock

Reference :

  1. http://www.javacodegeeks.com/2012/03/jvm-how-to-analyze-thread-dump.html
  2. http://architects.dzone.com/articles/how-analyze-java-thread-dumps
  3. http://allthingsmdw.blogspot.in/2012/02/analyzing-thread-dumps-in-middleware.html

IBM VM Thread Dump:

IBM VM Thread dump format is different from Oracle Java but provides more details for trouble shooting.  IBM VM Thread dump contains complete Run-time information of the application.

At the start of a Java dump, the first three sections are the TITLE, GPINFO, and ENVINFO sections. They provide useful information about the cause of the dump.

The MEMINFO section provides information about the Memory Manager.

LOCKS section of a Java dump gives LOCK Info. Useful for Deadlock analysis.

THREADS component (This is what you get in Oracle Thread Dump):

This section shows a complete list of stack traces of Java threads that are alive in IBM VM.

Here is an example stack trace for a thread running in IBM Java

3XMTHREADINFO      "pool-3-thread-23" J9VMThread:0x32EF0500, j9thread_t:0x320ECE6C, java/lang/Thread:0x3649AA58, state:P, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x37, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x5DA3, native priority:0x5, native policy:UNKNOWN)
3XMTHREADINFO2            (native stack address range from:0x2FE6A000, to:0x2FEAB000, size:0x41000)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at sun/misc/Unsafe.park(Native Method)
4XESTACKTRACE                at java/util/concurrent/locks/LockSupport.parkNanos(LockSupport.java:222)
4XESTACKTRACE                at java/util/concurrent/SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:435)
4XESTACKTRACE                at java/util/concurrent/SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:334)
4XESTACKTRACE                at java/util/concurrent/SynchronousQueue.poll(SynchronousQueue.java:885)
4XESTACKTRACE                at java/util/concurrent/ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:956)
4XESTACKTRACE                at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
4XESTACKTRACE                at java/lang/Thread.run(Thread.java:738)
3XMTHREADINFO3           Native callstack:
4XENATIVESTACK               (0xB7468B56 [libj9prt24.so+0xbb56])
4XENATIVESTACK               (0xB747464C [libj9prt24.so+0x1764c])
4XENATIVESTACK               (0xB7468BE9 [libj9prt24.so+0xbbe9])
4XENATIVESTACK               (0xB7468D0C [libj9prt24.so+0xbd0c])
4XENATIVESTACK               (0xB7468988 [libj9prt24.so+0xb988])
4XENATIVESTACK               (0xB747464C [libj9prt24.so+0x1764c])
4XENATIVESTACK               (0xB74689FC [libj9prt24.so+0xb9fc])
4XENATIVESTACK               (0xB746483D [libj9prt24.so+0x783d])
4XENATIVESTACK               (0xB773D40C)
4XENATIVESTACK               (0x0D49C778 [<unknown>+0x0])

More Details available at:

  1. https://www.ibm.com/developerworks/community/groups/service/html/communityview?communityUuid=2245aa39-fa5c-4475-b891-14c205f7333c
  2. http://publib.boulder.ibm.com/infocenter/realtime/v1r0/index.jsp?topic=%2Fcom.ibm.rt.doc.10%2Fdiag%2Ftools%2Fjavadump_tags_threads.html
  3. http://java.dzone.com/articles/how-analyze-thread-dumps-ibm
Posted in java | Tagged , | Leave a comment