Performanceproblems are one of the biggest challenges to expect when designing andimplementing Java EE related technologies. Some of these common problems can befaced when implementing either lightweight or large IT environments; whichtypically include several distributed systems from Web portals & orderingapplications to enterprise service bus (ESB), data warehouse and legacyMainframe storage systems.
Itis very important for IT architects and Java EE developers to understand theirclient environments and ensure that the proposed solutions will not only meettheir growing business needs but also ensure a long term scalable &reliable production IT environment; and at the lowest cost possible.Performance problems can disrupt your client business which can result in short& long term loss of revenue.
Thisarticle will consolidate and share the top 10 causes of Java EE performanceproblems I have encountered working with IT & Telecom clients over the last10 years along with high level recommendations.
Pleasenote that this article is in-depth but I'm confident that this substantial readwill be worth your time.
I'mconfident that many of you can identify episodes of performance problems followingJava EE project deployments. Some of these performance problems could have avery specific and technical explanation but are often symptoms of gaps in thecurrent capacity planning of the production environment.
Capacityplanning can be defined as a comprehensive and evolutive process measuring andpredicting current and future required IT environment capacity. A properimplemented capacity planning process will not only ensure and keep track ofcurrent IT production capacity and stability but also ensure that new projectscan be deployed with minimal risk in the existing production environment. Suchexercise can also conclude that extra capacity (hardware, middleware, JVM, tuning,etc.) is required prior to project deployment.
Inmy experience, this is often the most common "process" problem that can lead toshort- and long- term performance problems. The following are some examples.
Problems observed | Possible capacity planning gaps |
A newly deployed application triggers an overload to the current Java Heap or Native Heap space (e.g., java.lang.OutOfMemoryError is observed). | - Lack of understanding of the current JVM Java Heap (YoungGen and OldGen spaces) utilization - Lack of memory static and / or dynamic footprint calculation of the newly deployed application - Lack of performance and load testing preventing detection of problems such as Java Heap memory leak |
A newly deployed application triggers a significant increase of CPU utilization and performance degradation of the Java EE middleware JVM processes. | - Lack of understanding of the current CPU utilization (e.g., established baseline) - Lack of understanding of the current JVM garbage collection healthy (new application / extra load can trigger increased GC and CPU) - Lack of load and performance testing failing to predict the impact on existing CPU utilization |
A new Java EE middleware system is deployed to production but unable to handle the anticipated volume. | - Missing or non-adequate performance and load testing performed - Data and test cases used in performance and load testing not reflecting the real world traffic and business processes - Not enough bandwidth (or pages are much bigger than capacity planning anticipated) |
Onekey aspect of capacity planning is load and performance testing that everybody shouldbe familiar with. This involves generating load against a production-likeenvironment or the production environment itself in order to:
Thereare several technologies out there allowing you to achieve these goals. Someload-testing products allow you to generate load from inside your network froma test lab while other emerging technologies allow you to generate load fromthe "Cloud".
I'mcurrently exploring the free version of LoadTester, a new load testing tool I found allowing you to record test casesand generate load from inside your network or from theCloud.
Regardlessof the load and performance testing tool that you decide to use, this exerciseshould be done on a regular basis for any dynamic Java EE environments and aspart of a comprehensive and adaptive capacity planning process. When doneproperly, capacity planning will help increase the service availability of yourclient IT environment.
Thesecond most common cause of performance problems I have observed for Java EEenterprise systems is an inadequate Java EE middleware environment and / orinfrastructure. Not making proper decisions at the beginning of new platformcan result in major stability problems and increased costs for your client in thelong term. For that reason, it is important to spend enough time brainstormingon required Java EE middleware specifications. This exercise should be combinedwith an initial capacity planning iteration since the business processes,expected traffic, and application(s) footprint will ultimately dictate theinitial IT environment capacity requirements.
Now,find below typical examples of problems I have observed in my past experience:
Tryingto leverage a single middleware and / or JVM for many large Java EEapplications can be quite attractive from a cost perspective. However, this canresult in an operation nightmare and severe performance problems such asexcessive JVM garbage collection and many domino effect scenarios (e.g., StuckThreads) causing high business impact (e.g., App A causing App B, App C, andApp D to go down because a full JVM restart is often required to resolveproblems).
Nowlet's jump to pure technical problems starting with excessive JVM garbagecollection. Most of you are familiar with this famous (or infamous) Java error:java.lang.OutOfMemoryError. This is theresult of JVM memory space depletion (Java Heap, Native Heap, etc.).
I'msure middleware vendors such as Oracle and IBM could provide you with dozensand dozens of support cases involving JVM OutOfMemoryError problems on aregular basis, so no surprise that it made the #3 spot in our list.
Keepin mind that a garbage collection problem will not necessarily manifest itselfas an OOM condition. Excessive garbage collection can be defined as anexcessive number of minor and / or major collections performed by the JVM GCThreads (collectors) in a short amount of time leading to high JVM pause timeand performance degradation. There are many possible causes:
Beforepointing a finger at the JVM, keep in mind that the actual "root" cause can berelated to our #1 & #2 causes. An overloaded middleware environment willgenerate many symptoms, including excessive JVM garbage collection.
Properanalysis of your JVM related data (memory spaces, GC frequency, CPU correlation,etc.) will allow you to determine if you are facing a problem or not. Deeperlevel of analysis to understand your application memory footprint will requireyou to analyze JVM Heap Dumps and / or profile your application using profilertools (such as JProfiler) of yourchoice.
Thenext common cause of bad Java EE performance is mainly applicable for highlydistributed systems; typical for Telecom IT environments. In such environments,a middleware domain (e.g., Service Bus) will rarely do all the work but rather"delegate" some of the business processes, such as product qualification,customer profile, and order management, to other Java EE middleware platformsor legacy systems such as Mainframe via various payload types and communicationprotocols.
Suchexternal system calls means that the client Java EE application will triggercreation or reuse of Socket Connections to write and read data to/from externalsystems across a private network. Some of these calls can be configured assynchronous or asynchronous depending of the implementation and business processnature. It is important to note that the response time can change over timedepending on the health of the external systems, so it is very important toshield your Java EE application and middleware via proper use of timeouts.
Finally,I also recommend that you spend adequate time performing negative testing. Thismeans that problem conditions should be "artificially" introduced to theexternal systems in order to test how your application and middlewareenvironment handle failures of those external systems. This exercise shouldalso be performed under a high-volume situation, allowing you to fine-tune thedifferent timeout values between your applications and external systems.
Thenext common performance problem should not be a surprise for anybody: databaseissues. Most Java EE enterprise systems rely on relational databases forvarious business processes from portal content management to order provisioningsystems. A solid database environment and foundation will ensure that your ITenvironment will scale properly to support your client growing business.
Inmy production support experience, database-related performance problems arevery common. Since most database transactions are typically executed via JDBCDatasources (including for relational persistenceAPI's such as Hibernate), performance problems will initially manifest as Stuck Threads from your Java EE container Threadmanager. The following are common database-related problems I have seen overthe last 10 years:
* Note that Oracle database isused as an example since it is a common product used by my IT clients.*
联系客服