jvm.config Tuning Tip For All Server Side Java Solutions

I have been wanting to blog about an experience I had not too long ago on a project where the jvm was consistently throwing OOM errors. It had me banging my head against my desk for a few days attempting to trace where the culprit was.

Was it code related?... yes, was it bad code?... sort of, was it very very very intense code (looping over many sql calls and instantiating many objects)?... YES! The challenge here was to apply a quick band aid rather than redesign this rules engine that had the characteristics listed due to time constraints. It is important to note I was not responsible for the poorly written code 8-).

I had to first identify if there were memory leaks due to this code. To do this I utilized YourKit Java Profiler. They have .NET and Java profilers that allow you to monitor their respective runtimes. YourKit Profiler is very simple to configure within the jvm.config file (I'm not going to get into that in this blog). My point here is that I witnessed memory steadily climb and at times spike, but was only able to reclaim memory when executing a manual GC. UGGG!!! So, no memory leak, but the runtime was hanging on to what it had... Why was the GC not reclaiming memory quickly on it's own? I had all the BP jvm args of old, etc., etc. ParNewGC and RMI to no avail.

And to get back to the initial issue; the blasted OOM error. I searched high and low and identified a thread on a Sun forum that there were issues with jre 1.5 that ultimately threw a OOM error if the runtime was unable to reclaim memory during a GC within a given time frame. The workaround for this was to set a time constraint in the JVM (which didn't work) or install 1.6_10 or later. This was my first step. I installed this JRE version and pointed my jvm.config to it. The application ran fine under this JRE except that I was still seeing the memory creep to the ceiling with no reclaim.

I then read on one of Sun's GC tuning white papers the following paragraph:

The -XX:+AggressiveHeap option inspects the machine resources (size of memory and number of processors) and attempts to set various parameters to be optimal for long-running, memory allocation-intensive jobs. It was originally intended for machines with large amounts of memory and a large number of CPUs, but in the J2SE platform, version 1.4.1 and later it has shown itself to be useful even on four processor machines. With this option the throughput collector (-XX:+UseParallelGC) is used along with adaptive sizing (-XX:+UseAdaptiveSizePolicy). The physical memory on the machines must be at least 256MB before AggressiveHeap can be used. The size of the initial heap is calculated based on the size of the physical memory and attempts to make maximal use of the physical memory for the heap (i.e., the algorithms attempt to use heaps nearly as large as the total physical memory).

Note: -XX:+UseAdaptiveSizePolicy is on by default so I don't explicitly define it in my args.

Amazingly, once I added this to the args, removed ParNewGC (enabled UseParallelGC) the server ran flawlessly for days and days without a restart. I was serving requests into the millions without a restart!!! A partial arg list specific to these setting are below, please let me know your thoughts and concerns as I always enjoy constructive feedback.

java.args=-server -Xmx1024m -Xms1024m -XX:+AggressiveHeap -XX:+UseParallelGC -Dsun.io.useCanonCaches=false -XX:MaxPermSize=512m

Note: This was for a ColdFusion 8 instance.

Comments (Comment Moderation is enabled. Your comment will not appear until approved.)
Hmmmm, this project sounds vaguely familiar....
# Posted By Andrew Gscheidle | 7/15/09 10:55 AM
running these arguments causes my CF servers jrun.exe to flatline the CPUs after approx 1 hour of running..
# Posted By Joshua | 7/27/09 12:18 AM
@Joshua,

Sorry to hear. Could you give me some details on your server config.? I do not recommend implementing the settings I have listed if you are not running CF 8 with JRE 1.6_10 or greater.
# Posted By Strikefish | 7/27/09 11:25 AM
we are running CF8.1 with JRE 1.6_14. Wintel platforms with 2 CPUs and 3GB RAM. We have been experimenting with various arguments as our CF servers seem to be unstable anyway..random restarts etc. However with this jvm config the services never restarted just consumed all CPU and requests drip fed through. However there was no indictaion of a particular request that causes this. We are running fusionreactor but suspect we may need yourkit profiler
# Posted By Joshua | 7/27/09 5:48 PM
@Joshua, something is off. I'd like to see your config file. Send me your contact info. via the Strikefish contact us form. I've seen the behavior you speak of but it is usually due to the MaxPermSize setting.
# Posted By Strikefish | 7/30/09 5:07 PM
have you found an issue with multi-server instances? i find if i change the jvm.config to these settings, the instances start fine, but the root services won't start and i get a message about '...refer to service-specific error code 2'
# Posted By Daria | 2/12/10 3:06 PM
Hi Daria, no I have not, that usually means there is a config issue with the params, are you sure they are mirror images?

Any chance you can send me the files? Hit me up on the contact us form on strikefish.com if you like.
# Posted By Strikefish | 2/12/10 4:17 PM

Copyright Strikefish, Inc., 2005. All rights reserved.