jvm.config Tuning Tip For All Server Side Java Solutions

I have been wanting to blog about an experience I had not too long ago on a project where the jvm was consistently throwing OOM errors. It had me banging my head against my desk for a few days attempting to trace where the culprit was.

Was it code related?... yes, was it bad code?... sort of, was it very very very intense code (looping over many sql calls and instantiating many objects)?... YES! The challenge here was to apply a quick band aid rather than redesign this rules engine that had the characteristics listed due to time constraints. It is important to note I was not responsible for the poorly written code 8-).

I had to first identify if there were memory leaks due to this code. To do this I utilized YourKit Java Profiler. They have .NET and Java profilers that allow you to monitor their respective runtimes. YourKit Profiler is very simple to configure within the jvm.config file (I'm not going to get into that in this blog). My point here is that I witnessed memory steadily climb and at times spike, but was only able to reclaim memory when executing a manual GC. UGGG!!! So, no memory leak, but the runtime was hanging on to what it had... Why was the GC not reclaiming memory quickly on it's own? I had all the BP jvm args of old, etc., etc. ParNewGC and RMI to no avail.

And to get back to the initial issue; the blasted OOM error. I searched high and low and identified a thread on a Sun forum that there were issues with jre 1.5 that ultimately threw a OOM error if the runtime was unable to reclaim memory during a GC within a given time frame. The workaround for this was to set a time constraint in the JVM (which didn't work) or install 1.6_10 or later. This was my first step. I installed this JRE version and pointed my jvm.config to it. The application ran fine under this JRE except that I was still seeing the memory creep to the ceiling with no reclaim.

I then read on one of Sun's GC tuning white papers the following paragraph:

The -XX:+AggressiveHeap option inspects the machine resources (size of memory and number of processors) and attempts to set various parameters to be optimal for long-running, memory allocation-intensive jobs. It was originally intended for machines with large amounts of memory and a large number of CPUs, but in the J2SE platform, version 1.4.1 and later it has shown itself to be useful even on four processor machines. With this option the throughput collector (-XX:+UseParallelGC) is used along with adaptive sizing (-XX:+UseAdaptiveSizePolicy). The physical memory on the machines must be at least 256MB before AggressiveHeap can be used. The size of the initial heap is calculated based on the size of the physical memory and attempts to make maximal use of the physical memory for the heap (i.e., the algorithms attempt to use heaps nearly as large as the total physical memory).

Note: -XX:+UseAdaptiveSizePolicy is on by default so I don't explicitly define it in my args.

Amazingly, once I added this to the args, removed ParNewGC (enabled UseParallelGC) the server ran flawlessly for days and days without a restart. I was serving requests into the millions without a restart!!! A partial arg list specific to these setting are below, please let me know your thoughts and concerns as I always enjoy constructive feedback.

java.args=-server -Xmx1024m -Xms1024m -XX:+AggressiveHeap -XX:+UseParallelGC -Dsun.io.useCanonCaches=false -XX:MaxPermSize=512m

Note: This was for a ColdFusion 8 instance.

Why VOs (transfer objects) are good...but they can be abused like...

any other design pattern....

Sorry for the confusing title, but long titles are rather lame. So you're a flex cf, java, or php developer and you are leveraging all the beautiful one to one mapping associated with server and client object creation.

"YES!", you said. No more guess work; my server vals returned can be readily passed around within my AS code with the ease of code insight! Ctrl-space... wow theres my property! Ok, getting tacky I know.

So we embark on our design of a sytem always using VOs no matter the cost. Eee gaadd stop now. VOs, pending on design approaches may possibly have multiple layers of nested VOs YIKES!

Everyone knows that Rambo's weapons of choice were the bone cutting hunting knife and explosive bow and arrows. But there were times when he had to pull in the heavy artillery or perform a sneak attack with a much more lightweight approach like a choke hold (ahh, the violence of my youth...).

This is why VOs can be a problem if implemented without understanding the performance ramifications that can be incurred if they are always used.

Here's a real world scenario. Requesting an array of 100+ VOs from your middle tier that each have nested arrays of child VOs. Imagine just having two child VOs and the impact that could have on performance with this approach.

You call in to pull back the parent VOs that contain an array of child VOs (say 5) that all need to get created for each item in the array. So in this process we are creating 100 parent objects and internal to each we are creating 10 child VOs. This yields 1000 objects which each need to get created and the memory and process grow each time you do so on your middle tier (now add just a few users doing this incrementally over the first couple of hours).

I'm being facetious here of course as this isn't a very high number. But why return such a dense object to the client unless you were going to use it. There is a lot of wasted horsepower with this approach. Think of the scene with Rambo emptying that M60 E4 machine gun and never hitting his target... Rambo

A better approach is to pass back a snapshot of the data directly from your middle tier and pull back its VO representation when an edit needs to be performed or the VO truly is required to facilitate a process in the application.

So if you are going to populate a grid, I don't recommend doing it with VOs and if you absolutely have to create an array of VOs understand the possible performance impact (and data stagnation) that can ensue if the VO is of a complex nature and how it can impact the health of your server and ultimately the user experience.

The J2EE Core Pattern docs on Transfer Object speak to this. Check out the "Consequences" section of caching large sets of VOs. Line from the article: There is a trade-off associated with this strategy. Its power and flexibility must be weighed against the performance overhead... THE LINK

CF 8 WACK ADVANCED GO PREORDER!!! 8-)

The title says it all, this book is a must have for anyone who works with ColdFusion or is interested in some of the advanced topics associated with the language and the newly released version 8. Having written for this book I am truly amazed at all the effort and new features Adobe has injected into the language.

It is truly... per Tim Buntel, "The Swiss Army Knife" for any enterprise utilizing web based applications/technologies.

Scorpio.NET

How stoked am I?!? I'm currently installing Scorpio having been assigned the .NET integration chapter in Ben Forta's new CF 8 Web Application Construction Kit book. I have been working with Visual Studio 2005 for quite awhile now and enjoy coding in C#.

There are many benefits to calling into .NET assemblies on the Windows platform. I believe the most important is performance and in this area of performance, specifically database integration utilizing ADO.NET (arguable, I know) to connect, retrieve and edit data with SQL Server 2005.

Let's face it, when it comes to feature set Java and .NET are like comparing Macintosh to Red Delicious, but it is just plain common sense that Microsoft plays better with Microsoft specifically when integrating with the Windows API. This is a great reason for the businesses running ColdFusion hosted on Windows to look into utilizing the new native .NET integration support with Scorpio. I understand some of you may be saying, "Yeah, we got sold that bag of goods during the COM days.", but .NET is a much more stabile and robust platform.

There is a true talent concern here for offices that have ColdFusion, .NET and Java talent. ColdFusion, still being the most RAD app. server on the market (MHO) integrates nicely with all of them now and this will allow companies to build applictions at a much faster pace utilizing all of their in-house skill sets. We all know the individuals who swear by his/her chosen talent (i.e. .NET, Java, ColdFusion). If you're working in an office that has all three you will be able to leverage them when ColdFusion 8 is released.

You've Been Warned (AJAX vulenerable to attack)

I was just passed this article from a friend with regard to an AJAX vulenerability. It's a must read as it impacts the following frameworks:

"Vulnerable frameworks include: Microsoft ASP.NET AJAX (aka. Atlas), XAJAX and Google Web Toolkit, Prototype, Script.aculo.us, Dojo, Moo.fx, jQuery, Yahoo! UI, Rico, and MochiKit."

I don't see Spry on the list.

Check it out here.

Session Damage

I have written about the architectural flaws I see in various applications, many focusing on poor database design and poorly written sql. I felt it necessary to write about an issue I have seen in various web applications that unbeknownst to the developer/architect can hinder and or ultimately spell disaster for the application and the customer using the application.

The title "Session Damage" came about from this very issue. When does storing information in session scope in CF, ASP.NET, JSP, etc. become a problem? I've seen a couple of scenarios that were poor approaches to utilizing session. My first experience with this was (not having seen the actual code) when monitoring the JRUN service on a UNIX (Solaris) machine I witnessed 4 megabyte of memory peeling off the server upon every new login. Initially my mind went to the idea that there was a memory leak somewhere due to the memory bloat. On the UNIX platform this caused the operating system to dump core and restart the service when memory exceeded set thresholds. This lead to customers losing shopping cart/session data left and right. YIKES!!!!

There was one scenario where a customer had purchased $4000 worth of goods and actually took the time to call the support team and have them purchase the items because he did not have the time or patience to spend another 30 minutes selecting the items all over again... The culprit, once I got a chance to look at the code was that the application was written in such a way that upon a successful login, the system cached much of the database for each user (much of it was actually never utilized). This resulted in the memory bloat. Of course this was all done in an effort to speed performance, but regardless it was a poor approach. I had difficulty explaining what was going on to the CIO because he was unable to grasp the concept and kept saying "Memory is cheap, just buy more memory". Ouch, I had to explain a server has a maximum capacity for memory and that this would not fix the problem, just mask it for awhile.

So, the system had to be reengineered due to the memory issue and the queries streamlined to speed querying of the session data. The lesson learned here is that it is a poor approach to cache data at session to save .01 seconds of round trip time to get it from the database. An architect or developer must weigh the cost/benefit to the system when looking at this challenge. I always recommend tweaking the db so that the query search yields a timely response.

Another scenario I witnessed recently was the use of session caching associated with search results. These were very large datasets getting cached at the user level. It caused the JRUN service to bloat to 600 megabytes in no time at all if there were only a dozen or so active sessions on the server. There are times when caching search info is pertinent, but rather than caching the entire result set it might be a better approach to cache the search results unique identifiers only (array or comma-delim list) and go back to the database to pull back the details when needed. The reason the system developers built the system in this way was to facilitate pagination. The solution was to stop caching and go "round trip" to the db for this process, the performance impact was slight (.02) milliseconds difference, but it is of my opinion that even if there was a 1 to 2 second difference the user would not find issue with the search, considering the big picture of server stabilization and a more an application that no longer required a restart during peak usage do to unresponsiveness. Isn't that what we all look for in an application? One that is written once and never requires intervention? 8-)

ColdFusion Timeout

I was recently consulting with a client who was experiencing timeout issues due to long running requests. This issue has plagued their site for quite awhile. My approach to solving their long running requests was to jump into the CF logs to find out what might be the problem.

The challenge was that it was a legacy fusebox site and because of the architecture of this version of fusebox every error pointed to index.cfm. The logs were somewhat helpful with regard to the timeouts and where exactly the error was coming from at a higher level of the multiple application supporting server. Noticing that it was a particular application, I challenged the development team to wrap all cfquery/cfstoredproc requests in a cftry/cfcatch. As is the case with many CF developers (including myself when I was starting out) it is easy to take for granted that the database is always going to be stable and ColdFusion's very straightforward ability to query the database will execute with no problem.

This is a huge mistakes in application development no matter what technology you are working with. In code for the immediate application i.e. CF, java, C#, VB, etc. anytime the application has to go outside of itself to query a database, web service, shared api call, etc. this code must be contained in a try or catch because it can fail due to multiple reasons.

Back to the CF issue... I recommended that the client update the queries (no timeout for cfstoredproc, wonder why?) with a timeout settings forcing it to fail (this would identify the long running query) and due to CF throwing a time out error the cfcatch could write a very detailed description from the given cfcatch scope and then they could write the actual page and line of code where the failure happened. At this point after this code was implemented they could then place a cfthrow tag at the end of the catch to bubble up the error to a global error handler in Application.cfc so that it would be handled gracefully by displaying a friendly error page.

Sorry if I'm getting wordy... Thoughts tend to flow when blogging and sentences run-on... So you may be thinking, I have a huge application with queries all over the place where the heck do I start? If you are a dreamweaverer or cfeclipser or homesiter you have the tools at your disposal to do a sitewide search. I like dreamweaver's and homesite's search capabilities due to the fact that you can export the file list in various formats and organize a plan of attack with your team to patch up the application. You'll have your site ready to diagnose itself and in turn be able to stabilize or reengineer the failed module.

Javascript CFDUMP

A javascript DUMP! I have had time to catch up on reading some of my favorite blogs. One is by Lucas Sherwood an 'ol Aussie mate of mine and former co-worker (from the Allaire, Macromedia days). Good day Lucas!

Anyhow, his blog entry detailed a javascript dump built by NetGrow.au that I think many of you who don't read his blog can benefit from. It mirrors the <cfdump> tag. If you're doing alot of javascript and sick of using alert, then give it a go. I won't use anything else from here on out.

Lucas' Blog The Bit Bucket

Get the code here: js dump

Why Is My Server So Unstable?

Ever ask that question? It's not necessarily the easiest question to answer if you're a head down coder type. Not that there's anything wrong with head down coder types.

Recently a customer was experiencing server unresponsiveness though when opening up the task manager on the web server would see the CPU at next to 0 utilization and memory somewhat stable. So when this happened I asked for a bit more information the web server in question in the way of what was running on it and what database servers it hooked into.

I was told it was running ColdFusion apps, ASP.NET apps, and Reporting Services... UGGGGG!!!!

To be able to see what is happening on the server there are counters that need to be implemented on the various services in question.

ColdFusion version 7 can be monitored with the JRun Metrics and you can find information on it here http://www.bpurcell.org/blog/index.cfm?mode=entry&entry=991 on Brandon Purcell's blog.

ASP.NET can be monitored with a custom counter, an article on this topic is located here http://msdn2.microsoft.com/en-us/library/ms979194.aspx.

These approaches will identify what process is blocking up the web server. The true caveat in all instances when CPU utilization is low and server memory stable points to db or possibly network issues. In most cases a db will be pegged due to long running sql statements and it is important to monitor these machines while troubleshooting the cause of the bottleneck.

If the database is the culprit to clogging the server than in measures can be taken to find out which currently running sql statements are causing the problems. A tool I have used in that past that does an excellent job of monitoring sessions and currently executing sql statements (ORACLE) is TOAD with its DBA module. If the user account you are utilizing gives you the ability to run system queries than this module will show you why the server is doing some heavy lifting. There are similar feature in SQL Server, I will post these shortly.

In closing it is important to understand the demands an application may place on server hardware and while keeping that in mind look to incorporate or segregate the application in an existing environment.

C# Spider with HttpWebRequest

I was recently tasked to write code to facilitate html post requests to a website that hits a cgi application which in turn sets a configuration file that sends text streams to be displayed on TVs at the Kennedy Space Center. This component was to be built in support of the Weather Warning Appplication.

Weather warnings are a big deal at KSC due to the vastness of the center and many people work outdoors for a good part of the day. Long story short the code below details a simple web request call. I actually make an initial call to a login page and retain the cookie to maintain state on all subsequent requests. After this is done I continue hitting the required config. html form pages passing in the required form field values via the strPost variable. I hope this code helps anyone struggling with this.

FYI: All params are brought in from web.config.
(Notice the ConfigurationSettings.AppSettings["Key"]; calls)

The values in relation to each key look something like this:
loginId=foo&password=bar

public void setTvCrawlDisplay(String tvText, DateTime tvExpiration)
{
   //JFB - Calculate minutes for warning display.    
   DateTime currentDateTime = DateTime.Now;

   //JFB - Difference in days, hours, and minutes.    
   TimeSpan tvTimeSpan = tvExpiration - currentDateTime;
      
   //JFB - Difference in minutes.    
   int differenceInMinutes = (int) tvTimeSpan.TotalMinutes;   
   
   String url = ConfigurationSettings.AppSettings["URL"];
   String strPost = ConfigurationSettings.AppSettings["LoginPassword"];
   StreamWriter myWriter = null;
   CookieContainer myContainer = new CookieContainer();
   
   //Request #1 (the login)    
   HttpWebRequest objRequest = (HttpWebRequest)WebRequest.Create(url);
   objRequest.Method = "POST";
   objRequest.ContentLength = strPost.Length;
   objRequest.ContentType = "application/x-www-form-urlencoded";         
   objRequest.CookieContainer = new CookieContainer();

   try
   {
      myWriter = new StreamWriter(objRequest.GetRequestStream());
      myWriter.Write(strPost);
   }
   catch (Exception e)
   {
      Console.WriteLine(e.Message);
   }
   finally
   {
      myWriter.Close();
   }
      
   HttpWebResponse objResponse = (HttpWebResponse)objRequest.GetResponse();
   //retain the cookies    
   foreach (Cookie cook in objResponse.Cookies)
   {
      myContainer.Add(cook);
   }
   
   //Check out the html.    
   using (StreamReader sr =
          new StreamReader(objResponse.GetResponseStream()) )
   {
      String test = sr.ReadToEnd();

      // Close and clean up the StreamReader       
      sr.Close();
   }
   
   //Request #2 (select the proper submenu)    
   objRequest = (HttpWebRequest)WebRequest.Create(url);
   strPost = strPost = ConfigurationSettings.AppSettings["EditMenu"];
   objRequest = (HttpWebRequest)WebRequest.Create(url);
   objRequest.Method = "POST";
   objRequest.ContentLength = strPost.Length;
   objRequest.ContentType = "application/x-www-form-urlencoded";         
   objRequest.CookieContainer = myContainer;

   try
   {
      myWriter = new StreamWriter(objRequest.GetRequestStream());
      myWriter.Write(strPost);
   }
   catch (Exception e)
   {
      Console.WriteLine(e.Message);
   }
   finally
   {
      myWriter.Close();
   }
      
   objResponse = (HttpWebResponse)objRequest.GetResponse();
   
   //Check out the html.
   
   using (StreamReader sr =
          new StreamReader(objResponse.GetResponseStream()) )
   {
      String test = sr.ReadToEnd();

      // Close and clean up the StreamReader       
      sr.Close();
   }
}


Copyright Strikefish, Inc., 2005. All rights reserved.