DB Design id/name Debate

I have wanted to blog about a simple DB Design approach that can make or break facilitation of coding, readability, maintenance, and ultimately warehousing approaches.

It is the "id"/"name" debate. For instance you have a "Status" table. In this table you have a list of status names and their respective surrogate keys in the id field driven by a sequence or identity field (Oracle/Sql Server).

Best practice in this situation is to label your id field status_id or statusID (Oracle/SQL Server). Due to upper casing in oracle an underscore is used for readability and has developed into a standard for Oracle DB naming, SQL Server allows mixed case which parlays nicely into your object model where the C-based syntax world has standardized on camel case for class properties and doesn't require aliasing in SQL code.

This very issue while on a project with a Microsoft Centric Dev working for the first time with an Oracle DB was a huge bone of contention. He had to alias all his sql due to the underscores as this made it easier for us when copy/pasting sql return assignments. Made good sense and I felt for him being a Oracle newb. This is one thing SQL Server has over Oracle when it comes to readability.

But back to the topic... The reason behind prefixing the id field with table name is for readability when joining. Imagine if all tables had id/name fields. This would be confusing rather quickly now you'd have to prefix at the very least in the table where the foreign key was assigned and I've seen worse. A child (foreign key) table that has a label that doesn't necessarily relate to the parent. For instance stat or statpkey... Acckkk! If the parent only had statusID and in turn the child table also had statusID, the developer, sys arch., dba will readily understand the relationship.

When it comes to the name field it is my opinion that this field should be prefixed too, because you run into name being everywhere and pending on whether you had a good SQL coder or not you are going to have to alias that field every time you use it which is a pain when many IDEs out there generate sql code for you.

Lastly when it comes to the name field you have to think about whether or not data mining will be used in the future against this data. Hopefully in a data warehouse. When it comes time to building the warehouse from the transaction db solution, generating the SQL and DDL to do so is much more straightforward and will facilitate flattening of the tables as the table in the originating transaction db will be prevalent in both the id and name fields. I look forward to your comments.

Flex ColdFusion Extension Wizard Gotcha

This entry will hopefully be brief and to the point in regard to the title. I want to start off by stating how much I like the ColdFusion extension support in Flex 2 and hope to see Adobe add extenstions that support other languages/middle tier solutions as well hint: (java, php, c#).

There is nothing like beginning your application architecture and development with an out of the box mvc approach utilizing well known approaches to server side development utilizing delegates, services, daos, gateways, and value objects.

Where I see a major problem with the extension is how it is implemented. I have been involved with several projects that have utilized Flex and ColdFusion and upon initially beginning my trek into Flex 2 I saw this issue right away only because I have been working with ColdFusion for 10 years, have consulted on its benefits, scaled servers, architected highly successful and efficient enterprise solutions with it, and know the do's and don'ts...

The problem I am speaking of with the CF extension implementation is the iterating over read functions in the data access object cfcs.

To be more clear, in the templated or skeleton dao cfcs there are db calls made, mostly crud calls. The read function I am discussing here is "r" in crud. The problem is that in many cases I've seen engineers design a composite or aggregate VO. This is a VO cfc that contains other VO cfcs.

This is a perfectly efficient and best practice OOP approach that ColdFusion handles very well. The problem arises when a child resultset of ids (usually a query or array) is used to iterate over and call the read function in the dao to get the vo properties from the db to return via remoting to the Flex application.

The main issue here is performance, specifically with the possible 100s of calls that can be made to the database in one ColdFusion request from the application. Imagine a dump truck going 200 times to the gravel pit taking one pebble, with today's gas prices I don't think you'd last long on the job if you took that approach 8-)... This is what you are doing when you loop over queries. Not only could these calls prove to be slow and taxing on the database but it can also lock up threads in ColdFusion and create cpu and memory spiking out of the shoot. You also have to deal with any kind of network latency that may exist between your CF server and the database as well. AAAACKKKKK!

But there is a solution: First a band-aid needs to be applied. For these complex objects an investigation must be made to locate where these iterating processes could be going on. Second, there is a function in ColdFusion that automagically converts a query field to a comma-delimited value list(or delimiter of your choosing). The function is called "valueList(query.column, ',')" there is also a quotedValueList for strings. With this list you could then use the SQL IN clause on the hopefully indexed id field used to pull back the child VO data. Add another function in the dao or better yet the data gateway that specifies readMultiple or getMulipleObjectName...

Once this is done the code should truly be refactored to call the database only once for all its data (think one cf call/one db call). If you did not know ColdFusion is one of the most robust languages when integrating with the more popular dbms on the market (Oracle, SQL Server, MySQL, DB2, Informix). The cfstoredproc tag allows you to call into a stored procedure to get procedure resultsets.

cfstoredproc also allows a developer to return multiple resultsets from a stored procedure to ColdFusion. So, one call to a stored procedure in Oracle that returns 2,3,4,or more ref cursors to ColdFusion. These could be all your objects' data which only requires you to loop once over all of them (maybe some nesting and brief logic to create them, but still 1000s of times faster than separate db calls for each loop).

This may be one of the drier entries I've had (no code, pictures, jokes). But I hope it helps you when using and not abusing the ColdFusion extension wizards in Flex 8-).

Universal Mind is Blogging

Check it out here: UM Blog.

UM is made up of a bunch of stellar Adobe and related software IT professionals. They have some incredible information to share with regard to Flex, Flash, AIR, CF, Java, etc. and we can all learn from them. Enjoy!

Session Damage

I have written about the architectural flaws I see in various applications, many focusing on poor database design and poorly written sql. I felt it necessary to write about an issue I have seen in various web applications that unbeknownst to the developer/architect can hinder and or ultimately spell disaster for the application and the customer using the application.

The title "Session Damage" came about from this very issue. When does storing information in session scope in CF, ASP.NET, JSP, etc. become a problem? I've seen a couple of scenarios that were poor approaches to utilizing session. My first experience with this was (not having seen the actual code) when monitoring the JRUN service on a UNIX (Solaris) machine I witnessed 4 megabyte of memory peeling off the server upon every new login. Initially my mind went to the idea that there was a memory leak somewhere due to the memory bloat. On the UNIX platform this caused the operating system to dump core and restart the service when memory exceeded set thresholds. This lead to customers losing shopping cart/session data left and right. YIKES!!!!

There was one scenario where a customer had purchased $4000 worth of goods and actually took the time to call the support team and have them purchase the items because he did not have the time or patience to spend another 30 minutes selecting the items all over again... The culprit, once I got a chance to look at the code was that the application was written in such a way that upon a successful login, the system cached much of the database for each user (much of it was actually never utilized). This resulted in the memory bloat. Of course this was all done in an effort to speed performance, but regardless it was a poor approach. I had difficulty explaining what was going on to the CIO because he was unable to grasp the concept and kept saying "Memory is cheap, just buy more memory". Ouch, I had to explain a server has a maximum capacity for memory and that this would not fix the problem, just mask it for awhile.

So, the system had to be reengineered due to the memory issue and the queries streamlined to speed querying of the session data. The lesson learned here is that it is a poor approach to cache data at session to save .01 seconds of round trip time to get it from the database. An architect or developer must weigh the cost/benefit to the system when looking at this challenge. I always recommend tweaking the db so that the query search yields a timely response.

Another scenario I witnessed recently was the use of session caching associated with search results. These were very large datasets getting cached at the user level. It caused the JRUN service to bloat to 600 megabytes in no time at all if there were only a dozen or so active sessions on the server. There are times when caching search info is pertinent, but rather than caching the entire result set it might be a better approach to cache the search results unique identifiers only (array or comma-delim list) and go back to the database to pull back the details when needed. The reason the system developers built the system in this way was to facilitate pagination. The solution was to stop caching and go "round trip" to the db for this process, the performance impact was slight (.02) milliseconds difference, but it is of my opinion that even if there was a 1 to 2 second difference the user would not find issue with the search, considering the big picture of server stabilization and a more an application that no longer required a restart during peak usage do to unresponsiveness. Isn't that what we all look for in an application? One that is written once and never requires intervention? 8-)

Why Is My Server So Unstable?

Ever ask that question? It's not necessarily the easiest question to answer if you're a head down coder type. Not that there's anything wrong with head down coder types.

Recently a customer was experiencing server unresponsiveness though when opening up the task manager on the web server would see the CPU at next to 0 utilization and memory somewhat stable. So when this happened I asked for a bit more information the web server in question in the way of what was running on it and what database servers it hooked into.

I was told it was running ColdFusion apps, ASP.NET apps, and Reporting Services... UGGGGG!!!!

To be able to see what is happening on the server there are counters that need to be implemented on the various services in question.

ColdFusion version 7 can be monitored with the JRun Metrics and you can find information on it here http://www.bpurcell.org/blog/index.cfm?mode=entry&entry=991 on Brandon Purcell's blog.

ASP.NET can be monitored with a custom counter, an article on this topic is located here http://msdn2.microsoft.com/en-us/library/ms979194.aspx.

These approaches will identify what process is blocking up the web server. The true caveat in all instances when CPU utilization is low and server memory stable points to db or possibly network issues. In most cases a db will be pegged due to long running sql statements and it is important to monitor these machines while troubleshooting the cause of the bottleneck.

If the database is the culprit to clogging the server than in measures can be taken to find out which currently running sql statements are causing the problems. A tool I have used in that past that does an excellent job of monitoring sessions and currently executing sql statements (ORACLE) is TOAD with its DBA module. If the user account you are utilizing gives you the ability to run system queries than this module will show you why the server is doing some heavy lifting. There are similar feature in SQL Server, I will post these shortly.

In closing it is important to understand the demands an application may place on server hardware and while keeping that in mind look to incorporate or segregate the application in an existing environment.

MySQL ODBC Flex CF Extension Bug

I was working with Peter Ent's demo today at (Example of Flex 2 Coldfusion Connectivity).

Peter's demo. is excellent but it falls a bit short on the db integration. His sql file integrates perfectly with MySQL so I fired up my 4.x version/service and setup the db. I had not installed the new jar for ColdFusion jdbc driver support documented here: knowledge base article and didn't think I had to as I used odbc due the failings of the 3.x drivers installed by default on ColdFusion.

Once the datasource was setup I was able to see it through the RDS Dataview CF extension in Flex builder. Beauty right? Wrong. I attempted to go through his walkthrough and when I got to the part where you right click on the the table to get to the cfc wizard, nothing happened. What's weird is that the first table displayed the table fields and the wizard worked. It was only on the subsequent tables in the list that fields did not show and the cfc wizard did not work.

I thought that it must be driver related because all other datasources seemed to work fine and they were not ODBC. So, I found the MySQL upgrade knowledgebase article and that did the trick. Once I setup the datasource to use the new driver via the "other" option in the driver dropdown I created the datasource and the RDS Dataview showed the fields for the tables and allowed me to use the cfc wizard.

If you haven't checked out Peter's blog, I highly recommend that you do.

Be A Student of Your Craft

Lessons learned is right around the corner as my team is cleaning up our Flash Forms application prior to formal test.

My reasoning behind the title of this blog entry is to discuss how to be confident in your skill set as an IT professional and to avoid that feeling of "Where am I going or am I doing the right thing with this solution?" Recently I was questioned by a team member on my approach to the database design I architected for my current application to be released to NASA for administration of their award processes. One might think an application geared toward awards administration would be a straight forward project with minimal requirements. This proved not to be the case. The requirements document alone for Phase I if printed could have contributed significantly to the planet's deforestation.

To dig deeper into proper database design and the questions surrounding why I designed the database in the fashion I did, it became readily prevalent in the conversation with the team member and fellow coder that he didn't have a full understanding of proper hierarchical approaches to db design and the pitfalls that can be encountered when taking an all encompassing table approach with a type table to differentiate each entry in this table. Not to speak of the issues that would be encountered with deletion of data (multiple delete statements instead of cascading) and self table joins with convoluted where clauses associated with multiple aliases. When designing databases it is important to understand that if the attributes of an entity (that may be thought of as similar but with changing types) differ in count, leaving entries with null values where they don't apply requires a new entity in the db.

An example of this is organizations and divisions. Proper db design would enforce integrity by creating a table for each of the entities in question. They each have different characteristics other than being a different type and one belongs to the next (parent, child is immediately prevalent here). For coders who are not comfortable with joining tables I can relate with the type approach in question. To make coding easier for them in regard to retrieving data, views should be created. This all falls in-line with the title of this entry. Be a student of your craft. In discussing the issue I knew that he needed to do some reading on the subject and that my responses were not going to change his thinking because he didn't understand all the benefits a properly architected database can provide. Under the guise of his confusion behind multiple joins to retrieve the data he needed for a lookup he didn't understand that the type approach would require joins as well to present the hierarchical data, but with a table that would have multiple keys or unique constraints that would hinder editing and deleting of data.

There is something to be said for experience here, but experience can be compensated for by becoming a student of your craft. We've all been there when it comes to the "Is there a better way?" question when embarking down undiscovered territory. The only way to compensate for this uncertainty is to research and educate oneself on the issue in question.

Practical Web Database Design.

Throughout my travels as a consultant and working with various firms I continually come across architects and developers who do an excellent job in their approach to front end design as well as the business objects that hook their front end submissions and retrieval of data to and from a database. However, I see improper database design often, which can give any front end or middle tier technology a bad name.

On a recent project one of my team members, who had worked with databases in the past but was a bit rusty, expressed an interest in designing them and growing out of his business analyst role. I forwarded him many notes and best practices I had archived over the years but felt that he needed a resource that gave him a better picture on how to design a database to increase performance and allow for flexibility.

I purchased a book for him titled Practical Web Database Design. This book can be purcahased on Amazon here. It backs up any approach I take to designing a database and is a very easy read. It discusses and resolves almost every issue you could come across in your attempts to support a business requirement through your database. It is a must have for your library.

Time for MySQL

If you have not given MySQL a look, it is time. It has some excellent features and now supports stored procedures. The administrator provided here and the query analyzer provided here are excellent gui utilities to get you familiar with designing and coding against a MySQL database. What's so great about MySQL? To start, all this is free to download (ver. 4.1). Check it out, you'll have the power to develop a database in minutes. I have worked with the big boys... Oracle, SQL Server, Sybase... and MySQL has now arrived and is an excellent solution as well. If you have comments or concerns on MySQL as to how it stacks up against some of these dbs, I'd like to hear them. Thanks.

Oracle 9i (New Features)

For those of you who may not be aware of the benefits to upgrading to 9i, here are a couple of features passed on to me by a co-worker (Thanks Ryan).

New features in Oracle 9i

This is a technical article that Ryan came across.

[More]

More Entries


Copyright Strikefish, Inc., 2005. All rights reserved.