|
JMX and Java Application Server Instrumentation:
I was watching the history channel a few years ago which was running a program on the
three mile island reactor disaster; I was struck by the similarity of experience
between what happened in that control room and what happens in the average server room when applications go south.
A nuclear melt-down and Java server technology wouldn't normally appear to have all that much in common
(other than that we're warned by Sun not to run a nuclear reactor with Java software).
I think there's at least one key similarity: poor instrumentation. When finally in production, most
Java applications (at least the one's I've seen) are about as opaque as the three mile island reactor board was.
Though there were equipment failures at Three mile island, they were dramatically compounded by the poor instrumentation
which couldn't tell them what was really wrong. The same happens with routinely Java server aplications, there will
be bugs, that goes without saying, but the problems that ensue will be coumpounded by poor instrumentation and
an overall lack of visibility .
It's a difficult proposition to truely know what's going on inside your J2EE application once it's up and running,
logging only does so much and generates large files that someone has to wade through; a thankless task even
with good tools like Chainsaw.
There are instrumentation standards in the Java world certainly, JMX
in particular. The problem I see is implementation; open source products like JBoss
and Tomcat (as well as the expensive proprietary App Servers) generally provide massive detail.
Mbeans everywhere, some of it's quite useful but most of it probably won't help you when your app's down or far worse, running slowly.
It's challenging to produce meaningful runtime instrumentation, you need to:
- Understand the available technology for instrumentation
- Understand your applications problem domain and what your particular run-time profile is
- Understand the J2EE server you're deploying to and what's it's built in facilities are
- Perhaps most importantly, you must know what the important run-time metrics you need to see are
Complex requirements certainly; in the past I've sought to both divide and conquor this problem in 2 ways:
- Short term monitoring
- Long term monitoring
Short term monitoring's primary goal being triage: Diagnose what's gone wrong and fix it quickly. In this sort of environment
you want a quick monitoring and dashboard style tool like MC4J along
with logging tools and other assorted analysis tools for reviewing the application and server environment.
Long term monitoring is a different animal altogether, and I think the news here is good because there are some excellant commercial
applications that have become stronger over the last several years as the standards have matured
(like the Manage Engine products).
However with a little extra effort you can make your application much less opaque and your life easier after
deployment by adding your own application instrumentation with JMX.
Next Page >>
|