As an experienced WebSphere administrator, I know the pain of debugging cryptic application issues in complex environments. But through years of troubleshooting, I‘ve learned to become best friends with heap dumps, java cores and system dumps. These three artifacts are like x-rays that let you peered into production systems to understand what‘s happening under the hood.
In this post, let‘s really dive into proper usage and analysis of heap dumps, java cores, and system dumps in WebSphere. I‘ll share my top troubleshooting tips I‘ve picked up over the years. By the end, you‘ll be equipped to leverage dumps like a master to speed up diagnosing even the trickiest issues.
A Quick Refresher on the Three Amigos
First, let‘s quickly recap what each dump type provides:
Heap dumps – A snapshot of the heap contents and all application objects. Critical for finding memory leaks, high utilization, and garbage collection issues.
Java cores – The stack traces and thread details at a point in time. Your tool for optimizing performance and analyzing hangs or deadlocks.
System dumps – A kitchen sink of diagnostic data including environment settings, configurations, class loaders, plus heap info and threads. Great for a big picture view.
Now let‘s explore when and how to use each one for next-level troubleshooting…
Advanced Heap Dump Analysis
The heap is like the beating heart of a Java application, so heap dumps often provide the most valuable clues to resolve memory problems.
For example, I was recently helping a client diagnose periodic OutOfMemoryErrors in their WebSphere cluster. First I captured heap dumps from all the affected JVMs during an OOM using the techniques in this post.
Loading the dumps into Eclipse MAT revealed that classes from a 3rd party library were slowly leaking over time, causing available heap to be exhausted. The smoking gun was seeing thousands of instances of the suspected classes remaining in memory.
Without the heap snapshots, it could have taken days or weeks to pinpoint the root cause via code reviews or debugging.
3 Pro Tips for Heap Dump Analysis:
1. Compare heap dumps over time – This can clearly show increasing instances of objects indicating a leak. Always capture a baseline dump from a freshly started JVM for comparison.
2. Look at object referrers and gc roots – This shows what is holding references to objects, preventing garbage collection.
3. Filter by class name or package – Quickly isolate usage of a specific library or portion of code.
Following these practiced analysis methods, you can rapidly track down even the most elusive memory issues from heap dumps.
The Power of Java Cores for Thread Debugging
Java core files contain the secret insights you need for investigating thread starvation, deadlocks, and performance issues.
Here‘s an example from a recent troubleshooting war story…
One of my clients was experiencing intermittent hangs in their WebSphere cluster under load. First I worked with the app owners to identify that the hangs seemed tied to database queries.
I configured an automatic java core generation policy on timeout events. Soon WebSphere dumped a core file during one of the hangs. Opening the core in the IBM Thread and Monitor Dump Analyzer revealed a smoking gun – athread pool for database connections was completely blocked waiting on a monitor lock!
From this vital clue, we were able to identify a synchronization bottleneck in the application around handling DB connections. Bingo!
2 Pro Java Core Analysis Tips:
1. Inspect thread states and stack traces – Quickly spot threads stuck in monitors, IO or idle. Look for bulk patterns.
2. Filter and search threads – Isolate threads from a specific component or library to find culprits.
Follow these threads 😉 and java cores will unravel all kinds of performance mysteries.
Get the Full Picture with System Dumps
Heap dumps and java cores are great for drilling into specific subsystems like memory and threads. But sometimes you need the 30,000 foot view of everything happening under the hood.
System dumps act like a giant performance profiler for WebSphere, containing environmental info, application configs, logging snapshots, heap info, and threads.
Just last week, I utilized a system dump to resolve a server crash during deployment. The comprehensive dump had all the details I needed to correlate the sequence of events across threads, logs, apps, and configurations leading up to the crash.
Turns out a bad config change had violated an app dependency, causing things to detonate on deployment. Without the system dump, this could have taken days to uncover through log mining.
My Top 2 System Dump Analysis Tips:
1. Use search to find specific logs, configs or keywords – Scan across all persisted contents quickly.
2. Correlate findings with other data like heap dumps – Line up threads, memory, configs for the full timelines.
System dumps really shine for finding those baffling system-wide glitches.
Closing Thoughts
I hope these real-world examples and pro tips have shown how invaluable heap dumps, java cores, and system dumps can be for troubleshooting. No other techniques allow you to so quickly analyze at a point in time what is happening inside complex, distributed systems like WebSphere environments.
Next time you encounter a nasty application or performance issue, follow the guidance here to capture the right diagnostic data. Then leverage the power of tools like Eclipse MAT, IBM TDMA, and PSA to dissect the dumps and unravel the root causes! Let me know if you have any other questions!