Heap-dump analysis
for AI agents.

deepheap analyzes Java HPROF heap dumps, thread dumps, and GC logs through MCP. AI clients query classes, retained sizes, GC roots, dominator paths, off-heap memory, and threads, using deterministic heap operations, not LLM guesses.

Out-of-core engine

HPROF parser designed for heap dumps much larger than available RAM. The full file is never loaded into memory; analysis structures scale with dump complexity rather than file size.

MCP-native

Every analysis is an MCP tool call. Works with Claude Desktop, Claude Code, Cursor, VS Code, or any MCP-capable client. No GUI, no IDE plugin, no proprietary query language.

Deterministic, then interpreted

deepheap returns structured facts: class counts, retained sizes, GC root paths, object fields. The AI interprets them in context. The engine does not hallucinate heap data.

From question to root cause

Connect your AI client, load a dump, and ask. deepheap picks the right tools and returns structured data. The AI synthesizes the answer.

deepheap · Claude Code
Why is old gen growing between deployments? Here's the dump from this morning.
heap_dump__read 3.7 GB heap, 42 M objects, 184 classes
heap_dump__get_classes sorted by retained size
heap_dump__get_instances top instances of MeterSnapshot (184 K, 2.1 GB retained)
heap_dump__get_gc_root_path root → ApplicationContext.registry → ConcurrentHashMap → MeterSnapshot
The top retainer is a ConcurrentHashMap field on a MetricsRegistry singleton, holding 2.1 GB across 184,000 MeterSnapshot entries. The GC root path confirms the registry is reachable through a static field on ApplicationContext, so it is never released. The map has no eviction policy, so meter objects accumulate as new endpoints are hit. Configure a meter filter to limit cardinality, or switch to a bounded registry implementation.

Illustrative transcript. Class names are generic; actual tool results vary by dump.

Things you can ask

"What is retaining the most memory?" "Why is this object still alive?" "Are there any ClassLoader leaks?" "Show me duplicate strings causing waste." "Which threads are blocked or waiting?" "What off-heap memory is allocated via DirectByteBuffer?" "Why is old gen growing?" "How many open file handles does this process have?"

What runs where

Everything runs locally. The MCP server exposes structured tool results; the AI client synthesizes the answer. String content is filtered before it reaches the model.

MCP client Claude / Cursor VS Code deepheap MCP server 31 tools · stdio / HTTP structured tool results only out-of-core engine complete object graph dominator tree HPROF / threads / GC log stays on disk read out-of-core

The AI client receives only structured tool results. String content is filtered by default, not passed raw to the model.

Why not Eclipse MAT or VisualVM?

  • Headless and scriptable No GUI, no IDE plugin, no desktop process required. Run analysis from CI, remote hosts, or inside a container: anywhere your AI client runs.
  • MCP-native The answer lives inside the LLM session you're already in. No manual hunting for information, no copy-pasting object IDs, no separate query language to learn.
  • Built for dumps larger than RAM The full file is never loaded into memory. Analysis structures are mmap-backed and scale with dump complexity, so the resident set stays small even when the dump is many times larger than available RAM.
  • Multi-surface diagnosis One tool covers HPROF heap dumps, thread dumps, and GC logs. Correlate a GC pause pattern in the log with object retention in the dump in a single session.
More detail in the docs →

Headline tools

31 MCP tools cover heap analysis, thread inspection, and GC log diagnosis. Three that come up in almost every session:

heap_dump__get_classes
What is holding the memory?
All classes ranked by retained size. First call after loading any dump, surfacing the biggest memory consumers immediately.
heap_dump__get_gc_root_path
Why is this object still alive?
Full reference chain from a GC root to any object. Answers the most common leak investigation question in one call.
heap_dump__get_duplicate_strings
Find deduplication wins.
String instances with identical backing content. Surfaces cache bloat, interning bugs, and repeated configuration data in seconds.
→ Full tool reference in docs

Supported formats

Heap dumps (HPROF)

  • Any HotSpot-derived JDK (OpenJDK, Temurin, Oracle JDK, Corretto, Zulu, etc.)
  • JDK 8 and later

Thread dumps

  • jstack text format
  • JEP 425 JSON format (JDK 21+)

GC logs

  • Unified logging (JDK 9+) and pre-unified (JDK 8)
  • gzip-compressed files
  • G1, Parallel, Serial, CMS, ZGC, Shenandoah

JVM language collections

  • Java standard collections
  • Kotlin, Scala 2.13, Groovy
  • Other JVM languages render as their underlying Java types

Your data stays on your machine

Fully local execution deepheap runs entirely on your machine. No network calls, no telemetry, no phone-home.
String content is filtered before it reaches the AI Heap dumps frequently contain user-facing strings, secrets, and prompt-injection payloads. deepheap intercepts every String, StringBuilder, byte[], and char[] and replaces the content with a closed-vocabulary format label (json-object, uri, jwt, pem, base64, …) before the result is returned. Attackers cannot use heap content to steer the AI. Override with --show-strings when you control and trust the dump. More on the redaction model →

Ready to analyze a dump?