tools · article · 5 min
Claude Code Memory System Architecture
orchestrator.dev · May 31, 2026
Claude Code & Agent Memory: Best Practices for 2026 April 6, 2026 • 18 min read #claude-code #agent-memory #context-engineering #agentic-ai #2026 Introduction Every Claude Code session starts with a clean slate. No memory of the codebase you spent last week mapping. No record of the architecture decision you made on Tuesday. No recollection of the cryptic build flag that took three hours to debug. Just… nothing. If you’ve used Claude Code seriously, you’ve felt this friction. You correct the same mistakes session after session — “we use pnpm , not npm ”, “the tests live in /test/integration/ , not /tests/ ”. Each correction costs tokens and focus. It turns an autonomous agent into a patient who keeps waking up with amnesia. The good news: this problem is almost entirely solved — if you know how to solve it. Claude Code has a sophisticated, layered memory system that most developers use at perhaps 10% of its capability. This article covers the full architecture: what each layer does, how they interact, and the production patterns that separate teams shipping faster from teams debugging the same context failures week after week. By the end, you’ll understand how to configure CLAUDE.md so it actually works, how auto memory and the Memory Tool complement each other, how to survive context compaction without losing critical state, and how subagents get their own persistent knowledge stores. All of this is current as of Claude Code v2.1.92 (April 2026). ℹ️ Prerequisites This article assumes familiarity with Claude Code's basic setup and CLI usage. You should be comfortable running claude from the terminal and have a working Anthropic subscription (Pro, Max, or Team). Code examples target Claude Code v2.1.x and the Messages API with the memory_20250818 tool for API-based agents. Subagent memory ( memory: frontmatter) requires v2.1.33 or later. The Four-Layer Memory Architecture Before getting into tactics, it helps to have a clear mental model of what Claude Code actually stores — and where. There are four distinct layers, each with different persistence characteristics, audience, and purpose. #mermaid-0{font-family:arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-0 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-0 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-0 .error-icon{fill:#552222;}#mermaid-0 .error-text{fill:#552222;stroke:#552222;}#mermaid-0 .edge-thickness-normal{stroke-width:1px;}#mermaid-0 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-0 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-0 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-0 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-0 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-0 .marker{fill:#333333;stroke:#333333;}#mermaid-0 .marker.cross{stroke:#333333;}#mermaid-0 svg{font-family:arial,sans-serif;font-size:16px;}#mermaid-0 p{margin:0;}#mermaid-0 .label{font-family:arial,sans-serif;color:#333;}#mermaid-0 .cluster-label text{fill:#333;}#mermaid-0 .cluster-label span{color:#333;}#mermaid-0 .cluster-label span p{background-color:transparent;}#mermaid-0 .label text,#mermaid-0 span{fill:#333;color:#333;}#mermaid-0 .node rect,#mermaid-0 .node circle,#mermaid-0 .node ellipse,#mermaid-0 .node polygon,#mermaid-0 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-0 .rough-node .label text,#mermaid-0 .node .label text,#mermaid-0 .image-shape .label,#mermaid-0 .icon-shape .label{text-anchor:middle;}#mermaid-0 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-0 .rough-node .label,#mermaid-0 .node .label,#mermaid-0 .image-shape .label,#mermaid-0 .icon-shape .label{text-align:center;}#mermaid-0 .node.clickable{cursor:pointer;}#mermaid-0 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-0 .arrowheadPath{fill:#333333;}#mermaid-0 .edgePath .path{stroke:#333333;stroke-width:1px;}#mermaid-0 .flowchart-link{stroke:#333333;fill:none;}#mermaid-0 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-0 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-0 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-0 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-0 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-0 .cluster text{fill:#333;}#mermaid-0 .cluster span{color:#333;}#mermaid-0 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-0 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-0 rect.text{fill:none;stroke-width:0;}#mermaid-0 .icon-shape,#mermaid-0 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-0 .icon-shape p,#mermaid-0 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-0 .icon-shape .label rect,#mermaid-0 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-0 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-0 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-0 .node .neo-node{stroke:#9370DB;}#mermaid-0 [data-look="neo"].node rect,#mermaid-0 [data-look="neo"].cluster rect,#mermaid-0 [data-look="neo"].node polygon{stroke:#9370DB;filter:drop-shadow(1px 2px 2px rgba(185, 185, 185, 1));}#mermaid-0 [data-look="neo"].node path{stroke:#9370DB;stroke-width:1px;}#mermaid-0 [data-look="neo"].node .outer-path{filter:drop-shadow(1px 2px 2px rgba(185, 185, 185, 1));}#mermaid-0 [data-look="neo"].node .neo-line path{stroke:#9370DB;filter:none;}#mermaid-0 [data-look="neo"].node circle{stroke:#9370DB;filter:drop-shadow(1px 2px 2px rgba(185, 185, 185, 1));}#mermaid-0 [data-look="neo"].node circle .state-start{fill:#000000;}#mermaid-0 [data-look="neo"].icon-shape .icon{fill:#9370DB;filter:drop-shadow(1px 2px 2px rgba(185, 185, 185, 1));}#mermaid-0 [data-look="neo"].icon-shape .icon-neo path{stroke:#9370DB;filter:drop-shadow(1px 2px 2px rgba(185, 185, 185, 1));}#mermaid-0 :root{--mermaid-font-family:arial,sans-serif;} loaded every session first 200 lines loaded on-demand retrieval injected at subagent start context rot at ~70% auto-compact at ~83.5% critical rules survive via CLAUDE.md auto-updated during session Claude Code Session Context Window ephemeral / finite Conversation history File reads / Tool outputs Auto-loaded memory files Layer 1 — CLAUDE.md static / explicit Layer 2 — Auto Memory MEMORY.md / dynamic Layer 3 — Memory Tool /memories dir / API agents Layer 4 — Subagent Memory ~/.claude/agent-memory/ /compact summarize Layer 1 — CLAUDE.md is the explicit, human-authored layer. You write it. It loads at the start of every session. It’s the source of truth for things you always want Claude to know: build commands, coding conventions, architecture decisions, project-specific rules. Layer 2 — Auto Memory ( MEMORY.md ) is the implicit, learned layer. Claude Code discovers project-specific patterns during your sessions and writes them back autonomously. The first 200 lines or 25KB of MEMORY.md, whichever comes first, load at the start of each session. Layer 3 — The Memory Tool is the API layer, designed for long-running programmatic agents. Rather than loading everything upfront, agents store what they learn and pull it back on demand — keeping the active context focused on what’s currently relevant. Layer 4 — Subagent Memory gives each named subagent a persistent knowledge store, scoped to either the user or the project. Introduced in Claude Code v2.1.33 (February 2026), this field gives each subagent its own persistent markdown-based knowledge store. Before this, every agent invocation started from scratch. Layer 1: Engineering Your CLAUDE.md 🎯 Key Takeaways for CLAUDE.md Keep it under 300 lines — every line competes with actual work for context budget The golden rule: would removing this line cause Claude to make a mistake? If not, cut it Reference separate files for domain-specific docs; don't inline large content Check CLAUDE.md into git so your team can contribute and refine it over time CLAUDE.md is loaded before every conversation, which sounds simple until you internalize what that means for the context window. A fresh session consumes roughly 20,000 tokens loading the system prompt, tool definitions, and CLAUDE.md before you type anything. The most common mistake is treating CLAUDE.md like a wiki dump. The /init command generates a starter file based on your project structure — the counterintuitive step is to delete most of what it generates. The default file includes obvious things: yes, Claude, this is a TypeScript project, that’s visible from the package.json. Every line in CLAUDE.md competes for attention with the actual work. Target: under 300 lines. What Actually Belongs in CLAUDE.md The right content falls into four categories: project identity, commands, style, and guardrails. # My Project — CLAUDE.md ## Project Context Next.js 14 e-commerce platform with Stripe, Postgres, and Redis. Monorepo: apps/web (Next), apps/api (Express), packages/shared. ## Commands - Test: pnpm test:integration (NOT pytest or npm test) - Build: make build-docker - Lint: npm run lint:fix - Database migrations: pnpm db:migrate ## Code Style - ES modules only, named exports preferred - 2-space indentation, TypeScript strict mode - Error handling: always use AppError class in packages/shared/errors ## Guardrails - Never force push (--force-with-lease only) - Never commit to m