Context Management - Agentic Engineering

You’ll learn: how the context window fills, when to compact proactively, and a token budget mental model for sizing sessions.

Official reference — For CLAUDE.md mechanics and memory configuration, see the official memory docs. For all configuration options, see settings.

The Context Window

Claude Code has a large context window — but it’s shared across everything:

Source	Cost
System prompt	Fixed (~50 instructions)
CLAUDE.md files	Fixed (loaded at launch)
Active skill content	Pay-per-use (loads on trigger)
Conversation history	Growing (accumulates over time)
Tool results	Variable (file reads, command output, search)

As the window fills, older messages get compressed automatically. But you can manage this proactively.

`/compact`

When the conversation gets long or unfocused, run /compact. It:

Summarizes the conversation so far
Frees context for new work
Re-injects CLAUDE.md from disk (so your instructions survive)

Use it between tasks. Finished a bug fix and moving to a new feature? Compact first.

`@imports` in CLAUDE.md

The @path/to/file syntax in CLAUDE.md enables inline file expansion. Use it for detailed docs that shouldn’t be in the main CLAUDE.md body:

## Reference Documents
### API Architecture — `@docs/api-architecture.md`
**Read when:** Adding or modifying API endpoints

This keeps CLAUDE.md lean while making deep docs accessible on demand. Supports up to 5 hops of recursion.

Large Codebases

Break work into focused sessions. Don’t try to refactor 20 files in one conversation.
Point Claude to specific files. “Look at src/auth/middleware.ts” beats “find the auth code.”
Use agents for exploration. The Explore agent reads many files without polluting your main context.
Commit between tasks. Gives you a clean rollback point and lets you /compact without losing work.

Signs You’re Running Low

Claude starts forgetting earlier instructions or decisions
Responses become less specific or repeat generic advice
Claude asks questions you already answered
Tool results are getting truncated

When you notice these, /compact and refocus.

Context as Infrastructure

Official reference — For compaction mechanics, the 1M context window, and CLAUDE_AUTOCOMPACT_PCT_OVERRIDE, see the official context docs. What follows is the operational mindset.

Context isn’t just “how much Claude remembers” — it’s infrastructure you architect around. Auto-compaction triggers at ~83.5% capacity with a ~33K token buffer. For long-running agentic sessions, this means:

Front-load high-leverage context (CLAUDE.md, skills) — they survive compaction and get re-injected
Keep tool output concise — verbose npm test output burns context that could hold conversation state
Compact between tasks, not mid-task — compaction mid-flow loses decisions and rationale

For multi-agent work, each agent gets its own context window. Three agents reading the same 50-file codebase means paying for that context three times. Use the Explore agent (Haiku, read-only) for broad searches instead of having your main Opus session do it.

The Token Budget Mental Model

Think of context as a budget:

Cheapest: CLAUDE.md pointers and skills (fixed cost, high leverage)
Mid: Targeted file reads (pay once per read)
Expensive: Pasting large files into the chat (burns fast, not reusable)

The goal is to front-load context efficiently at launch and let skills handle the rest on demand.

Claude Code has three memory layers — CLAUDE.md (persistent conventions), auto memory (learned cross-session), and session memory (current conversation). CLAUDE.md is the only one you control directly; the others accumulate automatically. Keep CLAUDE.md lean and let the other layers handle session-specific knowledge.

Documentation Index

​The Context Window

​/compact

​@imports in CLAUDE.md

​Large Codebases

​Signs You’re Running Low

​Context as Infrastructure

​The Token Budget Mental Model

The Context Window

`/compact`

`@imports` in CLAUDE.md

Large Codebases

Signs You’re Running Low

Context as Infrastructure

The Token Budget Mental Model