Skip to main content
Official reference — For CLAUDE.md mechanics and memory configuration, see the official memory docs. For all configuration options, see settings.

The Context Window

Claude Code has a large context window — but it’s shared across everything:
SourceCost
System promptFixed (~50 instructions)
CLAUDE.md filesFixed (loaded at launch)
Active skill contentPay-per-use (loads on trigger)
Conversation historyGrowing (accumulates over time)
Tool resultsVariable (file reads, command output, search)
As the window fills, older messages get compressed automatically. But you can manage this proactively.

/compact

When the conversation gets long or unfocused, run /compact. It:
  • Summarizes the conversation so far
  • Frees context for new work
  • Re-injects CLAUDE.md from disk (so your instructions survive)
Use it between tasks. Finished a bug fix and moving to a new feature? Compact first.

@imports in CLAUDE.md

The @path/to/file syntax in CLAUDE.md enables inline file expansion. Use it for detailed docs that shouldn’t be in the main CLAUDE.md body:
## Reference Documents
### API Architecture — `@docs/api-architecture.md`
**Read when:** Adding or modifying API endpoints
This keeps CLAUDE.md lean while making deep docs accessible on demand. Supports up to 5 hops of recursion.

Large Codebases

  • Break work into focused sessions. Don’t try to refactor 20 files in one conversation.
  • Point Claude to specific files. “Look at src/auth/middleware.ts” beats “find the auth code.”
  • Use agents for exploration. The Explore agent reads many files without polluting your main context.
  • Commit between tasks. Gives you a clean rollback point and lets you /compact without losing work.

Signs You’re Running Low

  • Claude starts forgetting earlier instructions or decisions
  • Responses become less specific or repeat generic advice
  • Claude asks questions you already answered
  • Tool results are getting truncated
When you notice these, /compact and refocus.

Context as Infrastructure

Official reference — For compaction mechanics, the 1M context window, and CLAUDE_AUTOCOMPACT_PCT_OVERRIDE, see the official context docs. What follows is the operational mindset.
Context isn’t just “how much Claude remembers” — it’s infrastructure you architect around. Auto-compaction triggers at ~83.5% capacity with a ~33K token buffer. For long-running agentic sessions, this means:
  • Front-load high-leverage context (CLAUDE.md, skills) — they survive compaction and get re-injected
  • Keep tool output concise — verbose npm test output burns context that could hold conversation state
  • Compact between tasks, not mid-task — compaction mid-flow loses decisions and rationale
For multi-agent work, each agent gets its own context window. Three agents reading the same 50-file codebase means paying for that context three times. Use the Explore agent (Haiku, read-only) for broad searches instead of having your main Opus session do it.

The Token Budget Mental Model

Think of context as a budget:
  • Cheapest: CLAUDE.md pointers and skills (fixed cost, high leverage)
  • Mid: Targeted file reads (pay once per read)
  • Expensive: Pasting large files into the chat (burns fast, not reusable)
The goal is to front-load context efficiently at launch and let skills handle the rest on demand.
Claude Code has three memory layers — CLAUDE.md (persistent conventions), auto memory (learned cross-session), and session memory (current conversation). CLAUDE.md is the only one you control directly; the others accumulate automatically. Keep CLAUDE.md lean and let the other layers handle session-specific knowledge.

← Prev: Workflow Patterns · Next: Session Management →