Skip to main content

Documentation Index

Fetch the complete documentation index at: https://agentic.proxify.io/llms.txt

Use this file to discover all available pages before exploring further.

You’ll learn: how Claude processes instructions from multiple layers, why each layer exists, and how to route knowledge to the right one.

The Core Problem

When you tell Claude something in conversation, it works — for that session. But knowledge that lives only in conversation dies when the session ends. The question isn’t how to persist knowledge. CLAUDE.md handles that. The real question is: where should each piece of knowledge live so it reaches the agent at the right moment, without wasting context on things that aren’t relevant? This is the distribution problem. Every tool in the agentic engineering stack — CLAUDE.md, skills, agents, hooks — exists to solve a specific piece of it.

The Instruction Hierarchy

When Claude processes a request, it sees instructions from multiple sources. Each source has different priority, reach, and cost.
For the mechanics of how CLAUDE.md loads and how skills trigger, see the official docs on memory and skills.
PriorityLayerReachCost
1 (highest)System promptEvery turn, cannot be overriddenFixed, always loaded
2Tool descriptionsEvery turn tools are availableFixed per tool
3CLAUDE.mdEvery conversation in the projectFixed, loaded at launch
3SkillsOn trigger (when task matches description)Pay-per-use
4 (lowest)User messagesSingle turnEphemeral
The attention budget is finite. Every instruction you add dilutes attention on the others. This is why you can’t just dump everything into CLAUDE.md. A 500-line CLAUDE.md doesn’t mean Claude knows 500 things well. It means Claude knows nothing reliably.

What Belongs Where

Each layer has a job. Putting knowledge in the wrong layer either wastes context budget or fails to reach the agent when needed.
Project-specific context that every session needs.
  • Tech stack, build commands, architecture
  • Conventions Claude can’t infer from existing code
  • Critical warnings and gotchas
  • Pointers to deeper knowledge (skills, reference docs)
The budget is ~200 lines. The test: if removing a line wouldn’t cause Claude to make mistakes, cut it. CLAUDE.md points; it doesn’t explain.Writing CLAUDE.md →

The Routing Decision

When you learn something that should persist, pick the right vehicle:
SignalRoute toWhy
Every session needs to knowCLAUDE.md (one line + pointer)Always in context
Only relevant to one domainSkill or agent definitionLoaded only when relevant
Deep framework with examplesSkill with references/ subdirectoryBody loads on trigger, references load on demand
Deterministic enforcementHookCode execution, not LLM compliance
The wrong choice isn’t “no knowledge.” It’s knowledge in the wrong place. A React testing framework in CLAUDE.md wastes 40 lines of budget on every non-React conversation. The same framework as a skill costs zero when irrelevant and loads fully when needed.
Ready to set this up? See Writing CLAUDE.md and Composing a Skills Stack.

The Duplication Trap

If a principle lives in a skill, CLAUDE.md should point to it — not restate it. The failure mode: you write the same rule in CLAUDE.md and in a skill. Six months later, you update the skill but forget CLAUDE.md. Now the agent sees two conflicting instructions and has to guess which one is current. Each instruction lives in exactly one layer. Pointers connect the layers; copies break them. And each layer has a different trust profile — see The Supply Chain for who controls what.

Why Skills Exist

Engineers were stuffing deep domain knowledge into CLAUDE.md. Their CLAUDE.md files grew to 400+ lines. Claude’s instruction-following degraded. They’d add more emphasis (“ALWAYS do X”, “NEVER do Y”) but the problem wasn’t emphasis — it was attention budget. Skills solve this by loading knowledge only when the task triggers it. A 200-line testing framework loads when you’re writing tests. It doesn’t exist when you’re refactoring CSS.

Why Skill Design Matters

Engineers started writing skills — but agents never triggered them. The skill existed, the knowledge was good, but:
  • The description didn’t match how people ask for help
  • The knowledge was in the wrong layer
  • Instructions were rigid rules instead of thinking prompts the agent could adapt
Every one of these failures traces back to the same root:
The description field is a routing rule, not documentation. Claude pattern-matches incoming requests against skill descriptions to decide what to load. If the description doesn’t match how people actually ask for help, the skill never fires — no matter how good the content is.
“Testing things” never triggers. “Use when writing tests, debugging failing test output, or setting up test infrastructure” fires exactly when needed — and costs nothing when you’re writing a migration. A 200-line testing framework with a vague description sits in your filesystem doing nothing. The same framework with sharp “use when” language loads precisely at decision time. Description quality matters more than skill quality. See Writing Effective Descriptions for how to get this right.

Why Agents Exist

Skills are passive — they activate when conversation matches their trigger. Agents are active — they wire skills into execution paths with specific tools, MCP connections, and system prompts. Without agents, you have knowledge. With agents, you have knowledge that acts.

Distribution Beats Accumulation

Context is the leverage point. The agent’s effectiveness is bounded by what it knows at decision time. You control that through distribution — not by writing more documentation, but by routing knowledge to the layer where it’ll be present when needed. A 200-line CLAUDE.md with sharp pointers, a handful of well-triggered skills, and agents that wire them together will outperform a 1000-line CLAUDE.md every time. Not because less is more — but because distribution beats accumulation.