You ask Claude Code to fix one bug. It fixes the bug. It also rewrites the function around it, adds comments, changes indentation, and refactors something "while it was there." You look at the diff and half of it has nothing to do with the bug you reported.
This is not a random glitch. AI coding agents have specific bad habits when left unsupervised. They guess when they should ask. They build when they should simplify. They touch things they were never meant to touch.
Andrej Karpathy, a founding member of OpenAI, posted about this on X in late January 2026. His observations spread fast. A developer named Forrest Chang converted them into a single configuration file: a CLAUDE.md that any project can use. That file became one of the fastest-growing repositories on GitHub, past 90,000 stars. One file. No code. Just behavioral rules.
This tutorial explains those four rules, then adds eight more for the kind of multi-step agent work the original four were never built to cover.
CLAUDE.md is a plain text file you put at the root of your project. When Claude Code starts a session, it reads that file and treats it as part of the conversation context. Every message you send already has the rules in the background, so Claude knows what behavior is expected before it writes a single line.
This is not code enforcement. Claude cannot be physically stopped from doing something by the file. It works more like a briefing you hand a contractor before they start. A good contractor reads it and follows it. A contractor under time pressure might skip a step. The shorter and more concrete the briefing, the better the contractor follows it.
That distinction matters. You will hear people say CLAUDE.md is "advisory." What that means in practice: it works well for clear, specific rules and breaks down for vague encouragement. "Be careful" does nothing. "State your assumptions before writing code" works.
Karpathy described three specific patterns he kept seeing.
The AI picks an interpretation of your request and runs with it silently. It does not say "I assumed you meant X." It just assumes and builds. This is a hidden assumption.
The AI also builds more than you asked for. You request a simple function and get a class hierarchy, an interface, and three abstract layers. This is over-engineering.
Finally, the AI changes things next to what you asked it to change. Comments get rewritten. A function you never mentioned gets restructured. These are unintended side effects.
The Claude Code community has a name for this behavior: the "confident junior dev." Fast, knowledgeable, and prone to naive mistakes if left without clear guardrails.
Forrest Chang encoded Karpathy's observations into four rules. Each one addresses one of those patterns.
Think Before Coding fixes hidden assumptions. Before writing code, the AI states its assumptions out loud. If it sees two interpretations, it presents both. If something is unclear, it asks. Think of a waiter. You ask for "something spicy." A bad waiter brings the hottest thing on the menu. A good waiter says: "I have the chili, very hot, or the pepper pasta, mild but bold. Which works?" That is the behavior this rule creates.
Simplicity First fixes over-engineering. The AI writes the minimum code that solves the problem. Nothing speculative. You need a shelf for your books. This rule says: build the shelf. Do not also add a reading lamp, a rotating display, and a hidden drawer in case you need it someday. The test: would a senior engineer call this overcomplicated? If yes, simplify.
Surgical Changes fixes unintended side effects. The AI touches only what is necessary and matches the existing style. Think of a painter hired to repaint your front door. A surgical painter does exactly that. They do not also repaint the shutters and power-wash the driveway because it was nearby.
Goal-Driven Execution flips how you give instructions. Instead of telling the AI what steps to follow, you tell it what "done" looks like, and it loops until that condition is met. "Bake the cake" is vague. "Bake until a toothpick comes out clean" is testable. Instead of "fix the bug," say "write a test that reproduces the bug, then make the test pass."
These four close a large share of the failure modes you see in unsupervised sessions. But they were written for one specific moment: an AI writing code, one request at a time.
The four rules target the moment Claude is writing code. They go quiet the moment Claude is running a longer job. A few gaps show up fast.
Long-running agent work has no budget rule and no checkpoint rule. A multi-step pipeline can drift for an hour and nobody notices until the end.
"Match existing style" assumes one style. In a big repository with many services, the AI has to pick which style. The original rules do not say how, so it picks at random or blends two.
Goal-Driven Execution treats "tests pass" as success. It does not say the tests have to be meaningful. The result is tests that test nothing useful but make the AI confident it is done.
The fix is not to replace the four rules. It is to add rules that cover the gaps.
Each of these came from a real moment where four rules were not enough.
Use the model for classification, drafting, summarizing, pulling facts out of messy text. Do not use it for routing, retries, or status-code handling. If plain code can answer the question, plain code answers the question. The moment: code that called the model to "decide if we should retry on a 503" worked for two weeks, then started flaking, because the model began reading the request body as context. The retry policy was random because the prompt was random.
Set a budget per task and per session. If a task is approaching its budget, summarize and start fresh instead of pushing through. The moment: a debugging session ran for 90 minutes. The model kept iterating on the same error message, slowly losing track of which fixes it had already tried. By the end it was suggesting fixes that had been rejected 40 messages earlier.
If two patterns in the codebase contradict each other, pick one, explain why, and flag the other for cleanup. Do not blend them. The moment: a codebase had two error-handling patterns. The AI wrote new code that did both. Errors were swallowed twice, and it took half an hour to find out why.
Before adding code to a file, read its exports, the immediate caller, and any shared utilities. "Looks orthogonal to me" is a dangerous phrase. The moment: the AI added a function next to an existing identical function it had not read. The new one won because of import order. The old one had been the source of truth for six months.
A test must encode why the behavior matters, not just what it does. If you cannot write a test that would fail when the business logic changes, the function is wrong. The moment: the AI wrote 12 tests for an auth function. All passed. Auth was broken in production. The tests checked that the function returned something, not that it returned the right thing.
After each step in a multi-step task, summarize what was done, what is verified, and what is left. Do not continue from a state you cannot describe back. The moment: a six-step refactor went wrong on step four. By the time anyone noticed, steps five and six had been built on top of the broken state.
If the codebase uses snake_case and you prefer camelCase, use snake_case. If it uses class components and you prefer hooks, use class components. Two patterns in one codebase is worse than either pattern alone. If you genuinely think a convention is harmful, say so. Do not fork it silently.
The most expensive failures are the ones that look like success. "Migration completed" is wrong if 30 records were skipped silently. "Tests pass" is wrong if any were skipped. Default to surfacing uncertainty, not hiding it. The moment: the AI reported a database migration "completed successfully." It had silently skipped 14% of records that hit a constraint. The problem was found 11 days later, when reports started looking wrong.
Info
Tip: You do not need all eight. If you never run multi-step pipelines, Rule 10 does not matter. If one consistent style is enforced by a linter, Rule 11 is redundant. Keep the rules that map to mistakes you have actually made. A six-rule file tuned to your real failures beats a twelve-rule file with six rules you will never need.
CLAUDE.md is advisory. Claude reads it and usually follows it, but it is not a hard control. The way to keep compliance high is to keep the file short.
Think of it like a checklist taped to a wall. A checklist with 10 items gets followed. A checklist with 100 items gets ignored, because the reader skims instead of reading. The same thing happens to Claude. As the file grows, important rules get buried in noise, and compliance falls off.
The practical guidance from people who have tested this: keep your whole CLAUDE.md under about 200 lines. The twelve rules fit comfortably. Add your project-specific rules (stack, test commands, error patterns) below them, and watch the line count.
The mental model that keeps the file lean: CLAUDE.md is not a wishlist. It is a behavioral contract. Every rule should answer one question. What mistake does this prevent? If a line does not prevent a mistake you have actually seen, it is noise.
Before settling on twelve rules, the author tried several things that failed. They are worth knowing so you do not repeat them.
Forrest Chang packaged the original four rules into a Claude Code plugin. Install it once and they apply across all your projects. The same principles work in Cursor too. The repo includes a .cursor/rules/karpathy-guidelines.mdc version if you use both tools.
# Add the marketplace
/plugin marketplace add forrestchang/andrej-karpathy-skills
# Install the plugin
/plugin install andrej-karpathy-skills@karpathy-skillsIf you only want the rules for one project, append the file to that project's CLAUDE.md instead:
# Append the four-rule baseline to your existing CLAUDE.md
curl https://raw.githubusercontent.com/forrestchang/andrej-karpathy-skills/main/CLAUDE.md >> CLAUDE.mdThe >> matters. It appends to your existing file instead of overwriting any project-specific rules you already have.
Rules 5 through 12 are not in the plugin. Add them by hand below the baseline. Here is the block, ready to paste:
## Rule 5 — Use the model only for judgment calls
Use the model for classification, drafting, summarization, extraction.
Do NOT use it for routing, retries, deterministic transforms.
If code can answer, code answers.
## Rule 6 — Token budgets are not advisory
Set a per-task and per-session budget.
If approaching budget, summarize and start fresh. Surface the breach.
## Rule 7 — Surface conflicts, don't average them
If two patterns contradict, pick one (more recent / more tested).
Explain why. Flag the other for cleanup. Don't blend them.
## Rule 8 — Read before you write
Before adding code, read exports, immediate callers, shared utilities.
If unsure why code is structured a way, ask.
## Rule 9 — Tests verify intent, not just behavior
Tests must encode WHY behavior matters, not just WHAT it does.
A test that can't fail when business logic changes is wrong.
## Rule 10 — Checkpoint after every significant step
Summarize what was done, what's verified, what's left.
Don't continue from a state you can't describe back.
## Rule 11 — Match the codebase's conventions, even if you disagree
Conformance over taste inside the codebase.
If a convention is harmful, surface it. Don't fork silently.
## Rule 12 — Fail loud
"Completed" is wrong if anything was skipped silently.
"Tests pass" is wrong if any were skipped.
Default to surfacing uncertainty, not hiding it.Info
Warning: These rules bias toward caution over speed. For a typo fix or a variable rename, use your judgment. The full rigor is for non-trivial changes where a wrong assumption costs real time.
Karpathy named three patterns that make AI agents unreliable: silent assumptions, over-engineering, and unintended side effects. Forrest Chang turned them into four rules in a single file: Think Before Coding, Simplicity First, Surgical Changes, and Goal-Driven Execution. Those four cover one request at a time, but go quiet during multi-step agent work, so eight more rules close the gaps: judgment-only model use, token budgets, surfacing conflicts, reading before writing, meaningful tests, checkpoints, convention matching, and failing loud. Keep the whole file under about 200 lines, because CLAUDE.md is advisory and compliance falls as the file grows. Treat it as a behavioral contract: every rule must prevent a mistake you have actually seen.
Next up: 09. Permissions: Who Decides What Claude Can Run.
Discussion
0 comments· Help other readers by sharing what worked (or didn't) for you.
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts!