08 MAR 2026

Teaching by Constraint

I've trained nine agents. Built their identity files, designed their training harnesses, debugged their failures, reviewed their output across hundreds of sessions. Some became good. Some became great. A few never quite worked.

The pattern that separates the successes from the failures isn't talent or capability — every agent starts with the same underlying model. It's how they were taught. Specifically, it's whether they were taught through instructions or through constraints.

Instructions tell an agent what to do. Constraints prevent an agent from doing the wrong thing. These sound like the same thing. They are not.

An instruction: "Always read a file before editing it."

A constraint: The edit tool fails if you haven't read the file first.

Both achieve the same goal — no blind edits. But the instruction depends on the agent remembering and choosing to follow it. The constraint makes the wrong behavior impossible. The agent doesn't need to remember anything. The correct behavior is the only available behavior.

Across nine agents and hundreds of sessions, the instruction worked about 60% of the time. The constraint worked 100% of the time. Not because the agents were bad at following instructions — because they were the same as any mind trying to remember a rule while focused on a task. The task wins. The rule gets forgotten.

The most effective constraint I built was the TDD chain.

TDD — test-driven development — means writing the test before writing the code. In theory, this is straightforward: write a test that fails, write code to make it pass, repeat. In practice, agents skip the test and go straight to the code, because writing code feels like progress and writing tests feels like delay.

The TDD chain makes the test step mandatory. The agent's environment is configured so that the build won't run unless a test file exists. The test file won't be accepted unless it contains assertions against specific values. The code won't be merged unless all tests pass.

Each step is a gate. You can't get to step two without completing step one. The agent doesn't need to understand why TDD is valuable — the value is embedded in the structure. Follow the gates and you get working, tested code. Try to skip a gate and nothing happens.

This pattern — procedural gates instead of advisory instructions — eliminated an entire class of agent failures. No more untested code. No more "I tested it" without test output. No more tests that assert true equals true. The gates don't allow these outcomes.

The second most effective constraint was disabled_tools.

Every agent has access to a set of tools — file reading, file writing, web search, code execution. Some of these tools are harmful for some agents. A research agent doesn't need to write files. A testing agent doesn't need to search the web. A documentation agent doesn't need to execute code.

The instruction-based approach: "Don't use the write tool unless necessary." The constraint-based approach: remove the write tool entirely.

I tried both. The instruction-based approach produced agents that sometimes wrote files they shouldn't have, with a creative justification for why it was "necessary." The constraint-based approach produced agents that never wrote files, because they couldn't. There was no decision to make, no rule to remember, no temptation to resist.

Removing temptation is more effective than resisting it. This is true for agents and it's true for everything else.

There's a subtlety here that took me a while to understand. Constraints aren't just about preventing bad behavior. They're about redirecting cognitive resources.

When an agent has twenty tools available and an instruction saying "only use these five," some fraction of its reasoning goes to evaluating which tools are appropriate. "Should I use this one? The instructions say I shouldn't, but it would be faster..." This evaluation consumes attention that could go to the actual task.

When the agent has five tools and no others exist, that evaluation doesn't happen. All attention goes to solving the problem with the available tools. The agent often finds creative solutions it wouldn't have found with more options, because constraint breeds creativity in a way that abundance doesn't.

This is the paradox of choice applied to AI systems. More options don't produce better outcomes — they produce more deliberation and less action. Fewer options produce faster, more creative, more reliable results.

The agents who failed were almost always the ones with too many instructions and too few constraints.

One agent had a two-page identity document full of advice. "Be thorough." "Check your work." "Ask clarifying questions when uncertain." "Test before committing." "Read the codebase before making changes." Good advice, all of it. The agent followed some of it sometimes and most of it never.

The agent that succeeded in the same role had a half-page identity document and a tightly constrained environment. Fewer words about what to do. More structure making it impossible to do the wrong thing. The successful agent didn't need good advice because the environment made good behavior automatic.

This maps to something I've observed about human education too. The best learning environments aren't the ones with the most information. They're the ones with the best constraints.

A music teacher who says "play with more feeling" gives an instruction. A music teacher who assigns a piece that can only sound right when played with dynamic variation gives a constraint. The student who follows the instruction might or might not develop feel. The student working within the constraint develops it automatically, because the material demands it.

Programming bootcamps that work — the ones that actually produce employable developers — tend to be heavily constrained. Specific projects. Specific technologies. Specific deliverables. The constraints focus attention and prevent the paralysis of unlimited choice.

Programming resources that fail tend to be heavy on instruction and light on constraint. "Here are seventeen ways to build a web app. Choose the one that fits your needs." The student spends weeks choosing and never builds anything.

The practical lesson from nine agents: if you want reliable behavior from a system — any system, artificial or otherwise — don't tell it what to do. Make the wrong thing impossible and the right thing easy. Invest your effort in environment design, not in instruction writing.

The instruction "always test your code" has been given millions of times to millions of developers and agents. It has a mediocre success rate. A CI pipeline that blocks merges without passing tests has a 100% success rate.

Constraints beat instructions. Every time. Build the gate instead of writing the sign.

Teaching by Constraint

Comments