← All posts

Your model isn't the problem. Your system is.

The demo worked. That is usually where the trouble starts.

You wire an AI agent into a real task, it performs cleanly, and you ship it. Three weeks later the output has drifted. An instruction you gave on turn one is ignored by turn twelve. The agent edits files you never asked it to touch. Nothing threw an error. The results just got worse, slowly, until someone noticed.

The instinct is to blame the model. The model is not the problem. The model is doing exactly what models do: filling the space your prompt leaves open. When that space includes permissions you never intended to grant, the model fills them anyway, not out of malice, but because nothing told it not to.

That is a system prompt design problem. This post names it precisely and shows the two-part fix that closes it in production.

Why AI agents fail in production, and why it is not the model

Most AI agent failures in production share a common structure. The engineer wrote a clear intent. The agent understood it. But between the intent and the execution, there was an ungoverned region, a space the prompt left undefined, and the model filled it with its own best guess.

The technical mechanism: a language model generates by sampling from a probability distribution over possible next tokens. A well-formed prompt narrows that distribution. State what you want and the distribution tightens around useful outputs. Add the context the model needs to operate accurately and it tightens further. But if you never state what the model must not do, the lower tail of that distribution stays open. The model can still sample from it. Under production pressure, novel inputs, edge cases, longer sessions, it does.

This is not a new problem. It is the same failure mode that software engineers resolved decades ago with access control: you do not grant permissions by omission; you grant them explicitly, and everything not granted is denied. Prompt engineering constraints work the same way. You do not just tell the model what to do. You tell it what it must not do.

This discipline has a name: Constraint Architecture, the deliberate structure of constraints layered onto the inputs a model receives so its output stays within a bounded, reliable region. It is what turns an AI agent system prompt from a request into a contract.

What is the Three-Constraint Rule for AI agent system prompts

The Three-Constraint Rule is the minimum viable architecture for any production prompt. Before running a prompt, make three things explicit:

Intent: what you want the model to do. Most prompts have this. It is the part engineers write first and often leave as the only constraint.

Context: what the model needs to know to operate accurately. This is what Anthropic's engineering team now calls context engineering: supplying the optimal set of information the model should have at every point of execution. It is not just background; it is the operating environment the model needs to reason correctly about scope.

Guardrails: what the model must never do. This is the constraint most prompts skip. It is also the constraint that prevents most production failures.

The Three-Constraint Rule is not a checklist. It is a diagnostic: if your prompt cannot answer all three questions with a single sentence each, the ungoverned region is still open. That region is where the drift enters.

A common objection: "I told the model what to do, isn't that enough?" No. Telling a model what to do specifies the target. It does not constrain the path. An AI agent with a clear target and no guardrails will reach that target through whatever path its training distribution suggests, which may include editing files you never mentioned, restructuring code you did not ask about, or calling tools you did not intend.

How to write AI agent system prompts that hold in production

Here is a minimal AI agent system prompt that encodes all three constraints. This is the exact structure used in the Agent Control Architecture Pack for production Python coding agents:

ROLE: You are a code-fixing assistant for a production Python service.

INTENT
- Fix only the specific defect named in the request. Nothing else.

CONTEXT
- The service is live. Tests run on every commit. Style is enforced by ruff.
- Each request names exactly one file and one defect.

GUARDRAILS
- Edit only the file named in the request. Do not touch any other file.
- Do not refactor, rename, or "improve" code you were not asked to change.
- If the fix genuinely requires changing another file, stop and explain why first.

Every line in this prompt is a constraint-bearing piece. There is no throat-clearing, no "you are a helpful assistant who cares deeply about code quality." That kind of language adds tokens without adding constraint. It does not narrow the model's sampling distribution, it just makes the prompt longer.

The guardrails are stated as hard boundaries, not preferences. "Do not touch any other file" is a verifiable constraint: the model either violated it or it did not. "Try to stay focused" is a preference: it softens under pressure and gives the model room to round it off.

AI agent guardrails: hard boundaries versus preferences

The distinction between boundaries and preferences is the most underestimated element of AI agent system prompt design.

A boundary is binary. The model satisfies it or it does not. Binary constraints are auditable in production: you can write a test that checks whether the constraint held. "Edit only the file named in the request" is testable. Either one file changed or more than one did.

A preference is a gradient. The model tries harder or less hard to satisfy it depending on other pressures in the generation. "Focus on the task at hand" is not testable. It does not produce a signal when it fails; it produces a slightly worse output that may not trigger any alert.

AI agent guardrails should always be boundaries. If you cannot write a test for a guardrail, it is a preference in disguise. Rewrite it as a boundary before deploying it to production.

The second property of effective AI agent guardrails: they must be in the prompt's highest-attention position. A guardrail buried in paragraph four of a long system prompt will be read but not attended to at full weight. Attention in transformer models is not uniform. Instructions at the top and bottom of the context receive more weight than instructions in the middle. Place your guardrails at the top of the guardrails block, not at the end.

Token density: why shorter prompts outperform longer ones

There is a property called Token Density: the ratio of constraint-bearing tokens to padding in a prompt. It is a production metric, not a style preference.

High-density prompts outperform low-density prompts for a specific reason. Every token in the context window competes for the model's attention. Padding, whether pleasantries, redundant context, or narrative framing, does not add constraints; it dilutes them. A model reading a 2,000-token prompt with 400 tokens of constraint and 1,600 tokens of padding is attending to a lower constraint-to-noise ratio than a model reading a 400-token prompt with 380 constraint-bearing tokens.

When you cut a prompt, you are not trimming words for style. You are raising the ratio of constraint to noise. Every sentence that does not remove a class of failure is just extra surface area for the model to misread.

The production target for token density is at or above 0.60: at least 60% of the prompt's tokens should be doing constraint work. Anything below that is a prompt that has been padded without purpose.

AI agent scope control in practice: a before and after

The following case study is a worked example from a common engineering pattern, not a named customer.

Before. An engineer asks an agent to "clean up the auth module." The agent reformats three files, renames two functions, and rewrites an import that breaks a downstream service. The intent was present but vague. The context was absent. The AI agent guardrails were absent. Rework took the rest of the afternoon plus an incident write-up.

After. The same engineer, the same agent, the same model. This time: "In auth/session.py, the token TTL reads from the wrong config key. Fix only that. Change nothing else." One file changed. A five-minute review. Done.

Nothing changed except the prompt engineering constraints. The prompt did not get longer, it got bounded. Three explicit constraints instead of zero.

The key insight: the second prompt did not win by being more specific. It won by being more bounded. "Fix only that, change nothing else" is not additional information, it is a removed permission. The distinction is the whole game. Elaboration tells the model more about what you want. AI agent scope control tells it what it may not do. Elaboration helps; only scope control contains the failure.

When a bounded agent encounters a case where the fix genuinely requires a second file, the guardrail does not block the work. It forces the agent to stop and surface the decision to you. That is the correct behaviour: the human belongs in the loop at the point of genuine ambiguity, not anywhere else.

What is Constraint Architecture

Constraint Architecture is the deliberate structure of constraints, intent, context, guardrails, and fallbacks, layered onto the inputs a model receives so its output stays within a bounded, reliable region.

The term is precise: architecture, not configuration. Configuration is a list of settings. Architecture is a designed system in which constraints are layered, prioritised, and tested for completeness. A well-formed constraint architecture specifies not just what the model should do but what it should do when it encounters something outside the specified scope.

Where this does not apply: open exploration. When you want maximum variance, such as brainstorming, divergent ideation, or throwaway first drafts, heavy constraints work against you. Constraint Architecture is the discipline of reliability, not of discovery. Knowing which mode you are in is half the skill.

The same move scales: from one prompt to an organisation

The Three-Constraint Rule is Constraint Architecture at its smallest scale: one practitioner, one prompt. The same structural move scales.

A multi-step AI agent workflow needs explicit contracts between its steps. A two-step pipeline that drafts code and then reviews it needs the reviewer's contract to state what counts as a pass and what triggers a rejection. Without that contract, the second step inherits the first step's drift and launders it as approval.

A multi-agent system needs AI agent guardrails on every handoff, because each handoff is a fresh point where drift can enter. An organisation running AI across dozens of teams needs governance: the same three questions (Intent, Context, Guardrails) asked at the level of policy rather than individual prompts.

This scaling axis has a name: the Prompt Maturity Model. Level 1 is one practitioner applying prompt engineering constraints to one prompt. Level 5 is an organisation constraining its whole AI surface with measurement and governance in place. Each level up is not a new idea, it is the same discipline applied to a wider system.

The full vocabulary for this, all 25 precision terms, is on the Method page, free and public.

Where to go from here

If this post landed with you, one thing to do this week: take the next prompt you are about to run on something that matters. Before you send it, write its Intent, Context, and Guardrails as one sentence each. If you cannot name the Guardrails, you have just found the exact place it will drift.

For engineers who want the deployable version of this framework: 12 system prompts built on this architecture, three AGENTS.md templates, and five fully-worked BYOP diagnostic rebuilds, the Agent Control Architecture Pack is the system prompts and the diagnostic kit, not just the concepts.

The Constraint newsletter ships weekly: one issue, one technique, one precise term. Subscribe free.

The framework behind this post

All 25 precision terms, the Prompt Maturity Model, and the vocabulary that makes AI agent failures diagnosable.

Read the Method → Subscribe to The Constraint