"Prompting" quietly split into four disciplines at four altitudes. The people pulling 10x ahead aren't writing better prompts — they're writing specs an autonomous agent can build from while they get coffee.
If you haven't changed how you think about prompting since the start of 2026, you're already behind — and not by a little. The latest model generation doesn't just answer better; it works, autonomously, for hours and sometimes days, against a brief you wrote before it started. That single change quietly broke the skill most people are still practicing. The word "prompting" now hides four completely different disciplines, and most people only know one of them.
Here's the uncomfortable demonstration. Two people sit down on the same Tuesday with the same model, the same subscription, the same context window. The only difference is which skill they're using.
Figure 1 — Same model, same Tuesday
The 10× gap is a skill gap, not an IQ gap
That gap did not open because Person B is smarter. It opened because she's practicing a skill Person A doesn't know exists. To see why, you have to see that "prompting" is no longer one thing.
Prompting is the broad act of giving an AI system the input it needs to do useful work. Under that umbrella, it has split into four distinct disciplines — each operating at a different altitude (from you-and-a-chat-window up to your whole organization) and a different time horizon (from seconds to weeks). They aren't alternatives. They stack: each layer makes the one above it possible, and the blast radius of getting it wrong grows as you climb.
Figure 2 — The anchor
The agentic prompting stack
This is the original skill: synchronous, session-based, individual. You write an instruction, read the output, iterate. Clear instructions, relevant examples and counter-examples, guardrails, an explicit output format, and rules for resolving ambiguity so the model doesn't invent them. None of this stopped mattering — it became assumed. Knowing how to type with ten fingers was once a professional edge; now it's the floor. If you can't write a clean prompt in 2026, you're the person in 1998 who couldn't send an email. Necessary. Not a differentiator.
The moment models became workers that run for hours without checking in, prompt craft hit its ceiling. In a chat, you were the intent layer, the context layer, and the quality layer, supplying missing pieces in real time. A long-running agent removes you from the loop, so all of that has to be encoded before it starts. Context engineering is the discipline of curating and maintaining the optimal set of tokens during a task — and it's where the industry's attention sits today.
The scale is the point. Your prompt might be 200 tokens; the window it lands in might be a million. The other 99.98% — system prompts, tool definitions, retrieved documents, message history, memory systems, MCP connections — is context engineering. And more is not better: as the window grows, retrieval quality degrades, a failure mode Anthropic's engineering team frames bluntly as context being a finite resource with diminishing returns. So the people who are 10x more effective aren't writing 10x better prompts. They're building 10x better context infrastructure — agents that start each session with the right files, conventions, and constraints already loaded, so the prompt itself can stay simple.
Context engineering tells an agent what to know. Intent engineering tells it what to want: the practice of encoding organizational purpose, values, trade-off hierarchies, and decision boundaries into infrastructure an agent can act against. It sits above context the way strategy sits above tactics — and you can have flawless context with catastrophic intent. There's a proof case for exactly that.
Figure 3 — The Klarna warning
Perfect context, wrong intent, scaled
The top layer is the one almost nobody names yet: writing documents — across the whole organization — that autonomous agents can execute against over long horizons without human intervention. A specification is a complete, structured, internally consistent description of what an output should be and how its quality is measured. At the small scale, it's the brief you hand a single agent run. At the large scale, it's a mindset: your corporate strategy is a spec, your product strategy is a spec, your OKRs are a spec. Everything your org writes becomes something an agent can read and act on.
This isn't theoretical. When Anthropic's team tried to get an earlier agent to build a production-quality web app from a one-line prompt — "build a clone of claude.ai" — it tried to do everything at once, ran out of context mid-build, and left the next session guessing. The fix wasn't a better model; it was specification engineering: a pattern where one agent sets up the environment, a progress log records what's been done, and a coding agent makes incremental progress against a structured plan each session. The spec became the scaffolding that let multiple agents produce coherent work over days. And the smarter models get, the more this matters — because each generation can do more work off the same blueprint.
The smarter the model, the more your spec is the bottleneck — because it's the thing that scales the work.
Every bit of this traces to one change. The mental model most people carry — "prompting is good instructions for the AI" — silently assumes you're there: watching the output, catching mistakes, supplying context when the model asks, course-correcting when it drifts. Long-running agents break every one of those assumptions. Your real-time oversight has to be embedded in the brief before the agent begins.
Figure 4 — The break
From sitting in the loop to encoding it up front
This is also why the planner–worker architecture dominates production agent deployments. A capable model plans the work, decomposes it into subtasks, and defines the acceptance criteria; then cheaper, faster models execute. The planning phase — really the specification phase — sets the quality ceiling for everything downstream.
Figure 5 — Where the spec lives in production
The planner sets the ceiling; the workers fill the room
"Write a better specification" is uselessly vague advice. So here are the five primitives that actually compose one — the things you can practice individually until spec-writing becomes a reflex.
Figure 6 — Exploded view
Anatomy of an executable specification
CLAUDE.md files emerging in the coding community are constraint architecture in practice — concise, high-signal, every line earning its place. If a line wouldn't change the agent's behavior, kill it.A few of these reward a specific drill. For self-containment, take a request you'd normally make casually — "update the dashboard with the Q3 numbers" — and rewrite it for someone who's never seen your dashboard, doesn't know what Q3 means in your org, and has access to nothing but what you include. For acceptance criteria, write three sentences an independent observer could use to verify the output without asking you anything; if you can't, you don't understand the task well enough to delegate it yet. For constraint architecture, write down what a smart, well-intentioned person might do that technically satisfies the request but produces the wrong outcome — those failure modes are your constraints. Encode them.
The four layers are a learning sequence, not a menu, because each rung is load-bearing for the next.
CLAUDE.md equivalent for your work: goals, constraints, communication preferences, quality standards — the institutional context a new hire would need six months to absorb. Load it at the start of sessions.And there's a payoff that has nothing to do with AI. Toby Lütke — Shopify's CEO, who keeps a folder of prompts he runs against every new model — argues that the core skill is stating a problem with enough context that it's plausibly solvable without anything else. He's found it made him a better communicator generally: tighter emails, clearer memos, stronger decision frameworks. His most provocative claim is that much of what we call corporate politics is just bad context engineering between humans — disagreements over unstated assumptions, playing out as grudges. Force those assumptions into the open, the way a spec demands, and a lot of the friction dissolves.
A spec done right is just clear thinking made explicit — because the machine won't let you be lazy about it.
That's the real prize. The best managers already operate this way: complete context when they delegate, explicit acceptance criteria, articulated constraints. AI is simply enforcing a discipline the strongest leaders always had. The people who build these four skills won't just get more out of their models — they'll run cleaner organizations where humans and agents both perform at their ceilings. Everyone else will keep wondering why their AI investments return partial value while their teams keep missing each other.
The prompt by itself is dead. The specification, the context, the encoded intent — that's where the value moved. Go write a blueprint worth building from.
The four altitudes is a nice mental model, but I want to be careful before I adopt the language. Is there any evidence that thinking in altitudes produces better outputs, or is this a post hoc way of describing what good engineers already did? I am not being dismissive. I genuinely cannot tell whether this is a framework or a vocabulary, and the two have very different shelf lives.
I think its a vocabulary that happens to be useful, which is fine? Naming the altitude you are working at stops people arguing past each other. Half my team fights happen because one person is tweaking wording and the other is talking about the spec. Doesnt need to be falsifiable to be worth having a word for.
The spec writing point is the one I will steal for stakeholders. Most of my prompt problems are actually scope problems wearing a prompt costume. If I can get a PM to write the acceptance criteria as the spec, the prompt mostly writes itself and we stop blaming the model for our own vagueness.
This clicked something for me. Month five of switching into this and I have been stuck treating every task as wording the prompt better. Reading it as four different jobs at different levels actually explains why some of my prompts felt like they were fighting me. Going to try writing the spec first on the next one.
Comments (4)
Join the discussion
Sign in to comment, bookmark threads, and continue lessons across sessions.