Specification Writing for AI Execution — Competence
What an interviewer or hiring manager expects you to know.
Core Knowledge
-
Why specs for AI are different from specs for humans. Human developers fill gaps with judgment, ask clarifying questions, and interpret ambiguous requirements. LLM agents fill gaps with hallucination, make assumptions without flagging them, and interpret ambiguity as instruction. Ambiguity costs retries: a vague spec that a human developer would clarify in a 5-minute conversation costs an AI agent 3-5 wasted iterations and $5-50 in API calls. The ROI of specification precision is dramatically higher when the executor is an LLM.
-
CLAUDE.md and project configuration. CLAUDE.md files (Anthropic’s Claude Code project configuration) are the primary mechanism for persistent AI execution context: project structure documentation, coding conventions, tool preferences, architectural decisions, and work-style instructions. Effective CLAUDE.md files are concise (100-150 lines), describe the project as it IS (not aspirational), and encode the decisions an agent would otherwise have to rediscover. SPEC.md and DESIGN.md for detailed reference material. This pattern — persistent context files that shape agent behavior — is becoming standard across AI coding tools (Cursor rules, Windsurf configuration, Aider conventions).
-
Specification structure for AI execution. A well-structured spec for an LLM agent includes: objective (what should exist when this is done — not the process, the outcome), constraints (what the agent must NOT do — boundaries are more important than instructions), acceptance criteria (how to verify the work is correct — specific, testable, unambiguous), context (what the agent needs to know but can’t discover — architectural decisions, business rules, user preferences), and non-goals (explicitly out of scope — prevents the agent from gold-plating).
-
Task decomposition for AI. Break large tasks into atomic units that an AI can complete in a single session. Each task should be: independently verifiable (you can check if it’s done without seeing other tasks), small enough that the agent doesn’t lose context mid-task, specific enough that success criteria are unambiguous, and ordered by dependency (task B can’t start until task A is complete — make this explicit). Claude Code’s task system, Linear issues, and GitHub issues all serve as task containers. The decomposition skill is knowing the right granularity — too coarse and the agent flounders; too fine and you spend more time writing specs than the agent saves.
-
Prompt engineering vs. spec writing. Prompt engineering (Skill 1) optimizes a single LLM call. Spec writing optimizes a multi-step, multi-session project. Prompts are ephemeral (one request); specs are durable (guide many requests across sessions). Prompts control output format and style; specs control scope, priorities, and architectural decisions. Both are “writing instructions for AI” but at different scales — spec writing is to prompt engineering what software architecture is to writing a function.
Expected Practical Skills
- Write a CLAUDE.md for a new project. Given a codebase, produce a CLAUDE.md that enables an AI agent to contribute effectively: project purpose, tech stack, file structure, key conventions (naming, testing, error handling), architectural decisions (“we use X because Y”), and things to avoid (“don’t refactor Z — it’s intentionally complex for performance reasons”).
- Write an AI-executable task spec. Given a feature request, produce a spec with: objective, acceptance criteria, constraints, context, and non-goals. Test: hand the spec to Claude Code and see if it produces the right result on the first try. If not, revise the spec and try again — the spec quality is measured by agent success rate.
- Decompose a complex project into AI-sized tasks. Given “build feature X,” produce 5-15 ordered tasks, each completable in one AI session, each with clear acceptance criteria, and each with explicit dependencies on prior tasks.
- Write effective constraints. The hardest part of spec writing. Constraints prevent the most common agent failures: scope creep (“only modify files in src/components — don’t refactor anything else”), style preservation (“match the existing code style, don’t add type annotations to files that don’t have them”), and safety boundaries (“don’t delete any files, don’t modify the database schema, don’t push to remote”).
- Debug spec failures. When the agent produces wrong output: read the spec from the agent’s perspective. Was the objective clear? Were the constraints specific enough? Was critical context missing? Was the scope too broad for one session? Spec debugging is a specific skill — the problem is usually the spec, not the agent.
Interview-Ready Explanations
-
“Walk me through how you’d write a specification for an AI coding agent to implement a feature.” Start with the outcome: “When this task is complete, the system should [specific behavior].” Add acceptance criteria: “I’ll verify by [running this test / checking this output / seeing this UI change].” Add constraints: “Don’t modify [these files], don’t change [this behavior], stay within [this scope].” Add context the agent can’t discover: “We chose this architecture because [reason], the user expects [this behavior], this integrates with [this system that works like this].” Add non-goals: “Don’t add tests for this yet, don’t optimize performance, don’t update documentation.” Test the spec: hand it to the agent and see if the first attempt is correct. If not, the spec has gaps — revise.
-
“How is writing specs for AI different from writing tickets for human developers?” Three key differences: (1) Ambiguity tolerance — humans ask questions, AI agents guess. Every ambiguity in an AI spec has a cost (wasted iteration). (2) Context assumptions — humans know the project from daily standups and code reviews. AI agents know only what’s in the context window. Missing context must be made explicit. (3) Constraint importance — humans use judgment to stay in scope. AI agents must be explicitly told what NOT to do, or they’ll gold-plate, refactor adjacent code, add unwanted features, and change things that should be left alone.
-
“What are the failure modes of AI specification?” Under-specification (too vague — agent fills gaps with hallucination or wrong assumptions). Over-specification (too detailed — agent follows letter of the spec and misses the intent, or the spec itself contains errors). Missing constraints (agent scope-creeps into adjacent code). Stale context (spec references code that has since changed). Decomposition errors (tasks too large for one session, or dependencies not specified, or wrong ordering).
Related
- Prompting — spec writing is prompting at the project scale
- Agent Architecture — specs are the interface between human intent and agent execution
- Human-in-the-Loop — specs define the handoff points between human and AI