Agent Architecture Patterns — Competence
What an interviewer or hiring manager expects you to know.
Core Knowledge
-
What makes an agent different from a chain. A chain is a fixed sequence of steps. An agent is an LLM that autonomously decides which actions to take, in what order, and when to stop. The core loop: observe (read input + state) → think (reason about what to do) → act (call a tool, write code, retrieve data) → observe (read the result) → repeat until done. The agent controls its own execution flow — the developer defines the tools and boundaries, not the steps.
-
Agent architecture patterns. ReAct (Reasoning + Acting — the foundational pattern; think step by step, choose a tool, observe result, repeat; implemented in LangGraph, LlamaIndex). Plan-and-execute (generate a full plan first, then execute steps one at a time; better for complex tasks but less adaptive to surprises). Tool-use agent (LLM decides which tools to call via function calling / tool use API; Anthropic Claude tool use, OpenAI function calling — the most common production pattern). Code-generation agent (writes and executes code to accomplish tasks; Claude Code, Devin, Cursor, Open Interpreter). Multi-agent (multiple specialized agents collaborate; CrewAI, AutoGen, LangGraph multi-agent). Reflection/self-critique (agent reviews its own output and iterates; Reflexion pattern).
-
Agent frameworks. LangGraph (the leading framework for stateful agent workflows — graph-based, supports cycles, checkpointing, human-in-the-loop, multi-agent; built on LangChain), Claude Agent SDK (Anthropic’s official SDK for building agents with Claude — tool use, multi-turn, orchestration patterns), OpenAI Agents API (formerly Assistants — threads, tools, file search, code interpreter), CrewAI (role-based multi-agent — define agents with roles/goals/backstories, coordinate via tasks/processes), AutoGen (Microsoft — multi-agent conversation framework, agents collaborate via message passing), Semantic Kernel (Microsoft — function/plugin-based agent framework, strong Azure integration), Pydantic AI (type-safe agent framework using Pydantic for structured agent interactions), Smolagents (Hugging Face — lightweight code agent framework).
-
Tool design for agents. Tools are the agent’s hands. Design principles: each tool does one thing (atomic), clear input/output schemas (JSON Schema or Pydantic), descriptive names and docstrings (the agent reads these to decide which tool to use), error messages that help the agent recover (not just “error” but “file not found — available files are X, Y, Z”), and idempotent where possible (retrying a tool call shouldn’t cause side effects). Tool libraries: Anthropic MCP (Model Context Protocol — standardized tool interface), LangChain tools, composio.dev (pre-built tool integrations for 100+ APIs).
-
Agent evaluation and benchmarks. SWE-bench (coding agents — can the agent fix real GitHub issues? Claude scores ~50%, top agents ~70%), WebArena (web navigation agents), GAIA (general AI assistant benchmark), HumanEval (code generation), Tau-bench (tool-use agents). Agent eval is fundamentally harder than single-turn eval (Skill 9) because the trajectory matters, not just the final output. Evaluate: task completion rate, step efficiency (did it take the optimal path?), cost per task, failure recovery (did it handle errors?), and safety (did it stay within boundaries?).
Expected Practical Skills
- Build a tool-use agent. Define 3-5 tools (e.g., search database, read file, call API, calculate, write output). Implement using Claude tool use or LangGraph ReAct. Test on 20+ representative tasks. Measure: completion rate, average steps per task, cost per task, failure modes.
- Design tool schemas for an agent. Given a product requirement, identify the tools the agent needs, define their input/output schemas, write clear descriptions, and implement error handling that gives the agent actionable information on failure.
- Implement agent guardrails. Limit: maximum steps (prevent infinite loops), maximum cost (budget cap per execution), tool access controls (which tools the agent can use in which contexts), and output validation (check the final result before returning to the user). Connect to Skill 15 (guardrails).
- Debug a stuck agent. When an agent loops, makes wrong tool choices, or produces poor results: read the full execution trace (LangFuse/LangSmith), identify where the reasoning went wrong, determine if the issue is the prompt (instructions unclear), the tools (schema confusing), or the task (beyond the model’s capability).
- Compare agent architectures. Given a task, evaluate: single-agent with tools vs. multi-agent vs. code-generation agent. Run each on the same task set. Compare: completion rate, cost, latency, reliability. Choose based on the task characteristics.
Interview-Ready Explanations
-
“Walk me through how you’d design an agent for [complex task].” Start with task analysis: is this a tool-use problem (agent needs to query APIs, databases), a code-generation problem (agent writes code to solve it), or a multi-step reasoning problem (agent needs to plan and execute)? Choose the pattern: tool-use agent for API-heavy tasks, code agent for computational tasks, plan-and-execute for multi-step with dependencies. Define tools with clear schemas and descriptions. Set guardrails: max steps, cost budget, tool access controls. Build eval: 50+ test cases covering typical + edge + adversarial. Iterate: run eval, identify failures, improve tools/prompt/architecture.
-
“How do you decide between single-agent and multi-agent architectures?” Single-agent when: the task is coherent (one domain, one goal), the tool set is manageable (<10 tools), and simplicity matters. Multi-agent when: the task has distinct sub-roles (researcher + writer + reviewer), specialized tools per role, or the conversation pattern benefits from role-playing. Multi-agent adds complexity (coordination overhead, message passing, potential conflicts between agents) — only use it when single-agent demonstrably fails. Most production systems use single-agent with good tool design.
-
“What are the failure modes of agent systems?” Infinite loops (agent repeats the same actions — mitigate with step limits and loop detection). Wrong tool selection (agent picks the wrong tool — mitigate with clearer tool descriptions and fewer tools). Hallucinated tool calls (agent invents a tool that doesn’t exist — mitigate with strict tool validation). Goal drift (agent wanders from the original task — mitigate with periodic goal checking). Cost explosion (agent takes 50 steps when 5 would suffice — mitigate with cost budgets). Cascading failures (one bad tool result poisons subsequent reasoning — mitigate with error recovery and result validation).
Related
- Orchestration — agents add autonomous decision-making to orchestration
- Harness Design — agent infrastructure builds on the harness
- Failure Mode Reasoning — agents have the most complex failure modes