Orchestration — Competence

What an interviewer or hiring manager expects you to know.

Core Knowledge

What orchestration means for LLMs. Coordinating multi-step LLM workflows: sequencing (step B depends on step A’s output), parallelization (steps B and C can run simultaneously), branching (different steps based on classification results), error recovery (what happens when step 3 of 7 fails), and state management (carrying context across steps). This is the LLM equivalent of workflow orchestration in distributed systems — same patterns (DAGs, state machines, queues), new constraints (non-deterministic outputs, token limits, cost accumulation).
Orchestration frameworks. LangChain LCEL (LangChain Expression Language — composable chains with pipe operator, supports streaming, batch, async, and fallback; good for linear/branching workflows), LangGraph (graph-based orchestration for stateful, cyclic workflows — the agent framework built on LangChain; supports checkpointing, human-in-the-loop nodes, and branching), LlamaIndex Workflows (event-driven orchestration with @step decorators, async-first, streaming), CrewAI (role-based multi-agent orchestration — define agents with roles/goals/tools, coordinate via tasks), AutoGen (Microsoft’s multi-agent conversation framework — agents communicate via message passing), Temporal.io (durable workflow engine — not LLM-specific but excellent for long-running AI workflows with human checkpoints and retry guarantees), Prefect/Airflow (traditional pipeline orchestrators adapted for LLM workflows — better for batch than real-time).
Common orchestration patterns. Sequential chain (A → B → C — simplest, e.g., extract → validate → format). Map-reduce (split input into chunks, process in parallel, combine results — for large document processing). Router (classify input, route to specialized handler — e.g., question type → appropriate expert prompt). Evaluator-optimizer loop (generate → evaluate → regenerate if below threshold — for quality-critical outputs). Plan-then-execute (LLM generates a plan, then executes steps one at a time — the basis of most agent patterns). Human-in-the-loop (automated steps interspersed with human review checkpoints — Skill 17).
State management across steps. Each step in an orchestration receives context from previous steps and passes context to the next. Challenges: token budget grows with each step (accumulated context), inconsistent state when parallel steps modify shared state, and lost context when a step fails and the workflow retries. Solutions: explicit state objects (Pydantic models carrying only the fields each step needs), conversation memory (LangChain memory modules, LlamaIndex ChatMemoryBuffer), checkpointing (LangGraph’s built-in checkpoint system, Temporal’s durable state), and context summarization (periodically summarize accumulated context to stay within token limits).
Error handling in multi-step workflows. When step 3 of 7 fails: retry the step (with the same patterns from Skill 2 — exponential backoff, circuit breaker), skip the step and continue with degraded output (if the step is optional), fall back to a different implementation of the step (cheaper model, simpler prompt), or abort the entire workflow and return a partial result with an explanation. The orchestrator must distinguish retryable errors (API timeout) from non-retryable errors (input is invalid for this step). LangGraph and Temporal both support error handling at the step level.

Expected Practical Skills

Build a multi-step LLM pipeline. Implement a 3-5 step workflow: e.g., classify input → retrieve context → generate response → validate output → format for delivery. Use LCEL or LangGraph. Handle errors at each step. Pass state between steps. Add logging per step for debugging.
Implement parallel execution. Run multiple LLM calls simultaneously (e.g., extract information from 5 documents in parallel, then combine). Use async/await with the Anthropic/OpenAI SDK or LangChain’s batch operations. Manage rate limits across parallel calls.
Design a router workflow. Build a classifier that routes inputs to specialized handlers. Implement: classification prompt → routing logic → handler-specific prompts → unified output format. Measure: routing accuracy, per-handler quality, overall latency.
Add checkpointing to a long-running workflow. Implement state persistence so that if a 10-step workflow fails at step 7, it can resume from step 7 instead of restarting. Use LangGraph checkpointing or Temporal durable execution.
Debug a multi-step failure. Given a bad output from a 5-step pipeline, trace through the execution log to identify which step produced the error. Use LangFuse trace visualization or LangSmith’s step-by-step trace view.

Interview-Ready Explanations

“Walk me through how you’d orchestrate a complex multi-step LLM workflow.” Start with decomposition: break the task into atomic steps, each with a clear input/output contract. Identify dependencies (which steps depend on which). Determine parallelization opportunities (independent steps run concurrently). Choose orchestration framework: LCEL for linear/branching, LangGraph for stateful/cyclic, Temporal for long-running with human checkpoints. Implement state management (Pydantic models for step inputs/outputs). Add per-step error handling (retry, fallback, skip). Instrument with LangFuse for end-to-end tracing. Test: unit test each step with mocked inputs, integration test the full pipeline with golden datasets.
“How do you handle failures in a multi-step LLM pipeline?” Per-step error policy: each step defines what happens on failure (retry 3x → fallback to simpler approach → skip → abort). Distinguish retryable (timeout, rate limit) from non-retryable (invalid input, content policy block). Checkpoint state so failures don’t restart from scratch. Partial results: if step 5 of 7 fails, return steps 1-4 results with a clear indication of what’s missing. Alert: trigger monitoring when failure rate exceeds baseline. Post-mortem: LangFuse trace shows exactly where and why the failure occurred.
“When would you use LangGraph vs. LCEL vs. Temporal vs. building custom?” LCEL: linear or simple branching workflows where each step is a single LLM call. Fast to build, limited control. LangGraph: stateful workflows with cycles (agent loops), human-in-the-loop nodes, or complex branching. Best for agent-like behavior. Temporal: long-running workflows (hours/days), workflows requiring durable execution guarantees, workflows with external system integrations and human checkpoints. Custom: when framework abstractions add more complexity than they remove, or when you need fine-grained control over execution (typically at very high scale or with unusual requirements).

Harness Design — harness is the foundation orchestration builds on
Agent Architecture — agents are orchestration with autonomous decision-making
Prompting — each orchestration step needs a well-designed prompt

Orchestration

Orchestration — Competence

Core Knowledge

Expected Practical Skills

Interview-Ready Explanations

Related