LLM-as-Judge | CapabilityAtlas

arrow_back LLM-as-Judge

Core Knowledge

What LLM-as-judge is and why it matters. Using one LLM to evaluate the output of another LLM (or the same LLM). It's the scalable alternative to human evaluation — you can score 10,000 outputs in...

build

Expected Practical Skills

open_in_new

Build an LLM-as-judge scoring pipeline. Define a rubric for a specific use case, implement the judging prompt (system prompt with rubric + examples + output to judge), parse the judge's response...

quiz

Interview-Ready Explanations

open_in_new

"Walk me through how you'd design an LLM-as-judge system." Start with the evaluation dimensions (what aspects of quality matter for this use case). For each dimension, write a rubric with 3-5 score...

open_in_new Read full fundamentals