Inference · intermediate
Structured Output (JSON mode, structured generation)
Structured output constrains an LLM to emit text matching a schema — usually JSON. The model can be guaranteed to produce valid output that your code can parse without retries.
Explanation
Asking a model to "respond in JSON" mostly works but occasionally fails on edge cases. Structured-output APIs (OpenAI Structured Outputs, Anthropic tool-use response shapes, llama.cpp grammars, Outlines) constrain sampling at decode time — invalid tokens are masked out before they can be generated.
This trades a tiny amount of quality (some token choices ruled out) for 100% schema compliance, eliminating an entire class of production retries and parsing bugs.
Use whenever the downstream consumer is code rather than a human.
Examples
- OpenAI Structured Outputs with a Pydantic / JSON Schema.
- Anthropic tool_use blocks returning typed parameters.
- Llama.cpp GBNF grammars enforcing valid JSON.
When to use structured output
Any time you need to programmatically parse model output — extraction, function arguments, classification, multi-step pipelines.
Frequently asked
What is Structured Output?
Structured output constrains an LLM to emit text matching a schema — usually JSON. The model can be guaranteed to produce valid output that your code can parse without retries.
What is an example of structured output?
OpenAI Structured Outputs with a Pydantic / JSON Schema.
How is Structured Output related to Function Calling?
Structured Output and Function Calling are both inference concepts. Function calling is the specific API mechanism by which an LLM emits a structured request to invoke a named function with typed arguments. The OpenAI-popularized way to do tool use.
When should I use structured output?
Any time you need to programmatically parse model output — extraction, function arguments, classification, multi-step pipelines.
Is Structured Output considered intermediate?
Structured Output is generally considered intermediate-level material in the AI and LLM space.