Infrastructure · intermediate
LLM Gateway (model gateway, AI proxy)
An LLM gateway is a proxy layer that sits between application code and one or more LLM providers — handling auth, rate-limit retries, cost tracking, observability, prompt caching, model routing, and PII redaction.
Explanation
Production LLM apps quickly accumulate cross-cutting concerns: which provider has spare quota, how to retry on 429s, how to track per-tenant cost, how to redact PII before sending, how to log every call, how to A/B test models. An LLM gateway centralizes them.
Popular implementations: LiteLLM (open-source proxy + SDK), Portkey, Helicone, Cloudflare AI Gateway, Kong AI Gateway, AWS Bedrock's built-in gateway features.
Application code points at the gateway with one base URL; the gateway handles fallbacks, routing, observability, and policies behind it. Trade-off: another hop in the latency budget (~10-50ms added).
Examples
- LiteLLM proxy fronting OpenAI, Anthropic, and Bedrock with unified billing dashboards and automatic retry-on-fallback.
- A multi-tenant SaaS routing all LLM calls through Cloudflare AI Gateway for caching + analytics.
When to use llm gateway
When you use multiple providers, need per-tenant cost attribution, or want centralized observability/PII policies.
Frequently asked
What is LLM Gateway?
An LLM gateway is a proxy layer that sits between application code and one or more LLM providers — handling auth, rate-limit retries, cost tracking, observability, prompt caching, model routing, and PII redaction.
What is an example of llm gateway?
LiteLLM proxy fronting OpenAI, Anthropic, and Bedrock with unified billing dashboards and automatic retry-on-fallback.
How is LLM Gateway related to LLM Observability?
LLM Gateway and LLM Observability are both infrastructure concepts. LLM observability is the practice of capturing, analyzing, and acting on every LLM call in a production system — inputs, outputs, latencies, costs, errors, and quality scores — so you can debug regressions and improve quality over time.
When should I use llm gateway?
When you use multiple providers, need per-tenant cost attribution, or want centralized observability/PII policies.
Is LLM Gateway considered intermediate?
LLM Gateway is generally considered intermediate-level material in the AI and LLM space.