Comparison

Hallucination vs Online Evaluation

Hallucination and Online Evaluation are both common AI/LLM terms but cover different ideas. Here is a quick side-by-side.

When you would reach for Hallucination

Hallucination comes up when the question is fundamentally about evaluation.

Citing a paper that does not exist.

When you would reach for Online Evaluation

After offline eval is solid and you have meaningful production volume. Stretch your eval coverage from a fixed set to a live one.

Phoenix running a faithfulness eval on 5% of production RAG traces, dashboard charts the rolling 7-day mean.

Frequently asked

What is the difference between Hallucination and Online Evaluation?

Hallucination: A hallucination is a confidently-stated, plausible-sounding LLM output that is factually wrong. It is the failure mode that most often surprises non-expert users. Online Evaluation: Online evaluation runs scoring functions over live production traffic — usually a sample of recent traces — to monitor quality continuously instead of relying solely on a fixed offline dataset.

When should I use Hallucination vs Online Evaluation?

Hallucination is the right concept when you are focused on evaluation. After offline eval is solid and you have meaningful production volume. Stretch your eval coverage from a fixed set to a live one.

Are Hallucination and Online Evaluation the same thing?

No. Hallucination is evaluation; Online Evaluation is evaluation. They are related but address different parts of the AI stack.