Skip to main content
ModelTerms

Comparison

Data Contamination vs MMLU

Data Contamination and MMLU are both common AI/LLM terms but cover different ideas. Here is a quick side-by-side.

When you would reach for Data Contamination

Data Contamination comes up when the question is fundamentally about evaluation.

MMLU questions appearing verbatim in pretraining data crawls.

When you would reach for MMLU

MMLU comes up when the question is fundamentally about evaluation.

GPT-4: 86.4% MMLU (5-shot, original release).

Frequently asked

What is the difference between Data Contamination and MMLU?

Data Contamination: Data contamination is when benchmark questions or answers leak into a model's pretraining corpus, inflating its score because it memorized rather than reasoned. MMLU: MMLU is a benchmark of ~16K multiple-choice questions across 57 subjects from elementary to professional. It is one of the most widely cited LLM benchmarks.

When should I use Data Contamination vs MMLU?

Data Contamination is the right concept when you are focused on evaluation. MMLU applies when you are focused on evaluation.

Are Data Contamination and MMLU the same thing?

No. Data Contamination is evaluation; MMLU is evaluation. They are related but address different parts of the AI stack.