Comparison

Preference Data vs Synthetic Data

Preference Data and Synthetic Data are both common AI/LLM terms but cover different ideas. Here is a quick side-by-side.

When you would reach for Preference Data

Preference Data comes up when the question is fundamentally about training.

Anthropic HH-RLHF (~170K preference pairs).

When you would reach for Synthetic Data

Synthetic Data comes up when the question is fundamentally about training.

Phi-3 trained heavily on textbook-quality synthetic data.

Frequently asked

What is the difference between Preference Data and Synthetic Data?

Preference Data: Preference data is collections of (chosen, rejected) response pairs over the same prompt. It is the fuel for DPO and reward-model training. Synthetic Data: Synthetic data is training data produced by a model — instructions distilled from GPT-4, code generated and filtered by tests, reasoning traces sampled from a stronger model — rather than handwritten by humans.

When should I use Preference Data vs Synthetic Data?

Preference Data is the right concept when you are focused on training. Synthetic Data applies when you are focused on training.

Are Preference Data and Synthetic Data the same thing?

No. Preference Data is training; Synthetic Data is training. They are related but address different parts of the AI stack.