Comparison

Alignment vs Preference Data

Alignment and Preference Data are both common AI/LLM terms but cover different ideas. Here is a quick side-by-side.

When you would reach for Alignment

Alignment comes up when the question is fundamentally about safety & alignment.

Tuning a model to refuse to help with bioweapon synthesis.

When you would reach for Preference Data

Preference Data comes up when the question is fundamentally about training.

Anthropic HH-RLHF (~170K preference pairs).

Frequently asked

What is the difference between Alignment and Preference Data?

Alignment: Alignment is the problem of making an AI system pursue what humans actually want rather than the literal letter of its training objective. RLHF and Constitutional AI are alignment techniques. Preference Data: Preference data is collections of (chosen, rejected) response pairs over the same prompt. It is the fuel for DPO and reward-model training.

When should I use Alignment vs Preference Data?

Alignment is the right concept when you are focused on safety & alignment. Preference Data applies when you are focused on training.

Are Alignment and Preference Data the same thing?

No. Alignment is safety & alignment; Preference Data is training. They are related but address different parts of the AI stack.