Comparison

Instruction Tuning vs Pretraining

Instruction Tuning and Pretraining are both common AI/LLM terms but cover different ideas. Here is a quick side-by-side.

When you would reach for Instruction Tuning

Instruction Tuning comes up when the question is fundamentally about training.

FLAN tuning Google's T5 to follow instructions.

When you would reach for Pretraining

Pretraining comes up when the question is fundamentally about training.

GPT-3 pretrained on ~300B tokens.

Frequently asked

What is the difference between Instruction Tuning and Pretraining?

Instruction Tuning: Instruction tuning is fine-tuning on examples of (instruction, desired response) pairs so a base model learns to follow natural-language directions. Pretraining: Pretraining is the initial training phase where an LLM learns to predict the next token on trillions of tokens of general text. It produces a base model that can be adapted later.

When should I use Instruction Tuning vs Pretraining?

Instruction Tuning is the right concept when you are focused on training. Pretraining applies when you are focused on training.

Are Instruction Tuning and Pretraining the same thing?

No. Instruction Tuning is training; Pretraining is training. They are related but address different parts of the AI stack.