Skip to main content
ModelTerms

Training · intermediate

Instruction Tuning

Instruction tuning is fine-tuning on examples of (instruction, desired response) pairs so a base model learns to follow natural-language directions.

Explanation

A base LLM trained only on next-token prediction tends to complete text rather than follow instructions. If you prompt "Write a poem about the sea," a base model might continue "...and other writing tips for beginners." Instruction tuning explicitly teaches the model that "Write a poem about the sea" is a request to produce a poem.

The training data is usually a mix of human-written instruction examples (FLAN, OpenAssistant) and instructions distilled from a larger model. Most public chat models — Llama-3-Instruct, Mistral-Instruct — are instruction-tuned versions of corresponding base models.

Instruction tuning typically comes before RLHF in the post-training pipeline.

Examples

  • FLAN tuning Google's T5 to follow instructions.
  • Llama-3-8B-Instruct: the instruction-tuned variant of Llama-3-8B.

Frequently asked

What is Instruction Tuning?

Instruction tuning is fine-tuning on examples of (instruction, desired response) pairs so a base model learns to follow natural-language directions.

What is an example of instruction tuning?

FLAN tuning Google's T5 to follow instructions.

How is Instruction Tuning related to Fine-tuning?

Instruction Tuning and Fine-tuning are both training concepts. Fine-tuning continues training a pretrained model on a smaller, task-specific dataset, adjusting its weights to specialize behavior or knowledge.

Is Instruction Tuning considered intermediate?

Instruction Tuning is generally considered intermediate-level material in the AI and LLM space.

Side-by-side comparisons

Sources