Training · intermediate
Instruction Tuning
Instruction tuning is fine-tuning on examples of (instruction, desired response) pairs so a base model learns to follow natural-language directions.
Explanation
A base LLM trained only on next-token prediction tends to complete text rather than follow instructions. If you prompt "Write a poem about the sea," a base model might continue "...and other writing tips for beginners." Instruction tuning explicitly teaches the model that "Write a poem about the sea" is a request to produce a poem.
The training data is usually a mix of human-written instruction examples (FLAN, OpenAssistant) and instructions distilled from a larger model. Most public chat models — Llama-3-Instruct, Mistral-Instruct — are instruction-tuned versions of corresponding base models.
Instruction tuning typically comes before RLHF in the post-training pipeline.
Examples
- FLAN tuning Google's T5 to follow instructions.
- Llama-3-8B-Instruct: the instruction-tuned variant of Llama-3-8B.
Frequently asked
What is Instruction Tuning?
Instruction tuning is fine-tuning on examples of (instruction, desired response) pairs so a base model learns to follow natural-language directions.
What is an example of instruction tuning?
FLAN tuning Google's T5 to follow instructions.
How is Instruction Tuning related to Fine-tuning?
Instruction Tuning and Fine-tuning are both training concepts. Fine-tuning continues training a pretrained model on a smaller, task-specific dataset, adjusting its weights to specialize behavior or knowledge.
Is Instruction Tuning considered intermediate?
Instruction Tuning is generally considered intermediate-level material in the AI and LLM space.