Skip to main content
ModelTerms

Training · intermediate

Supervised Fine-Tuning (SFT)

SFT is fine-tuning where each training example has an explicit input and a desired output, supervised by a loss that penalizes deviation from that output.

Explanation

SFT is the workhorse of post-training. The data is typically human-written or human-curated (instruction, response) pairs, and the model is trained with the same next-token-prediction loss as pretraining but on this narrower distribution.

SFT alone can produce a useful chat model. RLHF then improves it further by optimizing against a learned reward model rather than mimicking specific responses.

Examples

  • Training Llama-3-Base on Anthropic's HH-RLHF "chosen" responses as a first pass.
  • Custom SFT on a company's historical support tickets.

Frequently asked

What is Supervised Fine-Tuning?

SFT is fine-tuning where each training example has an explicit input and a desired output, supervised by a loss that penalizes deviation from that output.

What is an example of supervised fine-tuning?

Training Llama-3-Base on Anthropic's HH-RLHF "chosen" responses as a first pass.

How is Supervised Fine-Tuning related to Fine-tuning?

Supervised Fine-Tuning and Fine-tuning are both training concepts. Fine-tuning continues training a pretrained model on a smaller, task-specific dataset, adjusting its weights to specialize behavior or knowledge.

Is Supervised Fine-Tuning considered intermediate?

Supervised Fine-Tuning is generally considered intermediate-level material in the AI and LLM space.

Side-by-side comparisons

Sources