Comparison

BFloat16 vs Mixed Precision

BFloat16 and Mixed Precision are both common AI/LLM terms but cover different ideas. Here is a quick side-by-side.

When you would reach for BFloat16

BFloat16 comes up when the question is fundamentally about infrastructure.

Llama 3 trained end-to-end in BF16.

When you would reach for Mixed Precision

Mixed Precision comes up when the question is fundamentally about infrastructure.

Pretraining a 7B model in BF16 instead of FP32.

Frequently asked

What is the difference between BFloat16 and Mixed Precision?

BFloat16: BFloat16 is a 16-bit floating-point format with FP32's exponent range but only 8 bits of mantissa. The default precision for LLM training and most inference. Mixed Precision: Mixed-precision training does the bulk of forward and backward computation in 16-bit floats (BF16 or FP16) while keeping master weights and certain accumulations in 32-bit. Faster, smaller, same accuracy.

When should I use BFloat16 vs Mixed Precision?

BFloat16 is the right concept when you are focused on infrastructure. Mixed Precision applies when you are focused on infrastructure.

Are BFloat16 and Mixed Precision the same thing?

No. BFloat16 is infrastructure; Mixed Precision is infrastructure. They are related but address different parts of the AI stack.