Skip to main content
ModelTerms

Comparison

Gradient Descent vs Loss Function

Gradient Descent and Loss Function are both common AI/LLM terms but cover different ideas. Here is a quick side-by-side.

When you would reach for Gradient Descent

Gradient Descent comes up when the question is fundamentally about training.

A linear regression model learning the slope and intercept.

When you would reach for Loss Function

Loss Function comes up when the question is fundamentally about training.

Cross-entropy loss in next-token prediction.

Frequently asked

What is the difference between Gradient Descent and Loss Function?

Gradient Descent: Gradient descent is the optimization algorithm at the heart of training: nudge each weight in the direction that reduces the loss, with a small step size set by the learning rate. Loss Function: A loss function measures how wrong a model's predictions are. Training minimizes it. For LLMs the loss is the cross-entropy of predicted vs. actual next tokens.

When should I use Gradient Descent vs Loss Function?

Gradient Descent is the right concept when you are focused on training. Loss Function applies when you are focused on training.

Are Gradient Descent and Loss Function the same thing?

No. Gradient Descent is training; Loss Function is training. They are related but address different parts of the AI stack.