Skip to main content
ModelTerms

Foundations · beginner

Neural Network (artificial neural network, ANN)

A neural network is a stack of simple mathematical units ("neurons") that learn to transform inputs into outputs by adjusting numeric weights during training.

Explanation

Each "neuron" computes a weighted sum of its inputs, passes it through a nonlinear function, and forwards the result. Stack thousands of these into layers and you can approximate astonishingly complex input-to-output mappings: pixels to "cat or dog", text to translated text, or audio to a transcript.

The weights are the model's memory. Training adjusts them via gradient descent and backpropagation. A modern large language model is a neural network with tens to hundreds of billions of these weights.

The term "deep" in deep learning just means "many layers" — usually dozens or hundreds.

Examples

  • A 3-layer network classifying handwritten digits.
  • A 96-layer transformer like GPT-3.
  • A convolutional network reading X-rays.

Frequently asked

What is Neural Network?

A neural network is a stack of simple mathematical units ("neurons") that learn to transform inputs into outputs by adjusting numeric weights during training.

What is an example of neural network?

A 3-layer network classifying handwritten digits.

How is Neural Network related to Deep Learning?

Neural Network and Deep Learning are both foundations concepts. Deep learning is machine learning using neural networks with many layers ("deep" = many layers). It powers nearly every recent breakthrough in AI, including LLMs and image generators.

Is Neural Network considered beginner?

Neural Network is generally considered beginner-level material in the AI and LLM space.

Deep LearningFoundations

Deep learning is machine learning using neural networks with many layers ("deep" = many layers). It powers nearly every recent breakthrough in AI, including LLMs and image generators.

BackpropagationTraining

Backpropagation is the algorithm used to compute how each weight in a neural network should change to reduce error, by propagating gradients backward through the network.

Gradient DescentTraining

Gradient descent is the optimization algorithm at the heart of training: nudge each weight in the direction that reduces the loss, with a small step size set by the learning rate.

TransformerArchitecture

The transformer is the neural network architecture behind virtually every modern large language model. It uses self-attention to model relationships between all positions in a sequence in parallel.

EmbeddingArchitecture

An embedding is a list of numbers (a vector) that represents a piece of input — a word, a sentence, an image — in a space where similar things end up close together.

Side-by-side comparisons

Sources