You've seen scalars (single numbers), vectors (lists of numbers), and matrices (grids of numbers). These are all special cases of one unifying concept: a tensor. Understanding tensors is essential because every piece of data in ML — images, text, audio, batches of examples — is stored and processed as a tensor.
The hierarchy
Each step up adds a dimension:
| Name | Dimensions | Shape example | What it holds |
|---|---|---|---|
| Scalar | 0D | () | A single number: 3.14 |
| Vector | 1D | (n,) | A list: [1, 2, 3] |
| Matrix | 2D | (m, n) | A grid: a spreadsheet |
| 3D tensor | 3D | (d₁, d₂, d₃) | A stack of matrices |
| 4D tensor | 4D | (d₁, d₂, d₃, d₄) | A batch of 3D tensors |
| N-D tensor | ND | (d₁, …, dₙ) | Anything |
Rank: how many dimensions?
The rank of a tensor is the number of axes it has (also called order or ndim).
- A scalar has rank 0.
- A vector has rank 1.
- A matrix has rank 2.
- A batch of RGB images has rank 4.
Shape: the size of each dimension
The shape of a tensor lists the size along each axis. A matrix with 5 rows and 3 columns has shape (5, 3). A 3D tensor with shape (4, 5, 3) is like a stack of four (5, 3) matrices.
Real data shapes you'll see constantly
A single grayscale image: shape (H, W) — height × width.
A single color image: shape (H, W, C) or (C, H, W) depending on convention (PyTorch prefers channels-first).
A batch of 32 color images at 224×224: shape (32, 3, 224, 224).
A sequence of 128 tokens, each embedded as a 512-dim vector: shape (128, 512).
A batch of 64 sequences: shape (64, 128, 512).
Interactive example
Tensor shape explorer
Coming soon
Why batching uses a tensor
When you train a neural network, you rarely process one example at a time — it's slow and the gradients are noisy. Instead, you process a batch of examples simultaneously.
Batching just means stacking individual examples along a new first axis:
This is efficient because modern hardware (GPUs) can apply the same operation to all N examples in parallel. The math stays the same — operations just apply to each slice along the batch axis.
Indexing into tensors
Just like a matrix entry needs two indices (row, column), a tensor entry needs one index per axis.
For a tensor T of shape (4, 5, 3):
T[i]selects one slice of shape(5, 3).T[i, j]selects one row of shape(3,).T[i, j, k]selects a single number.
In NumPy/PyTorch, this looks like: T[2, 1, 0] → the number at position (2, 1, 0).
Reshaping: reorganizing without changing values
You can reorganize a tensor into a different shape as long as the total number of elements stays the same. This is called reshaping (or view in PyTorch).
A 28×28 image has 784 pixels. Flattening turns the 2D grid into a 1D vector — same 784 numbers, different arrangement. This is how fully-connected layers accept image input.
Broadcasting: operating on different shapes
When you add a scalar to a vector, Python/NumPy "broadcasts" the scalar to match the vector's shape. The same idea extends to tensors:
This means you can add a bias vector of shape (512,) to a batch of activations of shape (64, 128, 512) — the bias is added to every position automatically. No explicit looping required.
Tensors in code
In PyTorch, everything is a Tensor:
import torch x = torch.tensor([1.0, 2.0, 3.0]) # shape (3,) — a vector W = torch.randn(4, 3) # shape (4, 3) — a matrix y = W @ x # shape (4,) — matrix-vector multiply batch = torch.randn(32, 3) # 32 vectors, shape (32, 3) out = batch @ W.T # (32, 3) @ (3, 4) → (32, 4)
Every operation — addition, multiplication, matrix products — just works element-wise or along specified axes, automatically handling the batch dimension.
Summary
- A tensor is an N-dimensional array. Scalars, vectors, and matrices are all special cases.
- Rank = number of dimensions. Shape = size along each axis.
- Batching adds a first axis of size N, letting GPUs process N examples at once.
- Reshaping reorganizes elements without copying data.
- In ML, "tensor" always means N-D array — not the physics definition.
In practice, most of your code will work on rank-2 and rank-3 tensors (matrices and batched sequences), with rank-4 for images. Understanding shapes is half the job of debugging neural network code.