Skip to content
Math Foundation Vectors & Matrices
Lesson 6 ⏱ 12 min

Tensors: N-dimensional arrays

Video coming soon

Tensors: From Vectors to N-Dimensional Arrays

Scalars, vectors, matrices, and higher-dimensional arrays — all as special cases of the same idea. How ML frameworks think about data, and why batching works.

⏱ ~7 min

🧮

Quick refresher

Vectors and matrices

A vector is a 1D array of numbers. A matrix is a 2D grid of numbers with rows and columns. Their shapes control what operations are valid.

Example

A vector has shape (n,).

A matrix has shape (m, n).

A batch of images might have shape (32, 28, 28, 3).

You've seen scalars (single numbers), vectors (lists of numbers), and matrices (grids of numbers). These are all special cases of one unifying concept: a tensor. Understanding tensors is essential because every piece of data in ML — images, text, audio, batches of examples — is stored and processed as a tensor.

The hierarchy

Each step up adds a dimension:

NameDimensionsShape exampleWhat it holds
Scalar0D()A single number: 3.14
Vector1D(n,)A list: [1, 2, 3]
Matrix2D(m, n)A grid: a spreadsheet
3D tensor3D(d₁, d₂, d₃)A stack of matrices
4D tensor4D(d₁, d₂, d₃, d₄)A batch of 3D tensors
N-D tensorND(d₁, …, dₙ)Anything

Rank: how many dimensions?

The rank of a tensor is the number of axes it has (also called order or ndim).

  • A scalar has rank 0.
  • A vector has rank 1.
  • A matrix has rank 2.
  • A batch of RGB images has rank 4.

Shape: the size of each dimension

The shape of a tensor lists the size along each axis. A matrix with 5 rows and 3 columns has shape (5, 3). A 3D tensor with shape (4, 5, 3) is like a stack of four (5, 3) matrices.

Real data shapes you'll see constantly

A single grayscale image: shape (H, W) — height × width.

A single color image: shape (H, W, C) or (C, H, W) depending on convention (PyTorch prefers channels-first).

A batch of 32 color images at 224×224: shape (32, 3, 224, 224).

A sequence of 128 tokens, each embedded as a 512-dim vector: shape (128, 512).

A batch of 64 sequences: shape (64, 128, 512).

Interactive example

Tensor shape explorer

Coming soon

Why batching uses a tensor

When you train a neural network, you rarely process one example at a time — it's slow and the gradients are noisy. Instead, you process a batch of examples simultaneously.

Batching just means stacking individual examples along a new first axis:

single image: (H,W,C)batch of N(N,H,W,C)\text{single image: } (H, W, C) \xrightarrow{\text{batch of } N} (N, H, W, C)

This is efficient because modern hardware (GPUs) can apply the same operation to all N examples in parallel. The math stays the same — operations just apply to each slice along the batch axis.

Indexing into tensors

Just like a matrix entry needs two indices (row, column), a tensor entry needs one index per axis.

For a tensor T of shape (4, 5, 3):

  • T[i] selects one slice of shape (5, 3).
  • T[i, j] selects one row of shape (3,).
  • T[i, j, k] selects a single number.

In NumPy/PyTorch, this looks like: T[2, 1, 0] → the number at position (2, 1, 0).

Reshaping: reorganizing without changing values

You can reorganize a tensor into a different shape as long as the total number of elements stays the same. This is called reshaping (or view in PyTorch).

shape (28,28)flattenshape (784,)\text{shape } (28, 28) \xrightarrow{\text{flatten}} \text{shape } (784,)

A 28×28 image has 784 pixels. Flattening turns the 2D grid into a 1D vector — same 784 numbers, different arrangement. This is how fully-connected layers accept image input.

Broadcasting: operating on different shapes

When you add a scalar to a vector, Python/NumPy "broadcasts" the scalar to match the vector's shape. The same idea extends to tensors:

(64,128,512)+(512,)(512,) is broadcast to (64,128,512)(64, 128, 512) + (512,) \rightarrow (512,) \text{ is broadcast to } (64, 128, 512)

This means you can add a bias vector of shape (512,) to a batch of activations of shape (64, 128, 512) — the bias is added to every position automatically. No explicit looping required.

Tensors in code

In PyTorch, everything is a Tensor:

import torch

x = torch.tensor([1.0, 2.0, 3.0])  # shape (3,) — a vector
W = torch.randn(4, 3)               # shape (4, 3) — a matrix
y = W @ x                           # shape (4,)  — matrix-vector multiply

batch = torch.randn(32, 3)          # 32 vectors, shape (32, 3)
out = batch @ W.T                   # (32, 3) @ (3, 4) → (32, 4)

Every operation — addition, multiplication, matrix products — just works element-wise or along specified axes, automatically handling the batch dimension.

Summary

  • A tensor is an N-dimensional array. Scalars, vectors, and matrices are all special cases.
  • Rank = number of dimensions. Shape = size along each axis.
  • Batching adds a first axis of size N, letting GPUs process N examples at once.
  • Reshaping reorganizes elements without copying data.
  • In ML, "tensor" always means N-D array — not the physics definition.

In practice, most of your code will work on rank-2 and rank-3 tensors (matrices and batched sequences), with rank-4 for images. Understanding shapes is half the job of debugging neural network code.

Quiz

1 / 3

A grayscale image is 28×28 pixels. A batch of 64 such images stored together has tensor shape...