What is a vector?

A is an ordered list of numbers. That is it. Nothing mysterious.

Vectors are the universal data format of machine learning: a training example is a vector of features, a model's weights are a vector of parameters, and a prediction is a vector of scores. Every operation you will learn — dot products, matrix multiplication, gradients — is built on vectors. Comfort here pays dividends in every subsequent lesson.

Think of GPS coordinates: your location might be $[37.77,\thinspace -122.42]$ - latitude and longitude. That is a 2D vector. Or think of a house described by $[\text{sqft},\thinspace \text{bedrooms},\thinspace \text{bathrooms},\thinspace \text{age}] = [1850,\thinspace 3,\thinspace 2,\thinspace 15]$ . Every feature is one component.

Order matters. $[2, 1, 3]$ and $[3, 1, 2]$ are different vectors. Same numbers, completely different meaning - like swapping latitude and longitude.

Writing Vectors

A vector with $n$ components is written $\mathbf{x} = [x_1, x_2, \ldots, x_n]$ (row vector) or as a vertical column (column vector):

\mathbf{x} = \begin{bmatrix} x_1 \ x_2 \ \vdots \ x_n \end{bmatrix}

$x_i$: the i-th component of the vector
$n$: dimension - number of components

By convention in ML, vectors are column vectors. The number of components is the dimension of the vector. The house vector above is 4-dimensional.

Geometric Picture

In 2D, the vector $[3, 4]$ is an arrow starting at the origin and ending at 3 units right and 4 units up. Every vector is an arrow. The direction says "which way," and the length says "how far."

In 3D, $[1, 2, 3]$ points 1 unit in x, 2 in y, 3 in z. ML routinely works in hundreds or thousands of dimensions - you cannot visualize that - but every rule you learn for 2D and 3D extends unchanged.

Vector Addition

Add two vectors by adding corresponding components:

\begin{bmatrix} 1 \ 2 \end{bmatrix} + \begin{bmatrix} 3 \ 4 \end{bmatrix} = \begin{bmatrix} 1+3 \ 2+4 \end{bmatrix} = \begin{bmatrix} 4 \ 6 \end{bmatrix}

$\mathbf{a}$: first vector
$\mathbf{b}$: second vector

Rule: both vectors must have the same dimension. You cannot add a 3D vector to a 4D vector.

Geometrically, adding $[3,4]$ to $[1,2]$ is like walking 3 right and 4 up, then 1 more right and 2 more up. You end at $[4,6]$ . Vector addition chains arrows end-to-end.

InteractiveDot Product — drag the vector tips

a[1.8, 1.2]

b[0.8, 1.8]

a · b3.60

angle32.3°

⬆ Positive — vectors point in similar directions

The dot product measures alignment. In ML, it's how a neuron "scores" its input — high positive = strong match, near zero = no signal, negative = opposite.

Scalar Multiplication

A is a plain number. Multiplying a vector by a scalar scales every component:

3 \cdot \begin{bmatrix} 1 \ 2 \end{bmatrix} = \begin{bmatrix} 3 \ 6 \end{bmatrix}

$c$: scalar - a plain real number
$\mathbf{x}$: vector to be scaled

Special cases:

Multiplying by $-1$ flips the direction: $-1 \cdot [1, 2] = [-1, -2]$
Multiplying by $0$ gives the zero vector: $0 \cdot [1, 2] = [0, 0]$
Multiplying by a fraction shrinks: $0.5 \cdot [4, 6] = [2, 3]$

Everyday intuition: if a map direction is $[3, 4]$ , then doubling it to $2 \cdot [3, 4] = [6, 8]$ means "go the same way, but twice as far." Scalar multiplication changes size while keeping the basic direction.

Magnitude (Norm)

The (also called norm or length) of a vector uses the Pythagorean theorem extended to multiple dimensions:

|\mathbf{x}| = \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2}

$\|\mathbf{x}\|$: Euclidean norm - the length of vector x
$n$: dimension of the vector

Examples:

Example: $\mid [3, 4]\mid = \sqrt{9 + 16} = \sqrt{25} = 5$ (classic 3-4-5 triangle)
Example: $\mid [1, 2, 2]\mid = \sqrt{1 + 4 + 4} = \sqrt{9} = 3$
Example: $\mid [0, 0, 0]\mid = 0$ - only the zero vector has length zero

A has magnitude exactly 1. To normalize a vector - convert it to unit length - divide every component by its magnitude:

\hat{\mathbf{x}} = \frac{\mathbf{x}}{|\mathbf{x}|}

$\hat{\mathbf{x}}$: unit vector - same direction as x but length 1

Check: if $\mathbf{x} = [3, 4]$ , then $\hat{\mathbf{x}} = [0.6, 0.8]$ and $\mid [0.6, 0.8]\mid = \sqrt{0.36 + 0.64} = 1$ ✓

Why Vectors Matter in ML

Almost everything in machine learning is a vector operation:

Each training example is a vector. A 28×28 pixel image flattens to a 784-dimensional vector. A sentence encoded over a 10,000-word vocabulary is a 10,000-dimensional vector.
Model weights are a vector. A linear model with 10 inputs has weights $\mathbf{w} = [w_1, \ldots, w_{10}]$ - a 10D vector of learned numbers, one per input.
Predictions are vectors. A 10-class classifier outputs a 10D probability vector - one entry per class.
The training dataset is a collection of vectors - stack them as rows and you get a matrix (next lesson).

Virtually every computation in a neural network is a combination of vector addition, scalar multiplication, and the operation we cover next: the dot product.

import numpy as np

# Vectors in NumPy are just arrays
house = np.array([1200, 3, 2, 15])   # [sqft, bedrooms, bathrooms, age]
x     = np.array([3.0, 4.0])

# Vector addition
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(a + b)            # → [5, 7, 9]  (component-wise)

# Scalar multiplication
print(2.0 * a)          # → [2., 4., 6.]

# Magnitude (L2 norm)
mag = np.linalg.norm(x)
print(mag)              # → 5.0  (√(9+16))

# Unit vector (normalize)
x_hat = x / np.linalg.norm(x)
print(x_hat)            # → [0.6, 0.8]
print(np.linalg.norm(x_hat))  # → 1.0 ✓