Matrices: Arrays That Transform Vectors

Matrices: Arrays That Transform Vectors
Matrices: Arrays That Transform Vectors | Ideasthesia

Most math textbooks will tell you that a matrix is a rectangular array of numbers arranged in rows and columns.

This is technically correct and pedagogically useless.

Here's what a matrix actually is: a machine that eats vectors and spits out different vectors.

Feed it a vector. Get a transformed vector back. The matrix has rotated it, stretched it, sheared it, or projected it to a lower dimension.

Matrices are how you encode transformations. And once you see them that way, everything about them makes sense.


The Geometric View

Start with this: a 2×2 matrix transforms two-dimensional vectors.

| a  b |
| c  d |

This matrix takes a vector (x, y) and outputs a new vector (x', y').

How? Matrix-vector multiplication:

| a  b | | x |   | ax + by |
| c  d | | y | = | cx + dy |

The output is a linear combination of the columns, weighted by the input components.

But forget the arithmetic for a moment. Think geometrically.

The matrix is transforming space itself. Every point gets mapped to a new point. Lines stay lines (if they don't collapse to points). The origin stays fixed. Parallel lines stay parallel (unless collapsed).

That's what "linear" means. The structure of space is preserved, even as space is stretched, rotated, or sheared.


Where Matrices Come From

A matrix is defined by what it does to basis vectors.

In two dimensions, the standard basis vectors are:

  • ê₁ = (1, 0) — points right
  • ê₂ = (0, 1) — points up

Every vector can be written as a combination: (x, y) = xê₁ + yê₂.

Now suppose you have a transformation T. Where does it send the basis vectors?

Say T(ê₁) = (a, c) and T(ê₂) = (b, d).

Then by linearity, T(xê₁ + yê₂) = xT(ê₁) + yT(ê₂) = x(a,c) + y(b,d) = (ax+by, cx+dy).

The transformation is completely determined by where it sends the basis vectors. And those destinations become the columns of the matrix:

| a  b |
| c  d |

First column: where ê₁ goes. Second column: where ê₂ goes.

This is the key insight: the columns of a matrix tell you where the basis vectors land.

Want to understand a matrix? Look at its columns. They show you what the transformation does to your coordinate axes.


Simple Transformations

Let's build intuition with simple matrices.

Identity Matrix

| 1  0 |
| 0  1 |

Sends ê₁ to (1, 0) and ê₂ to (0, 1). Everything stays where it is. This is the "do nothing" transformation.

Scaling Matrix

| 2  0 |
| 0  3 |

Sends ê₁ to (2, 0) and ê₂ to (0, 3). Space stretches by factor 2 horizontally, factor 3 vertically.

General scaling: diagonal matrix with scale factors on the diagonal.

Rotation Matrix

| cos(θ)  -sin(θ) |
| sin(θ)   cos(θ) |

Rotates every vector by angle θ counterclockwise around the origin.

For θ = 90°: cos(90°) = 0, sin(90°) = 1:

| 0  -1 |
| 1   0 |

Sends ê₁ = (1,0) to (0,1) and ê₂ = (0,1) to (-1,0). Rotates everything 90° left.

Shear Matrix

| 1  1 |
| 0  1 |

Sends ê₁ to (1, 0) and ê₂ to (1, 1). Vertical lines tilt right. It's like pushing the top of a deck of cards while holding the bottom fixed.

Reflection Matrix

| -1  0 |
|  0  1 |

Flips across the y-axis. Negative determinant—more on this later.

Projection Matrix

| 1  0 |
| 0  0 |

Sends ê₁ to (1, 0) and ê₂ to (0, 0). Collapses all vectors onto the x-axis. Two-dimensional space flattened to a line. Information is lost—this transformation isn't reversible.


Matrix-Vector Multiplication

Now the arithmetic makes sense.

Multiplying matrix M by vector v means: transform v by M's transformation.

Mv = (result vector)

The rule: linear combination of M's columns, weighted by v's components.

| a  b | | x |     | a |       | b |   | ax + by |
| c  d | | y | = x | c | + y   | d | = | cx + dy |

You're scaling the first column by x, scaling the second column by y, and adding them.

This is why the rule works: because transformations must respect linear combinations. If v = xê₁ + yê₂, then T(v) = xT(ê₁) + yT(ê₂).

The columns of M are T(ê₁) and T(ê₂). Multiplying gives you the linear combination.

The arithmetic follows from the geometry. It's not arbitrary.


Higher Dimensions

Everything generalizes.

A 3×3 matrix transforms three-dimensional vectors:

| a  b  c |
| d  e  f |
| g  h  i |

The columns tell you where the three basis vectors go:

  • ê₁ = (1,0,0) → (a, d, g)
  • ê₂ = (0,1,0) → (b, e, h)
  • ê₃ = (0,0,1) → (c, f, i)

An n×n matrix transforms n-dimensional space.

But matrices don't have to be square.

A 3×2 matrix transforms two-dimensional vectors into three-dimensional vectors:

| a  b |
| c  d |
| e  f |

Takes a 2D vector (x, y) and outputs a 3D vector (ax+by, cx+dy, ex+fy).

This is an embedding—mapping a lower-dimensional space into a higher-dimensional space. The 2D plane gets embedded as a surface in 3D space.

An m×n matrix transforms n-dimensional vectors into m-dimensional vectors. It can raise or lower dimension.


What Matrices Represent

Matrices encode transformations, but they also encode lots of other structured information.

Linear transformations: This is the geometric meaning. Rotate, scale, shear, project.

Systems of equations: A system like:

2x + 3y = 5
4x - y = 7

can be written as matrix equation Av = b:

| 2   3 | | x |   | 5 |
| 4  -1 | | y | = | 7 |

Solving the system means finding the vector v that gets transformed to b.

Graphs and networks: The adjacency matrix of a graph has entry (i,j) = 1 if nodes i and j are connected, 0 otherwise. Matrix powers count paths. Eigenvectors find important nodes.

Data: Each row is a data point, each column is a feature. A dataset is a matrix. Matrix operations (multiplication, decomposition) are data operations (transformation, dimensionality reduction).

Operators: In quantum mechanics, observables are matrices (technically: Hermitian operators). The matrix acts on the quantum state vector. Eigenvalues are measurement outcomes.

Changes of coordinates: A matrix can transform between different coordinate systems. Same vector, different representation.

Matrices are the universal language of linear structure.


Properties of Matrices

Some matrices have special properties that make them especially useful or well-behaved.

Square Matrices

Square matrices (n×n) transform a space to itself. These can potentially be inverted—you can undo the transformation.

Invertible Matrices

If M is invertible, there exists M⁻¹ such that MM⁻¹ = M⁻¹M = I (identity).

Geometrically: the transformation can be reversed. No information is lost.

Algebraically: you can solve Mv = b by multiplying by M⁻¹: v = M⁻¹b.

Not all matrices are invertible. Projection matrices aren't—you can't recover the lost dimension.

Determinant: The Invertibility Test

The determinant det(M) is a single number associated with a square matrix.

If det(M) ≠ 0, the matrix is invertible.

If det(M) = 0, the matrix is not invertible. It collapses some dimension—information is lost.

The determinant also measures volume scaling. If you transform a unit cube, det(M) is the volume of the resulting shape (possibly negative, indicating orientation flip).

Symmetric Matrices

A matrix is symmetric if Mᵀ = M (equals its transpose—flip across the diagonal).

Symmetric matrices have special properties: real eigenvalues, orthogonal eigenvectors. They represent particularly nice transformations.

Orthogonal Matrices

A matrix is orthogonal if QᵀQ = I (transpose equals inverse).

Orthogonal matrices represent rotations and reflections—transformations that preserve lengths and angles. They don't stretch or shear, only rotate/reflect.


Why Matrices Dominate

Matrices are the data structure of linear algebra.

They represent transformations: functions from vectors to vectors.

They compose: if A and B are transformations, AB is "do B, then do A" (more on this next).

They're computable: multiplying matrices and vectors is fast, even in high dimensions. Computers are built to do this.

They're analyzable: you can decompose matrices (eigenvalues, singular values, LU decomposition). These decompositions reveal structure.

They're everywhere:

  • Computer graphics: every transformation (rotate, scale, project) is a matrix multiply.
  • Machine learning: neural network layers are matrix multiplications.
  • Physics: quantum evolution is matrix exponentiation.
  • Signal processing: Fourier transforms are matrix multiplications.
  • Economics: input-output models are matrix equations.

Once you cast a problem in terms of matrices, the entire toolkit of linear algebra applies.


The Deep Idea

Here's why matrices matter: they separate the transformation from the vector being transformed.

In calculus, you study functions: f(x) = x². The function and its input are tangled together.

In linear algebra, you study transformations represented by matrices. The matrix M is the transformation. The vector v is what gets transformed. They're separate.

This separation is powerful.

You can study properties of M independent of which v you apply it to. You can compose transformations (multiply matrices) without choosing inputs. You can invert transformations, diagonalize them, decompose them—all without mentioning specific vectors.

The matrix is the structure. Vectors are the content.

Separating structure from content is how you build abstraction. And abstraction is how you handle complexity.


What's Next

We've seen what matrices are: machines that transform vectors, defined by where they send the basis vectors.

But we haven't talked about the most important matrix operation: multiplication.

Why do matrices multiply the way they do—row times column, that weird rule you memorized in school?

Because matrix multiplication is composition of transformations. Do one transformation, then another.

The arithmetic rule falls out of requiring that composition work correctly.

That's next.


This is Part 3 of the Linear Algebra series. Next: "Matrix Multiplication: Why the Rows and Columns Dance."


Part 3 of the Linear Algebra series.

Previous: Vectors: Quantities with Direction and Magnitude Next: Matrix Multiplication: Why the Rows and Columns Dance