Matrix Introduction and Operations

1 Matrix operations

1.1 Introduction

Matrix notation is a way of representing data and equations. An example from Bronson (1995):
T-shirts
Nine teal small and five teal medium; eight plum small and six plum medium; large sizes- three
sand, one rose, two peach ; also three medium rose, five medium sand, one peach medium, and seven
peach small.

In a matrix the same information looks like this:

In this format, it is easy to understand and work with the information. By summing each column
we can tell how many of each color there are, and by summing each row we can tell how many of
each size there are.

Matrices are made up of elements arranged in horizontal rows and vertical columns. The size of a
matrix is given as rows × columns, for example the t-shirt matrix is 3 × 5. A vector is a matrix
with 1 row × any number of columns or 1 column × any number of rows.

The matrix itself is usually capitalized, and the elements of the matrix are referred to in lower case
letters with row and then column in subscript. In the above matrix S each element s_ij corresponds
to the element in the ith row and jth column, with s_ij representing the number of small teal shirts.
Matrices are often expressed as capital and bold typed latin letters, whereas vectors most often are
expressed as bold lower case latin letters.

1.2 Matrix Addition

Matrices can be added together or subtracted from one another if the dimensions of the two matrices
are the same. If this is true, then each element of the matrix is added to its corresponding element
in the other matrix. Subtraction works similarly. For example, if matrix A is

and matrix B is

then matrix C = A + B is defined as

A natural extension of matrices to programming is through arrays. 2-dimensional arrays are analogous
the matrices shown here, but because arrays are not actually matrices, matrix operations
have to be specified in most general programming languages.

Algorithm 1 Matrix Addition

if A and B are both n × p matrices then

for i = 1 to n do
for j = 0 to p do
C_ij = A_ij + B_ij
end for
end for
end if

1.3 Matrix Multiplication

Matrix multiplication is less intuitive than matrix addition. First, the matrices have to be of defined
proportions . Matrices of different sizes can be multipled as long as the number of columns in the
first matrix equals the number of rows in the second matrix. The formal definition of this is that
any matrix of n rows ×p columns can be multiplied by any matrix of p rows ×r columns, where
the resulting matrix is n rows ×r columns. Second, the actual operations are defined differently . If

and matrix E is

then matrix

is defined as

This becomes more clear when we are working with linear equations . Linear equations can be
written as:

Or, in matrix-vector notation:

Now it is a little easier to see how to multiply the matrices and vectors, and what the result is.

1.4 The Lande Equation

An example of where this is used in evolution is the Lande equation and G-matrices. Lande (1979)
defined the phenotypic response to selection as . This means that the per-generation
change () in a phenotypic trait (z) is equal to the additive genetic variance (G) multiplied by

Algorithm 2 Matrix Multiplication C = AB

if A is an n × p and B is p × r then
for i = 1 to n do
for j = 1 to r do
for z = 1 to p do
c_ij = a_izb_zj + c_ij
end for
end for
end for
end if

the selection on that trait (β), or the partial derivative of mean fitness with respect to the trait
(Lande and Arnold 1983). If we expand this to multiple phenotypic traits we can write the
response to selection as a vector of changes in phenotypic traits. The matrix G contains the additive
genetic variances for each trait on the diagonal and the genetic covariances between traits
are the off-diagonal elements. To find the response to selection, we multiply this G matrix by the
selection vector.

The system can be represented in matrix-vector notation, but can also be split into three separate
equations.

Looking at it this way, we can see that the change in each trait is found by summing the selective
forces caused by selection on the trait itself and correlated effects from selection on other traits.

1.5 Matrix Transposition

A matrix can be transposed from A to A^T by converting all the columns of matrix A to the rows of
matrix A^T and the rows of matrix A to the columns of matrix A^T . The first row of A becomes the
first column of A^T , and so on. The definition of this is if A is an n×p matrix, then the transpose
of A is denoted by A^T and is defined as A^T (j, i) = A(i, j).

Algorithm 3 Matrix Transposition A -> A^T

for i = 1 to n do
for j = 1 to m do
A^T (j, i) = A(i, j)
end for
end for

1.6 Matrix Inversion

If we have a system of linear equations Ax = b, we might like to solve for x . In an algebraic equation
this would happen by dividing both sides by A but in matrix algebra division is undefined. Instead,
we use matrix inversion. A matrix A^-1 is defined as the inverse of matrix A if

where I is the identity matrix. I is defined as a square matrix with all diagonal elements equal to
one and all off-diagonal elements equal to 0. In order to satisfy this, both matrices must be square
and of the same order. If there is no matrix A^-1 that satisfies this condition the matrix is singular.
Multiplication of a square matrix with its inverse is commutative

but multiplication of two different (square) matrices A and B is not

here an example of an inversion:

Matrix inversion can be used in the Lande example to solve for β, the selection vector. The inverse
of a 2×2 matrix is found with the determinant, defined as a11×a22 - a12×a21. The inverse is:

The Lande equation specifies that at equilibrium the change in the trait will be zero , or . If
we have biased mutation (w) or some other force acting on the system, we can express it as:

To solve for β at equilibrium, we multiply both sides by the inverse matrix

And we can now solve for each β_i.

1.7 Vector and matrix norms

A Norm is a measure of distance. We can apply norms to vectors and matrices.

1.7.1 Vector norms

Requirements for vector norms are:

f(x) is the vector norm and is expressed typically as ||x||. Several vector norms are often used.
The general expression is sthe p-norm. It is expressed as

Of these the 1,2, and norms are the most commonly used ones:

the 1-norm is also called Manhattan distance or city block distance and the 2-norm is the Euclidian
distance. Vector norms have some cool properties for example the inequality:

A special case is the Cauchy-Schwartz inequality

Several more inequalities can be used to approximate or bound norms, but you might want to look
into the book by Golub and van Loan (1996).

1.7.2 Matrix norms

Matrix norms are an important measure to assess whether they are fit for some operations, matrix
norm can measure whether a matrix is near singularity. Matrix norms need the same requirements
as the vector norms. Examples for matrix norms are the Frobenius-norm

and the p-norm

Think of sup as the maximum, at least for real numbers . the matrix norms often can be broken
down into vector norms.

1.8 Summary of matrix operations not explicitely discussed

1.9 Sources and Additional Reading

Bronson, R. 1995. Linear Algebra: An Introduction. San Diego, CA, Academic Press.
Golub, G. H., and C. F van Loan 1996. Matrix computations. 3rd edition. John Hopkins University
Press, Baltimore and London.
Lande, R. 1979. Quantitative-genetic analysis of multivariate evolution, applied to brain-body size
allometry. Evolution 33: 402416.
Lande, R., and S. J. Arnold. 1983. The measurement of selection on correlated characters. Evolution
37:12101226.
Trefethen, L.N. and D. Bau, III. 1997. Numerical Linear Algebra. Philadelphia, PA, Society for
Industrial and Applied Mathematics .