# Matrix Introduction and Operations

## 1 Matrix operations

**1.1 Introduction**

Matrix notation is a way of representing data and equations. An example from
Bronson (1995):

T-shirts

Nine teal small and five teal medium; eight plum small and six plum medium;
large sizes- three

sand, one rose, two peach ; also three medium rose, five medium sand, one peach
medium, and seven

peach small.

In a matrix the same information looks like this:

In this format, it is easy to understand and work with the information. By
summing each column

we can tell how many of each color there are, and by summing each row we can
tell how many of

each size there are.

Matrices are made up of elements arranged in horizontal rows and vertical
columns. The size of a

matrix is given as rows × columns, for example the t-shirt matrix is 3 × 5. A
vector is a matrix

with 1 row × any number of columns or 1 column × any number of rows.

The matrix itself is usually capitalized, and the elements of the matrix are
referred to in lower case

letters with row and then column in subscript. In the above matrix S each
element s_{ij} corresponds

to the element in the ith row and jth column, with s_{ij} representing the number
of small teal shirts.

Matrices are often expressed as capital and bold typed latin letters, whereas
vectors most often are

expressed as bold lower case latin letters.

**1.2 Matrix Addition**

Matrices can be added together or subtracted from one another if the
dimensions of the two matrices

are the same. If this is true, then each element of the matrix is added to its
corresponding element

in the other matrix. Subtraction works similarly. For example, if matrix **A**
is

and matrix B is

then matrix C = A + B is defined as

A natural extension of matrices to programming is through arrays.
2-dimensional arrays are analogous

the matrices shown here, but because arrays are not actually matrices, matrix
operations

have to be specified in most general programming languages.

**Algorithm 1 Matrix Addition**

if A and B are both n × p matrices then

for i = 1 to n do

for j = 0 to p do

C_{ij} = A_{ij} + B_{ij}

end for

end for

end if

**1.3 Matrix Multiplication**

Matrix multiplication is less intuitive than matrix addition. First, the
matrices have to be of defined

proportions . Matrices of different sizes can be multipled as long as the number
of columns in the

first matrix equals the number of rows in the second matrix. The formal
definition of this is that

any matrix of n rows ×p columns can be multiplied by any matrix of p rows ×r
columns, where

the resulting matrix is n rows ×r columns. Second, the actual operations are
defined differently . If

and matrix E is

then matrix

is defined as

This becomes more clear when we are working with linear equations . Linear
equations can be

written as:

Or, in matrix-vector notation:

Now it is a little easier to see how to multiply the matrices and vectors, and what the result is.

**1.4 The Lande Equation**

An example of where this is used in evolution is the Lande equation and
G-matrices. Lande (1979)

defined the phenotypic response to selection as
. This means that the per-generation

change ()
in a phenotypic trait (z) is equal to the additive genetic variance (G)
multiplied by

**Algorithm 2 Matrix Multiplication C = AB**

if A is an n × p and B is p × r then

for i = 1 to n do

for j = 1 to r do

for z = 1 to p do

c_{ij} = a_{iz}b_{zj} + c_{ij}

end for

end for

end for

end if

the selection on that trait (β), or the partial
derivative of mean fitness with respect to the trait

(Lande
and Arnold 1983). If we expand this to multiple phenotypic traits we can write
the

response to selection as a vector of changes in phenotypic traits. The matrix G
contains the additive

genetic variances for each trait on the diagonal and the genetic covariances
between traits

are the off-diagonal elements. To find the response to selection, we multiply
this G matrix by the

selection vector.

The system can be represented in matrix-vector notation, but can also be
split into three separate

equations.

Looking at it this way, we can see that the change in each trait is found by
summing the selective

forces caused by selection on the trait itself and correlated effects from
selection on other traits.

**1.5 Matrix Transposition**

A matrix can be transposed from A to A^{T} by converting all the columns of
matrix A to the rows of

matrix A^{T} and the rows of matrix A to the columns of matrix A^{T} . The first row
of A becomes the

first column of A^{T} , and so on. The definition of this is if A is an n×p matrix,
then the transpose

of A is denoted by A^{T} and is defined as A^{T} (j, i) = A(i, j).

**Algorithm 3 Matrix Transposition A -> A**^{T}

for i = 1 to n do

for j = 1 to m do

A^{T} (j, i) = A(i, j)

end for

end for

**1.6 Matrix Inversion**

If we have a system of linear equations Ax = b, we might like to solve for x .
In an algebraic equation

this would happen by dividing both sides by A but in matrix algebra division is
undefined. Instead,

we use matrix inversion. A matrix A^{-1} is defined as the inverse of
matrix A if

where I is the identity matrix. I is defined as a square matrix with all
diagonal elements equal to

one and all off-diagonal elements equal to 0. In order to satisfy this, both
matrices must be square

and of the same order. If there is no matrix A^{-1} that satisfies this
condition the matrix is singular.

Multiplication of a square matrix with its inverse is commutative

but multiplication of two different (square) matrices A and B is not

here an example of an inversion:

Matrix inversion can be used in the Lande example to solve for
β, the
selection vector. The inverse

of a 2×2 matrix is found with the determinant, defined as a11×a22 - a12×a21. The
inverse is:

The Lande equation specifies that at equilibrium the change in the trait will
be zero , or
.
If

we have biased mutation (w) or some other force acting on the system, we can
express it as:

To solve for β at equilibrium, we multiply both sides by the inverse matrix

And we can now solve for each β_{i}.

**1.7 Vector and matrix norms**

A Norm is a measure of distance. We can apply norms to vectors and matrices.

**1.7.1 Vector norms**

**Requirements for vector norms are:**

f(x) is the vector norm and is expressed typically as
||x||. Several vector norms are often used.

The general expression is sthe p-norm. It is expressed as

Of these the 1,2, and norms are the most commonly used ones:

the 1-norm is also called Manhattan distance or city block
distance and the 2-norm is the Euclidian

distance. Vector norms have some cool properties for example the
inequality:

A special case is the Cauchy-Schwartz inequality

Several more inequalities can be used to approximate or
bound norms, but you might want to look

into the book by Golub and van Loan (1996).

**1.7.2 Matrix norms**

Matrix norms are an important measure to assess whether
they are fit for some operations, matrix

norm can measure whether a matrix is near singularity. Matrix norms need the
same requirements

as the vector norms. Examples for matrix norms are the Frobenius-norm

and the p-norm

Think of sup as the maximum, at least for real numbers .
the matrix norms often can be broken

down into vector norms.

**1.8 Summary of matrix operations not explicitely
discussed**

**1.9 Sources and Additional Reading**

Bronson, R. 1995. Linear Algebra: An Introduction. San
Diego, CA, Academic Press.

Golub, G. H., and C. F van Loan 1996. Matrix computations. 3rd edition. John
Hopkins University

Press, Baltimore and London.

Lande, R. 1979. Quantitative-genetic analysis of multivariate evolution, applied
to brain-body size

allometry. Evolution 33: 402416.

Lande, R., and S. J. Arnold. 1983. The measurement of selection on correlated
characters. Evolution

37:12101226.

Trefethen, L.N. and D. Bau, III. 1997. Numerical Linear Algebra. Philadelphia,
PA, Society for

Industrial and Applied Mathematics .

Prev | Next |