Singular Value Decomposition - Eigenvectors of A Symmetric Matrix


matrix

In this fourth post of the series, let’s first review the two concepts introduced in the last post — transpose and dot product — but in a new context.

Partitioned matrix

When calculating the transpose of a matrix, it is usually useful to represent it as a partitioned matrix. For example, the matrix

C=[542719]

can be also written as:

C=[u1u2u3]

where

u1=[57], u2=[41], u3=[29]

We can think of each column of C as a column vector, and C can be thought of as a matrix with just one row. To write the transpose of C, we can simply turn this row into a column, similar to what we do for a row vector. The only difference is that each element in C is now a vector itself and should be transposed too.

u1=[57], u2=[41], u3=[29], 

Therefore,

C=[u1u2u3]=[574129]

Each row of C is the transpose of the corresponding column of the original matrix C.

Let matrix A be a partitioned column matrix and matrix B be a partitioned row matrix:

A=[a1a2ap], B=[b1b2bp]

where each column vector ai is defined as the i-th column of A:

ai=[a1,ia2,iam,i]

For each element, the first subscript refers to the row number and the second subscript to the column number. Therefore, A is an m×p matrix. In addition, B is a p×n matrix where each row vector in bi is the i-th row of B:

bi=[bi1bi2bin]

Note that by convention, a vector is written as a column vector. To write a row vector, we write it as the transpose of a column vector. bi is a column vector, and its transpose is a row vector that captures the i-th row of B. To calculate AB :

C=AB=[a1a2ap][b1b2bp]==a1b1+a2b2++apbp

The product of the i-th column of A and the i-th row of B gives an m×n matrix, and all these matrices are added together to give AB which is also an m×n matrix. As a special case, suppose that x is a column vector. We can calculate Ax such that:

Ax=[a1a2ap][x1x2xp]=x1a1+x2a2++xpap

Ax is simply a linear combination of the columns of A.

To calculate the dot product of two vectors a and b in R, we can use a %*% b, as matrix multiplication.

Length of a Vector

Now that we are familiar with the transpose and dot product, we can define the length (also called the 2-norm) of vector u as:

u=uu=uu=u12+u22++un2

To normalize a vector u, we simply divide it by its length to have the normalized vector n :

n=uu

The normalized vector n is still in the same direction of u, but its length is 1. Now we can normalize the eigenvector of λ=1 that we saw before in the second post of this series:

u2=[11], u2=(1)2+12=2, n=u2u2=[1212][0.70710.7071]

which is the same as the output of eigen(mat_b). As shown before, if we multiply (or divide) an eigenvector by a constant, the new vector is still an eigenvector for the same eigenvalue. Therefore, by normalizing an eigenvector corresponding to an eigenvalue, we’d still have an eigenvector for that eigenvalue.

Revisit Eigenvectors

Why are eigenvectors important to us? As mentioned before, an eigenvector simplifies the matrix multiplication into a scalar multiplication. In addition, they have some more interesting properties. Let me go back to matrix A which was used in a previous post and calculate its eigenvectors:

A=[3202]

In the previous post, this matrix transformed a set of vectors X forming a circle into a new set T forming an ellipse. Let’s use eigen() to calculate its eigenvectors.

mat_a <- matrix(c(3, 0, 2, 2), 2)
eigen_a <- eigen(mat_a)
eigen_a
eigen() decomposition
$values
[1] 3 2

$vectors
     [,1]       [,2]
[1,]    1 -0.8944272
[2,]    0  0.4472136

We got two eigenvectors:

u1=[10], u2=[0.89440.4472]

and the corresponding eigenvalues are:

λ1=3, λ2=2

Now we plot X and the eigenvectors of Au1 and u2, as well as vectors in X transformed by A.

eigenvectorsa. Eigenvectors (left) and transformed eigenvectors (right).
Eigenvectors (left) and transformed eigenvectors (right).

Every vector in X (left), once transformed by A, is stretched and rotated in Ax (Figure 1 right), except two vectors — u1 and u2; those are only stretched, as though they are multiplied by a scalar. This is because they are the eigenvectors of A, and multiplying a matrix with its eigenvector is equivalent to multiplying the eigenvector with its corresponding eigenvalue.

Au=λu

Let’s try another matrix:

B=[3112]

It’s two eigenvectors:

u1=[0.85070.5257], u2=[0.52570.8507]

and the corresponding eigenvalues are:

λ1=3.618, λ2=1.382

Figure 2 shows X and the eigenvectors of Bu1 and u2, as well as vectors in X transformed by B.

eigenvectorsb. Eigenvectors (left) and transformed eigenvectors (right) for matrix B.
Eigenvectors (left) and transformed eigenvectors (right) for matrix B.

This time, the eigenvectors have an interesting property. They are along the major and minor axes of the ellipse (principal axes), and are perpendicular to each other. An ellipse can be thought of as a circle stretched or shrunk along its principal axes as shown in Figure 2, and matrix B transforms the initial circle X by stretching it along u1 and u2, the eigenvectors of B.

How come the eigenvectors of A did not have this property? This is because B is a symmetric matrix. A symmetric matrix is a matrix that is equal to its transpose. Here is an example of a symmetric matrix:

[5043079249613213]

Elements on the main diagonal of a symmetric matrix are arbitrary, but for the other elements, each element on row i and column j is equal to the element on row j and column i (aij=aji). A symmetric matrix is always a square matrix (n×n). Clearly, A was not symmetric. A symmetric matrix transforms a vector by stretching or shrinking the vector along the eigenvectors of this matrix. In particular, a matrix transforms its eigenvector by multiplying its length (or magnitude) by the corresponding eigenvalue.

Given that the initial vectors in X all have a length of 1 and that both u1 and u2 are normalized, they are members of X. Their transformed vectors are:

Bu1=λ1u1, Bu2=λ2u2,

Therefore, the amount of stretching or shrinking along each eigenvector is proportional to the corresponding eigenvalue as shown in Figure 2.

When you have more stretching in the direction of an eigenvector, the eigenvalue corresponding to that eigenvector will be greater. In fact, if the absolute value of an eigenvector is greater than 1, the circle 𝕏 stretches along it; vice versa.

Let’s try another matrix:

C=[3110.8]

The eigenvectors and corresponding eigenvalues are:

u1=[0.93270.3606], u2=[0.36060.9327]

λ1=3.3866,λ2=0.4134

Now if we plot the transformed vectors, we get:

eigenvectorsc. Eigenvectors (left) and transformed eigenvectors (right) for matrix B.
Eigenvectors (left) and transformed eigenvectors (right) for matrix C

Now we have stretched x along u1 and shrunk along u2.

In the next post, we will continue discussing eigenvectors, but in the context of a basis for a vector space.