Singular Value Decomposition - Properties of Symmetric Matrices 2

2021-11-14
matrix

Geometrical interpretation of eigendecomposition ¶

In the previous post, we showed how to express a symmetric matrix as the product of three matrices, a process known as eigendecomposition. In this post, we revisit this procedure, but from a geometrical perspective.

To begin, let’s assume that a symmetric matrix $A$ has $n$ eigenvectors, and each eigenvector $u_{i}$ is an $n \times 1$ column vector

$u_{i} = [\begin{matrix} u_{i 1} \\ u_{i 2} \\ ⋮ \\ u_{i n} \end{matrix}]$

then the transpose of $u_{i}$ is a $1 \times n$ row vector

$u_{i}^{T} = [\begin{matrix} u_{i 1} & u_{i 2} & \dots & u_{i n} \end{matrix}]$

and their multiplication

$u_{i} u_{i}^{T} = [\begin{matrix} u_{i 1} \\ u_{i 2} \\ ⋮ \\ u_{i n} \end{matrix}] [\begin{matrix} u_{i 1} & u_{i 2} & \dots & u_{i n} \end{matrix}] = [\begin{matrix} u_{i 1} u_{i 1} & u_{i 1} u_{i 2} & \dots & u_{i n} u_{i n} \\ u_{i 2} u_{i 1} & u_{i 2} u_{i 2} & \dots & u_{i 2} u_{i n} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ u_{i n} u_{i 1} & u_{i n} u_{i 2} & \dots & u_{i n} u_{i n} \end{matrix}]$

becomes an $n \times n$ matrix.

Let’s return to $A = P D P^{⊤}$ . First, we calculate $D P^{⊤}$ to simplify the eigendecomposition equation:

$\begin{aligned} [\begin{array}{c} λ_{1} & 0 & \dots & 0 \\ 0 & λ_{2} & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & \dots & λ_{n} \end{array}] [\begin{array}{c} u_{1}^{⊤} \\ u_{2}^{⊤} \\ \dots \\ u_{n}^{⊤} \end{array}] & = [\begin{array}{c} λ_{1} & 0 & \dots & 0 \\ 0 & λ_{2} & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & \dots & λ_{n} \end{array}] [\begin{array}{c} u_{11} & u_{12} & \dots & u_{1 n} \\ u_{21} & u_{22} & \dots & u_{2 n} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ u_{n 1} & u_{n 2} & \dots & u_{n n} \end{array}] \\ = [\begin{array}{c} λ_{1} u_{11} & λ_{1} u_{12} & \dots & λ_{1} u_{1 n} \\ λ_{2} u_{21} & λ_{2} u_{22} & \dots & λ_{2} u_{2 n} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ λ_{n} u_{n 1} & λ_{n} u_{n 2} & \dots & λ_{n} u_{n n} \end{array}] = [\begin{array}{c} λ_{1} u_{1}^{⊤} \\ λ_{2} u_{2}^{⊤} \\ \dots \\ λ_{n} u_{n}^{⊤} \end{array}] \end{aligned}$

Now the eigendecomposition becomes:

$A = [\begin{matrix} u_{1} & u_{2} & \dots & u_{n} \end{matrix}] [\begin{matrix} λ_{1} u_{1}^{⊤} \\ λ_{2} u_{2}^{⊤} \\ \dots \\ λ_{n} u_{n}^{⊤} \end{matrix}] = λ_{1} u_{1} u_{1}^{⊤} + λ_{2} u_{2} u_{2}^{⊤} + \dots + λ_{n} u_{n} u_{n}^{⊤}$

Therefore, the $n \times n$ matrix $A$ can be broken into $n$ matrices with the same shape ( $n \times n$ ), and each of these matrices has a multiplier which is equal to the corresponding eigenvalue $λ_{i}$ . Each of the matrices $u_{i} u_{i}^{⊤}$ is called a projection matrix.

Imagine that we have a vector $x$ and a unit vector $v$ . The inner product of $v$ and $x$ which is equal to $v \cdot x = v^{⊤} x$ gives the scalar projection of $x$ onto $v$ , which is the length of the vector projection of $x$ into $v$ . If we multiply $v^{⊤} x$ by $v$ again, $v v^{⊤} x$ gives a vector which is called the orthogonal projection of $x$ onto $v$ . This is shown in Figure 1.

eigendecomposition: projecting and scaling (left) then vectorizing (right). — Multiplying a vector with a projection matrix.

Therefore, when $v$ is a unit vector, multiplying $v v^{⊤}$ by $x$ will give the orthogonal projection of $x$ onto $v$ , and that is why $v v^{⊤}$ is called the projection matrix. Multiplying $u_{i} u_{i}^{⊤}$ by $x$ , we get the orthogonal projection of $x$ onto $u_{i}$ .

Now let’s use R to calculate the projection matrices of matrix $A$ mentioned before.

$A = [\begin{matrix} 3 & 1 \\ 1 & 2 \end{matrix}]$

We had already calculated the eigenvalues and eigenvectors of $A$ .

mat_a <- matrix(c(3, 1, 1, 2), 2)
eigen_a <- eigen(mat_a)
eigen_a

eigen() decomposition
$values
[1] 3.618034 1.381966

$vectors
           [,1]       [,2]
[1,] -0.8506508  0.5257311
[2,] -0.5257311 -0.8506508

The next chunk will apply eigendecomposition to $A$ and print the first term, namely $A_{1}$ .

u_a <- eigen_a$vectors # an orthogonal matrix made of A's eigenvectors
lambda_a <- eigen_a$values # a vector of A's eigenvalues
mat_a1 <- lambda_a[1] * u_a[,1] %*% t(u_a[,1])
mat_a1 |> round(3)

      [,1]  [,2]
[1,] 2.618 1.618
[2,] 1.618 1.000

$\begin{aligned} A_{1} & = λ_{1} u_{1} u_{1}^{⊤} \\ = 3.618 [\begin{array}{c} - 0.851 \\ - 0.526 \end{array}] [\begin{array}{c} - 0.851 & - 0.526 \end{array}] = [\begin{array}{c} 2.618 & 1.618 \\ 1.618 & 1 \end{array}] \end{aligned}$

As you can see, $A_{1}$ is also a symmetric matrix. In fact, it can be shown that all projection matrices $λ_{i} u_{i} u_{i}^{⊤}$ in the eigendecomposition equation are symmetric.

Other than being symmetric, projection matrices have some interesting properties. Let’s continue with $A_{1}$ as an example. We can calculate its eigenvalues and eigenvectors:

eigen_a1 <- eigen(mat_a1)
cat("eigenvalues: ", "\n")
ifelse(round(eigen_a1$values, 3) < 0.001, 0, round(eigen_a1$values, 3))
cat("eigenvectors: ", "\n")
eigen_a1$vectors |> round(3)

eigenvalues:  
[1] 3.618 0.000
eigenvectors:  
       [,1]   [,2]
[1,] -0.851  0.526
[2,] -0.526 -0.851

$A_{1}$ has two eigenvalues. One is 0. The other one is equal to $λ_{1}$ of the orignal matrix $A$ . In addition, its eigenvectors are identical to that of $A$ . This is not a coincidence. To see why, suppose we multiple $A_{1}$ by $u_{1}$ :

$A_{1} u_{1} = (λ_{1} u_{1} u_{1}^{⊤}) u_{1} = λ_{1} u_{1} (u_{1}^{⊤} u_{1})$

We know that $u_{1}$ is an eigenvector and it is normalized. Therefore, its length is equal to 1, so is its inner product with itself. Thus we have:

$A_{1} u_{1} = (λ_{1} u_{1} u_{1}^{⊤}) u_{1} = λ_{1} u_{1}$

Thus, $u_{1}$ is an eigenvector of $A_{1}$ , and the corresponding eigenvalue is $λ_{1}$ .

Furthermore,

$A_{1} u_{2} = (λ_{1} u_{1} u_{1}^{⊤}) u_{2} = λ_{1} u_{1} (u_{1}^{⊤} u_{2})$

Because $A$ is symmetric, its eigenvectors $u_{1}$ and $u_{2}$ are orthogonal, or perpendicular. Given that the inner product of two perpendicular vectors is zero, the inner product of $u_{1}$ and $u_{2}$ is zero. Thus we have

$A_{1} u_{2} = (λ_{1} u_{1} u_{1}^{⊤}) u_{2} = λ_{1} u_{1} (u_{1}^{⊤} u_{2}) = 0 = 0 \times u_{2}$

which means that $u_{2}$ is also an eigenvector of $A_{1}$ and its corresponding eigenvalue is 0, matching the output of eigen_a1$values[2] = 0.

In general, eigendecomposition decomposes a symmetric matrix into $n$ of $n \times n$ projection matrices, $λ_{i} u_{i} u_{i}^{⊤}$ .

Each projection matrix is also symmetric, which shares the same eigenvectors as the original matrix. For a particular project matrix $λ_{k} u_{k} u_{k}^{⊤}$ , the corresponding eigenvalue of eigenvector $u_{k}$ is the $k$ -th eigenvalue of $A$ , $λ_{k}$ , whereas all the remaining eigenvalues are zero.

Recall that a symmetric matrix scales a vector along its eigenvectors, proportionally to the corresponding eigenvalue. Therefore, a projection matrix $λ_{i} u_{i} u_{i}^{⊤}$ stretches/shrinks a vector along $u_{i}$ by $λ_{i}$ , but shrinks the vector to zero in all other directions. Let’s illustrate this with $A$ and one of its projection matrix $A_{1}$ in Figure 2.

projection: Original vectors (left) and transformed vectors (right). — Original vectors (left) and transformed vectors by a projection matrix (right).

All vectors in $X$ are transformed by $A_{1}$ , namely, stretched along $u_{1}$ and shrunk to zero along $u_{2}$ . As a result, the initial circle became a straight line.

Previously, matrix $A$ transformed vectors in $X$ into an ellipse, another 2-D shape. And yet, matrix $A_{1}$ transformed vectors in $X$ into a line, a 1-D shape. Both $A$ and $A_{1}$ are symmetric. How come one preserves whereas the other reduces the dimension? In the next post, we will discuss the reason by introducing the concept of rank.