Recall that in a previous post
on eigendecomposition,
An symmetric matrix can be decomposed into matrices
with the same shape ().
where
are eigenvectors of .
Written compactly,
,
where consists of as its column vectors
and a diagonal matrix with ’s eigenvalues
on its diagonal.
More generally, any matrix can be docomposed
into matrices of the same shape ,
where is the rank of .
Why should we want to decompose a matrix?
Similar to what decomposing a symmetric matrix does,
we can approximate
a matrix by the sum of its first components.
And why do we want to approximate a matrix?
I will answer this question in the next post.
In this post, let’s focus on how to decompose a matrix,
symmetric or otherwise.
Let be an matrix and .
It can be shown that the number of the non-zero singular values
of is also its rank, .
Since all of those singular values are positive,
we can label them in descending order as
where
We know that each signular value is the square root of ,
which are eigenvalues of
and correspond to eigenvectors of
in the same order.
Now we can write the singular value decomposition of as:
Next, let’s unpack this equation, starting with the item in the middle,
.
(pronouced “sigma”) is an diagonal matrix,
with in its diagonal:
In practice, to construct ,
we can fill an diagonal matrix
with all the non-zero singular values of ,
.
Then pad the rest with s to make it an matrix.
Next to is ,
an matrix consisting of column vectors ,
eigenvectors of the symmetric matrix .
is an orthogonal matrix,
or orthonormal matrix,
because its columns, , are orthogonal and normalized.
In addition,
a set of orthogonal and normalized vectors, such as the set
is called an orthonomal set.
Finally, is an orthogonal matrix.
To understand how is constructed,
consider the following statement:
is an orthogonal basis that spans .
Proof. Because and are orthogonal
for ,
Therefore, are orthogonal to each other.
In addition, ,
where are singular values of .
Recall that when
and for .
So
are orthogonal and all non-zero, and thus a linearly independent set,
all of which are in .
also spans .
To see why this is true, take any vector in ,
,
where is an vector.
Given that is a basis for
, we can write
so
In other words, any vector in
can be writen in terms of
.
Therefore, is an orthogonal basis for
.
We can normalize vectors () to obtain an orthonormal basis
by dividing them by their length:
We now have an orthonormal basis
,
where is the rank of .
In the Singular Value Decomposition equation,
,
is an matrix.
Therefore, needs to be a matrix.
In case , we need to add additional ortthonormal vectors
to the set
so that they span .
One method to find these vectors is
Gram-Schmidt Process.
We will introduce it in another post.
Now we have successfully constructed all three components:
, , and .
And we can decompose as:
Given that is an -dimensional column vector,
and is an -dimensional column vector,
and that is a scalar,
each is an matrix.
Therefore, the singular value decomposition equation
decomposes matrix into matrices
of the same shape.
Closely inspecting the equation
we can gain further insights.
Multiply both sides of the equation with an vector
and we get:
Earlier, we have also shown that
is an orthogonal basis that spans .
Therefore, can be written as
Taken togethers,
we will have
gives the scalar projection of
onto , which is then multiplied by the -th singular value
.
Thus, each component in the decomposition of
is the result of:
Project onto ,
multiply the scalar projection with the singular value ,
and finally scale with the compound scalar
resulted from the previous step.
Recall that in the eigendecomposition equation
where is a symmetric matrix,
is projected onto each eigenvector of ,
then scaled by the corresponding eigenvalue .
In the singular value decomposition,
is projected onto ,
then scaled by the product of singular value and
scalar projection of onto ,
where are eigenvectors of ,
and are unit vectors
along the direction of .
The R package matlib has a function SVD()
that can calculate the singular value decomposition of matrix .
Let’s see an example:
Note that the last component, , is a matrix,
whereas in the original signular value decomposition equation,
is expected to be a matrix,
given that is a matrix.
Did the function matlib::SVD() miss something?
Consider the following:
I have used ? as a place holder for elements in
that were missing from the previous output.
It is easy to observe that what’s replaced with ?
would be multiplied by 0 from the last column of
and would therefore contribute nothing to the final result.
This is why matlib::SVD() parsimoniously printed only the first two columns
of .
In the next and final post of this series,
we will walk through an example where singular value decoposition
solves a real life problem.