In a previous post,
we have seen the effect of multiplying a matrix with its eigenvectors.
The vector does not change in direction,
merely shrinks/stretches by an amount proportional to the corresponding
eigenvalue.
I reproduce the before and after plots below for three matrices
, , and .
There is one subtle difference between
, , and .
Take for example,
the length of is the maximum of
over all unit vectors .
And the length of is the maximum of
over all unit vectors
that are perpendicular to .
The same pattern applies to as well.
However, is certainly NOT the maximum of
over all unit vectors .
As always, there is no coincidence in mathematics.
Nor is this one.
For a symmetric matrix ,
returns the maximum of
over all unit vectors
that are perpendicular to the first eigenvectors of .
The question remains then:
among all unit vectors ,
which one maximizes
when is not necessarily symmetric?
Let’s digress here for a moment and consider, not ,
but .
Given that the transpose of a product is the product of the transpose
in the reverse order, we have
In other words, is equal to its transpose,
and therefore is a symmetric matrix.
From previous posts,
we know that a symmetric matrix such as
has real eigenvalues
and linearly independent and orthogonal eigenvectors.
Next, let’s calculate the eigenvalues and eigenvectors of .
Let’s label these eigenvectors as
and ,
and we can assume that they are normalized.
Before we proceed, take a guess at
what you would see if we plot , ,
and .
Recall the question we asked earlier:
Among all unit vectors ,
which one maximizes ?
It seems that we have found the answer.
It is the eigenvectors of .
We have shown that this is true in the example of matrix .
In general, for an matrix ,
it can be shown that has the greatest length
and is perpendicular to the pervious eigenvectors,
where
are eigenvectors of .
For each of these eigenvectors,
we can use the definition of length
and the rule for the product of transposed matrices to have:
Let’s assume that the corresponding eigenvalue of is
And because is normalized, so
and
This result shows that all the eigenvalues of
are non-negative.
If we label them in descending order, we have:
The singular value of is defined as the square root of ,
denoted .
Therefore, the singular values of are the length of vectors .
An important theory that forms the backbone of the SVD method:
the maximum value of , subject to the constraints
is , and this maximum value is attained at ,
the -th eigenvector of .
In an earlier post,
we mentioned that
a symmetric matrix transforms a vector
by stretching or shrinking the vector along the eigenvectors of this matrix.
With a non-symmetric matrix ,
it transforms a vector by stretching or shrinking the vector
along the direction of ,
where is an eigenvector of ,
ordered based on its corresponding eigenvalue,
.
The corresponding singular value is the scalar
that determines the length of the stretching,
,
where is the corresponding eigenvalue of
.
How can we reconcile these two seemingly different rules?
Let’s take a symmetric metrix, .
Suppose that its -th eigenvector is
and the corresponding eigenvalue is .
If we multiply by we get:
which means that is also an eigenvector of
,
but its corresponding eigenvalue is !
Now we can see that the previous rule about a symmetric matrix
is nothing but a special case of the more general rule:
A matrix transforms a vector by stretching or shrinking the vector
along the direction of ,
where is an eigenvector of ,
ordered based on its corresponding singular value.
The corresponding singular value is the scalar
that determines the length of the stretching or shrinking,
,
where is the corresponding eigenvalue of
.
When is symmetric,
the direction of will be identical
to that of ,
because has the same eigenvectors as .
Moreover, .
Therefore, the direction of
is the direction of ,
which is the direction of .
That is, a symmetric matrix transforms a vector by stretching or shrinking
the vector along the direction of ,
its eigenvector!
What about the length of the stretching or shrinking?
We know that ,
where is the corresponding eigenvalue of
,
and is the corresponding eigenvalue of .
Therefore, a symmetric matrix transforms a vector
along its eigenvectors ,
scaled by its corresponding eigenvalues .
We have come a full circle!
In the next post,
we are finally ready to present the singular value decomposition equation!