Singular Value Decomposition - Intro
matrix , updated 2021-11-18
Among all the math subjects I have studied during undergraduate — calculus, real analysis, differential equation, probability theory, numerical analysis — matrix algebra was perhaps the least intuitive to me. Many years have gone by, and I have made several attempts since then to re-learn matrix algebra. Until this summer, each past attempt had ended inconspicuously, either because I lost motivation, or because the book I was following along introduced some external dependency and side-tracked me into other rabbit holes with no return tickets. The pattern repeats itself and yet I kept trying. My stubborness is partly because I use statistics heavily in my daily research and work, and matrix algebra is almost around every corner, constantly provoking my curiousity.
Fast forward to this summer. I finally had some free time to indulge in some long-overdue self-development. Once more, I picked up a book on matrix algebra. Halfway through the book, it mentioned in passing “singular value decomposition”. I had never seen this concept before and decided to look it up. By some stroke of luck, I came across a blog post that made matrix algebra “stick” for me. It unpacked the concept so well that many fragmented pieces in matrix algebra I learned in the past started to form a coherent picture in my head. Most important of all, not only is singular value decomposition a beautiful theory, it is also a useful technique and has been applied to compressing images and predicting user preference, among others.
The original blog post walked readers through a series of illustrations rendered in python.
Here, I will use R
to re-create all the illustrations,
with some editorial changes to the original text.
You can find the original article by Reza Bagheri
here.
Introduction ¶
To understand singular value decomposition,
we need to first understand the Eigenvalue Decomposition of a matrix.
We can think of a matrix
For example, the rotation matrix
This matrix would rotate a vector about the origin by
Another example is the stretching matrix
then
vec <- c(1, 0) # original vector
theta <- 30 * pi / 180 # 30 degrees in radian
# rotation matrix for theta
mat_rotate <- matrix(c(cos(theta), sin(theta), -sin(theta), cos(theta)), 2)
# stretching matrix for k = 2
mat_stretch <- matrix(c(2, 0, 0, 1), 2)
# vec_rotate is the rotated vector
vec_rotate <- as.vector(mat_rotate %*% vec)
# vec_stretch is the stretched vector
vec_stretch <- as.vector(mat_stretch %*% vec)
# prepare the drawing canvas
# split the canvas into left and right, 1 row by 2 columns,
# with different widths
oldpar <- par(pin = c(1.5, 1))
layout(
mat = matrix(c(1, 2),
nrow = 1,
ncol = 2
),
heights = 1,
widths = c(1, 1.5)
)
# define the dimensions of canvas
xlim <- c(-0.5, 1.5)
ylim <- c(-0.5, 1)
# plot original and rotated vectors
# axis labels and plot title
plot(
xlim, ylim, type = "n", xlab = "", ylab = "", main = "Rotation transform", asp = 1
)
# add a reference line to plot and grids
abline(v = 0, h = 0, col = "gray")
grid()
# add vectors to plot
matlib::vectors(
rbind(vec, vec_rotate),
col = c("blue", "darkgreen"),
lwd = c(2, 2),
angle = 15,
labels = c(expression(bold(x)), expression(paste(bold(A), bold(x))))
)
# plot original and stretched vectors
# redefine the x-dimension of canvas
xlim <- c(-0.5, 3)
plot(xlim, ylim, type = "n", xlab = "", ylab = "", main = "Stretching transform", asp = 1)
abline(v = 0, h = 0, col = "gray")
grid()
matlib::vectors(
rbind(vec, vec_stretch),
col = c("blue", "darkgreen"),
lwd = c(2, 2),
angle = 15,
labels = c(expression(bold(x)), expression(paste(bold(B), bold(x))))
)
par(oldpar)

In Figure 1 the rotation matrix is calculated for
Now we are going to try a different transformation matrix. Suppose that
Instead of applying this matrix to a single vector,
we apply it to a set of vectors
It is easy to see that these vectors are one unit away from the origin
Figure 2 shows the set of
|
|

A word about line 43 in the previous chunk where each vector on the circle was transformed:
|
|
Ideally, I should translate
The initial vectors in
The sample vectors
In the next post, I will continue this series with eigenvalues and eigenvectors.