Singular Value Decomposition and Image Processing

23 Jun 2015

The singular value decomposition (SVD) is a factorization of a real or complex matrix. It has many useful applications in signal processing and statistics.

Singular Value Decomposition

SVD is the factorization of a $ m \times n $ matrix $ Y $ into three matrices as:

$\mathbf{Y = UDV^\top}$

With:

$ U $ is an $ m\times n $ orthogonal matrix
$ V $ is an $ n\times n $ orthogonal matrix
$ D $ is an $ n\times n $ diagonal matrix

In R The result of svd(X) is actually a list of three components named d, u and v, such that Y = U %*% D %*% t(V).

Video about SVD

</embed>

Example

# cleanup
rm(list=ls())

dat <- seq(1,36,2)

Y <- matrix(dat,ncol=6)
Y

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    7   13   19   25   31
## [2,]    3    9   15   21   27   33
## [3,]    5   11   17   23   29   35

# Apply SVD to get U, V, and D
s <- svd(Y)
U <- s$u
V <- s$v
D <- diag(s$d) ##turn it into a matrix

we can reconstruct Y

Yhat <- U %*% D %*% t(V)
Yhat

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    7   13   19   25   31
## [2,]    3    9   15   21   27   33
## [3,]    5   11   17   23   29   35

resid <- Y - Yhat
max(abs(resid))

## [1] 2.309264e-14

Image processing

Load the image and convert it to a greyscale:

# source("http://bioconductor.org/biocLite.R")
# biocLite("ripa", dependencies=TRUE)
# biocLite("rARPACK", dependencies=TRUE)
# install.packages('devtools')
# library(devtools)
# install_github('ririzarr/rafalib')

# Load libraries
library(rARPACK)
library(ripa)    #  function "imagematrix"
library(EBImage)
library(jpeg)
library(png)
#
library(rafalib)
mypar2()
#
# Read the image
img <- readImage("images/pansy.jpg") 
dim(img)

## [1] 465 600   3

display(img, method = "raster")

plot of chunk unnamed-chunk-3

# Convert image to greyscale
r <- imagematrix(img, type = "grey")

# display(r, method = "raster", all=TRUE)
display(r, method = "raster")

plot of chunk unnamed-chunk-3

dim(r)

## [1] 465 600

str(r)

##  imagematrix [1:465, 1:600] 0.345 0.282 0.29 0.306 0.29 ...
##  - attr(*, "type")= chr "grey"
##  - attr(*, "class")= chr [1:2] "imagematrix" "matrix"

Apply SVD to get U, V, and D

# Apply SVD to get u, v, and d
r.svd <- svd(r)

u <- r.svd$u
v <- r.svd$v
d <- diag(r.svd$d)
dim(d)

## [1] 465 465

## [1] 465 465

Plot the magnitude of the singular values

# check svd$d values 
# Plot the magnitude of the singular values
sigmas = r.svd$d # diagonal matrix (the entries of which are known as singular values)
plot(1:length(r.svd$d), r.svd$d, xlab="i-th r.svd$d", ylab="r.svd$d",  main="Singular Values");

plot of chunk unnamed-chunk-5

plot(1:length(r.svd$d), cumsum(r.svd$d) / sum(r.svd$d), main="Cumulative Percent of Total Sigmas");

plot of chunk unnamed-chunk-5 Not that, the total of the first n singular values divided by the sum of all the singular values is the percentage of “information” that those singular values contain. If we want to keep 90% of the information, we just need to compute sums of singular values until we reach 90% of the sum, and discard the rest of the singular values.

# first approximation
u1 <- as.matrix(u[-1, 1])
v1 <- as.matrix(v[-1, 1])
d1 <- as.matrix(d[1, 1])
l1 <- u1 %*% d1 %*% t(v1)
l1g <- imagematrix(l1, type = "grey")
#plot(l1g, useRaster = TRUE)
display(l1g, method = "raster", all=TRUE)

plot of chunk unnamed-chunk-6

# more approximation
depth <- 5
us <- as.matrix(u[, 1:depth])
vs <- as.matrix(v[, 1:depth])
ds <- as.matrix(d[1:depth, 1:depth])
ls <- us %*% ds %*% t(vs)
lsg <- imagematrix(ls, type = "grey")

## Warning: Pixel values were automatically clipped because of range over.

#plot(lsg, useRaster = TRUE)
display(lsg, method = "raster")

plot of chunk unnamed-chunk-6

# more approximation
depth <- 20
us <- as.matrix(u[, 1:depth])
vs <- as.matrix(v[, 1:depth])
ds <- as.matrix(d[1:depth, 1:depth])
ls <- us %*% ds %*% t(vs)
lsg <- imagematrix(ls, type = "grey")

## Warning: Pixel values were automatically clipped because of range over.

#plot(lsg, useRaster = TRUE)
display(lsg, method = "raster")

plot of chunk unnamed-chunk-6

Image Compression with the SVD

Here we continue to show how the SVD can be used for image compression (as we have seen above).

factorize = function(m, k){
  r = svds(m[, , 1], k);
  g = svds(m[, , 2], k);
  b = svds(m[, , 3], k);
  return(list(r = r, g = g, b = b));
}

recoverimg = function(lst, k){
  recover0 = function(fac, k){
    dmat = diag(k);
    diag(dmat) = fac$d[1:k];
    m = fac$u[, 1:k] %*% dmat %*% t(fac$v[, 1:k]);
    m[m < 0] = 0;
    m[m > 1] = 1;
    return(m);
  }

  r = recover0(lst$r, k);
  g = recover0(lst$g, k);
  b = recover0(lst$b, k);
  m = array(0, c(nrow(r), ncol(r), 3));
  m[, , 1] = r;
  m[, , 2] = g;
  m[, , 3] = b;
  return(m);
}

rawimg <- readJPEG("images/pansy.jpg");

lst = factorize(rawimg, 100);

neig = c(1, 5, 20, 50, 100);
for(i in neig){
  m = recoverimg(lst, i);
  writeJPEG(m, sprintf("images/svd_%d.jpg", i), 0.95);
  #display(m, method = "raster")
  fname <- sprintf("images/svd_%d.jpg", i)
  display(readImage(fname), title="svd_%d", method = "raster")
}

Original image
Singluar Value k = 1
Singluar Value k = 5
Singluar Value k = 20
Singluar Value k = 50
Singluar Value k = 100
Analysis

With only 10% of the real data we are able to create a very good approximation of the real data.

References

If you liked this post, you can share it with your followers or follow me on Twitter!

Data Science and R

Home About Archive Links

Singular Value Decomposition and Image Processing