Dr. Matthieu R Bloch
Wednesday, November 3, 2021
General announcements
Office hours on Friday November 05, 2021
Midterm 2:
Last time:
Today: singular value decomposition
Reading: lecture notes 12/13
Every complex matrix \(\matA\) has at least one complex eigenvector and every real symmetrix matrix has real eigenvalues and at least one real eigenvector.
Note that if \(\matA = \matV\matD\matV^\dagger\) then \[ \matA = \sum_{i=1}^n\lambda_i \vecv_i\vecv_i^\dagger \]
How about real-valued matrices \(\matA\in\bbR^{n\times n}\)?
A symmetric matrice \(\matA\) is positive definite if it has positive eigenvalues, i.e., \(\forall i\in\set{1,\cdots,n}\lambda_i>0\).
A symmetric matrice \(\matA\) is positive semidefinite if it has nonnegative eigenvalues, i.e., \(\forall i\in\set{1,\cdots,n}\lambda_i\geq 0\).
Convention: \(\lambda_1\geq \lambda_2\geq \cdots \geq \lambda_n\)
Variational form of extreme eigenvalues for symmetric positive definite matrices \(\bfA\) \[ \begin{align} \lambda_1 &= \max_{\vecx\in\bbR^n:\norm[2]{\bfx}=1}\vecx^\intercal \matA\vecx = \max_{\vecx\in\bbR^n}\frac{\vecx^\intercal \matA\vecx}{\norm[2]{\vecx}^2}\\ \lambda_n &= \min_{\vecx\in\bbR^n:\norm[2]{\bfx}=1}\vecx^\intercal \matA\vecx = \min_{\vecx\in\bbR^n}\frac{\vecx^\intercal \matA\vecx}{\norm[2]{\vecx}^2} \end{align} \]
For any analytic function \(f\), we have \[ f(\matA) = \sum_{i=1}^n f(\lambda_i)\vecv_i\vecv_i^\intercal \]
Consider the system \(\vecy=\matA\vecx\) with \(\matA\) symmetric positive definite
Let \(\set{\vecv_i}\) be the eigenvectors of \(\matA\). \[ \vecx = \sum_{i=1}^n\frac{1}{\lambda_i}\dotp{\vecy}{\vecv_i}\vecv_i \]
Assume some observation error \(\vecy=\matA\vecx+\vece\), with \(\vece\) unknown, and we reconstruct \(\vecx\) as \(\widetilde{\vecx}\) by applying \(\matA^{-1}\)
What happens for non-square matrice?
Let \(\matA\in\bbR^{m\times n}\) with \(\text{rank}(\matA)=r\). Then \(\matA=\matU\boldsymbol{\Sigma}\matV^T\) where
We say that \(\matA\) is full rank is \(r=\min(m,n)\)
We can write \(\matA=\sum_{i=1}^r\sigma_i\vecu_i\vecv_i^\intercal\)
The columns of \(\matV\) \(\set{\vecv_i}_{i=1}^r\) are eigenvectors of the psd matrix \(\matA^\intercal\matA\). \(\set{\sigma_i:1\leq i\leq n\text{ and } \sigma_i\neq 0}\) are the square roots of the non-zero eigenvalues of \(\matA^\intercal\matA\).
The columns of \(\matU\) \(\set{\vecu_i}_{i=1}^r\) are eigenvectors of the psd matrix \(\matA\matA^\intercal\). \(\set{\sigma_i:1\leq i\leq n\text{ and } \sigma_i\neq 0}\) are the square roots of the non-zero eigenvalues of \(\matA\matA^\intercal\).
The columns of \(\matV\) form an orthobasis for \(\text{row}(\matA)\)
The columns of \(\matU\) form an orthobasis for \(\text{col}(\matA)\)
Equivalent form of the SVD: \(\matA=\widetilde{\matU}\widetilde{\boldsymbol{\Sigma}}\widetilde{\matV}^T\) where
When we cannot solve \(\vecy=\matA\vecx\), we solve instead \[ \min_{\bfx\in\bbR^n}\norm[2]{\vecx}^2\text{ such that } \matA^\intercal\matA\vecx = \matA^\intercal\vecy \]
Recall: when \(\matA\in\bbR^{m\times n}\) is of rank \(n\), then \(\bfx=\matA^\intercal(\matA\matA^\intercal)^{-1}\vecy\)
\(\matA^+ = \matV\boldsymbol{\Sigma}^{-1}\matU^\intercal\) is called the pseudo-inverse, Lanczos inverse, or Moore-Penrose inverse of \(\matA=\matU\boldsymbol{\Sigma}\matV^T\).
If \(\matA\) is square invertible then \(\matA^+=\matA\)
If \(m\geq n\) (tall and skinny matrix) of rank \(n\) then \(\matA^+ = (\matA^\intercal\matA)^{-1}\matA^\intercal\)
If \(m\geq m\) (short and fat matrix) of rank \(m\) then \(\matA^+ = \matA^\intercal(\matA\matA^\intercal)^{-1}\)
Note \(\matA^+\) is as “close” to an inverse of \(\matA\) as possible