The Mathematics of ECE

Linear Algebra - Special Matrices, SVD, Least Squares

Wednesday, September 7, 2022

Logistics

A square matrix \(Q\) is called orthogonal if its columns are an orthonormal set, i.e. \(Q^T Q = I\).
Drill: Prove that every orthogonal matrix also has orthogonal rows.
- Hint: what is one way to interpret the statement \(Q^T Q = I\)?
Drill: Prove that every orthogonal matrix has determinant \(\pm 1\).
- Hint: Recall two properties of the determinant are \(\text{det}(A B) = \text{det}(A) \text{det}(B)\) and \(\text{det}(A^T) = \text{det}(A)\).
A square matrix \(S\) is called symmetric if it is equal to its transpose, i.e. \(S^T = S\).
Properties:
- The set of symmetric matrices is a vector subspace of the set of all real square matrices.
- \(A^{-1}\) is symmetric if and only if \(A\) is symmetric.
- All symmetric matrices are diagonalizable. Further, for any symmetric matrix \(S\) there exists an orthonormal set of eigenvectors such that \(S\) has eigendecomposition \(S = V \Lambda V^T\), where \(V\) is an orthogonal matrix; all eigenvalues are guaranteed to be real, hence \(\Lambda\) is a real diagonal matrix.
  - (These are results of the spectral theorem, which really deserves its own lecture…)
Drill: Prove that for symmetric matrix \(S\) and any positive integer \(n\) the matrix \(A^n\) is symmetric.

Any matrix \(A \in \bbR^{m \times n}\) may be described by two orthogonal matrices \(U \in \bbR^{m \times m}, V \in \bbR^{n \times n}\) and a diagonal (up to the matrix border) non-negative matrix \(\Sigma\) as \[ A = U \Sigma V^T \]

Visualization of the SVD for a 2x2 matrix
(reproduced with permission from Wikipedia)
Intuition: The SVD is a powerful tool to describe the action of any matrix in terms of an orthonormal change of basis (\(V^T\)), non-negative component-wise scaling (\(\Sigma\)), and an orthonormal basis reconstruction (\(U\)).
Drill: Let \(A\) have SVD \(A = U \Sigma V^T\). Prove that \(A^T A\) and \(A A^T\) share all of their nonzero eigenvalues, and further show that these eigenvalues are exactly the squares of the nonzero diagonal elements of \(\Sigma\).

Given a matrix \(A \in \bbR^{m \times n}\) and a vector \(b \in \bbR^m\), \[ \underset{x \in \bbR^n}{\text{minimize}} || A x - b ||_2^2 \]
Practical Examples:
- To find the line of best fit over a series of points \(\{(t_i, y_i)\}_{i=1}^m\), \[ A = \begin{bmatrix} 1 & t_1 \\ 1 & t_2 \\ \vdots & \vdots \\ 1 & t_m \end{bmatrix}, \quad b = \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_m \end{bmatrix} \]

Practical Examples (cont.):
- To find the \((n-1)\)th order polynomial of best fit over a series of points \(\{(t_i, y_i)\}_{i=1}^m\), \[ A = \begin{bmatrix} 1 & t_1 & t_1^2 & \dots & t_1^{n-1} \\ 1 & t_2 & t_2^2 & \dots & t_2^{n-1} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & t_m & t_m^2 & \dots & t_m^{n-1} \end{bmatrix}, \quad b = \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_m \end{bmatrix} \]
- To find the best fit fourier series over a series of points \(\{(t_i, y_i)\}_{i=1}^m\), \[ A = \begin{bmatrix} 1 & \omega^{t_1} & \omega^{2 t_1} & \dots & \omega^{(n-1) t_1} \\ 1 & \omega^{t_2} & \omega^{2 t_2} & \dots & \omega^{(n-1) t_2} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & \omega^{t_m} & \omega^{2 t_m} & \dots & \omega^{(n-1) t_m} \end{bmatrix}, \quad b = \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_m \end{bmatrix} \] where \(\omega = e^{j \frac{2 \pi}{n}}\)

Given a matrix \(A \in \bbR^{m \times n}\) and a vector \(b \in \bbR^m\), \[ \underset{x \in \bbR^n}{\text{minimize}} || A x - b ||_2^2 \]
Drill: What is the minimum value \(|| A x - b ||_2^2\) can take over all possible \(x\)?
For the following drills, assume \(\text{rank}(A) = \text{min}(m, n)\)
- Drill: If \(m = n\), how many solutions to the least squares problem are there? What is a closed form expression for these solutions?
- Drill: If \(m < n\), how many solutions to the least squares problem are there? What is a closed form expression for these solutions?
  - Hint: Recall that every set of vectors has a linearly independent subset which have the same span.
  - Hint: Recall that every linearly independent set of vectors in \(\bbR^m\) which span the entirety of \(\bbR^m\) contains exactly \(m\) vectors.
- Drill: If \(m > n\), how many solutions to the least squares problem are there? What is a closed form expression for these solutions?

Luckily, the least squares problem is convex, so we can solve by setting the derivative equal to 0: \(\hat{x} = (A^T A)^{-1} A^T b\)
- Issue when \(m < n\), \(A^T A\) is rank-deficient and can’t be inverted…
The pseudoinverse of \(A\) is defined using the compact SVD as \(A^+ = V \Sigma^{-1} U^T\).
Drill: Prove directly by substitution that \(A A^+\) is a projection matrix onto the column space of \(A\).

Prove \(|| Q x ||_2^2 = || x ||_2^2\) for any orthogonal matrix \(Q \in \bbR^{n \times n}\) and arbitrary vector \(x \in \bbR^n\). In other words, multiplying by an orthogonal matrix does not change the norm of a vector.
Prove that the transpose of an orthogonal matrix is still an orthogonal matrix.
Let \(\calB = \{x \in \bbR^n \text{ such that } ||x||_2^2 = 1\}\). Prove that the action of any arbitrary orthogonal matrix \(Q \in \bbR^{n \times n}\) is a bijective endomorphism, i.e. \(\{Q x | \forall x \in \calB\} = \calB\).
Let \(\Sigma \in \bbR^{n \times n}\) be a non-negative diagonal matrix with entries in descending order, i.e. \(\sigma_1 \geq \sigma_2 \geq \dots \geq \sigma_n\). Prove that \(\sigma_1^2 = \underset{y \in \calB}{\text{max}} || \Sigma y ||_2^2\).
- Hint: assume that \(y \neq e_1\). In this case, there exists some index \(i^\star\) such that \(y_{i^\star}^2 > 0\). Another feasible point in \(\calB\) is \(y'\), where all the entries are the same as those in \(y\) save that \(y'_1 = \sqrt{y_1^2 + y_{i^\star}^2}\) and \(y'_{i^\star} = 0\). Which has a larger objective value?
The operator norm of a matrix \(|| A ||\) is defined as \[ || A ||^2 = \underset{x \in \calB}{\text{max}} || A x ||_2^2 \] Using the previous results, solve for the operator norm in terms of the singular values of \(A\).