The Mathematics of ECE

Linear Algebra - Special Matrices, SVD, Least Squares

Wednesday, September 7, 2022

Logistics

  • Housekeeping
  • Today’s class: Linear Algebra
    • Key concepts: special/structured matrices, SVD, least squares
    • Key results:
      • structured matrices often have desirable properties
      • the SVD is a powerful tool for analysis of matrices
      • the ubiquitous least squares problem has an intimate relationship to the SVD.
  • Next workshop: Friday September 09, 2022

Special/Structured Matrices

  • A square matrix \(Q\) is called orthogonal if its columns are an orthonormal set, i.e. \(Q^T Q = I\).

  • Drill: Prove that every orthogonal matrix also has orthogonal rows.
    • Hint: what is one way to interpret the statement \(Q^T Q = I\)?
  • Drill: Prove that every orthogonal matrix has determinant \(\pm 1\).
    • Hint: Recall two properties of the determinant are \(\text{det}(A B) = \text{det}(A) \text{det}(B)\) and \(\text{det}(A^T) = \text{det}(A)\).
  • A square matrix \(S\) is called symmetric if it is equal to its transpose, i.e. \(S^T = S\).

  • Properties:
    • The set of symmetric matrices is a vector subspace of the set of all real square matrices.
    • \(A^{-1}\) is symmetric if and only if \(A\) is symmetric.
    • All symmetric matrices are diagonalizable. Further, for any symmetric matrix \(S\) there exists an orthonormal set of eigenvectors such that \(S\) has eigendecomposition \(S = V \Lambda V^T\), where \(V\) is an orthogonal matrix; all eigenvalues are guaranteed to be real, hence \(\Lambda\) is a real diagonal matrix.
      • (These are results of the spectral theorem, which really deserves its own lecture…)
  • Drill: Prove that for symmetric matrix \(S\) and any positive integer \(n\) the matrix \(A^n\) is symmetric.

Singular Value Decomposition (SVD)

  • Any matrix \(A \in \bbR^{m \times n}\) may be described by two orthogonal matrices \(U \in \bbR^{m \times m}, V \in \bbR^{n \times n}\) and a diagonal (up to the matrix border) non-negative matrix \(\Sigma\) as \[ A = U \Sigma V^T \]

    Image
    Visualization of the SVD for a 2x2 matrix
    (reproduced with permission from Wikipedia)
  • Intuition: The SVD is a powerful tool to describe the action of any matrix in terms of an orthonormal change of basis (\(V^T\)), non-negative component-wise scaling (\(\Sigma\)), and an orthonormal basis reconstruction (\(U\)).

  • Drill: Let \(A\) have SVD \(A = U \Sigma V^T\). Prove that \(A^T A\) and \(A A^T\) share all of their nonzero eigenvalues, and further show that these eigenvalues are exactly the squares of the nonzero diagonal elements of \(\Sigma\).

Least Squares

  • Given a matrix \(A \in \bbR^{m \times n}\) and a vector \(b \in \bbR^m\), \[ \underset{x \in \bbR^n}{\text{minimize}} || A x - b ||_2^2 \]

  • Practical Examples:
    • To find the line of best fit over a series of points \(\{(t_i, y_i)\}_{i=1}^m\), \[ A = \begin{bmatrix} 1 & t_1 \\ 1 & t_2 \\ \vdots & \vdots \\ 1 & t_m \end{bmatrix}, \quad b = \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_m \end{bmatrix} \]

Least Squares

  • Practical Examples (cont.):
    • To find the \((n-1)\)th order polynomial of best fit over a series of points \(\{(t_i, y_i)\}_{i=1}^m\), \[ A = \begin{bmatrix} 1 & t_1 & t_1^2 & \dots & t_1^{n-1} \\ 1 & t_2 & t_2^2 & \dots & t_2^{n-1} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & t_m & t_m^2 & \dots & t_m^{n-1} \end{bmatrix}, \quad b = \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_m \end{bmatrix} \]
    • To find the best fit fourier series over a series of points \(\{(t_i, y_i)\}_{i=1}^m\), \[ A = \begin{bmatrix} 1 & \omega^{t_1} & \omega^{2 t_1} & \dots & \omega^{(n-1) t_1} \\ 1 & \omega^{t_2} & \omega^{2 t_2} & \dots & \omega^{(n-1) t_2} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & \omega^{t_m} & \omega^{2 t_m} & \dots & \omega^{(n-1) t_m} \end{bmatrix}, \quad b = \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_m \end{bmatrix} \] where \(\omega = e^{j \frac{2 \pi}{n}}\)

Least Squares

  • Given a matrix \(A \in \bbR^{m \times n}\) and a vector \(b \in \bbR^m\), \[ \underset{x \in \bbR^n}{\text{minimize}} || A x - b ||_2^2 \]

  • Drill: What is the minimum value \(|| A x - b ||_2^2\) can take over all possible \(x\)?
  • For the following drills, assume \(\text{rank}(A) = \text{min}(m, n)\)
    • Drill: If \(m = n\), how many solutions to the least squares problem are there? What is a closed form expression for these solutions?
    • Drill: If \(m < n\), how many solutions to the least squares problem are there? What is a closed form expression for these solutions?
      • Hint: Recall that every set of vectors has a linearly independent subset which have the same span.
      • Hint: Recall that every linearly independent set of vectors in \(\bbR^m\) which span the entirety of \(\bbR^m\) contains exactly \(m\) vectors.
    • Drill: If \(m > n\), how many solutions to the least squares problem are there? What is a closed form expression for these solutions?

Least Squares

  • Luckily, the least squares problem is convex, so we can solve by setting the derivative equal to 0: \(\hat{x} = (A^T A)^{-1} A^T b\)
    • Issue when \(m < n\), \(A^T A\) is rank-deficient and can’t be inverted…
  • The pseudoinverse of \(A\) is defined using the compact SVD as \(A^+ = V \Sigma^{-1} U^T\).

  • Drill: Prove directly by substitution that \(A A^+\) is a projection matrix onto the column space of \(A\).

Problems

  • Work together to solve these problems!
  1. Prove \(|| Q x ||_2^2 = || x ||_2^2\) for any orthogonal matrix \(Q \in \bbR^{n \times n}\) and arbitrary vector \(x \in \bbR^n\). In other words, multiplying by an orthogonal matrix does not change the norm of a vector.
  2. Prove that the transpose of an orthogonal matrix is still an orthogonal matrix.
  3. Let \(\calB = \{x \in \bbR^n \text{ such that } ||x||_2^2 = 1\}\). Prove that the action of any arbitrary orthogonal matrix \(Q \in \bbR^{n \times n}\) is a bijective endomorphism, i.e. \(\{Q x | \forall x \in \calB\} = \calB\).
  4. Let \(\Sigma \in \bbR^{n \times n}\) be a non-negative diagonal matrix with entries in descending order, i.e. \(\sigma_1 \geq \sigma_2 \geq \dots \geq \sigma_n\). Prove that \(\sigma_1^2 = \underset{y \in \calB}{\text{max}} || \Sigma y ||_2^2\).
    • Hint: assume that \(y \neq e_1\). In this case, there exists some index \(i^\star\) such that \(y_{i^\star}^2 > 0\). Another feasible point in \(\calB\) is \(y'\), where all the entries are the same as those in \(y\) save that \(y'_1 = \sqrt{y_1^2 + y_{i^\star}^2}\) and \(y'_{i^\star} = 0\). Which has a larger objective value?
  5. The operator norm of a matrix \(|| A ||\) is defined as \[ || A ||^2 = \underset{x \in \calB}{\text{max}} || A x ||_2^2 \] Using the previous results, solve for the operator norm in terms of the singular values of \(A\).