Mathematical Foundations of Machine Learning

Prof. Matthieu Bloch

Monday, October 21, 2024 (v1.0)

Last time

Image
  • Last class: Wednesday October 16, 2024
    • We talked about the existence of eivenvalues and eigenvectors
    • We are making our way towards the SVD
  • Today: We will talk about the spectral theorem
  • To be effectively prepared for today's class, you should have:
    1. Gone over slides and read associated lecture notes here
    2. Submitted Homework 4
  • Logistics: use office hours to prepare for Homework 4
    • Jack Hill office hours: Wednesday 11:30am-12:30pm in TSRB and hybrid
    • Anuvab Sen office hours: Thursday 12pm-1pm in TSRB and hybrid
    • Dr. Bloch: Friday October 25, 2024 6pm
  • Homework 5: due Tuesday October 22, 2024
    • Don't start accumulating delays, homework 6 is coming

Spectral theorem

Every complex matrix \(\matA\) has at least one complex eigenvector and every real symmetrix matrix has real eigenvalues and at least one real eigenvector.

Every matrix \(\matA\in\bbC^{n\times n}\) is unitarily similar to an upper triangular matrix, i.e., \[ \bfA = \bfV\boldsymbol{\Delta}\bfV^\dagger \] with \(\boldsymbol{\Delta}\) upper triangular and \(\bfV^\dagger=\bfV^{-1}\).

Every hermitian matrix is unitarily similar to a real-valued diagonal matrix.

  • Note that if \(\matA = \matV\matD\matV^\dagger\) then \(\matA = \sum_{i=1}^n\lambda_i \vecv_i\vecv_i^\dagger\)
  • How about real-valued matrices \(\matA\in\bbR^{n\times n}\)?

Symmetric positive definite matrices

A symmetric matrice \(\matA\) is positive definite if it has positive eigenvalues, i.e., \(\forall i\in\set{1,\cdots,n}\quad\lambda_i>0\).

A symmetric matrice \(\matA\) is positive semidefinite if it has nonnegative eigenvalues, i.e., \(\forall i\in\set{1,\cdots,n}\quad\lambda_i\geq 0\).

  • Convention: \(\lambda_1\geq \lambda_2\geq \cdots \geq \lambda_n\)
  • Variational form of extreme eigenvalues for symmetric positive definite matrices \(\bfA\) \[ \lambda_1 = \max_{\vecx\in\bbR^n:\norm[2]{\bfx}=1}\vecx^\intercal \matA\vecx = \max_{\vecx\in\bbR^n}\frac{\vecx^\intercal \matA\vecx}{\norm[2]{\vecx}^2}\qquad \lambda_n = \min_{\vecx\in\bbR^n:\norm[2]{\bfx}=1}\vecx^\intercal \matA\vecx = \min_{\vecx\in\bbR^n}\frac{\vecx^\intercal \matA\vecx}{\norm[2]{\vecx}^2} \]

For any analytic function \(f\), we have \[ f(\matA) = \sum_{i=1}^n f(\lambda_i)\vecv_i\vecv_i^\intercal \]

System of symmetric definite equations

  • Consider the system \(\vecy=\matA\vecx\) with \(\matA\) symmetric positive definite

Let \(\set{\vecv_i}\) be the eigenvectors of \(\matA\). \[ \vecx = \sum_{i=1}^n\frac{1}{\lambda_i}\dotp{\vecy}{\vecv_i}\vecv_i \]

  • Assume that there exists some observation error \(\vecy=\matA\vecx+\vece\)
    • \(\vece\) is unknown
    • we try to reconstruct \(\vecx\) as \(\widetilde{\vecx}\) by applying \(\matA^{-1}\)

\[ \frac{1}{\lambda_1^2}\norm{\vece}^2\leq \norm[2]{\vecx-\tilde{\vecx}}\leq \frac{1}{\lambda_n^2}\norm{\vece}^2. \]

  • What about the average case?

Next time

  • Next class: Wednesday October 23, 2024
  • To be effectively prepared for next class, you should:
    1. Go over today's slides and read associated lecture notes here
    2. Work on Homework 5
  • Optional
    • Export slides for next lecture as PDF (be on the lookout for an announcement when they're ready)