Mathematical Foundations of Machine Learning
Prof. Matthieu Bloch
Monday, November 11, 2024
Last time
- Last class: Wednesday November 06, 2024
- We talked about using the SVD to solve \(\vecy=\matA\vecx\)
- We analyzed the reconstruction error
- Today: We will talk about mitigating the
reconstruction error
- To be effectively prepared for today's class, you should
have:
- Gone over slides
and read associated lecture notes here
and there
- Planned to submit Homework 6 (due Thursday November 07, 2024)
- Logistics:
- Jack Hill office hours: Wednesday 11:30am-12:30pm in TSRB
and hybrid
- Anuvab Sen office hours: Thursday 12pm-1pm in TSRB and
hybrid
- Dr. Bloch office hours: Friday November 08, 2024 6pm-7pm
online
- Homework 7: due Monday November 18, 2024
What's next for this semester
Lecture 21 - Monday November 4, 2024: SVD and least
squares
Lecture 22 - Wednesday November 6, 2024: Gradient descent
Homework 6 due on Thursday November 7, 2024
- Lecture 23 - Monday November 11, 2024: Estimation
- Lecture 24 - Wednesday November 13, 2024: Estimation
- Lecture 25 - Monday November 18, 2024: Classification and Regression
- Homework 7 due on Friday November 15. 2024
- Lecture 26 - Wednesday November 20, 2024: Classification and
Regression
- Lecture 27 - Monday November 25, 2024: Principal Component Analysis
- Lecture 28 - Monday December 2, 2024: Principal Component
Analysis
Stability of least squares
- What if we observe \(\vecy =
\matA\vecx_0+\vece\) and we apply the pseudo inverse? \(\hat{\vecx} = \matA^+\vecy\)
We can separate the error analysis into two components \[
\hat{\vecx}-\vecx_0 =
\underbrace{\matA^+\matA\vecx_0-\vecx_0}_{\text{null space error}} +
\underbrace{\matA^+\vece}_{\text{noise error}}
\]
We will express the error in terms of the SVD \(\matA=\matU\boldsymbol{\Sigma}\matV^\intercal\)
With
- \(\set{\vecv_i}_{i=1}^r\)
orthobasis of \(\text{row}(\matA)\),
augmented by \(\set{\vecv_i}_{i=r+1}^{n}\in\ker{\matA}\)
to form an orthobasis of \(\bbR^n\)
- \(\set{\vecu_i}_{i=1}^r\)
orthobasis of \(\text{col}(\matA)\),
augmented by \(\set{\vecu}_{i=t+1}^{m}\in\ker{\matA^\intercal}\)
to form an orthobasis of \(\bbR^m\)
The null space error is given by \[
\norm[2]{\matA^+\matA\vecx_0-\vecx_0}^2=\sum_{i=r+1}^n\abs{\dotp{\vecv_i}{x_0}}^2
\]
The noise error is given by \[
\norm[2]{\matA^+\vece}^2=\sum_{i=1}^r
\frac{1}{\sigma_i^2}\abs{\dotp{\vece}{\vecu_i}}^2
\]
Stable reconstruction by truncation
How do we mitigate the effect of small singular values in
reconstruction? \[
\hat{\vecx} = \matV\boldsymbol{\Sigma}^{-1}\matU^\intercal\vecy =
\sum_{i=1}^r\frac{1}{\sigma_i}\dotp{\vecy}{\vecu_i}\vecv_i
\]
Truncate the SVD to \(r'<r\) \[
\matA_t\eqdef
\sum_{i=1}^{r'}\sigma_i\vecu_i\vecv_i^\intercal\qquad\matA_t^+ =
\sum_{i=1}^{r'}\frac{1}{\sigma_i}\vecu_i\vecv_i^\intercal
\]
Reconstruct \(\hat{\vecx_t} =
\sum_{i=1}^{r'}\frac{1}{\sigma_i}\dotp{\vecy}{\vecu_i}\vecv_i=\matA_t\)
- Error analysis: \[
\norm[2]{\hat{\vecx}_t-\vecx_0}^2 =
\sum_{i=r+1}^n\abs{\dotp{\vecx_0}{\vecv_i}}^2+\sum_{i=r'+1}^r\abs{\dotp{\vecx_0}{\vecv_i}}^2+\sum_{i=1}^{r'}\frac{1}{\sigma_i^2}\abs{\dotp{\vece}{\vecu_i}}^2
\]
Stable reconstruction by regularization
Regularization means changing the problem to solve \[
\min_{\vecx\in\bbR^n}\norm[2]{\vecy-\matA\vecx}^2+\lambda\norm[2]{\vecx}^2\qquad\
\lambda>0
\]
The solution is \[
\hat{\vecx} =
(\matA^\intercal\matA+\lambda\matI)^{-1}\matA^\intercal\vecy =
\matV(\boldsymbol{\Sigma}^2+\lambda\matI)^{-1}\boldsymbol{\Sigma}\matU^\intercal\vecy
\]
Probability and statistics
- Probabilities play a huge role in machine learning
- You have to be comfortable with standard concepts
- I won't have time to review everything thing
- Resources
- Review
of Probability by Dr. Romberg
- "Probabilities" notes on Canvas with supporting videos here, here and there
- Use these to review your comfort and operational knowledge of
probabilities
- Homework 7
- Office hours
Next time
- Next class: Wednesday November 20, 2024
- To be effectively prepared for next class, you
should:
- Go over today's slides
and read associated lecture notes here
and there
- Work on Homework 7
- Optional
- Export slides for next lecture as PDF (be on the lookout for an
announcement when they're ready)