Mathematical Foundations of Machine Learning

Prof. Matthieu Bloch

Wednesday, November 13, 2024

Last time

  • Last class: Monday November 11, 2024
    • We finished talking about the SVD! (for now…)
  • Today: We will talk about estimation
  • To be effectively prepared for today's class, you should have:
    1. Gone over slides and read associated lecture notes here and there
    2. Planned to submit Homework 7 (due Monday November 18, 2024)
  • Logistics:
    • Jack Hill office hours: Wednesday 11:30am-12:30pm in TSRB and hybrid
    • Anuvab Sen office hours: Thursday 12pm-1pm in TSRB and hybrid
    • Dr. Bloch office hours: Friday November 15, 2024 4:30pm-5:15pm online
  • Homework 7: due Monday November 18, 2024

What's next for this semester

  • Lecture 21 - Monday November 4, 2024: SVD and least squares
  • Lecture 22 - Wednesday November 6, 2024: Gradient descent
    • Homework 6 due on Thursday November 7, 2024
  • Lecture 23 - Monday November 11, 2024: Estimation
  • Lecture 24 - Wednesday November 13, 2024: Estimation
  • Lecture 25 - Monday November 18, 2024: Classification and Regression
    • Homework 7 due on Friday November 15. 2024
  • Lecture 26 - Wednesday November 20, 2024: Classification and Regression
  • Lecture 27 - Monday November 25, 2024: Principal Component Analysis
    • Homework 8 due
  • Lecture 28 - Monday December 2, 2024: Principal Component Analysis

Probability and statistics

  • Probabilities play a huge role in machine learning
    • You have to be comfortable with standard concepts
    • I won't have time to review everything thing
  • Resources
    • Review of Probability by Dr. Romberg
    • "Probabilities" notes on Canvas with supporting videos here, here and there
    • Use these to review your comfort and operational knowledge of probabilities
    • Homework 7
    • Office hours + recitation problems

Minimum mean square estimation

  • Consider a random variable \(Y\) with finite first and second moments
    • distribution \(p_{Y}\) with \(\E{Y}\infty\), \(\E{Y^2}<\infty\)
    • what is the guess \(g\) that minimizes the mean square error? \[ g \eqdef \argmin \E{(Y-g)^2} \]
  • Consider two jointly distributed random variable \(X\) and \(Y\) with finite first and second moments
    • distribution \(p_{Y}\) with \(\E{Y}\infty\), \(\E{Y^2}<\infty\)
    • what is the guess \(g\) that minimizes the mean square error of \(Y\) after observing \(X=x\)? \[ g(x) \eqdef \argmin \E{(Y-g)^2|X=x} \]

Gaussian estimation

  • Consider a Gaussian random vector \(\bfX\sim\calN(\mathbf{0},\matR)\), i.e., \[ p(\vecx) = \frac{1}{(2\pi)^{n/2}\sqrt{\det{\matR}}}\exp\left(-\bfx^T\matR^{-1}\bfx\right) \]

  • Assume that we we write \[ \bfX = \left[\begin{array}{c}\bfX_o\\\bfX_h\end{array}\right]\qquad\matR = \left[\begin{array}{cc}\bfR_o&\matR_{oh}\\ \matR_{oh}^T&\matR_{h}\end{array}\right] \]

    • We observe \(\bfX_o=\vecx_o\)
    • what is the conditional density of \(\matX_h|\matX_o=\vecx_o\)

The conditional density of \(\matX_h|\matX_o=\vecx_o\) is a Normal distribution with mean and covariance matrix \[ \bfmu = \matR_{oh}^T\matR_o^{-1}\vecx_o \] \[ \mathbf{\Sigma} = \matR_h - \matR_{oh}^T\matR_o^{-1}\matR_{oh} \]

Next time

  • Next class: Monday November 18, 2024
  • To be effectively prepared for next class, you should:
    1. Go over today's slides and read associated lecture notes here and there
    2. Work on Homework 7
  • Optional
    • Export slides for next lecture as PDF (be on the lookout for an announcement when they're ready)