Regression

Logistics

Assignment 4 assigned Tuesday, October 5, 2021
- Includes a (small) programming component
- Due October 14, 2021 (soft deadline, hard deadline on October 16)

Last time: Least-square regression
Today
- Solving linear least-square regression
- Extension to infinite dimension
Reading: Romberg, lecture notes 8

Any solution $θ^{*}$ to the problem $min_{θ \in R^{d}} {‖ y - X θ ‖}_{2}^{2}$ must satisfy $X^{⊺} X θ^{*} = X^{⊺} y$ This system is called normal equations
Facts: for any matrix $A \in R^{m \times n}$
- $\ker A^{⊺} A = \ker A$
- $col (A^{⊺} A) = row (A)$
- $row (A)$ and $\ker A$ are orthogonal complements
We can say a lot more about the normal equations
1. There is always a solution
2. If $rank (X) = d$ , there is a unique solution: $(A^{⊺} A)^{- 1} A^{⊺} y$
3. if $rank (X) < d$ there are infinitely many non-trivial solution
4. if $rank (X) = n$ , there exists a solution $θ^{*}$ for which $y = X θ^{*}$
In machine learning, there are often infinitely many solutions

One reasonable to choose a solution among infinitely many is the minimum energy principle $min_{θ \in R^{d}} {‖ θ ‖}_{2}^{2} such that X^{⊺} X θ = X^{⊺} y$
- We will see the solution is always unique using the SVD
For now, assume that $rank (X) = d$ , so that the problem becomes $min_{θ \in R^{d}} {‖ θ ‖}_{2}^{2} such that X θ = y$
The solution is $θ^{*} = A^{⊺} (A A^{⊺})^{- 1} y$

Recall the problem $min_{θ \in R^{d}} {‖ θ ‖}_{2}^{2} such that X^{⊺} X θ = X^{⊺} y$
- There are infinitely many solution if $\ker X$ is non trivial
- The space of solution is unbounded!
- Even if $\ker X = {0}$ , the system can be poorly conditioned
Regularization with $λ > 0$ consists in solving $min_{θ \in R^{d}} {‖ y - X θ ‖}_{2}^{2} + λ {‖ θ ‖}_{2}^{2}$
- This problem always has a unique solution
The solution is $θ^{*} = (X^{⊺} X + λ I)^{- 1} X^{⊺} y = X^{⊺} (X X^{⊺} + λ I)^{- 1} y$
Note that $θ^{*}$ is the row space of $X$ $θ^{*} = X α with α = (X X^{⊺} + λ I)^{- 1} y$