# Representer theorem

Wednesday October 13, 2021

## Logistics

• Assignment 4 due October 14, 2021
• Hard deadline on October 16

## What’s on the agenda for today?

• Last time: solving least squares
• Minimum $\norm[2]{\cdot}$ solution
• Regularized least squares
• Today
• Extension to infinite dimension
• Representer theorem
• Reading: Romberg, lecture notes 8/9

## Ridge regression

• We can adapt the regularization approach to the situation of a finite dimension Hilbert space $\calF$ $\min_{f\in\calF}\sum_{i=1}^n(y_i-f(\bfx_i))^2 + \lambda\norm[\calF]{f}^2$

• We are penalizing the norm of the entire function $f$
• Using a basis for the space $\set{\psi_i}_{i=1}^d$ , and constructing $\boldsymbol{\Psi}$ as earlier, we obtain $\min_{\bftheta\in\bbR^d}\norm[2]{\bfy-\boldsymbol{\Psi}\bftheta}^2 + \lambda \bftheta^\intercal\matG\bftheta$ with $\matG$ the Gram matrix for the basis.

• If $\boldsymbol{\Psi}^\intercal \boldsymbol{\Psi}+\lambda\matG$ is invertible, we find the solution as $\bftheta^* = (\boldsymbol{\Psi}^\intercal \boldsymbol{\Psi}+\lambda\matG)^{-1}\boldsymbol{\Psi}^\intercal \bfy$ and we can reconstruct the function as $f(\bfx) = \sum_{i=1}^d\theta_i^*\psi_{i}(\bfx)$.

• If $\bfG$ is well conditioned, the resulting function is not too sensitive to the choice of the basis

## Least-Squares in infinite dimension Hilbert spaces

• In $\bbR^d$, the problem $\min_{\bftheta\in\bbR^d}\norm[2]{\bfy-\bfX\bftheta}^2 + \lambda\norm[2]{\bftheta}^2$ has a solution $\bftheta^* = \matX^\intercal\bfalpha\textsf{ with } \bfalpha =(\bfX\bfX^\intercal+\lambda\bfI)^{-1}\bfy$ $\matX\matX^\intercal\in\bbR^{n\times n}$ is dimension independent! We will be able to extend this to infinite dimensional Hilbert spaces!

• Let $\calF$ be a Hilbert space and let $f\in\calF$ be the function we are trying to estimate

• We will estimate $f\in\calF$ using noisy observations $\dotp{f}{x_i}$ with $\set{x_i}_{i=1}^n$ elements of $\calF$

• This is the equivalent of saying $\bfy = \bfA\bfx+\bfn$ in finite dimension

• The solution to $\min_{f\in\calF}\sum_{i=1}^n\abs{y_i-{\dotp{f}{x_i}}_{\calH}}^2+\lambda\norm[\calH]{f}$ $f = \sum_{i=1}^n\alpha_i x_i\textsf{ with } \bfalpha = (\matK+\lambda\matI)^{-1}\vecy\qquad \matK=\mat{c}{\dotp{x_i}{x_j}}_{1\leq i,j\leq n}$

• We will see that the situation of the representer theorem happens in Reproducing Kernel Hilber Space (RKHS)