Representer theorem

Dr. Matthieu R Bloch

Wednesday October 13, 2021

Logistics

  • Assignment 4 due October 14, 2021
    • Hard deadline on October 16

What’s on the agenda for today?

  • Last time: solving least squares
    • Minimum \(\norm[2]{\cdot}\) solution
    • Regularized least squares
  • Today
    • Extension to infinite dimension
    • Representer theorem
  • Reading: Romberg, lecture notes 8/9

Ridge regression

  • We can adapt the regularization approach to the situation of a finite dimension Hilbert space \(\calF\) \[ \min_{f\in\calF}\sum_{i=1}^n(y_i-f(\bfx_i))^2 + \lambda\norm[\calF]{f}^2 \]

    • We are penalizing the norm of the entire function \(f\)
  • Using a basis for the space \(\set{\psi_i}_{i=1}^d\) , and constructing \(\boldsymbol{\Psi}\) as earlier, we obtain \[ \min_{\bftheta\in\bbR^d}\norm[2]{\bfy-\boldsymbol{\Psi}\bftheta}^2 + \lambda \bftheta^\intercal\matG\bftheta \] with \(\matG\) the Gram matrix for the basis.

  • If \(\boldsymbol{\Psi}^\intercal \boldsymbol{\Psi}+\lambda\matG\) is invertible, we find the solution as \[ \bftheta^* = (\boldsymbol{\Psi}^\intercal \boldsymbol{\Psi}+\lambda\matG)^{-1}\boldsymbol{\Psi}^\intercal \bfy \] and we can reconstruct the function as \(f(\bfx) = \sum_{i=1}^d\theta_i^*\psi_{i}(\bfx)\).

  • If \(\bfG\) is well conditioned, the resulting function is not too sensitive to the choice of the basis

Least-Squares in infinite dimension Hilbert spaces

  • In \(\bbR^d\), the problem \(\min_{\bftheta\in\bbR^d}\norm[2]{\bfy-\bfX\bftheta}^2 + \lambda\norm[2]{\bftheta}^2\) has a solution \[ \bftheta^* = \matX^\intercal\bfalpha\textsf{ with } \bfalpha =(\bfX\bfX^\intercal+\lambda\bfI)^{-1}\bfy \] \(\matX\matX^\intercal\in\bbR^{n\times n}\) is dimension independent! We will be able to extend this to infinite dimensional Hilbert spaces!

  • Let \(\calF\) be a Hilbert space and let \(f\in\calF\) be the function we are trying to estimate

    • We will estimate \(f\in\calF\) using noisy observations \(\dotp{f}{x_i}\) with \(\set{x_i}_{i=1}^n\) elements of \(\calF\)

    • This is the equivalent of saying \(\bfy = \bfA\bfx+\bfn\) in finite dimension

  • The solution to \[ \min_{f\in\calF}\sum_{i=1}^n\abs{y_i-{\dotp{f}{x_i}}_{\calH}}^2+\lambda\norm[\calH]{f} \] \[ f = \sum_{i=1}^n\alpha_i x_i\textsf{ with } \bfalpha = (\matK+\lambda\matI)^{-1}\vecy\qquad \matK=\mat{c}{\dotp{x_i}{x_j}}_{1\leq i,j\leq n} \]

  • We will see that the situation of the representer theorem happens in Reproducing Kernel Hilber Space (RKHS)