Reproducing Kernel Hilbert Spaces

Dr. Matthieu R Bloch

Monday, October 18, 2021


  • Grades upcoming

    • Midterm 1 99% graded - Grades announced after curving (don’t panic)
    • Assignment 2 solution underway
    • Assignment 3 graded
    • Assignment 4 grading started
    • Drop date: check!
  • More office hours

    • Tuesday October 19, 2021 8am-9am on BlueJeans (
    • Come prepared!
  • Midterm 2: scheduled for Wednesday November 3, 2021

    • Moved to Monday November 8, 2021 (gives you weekend to prepare)
    • Coverage: everything since Midterm 1 (dont’ forget the fundamentals though), emphasis on regression

What’s on the agenda for today?

  • Last time: Representer theorem
    • Some infinite dimensional regression problems have surprising solutions!
    • We can compute the solution as (finite) linear combination of feature vectors
    • (We have to wrap up the proof)
  • Today:
    • Reproducing Kernel Hilbert Spaces
    • Justifies the kind of Hilbert spaces where regularized regression can be
  • Reading: Romberg, lecture notes 10

Least-Squares in infinite dimension Hilbert spaces

  • In \(\bbR^d\), the problem \(\min_{\bftheta\in\bbR^d}\norm[2]{\bfy-\bfX\bftheta}^2 + \lambda\norm[2]{\bftheta}^2\) has a solution \[ \bftheta^* = \matX^\intercal\bfalpha\textsf{ with } \bfalpha =(\bfX\bfX^\intercal+\lambda\bfI)^{-1}\bfy \] \(\matX\matX^\intercal\in\bbR^{n\times n}\) is dimension independent! We will be able to extend this to infinite dimensional Hilbert spaces!

  • Let \(\calF\) be a Hilbert space and let \(f\in\calF\) be the function we are trying to estimate

    • We will estimate \(f\in\calF\) using noisy observations \(\dotp{f}{x_i}\) with \(\set{x_i}_{i=1}^n\) elements of \(\calF\)

    • This is the equivalent of saying \(\bfy = \bfA\bfx+\bfn\) in finite dimension

  • \[ \min_{f\in\calF}\sum_{i=1}^n\abs{y_i-{\dotp{f}{x_i}}_{\calH}}^2+\lambda\norm[\calH]{f} \] has solution \[ f = \sum_{i=1}^n\alpha_i x_i\textsf{ with } \bfalpha = (\matK+\lambda\matI)^{-1}\vecy\qquad \matK=\mat{c}{\dotp{x_i}{x_j}}_{1\leq i,j\leq n} \]

The big picture

  • For a Hilbert space \(\calF\) and \(n\) pairs \((x_i,y_i)\in\calF\times \bbR\), we know how to solve the following problem with linear algebra \[ \min_{f\in\calF}\sum_{i=1}^n\abs{y_i-{\dotp{f}{x_i}}_{\calF}}^2+\lambda\norm[\calF]{f} \]

  • We would really like to solve the following problem for \(n\) pairs \((\bfx_i,y_i)\in\bbR^d\times\bbR\) \[ \min_{f\in\calF}\sum_{i=1}^n\abs{y_i-f(\bfx_i)}^2+\lambda\norm[\calF]{f} \]

  • The question whether \(f(\bfx_i) = {\dotp{f}{x_i}}_{\calF}\) for some \(x_i\in\calF\) function of \(\bfx_i\)

    • Can this be done?
    • We can choose what \(\calF\) is!
  • Reproducing Kernel Hilbert Spaces (RKHSs) are specific Hilbert spaces where this happens to be true

    • Specifcally, this is a Hilbert space of functions in whih the sampling linear operation is a continuous linear functional
  • As usual, we’re throwing definitions at out problem to make progress

Linear functions on Hilbert spaces

  • In what follows, \(\calF\) is a Hilbert space with scalar field \(\bbR\)

  • A functional \(F:\calF\to\bbR\) associates real-valued number to an element of a Hilbert space \(\calF\)

  • Notation can be tricky when the Hilbert space is a space of functions: \(F\) can act on a function \(f\in\calF\)

  • Examples

  • A functional \(F:\calF\to\bbR\) is continuous if \[ \forall \epsilon>0\exists\delta>0\textsf{ such that } \norm[\calF]{x-y}\leq \delta\Rightarrow \abs{F(x)-F(y)}\leq\epsilon. \]

  • All norms are continuous functionals \(F:\calF\to\bbR:x\mapsto\dotp{x}{c}\) for some \(c\in\calF\) is continuous
  • A functional \(F\) is linear if \(\forall a,b\in\bbR\) \(\forall x,y\in\calF\) \(F(ax+by) = aF(x)+bF(y)\).

Representation of (continuous) linear functionals

  • Let \(F:\calF\to\bbR\) be a linear functional on an \(n\)-dimensional Hilbert space \(\calF\).

    Then there exists \(c\in\calF\) such that \(F(x)=\dotp{x}{c}\) for every \(x\in\calF\)
  • Linear functional over finite dimensional Hilbert spaces are continuous!

  • This is not true in infinite dimension

  • Let \(F:\calF\to\bbR\) be a continuous linear functional on a (possible infinite dimensional) separable Hilbert space \(\calF\).

    Then there exists \(c\in\calF\) such that \(F(x)=\dotp{x}{c}\) for every \(x\in\calF\)

  • If \(\set{\psi_n}_{n\geq 1}\) is an orthobasis for \(\calH\), then we can construct \(c\) above as \[ c\eqdef \sum_{n=1}^\infty F(\psi_n)\psi_n \]

Reproducing Kernel Hilbert Spaces

  • An RKHS is a Hilbert space \(\calH\) of real-valued functions \(f:\bbR^d\to\bbR\) in which the sampling operation \(\calS_\bftau:\calH\to\bbR:f\mapsto f(\bftau)\) is continuous for every \(\bftau\in\bbR^d\).

    In other words, for each \(\bftau\in\bbR^d\), there exists \(k_\bftau\in\calH\) s.t. \[ f(\bftau) = {\dotp{f}{k_\bftau}}_\calH\text{ for all } f\in\calH \]
  • The kernel of an RKHS is \[ k:\bbR^d\times\bbR^d\to\bbR:(\bft,\bftau)\mapsto k_{\bftau}(\bft) \] where \(k_\bftau\) is the element of \(\calH\) that defines the sampling at \(\bftau\).
  • A (separable) Hilbert space with orthobasis \(\set{\psi_n}_{n\geq 1}\) is an RKHS iff \(\forall \bftau\in\bbR^d\) \(\sum_{n=1}^\infty\abs{\psi_{n}(\tau)}^2<\infty\)

RKHS an non orthogonal basis

  • If \(\set{\phi_n}_{n\geq 1}\) is a Riesz basis for \(\calH\), we know that every \(x\in\calH\) can be written \[ x = \sum_{n\geq 1}\alpha_n\phi_n\textsf{ with } \alpha_n\eqdef\dotp{x}{\smash{\widetilde{\phi}_n}} \] where \(\set{\widetilde{\phi}_n}_{n\geq 1}\) is the dual basis.

  • A (separable) Hilbert space with Riesz basis \(\set{\phi_n}_{n\geq 1}\) is an RKHS with kernel \[ k(\bft,\bftau) \sum_{n=1}^\infty \phi_n(\bftau)\widetilde{\phi}_n(\bft) \] iff \(\forall \bftau\in\bbR^d\) \(\sum_{n=1}^\infty\abs{\phi_{n}(\tau)}^2<\infty\)