Reproducing Kernel Hilbert Spaces

Dr. Matthieu R Bloch

Monday, October 18, 2021

Logistics

Grades upcoming
- Midterm 1 99% graded - Grades announced after curving (don’t panic)
- Assignment 2 solution underway
- Assignment 3 graded
- Assignment 4 grading started
- Drop date: check!
More office hours
- Tuesday October 19, 2021 8am-9am on BlueJeans (https://bluejeans.com/205357142)
- Come prepared!
Midterm 2: scheduled for Wednesday November 3, 2021
- Moved to Monday November 8, 2021 (gives you weekend to prepare)
- Coverage: everything since Midterm 1 (dont’ forget the fundamentals though), emphasis on regression

Last time: Representer theorem
- Some infinite dimensional regression problems have surprising solutions!
- We can compute the solution as (finite) linear combination of feature vectors
- (We have to wrap up the proof)
Today:
- Reproducing Kernel Hilbert Spaces
- Justifies the kind of Hilbert spaces where regularized regression can be
Reading: Romberg, lecture notes 10

In \(\bbR^d\), the problem \(\min_{\bftheta\in\bbR^d}\norm[2]{\bfy-\bfX\bftheta}^2 + \lambda\norm[2]{\bftheta}^2\) has a solution \[ \bftheta^* = \matX^\intercal\bfalpha\textsf{ with } \bfalpha =(\bfX\bfX^\intercal+\lambda\bfI)^{-1}\bfy \] \(\matX\matX^\intercal\in\bbR^{n\times n}\) is dimension independent! We will be able to extend this to infinite dimensional Hilbert spaces!
Let \(\calF\) be a Hilbert space and let \(f\in\calF\) be the function we are trying to estimate
- We will estimate \(f\in\calF\) using noisy observations \(\dotp{f}{x_i}\) with \(\set{x_i}_{i=1}^n\) elements of \(\calF\)
- This is the equivalent of saying \(\bfy = \bfA\bfx+\bfn\) in finite dimension
\[ \min_{f\in\calF}\sum_{i=1}^n\abs{y_i-{\dotp{f}{x_i}}_{\calH}}^2+\lambda\norm[\calH]{f} \] has solution \[ f = \sum_{i=1}^n\alpha_i x_i\textsf{ with } \bfalpha = (\matK+\lambda\matI)^{-1}\vecy\qquad \matK=\mat{c}{\dotp{x_i}{x_j}}_{1\leq i,j\leq n} \]

For a Hilbert space \(\calF\) and \(n\) pairs \((x_i,y_i)\in\calF\times \bbR\), we know how to solve the following problem with linear algebra \[ \min_{f\in\calF}\sum_{i=1}^n\abs{y_i-{\dotp{f}{x_i}}_{\calF}}^2+\lambda\norm[\calF]{f} \]
We would really like to solve the following problem for \(n\) pairs \((\bfx_i,y_i)\in\bbR^d\times\bbR\) \[ \min_{f\in\calF}\sum_{i=1}^n\abs{y_i-f(\bfx_i)}^2+\lambda\norm[\calF]{f} \]
The question whether \(f(\bfx_i) = {\dotp{f}{x_i}}_{\calF}\) for some \(x_i\in\calF\) function of \(\bfx_i\)
- Can this be done?
- We can choose what \(\calF\) is!
Reproducing Kernel Hilbert Spaces (RKHSs) are specific Hilbert spaces where this happens to be true
- Specifcally, this is a Hilbert space of functions in whih the sampling linear operation is a continuous linear functional
As usual, we’re throwing definitions at out problem to make progress

In what follows, \(\calF\) is a Hilbert space with scalar field \(\bbR\)
A functional \(F:\calF\to\bbR\) associates real-valued number to an element of a Hilbert space \(\calF\)
Notation can be tricky when the Hilbert space is a space of functions: \(F\) can act on a function \(f\in\calF\)
Examples
A functional \(F:\calF\to\bbR\) is continuous if \[ \forall \epsilon>0\exists\delta>0\textsf{ such that } \norm[\calF]{x-y}\leq \delta\Rightarrow \abs{F(x)-F(y)}\leq\epsilon. \]
All norms are continuous functionals \(F:\calF\to\bbR:x\mapsto\dotp{x}{c}\) for some \(c\in\calF\) is continuous
A functional \(F\) is linear if \(\forall a,b\in\bbR\) \(\forall x,y\in\calF\) \(F(ax+by) = aF(x)+bF(y)\).

Let \(F:\calF\to\bbR\) be a linear functional on an \(n\)-dimensional Hilbert space \(\calF\).
Then there exists \(c\in\calF\) such that \(F(x)=\dotp{x}{c}\) for every \(x\in\calF\)
Linear functional over finite dimensional Hilbert spaces are continuous!
This is not true in infinite dimension
Let \(F:\calF\to\bbR\) be a continuous linear functional on a (possible infinite dimensional) separable Hilbert space \(\calF\).
Then there exists \(c\in\calF\) such that \(F(x)=\dotp{x}{c}\) for every \(x\in\calF\)
If \(\set{\psi_n}_{n\geq 1}\) is an orthobasis for \(\calH\), then we can construct \(c\) above as \[ c\eqdef \sum_{n=1}^\infty F(\psi_n)\psi_n \]

An RKHS is a Hilbert space \(\calH\) of real-valued functions \(f:\bbR^d\to\bbR\) in which the sampling operation \(\calS_\bftau:\calH\to\bbR:f\mapsto f(\bftau)\) is continuous for every \(\bftau\in\bbR^d\).
In other words, for each \(\bftau\in\bbR^d\), there exists \(k_\bftau\in\calH\) s.t. \[ f(\bftau) = {\dotp{f}{k_\bftau}}_\calH\text{ for all } f\in\calH \]
The kernel of an RKHS is \[ k:\bbR^d\times\bbR^d\to\bbR:(\bft,\bftau)\mapsto k_{\bftau}(\bft) \] where \(k_\bftau\) is the element of \(\calH\) that defines the sampling at \(\bftau\).
A (separable) Hilbert space with orthobasis \(\set{\psi_n}_{n\geq 1}\) is an RKHS iff \(\forall \bftau\in\bbR^d\) \(\sum_{n=1}^\infty\abs{\psi_{n}(\tau)}^2<\infty\)

If \(\set{\phi_n}_{n\geq 1}\) is a Riesz basis for \(\calH\), we know that every \(x\in\calH\) can be written \[ x = \sum_{n\geq 1}\alpha_n\phi_n\textsf{ with } \alpha_n\eqdef\dotp{x}{\smash{\widetilde{\phi}_n}} \] where \(\set{\widetilde{\phi}_n}_{n\geq 1}\) is the dual basis.
A (separable) Hilbert space with Riesz basis \(\set{\phi_n}_{n\geq 1}\) is an RKHS with kernel \[ k(\bft,\bftau) \sum_{n=1}^\infty \phi_n(\bftau)\widetilde{\phi}_n(\bft) \] iff \(\forall \bftau\in\bbR^d\) \(\sum_{n=1}^\infty\abs{\phi_{n}(\tau)}^2<\infty\)