# Reproducing Kernel Hilbert Spaces

Monday, October 18, 2021

## Logistics

• Midterm 1 99% graded - Grades announced after curving (don’t panic)
• Assignment 2 solution underway
• Drop date: check!
• More office hours

• Tuesday October 19, 2021 8am-9am on BlueJeans (https://bluejeans.com/205357142)
• Come prepared!
• Midterm 2: scheduled for Wednesday November 3, 2021

• Moved to Monday November 8, 2021 (gives you weekend to prepare)
• Coverage: everything since Midterm 1 (dont’ forget the fundamentals though), emphasis on regression

## What’s on the agenda for today?

• Last time: Representer theorem
• Some infinite dimensional regression problems have surprising solutions!
• We can compute the solution as (finite) linear combination of feature vectors
• (We have to wrap up the proof)
• Today:
• Reproducing Kernel Hilbert Spaces
• Justifies the kind of Hilbert spaces where regularized regression can be
• Reading: Romberg, lecture notes 10

## Least-Squares in infinite dimension Hilbert spaces

• In $\bbR^d$, the problem $\min_{\bftheta\in\bbR^d}\norm{\bfy-\bfX\bftheta}^2 + \lambda\norm{\bftheta}^2$ has a solution $\bftheta^* = \matX^\intercal\bfalpha\textsf{ with } \bfalpha =(\bfX\bfX^\intercal+\lambda\bfI)^{-1}\bfy$ $\matX\matX^\intercal\in\bbR^{n\times n}$ is dimension independent! We will be able to extend this to infinite dimensional Hilbert spaces!

• Let $\calF$ be a Hilbert space and let $f\in\calF$ be the function we are trying to estimate

• We will estimate $f\in\calF$ using noisy observations $\dotp{f}{x_i}$ with $\set{x_i}_{i=1}^n$ elements of $\calF$

• This is the equivalent of saying $\bfy = \bfA\bfx+\bfn$ in finite dimension

• $\min_{f\in\calF}\sum_{i=1}^n\abs{y_i-{\dotp{f}{x_i}}_{\calH}}^2+\lambda\norm[\calH]{f}$ has solution $f = \sum_{i=1}^n\alpha_i x_i\textsf{ with } \bfalpha = (\matK+\lambda\matI)^{-1}\vecy\qquad \matK=\mat{c}{\dotp{x_i}{x_j}}_{1\leq i,j\leq n}$

## The big picture

• For a Hilbert space $\calF$ and $n$ pairs $(x_i,y_i)\in\calF\times \bbR$, we know how to solve the following problem with linear algebra $\min_{f\in\calF}\sum_{i=1}^n\abs{y_i-{\dotp{f}{x_i}}_{\calF}}^2+\lambda\norm[\calF]{f}$

• We would really like to solve the following problem for $n$ pairs $(\bfx_i,y_i)\in\bbR^d\times\bbR$ $\min_{f\in\calF}\sum_{i=1}^n\abs{y_i-f(\bfx_i)}^2+\lambda\norm[\calF]{f}$

• The question whether $f(\bfx_i) = {\dotp{f}{x_i}}_{\calF}$ for some $x_i\in\calF$ function of $\bfx_i$

• Can this be done?
• We can choose what $\calF$ is!
• Reproducing Kernel Hilbert Spaces (RKHSs) are specific Hilbert spaces where this happens to be true

• Specifcally, this is a Hilbert space of functions in whih the sampling linear operation is a continuous linear functional
• As usual, we’re throwing definitions at out problem to make progress

## Linear functions on Hilbert spaces

• In what follows, $\calF$ is a Hilbert space with scalar field $\bbR$

• A functional $F:\calF\to\bbR$ associates real-valued number to an element of a Hilbert space $\calF$

• Notation can be tricky when the Hilbert space is a space of functions: $F$ can act on a function $f\in\calF$

• Examples

• A functional $F:\calF\to\bbR$ is continuous if $\forall \epsilon>0\exists\delta>0\textsf{ such that } \norm[\calF]{x-y}\leq \delta\Rightarrow \abs{F(x)-F(y)}\leq\epsilon.$

• All norms are continuous functionals $F:\calF\to\bbR:x\mapsto\dotp{x}{c}$ for some $c\in\calF$ is continuous
• A functional $F$ is linear if $\forall a,b\in\bbR$ $\forall x,y\in\calF$ $F(ax+by) = aF(x)+bF(y)$.

## Representation of (continuous) linear functionals

• Let $F:\calF\to\bbR$ be a linear functional on an $n$-dimensional Hilbert space $\calF$.

Then there exists $c\in\calF$ such that $F(x)=\dotp{x}{c}$ for every $x\in\calF$
• Linear functional over finite dimensional Hilbert spaces are continuous!

• This is not true in infinite dimension

• Let $F:\calF\to\bbR$ be a continuous linear functional on a (possible infinite dimensional) separable Hilbert space $\calF$.

Then there exists $c\in\calF$ such that $F(x)=\dotp{x}{c}$ for every $x\in\calF$

• If $\set{\psi_n}_{n\geq 1}$ is an orthobasis for $\calH$, then we can construct $c$ above as $c\eqdef \sum_{n=1}^\infty F(\psi_n)\psi_n$

## Reproducing Kernel Hilbert Spaces

• An RKHS is a Hilbert space $\calH$ of real-valued functions $f:\bbR^d\to\bbR$ in which the sampling operation $\calS_\bftau:\calH\to\bbR:f\mapsto f(\bftau)$ is continuous for every $\bftau\in\bbR^d$.

In other words, for each $\bftau\in\bbR^d$, there exists $k_\bftau\in\calH$ s.t. $f(\bftau) = {\dotp{f}{k_\bftau}}_\calH\text{ for all } f\in\calH$
• The kernel of an RKHS is $k:\bbR^d\times\bbR^d\to\bbR:(\bft,\bftau)\mapsto k_{\bftau}(\bft)$ where $k_\bftau$ is the element of $\calH$ that defines the sampling at $\bftau$.
• A (separable) Hilbert space with orthobasis $\set{\psi_n}_{n\geq 1}$ is an RKHS iff $\forall \bftau\in\bbR^d$ $\sum_{n=1}^\infty\abs{\psi_{n}(\tau)}^2<\infty$

## RKHS an non orthogonal basis

• If $\set{\phi_n}_{n\geq 1}$ is a Riesz basis for $\calH$, we know that every $x\in\calH$ can be written $x = \sum_{n\geq 1}\alpha_n\phi_n\textsf{ with } \alpha_n\eqdef\dotp{x}{\smash{\widetilde{\phi}_n}}$ where $\set{\widetilde{\phi}_n}_{n\geq 1}$ is the dual basis.

• A (separable) Hilbert space with Riesz basis $\set{\phi_n}_{n\geq 1}$ is an RKHS with kernel $k(\bft,\bftau) \sum_{n=1}^\infty \phi_n(\bftau)\widetilde{\phi}_n(\bft)$ iff $\forall \bftau\in\bbR^d$ $\sum_{n=1}^\infty\abs{\phi_{n}(\tau)}^2<\infty$