Stochastic Least Squares

Matthieu Bloch

Tuesday September 06, 2022

Today in ECE 6555

Don't forget
Problem set 1 due Thursday September 8, 2022 on Gradescope

Announcements
- Mathematics of ECE workshops (third session on linear algebra on Wednesday September 07, 2022)
- Office hours today at 1pm (instead of 12pm)
Last time
- Normal equations and geometric approach
Today's plan
Stochastic least squares
Questions?

Stochastic least squares

We now model \(\vecx\) and \(\vecy\) as dependent random vectors with known statistics

The objective is to form an estimate of \(\vecx\in\bbR^n\) from \(\vecy\in\bbR^m\) as \[ \hat{\vecx} = h(\vecy) \] for some function \(h:\bbR^n\to\bbR^m\)

The least mean square estimator \(h^*\) is \[ h^* = \argmin_{h}\E{(\vecx-h(\vecy))(\vecx-h(\vecy))^T} \] in the sense that for all \(h\) the matrix \(\matP(h)\eqdef(\vecx-h(\vecy))(\vecx-h(\vecy))^T\) satisfies \[ \forall \veca\in \bbR^n\quad \veca^T \matP(h)\veca\geq\veca^T \matP(h^*)\veca \]

The least mean square estimate of \(\vecx\) given \(\vecy\) is \(\hat{\vecx}=\E{\vecx|\vecy}\)

This requires full knowledge of the joint statistics of \(\vecx\) and \(\vecy\), can be hard to compute

Multivariate Gaussian distribution

Multidimensional Gaussian random variables are essential to many engineering problems

For \(\vecx\in\bbR^n\) and \(\vecy\in\bbR^m\), a real centered multivariate Gaussian distribution is defined by a probability density function of the form \[ p(\vecx,\vecy)\eqdef \frac{1}{\pi^{n+m}\det \matR}\exp\left(-\left[\begin{array}{cc}(\vecx-\vecmu_\vecx)^T&(\vecy-\vecmu_\vecy)^T\end{array}\right]\matR^{-1}\left[\begin{array}{c}(\vecx-\vecmu_\vecx)\\(\vecy-\vecmu_\vecy)\end{array}\right]\right) \] where \(\vecmu_\vecx\eqdef \E{\vecx}\), \(\vecmu_\vecy\eqdef \E{\vecy}\) and \[ \matR \eqdef \left[\begin{array}{cc}\matR_\vecx&\matR_{\vecx\vecy}\\\matR_{\vecy\vecx}&\matR_{\vecy}\end{array}\right]\quad \textsf{ with }\quad \begin{array}{lll}\matR_\vecx=\E{(\vecx-\vecmu_\vecx)(\vecx-\vecmu_\vecx)^T}\\ \matR_{\vecx\vecy}=\E{(\vecx-\vecmu_\vecx)(\vecy-\vecmu_\vecy)^T}=\matR_{\vecy\vecx}^T\\ \matR_\vecy=\E{(\vecy-\vecmu_\vecy)(\vecy-\vecmu_\vecy)^T}\end{array} \]
Only need two parameters to specify two parameters: a mean vector and a covariance matrix
- Things are slighly more subtle with complex Gaussian random vectors (Homework)
- Need to be a bit more careful to deal with singular covariance matrices (Homework)
Note: Make your life easier and center the random vectors

LMSE for Multivariate Gaussian

Assume \(\vecx\in\bbR^n\) and \(\vecy\bbR^m\) follow a multivariate Gaussian distribution with non singular covariance matrix

The least mean square estimate of \(\vecx\) given \(\vecy\) is \[ \E{\vecx|\vecy} = \matR_{\vecx\vecy}\matR_{\vecy}^{-1}\vecy \]

The LMSE in linear!
The LMSE for multivariate Gaussian only requires knowlege of mean vectors and covariance matrix
Note: what happens when the covariance matrix is singular? (Homework)

Linear LMSE

We now restric ourselves to linear estimates of the form \(\hat{\vecx}=\matK_0\vecy\) (how restrictive is this?)
Unless otherwise specified, assume that all random vectors are zero mean, i.e., \(\E{\vecx}=\boldsymbol{0}\), etc.

Assume \(\vecx\in\bbR^n\) is to be estimated from \(p\) observations \(\set{\vecy_i}_{i=1}^p\) and \(\vecy_i\in\bbR^m\). Define \[ \vecy^T = \left[\begin{array}{ccc}\vecy_1&\cdots&\vecy_p\end{array}\right]\in\bbR^{mp} \]

The linear least mean square estimate (LLMSE) of \(\vecx\) given \(\set{\vecy_i}_{i=1}^p\) is given by any solution of the normal equation \[ \matK_0\matR_\vecy = \matR_{\vecx\vecy} \]

The corresponding error covariance matrix is \(P(\matK_0)=\matR_\vecx-\matK_0\matR_{\vecy\vecx}\)

We have performed sensor fusion: we have optimally combined observations from multiple sensors
What happens if random vectors are not centered?