Stochastic Least Squares

Matthieu Bloch

Tuesday September 06, 2022

Today in ECE 6555

  • Don't forget
  • Problem set 1 due Thursday September 8, 2022 on Gradescope
  • Announcements

    • Mathematics of ECE workshops (third session on linear algebra on Wednesday September 07, 2022)
    • Office hours today at 1pm (instead of 12pm)
  • Last time

    • Normal equations and geometric approach
  • Today's plan

  • Stochastic least squares

  • Questions?

Stochastic least squares

  • We now model \(\vecx\) and \(\vecy\) as dependent random vectors with known statistics
  • The objective is to form an estimate of \(\vecx\in\bbR^n\) from \(\vecy\in\bbR^m\) as \[ \hat{\vecx} = h(\vecy) \] for some function \(h:\bbR^n\to\bbR^m\)

The least mean square estimator \(h^*\) is \[ h^* = \argmin_{h}\E{(\vecx-h(\vecy))(\vecx-h(\vecy))^T} \] in the sense that for all \(h\) the matrix \(\matP(h)\eqdef(\vecx-h(\vecy))(\vecx-h(\vecy))^T\) satisfies \[ \forall \veca\in \bbR^n\quad \veca^T \matP(h)\veca\geq\veca^T \matP(h^*)\veca \]

The least mean square estimate of \(\vecx\) given \(\vecy\) is \(\hat{\vecx}=\E{\vecx|\vecy}\)

  • This requires full knowledge of the joint statistics of \(\vecx\) and \(\vecy\), can be hard to compute

Multivariate Gaussian distribution

  • Multidimensional Gaussian random variables are essential to many engineering problems

    For \(\vecx\in\bbR^n\) and \(\vecy\in\bbR^m\), a real centered multivariate Gaussian distribution is defined by a probability density function of the form \[ p(\vecx,\vecy)\eqdef \frac{1}{\pi^{n+m}\det \matR}\exp\left(-\left[\begin{array}{cc}(\vecx-\vecmu_\vecx)^T&(\vecy-\vecmu_\vecy)^T\end{array}\right]\matR^{-1}\left[\begin{array}{c}(\vecx-\vecmu_\vecx)\\(\vecy-\vecmu_\vecy)\end{array}\right]\right) \] where \(\vecmu_\vecx\eqdef \E{\vecx}\), \(\vecmu_\vecy\eqdef \E{\vecy}\) and \[ \matR \eqdef \left[\begin{array}{cc}\matR_\vecx&\matR_{\vecx\vecy}\\\matR_{\vecy\vecx}&\matR_{\vecy}\end{array}\right]\quad \textsf{ with }\quad \begin{array}{lll}\matR_\vecx=\E{(\vecx-\vecmu_\vecx)(\vecx-\vecmu_\vecx)^T}\\ \matR_{\vecx\vecy}=\E{(\vecx-\vecmu_\vecx)(\vecy-\vecmu_\vecy)^T}=\matR_{\vecy\vecx}^T\\ \matR_\vecy=\E{(\vecy-\vecmu_\vecy)(\vecy-\vecmu_\vecy)^T}\end{array} \]

  • Only need two parameters to specify two parameters: a mean vector and a covariance matrix

    • Things are slighly more subtle with complex Gaussian random vectors (Homework)
    • Need to be a bit more careful to deal with singular covariance matrices (Homework)
  • Note: Make your life easier and center the random vectors

LMSE for Multivariate Gaussian

Assume \(\vecx\in\bbR^n\) and \(\vecy\bbR^m\) follow a multivariate Gaussian distribution with non singular covariance matrix

The least mean square estimate of \(\vecx\) given \(\vecy\) is \[ \E{\vecx|\vecy} = \matR_{\vecx\vecy}\matR_{\vecy}^{-1}\vecy \]

  • The LMSE in linear!

  • The LMSE for multivariate Gaussian only requires knowlege of mean vectors and covariance matrix

  • Note: what happens when the covariance matrix is singular? (Homework)

Linear LMSE

  • We now restric ourselves to linear estimates of the form \(\hat{\vecx}=\matK_0\vecy\) (how restrictive is this?)
  • Unless otherwise specified, assume that all random vectors are zero mean, i.e., \(\E{\vecx}=\boldsymbol{0}\), etc.

Assume \(\vecx\in\bbR^n\) is to be estimated from \(p\) observations \(\set{\vecy_i}_{i=1}^p\) and \(\vecy_i\in\bbR^m\). Define \[ \vecy^T = \left[\begin{array}{ccc}\vecy_1&\cdots&\vecy_p\end{array}\right]\in\bbR^{mp} \]

The linear least mean square estimate (LLMSE) of \(\vecx\) given \(\set{\vecy_i}_{i=1}^p\) is given by any solution of the normal equation \[ \matK_0\matR_\vecy = \matR_{\vecx\vecy} \]

The corresponding error covariance matrix is \(P(\matK_0)=\matR_\vecx-\matK_0\matR_{\vecy\vecx}\)

  • We have performed sensor fusion: we have optimally combined observations from multiple sensors

  • What happens if random vectors are not centered?