Linear Models

Matthieu Bloch

Thursday September 08, 2022

Today in ECE 6555

Don't forget
Problem set 1 due today Thursday September 8, 2022 on Gradescope

Announcements
- Mathematics of ECE workshops (fourth session on linear algebra on Friday September 09, 2022)
- Recordings available on mediaspace
Last time
- Stochastic least squares
- Towering law of expectation
Today's plan
- Stochastic least squares with linear estimators
Questions?

Linear LMSE

We now restric ourselves to linear estimates of the form \(\hat{\vecx}=\matK_0\vecy\) (how restrictive is this?)
Unless otherwise specified, assume that all random vectors are zero mean, i.e., \(\E{\vecx}=\boldsymbol{0}\), etc.

Assume \(\vecx\in\bbR^n\) is to be estimated from \(p\) observations \(\set{\vecy_i}_{i=1}^p\) and \(\vecy_i\in\bbR^m\). Define \[ \vecy^T = \left[\begin{array}{ccc}\vecy_1&\cdots&\vecy_p\end{array}\right]\in\bbR^{mp} \]

The linear least mean square estimate (LLMSE) of \(\vecx\) given \(\set{\vecy_i}_{i=1}^p\) is given by any solution of the normal equation \[ \matK_0\matR_\vecy = \matR_{\vecx\vecy} \]

The corresponding error covariance matrix is \(P(\matK_0)=\matR_\vecx-\matK_0\matR_{\vecy\vecx}\)

We have performed sensor fusion: we have optimally combined observations from multiple sensors
What happens if random vectors are not centered?

Geometric view

The solution of the LLMSE is \(\matK_0\) solution of \(\matK_0\matR_\vecy=\matR_{\vecx\vecy}\), i.e. \[ \E{(\vecx-\matK_0\vecy)\vecy^T} = \boldsymbol{0} \]
Can this be interpreted again as an inner product?

The LLMSE of \(\vecx\) given \(\set{\vecy_i}_{i=1}^p\) is the projection of \(\vecx\) onto the linear space spanned by \(\set{\vecy_i}_{i=1}^p\)

Linear models

We can't say much more about LMSE without knowing more
Fortunately, many engineering problems problems impose more structure between \(\vecy\) and \(\vecx\)

A linear model is one in which \(\vecy=\matH\vecx+\vecv\) where:
- \(\matH\in\bbR^{m\times n}\)
- \(\vecv\) is zero mean an uncorrelated with \(\vecx\)
The LLMSE of \(\vecx\) given \(\vecy\) in a linear mode (assuming \(\matR_\vecx\) and \(\matR_\vecv\) non singular) is \(\hat{\vecx}=\matK_0\vecy\) with \[ \matK_0=\matR_\vecx\matH^T(\matH\matR_\vecx\matH^\intercal + \matR_\vecv)^{-1} = (\matR_\vecx^{-1}+\matH^T\matR_\vecv^{-1}\matH)^{-1}\matH^T\matR_\vecv^{-1} \] \[ \matP_\vecx = \matR_\vecx -\matR_\vecx\matH^T(\matH\matR_\vecx\matH^T+\matR_\vecv)^{-1}\matH\matR_\vecx = (\matR_\vecx^{-1}+\matH^T\matR_\vecv^{-1}\matH)^{-1} \]
In particular note that \(\matP_{\vecx}^{-1}\hat{\vecx} = \matH^T\matR_\vecv^{-1}\vecy\) (independent of \(\matR_\vecx\))