The Mathematics of ECE

Probabilities - Estimation

Wednesday, September 08, 2021

Probabilities: roadmap

Last time: review of foundations and concentration of measure
- Key concepts: Conditional distributions, conditional expectation
- Key results: Chebyshev, Hoeffing
- Useful in signal processing, information theory, machine learning
Today: estimation - given \(y\) what is \(x\)??
- Key concepts: conditional expectation, minimum mean square estimation
- Key result: orthogonality principle
- Useful in signal processing, robotics, machine learning

Assume that \(x\in\bbR^p\) and \(y\in\bbR^q\) are dependent random vectors with known \(p_{xy}\)
- cardinal sin: we are using lowercase for random vectors
- This is very typical in controls and robotics
Objective: estimate \(x\) from \(y\)
- Think \(y=x+\text{noise}\), a sensor measurement
- We want to form \(\hat{x}=h(y)\) with some estimator \(h:\bbR^q\to\bbR^p\)
- We need a measure of performance
The least mean square (LMS) estimator \(h^*\) is \[ h^*=\argmin_h\underbrace{\E{(x-h(y))(x-h(y))^\intercal}}_{\text{matrix}} \] in the sense that \(\forall h\) \(\forall a\in\bbR\) \(a^\intercal P(h) a \geq a^\intercal P(h^*) a\). \(\hat{x}\eqdef h^*(y)\) is called the minimum mean square estimate (MMSE) of \(x\) given y.
Make sure you understand where the matrices and vectors are!

The LMS estimate of \(x\) given \(y\) is \(\hat{x}=\E{x|y}\).
This is nice, it looks simple… but we need to know \(p_{xy}\)
That could be hard in practice
Consider jointly distributed real-valued zero mean Gaussian random vectors \(x\) and \(y\) with non singular covariance matrix \[ R\eqdef\mat{cc}{R_x&R_{xy}\\R_{yx}&R_y}\text{ where } R_x\eqdef\E{xx^\intercal}, R_y\eqdef\E{yy^\intercal}, R_{xy}=R_{yx}^\intercal\eqdef\E{xy^\intercal} \] The MMSE is \(\hat{x} = R_{xy}R_y^{-1}y\) and is linear

For simplicity, we might want to restrict ourselves to linear estimates of the form \(\hat{x}\eqdef K_0 y\).
Assume \(x\in\bbR^n\) \(y\in\bbR^p\) are zero mean random variables. The LLMS estimate of \(x\) given \(y\) is of the form \(\hat{x}=K_0y\) with \(K_0\) solution for the normal equation \[ K_0R_y = R_{xy}\text{ where }R_y = \E{yy^\intercal}, R_xy=\E{xy^\intercal} \] The corresponding error covariance matrix is \(P(K_0)=R_x-K_0R_{yx}\)
If \(R_y>0\), we have \(K_0=R_{xy}R_y^{-1}\)
We only need to know second order statistics of \(x\) and \(y\)
Question: how do we deal with non-zero mean?

The LMS solution is such that \(K_0R_y = R_{xy}\), equivalently \[ \E{(x-K_0y)y^\intercal} = 0 \]
This could be viewed as an orthogonality condition (linear algebra!!!!)
For centered (zero-mean) random variables, define \(\dotp{x}{y}\eqdef{\E{xy^\intercal}}\)
- This is linear
- This is symmetric
- This is positive
The LLMS estimate of \(x\) given \(y\) is characterized by the fact that the error \(\tilde{x}\eqdef x-\hat{x}\) is orthogonal (uncorrelated) to the observation \(y\). Equivalently, the LLMS estimate is the projection of \(x\) onto the linear space spanned by \(y\).