Matthieu Bloch
Tuesday, September 20, 2022
Smoothing: estimate \(x_i\) from \(\set{y_j}_{j=0}^m\), \(m>i\) (using past, present and future observations) as \[ \hat{x}_{i|m} \eqdef \sum_{j=0}^{m} k_{i,j}y_j \]
Causal filtering: estimate \(x_i\) from \(\set{y_j}_{j=0}^{i}\) (using past, and present observations) as \[ \hat{x}_{i|i} \eqdef \sum_{j=0}^{i} k_{i,j}y_j \]
Prediction: estimate \(x_{i+\ell}\) from \(\set{y_j}_{j=0}^{i}\), \(\ell>1\) (using past observations) as \[ \hat{x}_{i+\ell|i} \eqdef \sum_{j=0}^{i} k_{i,j}y_j \]
In all cases we want the estimation to be optimal (minimize error covariance matrix)
Let's put what we've learned to work: geometry!
Smoothing reduces to solving the normal equations and for \(\matR_y\succ 0\) \[ \hat{\vecx}_{s} = \matR_{\vecx\vecy}\matR_{\vecy}^{-1}\vecy \] where \[ \hat{\vecx}_{s}\eqdef\left[\begin{array}{c}\hat{x}_{0|m}\\\vdots\\\hat{x}_{m|m}\end{array}\right]\quad \matR_{\vecy}\eqdef\left[\matR_y(i,j)\right]\quad \matR_{\vecx\vecy}\eqdef\left[\matR_{xy}(i,j)\right] \]
For \(\matR_\vecy\succ 0\) decomposed as \(\matR_\vecy=\matL\matD\matL^T\) (\(\matL\) lower triangular) \[ \hat{\vecx}_{f} = \mathcal{L}\left[\matR_{\vecx\vecy}\matL^T\matD^{-1}\right]\matL^{-1}\vecy \] where \[ \hat{\vecx}_{f}\eqdef\left[\begin{array}{c}\hat{x}_{0|0}\\\hat{x}_{1|1}\\\vdots\\\hat{x}_{m|m}\end{array}\right]\quad \matR_{\vecy}\eqdef\left[\matR_y(i,j)\right]\quad \matR_{\vecx\vecy}\eqdef\left[\matR_{xy}(i,j)\right] \] and \(\mathcal{L}[\cdot]\) is the operator that makes a matrix lower triangular.
A key difficulty we create is that \(\matR_y\) need to be inverted
The normal equation are obtained by projecting onto a subspace
Gram-Schimdt orthogonalization for random variables \[ \vece_0 = \vecy_0\qquad\qquad \forall i \geq 1\quad \vece_i = \vecy_i-\underbrace{\sum_{j=0}^{i-1}\dotp{\vecy_i}{\vece_j}\norm{\vece_j}^{-2}\vece_j}_{\hat{\vecy}_i} \]
The random variable \(\vece_i\eqdef \vecy_i-\hat{\vecy}_i\) is called the innovation