Estimation of Stochastic Processes

Matthieu Bloch

Tuesday, September 20, 2022

Today in ECE 6555

  • Don't forget
    • Problem set 2 due Thursday September 22, 2022 on Gradescope
    • Mathematics of ECE workshops website
    • Recordings available on mediaspace
  • Last time
    • Sensor fusion
    • Stochastic processes: smoothing, causal filtering, prediction
  • Today's plan
    • Causal filtering
    • Innovation process
  • Questions?
  • Office hours at 12:30pm (instead of 12pm)

Estimation of Stochastic Processes

  • Estimation model:
    • Signal process \(\set{x_i}_{i\geq 0}\) not observed, zero mean
    • Measurement process \(\set{y_i}_{i\geq 0}\) obverved, zero mean
    • Covariance and correlation matrices known: \(\matR_{xy}(i,\ell)\eqdef \E{s_i y_\ell^T}\), \(\matR_{y}(i,\ell)\eqdef \E{y_i y_\ell^T}\), \(\forall i,\ell\)
  • Estimation goal: need to specify form of estimation and which observations to use

Smoothing, causal filtering, prediction

  • Smoothing: estimate \(x_i\) from \(\set{y_j}_{j=0}^m\), \(m>i\) (using past, present and future observations) as \[ \hat{x}_{i|m} \eqdef \sum_{j=0}^{m} k_{i,j}y_j \]

  • Causal filtering: estimate \(x_i\) from \(\set{y_j}_{j=0}^{i}\) (using past, and present observations) as \[ \hat{x}_{i|i} \eqdef \sum_{j=0}^{i} k_{i,j}y_j \]

  • Prediction: estimate \(x_{i+\ell}\) from \(\set{y_j}_{j=0}^{i}\), \(\ell>1\) (using past observations) as \[ \hat{x}_{i+\ell|i} \eqdef \sum_{j=0}^{i} k_{i,j}y_j \]

  • In all cases we want the estimation to be optimal (minimize error covariance matrix)

Smoothing

  • Let's put what we've learned to work: geometry!

  • Smoothing reduces to solving the normal equations and for \(\matR_y\succ 0\) \[ \hat{\vecx}_{s} = \matR_{\vecx\vecy}\matR_{\vecy}^{-1}\vecy \] where \[ \hat{\vecx}_{s}\eqdef\left[\begin{array}{c}\hat{x}_{0|m}\\\vdots\\\hat{x}_{m|m}\end{array}\right]\quad \matR_{\vecy}\eqdef\left[\matR_y(i,j)\right]\quad \matR_{\vecx\vecy}\eqdef\left[\matR_{xy}(i,j)\right] \]

Causal filtering

  • Geometry strikes backā€¦
  • For \(\matR_\vecy\succ 0\) decomposed as \(\matR_\vecy=\matL\matD\matL^T\) (\(\matL\) lower triangular) \[ \hat{\vecx}_{f} = \mathcal{L}\left[\matR_{\vecx\vecy}\matL^T\matD^{-1}\right]\matL^{-1}\vecy \] where \[ \hat{\vecx}_{f}\eqdef\left[\begin{array}{c}\hat{x}_{0|0}\\\hat{x}_{1|1}\\\vdots\\\hat{x}_{m|m}\end{array}\right]\quad \matR_{\vecy}\eqdef\left[\matR_y(i,j)\right]\quad \matR_{\vecx\vecy}\eqdef\left[\matR_{xy}(i,j)\right] \] and \(\mathcal{L}[\cdot]\) is the operator that makes a matrix lower triangular.

  • Example: Linear model \(\vecy = \vecx+\vecv\) with \(\E{\vecx\vecx^T}\eqdef \matR_x\), \(\E{\vecv\vecv^T}\eqdef \matR_v\), \(\E{\vecx\vecv^T}\eqdef 0\)

Innovation processes

  • A key difficulty we create is that \(\matR_y\) need to be inverted

    • Would be easier if \(\matR_y\) were diagonal (which in general it has no reason to do)
  • The normal equation are obtained by projecting onto a subspace

    • We are not bound to use \(\set{\vecy_i}_{i=0}^m\): we can orthogonalize!
  • Gram-Schimdt orthogonalization for random variables \[ \vece_0 = \vecy_0\qquad\qquad \forall i \geq 1\quad \vece_i = \vecy_i-\underbrace{\sum_{j=0}^{i-1}\dotp{\vecy_i}{\vece_j}\norm{\vece_j}^{-2}\vece_j}_{\hat{\vecy}_i} \]

  • The random variable \(\vece_i\eqdef \vecy_i-\hat{\vecy}_i\) is called the innovation