Matthieu Bloch
Tuesday October 25, 2022
Many interesting practical system are not linear! Consider the non-linear discrete time model \[ \vecx_{i+1} = f_i(\vecx_i) + g_i(\vecx_i) \vecu_i\qquad \vecy_i = h_i(\vecx_i)+\vecv_i \] with \(f_i(\cdot)\), \(g_i(\cdot)\), \(h_i(\cdot)\), are non linear and time-varying.
\(x_0\) is assumed to be random with mean \(\bar{x}_0\) and \[ \begin{aligned} \dotp{\mat{c}{ \vecx_0-\bar{\vecx}_0\\ \vecu_i\\ \vecv_i}}{\mat{c}{ \vecx_0-\bar{\vecx}_0\\ \vecu_j\\ \vecv_j\\1}} = \mat{cccc}{\Pi_0&0&0&0\\0& \matQ_i\delta_{ij}&0&0\\0&0&\matR_i\delta_{ij}&0}. \end{aligned} \]
Question: can we leverage what we obtained for linear filtering?
Idea: open loop linearization of state-space equations around a known nominal trajectory \(\vecx_i^\text{nom}\). \[ \vecx_0^{\text{nom}} = \vecx_0\qquad \vecx^{\text{nom}}_{i+1} = f_i(\vecx^{\text{nom}}_{i}) \]
Assume \(\set{f_i,g_i,h_i}_{i\geq 0}\) smooth enough to make first order Taylor expansion \[ f_i(\vecx_i) \approx f_i(\vecx^{\text{nom}}_{i}) + \matF_i \Delta \vecx_i\qquad h_i(\vecx_i)\approx h_i(\vecx_i^{\text{nom}}) + \matH_i \Delta \vecx_i \] with \(\matF_i\) and \(\matH_i\) defined as \[ \matF_i\eqdef \left.\frac{\partial f_i(x)}{\partial x}\right\vert_{x=\vecx^{\text{nom}}_{i}}\qquad \matH_i\eqdef \left.\frac{\partial h_i(x)}{\partial x}\right\vert_{x=\vecx^{\text{nom}}_{i}} \]
Make zeroth order approximation \(g_i(\vecx_i)\approx g_i(\vecx^{\text{nom}}_{i})\eqdef \matG_i\)
\[ \Delta \vecx_{i+1} = \matF_i \Delta \vecx_i + \matG_i \vecu_i\qquad \vecy_i-h_i(x^{\text{nom}}_{i}) = \matH_i \Delta \vecx_i + \vecv_i \]
\[ \begin{aligned} \hat{\vecx}_{i+1|i} &= \matF_i (\hat{\vecx}_{i|i} - \vecx_i^{\text{nom}}) + f_i(\vecx_i^{\text{nom}})\\ \hat{\vecx}_{i|i} &= \hat{\vecx}_{i|i-1}+\matK_{f,i} \left(\vecy_i-h_i(\vecx_i^{\text{nom}})-\matH_i (\hat{\vecx}_{i|i-1}-\vecx_i^{\text{nom}})\right)\\ \matK_{f,i}&= \matP_{i|i-1}\matH_i^T (\matH_i\matP_{i|i-1}\matH_i^T+R_i)^{-1}\\ \matP_{i|i}&=(\matI-\matK_{f,i}\matH_i)\matP_{i|i-1}\\ \matP_{i+1|i}&=\matF_i\matP_{i|i}\matF_i^T + \matG_i Q_i \matG_i^T. \end{aligned} \]
Idea: relinearization at every step around the current estimate
\[ f_i(\vecx_i) = f_i(\hat{\vecx}_{i|i}) + \matF_i (\vecx_i-\hat{\vecx}_{i|i}),\quad h_i(\vecx_i) = h_i(\hat{\vecx}_{i|i-1}) + \matH_i (\vecx_i-\hat{\vecx}_{i|i-1}) \]
\[ g_i(\vecx_i)=g_i(\hat{\vecx}_{i|i})\eqdef \matG_i \]
\[ \vecx_{i+1}=\matF_i \vecx_i + (f_i(\hat{\vecx}_{i|i})-\matF_i\hat{\vecx}_{i|i}) + \matG_i \vecu_i\qquad \vecy_i - (h_i(\hat{\vecx}_{i|i-1})-\matH_i\hat{\vecx}_{i|i-1})=\matH_i \vecx_i + \vecv_i \]
\[ \begin{aligned} \hat{\vecx}_{i+1|i} &= f_i(\hat{\vecx}_{i|i})\\ \hat{\vecx}_{i|i} &= \hat{\vecx}_{i|i-1}+\matK_{f,i} \left(\vecy_i-h_i(\hat{\vecx}_{i|i-1})\right)\\ \matK_{f,i}&= \matP_{i|i-1}\matH_i^T (\matH_i\matP_{i|i-1}\matH_i^T+R_i)^{-1}\\ \matP_{i|i}&=(\matI-\matK_{f,i}\matH_i)\matP_{i|i-1}\\ \matP_{i+1|i}&=\matF_i\matP_{i|i}\matF_i^T + \matG_i Q_i \matG_i^T. \end{aligned} \]
A probabilistic state space model consists of a state evolution and measurement model \[ x_{i+1} \sim p(x_{i+1}|x_{0:i}y_{0:i})\qquad y_i \sim p(y_i|x_{0:i}y_{0:i-1}) \] where \(x_i^\intercal = \mat{ccc}{x_{i,1}&\cdots&x_{i,n}}\) is the state and \(y_i= \mat{ccc}{y_{i,1}&\cdots&y_{i,m}}\) is the measurement.
We define \(x_{0:i}\eqdef \mat{ccc}{x_{0}&\cdots&x_{i}}\).
The dynamic model is Markovian if \(p(x_{i+1}|x_{0:i}y_{0:i}) = p(x_{i+1}|x_i)\). The measurement model satisfies conditional independence if \(p(y_i|x_{0:i}y_{0:i-1}) = p(y_i|x_i)\)
Consider \(m\) independent random variables and \(n\) functions of these variables. A functional dependence graph is a directed graph having \(m + n\) vertices, and in which edges are drawn from one vertex to another if the random variable of the former vertex is an argument in the function defining the latter.
Let \(x,y\) be jointly distributed random variables with a well-defined PMF or PDF. There exists a random variable \(n\) independent of \(x\) and a function \(f\) such that \(x=f(x,n)\).
Let \(\calX\), \(\calY\), and \(\calZ\) be disjoint subsets of vertices in a functional dependence graph. If \(\calZ\) d-separates \(\calX\) from \(\calY\), and if we collect the random variables in \(\calX\), \(\calY\), and \(\calZ\) in the random vectors \(x\), \(y\), and \(z\), respectively, then \(x\rightarrow y\rightarrow z\) forms a Markov chain (\(x\) and \(z\) are conditionally independent given \(y\))
In the probabilistic state space model, past states \(x_{0:i-1}\) are independent of the future states \(x_{i+1:T}\) and measurements \(y_{i:T}\) given the present state \(x_i\).
A Gauss-Markov model is a Gaussian driven linear model \[ \vecx_{i+1} = \matF_i \vecx_i + \vecu_i\qquad y_i = \matH_i \vecx_i +\vecv_i \] where \(\vecu_i\sim\calN(0,\matQ_i)\) and \(\vecv_i\sim\calN(0,\matR_i)\) are Gaussian white processes (assumed independent).
We assume that all variables are real-valued for simplicity
Let \(x\) and \(y\) be jointly distributed random variables \[ \mat{c}{x\\y}\sim\calN\left(\mat{c}{\mu_x\\\mu_y},\mat{cc}{\matR_x&\matR_{xy}\\\matR_{yx}&\matR_y}\right) \] Then \(x\sim\calN(\mu_x,\matR_x)\), \(y\sim\calN(\mu_y,\matR_y)\) and \[ \begin{aligned} x|y&\sim\calN(\mu_x+\matR_{xy}\matR_y^{-1}(y-\mu_y),\matR_x-\matR_{xy}\matR_y^{-1}\matR_{yx})\\ y|x&\sim\calN(\mu_y+\matR_{yx}\matR_x^{-1}(x-\mu_x),\matR_y-\matR_{yx}\matR_x^{-1}\matR_{xy}) \end{aligned} \]
All distribution are jointly Gaussian in a Gauss-Markov model.
\[ \begin{aligned} \hat{\vecx}_{i|i} &= \hat{\vecx}_{i|i-1}+\matK_{f,i}(\vecy_i-\matH_i\hat{\vecx}_{i|i-1})\\ \matK_{f,i} &= \matP_{i|i-1}\matH_i^\intercal(\matH_i\matP_{i|i-1}\matH_i^\intercal+\matQ_i)^{-1}\\ \matP_{i|i}&= \matP_{i|i-1}-\matK_{f,i}\matH_i \matP_{i|i-1} \end{aligned} \]