Matthieu Bloch
Thursday, October 20, 2022
Many interesting practical system are not linear! Consider the non-linear discrete time model \[ x_{i+1} = f_i(x_i) + g_i(x_i) u_i\qquad y_i = h_i(x_i)+v_i \] with \(f_i(\cdot)\), \(g_i(\cdot)\), \(h_i(\cdot)\), are non linear and time-varying.
\(x_0\) is assumed to be random with mean \(\bar{x}_0\) and \[ \begin{aligned} \dotp{\mat{c}{ x_0-\bar{x}_0\\ u_i\\ v_i}}{\mat{c}{ x_0-\bar{x}_0\\ u_j\\ v_j\\1}} = \mat{cccc}{\Pi_0&0&0&0\\0& Q_i\delta_{ij}&0&0\\0&0&R_i\delta_{ij}&0}. \end{aligned} \]
Question: can we leverage what we obtained for linear filtering?
Idea: open loop linearization of state-space equations around a known nominal trajectory \(x_i^\text{nom}\). \[ x_0^{\text{nom}} = x_0\qquad x^{\text{nom}}_{i+1} = f_i(x^{\text{nom}}_{i}) \]
Assume \(\set{f_i,g_i,h_i}_{i\geq 0}\) smooth enough to make first order Taylor expansion \[ f_i(x_i) \approx f_i(x^{\text{nom}}_{i}) + F_i \Delta x_i\qquad h_i(x_i)\approx h_i(x_i^{\text{nom}}) + H_i \Delta x_i \] with \(F_i\) and \(H_i\) defined as \[ F_i\eqdef \left.\frac{\partial f_i(x)}{\partial x}\right\vert_{x=x^{\text{nom}}_{i}}\qquad H_i\eqdef \left.\frac{\partial h_i(x)}{\partial x}\right\vert_{x=x^{\text{nom}}_{i}} \]
Make zeroth order approximation \(g_i(x_i)\approx g_i(x^{\text{nom}}_{i})\eqdef G_i\)
\[ \Delta x_{i+1} = F_i \Delta x_i + G_i u_i\qquad y_i-h_i(x^{\text{nom}}_{i}) = H_i \Delta x_i + v_i \]
\[ \begin{aligned} \hat{x}_{i+1|i} &= F_i (\hat{x}_{i|i} - x_i^{\text{nom}}) + f_i(x_i^{\text{nom}})\\ \hat{x}_{i|i} &= \hat{x}_{i|i-1}+K_{f,i} \left(y_i-h_i(x_i^{\text{nom}})-H_i (\hat{x}_{i|i-1}-x_i^{\text{nom}})\right)\\ K_{f,i}&= P_{i|i-1}H_i^T (H_iP_{i|i-1}H_i^T+R_i)^{-1}\\ P_{i|i}&=(I-K_{f,i}H_i)P_{i|i-1}\\ P_{i+1|i}&=F_iP_{i|i}F_i^T + G_i Q_i G_i^T. \end{aligned} \]
Idea: relinearization at every step around the current estimate
\[ f_i(x_i) = f_i(\hat{x}_{i|i}) + F_i (x_i-\hat{x}_{i|i}),\quad h_i(x_i) = h_i(\hat{x}_{i|i-1}) + H_i (x_i-\hat{x}_{i|i-1}),\quad g_i(x_i)=g_i(\hat{x}_{i|i})\eqdef G_i \]
\[ x_{i+1}=F_i x_i + (f_i(\hat{x}_{i|i})-F_i\hat{x}_{i|i}) + G_i u_i\qquad y_i - (h_i(\hat{x}_{i|i-1})-H_i\hat{x}_{i|i-1})=H_i x_i + v_i \]
\[ \begin{aligned} \hat{x}_{i+1|i} &= f_i(\hat{x}_{i|i})\\ \hat{x}_{i|i} &= \hat{x}_{i|i-1}+K_{f,i} \left(y_i-h_i(\hat{x}_{i|i-1})\right)\\ K_{f,i}&= P_{i|i-1}H_i^T (H_iP_{i|i-1}H_i^T+R_i)^{-1}\\ P_{i|i}&=(I-K_{f,i}H_i)P_{i|i-1}\\ P_{i+1|i}&=F_iP_{i|i}F_i^T + G_i Q_i G_i^T. \end{aligned} \]
A probabilistic state space model consists of a state evolution and measurement model \[ x_{i+1} \sim p(x_{i+1}|x_{0:i}y_{0:i})\qquad y_i \sim p(y_i|x_{0:i}y_{0:i-1}) \] where \(x_i^\intercal = \mat{ccc}{x_{i,1}&\cdots&x_{i,n}}\) is the state and \(y_i= \mat{ccc}{y_{i,1}&\cdots&y_{i,m}}\) is the measurement.
We define \(x_{0:i}\eqdef \mat{ccc}{x_{0}&\cdots&x_{i}}\).
The dynamic model is Markovian if \(p(x_{i+1}|x_{0:i}y_{0:i}) = p(x_{i+1}|x_i)\). The measurement model satisfies conditional independence if \(p(y_i|x_{0:i}y_{0:i-1}) = p(y_i|x_i)\)
From now on, unless otherwise specified, we assume Markovianity and conditional independence hold
Illustration: functional dependence graphs, hidden Markov model
Consider \(m\) independent random variables and \(n\) functions of these variables. A functional dependence graph is a directed graph having \(m + n\) vertices, and in which edges are drawn from one vertex to another if the random variable of the former vertex is an argument in the function defining the latter.
Let \(x,y\) be jointly distributed random variables with a well-defined PMF or PDF. There exists a random variable \(n\) independent of \(x\) and a function \(f\) such that \(x=f(x,n)\).
Let \(\calX\), \(\calY\), and \(\calZ\) be disjoint subsets of vertices in a functional dependence graph. If \(\calZ\) d-separates \(\calX\) from \(\calY\), and if we collect the random variables in \(\calX\), \(\calY\), and \(\calZ\) in the random vectors \(x\), \(y\), and \(z\), respectively, then \(x\rightarrow y\rightarrow z\) forms a Markov chain (\(x\) and \(z\) are conditionally independent given \(y\))
In the probabilistic state space model, past states \(x_{0:i-1}\) are independent of the future states \(x_{i+1:T}\) and measurements \(y_{i:T}\) given the present state \(x_i\).
A Gauss-Markov model is a Gaussian driven linear model \[ x_{i+1} = F_i x_i + u_i\qquad y_i = H_i x_i + v_i \] where \(u_i\sim\calN(0,Q_i)\) and \(v_i\sim\calN(0,R_i)\) are Gaussian white processes (assumed independent).
We assume that all variables are real-valued for simplicity
Let \(x\) and \(y\) be jointly distributed random variables \[ \mat{c}{x\\y}\sim\calN\left(\mat{c}{\mu_x\\\mu_y},\mat{cc}{R_x&R_{xy}\\R_{yx}&R_y}\right) \] Then \(x\sim\calN(\mu_x,R_x)\), \(y\sim\calN(\mu_y,R_y)\) and \[ \begin{aligned} x|y&\sim\calN(\mu_x+R_{xy}R_y^{-1}(y-\mu_y),R_x-R_{xy}R_y^{-1}R_{yx})\\ y|x&\sim\calN(\mu_y+R_{yx}R_x^{-1}(x-\mu_x),R_y-R_{yx}R_x^{-1}R_{xy}) \end{aligned} \]
All distribution are jointly Gaussian in a Gauss-Markov model.
\[ \begin{aligned} \hat{x}_{i|i} &= \hat{x}_{i|i-1}+K_{f,i}(y_i-H_i\hat{x}_{i|i-1})\\ K_{f,i} &= P_{i|i-1}H_i^\intercal(H_iP_{i|i-1}H_i^\intercal+Q_i)^{-1}\\ P_{i|i}&= P_{i|i-1}-K_{f,i}H_i P_{i|i-1} \end{aligned} \]