10  Nonlinear Kalman filtering

Main reference: Simon (2006)

In this chapter, the following are discussed

10.1 Linearised Kalman filter

In this section, the linearized Kalman filter is derived from continuous-time system and the derivation is analogous for discrete-time and hybrid systems.

Considering the following general nonlinear system model \[ \begin{align} \dot{\mathbf{x}} &= \mathbf{f}\left(\mathbf{x},\mathbf{u},\mathbf{w},t\right) \\ \mathbf{y} &= \mathbf{h}\left(\mathbf{x},\mathbf{v},t\right) \\ \mathbf{w} &\sim \left(0,\mathbf{Q}\right) \\ \mathbf{v} &\sim \left(0,\mathbf{R}\right) \end{align} \tag{10.1}\]

Here, \(\mathbf{f}(.),\mathbf{h}(.)\) are nonlinear functions.

Taylor series was used to linearize the equations around a nominal point \((\mathbf{x}_0,\mathbf{y}_0,\mathbf{w}_0,\mathbf{v}_0)\).

\[ \begin{align} \dot{\mathbf{x}} &\approx \mathbf{f}(\mathbf{x}_0,\mathbf{y}_0,\mathbf{w}_0,t) + \frac{\partial \mathbf{f}}{\partial \mathbf{x}}\rvert_0 (\mathbf{x}-\mathbf{x}_0) + \frac{\partial \mathbf{f}}{\partial \mathbf{u}}\rvert_0 (\mathbf{u}-\mathbf{u}_0) + \frac{\partial \mathbf{f}}{\partial \mathbf{w}}\rvert_0 (\mathbf{w}-\mathbf{w}_0) \\ &= \mathbf{f}(\mathbf{x}_0,\mathbf{y}_0,\mathbf{w}_0,t) + \mathbf{A}\Delta \mathbf{x} + \mathbf{B}\Delta\mathbf{u}+\mathbf{L}\Delta\mathbf{w} \\ \mathbf{y} &\approx \mathbf{h}(\mathbf{x}_0,\mathbf{v}_0,t) + \frac{\partial \mathbf{h}}{\partial \mathbf{x}}\rvert_0(\mathbf{x}-\mathbf{x}_0) + \frac{\partial \mathbf{h}}{\partial \mathbf{v}}\rvert_0(\mathbf{v}-\mathbf{v}_0) \\ &= \mathbf{h}(\mathbf{x}_0,\mathbf{v}_0,t) + \mathbf{C}\Delta\mathbf{x} + \mathbf{M}\Delta\mathbf{v} \end{align} \tag{10.2}\]

Let, the nominal values of noises be zero \(\mathbf{w}_0(t) = \mathbf{v}_0(t) = 0\) for all the time. That implies \(\Delta\mathbf{w}(t) = \mathbf{w}(t), \Delta\mathbf{v}(t) = \mathbf{v}(t)\).

Assuming that the control \(\mathbf{u}\) is perfectly known, then \(\Delta\mathbf{u}(t)=0\).

Now, the nominal system trajectory is defined as \[ \begin{align} \dot{\mathbf{x}}_0 &= \mathbf{f}(\mathbf{x}_0,\mathbf{u}_0,\mathbf{w}_0,t) \\ \mathbf{y}_0&= \mathbf{h}(\mathbf{x}_0,\mathbf{v}_0,t) . \end{align} \tag{10.3}\]

The deviation of true state from nominal state and of the true measurement from the nominal measurement is defined as follows.

\[ \begin{align} \Delta \dot{\mathbf{x}} &= \dot{\mathbf{x}} - \dot{\mathbf{x}}_0 \\ \Delta \mathbf{y} &= \mathbf{y} - \mathbf{y}_0 \end{align} \tag{10.4}\]

With above definitions, the Eq. 10.2 becomes

\[ \begin{align} \Delta\dot{\mathbf{x}} &= \mathbf{A}\Delta\mathbf{x} + \mathbf{L}\mathbf{w} \\ &= \mathbf{A}\Delta \mathbf{x} + \tilde{\mathbf{w}} \\ \tilde{\mathbf{w}} &\sim (0,\tilde{\mathbf{Q}}), \ \ \ \tilde{\mathbf{Q}} = \mathbf{L}\mathbf{Q}\mathbf{L}^T \\ \Delta \mathbf{y} &= \mathbf{C}\Delta \mathbf{x} + \mathbf{M}\mathbf{v} \\ &= \mathbf{C}\Delta\mathbf{x} + \tilde{\mathbf{v}} \\ \tilde{\mathbf{v}} &\sim (0,\tilde{\mathbf{R}}), \ \ \ \tilde{\mathbf{R}} = \mathbf{M}\mathbf{R}\mathbf{M}^T \end{align} \tag{10.5}\]

Above equation is a linear system with state \(\Delta \mathbf{x}\) and measurement \(\Delta \mathbf{y}\), so, Kalman filter can be used to estimate \(\Delta \mathbf{x}\).

Thus, the Kalman filter equations for the linearized Kalman filter are

\[ \begin{align} \Delta \hat{\mathbf{x}}(0) &= 0 \\ \mathbf{P}(0) &= E\left[\left(\Delta \mathbf{x}(0) - \Delta \hat{\mathbf{x}}(0)\right)\left(\Delta \mathbf{x}(0) - \Delta \hat{\mathbf{x}}(0)\right)^T\right] \\ \Delta\dot{\hat{\mathbf{x}}} &= \mathbf{A}\Delta\hat{\mathbf{x}} + \mathbf{K}\left(\Delta \mathbf{y} - \mathbf{C}\Delta \hat{\mathbf{x}}\right) \\ \mathbf{K} &= \mathbf{P}\mathbf{C}^T\tilde{\mathbf{R}}^{-1} \\ \dot{\mathbf{P}} &= \mathbf{A}\mathbf{P}+\mathbf{P}\mathbf{A}^T + \tilde{\mathbf{Q}} - \mathbf{P}\mathbf{C}^T\tilde{\mathbf{R}}^{-1}\mathbf{C}\mathbf{P} \\ \hat{\mathbf{x}} &= \mathbf{x}_0 + \Delta\hat{\mathbf{x}} \end{align} \tag{10.6}\]

10.1.1 Algorithm: continuous-time linearized Kalman filter

  1. For the given system equations (Eq. 10.1) and nominal trajectory known ahead of time (Eq. 10.3), compute the following partial derivative matrices evaluated at the nominal trajectory values \[ \begin{align} \mathbf{A} &= \frac{\partial \mathbf{f}}{\partial \mathbf{x}} \rvert_0 \\ \mathbf{L} &= \frac{\partial \mathbf{f}}{\partial \mathbf{w}} \rvert_0 \\ \mathbf{C} &= \frac{\partial \mathbf{h}}{\partial \mathbf{x}} \rvert_0 \\ \mathbf{M} &= \frac{\partial \mathbf{h}}{\partial \mathbf{v}} \rvert_0 \end{align} \]

  2. Compute the following matrices \[ \begin{align} \tilde{\mathbf{Q}} &= \mathbf{L}\mathbf{Q}\mathbf{L}^T \\ \tilde{\mathbf{R}} &= \mathbf{M} \mathbf{R} \mathbf{M}^T \end{align} \]

  3. Define \(\Delta \mathbf{y}\) as \[ \Delta \mathbf{y} = \mathbf{y} - \mathbf{y}_0 \]

  4. Execute the Kalman filter equations (Eq. 10.6)

  5. Estimate the state as \(\hat{\mathbf{x}} = \mathbf{x}_0 + \Delta \hat{\mathbf{x}}\)

10.2 Extended Kalman Filter (EKF)

In the previous linearized Kalman filter section, the derivation was based on a nominal state trajectory. But, it is no straightforward to find the nominal trajectory in most cases.

Since the Kalman filer estimates the state of the system, the Kalman filter estimate is itself used as the nominal state trajectory. This is sort of a bootstrap method.

First, the nonlinear system is linearized around the Kalman filter estimate, and the Kalman filter estimate is based on the linearized system. This is the idea of the extended Kalman filter.

10.2.1 Continuous-time extended Kalman filter

The main concept here is that, the nominal state is chosen as the Kalman filter estimate \(\hat{\mathbf{x}} = \mathbf{x}_0\).

Combining \(\dot{\mathbf{x}}_0\) expression in Eq. 10.3 with the \(\Delta\dot{\hat{\mathbf{x}}}\) expression in Eq. 10.6 to obtain

\[ \dot{\mathbf{x}}_0 + \Delta\dot{\hat{\mathbf{x}}} = \mathbf{f}(\mathbf{x}_0,\mathbf{u}_0,\mathbf{w}_0,t) + \mathbf{A}\Delta\hat{\mathbf{x}}+\mathbf{K}\left[\mathbf{y}-\mathbf{y}_0-\mathbf{C}\left(\hat{\mathbf{x}}-\mathbf{x}_0\right)\right] \tag{10.7}\]

Now, choosing \(\mathbf{x}_0(t) = \hat{\mathbf{x}}(t)\) so, \(\Delta\hat{\mathbf{x}}(t) = \Delta\dot{\hat{\mathbf{x}}}(t) = 0\). Then the nominal measurement expression in Eq. 10.3 becomes

\[ \begin{align} \mathbf{y}_0 &= \mathbf{h}(\mathbf{x}_0,\mathbf{v}_0,t) \\ &= \mathbf{h}\left(\hat{\mathbf{x}},\mathbf{v}_0,t\right) . \end{align} \]

Then, the Eq. 10.7 becomes

\[ \dot{\hat{\mathbf{x}}} = \mathbf{f}\left(\hat{\mathbf{x}},\mathbf{u},\mathbf{w}_0,t\right) + \mathbf{K}\left[\mathbf{y}-\mathbf{h}\left(\hat{\mathbf{x}},\mathbf{v}_0,t\right)\right] \]

This is equivalent to linearized Kalman filter, except that \(\mathbf{x}_0 = \hat{\mathbf{x}}\) and the equations are rearranged to obtain \(\hat{\mathbf{x}}\) directly. This formulation inputs \(\mathbf{y}\) directly and outputs \(\hat{\mathbf{x}}\) directly.

10.2.2 Algorithm: continuous-time EKF

  1. The system equations are given as \[ \begin{align} \hat{\mathbf{x}} &= \mathbf{f}\left(\mathbf{x},\mathbf{u},\mathbf{w},t\right) \\ \mathbf{y} &= \mathbf{h}\left(\mathbf{x},\mathbf{v},t\right) \\ \mathbf{w} &\sim (0,\mathbf{Q}) \\ \mathbf{v} &\sim (0,\mathbf{R}) \end{align} \]

  2. Compute the following partial derivative matrices evaluated at the current state estimate \[ \begin{align} \mathbf{A} &= \frac{\partial \mathbf{f}}{\partial \mathbf{x}} \rvert_{\hat{\mathbf{x}}} \\ \mathbf{L} &= \frac{\partial \mathbf{f}}{\partial \mathbf{w}} \rvert_{\hat{\mathbf{x}}} \\ \mathbf{C} &= \frac{\partial \mathbf{h}}{\partial \mathbf{x}} \rvert_{\hat{\mathbf{x}}} \\ \mathbf{M} &= \frac{\partial \mathbf{h}}{\partial \mathbf{v}} \rvert_{\hat{\mathbf{x}}} \\ \end{align} \]

  3. Compute the following matrices \[ \begin{align} \tilde{\mathbf{Q}} &= \mathbf{L}\mathbf{Q}\mathbf{L}^T \\ \tilde{\mathbf{R}} &= \mathbf{M}\mathbf{R}\mathbf{M}^T \end{align} \]

  4. Execute the following Kalman filter equations \[ \begin{align} \hat{\mathbf{x}}(0) &= E[\mathbf{x}(0)] \\ \mathbf{P}(0) &= E\left[\left(\mathbf{x}(0)-\hat{\mathbf{x}}(0)\right)\left(\mathbf{x}(0)-\hat{\mathbf{x}}(0)\right)^T\right] \\ \dot{\hat{\mathbf{x}}} &= \mathbf{f}\left(\hat{\mathbf{x}},\mathbf{u},\mathbf{w}_0,t\right) + \mathbf{K}\left[\mathbf{y}-\mathbf{h}\left(\hat{\mathbf{x}},\mathbf{v}_0,t\right)\right] \\ \mathbf{K} &= \mathbf{P}\mathbf{C}^T\tilde{\mathbf{R}}^{-1} \\ \dot{\mathbf{P}} &= \mathbf{A}\mathbf{P}+\mathbf{P}\mathbf{A}^T+\tilde{\mathbf{Q}}-\mathbf{P}\mathbf{C}^T\tilde{\mathbf{R}}^{-1}\mathbf{C}\mathbf{P} \end{align} \]

    where, the nominal noise values are given as \(\mathbf{v}_0 = \mathbf{w}_0 = 0\).

10.2.3 hybrid extended Kalman filter

This is a hybrid case, i.e. the system dynamics are continuous and the measurements received are discrete in time.

The continuous time system with discrete time measurements is defined as \[ \begin{align} \dot{\mathbf{x}} &= \mathbf{f}\left(\mathbf{x},\mathbf{u},\mathbf{w},t\right) \\ \mathbf{y}_k &= \mathbf{h}_k\left(\mathbf{x}_k,\mathbf{v}_k\right) \\ \mathbf{w}(t) &\sim (0,\mathbf{Q}) \\ \mathbf{v}_k &\sim (0,\mathbf{R}_k) \end{align} \tag{10.8}\]

  • In between the measurement time \(t_k < t < t_{k+1}\), the state estimate is propagated according to the known nonlinear dynamics, and the covariance \(\mathbf{P}\) is propagated according to the equation in continous-time EKF given below. \[ \dot{\mathbf{P}} = \mathbf{A}\mathbf{P} + \mathbf{P}\mathbf{A}^T + \mathbf{L}\mathbf{Q}\mathbf{L}^T - \mathbf{P}\mathbf{C}^T\left(\mathbf{M}\mathbf{R}\mathbf{M}^T\right)^{-1}\mathbf{C}\mathbf{P} \] Except, that the matrix \(\mathbf{R}\) cannot be used, because the integration is performed between measurement times during which measurements are unavailable. This can be thought like having infinite covariance (=), so the above equation becomes \[ \begin{align} \dot{\hat{\mathbf{x}}} &= \mathbf{f}\left(\hat{\mathbf{x}},\mathbf{u},\mathbf{w}_0,t\right) \\ \dot{\mathbf{P}} &= \mathbf{A}\mathbf{P}+\mathbf{P}\mathbf{A}^T+\mathbf{L}\mathbf{Q}\mathbf{L}^T \end{align} \tag{10.9}\] Note that the term \(\mathbf{w}_0\) is nominal process noise, it is equal to 0.

  • At each measurement time, the update equations given in Eq. 3.5 redefined below as \[ \begin{align} \mathbf{K}_k &= \mathbf{P}_k^-\mathbf{H}_k^T\left(\mathbf{H}_k\mathbf{P}_k^-\mathbf{H}_k^T+\mathbf{M}_k\mathbf{R}_k\mathbf{M}_k^T\right)^{-1}\\ \hat{\mathbf{x}}_k^+ &= \hat{\mathbf{x}}_k^-+\mathbf{K}_k\left[\mathbf{y}_k-\mathbf{h}_k(\hat{\mathbf{x}}_k^-,\mathbf{v}_0,t_k)\right] \\ \mathbf{P}_k^+ &= \left(\mathbf{I}-\mathbf{K}_k\mathbf{H}_k\right)\mathbf{P}_k^-\left(\mathbf{I}-\mathbf{K}_k\mathbf{H}_k\right)^T+\mathbf{K}_k\mathbf{M}_k\mathbf{R}_k\mathbf{M}_k^T\mathbf{K}_k^T \end{align} \tag{10.10}\]

10.2.4 Algorithm: hybrid EKF

  1. Consider the system equations defined in Eq. 10.8 with continuous dynamics and discrete measurements.

  2. Initialize the filter as follows \[ \begin{align} \hat{\mathbf{x}}_0^+ &= E[\mathbf{x}_0] \\ \mathbf{P}_0^+ &= E\left[\left(\mathbf{x}_0 - \hat{\mathbf{x}}_0^+\right)\left(\mathbf{x}_0 - \hat{\mathbf{x}}_0^+\right)^T\right] \end{align} \]

  3. For \(k=1,2,\dots\), perform the following

    1. Integrate the state and its covariance from time \((k-1)^+\) to time \(k^-\) using Eq. 10.9 \[ \begin{align} \dot{\hat{\mathbf{x}}} &= \mathbf{f}\left(\hat{\mathbf{x}},\mathbf{u},0,t\right) \\ \dot{\mathbf{P}} &= \mathbf{A}\mathbf{P}+\mathbf{P}\mathbf{A}^T+\mathbf{L}\mathbf{Q}\mathbf{L}^T \end{align} \]

      This integration process begins wth \(\hat{\mathbf{x}}=\hat{\mathbf{x}}_{k-1}^+\) and \(\mathbf{P}=\mathbf{P}_{k-1}^+\). And in the end of integration we get \(\hat{\mathbf{x}}=\hat{\mathbf{x}}_k^-\) and \(\mathbf{P}=\mathbf{P}_k^-\).

    2. At time \(k\), incorporate the measurement \(\mathbf{y}_k\) into the state estimate and estimation covariance following Eq. 10.10 given below. \[ \begin{align} \mathbf{K}_k &= \mathbf{P}_k^-\mathbf{H}_k^T\left(\mathbf{H}_k\mathbf{P}_k^-\mathbf{H}_k^T+\mathbf{M}_k\mathbf{R}_k\mathbf{M}_k^T\right)^{-1}\\ \hat{\mathbf{x}}_k^+ &= \hat{\mathbf{x}}_k^-+\mathbf{K}_k\left[\mathbf{y}_k-\mathbf{h}_k(\hat{\mathbf{x}}_k^-,\mathbf{v}_0,t_k)\right] \\ \mathbf{P}_k^+ &= \left(\mathbf{I}-\mathbf{K}_k\mathbf{H}_k\right)\mathbf{P}_k^-\left(\mathbf{I}-\mathbf{K}_k\mathbf{H}_k\right)^T+\mathbf{K}_k\mathbf{M}_k\mathbf{R}_k\mathbf{M}_k^T\mathbf{K}_k^T \end{align} \]

      \(\mathbf{H}_k\) and \(\mathbf{M}_k\) are the partial derivatives of \(\mathbf{h}_k(\mathbf{x}_k,\mathbf{v}_k)\) with respect to \(\mathbf{x}_k\) and \(\mathbf{v}_k\), and are both evaluated at \(\hat{\mathbf{x}}_k^-\).

10.2.5 Discrete-time extended Kalman filter

In this case, both the system equations and measurements are considered to be discrete in time.

This is often encountered in practice. Even if the underlying system dynamics are continuous in time, the EKF usually needs to be implemented in a digital computer. There might not be enough computational power to integrate the system dynamics onboard, as required in continuous-time or hybrid EKF.

Let the system model be

\[ \begin{align} \mathbf{x}_k &= \mathbf{f}_{k-1}\left(\mathbf{x}_{k-1},\mathbf{u}_{k-1},\mathbf{w}_{k-1}\right) \\ \mathbf{y}_k &= \mathbf{h}_k\left(\mathbf{x}_k,\mathbf{v}_k\right) \\ \mathbf{w}_k &\sim (0,\mathbf{Q}_k) \\ \mathbf{v}_k &\sim (0,\mathbf{R}_k) \end{align} \tag{10.11}\]

Then, the Taylor series expansion is performed on the state equation about the point \(\mathbf{x}_{k-1} = \mathbf{x}_{k-1}^+\) and \(\mathbf{w}_{k-1}= 0\) to obtain the following:

\[ \begin{align} \mathbf{x}_k &= \mathbf{f}_{k-1}(\hat{\mathbf{x}}_{k-1}^+,\mathbf{u}_{k-1},0) + \frac{\partial \mathbf{f}_{k-1}}{\partial \mathbf{x}}\vert_{\hat{\mathbf{x}}_{k-1}} \left(\mathbf{x}_{k-1}-\hat{\mathbf{x}}_{k-1}^+\right) + \frac{\partial \mathbf{f}_{k-1}}{\partial \mathbf{w}}\vert_{\hat{\mathbf{x}}_{k-1}}\mathbf{w}_{k-1} \\ &= \mathbf{f}_{k-1}(\hat{\mathbf{x}}_{k-1},\mathbf{u}_{k-1},0) + \mathbf{F}_{k-1}\left(\mathbf{x}_{k-1}-\hat{\mathbf{x}}_{k-1}^+\right)+\mathbf{L}_{k-1}\mathbf{w}_{k-1} \\ &= \mathbf{F}_{k-1}\mathbf{x}_{k-1}+\left[\mathbf{f}_{k-1}\left(\hat{\mathbf{x}}_{k-1},\mathbf{u}_{k-1},0\right)-\mathbf{F}_{k-1}\hat{\mathbf{x}}_{k-1}^+\right]+\mathbf{L}_{k-1}\mathbf{w}_{k-1} \\ &= \mathbf{F}_{k-1}\mathbf{x}_{k-1}+\tilde{\mathbf{u}}_{k-1}+\tilde{\mathbf{w}}_{k-1} \end{align} \tag{10.12}\]

The known signal \(\tilde{\mathbf{u}}_k\) and the noise signal \(\tilde{\mathbf{w}}_k\) are defined as

\[ \begin{align} \tilde{\mathbf{u}}_k &= \mathbf{f}_k\left(\hat{\mathbf{x}}_k^+,\mathbf{u}_k,0\right) - \mathbf{F}_k\hat{\mathbf{x}}_k^+ \\ \tilde{\mathbf{w}}_k &\sim (0,\mathbf{L}_k\mathbf{Q}_k\mathbf{L}_k^T) \end{align} \]

Similarly, the measurement equation is linearized around \(\mathbf{x}_k=\hat{\mathbf{x}}_k^-\) and \(\mathbf{v}_k=0\) to obtain the following

\[ \begin{align} \mathbf{y}_k &= \mathbf{h}_k(\hat{\mathbf{x}}_k^-,0) + \frac{\partial \mathbf{h}_k}{\partial \mathbf{x}}\vert_{\hat{\mathbf{x}}_k^-} (\mathbf{x}_k-\hat{\mathbf{x}}_k^-) + \frac{\partial \mathbf{h}_k}{\partial \mathbf{v}}\vert_{\hat{\mathbf{x}}_k^-}\mathbf{v}_k \\ &= \mathbf{h}_k\left(\hat{\mathbf{x}}_k^-,0\right) + \mathbf{H}_k\left(\mathbf{x}_k-\hat{\mathbf{x}}_k^-\right) + \mathbf{M}_k\mathbf{v}_k \\ &= \mathbf{H}_k\mathbf{x}_k+\left[\mathbf{h}_k(\hat{\mathbf{x}}_k^-,0)-\mathbf{H}_k\hat{\mathbf{x}}_k^-\right]+\mathbf{M}_k \mathbf{v}_k \\ &= \mathbf{H}_k\mathbf{x}_k + \mathbf{z}_k + \tilde{\mathbf{v}}_k \end{align} \tag{10.13}\]

The known signal \(\mathbf{z}_k\) and the noise signal \(\tilde{\mathbf{v}}_k\) are defined as

\[ \begin{align} \mathbf{z}_k &= \mathbf{h}_k\left(\hat{\mathbf{x}}_k^-,0\right) - \mathbf{H}_k\hat{\mathbf{x}}_k^- \\ \tilde{\mathbf{v}}_k &\sim \left(0,\mathbf{M}_k\mathbf{R}_k\mathbf{M}_k^T\right) \end{align} \]

We have a linear state-space equation system in Eq. 10.12 and a linear measurement equation in Eq. 10.13. That means, we can use the standard Kalman filter equations to estimate the state. This results in the following equations for the discrete-time EKF.

\[ \begin{align} \mathbf{P}_k^- &= \mathbf{F}_{k-1}\mathbf{P}_{k-1}^+\mathbf{F}_{k-1}^T+\mathbf{L}_{k-1}\mathbf{Q}_{k-1}\mathbf{L}_{k-1}^T \\ \mathbf{K}_k &= \mathbf{P}_k^-\mathbf{H}_k^T\left(\mathbf{H}_k\mathbf{P}_k^-\mathbf{H}_k^T+\mathbf{M}_k\mathbf{R}_k\mathbf{M}_k^T\right)^{-1} \\ \hat{\mathbf{x}}_k^- &= \mathbf{f}_{k-1}\left(\hat{\mathbf{x}}_{k-1}^+,\mathbf{u}_{k-1},0\right) \\ \mathbf{z}_k &= \mathbf{h}_k(\hat{\mathbf{x}}_k^-,0) - \mathbf{H}_k\hat{\mathbf{x}}_k^- \\ \hat{\mathbf{x}}_k^+ &= \hat{\mathbf{x}}_k^- + \mathbf{K}_k(\mathbf{y}_k-\mathbf{H}_k\hat{\mathbf{x}}_k^--\mathbf{z}_k) \\ &= \hat{\mathbf{x}}_k^-+\mathbf{K}_k\left[\mathbf{y}_k-\mathbf{h}_k(\hat{\mathbf{x}}_k^-,0)\right] \\ \mathbf{P}_k^+ &= \left(\mathbf{I}-\mathbf{K}_k\mathbf{H}_k\right)\mathbf{P}_k^- \end{align} \tag{10.14}\]

10.2.6 Algorithm: discrete-time EKF

  1. The system and measurement equations are given as \[ \begin{align} \mathbf{x}_k &= \mathbf{f}_{k-1}\left(\mathbf{x}_{k-1},\mathbf{u}_{k-1},\mathbf{w}_{k-1}\right) \\ \mathbf{y}_k &= \mathbf{h}_k\left(\mathbf{x}_k,\mathbf{v}_k\right) \\ \mathbf{w}_k &\sim \left(0,\mathbf{Q}_k\right) \\ \mathbf{v}_k &\sim \left(0,\mathbf{R}_k\right) \end{align} \]

  2. Initialize the filter as \[ \begin{align} \hat{\mathbf{x}}_0^+ &= E(\mathbf{x}_0) \\ \mathbf{P}_0^+ &= E\left[\left(\mathbf{x}_0-\hat{\mathbf{x}}_0\right)\left(\mathbf{x}_0-\hat{\mathbf{x}}_0\right)^T\right] \end{align} \]

  3. For \(k=1,2,\dots\) do the following

    1. Compute the following partial derivative matrices: \[ \begin{align} \mathbf{F}_{k-1} &= \frac{\partial \mathbf{f}_{k-1}}{\partial \mathbf{x}} \lvert_{\hat{\mathbf{x}}_{k-1}^+} \\ \mathbf{L}_{k-1} &= \frac{\partial \mathbf{f}_{k-1}}{\partial \mathbf{w}} \lvert_{\hat{\mathbf{x}}_{k-1}^+} \\ \end{align} \]

    2. Perform the time update of the state estimate and estimation-error covariance as follows: \[ \begin{align} \mathbf{P}_k^- &= \mathbf{F}_{k-1}\mathbf{P}_{k-1}^+\mathbf{F}_{k-1}^T+\mathbf{L}_{k-1}\mathbf{Q}_{k-1}\mathbf{L}_{k-1}^T \\ \hat{\mathbf{x}}_k^- &= \mathbf{f}_{k-1}\left(\hat{\mathbf{x}}_{k-1}^+,\mathbf{u}_{k-1},0\right) \end{align} \]

    3. Compute the following partial derivative matrices: \[ \begin{align} \mathbf{H}_k &= \frac{\partial \mathbf{h}_k}{\partial \mathbf{x}}\lvert_{\hat{\mathbf{x}}_k^-} \\ \mathbf{M}_k &= \frac{\partial \mathbf{h}_k}{\partial \mathbf{v}}\lvert_{\hat{\mathbf{x}}_k^-} \end{align} \]

    4. Perform the measurement update of the state estimate and estimation-error covariance as follows: \[ \begin{align} \mathbf{K}_k &= \mathbf{P}_k^-\mathbf{H}_k^T\left(\mathbf{H}_k\mathbf{P}_k^-\mathbf{H}_k^T+\mathbf{M}_k\mathbf{R}_k\mathbf{M}_k^T\right)^{-1} \\ \hat{\mathbf{x}}_k^+ &= \hat{\mathbf{x}}_k^- + \mathbf{K}_k\left[\mathbf{y}_k-\mathbf{h}_k\left(\hat{\mathbf{x}}_k^-,0\right)\right] \\ \mathbf{P}_k^+ &= \left(\mathbf{I}-\mathbf{K}_k\mathbf{H}_k\right)\mathbf{P}_k^- \end{align} \]

10.3 Higher order Approaches

More refined linearization techniques can be used to reduce the linearization error in the EKF for highly nonlinear systems. Two such approaches are discussed in here.

  1. Iterated EKF
  2. second order EKF

10.3.1 Iterated EKF

  • In the EKF, the term \(\mathbf{h}(\mathbf{x}_k,\mathbf{v}_k)\) is approximated by expanding it in a Taylor series around \(\hat{\mathbf{x}}_k^-\). Based on that, the measurement-update equations were derived.

  • The reason for expanding \(\mathbf{h}(.)\) around \(\hat{\mathbf{x}}_k^-\) is because, that was the best estimate of \(\mathbf{x}_k\) we have before taking the measurement at time \(k\) into account.

  • But, after the posteriori estimate, we have \(\hat{\mathbf{x}}_k^+\) which is a better estimate of \(\mathbf{x}_k\). So, we can reduce the linearization error by reformulating the Taylor series expansion of \(\mathbf{h}(.)\) around the new estimate \(\hat{\mathbf{x}}_k^+\). This can be repeated \(N\) times.

The algorithm of iterated EKF is given as

  1. The system and measurement equations are given as \[ \begin{align} \mathbf{x}_k &= \mathbf{f}_{k-1}\left(\mathbf{x}_{k-1},\mathbf{u}_{k-1},\mathbf{w}_{k-1}\right) \\ \mathbf{y}_k &= \mathbf{h}_k\left(\mathbf{x}_k,\mathbf{v}_k\right) \\ \mathbf{w}_k &\sim \left(0,\mathbf{Q}_k\right) \\ \mathbf{v}_k &\sim \left(0,\mathbf{R}_k\right) \end{align} \]

  2. Initialize the filter as \[ \begin{align} \hat{\mathbf{x}}_0^+ &= E(\mathbf{x}_0) \\ \mathbf{P}_0^+ &= E\left[\left(\mathbf{x}_0-\hat{\mathbf{x}}_0\right)\left(\mathbf{x}_0-\hat{\mathbf{x}}_0\right)^T\right] \end{align} \]

  3. For \(k=1,2,\dots\) do the following

    1. Compute the following partial derivative matrices: \[ \begin{align} \mathbf{F}_{k-1} &= \frac{\partial \mathbf{f}_{k-1}}{\partial \mathbf{x}} \lvert_{\hat{\mathbf{x}}_{k-1}^+} \\ \mathbf{L}_{k-1} &= \frac{\partial \mathbf{f}_{k-1}}{\partial \mathbf{w}} \lvert_{\hat{\mathbf{x}}_{k-1}^+} \\ \end{align} \]

    2. Perform the time update of the state estimate and estimation-error covariance as follows: \[ \begin{align} \mathbf{P}_k^- &= \mathbf{F}_{k-1}\mathbf{P}_{k-1}^+\mathbf{F}_{k-1}^T+\mathbf{L}_{k-1}\mathbf{Q}_{k-1}\mathbf{L}_{k-1}^T \\ \hat{\mathbf{x}}_k^- &= \mathbf{f}_{k-1}\left(\hat{\mathbf{x}}_{k-1}^+,\mathbf{u}_{k-1},0\right) \end{align} \]

    3. Perform the measurement update by initializing the iterated EKF estimate to the standard EKF estimate: \[ \begin{align} \hat{\mathbf{x}}_{k,0}^+ &= \hat{\mathbf{x}}_k^- \\ \mathbf{P}_{k,0}^+ &= \mathbf{P}_{k}^- \end{align} \]

    4. For \(i=0,1,\dots,N\) evaluate the following equations (where \(N\) is the desired number of measurement-update iterations) \[ \begin{align} \mathbf{H}_{k,i} &= \frac{\partial \mathbf{h}}{\partial \mathbf{x}}\lvert_{\hat{\mathbf{x}}_{k,i}^+} \\ \mathbf{M}_{k,i} &= \frac{\partial \mathbf{h}}{\partial \mathbf{v}}\lvert_{\hat{\mathbf{x}}_{k,i}^+} \\ \mathbf{K}_{k,i} &= \mathbf{P}_k^-\mathbf{H}_{k,i}^T\left(\mathbf{H}_{k,i}\mathbf{P}_k^-\mathbf{H}_{k,i}^T+\mathbf{M}_{k,i}\mathbf{R}_k\mathbf{M}_{k,i}^T\right)^{-1} \\ \hat{\mathbf{x}}_{k,i+1}^+ &= \hat{\mathbf{x}}_k^- + \mathbf{K}_{k,i}\left[\mathbf{y}_k-\mathbf{h}(\hat{\mathbf{x}}_{k,i}^+)-\mathbf{H}_{k,i}(\hat{\mathbf{x}}_k^--\hat{\mathbf{x}}_{k,i}^+\right] \\ \mathbf{P}_{k,i+1}^+ &= \left(\mathbf{I}-\mathbf{K}_{k,i}\mathbf{H}_{k,i}\right)\mathbf{P}_k^- \end{align} \]

    5. The final a posteriori state estimate and estimation-error covariance are given as \[ \begin{align} \hat{\mathbf{x}}_k^+ &= \hat{\mathbf{x}}_{k,N+1}^+ \\ \mathbf{P}_k^+ &= \mathbf{P}_{k,N+1}^+ \end{align} \]

Note

In the above algorithm, in step d, both \(\hat{\mathbf{x}}_{k,i}^+,\mathbf{P}_{k,i}^+\) will be updated with \(\hat{\mathbf{x}}_k^-,\mathbf{P}_k^-\) only! Not from previous iteration values i.e. not from \(\hat{\mathbf{x}}_{k,i-1}^+,\mathbf{P}_{k,i-1}^+\).

This is because, during the iterative update, we’re refining the state estimate and not our belief about how uncertain the prior was.

The above algorithm was described for discrete-time EKF, but it is straightforward for the continuous and hybrid EKFs.

10.3.2 Second-order EKF

Here, the concept is that the Taylor series expansion is performed up to second order terms instead of first order like in the above case.

The algorithm and equations can be referred in section 13.3.2 in Simon (2006).

10.4 Parameter estimation

The state estimation theory can be used to not only estimate the states of a system, but also to estimate the unknown parameters of the system.

Suppose that we have a discrete-time system model, but the system matrices depend in a nonlinear way on an unknown parameter vector \(\mathbf{p}\).

\[ \begin{align} \mathbf{x}_{k+1} &= \mathbf{F}_k(\mathbf{p})\mathbf{x}_k + \mathbf{G}_k(\mathbf{p})\mathbf{u}_k + \mathbf{L}_k(\mathbf{p})\mathbf{w}_k \\ \mathbf{y}_k &= \mathbf{H}_k\mathbf{x}_k + \mathbf{v}_k \\ \mathbf{p} &= constant \end{align} \]

Here, it is assumed that the measurement is independent of \(\mathbf{p}\) for the notational convenience, but it is straightforward otherwise.

In order to estimate the parameter \(\mathbf{p}\), we first augment the state with the parameter to obtain the augmented state vector \(\mathbf{x}'\).

\[ \mathbf{x}_k' = \begin{bmatrix} \mathbf{x}_k \\ \mathbf{p}_k \\ \end{bmatrix} \]

Since, \(\mathbf{p}_k\) is constant, it can be modeled as \(\mathbf{p}_{k+1} = \mathbf{p}_k + \mathbf{w}_{pk}\), where \(\mathbf{w}_{pk}\) is a small artificial noise term that allows the Kalman filter to change its estimate of \(\mathbf{p}_k\).

The augmented system model can be written as

\[ \begin{align} \mathbf{x}_{k+1}' &= \begin{bmatrix} \mathbf{F}_k(\mathbf{p}_k)\mathbf{x}_k + \mathbf{G}_k(\mathbf{p}_k)\mathbf{u}_k + \mathbf{L}_k(\mathbf{p}_k)\mathbf{w}_k \\ \mathbf{p}_k + \mathbf{w}_{pk} \end{bmatrix} \\ &= \mathbf{f}(\mathbf{x}_k',\mathbf{u}_k,\mathbf{w}_k,\mathbf{w}_{pk}) \\ \mathbf{y}_k &= \begin{bmatrix} \mathbf{H}_k & 0 \end{bmatrix} \begin{bmatrix} \mathbf{x}_k \\ \mathbf{p}_k \end{bmatrix} + \mathbf{v}_k \end{align} \]

Again, \(\mathbf{w}_{pk}\) is an artificial noise term that we add to the system that allows Kalman filter to modify its estimate of parameter vector \(\mathbf{p}\).

A simple example problem was given in Simon (2006), Example 13.4, for the parameter estimation.