{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Probabilistic derivation of the Kalman filter\n", "This derivation of the filter is based on the explanation by [Pradu](https://math.stackexchange.com/questions/840662/an-explanation-of-the-kalman-filter).\n", "\n", "Our goal is to express the mean and covariance of the state vector $\\mathbf{x}_t\\sim \\mathcal{N}(\\mathbf{\\hat{x}}_t,\\mathbf{P}_t)$ that needs to be estimated, conditioned on the measurement vector $\\mathbf{y}_t=\\mathbf{m}$, i.e mean $\\mathbf{x}_{t|\\mathbf{y}_t=\\mathbf{m}}$ and covariance $\\mathbf{P}_{t|\\mathbf{y}_t=\\mathbf{m}}$.\n", "\n", "## System description\n", "\n", "The state transition equation is\n", "\n", "$$\n", "\\mathbf{x}_{t+1} = \\mathbf{A}_t\\mathbf{x}_t + \\mathbf{B}_t\\mathbf{u}_t + \\mathbf{q}_t,\n", "$$\n", "\n", "where $\\mathbf{A}_t$ is a state transition matrix, $\\mathbf{u}_t \\sim \\mathcal{N}(\\mathbf{\\hat{u}}_t,\\mathbf{U}_t)$ is an input vector, and $\\mathbf{q}_t \\sim \\mathcal{N}(\\mathbf{0},\\mathbf{Q}_t)$ is a noise. The measurement equation is\n", "\n", "$$\n", "\\mathbf{y}_{t} = \\mathbf{C}_t\\mathbf{x}_t + \\mathbf{D}_t \\mathbf{u}_t + \\mathbf{r}_t,\n", "$$\n", "\n", "where $\\mathbf{y}_{t}$ is the measurement, $\\mathbf{C}_t$ is the output matrix, $\\mathbf{D}_t$ is the feed-forward matrix, and $\\mathbf{r}_t \\sim \\mathcal{N}(\\mathbf{0},\\mathbf{R}_t)$. It is assumed that $\\mathbf{q}_t$, $\\mathbf{u}_t$, $\\mathbf{x}_t$, and $\\mathbf{r}_t$ are uncorrelated.\n", "\n", "## State propagation update\n", "\n", "The mean and covariance of the a priori state estimate (The a priori state estimate is the state estimate before any measurements are taken into account.)\n", "follows directly from the mean and covariance of a linear combination of random variables (explained below),\n", "\n", "$$\n", "\\mathbf{\\hat{x}}_{t} = \\mathbf{A}_{t-1} \\mathbf{\\hat{x}}_{t-1} + \\mathbf{B}_{t-1} \\mathbf{\\hat{u}}_{t-1} \\\\\n", "\\mathbf{P}_t = \\mathbf{A}_{t-1} \\mathbf{P}_{t-1} { \\mathbf{A}_{t-1} }^T + \\mathbf{B}_{t-1} \\mathbf{U}_{t-1} {\\mathbf{B}_{t-1}}^T + \\mathbf{Q}_{t-1}\n", "$$\n", "\n", "\n", "## Mean and Covariance of a Linear Combination of Random Variables\n", "\n", "Let $\\mathbf{y} = \\mathbf{A}\\mathbf{x}_1 + \\mathbf{B}\\mathbf{x}_2$ where $\\mathbf{x}_1$ and $\\mathbf{x}_2$ are multivariate\n", "random variables. The expected value of $\\mathbf{y}$ is\n", "\n", "$$\n", "\\mathbb{E}\\left[{\\mathbf{y}}\\right] = \\mathbb{E}\\left[{ \\mathbf{A}\\mathbf{x}_1 + \\mathbf{B}\\mathbf{x}_2 }\\right] \\\\\n", " = \\mathbf{A}\\mathbb{E}\\left[{\\mathbf{x}_1}\\right] + \\mathbf{B}\\mathbb{E}\\left[{\\mathbf{x}_2}\\right].\n", "$$\n", "\n", "The covariance matrix for $\\mathbf{y}$ is\n", "\n", "$$\n", "\\sigma(\\mathbf{y},\\mathbf{y}) = \\mathbb{E}\\left[{ ( \\mathbf{y} - \\mathbb{E}\\left[{\\mathbf{y}}\\right] ) {( \\mathbf{y} - \\mathbb{E}\\left[{\\mathbf{y}}\\right] )}^T }\\right] \\\\\n", "= \\mathbb{E}\\left[{ ( \\mathbf{A}(\\mathbf{x}_1 -\\mathbb{E}\\left[{ \\mathbf{x}_1}\\right]) + \\mathbf{B}(\\mathbf{x}_2 -\\mathbb{E}\\left[{ \\mathbf{x}_2}\\right]) ) {( \\mathbf{A}(\\mathbf{x}_1 -\\mathbb{E}\\left[{ \\mathbf{x}_1}\\right]) + \\mathbf{B}(\\mathbf{x}_2 -\\mathbb{E}\\left[{ \\mathbf{x}_2}\\right]) )}^T }\\right] \\\\\n", "= \\mathbf{A} \\sigma(\\mathbf{x}_1,\\mathbf{x}_1){\\mathbf{A}}^T + \\mathbf{B} \\sigma(\\mathbf{x}_2,\\mathbf{x}_2){\\mathbf{B}}^T\n", " +\\mathbf{A} \\sigma(\\mathbf{x}_1,\\mathbf{x}_2){\\mathbf{B}}^T + \\mathbf{B} \\sigma(\\mathbf{x}_2,\\mathbf{x}_1){\\mathbf{A}}^T.\n", "$$\n", "\n", "If $\\mathbf{x}_1$ and $\\mathbf{x}_2$ are independent, then\n", "\n", "$$\n", "\\sigma(\\mathbf{y},\\mathbf{y}) = \\mathbf{A} \\sigma(\\mathbf{x}_1,\\mathbf{x}_1){\\mathbf{A}}^T + \\mathbf{B} \\sigma(\\mathbf{x}_2,\\mathbf{x}_2){\\mathbf{B}}^T.\n", "$$\n", "\n", "The covariance matrix for vector-valued random\n", "variables is defined as:\n", "$$\n", "\\sigma(\\mathbf{x},\\mathbf{y}) = \\mathbb{E}\\left[{ ( \\mathbf{x} - \\mathbb{E}\\left[{\\mathbf{x}}\\right] ) {( \\mathbf{y} - \\mathbb{E}\\left[{\\mathbf{y}}\\right] )}^T }\\right] \\\\\n", " = \\mathbb{E}\\left[{\\mathbf{x}{\\mathbf{y}}^T}\\right] - \\mathbb{E}\\left[{\\mathbf{x}}\\right]{\\mathbb{E}\\left[{\\mathbf{y}}\\right]}^T.\n", "$$\n", "\n", "\n", "## Measurement propagation update\n", "\n", "Let the joint distribution between $\\mathbf{x}_t$ and $\\mathbf{y}_t$ be\n", "\n", "$$\n", "\\left[ \\begin{array}{c}\n", "\\mathbf{x}_t \\\\\n", "\\mathbf{y}_t\n", "\\end{array} \\right] \\sim\n", "\\mathcal{N} \\left(\n", "\\left[ \\begin{array}{c}\n", "\\mathbf{\\hat{x}}_t \\\\\n", "\\mathbf{\\hat{y}}_t\n", "\\end{array}\n", "\\right],\n", "\\left[ \\begin{array}{cc}\n", "\\mathbf{\\Sigma}_{xx,t} & \\mathbf{\\Sigma}_{xy,t} \\\\\n", "\\mathbf{\\Sigma}_{yx,t} & \\mathbf{\\Sigma}_{yy,t}\n", "\\end{array} \\right]\n", "\\right)\n", "$$\n", "\n", "The expected measurement from state $\\mathbf{x}_t$, input $\\mathbf{u}_t$ is\n", "\n", "$$\n", "\\mathbf{\\hat{y}}_{t} = \\mathbf{C}_t\\mathbf{\\hat{x}}_t + \\mathbf{D}_t \\mathbf{\\hat{u}}_t.\n", "$$\n", "\n", "The covariance sub-matrices are\n", "$$\n", "\\mathbf{\\Sigma}_{xx,t} = \\mathbf{P}_t \\\\\n", "\\mathbf{\\Sigma}_{xy,t} = \\mathbb{E}\\left[{\\mathbf{x}_t {\\mathbf{y}_t}^T}\\right] - \\mathbb{E}\\left[{\\mathbf{x}_t}\\right] {\\mathbb{E}\\left[{\\mathbf{y}_t}\\right]}^T \\\\\n", " = \\mathbb{E}\\left[{\\mathbf{x}_t {\\left( \\mathbf{C}_t\\mathbf{x}_t + \\mathbf{D}_t \\mathbf{u}_t + \\mathbf{r}_t \\right)}^T}\\right] - \\mathbb{E}\\left[{\\mathbf{x}_t}\\right] {\\mathbb{E}\\left[{\\mathbf{C}_t\\mathbf{x}_t + \\mathbf{D}_t \\mathbf{u}_t + \\mathbf{r}_t}\\right]}^T \\\\\n", " = \\mathbf{P}_t { \\mathbf{C}_t }^T \\\\\n", "\\mathbf{\\Sigma}_{yx,t} = {\\mathbf{\\Sigma}_{xy,t}}^T = \\mathbf{C}_t \\mathbf{P}_t \\\\\n", "\\mathbf{\\Sigma}_{yy,t} = \\mathbf{C}_t\\mathbf{P}_t{\\mathbf{C}_t}^T + \\mathbf{D}_t \\mathbf{U}_t {\\mathbf{D}_t}^T + \\mathbf{R}_t\n", "$$\n", "\n", "where $\\mathbb{E}\\left[{\\mathbf{x}}\\right]$ denotes expectation of $\\mathbf{x}$.\n", "\n", "We can show that indeed,\n", "\n", "$$\n", "\\mathbb{E}\\left[{\\mathbf{x}_t {\\left( \\mathbf{C}_t\\mathbf{x}_t + \\mathbf{D}_t \\mathbf{u}_t + \\mathbf{r}_t \\right)}^T}\\right] - \\mathbb{E}\\left[{\\mathbf{x}_t}\\right] {\\mathbb{E}\\left[{\\mathbf{C}_t\\mathbf{x}_t + \\mathbf{D}_t \\mathbf{u}_t + \\mathbf{r}_t}\\right]}^T = \\mathbf{P}_t { \\mathbf{C}_t }^T \n", "$$\n", "\n", "The expectations of $\\mathbf{x}_t$, $\\mathbf{u}_t$ and $\\mathbf{r}_t$ are $\\mathbb{E}\\left[{\\mathbf{x}_t}\\right] = \\mathbf{\\hat{x}}_t$, $\\mathbb{E}\\left[{\\mathbf{u}_t}\\right] = \\mathbf{\\hat{u}}_t$ and $\\mathbb{E}\\left[{\\mathbf{r}_t}\\right] = 0$.\n", "\n", "If we multiply in the first term with $\\mathbf{x}_t$, we get\n", "\n", "$$\n", "\\mathbb{E}\\left[{\\mathbf{x}_t {\\left( \\mathbf{C}_t\\mathbf{x}_t + \\mathbf{D}_t \\mathbf{u}_t + \\mathbf{r}_t \\right)}^T}\\right] =\n", "\\mathbb{E}\\left[\\mathbf{x}_t {\\mathbf{x}_t}^T { \\mathbf{C}_t}^T + \\mathbf{x}_t {\\mathbf{u}_t}^T {\\mathbf{D}_t}^T + \\mathbf{x}_t {\\mathbf{r}_t}^T\\right] = \\\\\n", "\\mathbb{E}\\left[\\mathbf{x}_t {\\mathbf{x}_t}^T\\right]{ \\mathbf{C}_t}^T + \\mathbb{E}\\left[\\mathbf{x}_t {\\mathbf{u}_t}^T\\right]{\\mathbf{D}_t}^T + \\mathbb{E}\\left[\\mathbf{x}_t {\\mathbf{r}_t}^T\\right]\n", "$$\n", "\n", "If we assume that $\\mathbf{x}_t$, $\\mathbf{u}_t$ and $\\mathbf{x}_t$, $\\mathbf{r}_t$ are independent random variables, \n", "\n", "$$\n", "\\mathbb{E}\\left[\\mathbf{x}_t {\\mathbf{x}_t}^T\\right] = \\sigma(\\mathbf{x}_t,\\mathbf{x}_t) + \\mathbb{E}\\left[\\mathbf{x}_t\\right] \\mathbb{E}\\left[\\mathbf{x}_t\\right]^T =\n", "\\mathbf{P}_t + \\mathbf{\\hat{x}}_t {\\mathbf{\\hat{x}}_t}^T \\\\\n", "\\mathbb{E}\\left[\\mathbf{x}_t {\\mathbf{u}_t}^T\\right] = \\mathbb{E}\\left[\\mathbf{x}_t\\right] \\mathbb{E}\\left[\\mathbf{u}_t\\right]^T =\n", "\\mathbf{\\hat{x}}_t {\\mathbf{\\hat{u}}_t}^T \\\\\n", "\\mathbb{E}\\left[\\mathbf{x}_t {\\mathbf{r}_t}^T\\right] = \\mathbb{E}\\left[\\mathbf{x}_t\\right] \\mathbb{E}\\left[\\mathbf{r}_t\\right]^T =\n", "\\mathbf{\\hat{x}}_t 0 = 0\n", "$$\n", "\n", "hence the first term is\n", "\n", "$$\n", "\\mathbf{P}_t {\\mathbf{C}_t}^T + \\mathbf{\\hat{x}}_t {\\mathbf{\\hat{x}}_t}^T {\\mathbf{C}_t}^T + \\mathbf{\\hat{x}}_t {\\mathbf{\\hat{u}}_t}^T {\\mathbf{D}_t}^T\n", ".$$\n", "\n", "The second part of the second term is\n", "\n", "$$\n", "{\\mathbb{E}\\left[{\\mathbf{C}_t\\mathbf{x}_t + \\mathbf{D}_t \\mathbf{u}_t + \\mathbf{r}_t}\\right]}^T = \\\\\n", "\\mathbb{E}\\left[\\mathbf{x}_t\\right]^T {\\mathbf{C}_t}^T + \\mathbb{E}\\left[\\mathbf{u}_t\\right]^T {\\mathbf{D}_t}^T + \\mathbb{E}\\left[\\mathbf{r}_t\\right]^T = \\\\\n", "{\\mathbf{\\hat{x}}_t}^T {\\mathbf{C}_t}^T + {\\mathbf{\\hat{u}}_t}^T {\\mathbf{D}_t}^T\n", ",$$\n", "\n", "therefore the second term evaluates to\n", "\n", "$$\n", "\\mathbf{\\hat{x}}_t {\\mathbf{\\hat{x}}_t}^T {\\mathbf{C}_t}^T + \\mathbf{\\hat{x}}_t {\\mathbf{\\hat{u}}_t}^T {\\mathbf{D}_t}^T .\n", "$$\n", "\n", "If we subtract this term from the first, we get $\\mathbf{P}_t {\\mathbf{C}_t}^T$ as expected.\n", "\n", "Given a measurement $\\mathbf{m}$, the conditional distribution for $\\mathbf{x}$ given $\\mathbf{y}$ is a normal distribution with the following properties\n", "\n", "$$\n", "\\mathbf{\\hat{x}}_{t|\\mathbf{y}_t=\\mathbf{m}} = \\mathbb{E}\\left[{\\mathbf{x}_t|\\mathbf{y}_t=\\mathbf{m}}\\right] = \\mathbf{\\hat{x}_t} + \\mathbf{\\Sigma}_{xy,t} \\mathbf{\\Sigma}_{yy,t}^{-1}( \\mathbf{m} - {\\hat{\\mathbf{y}}_t} ) \\\\\n", "\\mathbf{P}_{t|\\mathbf{y}_t=\\mathbf{m}} = \\text{Var}(\\mathbf{x}_t|\\mathbf{y}_t=\\mathbf{m}) = \\mathbf{\\Sigma}_{xx,t} - \\mathbf{\\Sigma}_{xy,t} \\mathbf{\\Sigma}_{yy,t}^{-1} \\mathbf{\\Sigma}_{yx,t} \\\\\n", "$$\n", "\n", "Substituting, we get the final equations of the Kalman filter\n", "$$\n", "\\mathbf{\\hat{x}}_{t|\\mathbf{y}_t=\\mathbf{m}} = \\mathbf{\\hat{x}_t} + \\mathbf{P}_t { \\mathbf{C}_t }^T \\left( \\mathbf{C}_t\\mathbf{P}_t{\\mathbf{C}_t}^T + \\mathbf{D}_t \\mathbf{U}_t {\\mathbf{D}_t}^T + \\mathbf{R}_t\\right)^{-1}( \\mathbf{m} - \\mathbf{C}_t\\mathbf{\\hat{x}}_t - \\mathbf{D}_t \\mathbf{\\hat{u}}_t ) \\\\\n", "\\mathbf{P}_{t|\\mathbf{y}_t=\\mathbf{m}} = \\mathbf{P}_t - \\mathbf{P}_t { \\mathbf{C}_t }^T \\left( \\mathbf{C}_t\\mathbf{P}_t{\\mathbf{C}_t}^T + \\mathbf{D}_t \\mathbf{U}_t {\\mathbf{D}_t}^T + \\mathbf{R}_t\\right)^{-1} \\mathbf{C}_t \\mathbf{P}_t \\\\\n", "$$\n", "\n", "The only thing remained is to show the conditional expectation and covariance of multivariate normal random variables, as it was used in the above derivation.\n", "\n", "## Conditional Density of Multivariate Normal Random Variables\n", "\n", "Let $\\mathbf{x}$, $\\mathbf{y}$ be jointly normal with means $\\mathbb{E}\\left[{\\mathbf{x}}\\right]=\\mathbf{\\mu}_{\\mathbf{x}}$, $\\mathbb{E}\\left[{\\mathbf{y}}\\right] = \\mathbf{\\mu}_{\\mathbf{y}}$ and covariance\n", "$$\n", "\\left[ \\begin{array}{cc}\n", "\\mathbf{\\Sigma}_{xx} & \\mathbf{\\Sigma}_{xy} \\\\\n", "\\mathbf{\\Sigma}_{yx} & \\mathbf{\\Sigma}_{yy}\n", "\\end{array} \\right].\n", "$$\n", "\n", "Let us introduce a new variable $\\mathbf{z}$, which is a linear combination of variables $\\mathbf{x}$ and $\\mathbf{y}$. It is Gaussian, since it is a linear combination of Gaussian random variables.\n", "$$\n", "\\mathbf{z} = \\mathbf{x} - \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{y}.\n", "$$\n", "\n", "$\\mathbf{z}$ and $\\mathbf{y}$ are independent because $\\mathbf{z}$ and $\\mathbf{y}$ are jointly normal and\n", "\n", "$$\n", "\\sigma( \\mathbf{z}, \\mathbf{y} ) = \\sigma( \\mathbf{x} - \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{y}, \\mathbf{y} ) \\\\\n", "\\sigma( \\mathbf{z}, \\mathbf{y} ) = \\mathbb{E}\\left[{ \\left(\\mathbf{x} - \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{y} \\right) {\\mathbf{y}}^T }\\right] - \\mathbb{E}\\left[{\\mathbf{x} - \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{y} }\\right] {\\mathbb{E}\\left[{\\mathbf{y}}\\right]}^T \\\\\n", "\\sigma( \\mathbf{z}, \\mathbf{y} ) = \\mathbb{E}\\left[{ \\mathbf{x}{\\mathbf{y}}^T }\\right] - \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbb{E}\\left[{ \\mathbf{y} {\\mathbf{y}}^T}\\right]- \\mathbb{E}\\left[{\\mathbf{x}}\\right]{\\mathbb{E}\\left[{\\mathbf{y}}\\right]}^T -\\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbb{E}\\left[{\\mathbf{y}}\\right] {\\mathbb{E}\\left[{\\mathbf{y}}\\right]}^T \\\\\n", "\\sigma( \\mathbf{z}, \\mathbf{y} ) = \\mathbf{\\Sigma}_{xy} - \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{\\Sigma}_{yy} = \\mathbf{0}\\\\\n", "$$\n", "\n", "Let $\\mathbf{t}\\sim \\mathcal{N}(\\mathbf{\\mu}_{\\mathbf{t}},\\mathbf{T})$. Note that because $\\mathbf{z}$ and $\\mathbf{y}$ are independent, $\\mathbb{E}\\left[{\\mathbf{z}|\\mathbf{y}=\\mathbf{t}}\\right] = \\mathbb{E}\\left[{\\mathbf{z}}\\right] = \\mathbf{\\mu}_{\\mathbf{x}} - \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1}\\mathbf{\\mu}_{\\mathbf{y}}$. The conditional expectation for $\\mathbf{x}$ given $\\mathbf{y}$ is\n", "\n", "$$\n", "\\mathbb{E}\\left[{\\mathbf{x}|\\mathbf{y}=\\mathbf{t}}\\right] = \\mathbb{E}\\left[{\\mathbf{z}+\\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{y}|\\mathbf{y}=\\mathbf{t}}\\right] \\\\\n", "\\mathbb{E}\\left[{\\mathbf{x}|\\mathbf{y}=\\mathbf{t}}\\right] = \\mathbb{E}\\left[{\\mathbf{z}|\\mathbf{y}=\\mathbf{t}}\\right]+\\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbb{E}\\left[{\\mathbf{y}|\\mathbf{y}=\\mathbf{t}}\\right] \\\\\n", "\\mathbb{E}\\left[{\\mathbf{x}|\\mathbf{y}=\\mathbf{t}}\\right] = \\mathbf{\\mu}_{\\mathbf{x}} - \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1}\\mathbf{\\mu}_{\\mathbf{y}}+\\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{\\mu}_{\\mathbf{t}} \\\\\n", "\\mathbb{E}\\left[{\\mathbf{x}|\\mathbf{y}=\\mathbf{t}}\\right] = \\mathbf{\\mu}_{\\mathbf{x}} + \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1}( \\mathbf{\\mu}_{\\mathbf{t}} - \\mathbf{\\mu}_{\\mathbf{y}} ) \\\\\n", "$$\n", "\n", "The conditional covariance is\n", "\n", "$$\n", "\\text{Var}(\\mathbf{x}|\\mathbf{y}=\\mathbf{t}) = \\text{Var}(\\mathbf{z}+\\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{y}|\\mathbf{y}=\\mathbf{t}) \\\\\n", "\\text{Var}(\\mathbf{x}|\\mathbf{y}=\\mathbf{t}) = \\text{Var}(\\mathbf{z}|\\mathbf{y}) + \\text{Var}(\\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{y}|\\mathbf{y}=\\mathbf{t}) \\\\\n", "\\text{Var}(\\mathbf{x}|\\mathbf{y}=\\mathbf{t}) = \\text{Var}(\\mathbf{z}) + \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{T} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{\\Sigma}_{yx} \\\\\n", "\\text{Var}(\\mathbf{x}|\\mathbf{y}=\\mathbf{t}) = \\text{Var}(\\mathbf{x} - \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{y}) + \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{T} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{\\Sigma}_{yx} \\\\\n", "\\text{Var}(\\mathbf{x}|\\mathbf{y}=\\mathbf{t}) = \\mathbf{\\Sigma}_{xx} + \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{\\Sigma}_{yy} {(\\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1})}^T - \\mathbf{\\Sigma}_{xy} {(\\mathbf{\\Sigma}_{xy}\\mathbf{\\Sigma}_{yy}^{-1})}^T - \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{\\Sigma}_{yx} + \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{T} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{\\Sigma}_{yx} \\\\\n", "\\text{Var}(\\mathbf{x}|\\mathbf{y}=\\mathbf{t}) = \\mathbf{\\Sigma}_{xx} + \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{\\Sigma}_{yx} - \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1}\\mathbf{\\Sigma}_{yx} - \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{\\Sigma}_{yx} + \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{T} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{\\Sigma}_{yx} \\\\\n", "\\text{Var}(\\mathbf{x}|\\mathbf{y}=\\mathbf{t}) = \\mathbf{\\Sigma}_{xx} - \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{\\Sigma}_{yx} + \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{T} \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{\\Sigma}_{yx} \\\\\n", "\\text{Var}(\\mathbf{x}|\\mathbf{y}=\\mathbf{t}) = \\mathbf{\\Sigma}_{xx} + \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1} \\left(\\mathbf{T} - \\mathbf{\\Sigma}_{yy} \\right) \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{\\Sigma}_{yx} \\\\\n", "$$\n", "\n", "In summary,\n", "\n", "$$\n", "\\mathbb{E}\\left[{\\mathbf{x}|\\mathbf{y}=\\mathbf{t}}\\right] = \\mathbf{\\mu}_{\\mathbf{x}} + \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1}( \\mathbf{\\mu}_{\\mathbf{t}} - \\mathbf{\\mu}_{\\mathbf{y}} ) \\\\\n", "\\text{Var}(\\mathbf{x}|\\mathbf{y}=\\mathbf{t}) = \\mathbf{\\Sigma}_{xx} + \\mathbf{\\Sigma}_{xy} \\mathbf{\\Sigma}_{yy}^{-1}\\left(\\mathbf{T} - \\mathbf{\\Sigma}_{yy}\\right) \\mathbf{\\Sigma}_{yy}^{-1} \\mathbf{\\Sigma}_{yx} \\\\\n", "\\mathbb{E}\\left[{\\mathbf{y}|\\mathbf{x}=\\mathbf{s}}\\right] = \\mathbf{\\mu}_{\\mathbf{y}} + \\mathbf{\\Sigma}_{yx} \\mathbf{\\Sigma}_{xx}^{-1}( \\mathbf{\\mu}_{\\mathbf{s}} - \\mathbf{\\mu}_{\\mathbf{x}} ) \\\\\n", "\\text{Var}(\\mathbf{y}|\\mathbf{x}=\\mathbf{s}) = \\mathbf{\\Sigma}_{yy} + \\mathbf{\\Sigma}_{yx} \\mathbf{\\Sigma}_{xx}^{-1}\\left(\\mathbf{S} - \\mathbf{\\Sigma}_{xx}\\right) \\mathbf{\\Sigma}_{xx}^{-1} \\mathbf{\\Sigma}_{xy}.\n", "$$\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 2 }