{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "<a id='classical-filtering'></a>\n",
    "<div id=\"qe-notebook-header\" style=\"text-align:right;\">\n",
    "        <a href=\"https://quantecon.org/\" title=\"quantecon.org\">\n",
    "                <img style=\"width:250px;display:inline;\" src=\"https://assets.quantecon.org/img/qe-menubar-logo.svg\" alt=\"QuantEcon\">\n",
    "        </a>\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Classical Filtering With Linear Algebra"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Contents\n",
    "\n",
    "- [Classical Filtering With Linear Algebra](#Classical-Filtering-With-Linear-Algebra)  \n",
    "  - [Overview](#Overview)  \n",
    "  - [Infinite Horizon Prediction and Filtering Problems](#Infinite-Horizon-Prediction-and-Filtering-Problems)  \n",
    "  - [Finite Dimensional Prediction](#Finite-Dimensional-Prediction)  \n",
    "  - [Combined Finite Dimensional Control and Prediction](#Combined-Finite-Dimensional-Control-and-Prediction)  \n",
    "  - [Exercises](#Exercises)  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Overview\n",
    "\n",
    "This is a sequel to the earlier lecture [Classical Control with Linear Algebra](.html).\n",
    "\n",
    "That lecture used linear algebra – in particular,  the [LU decomposition](https://en.wikipedia.org/wiki/LU_decomposition)  – to formulate and solve a class of linear-quadratic optimal control problems.\n",
    "\n",
    "In this lecture, we’ll be using a closely related decomposition, the [Cholesky decomposition](https://en.wikipedia.org/wiki/Cholesky_decomposition) , to solve linear prediction and filtering problems.\n",
    "\n",
    "We exploit the useful fact that there is an intimate connection between two superficially different classes of problems:\n",
    "\n",
    "- deterministic linear-quadratic (LQ) optimal control problems  \n",
    "- linear least squares prediction and filtering problems  \n",
    "\n",
    "\n",
    "The first class of problems involves no randomness, while the second is all about randomness.\n",
    "\n",
    "Nevertheless,  essentially the same mathematics  solves both type of problem.\n",
    "\n",
    "This connection, which is often termed “duality,” is present whether one uses “classical” or “recursive” solution procedures.\n",
    "\n",
    "In fact we saw duality at work earlier when we formulated control and prediction problems recursively in lectures [LQ dynamic programming problems](../dynamic_programming/lqcontrol.html), [A first look at the Kalman filter](../tools_and_techniques/kalman.html), and [The permanent income model](../dynamic_programming/perm_income.html).\n",
    "\n",
    "A useful consequence of duality is that\n",
    "\n",
    "- With every LQ control problem there is implicitly affiliated a linear least squares prediction or filtering problem.  \n",
    "- With every linear least squares prediction or filtering problem there is implicitly affiliated a LQ control problem.  \n",
    "\n",
    "\n",
    "An understanding of these connections has repeatedly proved useful in cracking interesting applied problems.\n",
    "\n",
    "For example, Sargent [[Sar87]](../zreferences.html#sargent1987) [chs. IX, XIV] and Hansen and Sargent [[HS80]](../zreferences.html#hansar1980) formulated and solved control and filtering problems using $ z $-transform methods.\n",
    "\n",
    "In this lecture we investigate these ideas using mostly elementary linear algebra."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### References\n",
    "\n",
    "Useful references include [[Whi63]](../zreferences.html#whittle1963), [[HS80]](../zreferences.html#hansar1980), [[Orf88]](../zreferences.html#orfanidisoptimum1988), [[AP91]](../zreferences.html#athanasios1991), and [[Mut60]](../zreferences.html#muth1960)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "hide-output": true
   },
   "outputs": [],
   "source": [
    "using InstantiateFromURL\n",
    "# optionally add arguments to force installation: instantiate = true, precompile = true\n",
    "github_project(\"QuantEcon/quantecon-notebooks-julia\", version = \"0.8.0\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "using LinearAlgebra, Statistics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Infinite Horizon Prediction and Filtering Problems\n",
    "\n",
    "We pose two related prediction and filtering problems.\n",
    "\n",
    "We let $ Y_t $ be a univariate $ m^{\\rm th} $ order moving average, covariance stationary stochastic process,\n",
    "\n",
    "\n",
    "<a id='equation-eq-24'></a>\n",
    "$$\n",
    "Y_t = d(L) u_t \\tag{1}\n",
    "$$\n",
    "\n",
    "where $ d(L) = \\sum^m_{j=0} d_j L^j $, and $ u_t $ is a serially uncorrelated stationary random process satisfying\n",
    "\n",
    "\n",
    "<a id='equation-eq-25'></a>\n",
    "$$\n",
    "\\begin{aligned}\n",
    "    \\mathbb{E} u_t &= 0\\\\\n",
    "    \\mathbb{E} u_t u_s &=\n",
    "    \\begin{cases}\n",
    "        1 & \\text{ if } t = s \\\\\n",
    "        0 & \\text{ otherwise}\n",
    "    \\end{cases}\n",
    "\\end{aligned} \\tag{2}\n",
    "$$\n",
    "\n",
    "We impose no conditions on the zeros of $ d(z) $.\n",
    "\n",
    "A second covariance stationary process is $ X_t $ given by\n",
    "\n",
    "\n",
    "<a id='equation-eq-26'></a>\n",
    "$$\n",
    "X_t = Y_t + \\varepsilon_t \\tag{3}\n",
    "$$\n",
    "\n",
    "where $ \\varepsilon_t $ is a serially uncorrelated stationary\n",
    "random process with $ \\mathbb{E} \\varepsilon_t = 0 $ and $ \\mathbb{E} \\varepsilon_t \\varepsilon_s $ = $ 0 $ for all distinct $ t $ and $ s $.\n",
    "\n",
    "We also assume that $ \\mathbb{E} \\varepsilon_t u_s = 0 $ for all $ t $ and $ s $.\n",
    "\n",
    "The **linear least squares prediction problem** is to find the $ L_2 $\n",
    "random variable $ \\hat X_{t+j} $ among linear combinations of\n",
    "$ \\{ X_t,\\  X_{t-1},\n",
    "\\ldots \\} $ that minimizes $ \\mathbb{E}(\\hat X_{t+j} - X_{t+j})^2 $.\n",
    "\n",
    "That is, the problem is to find a $ \\gamma_j (L) = \\sum^\\infty_{k=0} \\gamma_{jk}\\, L^k $ such that $ \\sum^\\infty_{k=0} \\vert \\gamma_{jk} \\vert^2 < \\infty $ and $ \\mathbb{E} [\\gamma_j \\, (L) X_t -X_{t+j}]^2 $ is minimized.\n",
    "\n",
    "The **linear least squares filtering problem** is to find a $ b\\,(L) = \\sum^\\infty_{j=0} b_j\\, L^j $ such that $ \\sum^\\infty_{j=0}\\vert b_j \\vert^2 < \\infty $ and $ \\mathbb{E} [b\\, (L) X_t -Y_t ]^2 $ is minimized.\n",
    "\n",
    "Interesting versions of these problems related to the permanent income theory were studied by [[Mut60]](../zreferences.html#muth1960)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Problem formulation\n",
    "\n",
    "These problems are solved as follows.\n",
    "\n",
    "The covariograms of $ Y $ and $ X $ and their cross covariogram are, respectively,\n",
    "\n",
    "\n",
    "<a id='equation-eq-27'></a>\n",
    "$$\n",
    "\\begin{aligned}\n",
    "    C_X (\\tau) &= \\mathbb{E}X_t X_{t-\\tau} \\\\\n",
    "    C_Y (\\tau) &= \\mathbb{E}Y_t Y_{t-\\tau}  \\qquad \\tau = 0, \\pm 1, \\pm 2, \\ldots \\\\\n",
    "    C_{Y,X} (\\tau) &= \\mathbb{E}Y_t X_{t-\\tau}\n",
    "\\end{aligned} \\tag{4}\n",
    "$$\n",
    "\n",
    "The covariance and cross covariance generating functions are defined as\n",
    "\n",
    "\n",
    "<a id='equation-eq-28'></a>\n",
    "$$\n",
    "\\begin{aligned}\n",
    "    g_X(z) &= \\sum^\\infty_{\\tau = - \\infty} C_X (\\tau) z^\\tau \\\\\n",
    "    g_Y(z) &= \\sum^\\infty_{\\tau = - \\infty} C_Y (\\tau) z^\\tau \\\\\n",
    "    g_{YX} (z) &= \\sum^\\infty_{\\tau = - \\infty} C_{YX} (\\tau) z^\\tau\n",
    "\\end{aligned} \\tag{5}\n",
    "$$\n",
    "\n",
    "The generating functions can be computed by using the following facts.\n",
    "\n",
    "Let $ v_{1t} $ and $ v_{2t} $ be two mutually and serially uncorrelated white noises with unit variances.\n",
    "\n",
    "That is, $ \\mathbb{E}v^2_{1t} = \\mathbb{E}v^2_{2t} = 1, \\mathbb{E}v_{1t} = \\mathbb{E}v_{2t} = 0, \\mathbb{E}v_{1t} v_{2s} = 0 $ for all $ t $ and $ s $, $ \\mathbb{E}v_{1t} v_{1t-j} = \\mathbb{E}v_{2t} v_{2t-j} = 0 $ for all $ j \\not = 0 $.\n",
    "\n",
    "Let $ x_t $ and $ y_t $ be two random process given by\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "    y_t &= A(L) v_{1t} + B(L) v_{2t} \\\\\n",
    "    x_t &= C(L) v_{1t} + D(L) v_{2t}\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "Then, as shown for example in [[Sar87]](../zreferences.html#sargent1987) [ch. XI], it is true that\n",
    "\n",
    "\n",
    "<a id='equation-eq-29'></a>\n",
    "$$\n",
    "\\begin{aligned}\n",
    "    g_y(z) &= A(z) A(z^{-1}) + B (z) B(z^{-1}) \\\\\n",
    "    g_x (z) &= C(z) C(z^{-1}) + D(z) D(z^{-1}) \\\\\n",
    "    g_{yx} (z) &= A(z) C(z^{-1}) + B(z) D(z^{-1})\n",
    "\\end{aligned} \\tag{6}\n",
    "$$\n",
    "\n",
    "Applying these formulas to [(1)](#equation-eq-24) – [(4)](#equation-eq-27), we have\n",
    "\n",
    "\n",
    "<a id='equation-eq-30'></a>\n",
    "$$\n",
    "\\begin{aligned}\n",
    "    g_Y(z) &= d(z)d(z^{-1}) \\\\\n",
    "    g_X(z) &= d(z)d(z^{-1}) + h\\\\\n",
    "    g_{YX} (z) &= d(z) d(z^{-1})\n",
    "\\end{aligned} \\tag{7}\n",
    "$$\n",
    "\n",
    "The key step in obtaining solutions to our problems is to factor the covariance generating function  $ g_X(z) $ of $ X $.\n",
    "\n",
    "The solutions of our problems are given by formulas due to Wiener and Kolmogorov.\n",
    "\n",
    "These formulas utilize the Wold moving average representation of the $ X_t $ process,\n",
    "\n",
    "\n",
    "<a id='equation-eq-31'></a>\n",
    "$$\n",
    "X_t = c\\,(L)\\,\\eta_t \\tag{8}\n",
    "$$\n",
    "\n",
    "where $ c(L) = \\sum^m_{j=0} c_j\\, L^j $, with\n",
    "\n",
    "\n",
    "<a id='equation-eq-32'></a>\n",
    "$$\n",
    "c_0 \\eta_t\n",
    "= X_t - \\mathbb{\\hat E} [X_t \\vert X_{t-1}, X_{t-2}, \\ldots] \\tag{9}\n",
    "$$\n",
    "\n",
    "Here $ \\mathbb{\\hat E} $ is the linear least squares projection operator.\n",
    "\n",
    "Equation [(9)](#equation-eq-32)  is the condition that $ c_0 \\eta_t $ can be the one-step ahead error in predicting $ X_t $ from its own past values.\n",
    "\n",
    "Condition [(9)](#equation-eq-32)  requires that $ \\eta_t $ lie in the closed\n",
    "linear space spanned by $ [X_t,\\  X_{t-1}, \\ldots] $.\n",
    "\n",
    "This will be true if and only if the zeros of $ c(z) $ do not lie inside the unit circle.\n",
    "\n",
    "It is an implication of [(9)](#equation-eq-32) that $ \\eta_t $ is a serially\n",
    "uncorrelated random process, and that a normalization can be imposed so\n",
    "that $ \\mathbb{E}\\eta_t^2 = 1 $.\n",
    "\n",
    "Consequently, an implication of [(8)](#equation-eq-31)  is\n",
    "that the covariance generating function of $ X_t $ can be expressed\n",
    "as\n",
    "\n",
    "\n",
    "<a id='equation-eq-33'></a>\n",
    "$$\n",
    "g_X(z) = c\\,(z)\\,c\\,(z^{-1}) \\tag{10}\n",
    "$$\n",
    "\n",
    "It remains to discuss how $ c(L) $ is to be computed.\n",
    "\n",
    "Combining [(6)](#equation-eq-29)  and [(10)](#equation-eq-33)  gives\n",
    "\n",
    "\n",
    "<a id='equation-eq-34'></a>\n",
    "$$\n",
    "d(z) \\,d(z^{-1}) + h = c \\, (z) \\,c\\,(z^{-1}) \\tag{11}\n",
    "$$\n",
    "\n",
    "Therefore, we have already showed constructively how to factor the covariance generating function $ g_X(z) = d(z)\\,d\\,(z^{-1}) + h $.\n",
    "\n",
    "We now introduce the **annihilation operator**:\n",
    "\n",
    "\n",
    "<a id='equation-eq-35'></a>\n",
    "$$\n",
    "\\left[\n",
    "    \\sum^\\infty_{j= - \\infty} f_j\\, L^j\n",
    "\\right]_+\n",
    "\\equiv \\sum^\\infty_{j=0} f_j\\,L^j \\tag{12}\n",
    "$$\n",
    "\n",
    "In words, $ [\\phantom{00}]_+ $ means “ignore negative powers of $ L $”.\n",
    "\n",
    "We have defined the solution of the prediction problem as $ \\mathbb{\\hat E} [X_{t+j} \\vert X_t,\\, X_{t-1}, \\ldots] = \\gamma_j\\, (L) X_t $.\n",
    "\n",
    "Assuming that the roots of $ c(z) = 0 $ all lie outside the unit circle, the Wiener-Kolmogorov formula for $ \\gamma_j (L) $ holds:\n",
    "\n",
    "\n",
    "<a id='equation-eq-36'></a>\n",
    "$$\n",
    "\\gamma_j\\, (L) =\n",
    "\\left[\n",
    "    {c (L) \\over L^j}\n",
    "\\right]_+ c\\,(L)^{-1} \\tag{13}\n",
    "$$\n",
    "\n",
    "We have defined the solution of the filtering problem as $ \\mathbb{\\hat E}[Y_t \\mid X_t, X_{t-1}, \\ldots] = b (L)X_t $.\n",
    "\n",
    "The Wiener-Kolomogorov formula for $ b(L) $ is\n",
    "\n",
    "$$\n",
    "b(L) = \\left({g_{YX} (L) \\over c(L^{-1})}\\right)_+ c(L)^{-1}\n",
    "$$\n",
    "\n",
    "or\n",
    "\n",
    "\n",
    "<a id='equation-eq-37'></a>\n",
    "$$\n",
    "b(L) = \\left[ {d(L)d(L^{-1}) \\over c(L^{-1})} \\right]_+ c(L)^{-1} \\tag{14}\n",
    "$$\n",
    "\n",
    "Formulas [(13)](#equation-eq-36) and [(14)](#equation-eq-37)  are discussed in detail in  [[Whi83]](../zreferences.html#whittle1983) and [[Sar87]](../zreferences.html#sargent1987).\n",
    "\n",
    "The interested reader can there find several examples of the use of these formulas in economics\n",
    "Some classic examples using these formulas are due to [[Mut60]](../zreferences.html#muth1960).\n",
    "\n",
    "As an example of the usefulness of formula [(14)](#equation-eq-37), we let $ X_t $ be a stochastic process with Wold moving average representation\n",
    "\n",
    "$$\n",
    "X_t = c(L) \\eta_t\n",
    "$$\n",
    "\n",
    "where $ \\mathbb{E}\\eta^2_t = 1, \\hbox { and } c_0 \\eta_t = X_t - \\mathbb{\\hat E} [X_t \\vert X_{t-1}, \\ldots], c (L) = \\sum^m_{j=0} c_j L $.\n",
    "\n",
    "Suppose that at time $ t $, we wish to predict a geometric sum of future $ X $’s, namely\n",
    "\n",
    "$$\n",
    "y_t \\equiv \\sum^\\infty_{j=0} \\delta^j X_{t+j} = {1 \\over 1 - \\delta L^{-1}}\n",
    "X_t\n",
    "$$\n",
    "\n",
    "given knowledge of $ X_t, X_{t-1}, \\ldots $.\n",
    "\n",
    "We shall use [(14)](#equation-eq-37)  to obtain the answer.\n",
    "\n",
    "Using the standard formulas  [(6)](#equation-eq-29), we have that\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "    g_{yx}(z) &= (1-\\delta z^{-1})c(z) c (z^{-1}) \\\\\n",
    "    g_x (z) &= c(z) c (z^{-1})\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "Then [(14)](#equation-eq-37)  becomes\n",
    "\n",
    "\n",
    "<a id='equation-eq-38'></a>\n",
    "$$\n",
    "b(L)=\\left[{c(L)\\over 1-\\delta L^{-1}}\\right]_+ c(L)^{-1} \\tag{15}\n",
    "$$\n",
    "\n",
    "In order to evaluate the term in the annihilation operator, we use the following result from [[HS80]](../zreferences.html#hansar1980).\n",
    "\n",
    "**Proposition** Let\n",
    "\n",
    "- $ g(z) = \\sum^\\infty_{j=0} g_j \\, z^j $ where $ \\sum^\\infty_{j=0} \\vert g_j \\vert^2 < + \\infty $  \n",
    "- $ h\\,(z^{-1}) = $ $ (1- \\delta_1 z^{-1}) \\ldots (1-\\delta_n z^{-1}) $, where $ \\vert \\delta_j \\vert < 1 $, for $ j = 1, \\ldots, n $  \n",
    "\n",
    "\n",
    "Then\n",
    "\n",
    "\n",
    "<a id='equation-eq-39'></a>\n",
    "$$\n",
    "\\left[{g(z)\\over h(z^{-1})}\\right]_+ = {g(z)\\over h(z^{-1})} - \\sum^n_{j=1}\n",
    "\\ {\\delta_j g (\\delta_j) \\over \\prod^n_{k=1 \\atop k \\not = j} (\\delta_j -\n",
    "\\delta_k)} \\ \\left({1 \\over z- \\delta_j}\\right) \\tag{16}\n",
    "$$\n",
    "\n",
    "and, alternatively,\n",
    "\n",
    "\n",
    "<a id='equation-eq-40'></a>\n",
    "$$\n",
    "\\left[\n",
    "    {g(z)\\over h(z^{-1})}\n",
    "\\right]_+\n",
    "=\\sum^n_{j=1} B_j\n",
    "\\left(\n",
    "    {zg(z)-\\delta_j g (\\delta_j) \\over z- \\delta_j}\n",
    "\\right) \\tag{17}\n",
    "$$\n",
    "\n",
    "where $ B_j = 1 / \\prod^n_{k=1\\atop k+j} (1 - \\delta_k / \\delta_j) $.\n",
    "\n",
    "Applying formula [(17)](#equation-eq-40)  of the proposition to evaluating  [(15)](#equation-eq-38)  with $ g(z) = c(z) $ and $ h(z^{-1}) = 1 - \\delta z^{-1} $ gives\n",
    "\n",
    "$$\n",
    "b(L)=\\left[{Lc(L)-\\delta c(\\delta)\\over L-\\delta}\\right] c(L)^{-1}\n",
    "$$\n",
    "\n",
    "or\n",
    "\n",
    "$$\n",
    "b(L) =\n",
    "\\left[\n",
    "    {1-\\delta c (\\delta) L^{-1} c (L)^{-1}\\over 1-\\delta L^{-1}}\n",
    "\\right]\n",
    "$$\n",
    "\n",
    "Thus, we have\n",
    "\n",
    "\n",
    "<a id='equation-eq-41'></a>\n",
    "$$\n",
    "\\mathbb{\\hat E}\n",
    "\\left[\n",
    "    \\sum^\\infty_{j=0} \\delta^j X_{t+j}\\vert X_t,\\, x_{t-1},\n",
    "    \\ldots\n",
    "\\right]  =\n",
    "\\left[\n",
    "    {1-\\delta c (\\delta) L^{-1} c(L)^{-1} \\over 1 - \\delta L^{-1}}\n",
    "\\right]\n",
    "\\, X_t \\tag{18}\n",
    "$$\n",
    "\n",
    "This formula is useful in solving stochastic versions of problem 1 of lecture [Classical Control with Linear Algebra](lu_tricks.html) in which the randomness emerges because $ \\{a_t\\} $ is a stochastic\n",
    "process.\n",
    "\n",
    "The problem is to maximize\n",
    "\n",
    "\n",
    "<a id='equation-eq-42'></a>\n",
    "$$\n",
    "\\mathbb{E}_0\n",
    "\\lim_{N \\rightarrow \\infty}\\\n",
    "\\sum^N_{t-0} \\beta^t\n",
    "\\left[\n",
    "    a_t\\, y_t - {1 \\over 2}\\ hy^2_t-{1 \\over 2}\\ [d(L)y_t]^2\n",
    "\\right] \\tag{19}\n",
    "$$\n",
    "\n",
    "where $ \\mathbb{E}_t $ is mathematical expectation conditioned on information\n",
    "known at $ t $, and where $ \\{ a_t\\} $ is a covariance\n",
    "stationary stochastic process with Wold moving average representation\n",
    "\n",
    "$$\n",
    "a_t = c(L)\\, \\eta_t\n",
    "$$\n",
    "\n",
    "where\n",
    "\n",
    "$$\n",
    "c(L) = \\sum^{\\tilde n}_{j=0} c_j L^j\n",
    "$$\n",
    "\n",
    "and\n",
    "\n",
    "$$\n",
    "\\eta_t =\n",
    "a_t - \\mathbb{\\hat E} [a_t \\vert a_{t-1}, \\ldots]\n",
    "$$\n",
    "\n",
    "The problem is to maximize [(19)](#equation-eq-42)  with respect to a contingency plan\n",
    "expressing $ y_t $ as a function of information known at $ t $,\n",
    "which is assumed to be\n",
    "$ (y_{t-1},\\  y_{t-2}, \\ldots, a_t, \\ a_{t-1}, \\ldots) $.\n",
    "\n",
    "The solution of this problem can be achieved in two steps.\n",
    "\n",
    "First, ignoring the uncertainty, we can solve the problem assuming that $ \\{a_t\\} $ is a known sequence.\n",
    "\n",
    "The solution is, from above,\n",
    "\n",
    "$$\n",
    "c(L) y_t = c(\\beta L^{-1})^{-1} a_t\n",
    "$$\n",
    "\n",
    "or\n",
    "\n",
    "\n",
    "<a id='equation-eq-43'></a>\n",
    "$$\n",
    "(1-\\lambda_1 L) \\ldots (1 - \\lambda_m L) y_t\n",
    "= \\sum^m_{j=1} A_j\n",
    "\\sum^\\infty_{k=0} (\\lambda_j \\beta)^k\\, a_{t+k} \\tag{20}\n",
    "$$\n",
    "\n",
    "Second, the solution of the problem under uncertainty is obtained by\n",
    "replacing the terms on the right-hand side of the above expressions with\n",
    "their linear least squares predictors.\n",
    "\n",
    "Using [(18)](#equation-eq-41) and [(20)](#equation-eq-43), we have\n",
    "the following solution\n",
    "\n",
    "$$\n",
    "(1-\\lambda_1 L) \\ldots (1-\\lambda_m L) y_t\n",
    "=\n",
    "\\sum^m_{j=1} A_j\n",
    " \\left[\n",
    "     \\frac{1-\\beta \\lambda_j \\, c (\\beta \\lambda_j) L^{-1} c(L)^{-1} }\n",
    "     { 1-\\beta \\lambda_j L^{-1} }\n",
    " \\right] a_t\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Finite Dimensional Prediction\n",
    "\n",
    "Let $ (x_1, x_2, \\ldots, x_T)^\\prime = x $ be a $ T \\times 1 $ vector of random variables with mean $ \\mathbb{E} x = 0 $ and covariance matrix $ \\mathbb{E} xx^\\prime = V $.\n",
    "\n",
    "Here $ V $ is a $ T \\times T $ positive definite matrix.\n",
    "\n",
    "We shall regard the random variables as being\n",
    "ordered in time, so that $ x_t $ is thought of as the value of some\n",
    "economic variable at time $ t $.\n",
    "\n",
    "For example, $ x_t $ could be generated by the random process described  by the Wold representation presented in equation [(8)](#equation-eq-31).\n",
    "\n",
    "In this case, $ V_{ij} $ is given by the coefficient on $ z^{\\mid i-j \\mid} $ in the expansion of $ g_x (z) = d(z) \\, d(z^{-1}) + h $, which equals\n",
    "$ h+\\sum^\\infty_{k=0} d_k d_{k+\\mid i-j \\mid} $.\n",
    "\n",
    "We shall be interested in constructing $ j $ step ahead linear least squares predictors of the form\n",
    "\n",
    "$$\n",
    "\\mathbb{\\hat E}\n",
    "\\left[\n",
    "    x_T\\vert x_{T-j}, x_{T-j + 1}, \\ldots, x_1\n",
    "\\right]\n",
    "$$\n",
    "\n",
    "where $ \\mathbb{\\hat E} $ is the linear least squares projection operator.\n",
    "\n",
    "The solution of this problem can be exhibited by first constructing an\n",
    "orthonormal basis of random variables $ \\varepsilon $ for $ x $.\n",
    "\n",
    "Since $ V $ is a positive definite and symmetric, we know that there\n",
    "exists a (Cholesky) decomposition of $ V $ such that\n",
    "\n",
    "$$\n",
    "V = L^{-1} (L^{-1})^\\prime\n",
    "$$\n",
    "\n",
    "or\n",
    "\n",
    "$$\n",
    "L \\, V \\, L^\\prime = I\n",
    "$$\n",
    "\n",
    "where $ L $ is lower-trangular, and therefore so is $ L^{-1} $.\n",
    "\n",
    "Form the random variable $ Lx = \\varepsilon $.\n",
    "\n",
    "Then $ \\varepsilon $ is an orthonormal basis for $ x $, since $ L $ is nonsingular, and $ \\mathbb{E} \\, \\varepsilon \\, \\varepsilon^\\prime =\n",
    "L \\mathbb{E} xx^\\prime L^\\prime = I $.\n",
    "\n",
    "It is convenient to write out the equations $ Lx = \\varepsilon $ and $ L^{-1} \\varepsilon = x $\n",
    "\n",
    "\n",
    "<a id='equation-eq-53'></a>\n",
    "$$\n",
    "\\begin{aligned}\n",
    "    L_{11} x_1 &= \\varepsilon_1 \\\\\n",
    "    L_{21}x_1 + L_{22} x_2 &= \\varepsilon_2 \\\\ \\, \\vdots \\\\\n",
    "    L_{T1} \\, x_1 \\, \\ldots \\, + L_{TTx_T} &= \\varepsilon_T\n",
    "\\end{aligned} \\tag{21}\n",
    "$$\n",
    "\n",
    "or\n",
    "\n",
    "\n",
    "<a id='equation-eq-54'></a>\n",
    "$$\n",
    "\\sum^{t-1}_{j=0} L_{t,t-j}\\, x_{t-j} = \\varepsilon_t, \\quad t = 1, \\, 2, \\ldots T \\tag{22}\n",
    "$$\n",
    "\n",
    "We also have\n",
    "\n",
    "\n",
    "<a id='equation-eq-55'></a>\n",
    "$$\n",
    "x_t = \\sum^{t-1}_{j=0} L^{-1}_{t,t-j}\\, \\varepsilon_{t-j}\\ . \\tag{23}\n",
    "$$\n",
    "\n",
    "Notice from [(23)](#equation-eq-55) that $ x_t $ is in the space spanned by\n",
    "$ \\varepsilon_t, \\, \\varepsilon_{t-1}, \\ldots, \\varepsilon_1 $, and from [(22)](#equation-eq-54)   that\n",
    "$ \\varepsilon_t $ is in the space spanned by $ x_t,\\, x_{t-1}, \\ldots,\\, x_1 $.\n",
    "\n",
    "Therefore, we have that for $ t-1\\geq m \\geq 1 $\n",
    "\n",
    "\n",
    "<a id='equation-eq-56'></a>\n",
    "$$\n",
    "\\mathbb{\\hat E}\n",
    "[ x_t \\mid x_{t-m},\\, x_{t-m-1}, \\ldots, x_1 ] =\n",
    "\\mathbb{\\hat E}\n",
    "[x_t \\mid \\varepsilon_{t-m}, \\varepsilon_{t-m-1},\\ldots, \\varepsilon_1] \\tag{24}\n",
    "$$\n",
    "\n",
    "For $ t-1 \\geq m \\geq 1 $ rewrite [(23)](#equation-eq-55)  as\n",
    "\n",
    "\n",
    "<a id='equation-eq-57'></a>\n",
    "$$\n",
    "x_t = \\sum^{m-1}_{j=0} L_{t,t-j}^{-1}\\, \\varepsilon_{t-j} + \\sum^{t-1}_{j=m}\n",
    "L^{-1}_{t, t-j}\\, \\varepsilon_{t-j} \\tag{25}\n",
    "$$\n",
    "\n",
    "Representation [(25)](#equation-eq-57)  is an orthogonal decomposition of $ x_t $ into a part $ \\sum^{t-1}_{j=m} L_{t, t-j}^{-1}\\, \\varepsilon_{t-j} $ that lies in the space spanned by\n",
    "$ [x_{t-m},\\, x_{t-m+1},\\, \\ldots, x_1] $, and an orthogonal\n",
    "component not in this space."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Implementation\n",
    "\n",
    "Code that computes solutions to  LQ control and filtering problems  using the methods described here and in [Classical Control with Linear Algebra](lu_tricks.html) can be found in the file [control_and_filter.jl](https://github.com/QuantEcon/QuantEcon.lectures.code/blob/master/lu_tricks/control_and_filter.jl).\n",
    "\n",
    "Here’s how it looks"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "hide-output": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "optimal_y (generic function with 2 methods)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "using Polynomials.PolyCompat, LinearAlgebra\n",
    "import Polynomials.PolyCompat: roots, coeffs\n",
    "\n",
    "function LQFilter(d, h, y_m;\n",
    "                  r = nothing,\n",
    "                  β = nothing,\n",
    "                  h_eps = nothing)\n",
    "\n",
    "    m = length(d) - 1\n",
    "    m == length(y_m) ||\n",
    "        throw(ArgumentError(\"y_m and d must be of same length = $m\"))\n",
    "\n",
    "    # define the coefficients of ϕ up front\n",
    "    ϕ = zeros(2m + 1)\n",
    "    for i in -m:m\n",
    "        ϕ[m-i+1] = sum(diag(d*d', -i))\n",
    "    end\n",
    "    ϕ[m+1] = ϕ[m+1] + h\n",
    "\n",
    "    # if r is given calculate the vector ϕ_r\n",
    "    if isnothing(r)\n",
    "        k = nothing\n",
    "        ϕ_r = nothing\n",
    "    else\n",
    "        k = size(r, 1) - 1\n",
    "        ϕ_r = zeros(2k + 1)\n",
    "\n",
    "        for i = -k:k\n",
    "            ϕ_r[k-i+1] = sum(diag(r*r', -i))\n",
    "        end\n",
    "\n",
    "        if isnothing(h_eps) == false\n",
    "            ϕ_r[k+1] = ϕ_r[k+1] + h_eps\n",
    "        end\n",
    "    end\n",
    "\n",
    "    # if β is given, define the transformed variables\n",
    "    if isnothing(β)\n",
    "        β = 1.0\n",
    "    else\n",
    "        d = β.^(collect(0:m)/2) * d\n",
    "        y_m = y_m * β.^(- collect(1:m)/2)\n",
    "    end\n",
    "\n",
    "    return (d = d, h = h, y_m = y_m, m = m, ϕ = ϕ, β = β,\n",
    "            ϕ_r = ϕ_r, k = k)\n",
    "end\n",
    "\n",
    "function construct_W_and_Wm(lqf, N)\n",
    "\n",
    "    d, m = lqf.d, lqf.m\n",
    "\n",
    "    W = zeros(N + 1, N + 1)\n",
    "    W_m = zeros(N + 1, m)\n",
    "\n",
    "    # terminal conditions\n",
    "    D_m1 = zeros(m + 1, m + 1)\n",
    "    M = zeros(m + 1, m)\n",
    "\n",
    "    # (1) constuct the D_{m+1} matrix using the formula\n",
    "\n",
    "    for j in 1:(m+1)\n",
    "        for k in j:(m+1)\n",
    "            D_m1[j, k] = dot(d[1:j, 1], d[k-j+1:k, 1])\n",
    "        end\n",
    "    end\n",
    "\n",
    "    # Make the matrix symmetric\n",
    "    D_m1 = D_m1 + D_m1' - Diagonal(diag(D_m1))\n",
    "\n",
    "    # (2) Construct the M matrix using the entries of D_m1\n",
    "\n",
    "    for j in 1:m\n",
    "        for i in (j + 1):(m + 1)\n",
    "            M[i, j] = D_m1[i-j, m+1]\n",
    "        end\n",
    "    end\n",
    "    M\n",
    "\n",
    "    # Euler equations for t = 0, 1, ..., N-(m+1)\n",
    "    ϕ, h = lqf.ϕ, lqf.h\n",
    "\n",
    "    W[1:(m + 1), 1:(m + 1)] = D_m1 + h * I\n",
    "    W[1:(m + 1), (m + 2):(2m + 1)] = M\n",
    "\n",
    "    for (i, row) in enumerate((m + 2):(N + 1 - m))\n",
    "        W[row, (i + 1):(2m + 1 + i)] = ϕ'\n",
    "    end\n",
    "\n",
    "    for i in 1:m\n",
    "        W[N - m + i + 1 , end-(2m + 1 - i)+1:end] = ϕ[1:end-i]\n",
    "    end\n",
    "\n",
    "    for i in 1:m\n",
    "        W_m[N - i + 2, 1:(m - i)+1] = ϕ[(m + 1 + i):end]\n",
    "    end\n",
    "\n",
    "    return W, W_m\n",
    "end\n",
    "\n",
    "function roots_of_characteristic(lqf)\n",
    "    m, ϕ = lqf.m, lqf.ϕ\n",
    "\n",
    "    # Calculate the roots of the 2m-polynomial\n",
    "    ϕ_poly = Poly(ϕ[end:-1:1])\n",
    "    proots = roots(ϕ_poly)\n",
    "    # sort the roots according to their length (in descending order)\n",
    "    roots_sorted = sort(proots, by=abs)[end:-1:1]\n",
    "   z_0 = sum(ϕ) / polyval(poly(proots), 1.0)\n",
    "    z_1_to_m = roots_sorted[1:m]     # we need only those outside the unit circle\n",
    "    λ = 1 ./ z_1_to_m\n",
    "    return z_1_to_m, z_0, λ\n",
    "end\n",
    "\n",
    "function coeffs_of_c(lqf)\n",
    "    m = lqf.m\n",
    "    z_1_to_m, z_0, λ = roots_of_characteristic(lqf)\n",
    "    c_0 = (z_0 * prod(z_1_to_m) * (-1.0)^m)^(0.5)\n",
    "    c_coeffs = coeffs(poly(z_1_to_m)) * z_0 / c_0\n",
    "    return c_coeffs\n",
    "end\n",
    "\n",
    "function solution(lqf)\n",
    "    z_1_to_m, z_0, λ = roots_of_characteristic(lqf)\n",
    "    c_0 = coeffs_of_c(lqf)[end]\n",
    "    A = zeros(lqf.m)\n",
    "    for j in 1:m\n",
    "        denom = 1 - λ/λ[j]\n",
    "        A[j] = c_0^(-2) / prod(denom[1:m .!= j])\n",
    "    end\n",
    "    return λ, A\n",
    "end\n",
    "\n",
    "function construct_V(lqf; N = nothing)\n",
    "    if isnothing(N)\n",
    "        error(\"N must be provided!!\")\n",
    "    end\n",
    "    if !(N isa Integer)\n",
    "        throw(ArgumentError(\"N must be Integer!\"))\n",
    "    end\n",
    "\n",
    "    ϕ_r, k = lqf.ϕ_r, lqf.k\n",
    "    V = zeros(N, N)\n",
    "    for i in 1:N\n",
    "        for j in 1:N\n",
    "            if abs(i-j) ≤ k\n",
    "                V[i, j] = ϕ_r[k + abs(i-j)+1]\n",
    "            end\n",
    "        end\n",
    "    end\n",
    "    return V\n",
    "end\n",
    "\n",
    "function simulate_a(lqf, N)\n",
    "    V = construct_V(N + 1)\n",
    "    d = MVNSampler(zeros(N + 1), V)\n",
    "    return rand(d)\n",
    "end\n",
    "\n",
    "function predict(lqf, a_hist, t)\n",
    "    N = length(a_hist) - 1\n",
    "    V = construct_V(N + 1)\n",
    "\n",
    "    aux_matrix = zeros(N + 1, N + 1)\n",
    "    aux_matrix[1:t+1 , 1:t+1 ] = Matrix(I, t + 1, t + 1)\n",
    "    L = cholesky(V).U'\n",
    "    Ea_hist = inv(L) * aux_matrix * L * a_hist\n",
    "\n",
    "    return Ea_hist\n",
    "end\n",
    "\n",
    "function optimal_y(lqf, a_hist, t = nothing)\n",
    "    β, y_m, m = lqf.β, lqf.y_m, lqf.m\n",
    "\n",
    "    N = length(a_hist) - 1\n",
    "    W, W_m = construct_W_and_Wm(lqf, N)\n",
    "\n",
    "    F = lu(W, Val(true))\n",
    "\n",
    "    L, U = F\n",
    "    D = diagm(0 => 1.0 ./ diag(U))\n",
    "    U = D * U\n",
    "    L = L * diagm(0 => 1.0 ./ diag(D))\n",
    "\n",
    "    J = reverse(Matrix(I, N + 1, N + 1), dims = 2)\n",
    "\n",
    "    if isnothing(t)                      # if the problem is deterministic\n",
    "        a_hist = J * a_hist\n",
    "\n",
    "        # transform the a sequence if β is given\n",
    "        if β != 1\n",
    "            a_hist =  reshape(a_hist * (β^(collect(N:0)/ 2)), N + 1, 1)\n",
    "        end\n",
    "\n",
    "        ā = a_hist - W_m * y_m        # ā from the lecutre\n",
    "        Uy = \\(L, ā)                  # U @ ȳ = L^{-1}ā from the lecture\n",
    "        ȳ = \\(U, Uy)                  # ȳ = U^{-1}L^{-1}ā\n",
    "        # Reverse the order of ȳ with the matrix J\n",
    "        J = reverse(Matrix(I, N + m + 1, N + m + 1), dims = 2)\n",
    "        y_hist = J * vcat(ȳ, y_m)     # y_hist : concatenated y_m and ȳ\n",
    "        # transform the optimal sequence back if β is given\n",
    "        if β != 1\n",
    "            y_hist = y_hist .* β.^(- collect(-m:N)/2)\n",
    "        end\n",
    "\n",
    "    else                                  # if the problem is stochastic and we look at it\n",
    "        Ea_hist = reshape(predict(a_hist, t), N + 1, 1)\n",
    "        Ea_hist = J * Ea_hist\n",
    "\n",
    "        ā = Ea_hist - W_m * y_m       # ā from the lecutre\n",
    "        Uy = \\(L, ā)                  # U @ ȳ = L^{-1}ā from the lecture\n",
    "        ȳ = \\(U, Uy)                  # ȳ = U^{-1}L^{-1}ā\n",
    "\n",
    "        # Reverse the order of ȳ with the matrix J\n",
    "        J = reverse(Matrix(I, N + m + 1, N + m + 1), dims = 2)\n",
    "        y_hist = J * vcat(ȳ, y_m)     # y_hist : concatenated y_m and ȳ\n",
    "    end\n",
    "    return y_hist, L, U, ȳ\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let’s use this code to tackle two interesting examples."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example 1\n",
    "\n",
    "Consider a stochastic process with moving average representation\n",
    "\n",
    "$$\n",
    "x_t = (1 - 2 L) \\varepsilon_t\n",
    "$$\n",
    "\n",
    "where $ \\varepsilon_t $ is a serially uncorrelated random process with mean zero and variance unity.\n",
    "\n",
    "We want to use the Wiener-Kolmogorov formula [(13)](#equation-eq-36) to compute the linear least squares forecasts $ \\mathbb{E} [x_{t+j} \\mid x_t, x_{t-1}, \\ldots] $, for $ j = 1,\\, 2 $.\n",
    "\n",
    "We can do everything we want by setting $ d = r $, generating an instance of LQFilter, then invoking pertinent methods of LQFilter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "hide-output": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(d = [1.0, -2.0], h = 0.0, y_m = [0.0], m = 1, ϕ = [-2.0, 5.0, -2.0], β = 1.0, ϕ_r = [-2.0, 5.0, -2.0], k = 1)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "m = 1\n",
    "y_m = zeros(m)\n",
    "d = [1.0, -2.0]\n",
    "r = [1.0, -2.0]\n",
    "h = 0.0\n",
    "example = LQFilter(d, h, y_m, r=d)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Wold representation is computed by example.coefficients_of_c().\n",
    "\n",
    "Let’s check that it “flips roots” as required"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "hide-output": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2-element Array{Float64,1}:\n",
       "  2.0\n",
       " -1.0"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "coeffs_of_c(example)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "hide-output": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "([2.0], -2.0, [0.5])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "roots_of_characteristic(example)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let’s form the covariance matrix of a time series vector of length $ N $\n",
    "and put it in $ V $.\n",
    "\n",
    "Then we’ll take a Cholesky decomposition of $ V = L^{-1} L^{-1} = Li Li' $ and use it to form the vector of “moving average representations” $ x = Li \\varepsilon $ and the vector of “autoregressive representations” $ L x = \\varepsilon $"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "hide-output": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       "  5.0  -2.0   0.0   0.0   0.0\n",
       " -2.0   5.0  -2.0   0.0   0.0\n",
       "  0.0  -2.0   5.0  -2.0   0.0\n",
       "  0.0   0.0  -2.0   5.0  -2.0\n",
       "  0.0   0.0   0.0  -2.0   5.0"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "V = construct_V(example,N=5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice how the lower rows of the “moving average representations” are converging to the appropriate infinite history Wold representation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "hide-output": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 LowerTriangular{Float64,Array{Float64,2}}:\n",
       "  2.23607     ⋅         ⋅         ⋅         ⋅ \n",
       " -0.894427   2.04939    ⋅         ⋅         ⋅ \n",
       "  0.0       -0.9759    2.01187    ⋅         ⋅ \n",
       "  0.0        0.0      -0.9941    2.00294    ⋅ \n",
       "  0.0        0.0       0.0      -0.998533  2.00073"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "F = cholesky(V)\n",
    "Li = F.L"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice how the lower rows of the “autoregressive representations” are converging to the appropriate infinite history autoregressive representation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "hide-output": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 LowerTriangular{Float64,Array{Float64,2}}:\n",
       " 0.447214    ⋅          ⋅         ⋅         ⋅ \n",
       " 0.19518    0.48795     ⋅         ⋅         ⋅ \n",
       " 0.0946762  0.236691   0.49705    ⋅         ⋅ \n",
       " 0.0469898  0.117474   0.246696  0.499266   ⋅ \n",
       " 0.0234518  0.0586295  0.123122  0.249176  0.499817"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L = inv(Li)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Remark** Let $ \\pi (z) = \\sum^m_{j=0} \\pi_j z^j $ and let $ z_1, \\ldots,\n",
    "z_k $ be the zeros of $ \\pi (z) $ that are inside the unit circle, $ k < m $.\n",
    "\n",
    "Then define\n",
    "\n",
    "$$\n",
    "\\theta (z) = \\pi (z) \\Biggl( {(z_1 z-1) \\over (z-z_1)} \\Biggr)\n",
    "\\Biggl( { (z_2 z-1) \\over (z-z_2) } \\Biggr ) \\ldots \\Biggl({(z_kz-1) \\over\n",
    "(z-z_k) }\\Biggr)\n",
    "$$\n",
    "\n",
    "The term multiplying $ \\pi (z) $ is termed a “Blaschke factor”.\n",
    "\n",
    "Then it can be proved directly that\n",
    "\n",
    "$$\n",
    "\\theta (z^{-1}) \\theta (z) = \\pi (z^{-1}) \\pi (z)\n",
    "$$\n",
    "\n",
    "and that the zeros of $ \\theta (z) $ are not inside the unit circle."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example 2\n",
    "\n",
    "Consider a stochastic process $ X_t $ with moving average\n",
    "representation\n",
    "\n",
    "$$\n",
    "X_t = (1 - \\sqrt 2 L^2) \\varepsilon_t\n",
    "$$\n",
    "\n",
    "where $ \\varepsilon_t $ is a serially uncorrelated random process\n",
    "with mean zero and variance unity.\n",
    "\n",
    "Let’s find a Wold moving average representation for $ x_t $.\n",
    "\n",
    "Let’s use the Wiener-Kolomogorov formula [(13)](#equation-eq-36) to compute the linear least squares forecasts\n",
    "$ \\mathbb{\\hat E}\\left[X_{t+j} \\mid X_{t-1}, \\ldots\\right] \\hbox { for } j = 1,\\, 2,\\, 3 $.\n",
    "\n",
    "We proceed in the same way as example 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "hide-output": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(d = [1.0, 0.0, -1.4142135623730951], h = 0.0, y_m = [0.0, 0.0], m = 2, ϕ = [-1.4142135623730951, 0.0, 3.0000000000000004, 0.0, -1.4142135623730951], β = 1.0, ϕ_r = [-1.4142135623730951, 0.0, 3.0000000000000004, 0.0, -1.4142135623730951], k = 2)"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "m = 2\n",
    "y_m = [0.0, 0.0]\n",
    "d = [1, 0, -sqrt(2)]\n",
    "r = [1, 0, -sqrt(2)]\n",
    "h = 0.0\n",
    "example = LQFilter(d, h, y_m, r = d)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "hide-output": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3-element Array{Float64,1}:\n",
       "  1.4142135623731003\n",
       " -0.0\n",
       " -1.0000000000000064"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "coeffs_of_c(example)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "hide-output": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "([1.1892071150027195, -1.1892071150027195], -1.4142135623731096, [0.8408964152537157, -0.8408964152537157])"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "roots_of_characteristic(example)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "hide-output": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "8×8 Array{Float64,2}:\n",
       "  3.0       0.0      -1.41421   0.0      …   0.0       0.0       0.0\n",
       "  0.0       3.0       0.0      -1.41421      0.0       0.0       0.0\n",
       " -1.41421   0.0       3.0       0.0          0.0       0.0       0.0\n",
       "  0.0      -1.41421   0.0       3.0         -1.41421   0.0       0.0\n",
       "  0.0       0.0      -1.41421   0.0          0.0      -1.41421   0.0\n",
       "  0.0       0.0       0.0      -1.41421  …   3.0       0.0      -1.41421\n",
       "  0.0       0.0       0.0       0.0          0.0       3.0       0.0\n",
       "  0.0       0.0       0.0       0.0         -1.41421   0.0       3.0"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "V = construct_V(example, N = 8)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "hide-output": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3×8 Array{Float64,2}:\n",
       " 0.0  0.0  0.0  -0.92582   0.0        1.46385   0.0      0.0\n",
       " 0.0  0.0  0.0   0.0      -0.966092   0.0       1.43759  0.0\n",
       " 0.0  0.0  0.0   0.0       0.0       -0.966092  0.0      1.43759"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "F = cholesky(V)\n",
    "Li = F.L\n",
    "Li[end-2:end, :]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "hide-output": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "8×8 LowerTriangular{Float64,Array{Float64,2}}:\n",
       " 0.57735    ⋅         ⋅         ⋅        …   ⋅         ⋅         ⋅ \n",
       " 0.0       0.57735    ⋅         ⋅            ⋅         ⋅         ⋅ \n",
       " 0.308607  0.0       0.654654   ⋅            ⋅         ⋅         ⋅ \n",
       " 0.0       0.308607  0.0       0.654654      ⋅         ⋅         ⋅ \n",
       " 0.19518   0.0       0.414039  0.0           ⋅         ⋅         ⋅ \n",
       " 0.0       0.19518   0.0       0.414039  …  0.68313    ⋅         ⋅ \n",
       " 0.131165  0.0       0.278243  0.0          0.0       0.695608   ⋅ \n",
       " 0.0       0.131165  0.0       0.278243     0.459078  0.0       0.695608"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L = inv(Li)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Prediction\n",
    "\n",
    "It immediately follows from the “orthogonality principle” of least squares (see [[AP91]](../zreferences.html#athanasios1991) or [[Sar87]](../zreferences.html#sargent1987) [ch. X]) that\n",
    "\n",
    "\n",
    "<a id='equation-eq-58'></a>\n",
    "$$\n",
    "\\begin{aligned}\n",
    "    \\mathbb{\\hat E} & [x_t \\mid x_{t-m},\\, x_{t-m+1}, \\ldots x_1]\n",
    "                    = \\sum^{t-1}_{j=m} L^{-1}_{t,t-j}\\, \\varepsilon_{t-j} \\\\\n",
    "               & = [L_{t, 1}^{-1}\\, L^{-1}_{t,2},\\, \\ldots, L^{-1}_{t,t-m}\\ 0 \\ 0 \\ldots 0] L \\, x\n",
    "\\end{aligned} \\tag{26}\n",
    "$$\n",
    "\n",
    "This can be interpreted as a finite-dimensional version of the Wiener-Kolmogorov $ m $-step ahead prediction formula.\n",
    "\n",
    "We can use [(26)](#equation-eq-58)  to represent the linear least squares projection of\n",
    "the vector $ x $ conditioned on the first $ s $ observations\n",
    "$ [x_s, x_{s-1} \\ldots, x_1] $.\n",
    "\n",
    "We have\n",
    "\n",
    "\n",
    "<a id='equation-eq-59'></a>\n",
    "$$\n",
    "\\mathbb{\\hat E}[x \\mid x_s, x_{s-1}, \\ldots, x_1]\n",
    "= L^{-1}\n",
    "\\left[\n",
    "    \\begin{matrix}\n",
    "        I_s & 0 \\\\\n",
    "        0 & 0_{(t-s)}\n",
    "    \\end{matrix}\n",
    "\\right] L x \\tag{27}\n",
    "$$\n",
    "\n",
    "This formula will be convenient in representing the solution of control problems under uncertainty.\n",
    "\n",
    "Equation [(23)](#equation-eq-55)  can be recognized as a finite dimensional version of a moving average representation.\n",
    "\n",
    "Equation  [(22)](#equation-eq-54) can be viewed as a finite dimension version of an autoregressive representation.\n",
    "\n",
    "Notice that even\n",
    "if the $ x_t $ process is covariance stationary, so that $ V $\n",
    "is such that $ V_{ij} $ depends only on $ \\vert i-j\\vert $, the\n",
    "coefficients in the moving average representation are time-dependent,\n",
    "there being a different moving average for each $ t $.\n",
    "\n",
    "If\n",
    "$ x_t $ is a covariance stationary process, the last row of\n",
    "$ L^{-1} $ converges to the coefficients in the Wold moving average\n",
    "representation for $ \\{ x_t\\} $ as $ T \\rightarrow \\infty $.\n",
    "\n",
    "Further, if $ x_t $ is covariance stationary, for fixed $ k $\n",
    "and $ j > 0, \\, L^{-1}_{T,T-j} $ converges to\n",
    "$ L^{-1}_{T-k, T-k-j} $ as $ T \\rightarrow \\infty $.\n",
    "\n",
    "That is,\n",
    "the “bottom” rows of $ L^{-1} $ converge to each other and to the\n",
    "Wold moving average coefficients as $ T \\rightarrow \\infty $.\n",
    "\n",
    "This last observation gives one simple and widely-used practical way of\n",
    "forming a finite $ T $ approximation to a Wold moving average\n",
    "representation.\n",
    "\n",
    "First, form the covariance matrix\n",
    "$ \\mathbb{E}xx^\\prime = V $, then obtain the Cholesky decomposition\n",
    "$ L^{-1} L^{-1^\\prime} $ of $ V $, which can be accomplished\n",
    "quickly on a computer.\n",
    "\n",
    "The last row of $ L^{-1} $ gives the approximate Wold moving average coefficients.\n",
    "\n",
    "This method can readily be generalized to multivariate systems.\n",
    "\n",
    "\n",
    "<a id='fdcp'></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Combined Finite Dimensional Control and Prediction\n",
    "\n",
    "Consider the finite-dimensional control problem, maximize\n",
    "\n",
    "$$\n",
    "\\mathbb{E} \\, \\sum^N_{t=0} \\,\n",
    "\\left\\{\n",
    "     a_t y_t - {1 \\over 2} h y^2_t - {1 \\over 2} [d(L) y_t ]^2\n",
    "\\right\\},\\  \\quad h > 0\n",
    "$$\n",
    "\n",
    "where $ d(L) = d_0 + d_1 L+ \\ldots + d_m L^m $, $ L $ is the\n",
    "lag operator, $ \\bar a = [ a_N, a_{N-1} \\ldots, a_1, a_0]^\\prime $ a\n",
    "random vector with mean zero and $ \\mathbb{E}\\,\\bar a \\bar a^\\prime = V $.\n",
    "\n",
    "The variables $ y_{-1}, \\ldots, y_{-m} $ are given.\n",
    "\n",
    "Maximization is over choices of $ y_0,\n",
    "y_1 \\ldots, y_N $, where $ y_t $ is required to be a linear function\n",
    "of $ \\{y_{t-s-1}, t+m-1\\geq 0;\\ a_{t-s}, t\\geq s\\geq 0\\} $.\n",
    "\n",
    "We saw in the lecture [Classical Control with Linear Algebra](lu_tricks.html)  that the solution of this problem under certainty could be represented in feedback-feedforward form\n",
    "\n",
    "$$\n",
    "U \\bar y\n",
    "   = L^{-1}\\bar a + K\n",
    "   \\left[\n",
    "     \\begin{matrix}\n",
    "         y_{-1}\\\\\n",
    "         \\vdots\\\\\n",
    "         y_{-m}\n",
    "     \\end{matrix}\n",
    "   \\right]\n",
    "$$\n",
    "\n",
    "for some $ (N+1)\\times m $ matrix $ K $.\n",
    "\n",
    "Using a version of formula [(26)](#equation-eq-58), we can express $ \\mathbb{\\hat E}[\\bar a \\mid a_s,\\, a_{s-1}, \\ldots, a_0 ] $ as\n",
    "\n",
    "$$\n",
    "\\mathbb{\\hat E}\n",
    "[ \\bar a \\mid a_s,\\, a_{s-1}, \\ldots, a_0]\n",
    "= \\tilde U^{-1}\n",
    "\\left[\n",
    "    \\begin{matrix}\n",
    "        0 & 0 \\\\\n",
    "        0 & I_{(s+1)}\n",
    "    \\end{matrix}\n",
    "\\right]\n",
    "\\tilde U \\bar a\n",
    "$$\n",
    "\n",
    "where $ I_{(s + 1)} $ is the $ (s+1) \\times (s+1) $ identity\n",
    "matrix, and $ V = \\tilde U^{-1} \\tilde U^{-1^{\\prime}} $, where\n",
    "$ \\tilde U $ is the *upper* trangular Cholesky factor of the\n",
    "covariance matrix $ V $.\n",
    "\n",
    "(We have reversed the time axis in dating the $ a $’s relative to earlier)\n",
    "\n",
    "The time axis can be reversed in representation [(27)](#equation-eq-59) by replacing $ L $ with $ L^T $.\n",
    "\n",
    "The optimal decision rule to use at time $ 0 \\leq t \\leq N $ is then\n",
    "given by the $ (N-t +1)^{\\rm th} $ row of\n",
    "\n",
    "$$\n",
    "U \\bar y = L^{-1} \\tilde U^{-1}\n",
    "    \\left[\n",
    "        \\begin{matrix}\n",
    "            0 & 0 \\\\\n",
    "            0 & I_{(t+1)}\n",
    "        \\end{matrix}\n",
    "    \\right]\n",
    "    \\tilde U \\bar a + K\n",
    "    \\left[\n",
    "    \\begin{matrix}\n",
    "        y_{-1}\\\\\n",
    "        \\vdots\\\\\n",
    "        y_{-m}\n",
    "    \\end{matrix}\n",
    "    \\right]\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 1\n",
    "\n",
    "Let $ Y_t = (1 - 2 L ) u_t $ where $ u_t $ is a mean zero\n",
    "white noise with $ \\mathbb{E} u^2_t = 1 $. Let\n",
    "\n",
    "$$\n",
    "X_t = Y_t + \\varepsilon_t\n",
    "$$\n",
    "\n",
    "where $ \\varepsilon_t $ is a serially uncorrelated white noise with\n",
    "$ \\mathbb{E} \\varepsilon^2_t = 9 $, and $ \\mathbb{E} \\varepsilon_t u_s = 0 $ for all\n",
    "$ t $ and $ s $.\n",
    "\n",
    "Find the Wold moving average representation for $ X_t $.\n",
    "\n",
    "Find a formula for the $ A_{1j} $’s in\n",
    "\n",
    "$$\n",
    "\\mathbb{E} \\widehat X_{t+1} \\mid X_t, X_{t-1}, \\ldots = \\sum^\\infty_{j=0} A_{1j}\n",
    "X_{t-j}\n",
    "$$\n",
    "\n",
    "Find a formula for the $ A_{2j} $’s in\n",
    "\n",
    "$$\n",
    "\\mathbb{\\hat E} X_{t+2} \\mid X_t, X_{t-1}, \\ldots = \\sum^\\infty_{j=0} A_{2j}\n",
    "X_{t-j}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 2\n",
    "\n",
    "(Multivariable Prediction) Let $ Y_t $ be an $ (n\\times 1) $\n",
    "vector stochastic process with moving average representation\n",
    "\n",
    "$$\n",
    "Y_t = D(L) U_t\n",
    "$$\n",
    "\n",
    "where $ D(L) = \\sum^m_{j=0} D_j L^J, D_j $ an $ n \\times n $\n",
    "matrix, $ U_t $ an $ (n \\times 1) $ vector white noise with\n",
    ":math: mathbb{E} U_t =0 for all $ t $, $ \\mathbb{E} U_t U_s' = 0 $ for all $ s \\neq t $,\n",
    "and $ \\mathbb{E} U_t U_t' = I $ for all $ t $.\n",
    "\n",
    "Let $ \\varepsilon_t $ be an $ n \\times 1 $ vector white noise with mean $ 0 $ and contemporaneous covariance matrix $ H $, where $ H $ is a positive definite matrix.\n",
    "\n",
    "Let $ X_t = Y_t +\\varepsilon_t $.\n",
    "\n",
    "Define the covariograms as $ C_X\n",
    "(\\tau) = \\mathbb{E} X_t X^\\prime_{t-\\tau}, C_Y (\\tau) = \\mathbb{E} Y_t Y^\\prime_{t-\\tau},\n",
    "C_{YX} (\\tau) = \\mathbb{E} Y_t X^\\prime_{t-\\tau} $.\n",
    "\n",
    "Then define the matrix\n",
    "covariance generating function, as in [(21)](lu_tricks.html#equation-onetwenty), only interpret all the\n",
    "objects in [(21)](lu_tricks.html#equation-onetwenty) as matrices.\n",
    "\n",
    "Show that the covariance generating functions are given by\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "    g_y (z) &= D (z) D (z^{-1})^\\prime \\\\\n",
    "    g_X (z) &= D (z) D (z^{-1})^\\prime + H \\\\\n",
    "    g_{YX} (z) &= D (z) D (z^{-1})^\\prime\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "A factorization of $ g_X (z) $ can be found (see [[Roz67]](../zreferences.html#rozanov1967) or [[Whi83]](../zreferences.html#whittle1983)) of the form\n",
    "\n",
    "$$\n",
    "D (z) D (z^{-1})^\\prime + H = C (z) C (z^{-1})^\\prime, \\quad C (z) =\n",
    "\\sum^m_{j=0} C_j z^j\n",
    "$$\n",
    "\n",
    "where the zeros of $ \\vert C(z)\\vert $ do not lie inside the unit\n",
    "circle.\n",
    "\n",
    "A vector Wold moving average representation of $ X_t $ is then\n",
    "\n",
    "$$\n",
    "X_t = C(L) \\eta_t\n",
    "$$\n",
    "\n",
    "where $ \\eta_t $ is an $ (n\\times 1) $ vector white noise that\n",
    "is “fundamental” for $ X_t $.\n",
    "\n",
    "That is, $ X_t - \\mathbb{\\hat E}\\left[X_t \\mid X_{t-1}, X_{t-2}\n",
    "\\ldots\\right] = C_0 \\, \\eta_t $.\n",
    "\n",
    "The optimum predictor of $ X_{t+j} $ is\n",
    "\n",
    "$$\n",
    "\\mathbb{\\hat E} \\left[X_{t+j} \\mid X_t, X_{t-1}, \\ldots\\right]\n",
    " = \\left[{C(L) \\over L^j} \\right]_+ \\eta_t\n",
    "$$\n",
    "\n",
    "If $ C(L) $ is invertible, i.e., if the zeros of $ \\det $\n",
    "$ C(z) $ lie strictly outside the unit circle, then this formula can\n",
    "be written\n",
    "\n",
    "$$\n",
    "\\mathbb{\\hat E} \\left[X_{t+j} \\mid X_t, X_{t-1}, \\ldots\\right]\n",
    "    = \\left[{C(L) \\over L^J} \\right]_+ C(L)^{-1}\\, X_t\n",
    "$$"
   ]
  }
 ],
 "metadata": {
  "date": 1591310626.6953862,
  "download_nb": 1,
  "download_nb_path": "https://julia.quantecon.org/",
  "filename": "classical_filtering.rst",
  "filename_with_path": "time_series_models/classical_filtering",
  "kernelspec": {
   "display_name": "Julia 1.4.2",
   "language": "julia",
   "name": "julia-1.4"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.4.2"
  },
  "title": "Classical Filtering With Linear Algebra"
 },
 "nbformat": 4,
 "nbformat_minor": 2
}