{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Notes taken from book \"An Introduction to State Space Time Series Analysis\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why traditional regression models fail on time series data\n", "- In short, regression models assume that the irregular noise follows a zero-mean, constant-covariance Gaussian, which is uncorrelated (sepcially not auot-correlated). And all the statistics such as F-test (t-test) were built on the validaty of this assumption.\n", "- If the `Acf` (correlogram) of the residuals (error) shows non-zero correlations, then:\n", " - (++-- pattern): if positive auto-correlation at lag = 1, followed by one or more other positive residudals, and then a negative residual followed by several other negative: the error variance is seriously underestimated and there is a risk of being overly optimistic.\n", " - (-+-- pattern): if negative auto-correlation at lag = 1, followed by a positive residual and then a negative residual and vice versa. The error variance is seriously overestimated, and thus at the risk of being overly pessimistic. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "%pylab inline\n", "%load_ext rmagic\n", "%R library(forecast)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Populating the interactive namespace from numpy and matplotlib\n", "The rmagic extension is already loaded. To reload it, use:\n", " %reload_ext rmagic\n" ] }, { "metadata": {}, "output_type": "display_data", "text": [ "This is forecast 4.8 \n", "\n" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "**State Model**\n", "- Time series analysis has the primary task to uncover the dynamic evolu-\n", "tion of observations measured over time. It is assumed that the dynamic\n", "properties cannot be observed directly from the data. The unobserved\n", "dynamic process at time t is referred to as the *state* of the time series.\n", "- *States* could be represented as simple as *level*, *slope* and *seasonal*, or as *explanatory & description variables*." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Local Level Model**\n", "\n", "- a basic exmaple of the state space model\n", "- in this model the `level` component is allowed to vary in time\n", "- the level component can be conceived of as the equivalent of the intercept in the classical regression model, except the intercept in regression is usually fixed.\n", "- mathematically, it is formulated as\n", "$$y_t = \\mu_t + \\epsilon_t,\\ \\epsilon_t \\sim N(0, \\sigma_{\\epsilon}^2) \\text{ [observation equaption]}$$\n", "$$\\mu_{t+1} = \\mu_t + \\xi_t, \\ \\xi_t \\sim N(0, \\sigma_{\\xi}^2) \\text{[state equation]}$$\n", " where $\\mu_t$ is the unobserved level at time t, and $\\epsilon_t$ (irregular part) is the observation disturbance at time t, and $\\xi_t$ is what is called the level disturbance at time t.\n", "\n", " In other words, the hidden state is modelled as a random variable \n", "- the whole model is also called **random walk with noise** model because its state equation is defined as a random walk.\n", "- when the state disturbances are all fixed on $\\xi_t=0$, the model reduces to a deterministic model, where the level does NOT vary over time. On the other hand, when the level is allowed to vary over time, it is treated as a stochastic process." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Local Linear Tread Model**\n", "$$y_t = \\mu_t + \\epsilon_t, \\ \\epsilon_t \\sim N(0, \\sigma_{\\epsilon}^2)$$\n", "$$\\mu_{t+1} = \\mu_t + \\nu_t + \\xi_t, \\ \\xi_t \\sim N(0, \\sigma_{\\xi}^2)$$\n", "$$\\nu_{t+1} = \\nu_t + \\zeta_t, \\ \\zeta_t \\sim N(0, \\sigma_{\\zeta}^2)$$\n", "so the local linear trend model contains two state equations, one for modelling the level, and one for modelling the slope. In the literature on time series analysis, the slope is also referred to as the drift." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }