{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ARCH Modeling"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "_This setup code is required to run in an IPython notebook_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "\n",
    "sns.set_style(\"darkgrid\")\n",
    "plt.rc(\"figure\", figsize=(16, 6))\n",
    "plt.rc(\"savefig\", dpi=90)\n",
    "plt.rc(\"font\", family=\"sans-serif\")\n",
    "plt.rc(\"font\", size=14)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These examples will all make use of financial data from Yahoo! Finance.  This data set can be loaded from `arch.data.sp500`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "import datetime as dt\n",
    "\n",
    "import arch.data.sp500\n",
    "\n",
    "st = dt.datetime(1988, 1, 1)\n",
    "en = dt.datetime(2018, 1, 1)\n",
    "data = arch.data.sp500.load()\n",
    "market = data[\"Adj Close\"]\n",
    "returns = 100 * market.pct_change().dropna()\n",
    "ax = returns.plot()\n",
    "xlim = ax.set_xlim(returns.index.min(), returns.index.max())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Specifying Common Models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The simplest way to specify a model is to use the model constructor `arch.arch_model` which can specify most common models.  The simplest invocation of `arch` will return a model with a constant mean, GARCH(1,1) volatility process and normally distributed errors.\n",
    "\n",
    "\n",
    "$$ r_t  =  \\mu + \\epsilon_t$$\n",
    "\n",
    "$$\\sigma^2_t   =  \\omega + \\alpha \\epsilon_{t-1}^2 + \\beta \\sigma_{t-1}^2 $$\n",
    "\n",
    "$$\\epsilon_t  =  \\sigma_t e_t,\\,\\,\\, e_t  \\sim  N(0,1) $$\n",
    "\n",
    "\n",
    "The model is estimated by calling `fit`.  The optional inputs `iter` controls the frequency of output form the optimizer, and `disp` controls whether convergence information is returned.  The results class returned offers direct access to the estimated parameters and related quantities, as well as a `summary` of the estimation results."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### GARCH (with a Constant Mean)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The default set of options produces a model with a constant mean, GARCH(1,1) conditional variance and normal errors."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "from arch import arch_model\n",
    "\n",
    "am = arch_model(returns)\n",
    "res = am.fit(update_freq=5)\n",
    "print(res.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`plot()` can be used to quickly visualize the standardized residuals and conditional volatility.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "fig = res.plot(annualize=\"D\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### GJR-GARCH"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Additional inputs can be used to construct other models.  This example sets `o` to 1, which includes one lag of an asymmetric shock which transforms a GARCH model into a GJR-GARCH model with variance dynamics given by \n",
    "\n",
    "$$\n",
    "\\sigma^2_t   =  \\omega + \\alpha \\epsilon_{t-1}^2 + \\gamma \\epsilon_{t-1}^2 I_{[\\epsilon_{t-1}<0]}+ \\beta \\sigma_{t-1}^2 \n",
    "$$\n",
    "\n",
    "where $I$ is an indicator function that takes the value 1 when its argument is true.\n",
    "\n",
    "The log likelihood improves substantially with the introduction of an asymmetric term, and the parameter estimate is highly significant."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "am = arch_model(returns, p=1, o=1, q=1)\n",
    "res = am.fit(update_freq=5, disp=\"off\")\n",
    "print(res.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### TARCH/ZARCH"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "TARCH (also known as ZARCH) model the _volatility_ using absolute values.  This model is specified using `power=1.0` since the default power, 2, corresponds to variance processes that evolve in squares.\n",
    "\n",
    "The volatility process in a TARCH model is given by \n",
    "\n",
    "$$\n",
    "\\sigma_t  =  \\omega + \\alpha \\left|\\epsilon_{t-1}\\right| + \\gamma \\left|\\epsilon_{t-1}\\right| I_{[\\epsilon_{t-1}<0]}+ \\beta \\sigma_{t-1} \n",
    "$$\n",
    "\n",
    "More general models with other powers ($\\kappa$) have volatility dynamics given by \n",
    "\n",
    "$$\n",
    "\\sigma_t^\\kappa   = \\omega + \\alpha \\left|\\epsilon_{t-1}\\right|^\\kappa + \\gamma \\left|\\epsilon_{t-1}\\right|^\\kappa I_{[\\epsilon_{t-1}<0]}+ \\beta \\sigma_{t-1}^\\kappa \n",
    "$$\n",
    "\n",
    "where the conditional variance is $\\left(\\sigma_t^\\kappa\\right)^{2/\\kappa}$.\n",
    "\n",
    "The TARCH model also improves the fit, although the change in the log likelihood is less dramatic."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "am = arch_model(returns, p=1, o=1, q=1, power=1.0)\n",
    "res = am.fit(update_freq=5)\n",
    "print(res.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Student's T Errors"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Financial returns are often heavy tailed, and a Student's T distribution is a simple method to capture this feature.  The call to `arch` changes the distribution from a Normal to a Students's T.\n",
    "\n",
    "The standardized residuals appear to be heavy tailed with an estimated degree of freedom near 10.  The log-likelihood also shows a large increase."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "am = arch_model(returns, p=1, o=1, q=1, power=1.0, dist=\"StudentsT\")\n",
    "res = am.fit(update_freq=5)\n",
    "print(res.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fixing Parameters\n",
    "In some circumstances, fixed rather than estimated parameters might be of interest.  A model-result-like class can be generated using the `fix()` method.  The class returned is identical to the usual model result class except that information about inference (standard errors, t-stats, etc) is not available. \n",
    "\n",
    "In the example, I fix the parameters to a symmetric version of the previously estimated model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "fixed_res = am.fix([0.0235, 0.01, 0.06, 0.0, 0.9382, 8.0])\n",
    "print(fixed_res.summary())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "df = pd.concat([res.conditional_volatility, fixed_res.conditional_volatility], axis=1)\n",
    "df.columns = [\"Estimated\", \"Fixed\"]\n",
    "subplot = df.plot()\n",
    "subplot.set_xlim(xlim)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Building a Model From Components"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Models can also be systematically assembled from the three model components:\n",
    "\n",
    "* A mean model (`arch.mean`)\n",
    "    * Zero mean (`ZeroMean`) - useful if using residuals from a model estimated separately\n",
    "    * Constant mean (`ConstantMean`) - common for most liquid financial assets\n",
    "    * Autoregressive (`ARX`) with optional exogenous regressors\n",
    "    * Heterogeneous (`HARX`) autoregression with optional exogenous regressors\n",
    "    * Exogenous regressors only (`LS`)\n",
    "* A volatility process (`arch.volatility`)\n",
    "    * ARCH (`ARCH`)\n",
    "    * GARCH (`GARCH`)\n",
    "    * GJR-GARCH (`GARCH` using `o` argument) \n",
    "    * TARCH/ZARCH (`GARCH` using `power` argument set to `1`)\n",
    "    * Power GARCH and Asymmetric Power GARCH (`GARCH` using `power`)\n",
    "    * Exponentially Weighted Moving Average Variance with estimated coefficient (`EWMAVariance`)\n",
    "    * Heterogeneous ARCH (`HARCH`)\n",
    "    * Parameterless Models\n",
    "        * Exponentially Weighted Moving Average Variance, known as RiskMetrics (`EWMAVariance`)\n",
    "        * Weighted averages of EWMAs, known as the RiskMetrics 2006 methodology (`RiskMetrics2006`)\n",
    "* A distribution (`arch.distribution`)\n",
    "    * Normal (`Normal`)\n",
    "    * Standardized Students's T (`StudentsT`)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Mean Models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The first choice is the mean model.  For many liquid financial assets, a constant mean (or even zero) is adequate.  For other series, such as inflation, a more complicated model may be required.  These examples make use of Core CPI downloaded from the [Federal Reserve Economic Data](https://fred.stlouisfed.org/) site."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "import arch.data.core_cpi\n",
    "\n",
    "core_cpi = arch.data.core_cpi.load()\n",
    "ann_inflation = 100 * core_cpi.CPILFESL.pct_change(12).dropna()\n",
    "fig = ann_inflation.plot()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "All mean models are initialized with constant variance and normal errors.  For `ARX` models, the `lags` argument specifies the lags to include in the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "from arch.univariate import ARX\n",
    "\n",
    "ar = ARX(100 * ann_inflation, lags=[1, 3, 12])\n",
    "print(ar.fit().summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Volatility Processes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Volatility processes can be added to a mean model using the `volatility` property.  This example adds an ARCH(5) process to model volatility. The arguments `iter` and `disp` are used in `fit()` to suppress estimation output."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "from arch.univariate import ARCH, GARCH\n",
    "\n",
    "ar.volatility = ARCH(p=5)\n",
    "res = ar.fit(update_freq=0, disp=\"off\")\n",
    "print(res.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plotting the standardized residuals and the conditional volatility shows some large (in magnitude) errors, even when standardized."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "fig = res.plot()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Distributions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally the distribution can be changed from the default normal to a standardized Student's T using the `distribution` property of a mean model.\n",
    "\n",
    "The Student's t distribution improves the model, and the degree of freedom is estimated to be near 8."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "from arch.univariate import StudentsT\n",
    "\n",
    "ar.distribution = StudentsT()\n",
    "res = ar.fit(update_freq=0, disp=\"off\")\n",
    "print(res.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## WTI Crude"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The next example uses West Texas Intermediate Crude data from FRED.  Three models are fit using alternative distributional assumptions.  The results are printed, where we can see that the normal has a much lower log-likelihood than either the Standard Student's T or the Standardized Skew Student's T -- however, these two are fairly close.  The closeness of the T and the Skew T indicate that returns are not heavily skewed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "from collections import OrderedDict\n",
    "\n",
    "import arch.data.wti\n",
    "\n",
    "crude = arch.data.wti.load()\n",
    "crude_ret = 100 * crude.DCOILWTICO.dropna().pct_change().dropna()\n",
    "res_normal = arch_model(crude_ret).fit(disp=\"off\")\n",
    "res_t = arch_model(crude_ret, dist=\"t\").fit(disp=\"off\")\n",
    "res_skewt = arch_model(crude_ret, dist=\"skewt\").fit(disp=\"off\")\n",
    "lls = pd.Series(\n",
    "    OrderedDict(\n",
    "        (\n",
    "            (\"normal\", res_normal.loglikelihood),\n",
    "            (\"t\", res_t.loglikelihood),\n",
    "            (\"skewt\", res_skewt.loglikelihood),\n",
    "        )\n",
    "    )\n",
    ")\n",
    "print(lls)\n",
    "params = pd.DataFrame(\n",
    "    OrderedDict(\n",
    "        (\n",
    "            (\"normal\", res_normal.params),\n",
    "            (\"t\", res_t.params),\n",
    "            (\"skewt\", res_skewt.params),\n",
    "        )\n",
    "    )\n",
    ")\n",
    "params"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The standardized residuals can be computed by dividing the residuals by the conditional volatility.  These are plotted along with the (unstandardized, but scaled) residuals. The non-standardized residuals are more peaked in the center indicating that the distribution is somewhat more heavy tailed than that of the standardized residuals."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false
    }
   },
   "outputs": [],
   "source": [
    "std_resid = res_normal.resid / res_normal.conditional_volatility\n",
    "unit_var_resid = res_normal.resid / res_normal.resid.std()\n",
    "df = pd.concat([std_resid, unit_var_resid], axis=1)\n",
    "df.columns = [\"Std Resids\", \"Unit Variance Resids\"]\n",
    "subplot = df.plot(kind=\"kde\", xlim=(-4, 4))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Simulation\n",
    "\n",
    "All mean models expose a method to simulate returns from assuming the\n",
    "model is correctly specified.  There are two required parameters, `params`\n",
    "which are the model parameters, and `nobs`, the number of observations\n",
    "to produce.\n",
    "\n",
    "Below we simulate from a GJR-GARCH(1,1) with Skew-t errors using parameters\n",
    "estimated on the WTI series.  The simulation returns a `DataFrame` with 3 columns:\n",
    "\n",
    "* `data`: The simulated data, which includes any mean dynamics.\n",
    "* `volatility`: The conditional volatility series\n",
    "* `errors`: The simulated errors generated to produce the model. The errors are\n",
    "  the difference between the data and its conditional mean, and can be transformed\n",
    "  into the standardized errors by dividing by the volatility."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false,
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "res = arch_model(crude_ret, p=1, o=1, q=1, dist=\"skewt\").fit(disp=\"off\")\n",
    "pd.DataFrame(res.params)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false,
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "sim_mod = arch_model(None, p=1, o=1, q=1, dist=\"skewt\")\n",
    "\n",
    "sim_data = sim_mod.simulate(res.params, 1000)\n",
    "sim_data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Simulations can be reproduced using a NumPy `RandomState`. This requires \n",
    "constructing a model from components where the `RandomState` instance is\n",
    "passed into to the distribution when the model is created.\n",
    "\n",
    "The cell below contains code that builds a model with a constant mean,\n",
    "GJR-GARCH volatility and Skew $t$ errors initialized with a user-provided\n",
    "`RandomState`. Saving the initial state allows it to be restored later so\n",
    "that the simulation can be run with the same random values.   "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false,
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from arch.univariate import ConstantMean, SkewStudent\n",
    "\n",
    "rs = np.random.RandomState([892380934, 189201902, 129129894, 9890437])\n",
    "# Save the initial state to reset later\n",
    "state = rs.get_state()\n",
    "\n",
    "dist = SkewStudent(seed=rs)\n",
    "vol = GARCH(p=1, o=1, q=1)\n",
    "repro_mod = ConstantMean(None, volatility=vol, distribution=dist)\n",
    "\n",
    "repro_mod.simulate(res.params, 1000).head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Resetting the state using `set_state` shows that calling `simulate`\n",
    "using the same underlying state in the `RandomState` produces the\n",
    "same objects. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": false,
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "# Reset the state to the initial state\n",
    "rs.set_state(state)\n",
    "repro_mod.simulate(res.params, 1000).head()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.9"
  },
  "pycharm": {
   "stem_cell": {
    "cell_type": "raw",
    "metadata": {
     "collapsed": false
    },
    "source": []
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}