{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Poisson-Zero Hidden Markov Model\n", "\n", "## Summary\n", "\n", "The following exposition uses [`pymc3-hmm`](https://github.com/AmpersandTV/pymc3-hmm) to both simulate and estimate a hidden Markov model (HMM) with emissions consisting of a Poisson random variable and the point at zero." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "\n", "import theano\n", "import theano.tensor as tt\n", "\n", "import pymc3 as pm\n", "\n", "from pymc3_hmm.utils import compute_steady_state\n", "from pymc3_hmm.distributions import Constant, SwitchingProcess, DiscreteMarkovChain\n", "from pymc3_hmm.step_methods import FFBSStep, TransMatConjugateStep" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "\n", "Our observation model can be described as follows:\n", "\n", "\\begin{align}\n", " \\label{eq:pois-zero-model}\n", " Y_t &= S_t \\epsilon_t,\\quad\n", " \\epsilon_t \\sim \\operatorname{Pois}\\left( \\mu \\right)\n", " \\\\\n", " S_t &\\sim \\operatorname{Bern}\\left( \\pi_t \\right)\n", " \\;,\n", "\\end{align}\n", "where $y_t \\sim Y_t$ are the observed values sampled from the observation distribution, $Y_t$, spanning $t \\in \\left\\{0, \\dots, T \\right\\}$.\n", "\n", "The \"hidden\" state sequence, $\\{S_t\\}$, is driven by the following Markov relationship:\n", "\n", "\\begin{equation*}\n", " \\operatorname{P}\\left(S_t \\mid S_{t-1}\\right) \\triangleq \\Gamma_{t,t-1} \\in \\mathbb{R}^{2 \\times 2}_{[0, 1]}\n", "\\end{equation*}\n", "\n", "The marginal state probability, $\\pi_t$, is then given by\n", "\n", "\\begin{equation*}\n", " \\begin{aligned}\n", " \\operatorname{P}\\left( S_t \\right)\n", " &= \\int_{S_{t-1}} \\operatorname{P}\\left(S_t \\mid S_{t-1}\\right)\n", " \\operatorname{dP}\\left(S_{t-1}\\right)\n", " \\\\\n", " &=\n", " \\begin{pmatrix}\n", " \\Gamma^{(0, 0)}_{t,t-1} & \\Gamma^{(0, 1)}_{t,t-1}\n", " \\\\\n", " \\Gamma^{(1, 0)}_{t,t-1} & \\Gamma^{(1, 1)}_{t,t-1}\n", " \\end{pmatrix}^\\top\n", " \\begin{pmatrix}\n", " \\pi_{t-1}\n", " \\\\\n", " 1 - \\pi_{t-1}\n", " \\end{pmatrix}\n", " \\\\\n", " &=\n", " \\begin{pmatrix}\n", " \\pi_{t}\n", " \\\\\n", " 1 - \\pi_{t}\n", " \\end{pmatrix}\n", " \\;.\n", " \\end{aligned}\n", "\\end{equation*}\n", "\n", "Our model is effectively a zero-inflated Poisson with a simple first-order Markov process as the mixing distribution.\n", "\n", "\n", "In the remainder of this exposition, we will assume Dirichlet priors for each row of $\\Gamma_{t, t-1}$, the conjugate Gamma prior for $\\mu$, and the steady-state value of $\\Gamma_{t, t-1}$ for $\\pi_0$.\n", "\n", "## Simulation" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def create_poisson_zero_hmm(mu, observed, p_0_a=np.r_[5, 1], p_1_a=np.r_[3, 5]):\n", "\n", " p_0_rv = pm.Dirichlet(\"p_0\", p_0_a)\n", " p_1_rv = pm.Dirichlet(\"p_1\", p_1_a)\n", "\n", " P_tt = tt.stack([p_0_rv, p_1_rv])\n", " P_rv = pm.Deterministic(\"P_t\", tt.shape_padleft(P_tt))\n", "\n", " pi_0_tt = compute_steady_state(P_rv)\n", "\n", " S_rv = DiscreteMarkovChain(\"S_t\", P_rv, pi_0_tt, shape=np.shape(observed)[-1])\n", " S_rv.tag.test_value = np.array(observed > 0, dtype=np.int32)\n", "\n", " Y_rv = SwitchingProcess(\n", " \"Y_t\", [Constant.dist(np.array(0, dtype=np.int64)), pm.Poisson.dist(mu)], S_rv, observed=observed\n", " )\n", "\n", " return Y_rv\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "np.random.seed(2032)\n", "\n", "mu_true = 5000\n", "\n", "with pm.Model() as sim_model:\n", " _ = create_poisson_zero_hmm(mu_true, np.zeros(10000))\n", "\n", "sim_point = pm.sample_prior_predictive(samples=1, model=sim_model)\n", "sim_point[\"Y_t\"] = sim_point[\"Y_t\"].squeeze()\n", "\n", "y_t = sim_point[\"Y_t\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Estimation\n", "\n", "We will use the \"true\" data-generating observation model to estimate the parameters $\\mu$ and $\\Gamma_{t, t-1}$ (the latter as rows denoted by `p_0` and `p_1`). For demonstration purposes, we choose hyper-parameters for the $\\mu$ prior that are \"far\" from the true $\\mu$ value.\n", "\n", "The sampling steps for $S_t$ are performed using forward-filtering backward-sampling (FFBS). The steps for $\\Gamma_{t, t-1}$ are conjugate samples conditional on the sampled state sequence $\\{S_t\\}$ provided by FFBS. Both of these step methods are available in `pymc3-hmm`.\n", "\n", "While $\\mu$ could also be sampled exactly using a conjugate step—conditional on $\\{S_t\\}$—we instead sample this parameter using PyMC3's built-in NUTS." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Sequential sampling (1 chains in 1 job)\n", "CompoundStep\n", ">TransMatConjugateStep: [p_0, p_1]\n", ">FFBSStep: [S_t]\n", ">NUTS: [mu]\n" ] }, { "data": { "text/html": [ "\n", "