{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "fa7bdc16",
   "metadata": {},
   "source": [
    "\n",
    "<a id='speed'></a>\n",
    "<div id=\"qe-notebook-header\" align=\"right\" style=\"text-align:right;\">\n",
    "        <a href=\"https://quantecon.org/\" title=\"quantecon.org\">\n",
    "                <img style=\"width:250px;display:inline;\" width=\"250px\" src=\"https://assets.quantecon.org/img/qe-menubar-logo.svg\" alt=\"QuantEcon\">\n",
    "        </a>\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c4fabd8d",
   "metadata": {},
   "source": [
    "# Numba\n",
    "\n",
    "In addition to what’s in Anaconda, this lecture will need the following libraries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "25fb67ab",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "!pip install quantecon"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a9c81f9f",
   "metadata": {},
   "source": [
    "Please also make sure that you have the latest version of Anaconda, since old\n",
    "versions are a [common source of errors](https://python-programming.quantecon.org/troubleshooting.html).\n",
    "\n",
    "Let’s start with some imports:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "79988914",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import quantecon as qe\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a50fd849",
   "metadata": {},
   "source": [
    "## Overview\n",
    "\n",
    "In an [earlier lecture](https://python-programming.quantecon.org/need_for_speed.html) we learned about vectorization, which is one method to improve speed and efficiency in numerical work.\n",
    "\n",
    "Vectorization involves sending array processing\n",
    "operations in batch to efficient low-level code.\n",
    "\n",
    "However, as [discussed previously](https://python-programming.quantecon.org/need_for_speed.html#numba-p-c-vectorization), vectorization has several weaknesses.\n",
    "\n",
    "One is that it is highly memory-intensive when working with large amounts of data.\n",
    "\n",
    "Another is that the set of algorithms that can be entirely vectorized is not universal.\n",
    "\n",
    "In fact, for some algorithms, vectorization is ineffective.\n",
    "\n",
    "Fortunately, a new Python library called [Numba](https://numba.pydata.org/)\n",
    "solves many of these problems.\n",
    "\n",
    "It does so through something called **just in time (JIT) compilation**.\n",
    "\n",
    "The key idea is to compile functions to native machine code instructions on the fly.\n",
    "\n",
    "When it succeeds, the compiled code is extremely fast.\n",
    "\n",
    "Beyond speed gains from compilation, Numba is specifically designed for numerical work and can also do other tricks such as [Multithreaded Loops in Numba](#multithreading).\n",
    "\n",
    "This lecture introduces the main ideas.\n",
    "\n",
    "\n",
    "<a id='numba-link'></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0fce08c1",
   "metadata": {},
   "source": [
    "## Compiling Functions\n",
    "\n",
    "\n",
    "<a id='index-1'></a>\n",
    "As stated above, Numba’s primary use is compiling functions to fast native\n",
    "machine code during runtime.\n",
    "\n",
    "\n",
    "<a id='quad-map-eg'></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6571db8d",
   "metadata": {},
   "source": [
    "### An Example\n",
    "\n",
    "Let’s consider a problem that is difficult to vectorize: generating the trajectory of a difference equation given an initial condition.\n",
    "\n",
    "We will take the difference equation to be the quadratic map\n",
    "\n",
    "$$\n",
    "x_{t+1} = \\alpha x_t (1 - x_t)\n",
    "$$\n",
    "\n",
    "In what follows we set"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "731e1f21",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "α = 4.0"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9f35a946",
   "metadata": {},
   "source": [
    "Here’s the plot of a typical trajectory, starting from $ x_0 = 0.1 $, with $ t $ on the x-axis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b0c839c4",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "def qm(x0, n):\n",
    "    x = np.empty(n+1)\n",
    "    x[0] = x0\n",
    "    for t in range(n):\n",
    "      x[t+1] = α * x[t] * (1 - x[t])\n",
    "    return x\n",
    "\n",
    "x = qm(0.1, 250)\n",
    "fig, ax = plt.subplots()\n",
    "ax.plot(x, 'b-', lw=2, alpha=0.8)\n",
    "ax.set_xlabel('$t$', fontsize=12)\n",
    "ax.set_ylabel('$x_{t}$', fontsize = 12)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4b0cf0d9",
   "metadata": {},
   "source": [
    "To speed the function `qm` up using Numba, our first step is"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e9dfeef1",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "from numba import jit\n",
    "\n",
    "qm_numba = jit(qm)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e01266ea",
   "metadata": {},
   "source": [
    "The function `qm_numba` is a version of `qm` that is “targeted” for\n",
    "JIT-compilation.\n",
    "\n",
    "We will explain what this means momentarily.\n",
    "\n",
    "Let’s time and compare identical function calls across these two versions, starting with the original function `qm`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0b20a15d",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "n = 10_000_000\n",
    "\n",
    "with qe.Timer() as timer1:\n",
    "    qm(0.1, int(n))\n",
    "time1 = timer1.elapsed"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d05c30d5",
   "metadata": {},
   "source": [
    "Now let’s try qm_numba"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2215a7f7",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "with qe.Timer() as timer2:\n",
    "    qm_numba(0.1, int(n))\n",
    "time2 = timer2.elapsed"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8c656d57",
   "metadata": {},
   "source": [
    "This is already a very large speed gain.\n",
    "\n",
    "In fact, the next time and all subsequent times it runs even faster as the function has been compiled and is in memory:\n",
    "\n",
    "\n",
    "<a id='qm-numba-result'></a>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ca05a763",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "with qe.Timer() as timer3:\n",
    "    qm_numba(0.1, int(n))\n",
    "time3 = timer3.elapsed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a85ed910",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "time1 / time3  # Calculate speed gain"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f5e8c28",
   "metadata": {},
   "source": [
    "This kind of speed gain is impressive relative to how simple and clear the modification is."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "34ad02a3",
   "metadata": {},
   "source": [
    "### How and When it Works\n",
    "\n",
    "Numba attempts to generate fast machine code using the infrastructure provided by the [LLVM Project](https://llvm.org/).\n",
    "\n",
    "It does this by inferring type information on the fly.\n",
    "\n",
    "(See our [earlier lecture](https://python-programming.quantecon.org/need_for_speed.html) on scientific computing for a discussion of types.)\n",
    "\n",
    "The basic idea is this:\n",
    "\n",
    "- Python is very flexible and hence we could call the function qm with many\n",
    "  types.  \n",
    "  - e.g., `x0` could be a NumPy array or a list, `n` could be an integer or a float, etc.  \n",
    "- This makes it hard to *pre*-compile the function (i.e., compile before runtime).  \n",
    "- However, when we do actually call the function, say by running `qm(0.5, 10)`,\n",
    "  the types of `x0` and `n` become clear.  \n",
    "- Moreover, the types of other variables in `qm` can be inferred once the input types are known.  \n",
    "- So the strategy of Numba and other JIT compilers is to wait until this\n",
    "  moment, and *then* compile the function.  \n",
    "\n",
    "\n",
    "That’s why it is called “just-in-time” compilation.\n",
    "\n",
    "Note that, if you make the call `qm(0.5, 10)` and then follow it with `qm(0.9, 20)`, compilation only takes place on the first call.\n",
    "\n",
    "The compiled code is then cached and recycled as required.\n",
    "\n",
    "This is why, in the code above, `time3` is smaller than `time2`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1642c032",
   "metadata": {},
   "source": [
    "## Decorator Notation\n",
    "\n",
    "In the code above we created a JIT compiled version of `qm` via the call"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "352f4943",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "qm_numba = jit(qm)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dceaaee0",
   "metadata": {},
   "source": [
    "In practice this would typically be done using an alternative *decorator* syntax.\n",
    "\n",
    "(We discuss decorators in a [separate lecture](https://python-programming.quantecon.org/python_advanced_features.html) but you can skip the details at this stage.)\n",
    "\n",
    "Let’s see how this is done.\n",
    "\n",
    "To target a function for JIT compilation we can put `@jit` before the function definition.\n",
    "\n",
    "Here’s what this looks like for `qm`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "467ef1f2",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "@jit\n",
    "def qm(x0, n):\n",
    "    x = np.empty(n+1)\n",
    "    x[0] = x0\n",
    "    for t in range(n):\n",
    "        x[t+1] = α * x[t] * (1 - x[t])\n",
    "    return x"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d89bc36d",
   "metadata": {},
   "source": [
    "This is equivalent to adding `qm = jit(qm)` after the function definition.\n",
    "\n",
    "The following now uses the jitted version:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fc322522",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "with qe.Timer(precision=4):\n",
    "    qm(0.1, 100_000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e3582e18",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "with qe.Timer(precision=4):\n",
    "    qm(0.1, 100_000)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "70bec448",
   "metadata": {},
   "source": [
    "Numba also provides several arguments for decorators to accelerate computation and cache functions – see [here](https://numba.readthedocs.io/en/stable/user/performance-tips.html)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b8ee6cf8",
   "metadata": {},
   "source": [
    "## Type Inference\n",
    "\n",
    "Successful type inference is a key part of JIT compilation.\n",
    "\n",
    "As you can imagine, inferring types is easier for simple Python objects (e.g., simple scalar data types such as floats and integers).\n",
    "\n",
    "Numba also plays well with NumPy arrays, which have well-defined types.\n",
    "\n",
    "In an ideal setting, Numba can infer all necessary type information.\n",
    "\n",
    "This allows it to generate native machine code, without having to call the Python runtime environment.\n",
    "\n",
    "In such a setting, Numba will be on par with machine code from low-level languages.\n",
    "\n",
    "When Numba cannot infer all type information, it will raise an error.\n",
    "\n",
    "For example, in the (artificial) setting below, Numba is unable to determine the type of function `mean` when compiling the function `bootstrap`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "56ba5d42",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "@jit\n",
    "def bootstrap(data, statistics, n):\n",
    "    bootstrap_stat = np.empty(n)\n",
    "    n = len(data)\n",
    "    for i in range(n_resamples):\n",
    "        resample = np.random.choice(data, size=n, replace=True)\n",
    "        bootstrap_stat[i] = statistics(resample)\n",
    "    return bootstrap_stat\n",
    "\n",
    "# No decorator here.\n",
    "def mean(data):\n",
    "    return np.mean(data)\n",
    "\n",
    "data = np.array((2.3, 3.1, 4.3, 5.9, 2.1, 3.8, 2.2))\n",
    "n_resamples = 10\n",
    "\n",
    "# This code throws an error\n",
    "try:\n",
    "    bootstrap(data, mean, n_resamples)\n",
    "except Exception as e:\n",
    "    print(e)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0661071e",
   "metadata": {},
   "source": [
    "We can fix this error easily in this case by compiling `mean`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a7cab997",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "@jit\n",
    "def mean(data):\n",
    "    return np.mean(data)\n",
    "\n",
    "with qe.Timer():\n",
    "    bootstrap(data, mean, n_resamples)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7d6771a7",
   "metadata": {},
   "source": [
    "## Compiling Classes\n",
    "\n",
    "As mentioned above, at present Numba can only compile a subset of Python.\n",
    "\n",
    "However, that subset is ever expanding.\n",
    "\n",
    "Notably, Numba is now quite effective at compiling classes.\n",
    "\n",
    "If a class is successfully compiled, then its methods act as JIT-compiled\n",
    "functions.\n",
    "\n",
    "To give one example, let’s consider the class for analyzing the Solow growth model we\n",
    "created in [this lecture](https://python-programming.quantecon.org/python_oop.html).\n",
    "\n",
    "To compile this class we use the `@jitclass` decorator:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "adfdc5da",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "from numba import float64\n",
    "from numba.experimental import jitclass"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5b84a8c3",
   "metadata": {},
   "source": [
    "Notice that we also imported something called `float64`.\n",
    "\n",
    "This is a data type representing standard floating point numbers.\n",
    "\n",
    "We are importing it here because Numba needs a bit of extra help with types when it tries to deal with classes.\n",
    "\n",
    "Here’s our code:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5e713f20",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "solow_data = [\n",
    "    ('n', float64),\n",
    "    ('s', float64),\n",
    "    ('δ', float64),\n",
    "    ('α', float64),\n",
    "    ('z', float64),\n",
    "    ('k', float64)\n",
    "]\n",
    "\n",
    "@jitclass(solow_data)\n",
    "class Solow:\n",
    "    r\"\"\"\n",
    "    Implements the Solow growth model with the update rule\n",
    "\n",
    "        k_{t+1} = [(s z k^α_t) + (1 - δ)k_t] /(1 + n)\n",
    "\n",
    "    \"\"\"\n",
    "    def __init__(self, n=0.05,  # population growth rate\n",
    "                       s=0.25,  # savings rate\n",
    "                       δ=0.1,   # depreciation rate\n",
    "                       α=0.3,   # share of labor\n",
    "                       z=2.0,   # productivity\n",
    "                       k=1.0):  # current capital stock\n",
    "\n",
    "        self.n, self.s, self.δ, self.α, self.z = n, s, δ, α, z\n",
    "        self.k = k\n",
    "\n",
    "    def h(self):\n",
    "        \"Evaluate the h function\"\n",
    "        # Unpack parameters (get rid of self to simplify notation)\n",
    "        n, s, δ, α, z = self.n, self.s, self.δ, self.α, self.z\n",
    "        # Apply the update rule\n",
    "        return (s * z * self.k**α + (1 - δ) * self.k) / (1 + n)\n",
    "\n",
    "    def update(self):\n",
    "        \"Update the current state (i.e., the capital stock).\"\n",
    "        self.k =  self.h()\n",
    "\n",
    "    def steady_state(self):\n",
    "        \"Compute the steady state value of capital.\"\n",
    "        # Unpack parameters (get rid of self to simplify notation)\n",
    "        n, s, δ, α, z = self.n, self.s, self.δ, self.α, self.z\n",
    "        # Compute and return steady state\n",
    "        return ((s * z) / (n + δ))**(1 / (1 - α))\n",
    "\n",
    "    def generate_sequence(self, t):\n",
    "        \"Generate and return a time series of length t\"\n",
    "        path = []\n",
    "        for i in range(t):\n",
    "            path.append(self.k)\n",
    "            self.update()\n",
    "        return path"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3806f5a1",
   "metadata": {},
   "source": [
    "First we specified the types of the instance data for the class in\n",
    "`solow_data`.\n",
    "\n",
    "After that, targeting the class for JIT compilation only requires adding\n",
    "`@jitclass(solow_data)` before the class definition.\n",
    "\n",
    "When we call the methods in the class, the methods are compiled just like functions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "673e2195",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "s1 = Solow()\n",
    "s2 = Solow(k=8.0)\n",
    "\n",
    "T = 60\n",
    "fig, ax = plt.subplots()\n",
    "\n",
    "# Plot the common steady state value of capital\n",
    "ax.plot([s1.steady_state()]*T, 'k-', label='steady state')\n",
    "\n",
    "# Plot time series for each economy\n",
    "for s in s1, s2:\n",
    "    lb = f'capital series from initial state {s.k}'\n",
    "    ax.plot(s.generate_sequence(T), 'o-', lw=2, alpha=0.6, label=lb)\n",
    "ax.set_ylabel('$k_{t}$', fontsize=12)\n",
    "ax.set_xlabel('$t$', fontsize=12)\n",
    "ax.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3995ffda",
   "metadata": {},
   "source": [
    "## Dangers and Limitations\n",
    "\n",
    "Let’s review the above and add some cautionary notes."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "68a003db",
   "metadata": {},
   "source": [
    "### Limitations\n",
    "\n",
    "As we’ve seen, Numba needs to infer type information on\n",
    "all variables to generate fast machine-level instructions.\n",
    "\n",
    "For simple routines, Numba infers types very well.\n",
    "\n",
    "For larger ones, or for routines using external libraries, it can easily fail.\n",
    "\n",
    "Hence, it’s prudent when using Numba to focus on speeding up small, time-critical snippets of code.\n",
    "\n",
    "This will give you much better performance than blanketing your Python programs with `@njit` statements."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8f76df71",
   "metadata": {},
   "source": [
    "### A Gotcha: Global Variables\n",
    "\n",
    "Here’s another thing to be careful about when using Numba.\n",
    "\n",
    "Consider the following example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ae5033c0",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "a = 1\n",
    "\n",
    "@jit\n",
    "def add_a(x):\n",
    "    return a + x\n",
    "\n",
    "print(add_a(10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0efff9e9",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "a = 2\n",
    "\n",
    "print(add_a(10))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8c6ea2ff",
   "metadata": {},
   "source": [
    "Notice that changing the global had no effect on the value returned by the\n",
    "function.\n",
    "\n",
    "When Numba compiles machine code for functions, it treats global variables as constants to ensure type stability.\n",
    "\n",
    "\n",
    "<a id='multithreading'></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "216bf4b6",
   "metadata": {},
   "source": [
    "## Multithreaded Loops in Numba\n",
    "\n",
    "In addition to JIT compilation, Numba provides powerful support for parallel computing on CPUs.\n",
    "\n",
    "By distributing computations across multiple CPU cores, we can achieve significant speed gains for many numerical algorithms.\n",
    "\n",
    "The key tool for parallelization in Numba is the `prange` function, which tells Numba to execute loop iterations in parallel across available CPU cores.\n",
    "\n",
    "This approach to multithreading works well for a wide range of problems in scientific computing and quantitative economics.\n",
    "\n",
    "To illustrate, let’s look first at a simple, single-threaded (i.e., non-parallelized) piece of code.\n",
    "\n",
    "The code simulates updating the wealth $ w_t $ of a household via the rule\n",
    "\n",
    "$$\n",
    "w_{t+1} = R_{t+1} s w_t + y_{t+1}\n",
    "$$\n",
    "\n",
    "Here\n",
    "\n",
    "- $ R $ is the gross rate of return on assets  \n",
    "- $ s $ is the savings rate of the household and  \n",
    "- $ y $ is labor income.  \n",
    "\n",
    "\n",
    "We model both $ R $ and $ y $ as independent draws from a lognormal\n",
    "distribution.\n",
    "\n",
    "Here’s the code:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bd78c9c4",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "from numpy.random import randn\n",
    "from numba import njit\n",
    "\n",
    "@njit\n",
    "def h(w, r=0.1, s=0.3, v1=0.1, v2=1.0):\n",
    "    \"\"\"\n",
    "    Updates household wealth.\n",
    "    \"\"\"\n",
    "\n",
    "    # Draw shocks\n",
    "    R = np.exp(v1 * randn()) * (1 + r)\n",
    "    y = np.exp(v2 * randn())\n",
    "\n",
    "    # Update wealth\n",
    "    w = R * s * w + y\n",
    "    return w"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "626958f7",
   "metadata": {},
   "source": [
    "Let’s have a look at how wealth evolves under this rule."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c4ab1b4c",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "\n",
    "T = 100\n",
    "w = np.empty(T)\n",
    "w[0] = 5\n",
    "for t in range(T-1):\n",
    "    w[t+1] = h(w[t])\n",
    "\n",
    "ax.plot(w)\n",
    "ax.set_xlabel('$t$', fontsize=12)\n",
    "ax.set_ylabel('$w_{t}$', fontsize=12)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9831304d",
   "metadata": {},
   "source": [
    "Now let’s suppose that we have a large population of households and we want to\n",
    "know what median wealth will be.\n",
    "\n",
    "This is not easy to solve with pencil and paper, so we will use simulation\n",
    "instead.\n",
    "\n",
    "In particular, we will simulate a large number of households and then\n",
    "calculate median wealth for this group.\n",
    "\n",
    "Suppose we are interested in the long-run average of this median over time.\n",
    "\n",
    "It turns out that, for the specification that we’ve chosen above, we can\n",
    "calculate this by taking a one-period snapshot of what has happened to median\n",
    "wealth of the group at the end of a long simulation.\n",
    "\n",
    "Moreover, provided the simulation period is long enough, initial conditions\n",
    "don’t matter.\n",
    "\n",
    "- This is due to something called ergodicity, which we will discuss [later on](https://python.quantecon.org/finite_markov.html#id15).  \n",
    "\n",
    "\n",
    "So, in summary, we are going to simulate 50,000 households by\n",
    "\n",
    "1. arbitrarily setting initial wealth to 1 and  \n",
    "1. simulating forward in time for 1,000 periods.  \n",
    "\n",
    "\n",
    "Then we’ll calculate median wealth at the end period.\n",
    "\n",
    "Here’s the code:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5212abf4",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "@njit\n",
    "def compute_long_run_median(w0=1, T=1000, num_reps=50_000):\n",
    "\n",
    "    obs = np.empty(num_reps)\n",
    "    for i in range(num_reps):\n",
    "        w = w0\n",
    "        for t in range(T):\n",
    "            w = h(w)\n",
    "        obs[i] = w\n",
    "\n",
    "    return np.median(obs)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e3ec8732",
   "metadata": {},
   "source": [
    "Let’s see how fast this runs:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8150b838",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "with qe.Timer():\n",
    "    compute_long_run_median()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ac0096d9",
   "metadata": {},
   "source": [
    "To speed this up, we’re going to parallelize it via multithreading.\n",
    "\n",
    "To do so, we add the `parallel=True` flag and change `range` to `prange`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4462f963",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "from numba import prange\n",
    "\n",
    "@njit(parallel=True)\n",
    "def compute_long_run_median_parallel(w0=1, T=1000, num_reps=50_000):\n",
    "\n",
    "    obs = np.empty(num_reps)\n",
    "    for i in prange(num_reps):\n",
    "        w = w0\n",
    "        for t in range(T):\n",
    "            w = h(w)\n",
    "        obs[i] = w\n",
    "\n",
    "    return np.median(obs)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5809f376",
   "metadata": {},
   "source": [
    "Let’s look at the timing:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "66a51ae4",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "with qe.Timer():\n",
    "    compute_long_run_median_parallel()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "de5736b7",
   "metadata": {},
   "source": [
    "The speed-up is significant."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "22aad262",
   "metadata": {},
   "source": [
    "## Exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dbec1695",
   "metadata": {},
   "source": [
    "## Exercise 13.1\n",
    "\n",
    "[Previously](https://python-programming.quantecon.org/python_by_example.html#pbe_ex5) we considered how to approximate $ \\pi $ by\n",
    "Monte Carlo.\n",
    "\n",
    "Use the same idea here, but make the code efficient using Numba.\n",
    "\n",
    "Compare speed with and without Numba when the sample size is large."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1fbd17ff",
   "metadata": {},
   "source": [
    "## Solution\n",
    "\n",
    "Here is one solution:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dfde1c3d",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "from random import uniform\n",
    "\n",
    "@jit\n",
    "def calculate_pi(n=1_000_000):\n",
    "    count = 0\n",
    "    for i in range(n):\n",
    "        u, v = uniform(0, 1), uniform(0, 1)\n",
    "        d = np.sqrt((u - 0.5)**2 + (v - 0.5)**2)\n",
    "        if d < 0.5:\n",
    "            count += 1\n",
    "\n",
    "    area_estimate = count / n\n",
    "    return area_estimate * 4  # dividing by radius**2"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1328d6ed",
   "metadata": {},
   "source": [
    "Now let’s see how fast it runs:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "19de0380",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "with qe.Timer():\n",
    "    calculate_pi()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1fb6f817",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "with qe.Timer():\n",
    "    calculate_pi()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6245e73c",
   "metadata": {},
   "source": [
    "If we switch off JIT compilation by removing `@njit`, the code takes around\n",
    "150 times as long on our machine.\n",
    "\n",
    "So we get a speed gain of 2 orders of magnitude–which is huge–by adding four\n",
    "characters."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7ff808bb",
   "metadata": {},
   "source": [
    "## Exercise 13.2\n",
    "\n",
    "In the [Introduction to Quantitative Economics with Python](https://intro.quantecon.org/intro.html) lecture series you can\n",
    "learn all about finite-state Markov chains.\n",
    "\n",
    "For now, let’s just concentrate on simulating a very simple example of such a chain.\n",
    "\n",
    "Suppose that the volatility of returns on an asset can be in one of two regimes — high or low.\n",
    "\n",
    "The transition probabilities across states are as follows\n",
    "\n",
    "![https://python-programming.quantecon.org/_static/lecture_specific/sci_libs/nfs_ex1.png](https://python-programming.quantecon.org/_static/lecture_specific/sci_libs/nfs_ex1.png)\n",
    "\n",
    "For example, let the period length be one day, and suppose the current state is high.\n",
    "\n",
    "We see from the graph that the state tomorrow will be\n",
    "\n",
    "- high with probability 0.8  \n",
    "- low with probability 0.2  \n",
    "\n",
    "\n",
    "Your task is to simulate a sequence of daily volatility states according to this rule.\n",
    "\n",
    "Set the length of the sequence to `n = 1_000_000` and start in the high state.\n",
    "\n",
    "Implement a pure Python version and a Numba version, and compare speeds.\n",
    "\n",
    "To test your code, evaluate the fraction of time that the chain spends in the low state.\n",
    "\n",
    "If your code is correct, it should be about 2/3.\n",
    "\n",
    "- Represent the low state as 0 and the high state as 1.  \n",
    "- If you want to store integers in a NumPy array and then apply JIT compilation, use `x = np.empty(n, dtype=np.int_)`.  "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "53d97ec5",
   "metadata": {},
   "source": [
    "## Solution\n",
    "\n",
    "We let\n",
    "\n",
    "- 0 represent “low”  \n",
    "- 1 represent “high”  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7c539f59",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "p, q = 0.1, 0.2  # Prob of leaving low and high state respectively"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1eddfa1e",
   "metadata": {},
   "source": [
    "Here’s a pure Python version of the function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5954ad6e",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "def compute_series(n):\n",
    "    x = np.empty(n, dtype=np.int_)\n",
    "    x[0] = 1  # Start in state 1\n",
    "    U = np.random.uniform(0, 1, size=n)\n",
    "    for t in range(1, n):\n",
    "        current_x = x[t-1]\n",
    "        if current_x == 0:\n",
    "            x[t] = U[t] < p\n",
    "        else:\n",
    "            x[t] = U[t] > q\n",
    "    return x"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7fc40f9e",
   "metadata": {},
   "source": [
    "Let’s run this code and check that the fraction of time spent in the low\n",
    "state is about 0.666"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "afa5bf2c",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "n = 1_000_000\n",
    "x = compute_series(n)\n",
    "print(np.mean(x == 0))  # Fraction of time x is in state 0"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1ec52ef2",
   "metadata": {},
   "source": [
    "This is (approximately) the right output.\n",
    "\n",
    "Now let’s time it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "520e94d7",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "with qe.Timer():\n",
    "    compute_series(n)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "53d384d6",
   "metadata": {},
   "source": [
    "Next let’s implement a Numba version, which is easy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8ba58873",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "compute_series_numba = jit(compute_series)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e76eedfc",
   "metadata": {},
   "source": [
    "Let’s check we still get the right numbers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "59c7b1a8",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "x = compute_series_numba(n)\n",
    "print(np.mean(x == 0))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aff2810c",
   "metadata": {},
   "source": [
    "Let’s see the time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "48c8b880",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "with qe.Timer():\n",
    "    compute_series_numba(n)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "55f5bcbb",
   "metadata": {},
   "source": [
    "This is a nice speed improvement for one line of code!"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d600f289",
   "metadata": {},
   "source": [
    "## Exercise 13.3\n",
    "\n",
    "In [an earlier exercise](#speed_ex1), we used Numba to accelerate an\n",
    "effort to compute the constant $ \\pi $ by Monte Carlo.\n",
    "\n",
    "Now try adding parallelization and see if you get further speed gains.\n",
    "\n",
    "You should not expect huge gains here because, while there are many\n",
    "independent tasks (draw point and test if in circle), each one has low\n",
    "execution time.\n",
    "\n",
    "Generally speaking, parallelization is less effective when the individual\n",
    "tasks to be parallelized are very small relative to total execution time.\n",
    "\n",
    "This is due to overheads associated with spreading all of these small tasks across multiple CPUs.\n",
    "\n",
    "Nevertheless, with suitable hardware, it is possible to get nontrivial speed gains in this exercise.\n",
    "\n",
    "For the size of the Monte Carlo simulation, use something substantial, such as\n",
    "`n = 100_000_000`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "947d1f82",
   "metadata": {},
   "source": [
    "## Solution\n",
    "\n",
    "Here is one solution:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3bba504d",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "from random import uniform\n",
    "\n",
    "@njit(parallel=True)\n",
    "def calculate_pi(n=1_000_000):\n",
    "    count = 0\n",
    "    for i in prange(n):\n",
    "        u, v = uniform(0, 1), uniform(0, 1)\n",
    "        d = np.sqrt((u - 0.5)**2 + (v - 0.5)**2)\n",
    "        if d < 0.5:\n",
    "            count += 1\n",
    "\n",
    "    area_estimate = count / n\n",
    "    return area_estimate * 4  # dividing by radius**2"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "611baef8",
   "metadata": {},
   "source": [
    "Now let’s see how fast it runs:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "744bce64",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "with qe.Timer():\n",
    "    calculate_pi()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e17c0ca4",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "with qe.Timer():\n",
    "    calculate_pi()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "29cd98ed",
   "metadata": {},
   "source": [
    "By switching parallelization on and off (selecting `True` or\n",
    "`False` in the `@njit` annotation), we can test the speed gain that\n",
    "multithreading provides on top of JIT compilation.\n",
    "\n",
    "On our workstation, we find that parallelization increases execution speed by\n",
    "a factor of 2 or 3.\n",
    "\n",
    "(If you are executing locally, you will get different numbers, depending mainly\n",
    "on the number of CPUs on your machine.)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c265d699",
   "metadata": {},
   "source": [
    "## Exercise 13.4\n",
    "\n",
    "In [our lecture on SciPy](https://python-programming.quantecon.org/scipy.html), we discussed pricing a call option in a\n",
    "setting where the underlying stock price had a simple and well-known\n",
    "distribution.\n",
    "\n",
    "Here we discuss a more realistic setting.\n",
    "\n",
    "We recall that the price of the option obeys\n",
    "\n",
    "$$\n",
    "P = \\beta^n \\mathbb E \\max\\{ S_n - K, 0 \\}\n",
    "$$\n",
    "\n",
    "where\n",
    "\n",
    "1. $ \\beta $ is a discount factor,  \n",
    "1. $ n $ is the expiry date,  \n",
    "1. $ K $ is the strike price and  \n",
    "1. $ \\{S_t\\} $ is the price of the underlying asset at each time $ t $.  \n",
    "\n",
    "\n",
    "Suppose that `n, β, K = 20, 0.99, 100`.\n",
    "\n",
    "Assume that the stock price obeys\n",
    "\n",
    "$$\n",
    "\\ln \\frac{S_{t+1}}{S_t} = \\mu + \\sigma_t \\xi_{t+1}\n",
    "$$\n",
    "\n",
    "where\n",
    "\n",
    "$$\n",
    "\\sigma_t = \\exp(h_t),\n",
    "    \\quad\n",
    "        h_{t+1} = \\rho h_t + \\nu \\eta_{t+1}\n",
    "$$\n",
    "\n",
    "Here $ \\{\\xi_t\\} $ and $ \\{\\eta_t\\} $ are IID and standard normal.\n",
    "\n",
    "(This is a **stochastic volatility** model, where the volatility $ \\sigma_t $\n",
    "varies over time.)\n",
    "\n",
    "Use the defaults `μ, ρ, ν, S0, h0 = 0.0001, 0.1, 0.001, 10, 0`.\n",
    "\n",
    "(Here `S0` is $ S_0 $ and `h0` is $ h_0 $.)\n",
    "\n",
    "By generating $ M $ paths $ s_0, \\ldots, s_n $, compute the Monte Carlo estimate\n",
    "\n",
    "$$\n",
    "\\hat P_M\n",
    "    := \\beta^n \\mathbb E \\max\\{ S_n - K, 0 \\}\n",
    "    \\approx\n",
    "    \\frac{1}{M} \\sum_{m=1}^M \\max \\{S_n^m - K, 0 \\}\n",
    "$$\n",
    "\n",
    "of the price, applying Numba and parallelization."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0bf46788",
   "metadata": {},
   "source": [
    "## Solution\n",
    "\n",
    "With $ s_t := \\ln S_t $, the price dynamics become\n",
    "\n",
    "$$\n",
    "s_{t+1} = s_t + \\mu + \\exp(h_t) \\xi_{t+1}\n",
    "$$\n",
    "\n",
    "Using this fact, the solution can be written as follows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "91d2f581",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "from numpy.random import randn\n",
    "M = 10_000_000\n",
    "\n",
    "n, β, K = 20, 0.99, 100\n",
    "μ, ρ, ν, S0, h0 = 0.0001, 0.1, 0.001, 10, 0\n",
    "\n",
    "@njit(parallel=True)\n",
    "def compute_call_price_parallel(β=β,\n",
    "                                μ=μ,\n",
    "                                S0=S0,\n",
    "                                h0=h0,\n",
    "                                K=K,\n",
    "                                n=n,\n",
    "                                ρ=ρ,\n",
    "                                ν=ν,\n",
    "                                M=M):\n",
    "    current_sum = 0.0\n",
    "    # For each sample path\n",
    "    for m in prange(M):\n",
    "        s = np.log(S0)\n",
    "        h = h0\n",
    "        # Simulate forward in time\n",
    "        for t in range(n):\n",
    "            s = s + μ + np.exp(h) * randn()\n",
    "            h = ρ * h + ν * randn()\n",
    "        # And add the value max{S_n - K, 0} to current_sum\n",
    "        current_sum += np.maximum(np.exp(s) - K, 0)\n",
    "\n",
    "    return β**n * current_sum / M"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6847029d",
   "metadata": {},
   "source": [
    "Try swapping between `parallel=True` and `parallel=False` and noting the run time.\n",
    "\n",
    "If you are on a machine with many CPUs, the difference should be significant."
   ]
  }
 ],
 "metadata": {
  "date": 1768359378.222308,
  "filename": "numba.md",
  "kernelspec": {
   "display_name": "Python",
   "language": "python3",
   "name": "python3"
  },
  "title": "Numba"
 },
 "nbformat": 4,
 "nbformat_minor": 5
}