{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# PyHEP Numba tutorial on February 3, 2021"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Simple stuff\n",
    "\n",
    "Even if you've used Numba before, let's start with the basics.\n",
    "\n",
    "In fact, let's start with NumPy itself."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### NumPy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "NumPy accelerates Python code by replacing loops in Python's virtual machine (with type-checks at runtime) with precompiled loops that transform arrays into arrays."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = np.arange(1000000)\n",
    "data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%timeit\n",
    "\n",
    "output = np.empty(len(data))\n",
    "\n",
    "for i, x in enumerate(data):\n",
    "    output[i] = x**2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%timeit\n",
    "\n",
    "output = data**2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But if you have to compute a complex formula, NumPy would have to _make an array for each intermediate step_.\n",
    "\n",
    "(There are tricks for circumventing this, but we won't get into that.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "energy = np.random.normal(100, 10, 1000000)\n",
    "px = np.random.normal(0, 10, 1000000)\n",
    "py = np.random.normal(0, 10, 1000000)\n",
    "pz = np.random.normal(0, 10, 1000000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%timeit\n",
    "\n",
    "mass = np.sqrt(energy**2 - px**2 - py**2 - pz**2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The above is equivalent to"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%timeit\n",
    "\n",
    "tmp1 = energy**2\n",
    "tmp2 = px**2\n",
    "tmp3 = py**2\n",
    "tmp4 = pz**2\n",
    "tmp5 = tmp1 - tmp2\n",
    "tmp6 = tmp5 - tmp3\n",
    "tmp7 = tmp6 - tmp4\n",
    "mass = np.sqrt(tmp7)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Numba\n",
    "\n",
    "(I always mistype \"numba\" as \"numpy\"...)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numba as nb"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Numba lets us compile a function to compute a whole formula in one step.\n",
    "\n",
    "```python\n",
    "@nb.jit\n",
    "```\n",
    "\n",
    "means \"JIT-compile\" and\n",
    "\n",
    "```python\n",
    "@nb.njit\n",
    "```\n",
    "\n",
    "means \"really JIT-compile\" because the original has a fallback mode that's getting deprecated. If we're using Numba at all, we don't want it to fall back to ordinary Python."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def compute_mass(energy, px, py, pz):\n",
    "    mass = np.empty(len(energy))\n",
    "    for i in range(len(energy)):\n",
    "        mass[i] = np.sqrt(energy[i]**2 - px[i]**2 - py[i]**2 - pz[i]**2)\n",
    "    return mass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `compute_mass` function is now \"ready to be compiled.\" It will be compiled when we give it arguments, so that it can propagate types."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "compute_mass(energy, px, py, pz)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%timeit\n",
    "\n",
    "mass = compute_mass(energy, px, py, pz)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "_(Note to self: show `fastmath` and `parallel`.)_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Entering the JIT'ed code\n",
    "\n",
    "What's wrong with the following?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def compute_mass_i(energy_i, px_i, py_i, pz_i):\n",
    "    return np.sqrt(energy_i**2 - px_i**2 - py_i**2 - pz_i**2)\n",
    "\n",
    "compute_mass_i(energy[0], px[0], py[0], pz[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%timeit\n",
    "\n",
    "mass = np.empty(len(energy))\n",
    "for i in range(len(energy)):\n",
    "    mass[i] = compute_mass_i(energy[i], px[i], py[i], pz[i])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "What do you think about this one?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def compute_mass_arrays(energy, px, py, pz):\n",
    "    return np.sqrt(energy**2 - px**2 - py**2 - pz**2)\n",
    "\n",
    "compute_mass_arrays(energy, px, py, pz)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%timeit\n",
    "\n",
    "mass = compute_mass_arrays(energy, px, py, pz)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "_(Note to self: show `nb.vectorize`.)_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Type considerations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Dynamically typed programs that are not statically typed\n",
    "\n",
    "Much of Python's flexibility comes from the fact that it does not need to know the types of all variables before it starts to run.\n",
    "\n",
    "That dynamism makes it easier to express complex logic (what I call \"bookkeeping\"), but it is a hurdle for speed. Dynamic typing was the first thing to go in Didier Verna's _How to make LISP go faster than C_."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def perfectly_valid_script(data):\n",
    "    output = np.empty(len(data))\n",
    "    for i, group in enumerate(data):\n",
    "        minimum = \"nothing yet\"\n",
    "        for x in group:\n",
    "            if minimum == \"nothing yet\":\n",
    "                minimum = x\n",
    "            elif x < minimum:\n",
    "                minimum = x\n",
    "        output[i] = minimum\n",
    "    return output"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = np.random.normal(2, 1, (100000, 3))\n",
    "data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "perfectly_valid_script(data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "invalid_for_numba = nb.njit(perfectly_valid_script)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "raises-exception"
    ]
   },
   "outputs": [],
   "source": [
    "invalid_for_numba(data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "How can we fix it up?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### What is \"type unification?\"\n",
    "\n",
    "Consider the following:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def another_valid_script(data):\n",
    "    output = np.empty(len(data))\n",
    "    for i, group in enumerate(data):\n",
    "        total = np.sum(group)\n",
    "        if total < 0:\n",
    "            return \"we don't like negative sums\"\n",
    "        else:\n",
    "            output[i] = total\n",
    "    return output"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "another_valid_script(data[:10])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "invalid_for_numba = nb.njit(another_valid_script)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "raises-exception"
    ]
   },
   "outputs": [],
   "source": [
    "invalid_for_numba(data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Does it matter whether we limit `data` to the first 10 elements?\n",
    "\n",
    "How can we fix it up?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Avoid lists and dicts\n",
    "\n",
    "Although Numba developers are doing a lot of work on supporting Python `list` and `dict` (of identically typed contents), I find it to be too easy to run into unsupported cases. The main problem is that their contents _must_ be typed, but the Python language didn't have ways to express types.\n",
    "\n",
    "(Yes, Python has type annotations now, but they're not fully integrated into Numba yet.)\n",
    "\n",
    "Let's start with something that works and make small changes to it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def output_a_list(data):\n",
    "    output = []\n",
    "    for group in data:\n",
    "        total = 0.0\n",
    "        for x in group:\n",
    "            total += x\n",
    "        output.append(total)\n",
    "    return output"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = np.random.normal(2, 1, (10, 3))\n",
    "data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "output_a_list(data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Closures versus arguments\n",
    "\n",
    "In Python, you can create functions at runtime, and these functions can reference data defined outside of the function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "accumulate = nb.typed.List([])\n",
    "\n",
    "def yet_another_valid_script(data):\n",
    "    for group in data:\n",
    "        total = 0.0\n",
    "        for x in group:\n",
    "            total += x\n",
    "        accumulate.append(total)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "accumulate"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "yet_another_valid_script(data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "accumulate"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try_it_in_numba = nb.njit(yet_another_valid_script)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "raises-exception"
    ]
   },
   "outputs": [],
   "source": [
    "try_it_in_numba(data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "accumulate"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Functions calling functions\n",
    "\n",
    "The `@nb.njit` decorator brackets a region of code to make fast. But that doesn't mean you need to write monolithic functions; JIT'ed functions can be rapidly executed within other JIT'ed functions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def find_minimum(group):\n",
    "    out = None\n",
    "    for x in group:\n",
    "        if out is None:\n",
    "            out = x\n",
    "        elif x < out:\n",
    "            out = x\n",
    "    return out"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def run_over_all(data):\n",
    "    out = np.empty(len(data))\n",
    "    for i, group in enumerate(data):\n",
    "        out[i] = find_minimum(group)\n",
    "    return out"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = np.random.normal(2, 1, (1000000, 3))\n",
    "data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "run_over_all(data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%timeit\n",
    "\n",
    "run_over_all(data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Functional functions\n",
    "\n",
    "A JIT'ed function can also be _passed into_ a JIT'ed function as an argument. Same constraints apply."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def run_over_all(do_what, data):\n",
    "    out = np.empty(len(data))\n",
    "    for i, group in enumerate(data):\n",
    "        out[i] = do_what(group)\n",
    "    return out"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "run_over_all(find_minimum, data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%timeit\n",
    "\n",
    "run_over_all(find_minimum, data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "What about functions _returning_ functions?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def function_returning_a_function():\n",
    "    def what_the_heck(x):\n",
    "        return x + 1\n",
    "    return what_the_heck"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "f = function_returning_a_function()\n",
    "f"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "f(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But the reason you might want to do this—to make a closure—doesn't work, so I consider this as a curiosity."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def function_returning_a_function(y):\n",
    "    def what_the_heck(x):\n",
    "        return x + y\n",
    "    return what_the_heck"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "raises-exception"
    ]
   },
   "outputs": [],
   "source": [
    "function_returning_a_function(1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### What about classes?\n",
    "\n",
    "Numba has a whole system for annotating classes to be used in JIT'ed functions: `@nb.jitclass` ([see documentation](https://numba.pydata.org/numba-doc/latest/user/jitclass.html)).\n",
    "\n",
    "However, I rarely use it—you find yourself having to annotate more and more until it doesn't look like Python. The value of Numba is that you can develop quickly without leaving the Python environment. When you're doing something so complex as to need classes, you might want to write a C++ program bound to Python with [pybind11](https://pybind11.readthedocs.io/) or ROOT.\n",
    "\n",
    "But here's a simple one:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.experimental.jitclass([\n",
    "    (\"E\", nb.float64),\n",
    "    (\"px\", nb.float64),\n",
    "    (\"py\", nb.float64),\n",
    "    (\"pz\", nb.float64),\n",
    "])\n",
    "class Lorentz:\n",
    "    def __init__(self, E, px, py, pz):\n",
    "        self.E = E\n",
    "        self.px = px\n",
    "        self.py = py\n",
    "        self.pz = pz\n",
    "\n",
    "    @property\n",
    "    def mass(self):\n",
    "        return np.sqrt(self.E**2 - self.px**2 - self.py**2 - self.pz**2)\n",
    "\n",
    "@nb.njit\n",
    "def add_Lorentz(one, two):\n",
    "    return Lorentz(one.E + two.E, one.px + two.px, one.py + two.py, one.pz + two.pz)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "objects = nb.typed.List([Lorentz(E, px, py, pz) for E, px, py, pz in zip(\n",
    "    np.random.normal(100, 10, 10000),\n",
    "    np.random.normal(0, 10, 10000),\n",
    "    np.random.normal(0, 10, 10000),\n",
    "    np.random.normal(0, 10, 10000),\n",
    ")])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def calculate_masses(one, two):\n",
    "    out = np.empty(len(one))\n",
    "    for i in range(len(one)):\n",
    "        out[i] = add_Lorentz(one[i], two[i]).mass\n",
    "    return out"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "calculate_masses(objects[:5000], objects[5000:])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It works, but small deviations from this example hit pickling errors (because Numba passes class objects into low-level ↔ Python conversions by pickling them) and other errors deep in the Numba internals.\n",
    "\n",
    "Iteration over objects easier (and faster) with Awkward Array (see below), which will have Lorentz vector objects defined by the [Vector](https://github.com/scikit-hep/vector) library (in development)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Auto-generating functions\n",
    "\n",
    "The biggest strength of JIT-compilers is that you can generate the code to compile after you know a lot about the problem.\n",
    "\n",
    "In C++, this is the realm of templates and compile-time metaprogramming. In Numba, it can be an f-string:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "args = [\n",
    "    np.random.normal(0, 1, 1000000),\n",
    "    np.random.normal(0, 1, 1000000),\n",
    "    np.random.normal(0, 1, 1000000),\n",
    "    np.random.normal(0, 1, 1000000),\n",
    "    np.random.normal(0, 1, 1000000),\n",
    "]\n",
    "\n",
    "code = f\"\"\"\n",
    "@nb.vectorize\n",
    "def sum_in_quadrature({\", \".join(\"p%d\" % i for i in range(len(args)))}):\n",
    "    # compiled for exactly {len(args)} arguments\n",
    "    return np.sqrt({\" + \".join(\"p%d**2\" % i for i in range(len(args)))})\n",
    "\"\"\"\n",
    "\n",
    "print(code)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "exec(code)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sum_in_quadrature(*args)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The point is that you have the full power of Python at your disposal to examine your data and generate code before compiling it, and you don't have to hack together scripts to make source code files before you can use them.\n",
    "\n",
    "Numba has a `@nb.generated_jit` decorator for specializing a function for specific types."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.generated_jit\n",
    "def add_unmasked(data, mask):\n",
    "    # At this level, data and mask are Numba *types* in plain Python.\n",
    "    if isinstance(mask, nb.types.NoneType):\n",
    "        def implementation(data, mask):\n",
    "            # At this level, data and mask are *values* in a JIT'ed context.\n",
    "            return np.sum(data)\n",
    "        return implementation\n",
    "    else:\n",
    "        def implementation(data, mask):\n",
    "            # The names \"data\" and \"mask\" have to match the outer function.\n",
    "            result = 0\n",
    "            for d, m in zip(data, mask):\n",
    "                if not m:\n",
    "                    result += d\n",
    "            return result\n",
    "        return implementation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = np.random.normal(0, 1, 1000000)\n",
    "mask = np.random.randint(0, 2, 1000000).view(np.bool_)\n",
    "data, mask"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "add_unmasked(data, mask)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "add_unmasked(data, None)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Debugging JIT'ed functions\n",
    "\n",
    "You might argue that my examples are too simple, but that's the best way to use Numba: by merging complex logic into a procedure bit by bit. You only benefit from the \"JIT\" aspect if you're building it up interactively—in a Python terminal, iPython, or a Jupyter notebook.\n",
    "\n",
    "**If your workflow is \"edit script, save, run from the beginning, edit script again,\" then you might as well be using a non-JIT compiler, like C++.**\n",
    "\n",
    "You can figure out how to write a function in one environment and then copy it into the script you'll be submitting to the GRID, you know.\n",
    "\n",
    "**Recommendations:**\n",
    "\n",
    "   * If you're still figuring out the logic of what you want to compute, write pure Python functions.\n",
    "   * If you have an algorithm and want to make it fast, enclose it in a function or a few functions and put `@nb.njit` at the tops of just those functions.\n",
    "   * If that doesn't work, break it down into small pieces and try to `@nb.njit` each one individually.\n",
    "   * There's [documentation on using Numba in gdb](https://numba.pydata.org/numba-doc/latest/user/troubleshoot.html#debugging-jit-compiled-code-with-gdb), but note that `print` works:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def do_stuff(data):\n",
    "    for group in data:\n",
    "        print(group)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = np.random.normal(0, 1, (1000000, 3))\n",
    "data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "do_stuff(data[:10])    # not too many!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Personally, I wouldn't bother with gdb. I'd just use `print`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Awkward Arrays in Numba\n",
    "\n",
    "The [Awkward Array](https://awkward-array.org) library is for doing array-at-a-time, precompiled calculations like NumPy, but on complex data structures. However, it also works well with Numba's element-at-a-time programming model because\n",
    "\n",
    "   * it gets complex data structures into Numba way more easily than `@nb.jitclass`,\n",
    "   * those data structures don't have to be \"boxed\" and \"unboxed\" into Python objects (e.g. Python's 28-byte structs for passing around each 8-byte number),\n",
    "   * some problems are easier in array-at-a-time programming, others are easier in element-at-a-time programming.\n",
    "\n",
    "Also, [Uproot](https://uproot.readthedocs.io) makes Awkward Arrays, so it's an easy data source to access in HEP."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Awkward Arrays as arguments"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import uproot\n",
    "import skhep_testdata"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "events = uproot.open(skhep_testdata.data_path(\"uproot-HZZ-objects.root\"))[\"events\"]\n",
    "events.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "muons = events[\"muonp4\"].array()\n",
    "muons"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import particle\n",
    "from hepunits import MeV, GeV\n",
    "\n",
    "# Get the Z mass Pythonically!\n",
    "z_mass = particle.Particle.from_string(\"Z\").mass * MeV / GeV\n",
    "\n",
    "@nb.njit\n",
    "def single_mass(p1, p2):\n",
    "    mass2 = ((p1.fE + p2.fE)**2\n",
    "           - (p1.fP.fX + p2.fP.fX)**2 - (p1.fP.fY + p2.fP.fY)**2 - (p1.fP.fZ + p2.fP.fZ)**2)\n",
    "    if mass2 > 0:\n",
    "        return np.sqrt(mass2)\n",
    "    else:\n",
    "        return 0\n",
    "\n",
    "@nb.njit\n",
    "def calculate_masses(groups_of_particles):\n",
    "    masses = np.empty(len(groups_of_particles))\n",
    "    \n",
    "    # Loop over groups of variable numbers of particles in each event.\n",
    "    for num, group in enumerate(groups_of_particles):\n",
    "        best = None\n",
    "        # Consider all unique pairs of particles in each group.\n",
    "        for i in range(len(group)):\n",
    "            for j in range(i + 1, len(group)):\n",
    "                # Compute the mass of each pair of particles.\n",
    "                mass = single_mass(group[i], group[j])\n",
    "                # If it's closer to the Z mass, keep it.\n",
    "                if best is None:\n",
    "                    best = mass\n",
    "                elif abs(mass - z_mass) < abs(best - z_mass):\n",
    "                    best = mass\n",
    "        if best is None:\n",
    "            masses[num] = 0\n",
    "        else:\n",
    "            masses[num] = best\n",
    "    return masses"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "calculate_masses(muons)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "calculate_masses(events[\"jetp4\"].array())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Incidentally, this is how you would do the same thing with array-at-a-time functions in Awkward Array."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import awkward as ak"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Use mostly the same mass function, with \"np.where\" replacing \"if\".\n",
    "def vectorized_mass(p1, p2):\n",
    "    mass2 = ((p1.fE + p2.fE)**2\n",
    "           - (p1.fP.fX + p2.fP.fX)**2 - (p1.fP.fY + p2.fP.fY)**2 - (p1.fP.fZ + p2.fP.fZ)**2)\n",
    "    return np.where(mass2 > 0, np.sqrt(mass2), 0)\n",
    "\n",
    "def calculate_masses_awkward(groups_of_particles):\n",
    "    # Get the first and second in each unique pair of particles.\n",
    "    first, second = ak.unzip(ak.combinations(groups_of_particles, 2))\n",
    "\n",
    "    # Apply the function to each pair.\n",
    "    nested_masses = vectorized_mass(first, second)\n",
    "\n",
    "    # Determine how we're going to pick our favorite pair.\n",
    "    objective_metric = abs(nested_masses - z_mass)\n",
    "    selector = ak.argmin(objective_metric, axis=1, keepdims=True)\n",
    "\n",
    "    # Apply it to the nested_masses, replace \"None\" with \"0\", and flatten the output.\n",
    "    return ak.fill_none(nested_masses[selector], 0)[:, 0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "calculate_masses_awkward(muons)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The method you choose depends on what seems more natural to you, what's quicker to type, and perhaps what has better performance overall.\n",
    "\n",
    "Generally speaking, element-at-a-time calculations on Awkward Arrays with Numba\n",
    "\n",
    "   * will be faster than array-at-a-time calculations without Numba,\n",
    "   * will use less memory (e.g. iterating over combinations rather than creating an array of all combinations),\n",
    "   * will require more lines of code,\n",
    "   * will require more fiddling to get the types right.\n",
    "\n",
    "It's also a good technique to use Numba to make arrays of booleans or arrays of integers to slice an Awkward Array."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def has_Z_boson(group):\n",
    "    for i in range(len(group)):\n",
    "        for j in range(i + 1, len(group)):\n",
    "            mass = single_mass(group[i], group[j])\n",
    "            if 40 < mass < 120:\n",
    "                return True\n",
    "    return False\n",
    "\n",
    "@nb.njit\n",
    "def mask_for_Z_bosons(groups_of_particles):\n",
    "    mask = np.empty(len(groups_of_particles), np.bool_)\n",
    "    for num, group in enumerate(groups_of_particles):\n",
    "        mask[num] = has_Z_boson(group)\n",
    "    return mask"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mask = mask_for_Z_bosons(muons)\n",
    "mask"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "muons"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "muons[mask]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Awkward Arrays as return values\n",
    "\n",
    "Using complex data structures as input is a very different thing from creating them as output.\n",
    "\n",
    "When using them as input, you have an object that you can extract elements from with square brackets, pick out fields with a dot, or iterate over with a `for` loop.\n",
    "\n",
    "If you want to output a _part_ of that input, it's still not a problem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def first_with_Z_boson(groups_of_particles):\n",
    "    for group in groups_of_particles:\n",
    "        if has_Z_boson(group):\n",
    "            return group\n",
    "    return None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "single_event = first_with_Z_boson(muons)\n",
    "single_event"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "single_event.tolist()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But if you want to make a whole new structure that didn't exist before, that's more difficult because you need a syntax for constructing it.\n",
    "\n",
    "We've seen the difficulties of making lists (`nb.typed.List`), dicts (`nb.typed.Dict`), and class instances (`nb.jitclass`), because Numba's world must be statically typed and the Python has traditionally been a dynamically typed language.\n",
    "\n",
    "Awkward Array has a facility for building structures: [ak.ArrayBuilder](https://awkward-array.readthedocs.io/en/latest/_auto/ak.ArrayBuilder.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "builder = ak.ArrayBuilder()\n",
    "builder"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with builder.list():\n",
    "    builder.integer(1)\n",
    "    builder.real(2.2)\n",
    "    builder.string(\"three\")\n",
    "\n",
    "builder"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "builder = ak.ArrayBuilder()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with builder.record():\n",
    "    builder.field(\"x\").integer(1)\n",
    "    builder.field(\"y\").real(2.2)\n",
    "    builder.field(\"z\").string(\"three\")\n",
    "\n",
    "builder"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An ArrayBuilder is not an Awkward Array, but `snapshot()` will turn it into one."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "builder.snapshot()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here's how they differ:\n",
    "\n",
    "   * Awkward Arrays are _immutable_: arrays can be transformed onto new arrays, but the original arrays can't be changed in place.\n",
    "   * ArrayBuilders are _append-only_: new data can be added onto the end, but not into the middle.\n",
    "   * ArrayBuilder types start specific and get more general, as needed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "builder = ak.ArrayBuilder()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ak.type(builder)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "builder.integer(1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ak.type(builder)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "builder.integer(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ak.type(builder)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "builder.real(3.3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ak.type(builder)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with builder.list():\n",
    "    pass"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ak.type(builder)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with builder.list():\n",
    "    builder.integer(4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ak.type(builder)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "ArrayBuilders can be used in Numba JIT-compiled functions, which makes it the easiest way to make data structures in these functions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def is_part_of_Z_boson(groups_of_particles, builder):\n",
    "    for group in groups_of_particles:\n",
    "        builder.begin_list()\n",
    "        for i in range(len(group)):\n",
    "            good_Z = False\n",
    "            for j in range(i + 1, len(group)):\n",
    "                mass = single_mass(group[i], group[j])\n",
    "                if 40 < mass < 120:\n",
    "                    good_Z = True\n",
    "            builder.boolean(good_Z)\n",
    "        builder.end_list()\n",
    "    return builder"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nested_booleans = is_part_of_Z_boson(muons, ak.ArrayBuilder())\n",
    "nested_booleans"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nested_booleans.snapshot()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "muons"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "muons[nested_booleans.snapshot()]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are some limitations:\n",
    "\n",
    "   * Numba doesn't support `with` statements in `@nb.njit` mode (check their [supported Python features](https://numba.pydata.org/numba-doc/dev/reference/pysupported.html#constructs) page to see if this has changed).\n",
    "      * `with builder.list()` → `builder.begin_list()` and `builder.end_list()`\n",
    "      * `with builder.record()` → `builder.begin_record()` and `builder.end_record()`\n",
    "      * etc.\n",
    "   * You can't create an ArrayBuilder inside the JIT-compiled function; pass it in.\n",
    "   * You can't `snapshot()` the ArrayBuilder inside the JIT-compiled function; do that outside.\n",
    "   * Because ArrayBuilder is not statically typed, it is _slower_ than any equivalent NumPy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def mask_for_Z_bosons_NumPy(groups_of_particles):\n",
    "    mask = np.empty(len(groups_of_particles), np.bool_)\n",
    "    for num, group in enumerate(groups_of_particles):\n",
    "        mask[num] = has_Z_boson(group)\n",
    "    return mask\n",
    "\n",
    "@nb.njit\n",
    "def mask_for_Z_bosons_ArrayBuilder(groups_of_particles, builder):\n",
    "    for group in groups_of_particles:\n",
    "        builder.append(has_Z_boson(group))\n",
    "    return builder"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%timeit\n",
    "\n",
    "mask_for_Z_bosons_NumPy(muons)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%timeit\n",
    "\n",
    "mask_for_Z_bosons_ArrayBuilder(muons, ak.ArrayBuilder())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(We're working on a TypedArrayBuilder to take advantage of static typing.)\n",
    "\n",
    "Also useful: you can `append` parts of an Awkward Array into an ArrayBuilder (which references it like a pointer, so it doesn't matter how complex it is)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@nb.njit\n",
    "def muons_that_are_part_of_Z_boson(groups_of_particles, builder):\n",
    "    for group in groups_of_particles:\n",
    "        builder.begin_list()\n",
    "        for i in range(len(group)):\n",
    "            good_Z = False\n",
    "            for j in range(i + 1, len(group)):\n",
    "                mass = single_mass(group[i], group[j])\n",
    "                if 40 < mass < 120:\n",
    "                    good_Z = True\n",
    "            # Instead of inserting a boolean, conditionally insert the original particle record.\n",
    "            builder.append(group[i])\n",
    "        builder.end_list()\n",
    "    return builder"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "selected_muons = muons_that_are_part_of_Z_boson(muons, ak.ArrayBuilder())\n",
    "selected_muons.snapshot()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Any questions?\n",
    "\n",
    "That's all I've prepared. We can use the space below for free-form questions."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.6"
  },
  "toc-autonumbering": false,
  "toc-showcode": false,
  "toc-showmarkdowntxt": false
 },
 "nbformat": 4,
 "nbformat_minor": 4
}