{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "      \n",
    "<img src=\"https://raw.githubusercontent.com/scikit-hep/mplhep/master/docs/source/_static/mplhep.png\" width=\"70%\"/>\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# mplhep"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "cell_style": "split",
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "## Styling\n",
    "- Easy publication grade styling\n",
    "- Distribute styles and fonts\n",
    "- Save real analyzer time\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-17T11:00:22.181763Z",
     "start_time": "2020-07-17T11:00:22.176738Z"
    },
    "cell_style": "split",
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "## Plotting\n",
    "- extend `mpl` with domain specfic functions\n",
    "- maximally align with `mpl` API to be intuitive\n",
    "- upstream as much as possible"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "cell_style": "center",
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Histogram plotting\n",
    "- `matplotlib` now has basic histogram plotting support - `plt.stairs`\n",
    "- some functionality remains too specific to be upstreamed - like histograms with error bars"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:17:39.006595Z",
     "start_time": "2021-07-06T21:17:38.469123Z"
    },
    "scrolled": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import numpy as np \n",
    "np.random.seed(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:17:40.790067Z",
     "start_time": "2021-07-06T21:17:40.782646Z"
    },
    "scrolled": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "H = np.histogram(np.random.normal(2.5, .5, 100), bins=np.arange(0,6, 0.5))\n",
    "print(\"Type:\", type(H))\n",
    "h, bins = H\n",
    "print(\"Values:\", h)\n",
    "print(\"Bins:\", bins)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Simple Histograms"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:17:55.427268Z",
     "start_time": "2021-07-06T21:17:55.305116Z"
    },
    "cell_style": "center",
    "scrolled": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "\n",
    "ax.stairs(*H, label='Empty')\n",
    "ax.stairs(h, bins + 4, fill=True, label='Filled')\n",
    "ax.legend()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:18:02.962038Z",
     "start_time": "2021-07-06T21:18:02.740143Z"
    },
    "scrolled": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "import mplhep as hep\n",
    "f, axs = plt.subplots(1,2, figsize=(14, 5))\n",
    "\n",
    "hep.histplot(H, ax=axs[0])\n",
    "hep.histplot(h, bins, yerr=True, ax=axs[1]);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-16T20:49:13.523865Z",
     "start_time": "2020-07-16T20:49:13.114937Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Primary goal is to stay unobtrusive\n",
    "- if you know how `plt.hist()` works, `mplhep.histplot()` should behave like you'd expect\n",
    "- kwargs you are used to should work"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:18:22.052943Z",
     "start_time": "2021-07-06T21:18:21.837703Z"
    },
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "f, axs = plt.subplots(1,2, figsize=(14, 5))\n",
    "\n",
    "hep.histplot(H, ax=axs[0], histtype='fill', hatch='///', edgecolor='white')\n",
    "hep.histplot(H, ax=axs[1], histtype='errorbar', yerr=True, c='black', capsize=4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-16T20:49:13.523865Z",
     "start_time": "2020-07-16T20:49:13.114937Z"
    },
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### and be able to do same things as `plt.hist`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:27:28.320742Z",
     "start_time": "2021-07-06T21:27:28.014771Z"
    },
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "f, axs = plt.subplots(1,3, figsize=(21, 5))\n",
    "\n",
    "hep.histplot([h, h*2], bins=bins, ax=axs[0], yerr=True)\n",
    "hep.histplot([h, h*2], bins=bins, ax=axs[1], stack=True, histtype='fill')\n",
    "hep.histplot([h, h*2], bins=bins, ax=axs[2], binwnorm=[20, 9]);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Be convenient for a physicist"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:27:31.736450Z",
     "start_time": "2021-07-06T21:27:31.360100Z"
    },
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "f, axs = plt.subplots(1,3, figsize=(21, 5))\n",
    "\n",
    "hep.histplot([h, h*2], bins=bins, ax=axs[0], yerr=True, label=[\"MC1\", \"MC2\"])\n",
    "hep.histplot(np.random.poisson(h*3), bins=bins, ax=axs[1], yerr=True, label=\"Data\")\n",
    "\n",
    "hep.histplot([h, h*2], bins=bins, ax=axs[2], stack=True, label=[\"MC1\", \"MC2\"], density=True)\n",
    "hep.histplot(np.random.poisson(h*3), bins=bins, ax=axs[2], yerr=True, histtype='errorbar', label=\"Data\", density=True, color='k')\n",
    "for ax in axs:\n",
    "    ax.legend()\n",
    "axs[0].set_title(\"Some MCs\")\n",
    "axs[1].set_title(\"Draw Poisson Data\")\n",
    "axs[2].set_title(\"Data/MC Shape comparison\"); "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Be flexible about input types"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:27:39.266655Z",
     "start_time": "2021-07-06T21:27:38.930880Z"
    },
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "f, axs = plt.subplots(1,3, figsize=(21, 5))\n",
    "\n",
    "my_hist = [1,2,3,4,2]\n",
    "hep.histplot(my_hist, bins=range(len(my_hist)+1), ax=axs[0])\n",
    "hep.histplot(H, yerr=True, ax=axs[1])\n",
    "hep.histplot(h, bins, yerr=True, ax=axs[2])\n",
    "\n",
    "axs[0].set_title(\"'Manual' inputs\")\n",
    "axs[1].set_title(\"Numpy Tuple\")\n",
    "axs[2].set_title(\"Unwrapped Tuple\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Process anything that implements UHI PlottableProtocol"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:27:43.144392Z",
     "start_time": "2021-07-06T21:27:42.910807Z"
    },
    "cell_style": "split",
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import uproot4\n",
    "from skhep_testdata import data_path\n",
    "fname = data_path(\"uproot-hepdata-example.root\")\n",
    "f = uproot4.open(fname)\n",
    "print(f.keys())\n",
    "print(f['hpx'])\n",
    "hep.histplot(f['hpx']);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:27:46.062529Z",
     "start_time": "2021-07-06T21:27:44.486291Z"
    },
    "cell_style": "split",
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import ROOT\n",
    "h = ROOT.TH1F(\"h1\", \"h1\", 50, -2.5, 2.5)\n",
    "h.FillRandom(\"gaus\", 10000)\n",
    "\n",
    "hep.histplot(h);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Integrate tightly with `hist`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:27:48.517899Z",
     "start_time": "2021-07-06T21:27:48.356315Z"
    },
    "cell_style": "split",
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import hist\n",
    "\n",
    "h = hist.Hist(\n",
    "    hist.axis.Regular(10, 0.0, 1.0, label='X', name='x'),\n",
    ")\n",
    "h.fill(np.random.normal(0.5, 0.2, 1000))\n",
    "\n",
    "hep.histplot(h);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:27:49.878761Z",
     "start_time": "2021-07-06T21:27:49.750711Z"
    },
    "cell_style": "split",
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "import hist\n",
    "\n",
    "h = hist.Hist(\n",
    "    hist.axis.IntCategory([], \n",
    "        growth=True, label='Categorical Bins', name='x'),\n",
    ")\n",
    "h.fill(np.random.normal(5, 1, 1000))\n",
    "\n",
    "hep.histplot(h);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## 2D histograms\n",
    "- minimal wrap on `plt.pcolormesh()`, which alrady has almost everything\n",
    "- fix(flip) indexing\n",
    "- add some annotation sugar ala `sns.heatmap`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:27:53.796602Z",
     "start_time": "2021-07-06T21:27:53.373629Z"
    },
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "h2 = hist.Hist(\n",
    "    hist.axis.Regular(10, 0.0, 1.0, label='Some label'),\n",
    "    hist.axis.Regular(10, 0, 1)\n",
    ")\n",
    "h2.fill(np.random.normal(0.5, 0.2, 1000), np.random.normal(0.5, 0.2, 1000))\n",
    "hep.hist2dplot(h2, labels=True, cbar=False);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:27:56.113230Z",
     "start_time": "2021-07-06T21:27:55.826249Z"
    },
    "scrolled": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "print(f['hpxpy'])\n",
    "hep.hist2dplot(f['hpxpy']);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Styling\n",
    "- Primary purpose of `mplhep` is to serve and distribute styles \n",
    "    - **ALICE**\n",
    "    - **ATLAS**\n",
    "    - **CMS**\n",
    "    - **LHCb**\n",
    "- To ensure plots looks the same on any framework fonts need to be included\n",
    " - I am liable to go on a rant, so suffice to say:\n",
    " - We package an open look-alike of Helvetica called Tex Gyre Heros"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:28:27.033642Z",
     "start_time": "2021-07-06T21:28:26.504467Z"
    },
    "scrolled": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "hep.style.use([hep.style.ATLAS])#, {'xtick.direction': 'out'}])\n",
    "hep.histplot(np.histogram(np.random.normal(10, 3, 1000)));\n",
    "hep.atlas.label();"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:28:29.304879Z",
     "start_time": "2021-07-06T21:28:28.653700Z"
    },
    "scrolled": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "hep.style.use(\"CMS\")\n",
    "hep.histplot(np.histogram(np.random.normal(10, 3, 1000)))\n",
    "hep.cms.label() "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:28:36.927781Z",
     "start_time": "2021-07-06T21:28:36.126814Z"
    },
    "scrolled": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "hep.style.use()\n",
    "hep.style.use([hep.style.LHCb2])\n",
    "hep.histplot(np.histogram(np.random.normal(10, 3, 1000)))\n",
    "hep.lhcb.label() "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:28:49.819290Z",
     "start_time": "2021-07-06T21:28:49.455881Z"
    },
    "scrolled": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "hep.style.use()\n",
    "hep.style.use([hep.style.ALICE])\n",
    "hep.histplot(np.histogram(np.random.normal(10, 3, 1000)))\n",
    "hep.alice.label() "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "###  Label styles"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:29:01.747874Z",
     "start_time": "2021-07-06T21:29:00.785268Z"
    },
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "hep.style.use()\n",
    "fig, axs = plt.subplots(1, 5, figsize=(18, 3))\n",
    "for i, ax in enumerate(axs):\n",
    "    hep.cms.label(ax=ax, loc=i)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Extended examples"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### Ratio Plot"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:29:18.292012Z",
     "start_time": "2021-07-06T21:29:17.972896Z"
    },
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "a = hist.Hist.new.Reg(20,-2,2).Int64().fill(np.random.uniform(-2,2,size=3000))\n",
    "b = hist.Hist.new.Reg(20,-2,2).Int64().fill(np.random.normal(0,0.5,size=5000))\n",
    "tot = a + b\n",
    "data = tot.copy()\n",
    "data[...] = np.random.poisson(tot.values())\n",
    "\n",
    "from hist.intervals import ratio_uncertainty"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T21:29:20.722295Z",
     "start_time": "2021-07-06T21:29:20.395854Z"
    },
    "code_folding": [],
    "scrolled": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "fig, (ax, rax) = plt.subplots(2, 1, figsize=(6,6), gridspec_kw=dict(height_ratios=[3, 1], hspace=0), sharex=True)\n",
    "\n",
    "hep.histplot([a, b], ax=ax, stack=True, histtype='fill', label=[\"MC1\", \"MC2\"])\n",
    "hep.histplot(data, ax=ax, histtype='errorbar', color='k', capsize=4, yerr=True, label=\"Data\")\n",
    "\n",
    "errps = {'hatch':'////', 'facecolor':'none', 'lw': 0, 'color': 'k', 'alpha': 0.4}\n",
    "ax.stairs(\n",
    "    values=tot.values() + np.sqrt(tot.values()),\n",
    "    baseline=tot.values() - np.sqrt(tot.values()),\n",
    "    edges=tot.axes[0].edges, **errps, label='Stat. unc.')\n",
    "\n",
    "yerr = ratio_uncertainty(data.values(), tot.values(), 'poisson')\n",
    "rax.stairs(1+yerr[1], edges=tot.axes[0].edges, baseline=1-yerr[0], **errps)\n",
    "hep.histplot(data.values()/tot.values(), tot.axes[0].edges, yerr=np.sqrt(data.values())/tot.values(),\n",
    "    ax=rax, histtype='errorbar', color='k', capsize=4, label=\"Data\")\n",
    "\n",
    "\n",
    "rax.axhline(1, ls='--', color='k')\n",
    "rax.set_ylim(0.7, 1.3)\n",
    "ax.set_xlim(-2, 2)\n",
    "ax.legend()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "#### 2D Ratio plot"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-07-06T19:58:38.989373Z",
     "start_time": "2021-07-06T19:58:37.143452Z"
    },
    "cell_style": "center",
    "scrolled": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "a = hist.Hist.new.Reg(5,-2,2).Reg(5,-2,2).Int64().fill(*np.random.normal(0,1,size=(2,1000)))\n",
    "b = hist.Hist.new.Reg(5,-2,2).Reg(5,-2,2).Int64().fill(*np.random.uniform(-2,2,size=(2,1000)))\n",
    "\n",
    "ratio = a.values() / b.values()\n",
    "err_down, err_up = ratio_uncertainty(a.values(), b.values(), 'poisson')\n",
    "\n",
    "def unctext(ra, u, d):\n",
    "    ra, u, d = f'{ra:.2f}', f'{u:.2f}', f'{d:.2f}'\n",
    "    return '$'+ra+'_{-'+d+'}^{+'+u+'}$'\n",
    "\n",
    "labels = np.vectorize(unctext)(ratio, err_up, err_down)\n",
    "\n",
    "hep.hist2dplot(ratio, labels=labels, cmap='cividis');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-07-16T22:53:19.617488Z",
     "start_time": "2020-07-16T22:53:19.607120Z"
    },
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### mplhep in publications\n",
    "Package has already helped produce plots in several publication\n",
    " - We can create experiment TDR guidelines compatible plots in python\n",
    "\n",
    "- [Simultaneous Jet Energy and Mass Calibrations with Neural Networks](https://cds.cern.ch/record/2706189), ATLAS Collaboration, 2019\n",
    "- [Integration and Performance of New Technologies in the CMS Simulation](https://arxiv.org/abs/2004.02327), Kevin Pedro, 2020 (Fig 3,4)\n",
    "- [GeantV: Results from the prototype of concurrent vector particle transport simulation in HEP](https://arxiv.org/abs/2005.00949), Amadio et al, 2020 (Fig 25,26)\n",
    "- [Search for the standard model Higgs boson decaying to charm quarks](https://cds.cern.ch/record/2682638), CMS Collaboration, 2019 (Fig 1)"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python (def)",
   "language": "python",
   "name": "def"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  },
  "rise": {
   "autolaunch": true,
   "center": false,
   "enable_chalkboard": true,
   "scroll": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}