{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Stagg examples"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conversions\n",
    "\n",
    "The main purpose of Stagg is to move **st**atistical **agg**reggations, such as histograms, from one framework to the next. This requires a conversion of high-level domain concepts.\n",
    "\n",
    "Consider the following example: in Numpy, a histogram is simply a 2-tuple of arrays with special meaning—bin contents, then bin edges."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([     3,      6,      9,     17,     34,     53,     88,    145,\n",
       "           219,    362,    583,    890,   1414,   2082,   3073,   4567,\n",
       "          6650,   9493,  13497,  18696,  25706,  34175,  45639,  59338,\n",
       "         76917,  96418, 120250, 147785, 177579, 210677, 246305, 283236,\n",
       "        321129, 357500, 392646, 424978, 452731, 475951, 490446, 497232,\n",
       "        497458, 490322, 475074, 453326, 425909, 393028, 358993, 321558,\n",
       "        284107, 246317, 210293, 177366, 147453, 119625,  97069,  75632,\n",
       "         59476,  45713,  34588,  25589,  18934,  13608,   9658,   6656,\n",
       "          4692,   3177,   2137,   1388,    866,    570,    365,    207,\n",
       "           147,     71,     40,     29,     21,      9,      4,      4]),\n",
       " array([-5.   , -4.875, -4.75 , -4.625, -4.5  , -4.375, -4.25 , -4.125,\n",
       "        -4.   , -3.875, -3.75 , -3.625, -3.5  , -3.375, -3.25 , -3.125,\n",
       "        -3.   , -2.875, -2.75 , -2.625, -2.5  , -2.375, -2.25 , -2.125,\n",
       "        -2.   , -1.875, -1.75 , -1.625, -1.5  , -1.375, -1.25 , -1.125,\n",
       "        -1.   , -0.875, -0.75 , -0.625, -0.5  , -0.375, -0.25 , -0.125,\n",
       "         0.   ,  0.125,  0.25 ,  0.375,  0.5  ,  0.625,  0.75 ,  0.875,\n",
       "         1.   ,  1.125,  1.25 ,  1.375,  1.5  ,  1.625,  1.75 ,  1.875,\n",
       "         2.   ,  2.125,  2.25 ,  2.375,  2.5  ,  2.625,  2.75 ,  2.875,\n",
       "         3.   ,  3.125,  3.25 ,  3.375,  3.5  ,  3.625,  3.75 ,  3.875,\n",
       "         4.   ,  4.125,  4.25 ,  4.375,  4.5  ,  4.625,  4.75 ,  4.875,\n",
       "         5.   ]))"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import numpy\n",
    "\n",
    "numpy_hist = numpy.histogram(numpy.random.normal(0, 1, int(10e6)), bins=80, range=(-5, 5))\n",
    "numpy_hist"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We convert that into its Stagg equivalent with a connector (two-function module: `tostagg` and `tonumpy`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<Histogram at 0x711474a41588>"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import stagg.connect.numpy\n",
    "\n",
    "stagg_hist = stagg.connect.numpy.tostagg(numpy_hist)\n",
    "stagg_hist"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This object is instantiated from a class structure built from simple pieces."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Histogram(\n",
      "  axis=[\n",
      "    Axis(binning=RegularBinning(num=80, interval=RealInterval(low=-5.0, high=5.0)))\n",
      "  ],\n",
      "  counts=\n",
      "    UnweightedCounts(\n",
      "      counts=\n",
      "        InterpretedInlineInt64Buffer(\n",
      "          buffer=\n",
      "              [     3      6      9     17     34     53     88    145    219    362\n",
      "                  583    890   1414   2082   3073   4567   6650   9493  13497  18696\n",
      "                25706  34175  45639  59338  76917  96418 120250 147785 177579 210677\n",
      "               246305 283236 321129 357500 392646 424978 452731 475951 490446 497232\n",
      "               497458 490322 475074 453326 425909 393028 358993 321558 284107 246317\n",
      "               210293 177366 147453 119625  97069  75632  59476  45713  34588  25589\n",
      "                18934  13608   9658   6656   4692   3177   2137   1388    866    570\n",
      "                  365    207    147     71     40     29     21      9      4      4])))\n"
     ]
    }
   ],
   "source": [
    "stagg_hist.dump()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now it can be converted to a ROOT histogram with another connector."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Welcome to JupyROOT 6.14/04\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<ROOT.TH1D object (\"root_hist\") at 0x6510de522e70>"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import stagg.connect.root\n",
    "\n",
    "root_hist = stagg.connect.root.toroot(stagg_hist, \"root_hist\")\n",
    "root_hist"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAArgAAAHYCAIAAAApvgy/AAAABmJLR0QAAAAAAAD5Q7t/AAAexklEQVR4nO3dUZqquqIuUHK/3S+gM2s3Q2jGmZ0RWsZ5yJ05WUicVk2UQI3xUJ9FoQa1yG8SkrAsSwMAsOX/HV0AAKBeggIAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAUCQoAABF/zm6APXquq5pmnme7/d7vA3AK0IIRxeBP1uW5ZXdwov7/TTDMMSf0zT1fe9VAnhdCCqX2r3+HmlR2BaDQtS27XEFAYAjGaPwTNd1fd/noQEAmqaZpqnUKz1N05Uqjp8YFDbfv2EYuq6bpinfOE3Tsix933+mYAAc7kkCWO02z3PpT+M4Pr/7MAxnCRM/LijE928VCEII8U3t+z5+Pk70FgKwoycJ4EXDMPyx+3+aplVNVK0fFBRiSHxsHoiBYFmW2H4wz3NsNRrHseu6EMLtdjuguAB8UTyfr5oEht8eK+bYlpz/KdXfr39XTA+StqxCQKxTVs/SNM08z+f4Rrpcy+12u9/vq41t2y7Lcr/fb7dbrPXzfdq2jTs8/vr4UAD80VGVS9M06atd2tJkY9Jvt1u+c/6neMJPd2/b9nkVEPdsf8ufNP4p3o5/WhUgf5adX4KXvf4eXS0oLMuyygGrXzd3yD86+RsMwDccGBTyp16dzx/r7/Sn/NcXa4G4W6o+7vf7Km2kIqUaZ1WAA1PC8pX36IJdD8uy9H0fG3ZCCKZLAvg58s7icRzz69vTBDlN08zznO8Zq4xvDBpIfQdPKpq0zytjFyp0waDQ/M4KUgLAT7M65z+pAh7/9I7RhbfbbZ7nEMJqHMOJXDMoAMBzn7noILYixI6JcRzPOLn1NYNCbEvI+yCeaNt2NTzVVIwA1/BYBaSGhNWZv3na/PDtZ4+tCHGenhgXznJVZHLBoJD3OLySFYZhiJdENr8vnz1p6xAAudjsn6qAWC/En6s/xSvn39FVPY5jqlNWldFfztbwOe8ZTXmYJ5dHJs3DdRD5kJb8CggAvuGoyuX56b15uDZ+80/xe//jQ608XhzRbF31sHqWVMWsLuP8vNef2gJf/+fFaTsBeK621SOf9Cxs/mn3nojPPMuXvP4e1fVeAnABtQUFHv3cZabPOKAUgAp1XVcaRrAaBX9tVwt9T4LCxY4UoFpaFOr3c1sUGoEAoALady/jgkEBgMP5zla515PcBedRAAD2IigAAEWCAgBQJCgAwEe96dLKNz2soADAh8RFkla+8Ti1rcjTdV34LW0chiFtTFX4NE0hhL7vQwj5GhBpz3yixs2HzTc+f9jd7D599LGud0QAp1M6FceFD9rM8+V1Viv1RPf7fXP7UW63WypPuh0XjIgb89tNtoREvjG9Dul2fH3ixnQ7X0Uiv50eNj7X81UqVs/+R1ebE8MsHwCHK52Ku67ruu71r7ynOKXnSxY3v8u8OtK4T9M0fd+nI0r75Ic5DMM0TbGF4MWHjW0M6RFWd3xS7Bdf24O7HuI6TLn8r8MwxDW8N+9VW9MTAN8TuyRSo3q+WlL8NdYFsWU+X8Avb7RPlUVq8//Yekt5F0C88Vh5bQ4g2Nz4OG90vtvmXdq2jdVivpT2bl5seXiT2HKSN0OlP6WDjz8379I8NLAcfkQAlE7FqbshyddlflyjOW3MT/ibLfnpXnnb+6paeZNU+Hx96ng7HmAsfOxQiPXX/X5PdVkqZ9oYD+TxYdPR5Q+bH37+RH/0enV5cLVaehdX63w3/+7USbcf7y4oABzuSVBotsYoPJ7zVzfy0QwpKOSDA5bftcPrnfQ7iiWJlX2TDU1Ix/hYc8W7pINKj7AahfDkYdOR5n9d/j3i4YnTBIXS8awSwOY4juXfuTI94HtKCsCrngSFzXP+Y5W/urEZFGLsyOVfr1//br2jx1op2ty4+WqsXopXHnZ1l81HeLFIm46/PDL1MOU9OvM8ry4R2VzrM+7zc9b6BCC3qmhjpTAMQ6xcY/3y7jKkkQHxdowp+VPnI+rSWIppmuZ5jn/Kr2kcxzHezsftlR42RqVVFTmO46XGKDS/e2tSd0vanr/9qSFhs68hb2I644sAcDGl8+o3WhTSaIPHFoXNSxDzhufSF/HdbVYoeYNHqqdSa0e+MY1CWNV9aeOqif3xufIxCi8Oy3j9lamrjmyy4R7fDgqfKCgAZU+CwuZXtSdBIZ7kN4PC6gHzqQg26923it94X9kYt7++8S8ftuT16rKuS1RTV0II4Xa7paaYYRjGcVyWJd2I26dpyq9JbU5y0S1UIoRfL+65LP+8tSRczI6n4vxiyNf3yS+wZNPr79F/3l2UJ+KFs/kIg3meU/vJ6rLR1BMzjmO+/VOFhWt6JQG8nidgd69U9o/7iAg7OnIwYxx/kTcbNL/f3WEY5nl+HPGR/hrvMo7jZkMWALCLgxvqY99B+vWxu+Fx++ouq/LreoDXhfDrxRYFXQ98iVNx/V5/j6p4L590JpV6p0p38emE1wkKvIlTcf1OFhR25NMJm0rjDP5mjIL0QIlTcf3OMZgR+KRv1+ubdzTCEX6I42dmBACar1/K95lL/wQFAD5kdUl82phPclyntHR1aU7otEZ2Pnhu816bS2PHjX3frx4hPezj866G9qf9938xX5/F6RSud0Swi6b5n8ofkCspnYo36536K6PVXJCPcySvlsaO00Hm98qnnmwK62WnSSTTNJTPn3f1usWnaN6wzLQWBQA+Km9UqL8tocmWX2p+T/Oz2mGaprSIw/1+j9f2p6kC83vll+yt1jVML0VaIjGtBRV3y1+3OH9xXoZ8Eal9CQoAfE4+L07TNOM45hXek2b5aFWbPu8O2FGqgzcr49WKx9HmxNJ5emiyaYiX7AKEdMf4sFH+osXgsspYb+l0aJrmkkEhFBxdLgDWayI3/656+76PzfK32y11wPd9n5ri00R88zzHtX5iFfvWlonHHPBkFOHffK1Pq0inw+n7PgaFtPz0NE3jOH5yBYMLBoUn/TEAHKvruvQ1Om/ST7+mqfrTbvf7PZ/FP0lNEbfb7a0V52r5oceS5L5dkmEYYiTKHyEe+zAMKSSlfdIqB997utddMCgAULPUSJ6+JUdxZZ/UDJxW/Gl+D/5fVc+HrPxUqpgft6ehBvmWVYNKaquIOy/LUmoayQ92HMe+72OLS9/3784KJlyCqzEVEpWLFw2mYX15Gmgeatx4EeD9fo9/PaQfOX7XTy3/qRUkDSbImwHSCMR8uePUdpKOMR14Go4Qr3HIxfEcq9aXvIH8MzNgCgpwQZ+ZXHkzkZjXmVe0bdv3/Wr537xmbZomhHC/3/Pa9JMd86uCtW0bQmjbdp7nVDfHXoDYNRDbPOL2VObb7ZbulXLA/X6Px542xv3zSRFiZ0eahiFuPKwP/cXLKM/iekcEX3XgDAcmVyAqnYqbbF6BZms6gfwKiDQfQPw1TiQQbyzZbATLv2cpeKv0jKW/bu7w+sYvPexfer26vNq6HVYigQMXe7TOJNFfnoo3LyxMW0qrCvMlVo+En0tQ4HBOxfV7/T1y1QMAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAnMBRE1NeMChYZhqgWnFm4rQu1Cv7bz5CsncBD/NkIqm4/HTf9yGE/JBTBfd4xx1rvQsGhSfTVQJwoLi6UvN7Lai8MtusI2MmeHyQcRyn31Z15xnFiJCv9bASF5VYliUtNt38XiYqVnD5SptxhYgdi3e1ybNMBwZmZuRwpVNxXOcpZYJ8qafNu8SgsLlYcwoHq+UlzyiFnqbQv5C/OOnw8xczvkppncmmacZxfF4bmpkRgNqlBoP4DTh9D44rMaalqJ+LSzum3JBWXMzjSHqcOteJiBX/k36H0pb8GOd5jrd375ERFAD4kLi6dN4eEKu6tIBk8zsurJrZn8vXixrHMa61uNoeb5y9k6Ip9NG81X8+/HzAjkL4dXQR1jaLpD+CKLWxj+MYQ0BsPF/V6PFn13W32+3Fof7x+/QwDKk1Po2BiAMjYkSY5zmGkvP6/LUPggKcW1V18GZhKkwzHCh1scd+h77vdxlY1rZt0zTzPD8OCcxHQjRHfCPf3eOK29M0xVfgHXQ9APAJj5c57NURMI5jqjtjv8Pqere2bWMuud1uuzzjZ6SWlSbrNJnnOd7OR2aM4/i+XhVBAYBPWFV4zVYr+mqfP45RSF+p411ut9tqVGO6PY5jqmJPIV7NEW/H4RpxjGfKOtM0zfMcN7Zt+76WkqtdTOjySH6UU1yOeIpCsq/SqTiv/KK0W6zt4giGtE9sTn+8PDKN8G9+txbkT51u59dMpsf/7jEdb/OSjW9fx/F6dXm1alVQ4Ec5RR18ikKyr+en4rxFfbX9sff9G0qPT05QgB/hFHXwKQrJvpyK62fCJQBgB4ICAFB0wXkUSothaAcD+Bhr9l7GBYOCQABwrJ98Hr7e+AxdDwBAkaAAABQJCgBAkaAAABQJCgBAkaAAABQJCgBAkaAAABRdcMIloDYh/HrcaKUoOAVBAc5hs649hc1AcN7DgZ9GUIDT8BUc+DxjFACAIkEBACi6YNeDZaYBYC8XDAoCAQDsRdcDAFAkKAAARYICAFAkKAAARYICAFAkKAAARYICAFBUUVDoum61ZRiGruumaVptn6ap67phGD5SLgD4uWoJCl3XzfOcZ4IQwjiOTdP0fZ9niGEY+r5vmmaaphDCY4wAAPZSxcyM0zTN85xvia0FaY7FGAhiXBjH8X6/x9uxXUFWAIA3qaJFoe/72+2Wb5mmqW3b9GvbtjE6xJ+pgWEYhlXCAAB2dHxQ6LrudrutBhzM85x3N8SOic37Nk2jRQEA3uTgrofYJPB6Tb9qadhUWj3yCetIAcCmI4PCNE3jOO5eSav1AWAvRwaF1YCDpmn6vm/b9kkDQ9d18VKIKO75eF0lALCLg4NCngnmeU6DFldxIfU4bAYFAOBNQj0N9SGEdN3jNE1938df89txtzT4MYSwihQhVHREsKMQfi3LP0eXYjcXOxxIrlcNVTGPwqN4KUScWKlpmtvtlvoX7vd73/epXUGjAtcTwq+ji/AJm4cpPUBtag8+aZ6lx+3N1uiE60U5fqAf+237xx44V3K9aqjSFoWkNFDRAEYA+IDjJ1wCAKolKAAARYICAFAkKAAARYICAFAkKAAARYICAFBU+zwK31BaZvpiM2AAwAdcMCgIBACwF10PAECRoAAAFAkKAECRoAAAFAkKAECRoAAAFAkKAECRoAAAFAkKAECRoAAAFAkKAECRoAAAFF1wUSirRwLAXi4YFAQCOK8Qfj1uXJZ/Pl8SILpgUIAT2awXf6zNQOAlgmMJCnAwX5eBmhnMCAAUCQoAQJGgAAAUCQoAQJGgAAAUCQoAQJGgAAAUCQoAQJGgAAAUCQoAQJGgAAAUXXCtB8tMA8BeLhgUBAIA2IuuBwCgSFAAAIoEBQCgSFAAAIoEBQCgSFAAAIoEBQCgSFAAAIoEBQCgSFAAAIoEBQCg6IJrPUCdQvh1dBHOavOlW5Z/Pl8S+IEuGBSsHkm11G3fsPmiSV3wMRcMCgIBAOzFGAUAoEhQAACKBAUAoEhQAACKBAUAoEhQAACKBAUAoOj4oDBN0zAMXddN07T6U2n7NE1d1w3D8JECAsDPdXBQGIah7/sYBfq+77ou/SmEMI7j4/Z4l6ZppmkKITzGCABgL+HYeQxDCPf7PeaAaZr6vo/lGYZhHMdUtny3/Ha6Y/6AZmakTiH8MoXzXryYVOt61dCRLQqxgk+tBXmtP01T27Zpz7ZtY0dD/JnuMgzDPM8fKi4A/DxHBoWu61LsisMOmt8hYJ7nvLuh67rNQPDYogAA7KiKRaFSDrjf78/3XLU0bCqtHvnExZqJAGAvVQSF6be+79P4g29T6wPAXo6/PDJKlzs+70dY9UGsRjkAAPs6MigMw1DqJmjbNk8MqcdhlQmMTgCAtzo4KKSfzb+bB+LlDOkKiHme8+sd0l3GcfzjkAUA4NsO7nq43W7jOIYQQgh9399ut3Ttw+126/t+tb1pmvv9nu7SaFQAgHeqYl6IJ0MN0mWTL97lejNdcBnmCNqRF5NqXa8autzxXO4d4jLUbTvyYlKt61VDtVz1AABUSFAAAIoEBQCgSFAAAIqqmMIZ4KtC+PW40QhH2J2gAPvbrMPY0WYg8LLDOwgK8Ba+2gLXcMGgUFo/4mIXtgLAB1wwKAgEALAXVz0AAEWCAgBQJCgAAEWCAgBQJCgAAEWCAgBQJCgAAEWCAgBQJCgAAEWCAgBQJCgAAEWCAgBQdMFFoaweCQB7uWBQEAgAYC+6HgCAIkEBACgSFACAIkEBACgSFACAIkEBACgSFACAIkEBACi64IRL8Ekh/Dq6CPyfzbdjWf75fEngMgQF+FvqoUpsvhGSHPwlXQ8AQJGgAAAUCQoAQNEFxyhYZhoA9nLBoCAQAMBedD0AAEWCAgBQJCgAAEWCAgBQJCgAAEWCAgBQJCgAAEWCAgBQJCgAAEWCAgBQJCgAAEWCAgBQdMFFoaweCQB7uWBQEAgAYC+6HgCAIkEBACgSFACAIkEBACgSFACAogte9QBvEsKvo4vAd2y+ccvyz+dLAmckKMAXqF1OZ/Mtk/ngdboeAICi44PCNE3DMHRdNwzD6k9x+zRNj3fZ3B8A2NfBQWEYhr7vYxQYxzGffTmEMI5j0zR933ddt7pL0zTTNIUQHmMEALCXcOyExyGE2+2W2gbSr8MwjOOYyhZCuN/vMS7kt+PPPCuEcPARcWEh/DJG4Rq8lbzP9aqh47se8taCtm1jrT9NU9u2+fYYJuLPdJdhGOZ5/lRJAeDHOTgoLMuSB4V5nuOv6UbUdd1mIHhsUQAAdnR8i0IUBxw0v9sMnuyWtzRsCl+344EAwJVUERS6ruv7vm3bXfp1lq/7+ycFgEs6PiiEEOZ5vt/vr/QgrPog4l3yTgoAYEcHB4UQQmxIWFX2aVRjlHocVrsZnQAAb3XkFM6pPWBV38fJlOL8CvGvscmh+R0U4vWTTdOM4/jHIQsAwLcdeblnnCxhtTG1JeR/zedamKYpTrgUrcp/vQtYqYeL7y/DW8n7XK8aqv14YqPC5vZma3TC9d4h6qF2uQxvJe9zvWqo9tUjSwMVDWAEgA84/qoHAKBaggIAUCQoAABFggIAUCQoAABFggIAUFT75ZHfUFoN8mIXtgLAB1wwKAgEALCXCwYF+Hsh/Dq6CLzX5ltsukZ4JCjANnXGhW2+udIhbDKYEQAoEhQAgCJBAQAoEhQAgCJBAQAoEhQAgCJBAQAoEhQAgCJBAQAoEhQAgCJBAQAouuBaD5aZBoC9XDAoCAQAsBddDwBAkaAAABQJCgBAkaAAABQJCgBAkaAAABRd8PJI+JIQfh1dBIB6CQrQLMs/RxeBKmymRh8PfjhBAaBpCoFAgxMYowAAFAkKAECRoAAAFF1wjILVIwFgLxcMCgIBAOxF1wMAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAUCQoAABFggIAUCQoAABFF1wUCkpC+HV0EQBO5oJBwTLTPLEs/xxdBE5mM1/6IPFzXDAoCATAXjYDgaYpfhRjFACAIkEBACgSFACAIkEBACgSFACAIkEBACgSFACAolqCwjAMmxu7rpumabV9mqau6zbvAgDsqIqgME3TOI6rQBBCGMexaZq+77uuS9uHYej7Pt4rhPAYIwCAvRwcFGLbQKz4c7G1YFmWaZqWZZnnOQWCcRzv9/s0TdM0tW2rXQEA3uf4FoWu626322pjDAHp1xQI4s/UwDAMwzzPHykmAPxEBweFONTgsVVgnue8u6Hrus1AEPfR+wAAb3KyRaFWLQ2bSqtHPmEdKQDYdLKg8Aq1PgDs5WRBoeu6eClEFDsd8k4KgA/YXGl6c01qOLtKg0LbtvnIg9TjsBkU4NHmeRx2sRkIfOS4qkqDQpwsIV48OU3TPM/3+7353XiQxj+O4/jHIQv8WL7eAfy9SoNCvGYyza9wu91S/8L9fu/7PrUraFQAgPcJlQ/9i40Km9ubrdEJIdR+RHxGCL+0KPBJPnJE16uGKm1RSEoDFQ1gBIAPOH5mRgCgWoICAFAkKAAARYICAFAkKAAARYICAFBU++WR31BaPfJiF7YCwAdcMCgIBACwF10PAEDRBVsU+Gms2gfwPoICV2COfWqwmVl9ODk7QQFgB5uBQHMXF2CMAgBQJCgAAEWCAgBQJCgAAEWCAgBQJCgAAEWCAgBQJCgAAEWCAgBQdMGZGS0zfWHmueN0zOvM2V0wKAgE1+YMy4mY15kL0PUAABQJCgBAkaAAABQJCgBAkaAAABQJCgBAkaAAABQJCgBA0QUnXOIaTErDhZmukRMRFKiX8yaXZLpGzkXXAwBQJCgAAEUX7HqweiQA7OWCQUEgAIC96HoAAIoEBQCg6IJdDwBnZHIF6iQocDxXkIPJFaiWoEAVfG0CqJMxCgBAkaAAABQJCgBAkaAAABQZzMhHGcUNX+KaSQ4nKPBpznHwItdMUgNdDwBAkaAAABRdsOvBMtMAsJcLBgWBoBJ6UuFNjHDkky4YFKiHMxfszghHPswYhWOU+keqcopCNsq5t1OU8xSFbJRzV6co5CVpUQC4Av0RvImgwA40e8Kx9EfwPoIC+/DFBeCSBAW+xncUOBH9Efw9QYEvc5aBU9AfwS7OGhSGYWiapuu6ruu+dMcQwvOJFv64w14P8u6n2OlI/3BOqaScz3nTX9/hFacopzf9yQ7fyAre9C89yMWc74Cnaer7vm3bpmnmeb7dbjE0RJ+qPo//n6lhh0qK4Uh33KGSYtSwQyXFcKSf3OFjz3Iu5zue2IQwTVPTNMMwjOOYH8JlPkmf32HrS8Z/f+ZLUW0xHOknd6ikGB870qb5n8ftqfOihnJe5k0/nfMdTwjhfr+nHofHX6/xSXrrDr8zwX9Xp4ZVj+ZPeCle3KGSYjjST+5QSTEOPNJ/f3n4/6eL0hClGg7kLG/66ZxsjEJsSFiNS5im6asjFS7psUngSU/ksvwTwn8NSwRK8vNDPF2E8OvJWWX1J6eXyzhZUNgU00MS/jTN59/v8Jln2aOc/y3f9787PUUVR+pN33GHSopRww6VFOMsR7o64cSTzL5PcZ6X4lKuEBTy5oSLNfgAwLEsCgUAFJ0sKOSXPKw2AgC7O1lQaJqmbdu+7+PtNO3SgeUBgAs7X1CIzQkhhBDCOI73+/1xh+7fPl/Ir6q5kNM0DcPQdd2qIac2qZz5BFw1q7OcwzAMw1D5ex3V+QImZ/lAnuUfPKr8VHm6qudF5wsKTdMsy3K/3+/3+7Isj2/GNE3zPB9Rrm/qum6e5zr/S4dh6Ps+lq3v+2o/+nk5x3Gsf0zyNE3jOFb1pk/TFEKYpilOflp/9VbbC5g7ywfyLP/gUc2nyuaEVc8XLJfTtm3btkeX4lWpRSTmntrkBYtFPbQ4RU3T3G630q9Vud/vcQLy2t70/B/ndrtV+15X+wLmzvKBPMs/+FL9qXI5W9XzJadsUXhunufKc3Gu7/t4Uq7QanqrzZGk9cjf9LZtqy1n0zRd11X4ps/znFoR4o1qX8M6X8CV+j+Q5/oHr/lUGZ2r6vmao5PK/pqmSV842ratNn4uy9K2bfye0VQck6P0Ne7ogrykqfULXK6qN/3x22T9r2FVL+Bz9b+Ylf+Dn+JUeaKq56uuMOHSpnjiiz1wS5WzMA3DUHN/Wy52DTa/X9Waxf71pvqRbqdwig9n5U7xgaz8H/xEp8rmDFXPN5wyKMTxVo/b479i/t7E8VlxIPeHCpd5Us44FKuGj9HzFzPfJ57y8iW4PumVcsbz3eHNvK8U9RQu2476KZV8IP+ohn/wknpOlX9UT9Wzu1MGhS+p87/0cQaIvu/rLGoUr/aJ48yrOo8kcWB5bac5fqxzfSCr/Qc/3akyOUUhX3TKoPDkEtV4TXD+9szzfNQQmCflfCxk27aHZM/nhawnyz+/LjmEUM+/5RkvoU4D2fKSn+4o6lHVB7Kkqn/wknpOlc9VVfXs77jhEe/SZOOG4vtU/6CSaguZv5ix763CcsaC3W63+78dXa4/qO3FzMey1Xx5ZFLbC5ic6AN5in/wXM0lPGPV86LazwXfsBqPU/lg46jaj9QqEdf5Ym7G9vovaK7wTc9fwNrK9qjaQp7oA3mKf/BctW/6cs6q50Vhqbvd6dtWlwjzN7yYP4f3+gfypu/oki/mZYMCAPD3LjgzIwCwF0EBACgSFACAIkEBACgSFACAIkEBACgSFACAIkEBACgSFACAIkEBACgSFACAIkEBACgSFACAIkEBACgSFACAIkEBACgSFACAIkEBACgSFACAIkEBACgSFACAIkEBACgSFACAIkEBACgSFACAIkEBACgSFACAIkEBACgSFACAIkEBACj6X26Xz+t+LqzaAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<IPython.core.display.Image object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import ROOT\n",
    "canvas = ROOT.TCanvas()\n",
    "root_hist.Draw()\n",
    "canvas.Draw()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And Pandas with yet another connector."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>unweighted</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>[-5.0, -4.875)</th>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-4.875, -4.75)</th>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-4.75, -4.625)</th>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-4.625, -4.5)</th>\n",
       "      <td>17</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-4.5, -4.375)</th>\n",
       "      <td>34</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-4.375, -4.25)</th>\n",
       "      <td>53</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-4.25, -4.125)</th>\n",
       "      <td>88</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-4.125, -4.0)</th>\n",
       "      <td>145</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-4.0, -3.875)</th>\n",
       "      <td>219</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-3.875, -3.75)</th>\n",
       "      <td>362</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-3.75, -3.625)</th>\n",
       "      <td>583</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-3.625, -3.5)</th>\n",
       "      <td>890</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-3.5, -3.375)</th>\n",
       "      <td>1414</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-3.375, -3.25)</th>\n",
       "      <td>2082</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-3.25, -3.125)</th>\n",
       "      <td>3073</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-3.125, -3.0)</th>\n",
       "      <td>4567</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-3.0, -2.875)</th>\n",
       "      <td>6650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-2.875, -2.75)</th>\n",
       "      <td>9493</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-2.75, -2.625)</th>\n",
       "      <td>13497</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-2.625, -2.5)</th>\n",
       "      <td>18696</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-2.5, -2.375)</th>\n",
       "      <td>25706</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-2.375, -2.25)</th>\n",
       "      <td>34175</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-2.25, -2.125)</th>\n",
       "      <td>45639</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-2.125, -2.0)</th>\n",
       "      <td>59338</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-2.0, -1.875)</th>\n",
       "      <td>76917</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-1.875, -1.75)</th>\n",
       "      <td>96418</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-1.75, -1.625)</th>\n",
       "      <td>120250</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-1.625, -1.5)</th>\n",
       "      <td>147785</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-1.5, -1.375)</th>\n",
       "      <td>177579</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-1.375, -1.25)</th>\n",
       "      <td>210677</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[1.25, 1.375)</th>\n",
       "      <td>210293</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[1.375, 1.5)</th>\n",
       "      <td>177366</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[1.5, 1.625)</th>\n",
       "      <td>147453</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[1.625, 1.75)</th>\n",
       "      <td>119625</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[1.75, 1.875)</th>\n",
       "      <td>97069</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[1.875, 2.0)</th>\n",
       "      <td>75632</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[2.0, 2.125)</th>\n",
       "      <td>59476</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[2.125, 2.25)</th>\n",
       "      <td>45713</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[2.25, 2.375)</th>\n",
       "      <td>34588</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[2.375, 2.5)</th>\n",
       "      <td>25589</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[2.5, 2.625)</th>\n",
       "      <td>18934</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[2.625, 2.75)</th>\n",
       "      <td>13608</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[2.75, 2.875)</th>\n",
       "      <td>9658</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[2.875, 3.0)</th>\n",
       "      <td>6656</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[3.0, 3.125)</th>\n",
       "      <td>4692</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[3.125, 3.25)</th>\n",
       "      <td>3177</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[3.25, 3.375)</th>\n",
       "      <td>2137</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[3.375, 3.5)</th>\n",
       "      <td>1388</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[3.5, 3.625)</th>\n",
       "      <td>866</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[3.625, 3.75)</th>\n",
       "      <td>570</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[3.75, 3.875)</th>\n",
       "      <td>365</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[3.875, 4.0)</th>\n",
       "      <td>207</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[4.0, 4.125)</th>\n",
       "      <td>147</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[4.125, 4.25)</th>\n",
       "      <td>71</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[4.25, 4.375)</th>\n",
       "      <td>40</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[4.375, 4.5)</th>\n",
       "      <td>29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[4.5, 4.625)</th>\n",
       "      <td>21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[4.625, 4.75)</th>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[4.75, 4.875)</th>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[4.875, 5.0)</th>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>80 rows × 1 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                 unweighted\n",
       "[-5.0, -4.875)            3\n",
       "[-4.875, -4.75)           6\n",
       "[-4.75, -4.625)           9\n",
       "[-4.625, -4.5)           17\n",
       "[-4.5, -4.375)           34\n",
       "[-4.375, -4.25)          53\n",
       "[-4.25, -4.125)          88\n",
       "[-4.125, -4.0)          145\n",
       "[-4.0, -3.875)          219\n",
       "[-3.875, -3.75)         362\n",
       "[-3.75, -3.625)         583\n",
       "[-3.625, -3.5)          890\n",
       "[-3.5, -3.375)         1414\n",
       "[-3.375, -3.25)        2082\n",
       "[-3.25, -3.125)        3073\n",
       "[-3.125, -3.0)         4567\n",
       "[-3.0, -2.875)         6650\n",
       "[-2.875, -2.75)        9493\n",
       "[-2.75, -2.625)       13497\n",
       "[-2.625, -2.5)        18696\n",
       "[-2.5, -2.375)        25706\n",
       "[-2.375, -2.25)       34175\n",
       "[-2.25, -2.125)       45639\n",
       "[-2.125, -2.0)        59338\n",
       "[-2.0, -1.875)        76917\n",
       "[-1.875, -1.75)       96418\n",
       "[-1.75, -1.625)      120250\n",
       "[-1.625, -1.5)       147785\n",
       "[-1.5, -1.375)       177579\n",
       "[-1.375, -1.25)      210677\n",
       "...                     ...\n",
       "[1.25, 1.375)        210293\n",
       "[1.375, 1.5)         177366\n",
       "[1.5, 1.625)         147453\n",
       "[1.625, 1.75)        119625\n",
       "[1.75, 1.875)         97069\n",
       "[1.875, 2.0)          75632\n",
       "[2.0, 2.125)          59476\n",
       "[2.125, 2.25)         45713\n",
       "[2.25, 2.375)         34588\n",
       "[2.375, 2.5)          25589\n",
       "[2.5, 2.625)          18934\n",
       "[2.625, 2.75)         13608\n",
       "[2.75, 2.875)          9658\n",
       "[2.875, 3.0)           6656\n",
       "[3.0, 3.125)           4692\n",
       "[3.125, 3.25)          3177\n",
       "[3.25, 3.375)          2137\n",
       "[3.375, 3.5)           1388\n",
       "[3.5, 3.625)            866\n",
       "[3.625, 3.75)           570\n",
       "[3.75, 3.875)           365\n",
       "[3.875, 4.0)            207\n",
       "[4.0, 4.125)            147\n",
       "[4.125, 4.25)            71\n",
       "[4.25, 4.375)            40\n",
       "[4.375, 4.5)             29\n",
       "[4.5, 4.625)             21\n",
       "[4.625, 4.75)             9\n",
       "[4.75, 4.875)             4\n",
       "[4.875, 5.0)              4\n",
       "\n",
       "[80 rows x 1 columns]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import stagg.connect.pandas\n",
    "\n",
    "pandas_hist = stagg.connect.pandas.topandas(stagg_hist)\n",
    "pandas_hist"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Serialization\n",
    "\n",
    "The `stagg_hist` object is also a [Flatbuffers](http://google.github.io/flatbuffers/) object, which has a [multi-lingual](https://google.github.io/flatbuffers/flatbuffers_support.html), [random-access](https://github.com/mzaks/FlatBuffersSwift/wiki/FlatBuffers-Explained), [small-footprint](http://google.github.io/flatbuffers/md__benchmarks.html) serialization:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "bytearray(b'\\x04\\x00\\x00\\x00\\x90\\xff\\xff\\xff\\x10\\x00\\x00\\x00\\x00\\x01\\n\\x00\\x10\\x00\\x0c\\x00\\x0b\\x00\\x04\\x00\\n\\x00\\x00\\x00`\\x00\\x00\\x00\\x00\\x00\\x00\\x01\\x04\\x00\\x00\\x00\\x01\\x00\\x00\\x00\\x0c\\x00\\x00\\x00\\x08\\x00\\x0c\\x00\\x0b\\x00\\x04\\x00\\x08\\x00\\x00\\x00\\x10\\x00\\x00\\x00\\x00\\x00\\x00\\x02\\x08\\x00(\\x00\\x1c\\x00\\x04\\x00\\x08\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x14\\xc0\\x00\\x00\\x00\\x00\\x00\\x00\\x14@\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00P\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x08\\x00\\n\\x00\\t\\x00\\x04\\x00\\x08\\x00\\x00\\x00\\x0c\\x00\\x00\\x00\\x00\\x02\\x06\\x00\\x08\\x00\\x04\\x00\\x06\\x00\\x00\\x00\\x04\\x00\\x00\\x00\\x80\\x02\\x00\\x00\\x03\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x06\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\t\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x11\\x00\\x00\\x00\\x00\\x00\\x00\\x00\"\\x00\\x00\\x00\\x00\\x00\\x00\\x005\\x00\\x00\\x00\\x00\\x00\\x00\\x00X\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x91\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\xdb\\x00\\x00\\x00\\x00\\x00\\x00\\x00j\\x01\\x00\\x00\\x00\\x00\\x00\\x00G\\x02\\x00\\x00\\x00\\x00\\x00\\x00z\\x03\\x00\\x00\\x00\\x00\\x00\\x00\\x86\\x05\\x00\\x00\\x00\\x00\\x00\\x00\"\\x08\\x00\\x00\\x00\\x00\\x00\\x00\\x01\\x0c\\x00\\x00\\x00\\x00\\x00\\x00\\xd7\\x11\\x00\\x00\\x00\\x00\\x00\\x00\\xfa\\x19\\x00\\x00\\x00\\x00\\x00\\x00\\x15%\\x00\\x00\\x00\\x00\\x00\\x00\\xb94\\x00\\x00\\x00\\x00\\x00\\x00\\x08I\\x00\\x00\\x00\\x00\\x00\\x00jd\\x00\\x00\\x00\\x00\\x00\\x00\\x7f\\x85\\x00\\x00\\x00\\x00\\x00\\x00G\\xb2\\x00\\x00\\x00\\x00\\x00\\x00\\xca\\xe7\\x00\\x00\\x00\\x00\\x00\\x00u,\\x01\\x00\\x00\\x00\\x00\\x00\\xa2x\\x01\\x00\\x00\\x00\\x00\\x00\\xba\\xd5\\x01\\x00\\x00\\x00\\x00\\x00IA\\x02\\x00\\x00\\x00\\x00\\x00\\xab\\xb5\\x02\\x00\\x00\\x00\\x00\\x00\\xf56\\x03\\x00\\x00\\x00\\x00\\x00!\\xc2\\x03\\x00\\x00\\x00\\x00\\x00dR\\x04\\x00\\x00\\x00\\x00\\x00i\\xe6\\x04\\x00\\x00\\x00\\x00\\x00|t\\x05\\x00\\x00\\x00\\x00\\x00\\xc6\\xfd\\x05\\x00\\x00\\x00\\x00\\x00\\x12|\\x06\\x00\\x00\\x00\\x00\\x00{\\xe8\\x06\\x00\\x00\\x00\\x00\\x00/C\\x07\\x00\\x00\\x00\\x00\\x00\\xce{\\x07\\x00\\x00\\x00\\x00\\x00P\\x96\\x07\\x00\\x00\\x00\\x00\\x002\\x97\\x07\\x00\\x00\\x00\\x00\\x00R{\\x07\\x00\\x00\\x00\\x00\\x00\\xc2?\\x07\\x00\\x00\\x00\\x00\\x00\\xce\\xea\\x06\\x00\\x00\\x00\\x00\\x00\\xb5\\x7f\\x06\\x00\\x00\\x00\\x00\\x00D\\xff\\x05\\x00\\x00\\x00\\x00\\x00Qz\\x05\\x00\\x00\\x00\\x00\\x00\\x16\\xe8\\x04\\x00\\x00\\x00\\x00\\x00\\xcbU\\x04\\x00\\x00\\x00\\x00\\x00-\\xc2\\x03\\x00\\x00\\x00\\x00\\x00u5\\x03\\x00\\x00\\x00\\x00\\x00\\xd6\\xb4\\x02\\x00\\x00\\x00\\x00\\x00\\xfd?\\x02\\x00\\x00\\x00\\x00\\x00I\\xd3\\x01\\x00\\x00\\x00\\x00\\x00-{\\x01\\x00\\x00\\x00\\x00\\x00p\\'\\x01\\x00\\x00\\x00\\x00\\x00T\\xe8\\x00\\x00\\x00\\x00\\x00\\x00\\x91\\xb2\\x00\\x00\\x00\\x00\\x00\\x00\\x1c\\x87\\x00\\x00\\x00\\x00\\x00\\x00\\xf5c\\x00\\x00\\x00\\x00\\x00\\x00\\xf6I\\x00\\x00\\x00\\x00\\x00\\x00(5\\x00\\x00\\x00\\x00\\x00\\x00\\xba%\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x1a\\x00\\x00\\x00\\x00\\x00\\x00T\\x12\\x00\\x00\\x00\\x00\\x00\\x00i\\x0c\\x00\\x00\\x00\\x00\\x00\\x00Y\\x08\\x00\\x00\\x00\\x00\\x00\\x00l\\x05\\x00\\x00\\x00\\x00\\x00\\x00b\\x03\\x00\\x00\\x00\\x00\\x00\\x00:\\x02\\x00\\x00\\x00\\x00\\x00\\x00m\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\xcf\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x93\\x00\\x00\\x00\\x00\\x00\\x00\\x00G\\x00\\x00\\x00\\x00\\x00\\x00\\x00(\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x1d\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x15\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\t\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x04\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x04\\x00\\x00\\x00\\x00\\x00\\x00\\x00')"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "stagg_hist.tobuffer()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Numpy size:  1288\n",
      "ROOT size:   1962\n",
      "Pandas size: 2975\n",
      "Stagg size:   792\n"
     ]
    }
   ],
   "source": [
    "print(\"Numpy size: \", numpy_hist[0].nbytes + numpy_hist[1].nbytes)\n",
    "\n",
    "tmessage = ROOT.TMessage()\n",
    "tmessage.WriteObject(root_hist)\n",
    "print(\"ROOT size:  \", tmessage.Length())\n",
    "\n",
    "import pickle\n",
    "print(\"Pandas size:\", len(pickle.dumps(pandas_hist)))\n",
    "\n",
    "print(\"Stagg size:  \", len(stagg_hist.tobuffer()))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Stagg is generally forseen as a memory format, like [Apache Arrow](https://arrow.apache.org), but for statistical aggregations. Like Arrow, it reduces the need to implement $N(N - 1)/2$ conversion functions among $N$ statistical libraries to just $N$ conversion functions. (See the figure on Arrow's website.)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Translation of conventions\n",
    "\n",
    "Stagg also intends to be as close to zero-copy as possible. This means that it must make graceful translations among conventions. Different histogramming libraries handle overflow bins in different ways:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "RegularBinning(\n",
      "  num=80,\n",
      "  interval=RealInterval(low=-5.0, high=5.0),\n",
      "  overflow=RealOverflow(loc_underflow=BinLocation.below1, loc_overflow=BinLocation.above1))\n",
      "Bin contents length: 82\n"
     ]
    }
   ],
   "source": [
    "fromroot = stagg.connect.root.tostagg(root_hist)\n",
    "fromroot.axis[0].binning.dump()\n",
    "print(\"Bin contents length:\", len(fromroot.counts.array))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "RegularBinning(num=80, interval=RealInterval(low=-5.0, high=5.0))\n",
      "Bin contents length: 80\n"
     ]
    }
   ],
   "source": [
    "stagg_hist.axis[0].binning.dump()\n",
    "print(\"Bin contents length:\", len(stagg_hist.counts.array))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And yet we want to be able to manipulate them as though these differences did not exist."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "sum_hist = fromroot + stagg_hist"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "RegularBinning(\n",
      "  num=80,\n",
      "  interval=RealInterval(low=-5.0, high=5.0),\n",
      "  overflow=RealOverflow(loc_underflow=BinLocation.above1, loc_overflow=BinLocation.above2))\n",
      "Bin contents length: 82\n"
     ]
    }
   ],
   "source": [
    "sum_hist.axis[0].binning.dump()\n",
    "print(\"Bin contents length:\", len(sum_hist.counts.array))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The binning structure keeps track of the existence of underflow/overflow bins and where they are located.\n",
    "\n",
    "   * ROOT's convention is to put underflow before the normal bins (`below1`) and overflow after (`above1`), so that the normal bins are effectively 1-indexed.\n",
    "   * Boost.Histogram's convention is to put overflow after the normal bins (`above1`) and underflow after that (`above2`), so that underflow is accessed via `myhist[-1]` in Numpy.\n",
    "   * Numpy histograms don't have underflow/overflow bins.\n",
    "   * Pandas could have `Intervals` that extend to infinity.\n",
    "\n",
    "Stagg accepts all of these, so that it doesn't have to manipulate the bin contents buffer it receives, but knows how to deal with them if it has to combine histograms that follow different conventions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Binning types\n",
    "\n",
    "All the different axis types have an equivalent in Stagg (and not all are single-dimensional)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "IntegerBinning(min=5, max=10)\n",
      "RegularBinning(num=100, interval=RealInterval(low=-5.0, high=5.0))\n",
      "HexagonalBinning(qmin=0, qmax=100, rmin=0, rmax=100, coordinates=HexagonalBinning.cube_xy)\n",
      "EdgesBinning(edges=[0.01 0.05 0.1 0.5 1 5 10 50 100])\n",
      "IrregularBinning(\n",
      "  intervals=[\n",
      "    RealInterval(low=0.0, high=5.0),\n",
      "    RealInterval(low=10.0, high=100.0),\n",
      "    RealInterval(low=-10.0, high=10.0)\n",
      "  ],\n",
      "  overlapping_fill=IrregularBinning.all)\n",
      "CategoryBinning(categories=['one', 'two', 'three'])\n",
      "SparseRegularBinning(bins=[5 3 -2 8 -100], bin_width=10.0)\n",
      "FractionBinning(error_method=FractionBinning.clopper_pearson)\n",
      "PredicateBinning(predicates=['signal region', 'control region'])\n",
      "VariationBinning(\n",
      "  variations=[\n",
      "    Variation(assignments=[\n",
      "        Assignment(identifier='x', expression='nominal')\n",
      "      ]),\n",
      "    Variation(\n",
      "      assignments=[\n",
      "        Assignment(identifier='x', expression='nominal + sigma')\n",
      "      ]),\n",
      "    Variation(\n",
      "      assignments=[\n",
      "        Assignment(identifier='x', expression='nominal - sigma')\n",
      "      ])\n",
      "  ])\n"
     ]
    }
   ],
   "source": [
    "import stagg\n",
    "stagg.IntegerBinning(5, 10).dump()\n",
    "stagg.RegularBinning(100, stagg.RealInterval(-5, 5)).dump()\n",
    "stagg.HexagonalBinning(0, 100, 0, 100, stagg.HexagonalBinning.cube_xy).dump()\n",
    "stagg.EdgesBinning([0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50, 100]).dump()\n",
    "stagg.IrregularBinning([stagg.RealInterval(0, 5),\n",
    "                        stagg.RealInterval(10, 100),\n",
    "                        stagg.RealInterval(-10, 10)],\n",
    "                       overlapping_fill=stagg.IrregularBinning.all).dump()\n",
    "stagg.CategoryBinning([\"one\", \"two\", \"three\"]).dump()\n",
    "stagg.SparseRegularBinning([5, 3, -2, 8, -100], 10).dump()\n",
    "stagg.FractionBinning(error_method=stagg.FractionBinning.clopper_pearson).dump()\n",
    "stagg.PredicateBinning([\"signal region\", \"control region\"]).dump()\n",
    "stagg.VariationBinning([stagg.Variation([stagg.Assignment(\"x\", \"nominal\")]),\n",
    "                        stagg.Variation([stagg.Assignment(\"x\", \"nominal + sigma\")]),\n",
    "                        stagg.Variation([stagg.Assignment(\"x\", \"nominal - sigma\")])]).dump()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The meanings of these binning classes are given in [the specification](https://github.com/diana-hep/stagg/blob/master/specification.adoc#integerbinning), but many of them can be converted into one another, and converting to `CategoryBinning` (strings) often makes the intent clear."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CategoryBinning(categories=['5', '6', '7', '8', '9', '10'])\n",
      "CategoryBinning(\n",
      "  categories=['[-5, -4)', '[-4, -3)', '[-3, -2)', '[-2, -1)', '[-1, 0)', '[0, 1)', '[1, 2)', '[2, 3)', '[3, 4)', '[4, 5)'])\n",
      "CategoryBinning(\n",
      "  categories=['[0.01, 0.05)', '[0.05, 0.1)', '[0.1, 0.5)', '[0.5, 1)', '[1, 5)', '[5, 10)', '[10, 50)', '[50, 100)'])\n",
      "CategoryBinning(categories=['[0, 5)', '[10, 100)', '[-10, 10)'])\n",
      "CategoryBinning(categories=['[50, 60)', '[30, 40)', '[-20, -10)', '[80, 90)', '[-1000, -990)'])\n",
      "CategoryBinning(categories=['pass', 'all'])\n",
      "CategoryBinning(categories=['signal region', 'control region'])\n",
      "CategoryBinning(categories=['x := nominal', 'x := nominal + sigma', 'x := nominal - sigma'])\n"
     ]
    }
   ],
   "source": [
    "stagg.IntegerBinning(5, 10).toCategoryBinning().dump()\n",
    "stagg.RegularBinning(10, stagg.RealInterval(-5, 5)).toCategoryBinning().dump()\n",
    "stagg.EdgesBinning([0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50, 100]).toCategoryBinning().dump()\n",
    "stagg.IrregularBinning([stagg.RealInterval(0, 5),\n",
    "                        stagg.RealInterval(10, 100),\n",
    "                        stagg.RealInterval(-10, 10)],\n",
    "                       overlapping_fill=stagg.IrregularBinning.all).toCategoryBinning().dump()\n",
    "stagg.SparseRegularBinning([5, 3, -2, 8, -100], 10).toCategoryBinning().dump()\n",
    "stagg.FractionBinning(error_method=stagg.FractionBinning.clopper_pearson).toCategoryBinning().dump()\n",
    "stagg.PredicateBinning([\"signal region\", \"control region\"]).toCategoryBinning().dump()\n",
    "stagg.VariationBinning([stagg.Variation([stagg.Assignment(\"x\", \"nominal\")]),\n",
    "                        stagg.Variation([stagg.Assignment(\"x\", \"nominal + sigma\")]),\n",
    "                        stagg.Variation([stagg.Assignment(\"x\", \"nominal - sigma\")])]).toCategoryBinning().dump()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This technique can also clear up confusion about overflow bins."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CategoryBinning(\n",
      "  categories=['{nan}', '[-5, -3)', '[-3, -1)', '[-1, 1)', '[1, 3)', '[3, 5)', '[5, +inf]', '[-inf, -5)'])\n"
     ]
    }
   ],
   "source": [
    "stagg.RegularBinning(5, stagg.RealInterval(-5, 5), stagg.RealOverflow(\n",
    "    loc_underflow=stagg.BinLocation.above2,\n",
    "    loc_overflow=stagg.BinLocation.above1,\n",
    "    loc_nanflow=stagg.BinLocation.below1\n",
    "    )).toCategoryBinning().dump()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Fancy binning types\n",
    "\n",
    "You might also be wondering about `FractionBinning`, `PredicateBinning`, and `VariationBinning`.\n",
    "\n",
    "`FractionBinning` is an axis of two bins: #passing and #total, #failing and #total, or #passing and #failing. Adding it to another axis effectively makes an \"efficiency plot.\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>unweighted</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"10\" valign=\"top\">pass</th>\n",
       "      <th>[-5.0, -4.0)</th>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-4.0, -3.0)</th>\n",
       "      <td>25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-3.0, -2.0)</th>\n",
       "      <td>29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-2.0, -1.0)</th>\n",
       "      <td>35</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-1.0, 0.0)</th>\n",
       "      <td>54</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[0.0, 1.0)</th>\n",
       "      <td>67</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[1.0, 2.0)</th>\n",
       "      <td>60</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[2.0, 3.0)</th>\n",
       "      <td>84</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[3.0, 4.0)</th>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[4.0, 5.0)</th>\n",
       "      <td>94</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"10\" valign=\"top\">all</th>\n",
       "      <th>[-5.0, -4.0)</th>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-4.0, -3.0)</th>\n",
       "      <td>119</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-3.0, -2.0)</th>\n",
       "      <td>109</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-2.0, -1.0)</th>\n",
       "      <td>109</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-1.0, 0.0)</th>\n",
       "      <td>95</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[0.0, 1.0)</th>\n",
       "      <td>104</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[1.0, 2.0)</th>\n",
       "      <td>102</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[2.0, 3.0)</th>\n",
       "      <td>106</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[3.0, 4.0)</th>\n",
       "      <td>112</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[4.0, 5.0)</th>\n",
       "      <td>122</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                   unweighted\n",
       "pass [-5.0, -4.0)           9\n",
       "     [-4.0, -3.0)          25\n",
       "     [-3.0, -2.0)          29\n",
       "     [-2.0, -1.0)          35\n",
       "     [-1.0, 0.0)           54\n",
       "     [0.0, 1.0)            67\n",
       "     [1.0, 2.0)            60\n",
       "     [2.0, 3.0)            84\n",
       "     [3.0, 4.0)            80\n",
       "     [4.0, 5.0)            94\n",
       "all  [-5.0, -4.0)          99\n",
       "     [-4.0, -3.0)         119\n",
       "     [-3.0, -2.0)         109\n",
       "     [-2.0, -1.0)         109\n",
       "     [-1.0, 0.0)           95\n",
       "     [0.0, 1.0)           104\n",
       "     [1.0, 2.0)           102\n",
       "     [2.0, 3.0)           106\n",
       "     [3.0, 4.0)           112\n",
       "     [4.0, 5.0)           122"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h = stagg.Histogram([stagg.Axis(stagg.FractionBinning()),\n",
    "                     stagg.Axis(stagg.RegularBinning(10, stagg.RealInterval(-5, 5)))],\n",
    "                    stagg.UnweightedCounts(\n",
    "                        stagg.InterpretedInlineBuffer.fromarray(\n",
    "                            numpy.array([[  9,  25,  29,  35,  54,  67,  60,  84,  80,  94],\n",
    "                                         [ 99, 119, 109, 109,  95, 104, 102, 106, 112, 122]]))))\n",
    "df = stagg.connect.pandas.topandas(h)\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th colspan=\"2\" halign=\"left\">unweighted</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th>all</th>\n",
       "      <th>pass</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>[-5.0, -4.0)</th>\n",
       "      <td>99</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-4.0, -3.0)</th>\n",
       "      <td>119</td>\n",
       "      <td>25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-3.0, -2.0)</th>\n",
       "      <td>109</td>\n",
       "      <td>29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-2.0, -1.0)</th>\n",
       "      <td>109</td>\n",
       "      <td>35</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-1.0, 0.0)</th>\n",
       "      <td>95</td>\n",
       "      <td>54</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[0.0, 1.0)</th>\n",
       "      <td>104</td>\n",
       "      <td>67</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[1.0, 2.0)</th>\n",
       "      <td>102</td>\n",
       "      <td>60</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[2.0, 3.0)</th>\n",
       "      <td>106</td>\n",
       "      <td>84</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[3.0, 4.0)</th>\n",
       "      <td>112</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[4.0, 5.0)</th>\n",
       "      <td>122</td>\n",
       "      <td>94</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             unweighted     \n",
       "                    all pass\n",
       "[-5.0, -4.0)         99    9\n",
       "[-4.0, -3.0)        119   25\n",
       "[-3.0, -2.0)        109   29\n",
       "[-2.0, -1.0)        109   35\n",
       "[-1.0, 0.0)          95   54\n",
       "[0.0, 1.0)          104   67\n",
       "[1.0, 2.0)          102   60\n",
       "[2.0, 3.0)          106   84\n",
       "[3.0, 4.0)          112   80\n",
       "[4.0, 5.0)          122   94"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = df.unstack(level=0)\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[-5.0, -4.0)    0.090909\n",
       "[-4.0, -3.0)    0.210084\n",
       "[-3.0, -2.0)    0.266055\n",
       "[-2.0, -1.0)    0.321101\n",
       "[-1.0, 0.0)     0.568421\n",
       "[0.0, 1.0)      0.644231\n",
       "[1.0, 2.0)      0.588235\n",
       "[2.0, 3.0)      0.792453\n",
       "[3.0, 4.0)      0.714286\n",
       "[4.0, 5.0)      0.770492\n",
       "dtype: float64"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[\"unweighted\", \"pass\"] / df[\"unweighted\", \"all\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`PredicateBinning` means that each bin represents a predicate (if-then rule) in the filling procedure. Stagg doesn't _have_ a filling procedure, but filling-libraries can use this to encode relationships among histograms that a fitting-library can take advantage of, for combined signal-control region fits, for instance. It's possible for those regions to overlap: an input datum might satisfy more than one predicate, and `overlapping_fill` determines which bin(s) were chosen: `first`, `last`, or `all`.\n",
    "\n",
    "`VariationBinning` means that each bin represents a variation of one of the paramters used to calculate the fill-variables. This is used to determine sensitivity to systematic effects, by varying them and re-filling. In this kind of binning, the same input datum enters every bin."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>unweighted</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"10\" valign=\"top\">x := nominal</th>\n",
       "      <th>[-5.0, -4.0)</th>\n",
       "      <td>35</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-4.0, -3.0)</th>\n",
       "      <td>1348</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-3.0, -2.0)</th>\n",
       "      <td>21465</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-2.0, -1.0)</th>\n",
       "      <td>135923</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-1.0, 0.0)</th>\n",
       "      <td>341627</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[0.0, 1.0)</th>\n",
       "      <td>340649</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[1.0, 2.0)</th>\n",
       "      <td>135983</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[2.0, 3.0)</th>\n",
       "      <td>21584</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[3.0, 4.0)</th>\n",
       "      <td>1355</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[4.0, 5.0)</th>\n",
       "      <td>30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"10\" valign=\"top\">x := nominal + sigma</th>\n",
       "      <th>[-5.0, -4.0)</th>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-4.0, -3.0)</th>\n",
       "      <td>597</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-3.0, -2.0)</th>\n",
       "      <td>10968</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-2.0, -1.0)</th>\n",
       "      <td>84154</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-1.0, 0.0)</th>\n",
       "      <td>272295</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[0.0, 1.0)</th>\n",
       "      <td>367137</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[1.0, 2.0)</th>\n",
       "      <td>209741</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[2.0, 3.0)</th>\n",
       "      <td>50026</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[3.0, 4.0)</th>\n",
       "      <td>4854</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[4.0, 5.0)</th>\n",
       "      <td>213</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                   unweighted\n",
       "x := nominal         [-5.0, -4.0)          35\n",
       "                     [-4.0, -3.0)        1348\n",
       "                     [-3.0, -2.0)       21465\n",
       "                     [-2.0, -1.0)      135923\n",
       "                     [-1.0, 0.0)       341627\n",
       "                     [0.0, 1.0)        340649\n",
       "                     [1.0, 2.0)        135983\n",
       "                     [2.0, 3.0)         21584\n",
       "                     [3.0, 4.0)          1355\n",
       "                     [4.0, 5.0)            30\n",
       "x := nominal + sigma [-5.0, -4.0)          14\n",
       "                     [-4.0, -3.0)         597\n",
       "                     [-3.0, -2.0)       10968\n",
       "                     [-2.0, -1.0)       84154\n",
       "                     [-1.0, 0.0)       272295\n",
       "                     [0.0, 1.0)        367137\n",
       "                     [1.0, 2.0)        209741\n",
       "                     [2.0, 3.0)         50026\n",
       "                     [3.0, 4.0)          4854\n",
       "                     [4.0, 5.0)           213"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "xdata = numpy.random.normal(0, 1, int(1e6))\n",
    "sigma = numpy.random.uniform(-0.1, 0.8, int(1e6))\n",
    "\n",
    "h = stagg.Histogram([stagg.Axis(stagg.VariationBinning([\n",
    "                        stagg.Variation([stagg.Assignment(\"x\", \"nominal\")]),\n",
    "                        stagg.Variation([stagg.Assignment(\"x\", \"nominal + sigma\")])])),\n",
    "                     stagg.Axis(stagg.RegularBinning(10, stagg.RealInterval(-5, 5)))],\n",
    "                    stagg.UnweightedCounts(\n",
    "                        stagg.InterpretedInlineBuffer.fromarray(\n",
    "                            numpy.concatenate([\n",
    "                                numpy.histogram(xdata, bins=10, range=(-5, 5))[0],\n",
    "                                numpy.histogram(xdata + sigma, bins=10, range=(-5, 5))[0]]))))\n",
    "df = stagg.connect.pandas.topandas(h)\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th colspan=\"2\" halign=\"left\">unweighted</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th>x := nominal</th>\n",
       "      <th>x := nominal + sigma</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>[-5.0, -4.0)</th>\n",
       "      <td>35</td>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-4.0, -3.0)</th>\n",
       "      <td>1348</td>\n",
       "      <td>597</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-3.0, -2.0)</th>\n",
       "      <td>21465</td>\n",
       "      <td>10968</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-2.0, -1.0)</th>\n",
       "      <td>135923</td>\n",
       "      <td>84154</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[-1.0, 0.0)</th>\n",
       "      <td>341627</td>\n",
       "      <td>272295</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[0.0, 1.0)</th>\n",
       "      <td>340649</td>\n",
       "      <td>367137</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[1.0, 2.0)</th>\n",
       "      <td>135983</td>\n",
       "      <td>209741</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[2.0, 3.0)</th>\n",
       "      <td>21584</td>\n",
       "      <td>50026</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[3.0, 4.0)</th>\n",
       "      <td>1355</td>\n",
       "      <td>4854</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>[4.0, 5.0)</th>\n",
       "      <td>30</td>\n",
       "      <td>213</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "               unweighted                     \n",
       "             x := nominal x := nominal + sigma\n",
       "[-5.0, -4.0)           35                   14\n",
       "[-4.0, -3.0)         1348                  597\n",
       "[-3.0, -2.0)        21465                10968\n",
       "[-2.0, -1.0)       135923                84154\n",
       "[-1.0, 0.0)        341627               272295\n",
       "[0.0, 1.0)         340649               367137\n",
       "[1.0, 2.0)         135983               209741\n",
       "[2.0, 3.0)          21584                50026\n",
       "[3.0, 4.0)           1355                 4854\n",
       "[4.0, 5.0)             30                  213"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.unstack(level=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Collections\n",
    "\n",
    "You can gather many objects (histograms, functions, ntuples) into a `Collection`, partly for convenience of encapsulating all of them in one object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collection(\n",
      "  objects={\n",
      "    'one': Histogram(\n",
      "      axis=[\n",
      "        Axis(\n",
      "          binning=\n",
      "            RegularBinning(\n",
      "              num=80,\n",
      "              interval=RealInterval(low=-5.0, high=5.0),\n",
      "              overflow=RealOverflow(loc_underflow=BinLocation.below1, loc_overflow=BinLocation.above1)),\n",
      "          statistics=[\n",
      "            Statistics(\n",
      "              moments=[\n",
      "                Moments(sumwxn=InterpretedInlineInt64Buffer(buffer=[1e+07]), n=0),\n",
      "                Moments(sumwxn=InterpretedInlineFloat64Buffer(buffer=[1e+07]), n=0, weightpower=1),\n",
      "                Moments(sumwxn=InterpretedInlineFloat64Buffer(buffer=[1e+07]), n=0, weightpower=2),\n",
      "                Moments(sumwxn=InterpretedInlineFloat64Buffer(buffer=[2641.38]), n=1, weightpower=1),\n",
      "                Moments(\n",
      "                  sumwxn=InterpretedInlineFloat64Buffer(buffer=[1.00103e+07]),\n",
      "                  n=2,\n",
      "                  weightpower=1)\n",
      "              ])\n",
      "          ])\n",
      "      ],\n",
      "      counts=\n",
      "        UnweightedCounts(\n",
      "          counts=\n",
      "            InterpretedInlineFloat64Buffer(\n",
      "              buffer=\n",
      "                  [0.00000e+00 3.00000e+00 6.00000e+00 9.00000e+00 1.70000e+01 3.40000e+01\n",
      "                   5.30000e+01 8.80000e+01 1.45000e+02 2.19000e+02 3.62000e+02 5.83000e+02\n",
      "                   8.90000e+02 1.41400e+03 2.08200e+03 3.07300e+03 4.56700e+03 6.65000e+03\n",
      "                   9.49300e+03 1.34970e+04 1.86960e+04 2.57060e+04 3.41750e+04 4.56390e+04\n",
      "                   5.93380e+04 7.69170e+04 9.64180e+04 1.20250e+05 1.47785e+05 1.77579e+05\n",
      "                   2.10677e+05 2.46305e+05 2.83236e+05 3.21129e+05 3.57500e+05 3.92646e+05\n",
      "                   4.24978e+05 4.52731e+05 4.75951e+05 4.90446e+05 4.97232e+05 4.97458e+05\n",
      "                   4.90322e+05 4.75074e+05 4.53326e+05 4.25909e+05 3.93028e+05 3.58993e+05\n",
      "                   3.21558e+05 2.84107e+05 2.46317e+05 2.10293e+05 1.77366e+05 1.47453e+05\n",
      "                   1.19625e+05 9.70690e+04 7.56320e+04 5.94760e+04 4.57130e+04 3.45880e+04\n",
      "                   2.55890e+04 1.89340e+04 1.36080e+04 9.65800e+03 6.65600e+03 4.69200e+03\n",
      "                   3.17700e+03 2.13700e+03 1.38800e+03 8.66000e+02 5.70000e+02 3.65000e+02\n",
      "                   2.07000e+02 1.47000e+02 7.10000e+01 4.00000e+01 2.90000e+01 2.10000e+01\n",
      "                   9.00000e+00 4.00000e+00 4.00000e+00 0.00000e+00]))),\n",
      "    'two': Histogram(\n",
      "      axis=[\n",
      "        Axis(binning=RegularBinning(num=80, interval=RealInterval(low=-5.0, high=5.0)))\n",
      "      ],\n",
      "      counts=\n",
      "        UnweightedCounts(\n",
      "          counts=\n",
      "            InterpretedInlineInt64Buffer(\n",
      "              buffer=\n",
      "                  [     3      6      9     17     34     53     88    145    219    362\n",
      "                      583    890   1414   2082   3073   4567   6650   9493  13497  18696\n",
      "                    25706  34175  45639  59338  76917  96418 120250 147785 177579 210677\n",
      "                   246305 283236 321129 357500 392646 424978 452731 475951 490446 497232\n",
      "                   497458 490322 475074 453326 425909 393028 358993 321558 284107 246317\n",
      "                   210293 177366 147453 119625  97069  75632  59476  45713  34588  25589\n",
      "                    18934  13608   9658   6656   4692   3177   2137   1388    866    570\n",
      "                      365    207    147     71     40     29     21      9      4      4])))\n",
      "  })\n"
     ]
    }
   ],
   "source": [
    "stagg.Collection({\"one\": fromroot, \"two\": stagg_hist}).dump()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Not only for convenience: [you can also define](https://github.com/diana-hep/stagg/blob/master/specification.adoc#Collection) an `Axis` in the `Collection` to subdivide all contents by that `Axis`. For instance, you can make a collection of qualitatively different histograms all have a signal and control region with `PredicateBinning`, or all have systematic variations with `VariationBinning`.\n",
    "\n",
    "It is not necessary to rely on naming conventions to communicate this information from filler to fitter."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Histogram → histogram conversions\n",
    "\n",
    "I said in the introduction that Stagg does not fill histograms and does not plot histograms—the two things data analysts are expecting to do. These would be done by user-facing libraries.\n",
    "\n",
    "Stagg does, however, transform histograms into other histograms, and not just among formats. You can combine histograms with `+`. In addition to adding histogram counts, it combines auxiliary statistics appropriately (if possible)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "h1 = stagg.Histogram([\n",
    "    stagg.Axis(stagg.RegularBinning(10, stagg.RealInterval(-5, 5)),\n",
    "               statistics=[stagg.Statistics(\n",
    "                   moments=[\n",
    "                       stagg.Moments(stagg.InterpretedInlineBuffer.fromarray(numpy.array([10])), n=1),\n",
    "                       stagg.Moments(stagg.InterpretedInlineBuffer.fromarray(numpy.array([20])), n=2)],\n",
    "                   quantiles=[\n",
    "                       stagg.Quantiles(stagg.InterpretedInlineBuffer.fromarray(numpy.array([30])), p=0.5)],\n",
    "                   mode=stagg.Modes(stagg.InterpretedInlineBuffer.fromarray(numpy.array([40]))),\n",
    "                   min=stagg.Extremes(stagg.InterpretedInlineBuffer.fromarray(numpy.array([50]))),\n",
    "                   max=stagg.Extremes(stagg.InterpretedInlineBuffer.fromarray(numpy.array([60]))))])],\n",
    "    stagg.UnweightedCounts(stagg.InterpretedInlineBuffer.fromarray(numpy.arange(10))))\n",
    "h2 = stagg.Histogram([\n",
    "    stagg.Axis(stagg.RegularBinning(10, stagg.RealInterval(-5, 5)),\n",
    "               statistics=[stagg.Statistics(\n",
    "                   moments=[\n",
    "                       stagg.Moments(stagg.InterpretedInlineBuffer.fromarray(numpy.array([100])), n=1),\n",
    "                       stagg.Moments(stagg.InterpretedInlineBuffer.fromarray(numpy.array([200])), n=2)],\n",
    "                   quantiles=[\n",
    "                       stagg.Quantiles(stagg.InterpretedInlineBuffer.fromarray(numpy.array([300])), p=0.5)],\n",
    "                   mode=stagg.Modes(stagg.InterpretedInlineBuffer.fromarray(numpy.array([400]))),\n",
    "                   min=stagg.Extremes(stagg.InterpretedInlineBuffer.fromarray(numpy.array([500]))),\n",
    "                   max=stagg.Extremes(stagg.InterpretedInlineBuffer.fromarray(numpy.array([600]))))])],\n",
    "    stagg.UnweightedCounts(stagg.InterpretedInlineBuffer.fromarray(numpy.arange(100, 200, 10))))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Histogram(\n",
      "  axis=[\n",
      "    Axis(\n",
      "      binning=RegularBinning(num=10, interval=RealInterval(low=-5.0, high=5.0)),\n",
      "      statistics=[\n",
      "        Statistics(\n",
      "          moments=[\n",
      "            Moments(sumwxn=InterpretedInlineInt64Buffer(buffer=[110]), n=1),\n",
      "            Moments(sumwxn=InterpretedInlineInt64Buffer(buffer=[220]), n=2)\n",
      "          ],\n",
      "          min=Extremes(values=InterpretedInlineInt64Buffer(buffer=[50])),\n",
      "          max=Extremes(values=InterpretedInlineInt64Buffer(buffer=[600])))\n",
      "      ])\n",
      "  ],\n",
      "  counts=\n",
      "    UnweightedCounts(\n",
      "      counts=InterpretedInlineInt64Buffer(buffer=[100 111 122 133 144 155 166 177 188 199])))\n"
     ]
    }
   ],
   "source": [
    "(h1 + h2).dump()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The corresponding moments of `h1` and `h2` were matched and added, quantiles and modes were dropped (no way to combine them), and the correct minimum and maximum were picked; the histogram contents were added as well."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another important histogram → histogram conversion is axis-reduction, which can take three forms:\n",
    "\n",
    "   * slicing an axis, either dropping the eliminated bins or adding them to underflow/overflow (if possible, depends on binning type);\n",
    "   * rebinning by combining neighboring bins;\n",
    "   * projecting out an axis, removing it entirely, summing over all existing bins.\n",
    "\n",
    "All of these operations use a Pandas-inspired `loc`/`iloc` syntax."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "h = stagg.Histogram(\n",
    "    [stagg.Axis(stagg.RegularBinning(10, stagg.RealInterval(-5, 5)))],\n",
    "    stagg.UnweightedCounts(\n",
    "        stagg.InterpretedInlineBuffer.fromarray(numpy.array([0, 10, 20, 30, 40, 50, 60, 70, 80, 90]))))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`loc` slices in the data's coordinate system. `1.5` rounds up to bin index `6`. The first five bins get combined into an overflow bin: `150 = 10 + 20 + 30 + 40 + 50`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Histogram(\n",
      "  axis=[\n",
      "    Axis(\n",
      "      binning=\n",
      "        RegularBinning(\n",
      "          num=4,\n",
      "          interval=RealInterval(low=1.0, high=5.0),\n",
      "          overflow=\n",
      "            RealOverflow(\n",
      "              loc_underflow=BinLocation.above1,\n",
      "              minf_mapping=RealOverflow.missing,\n",
      "              pinf_mapping=RealOverflow.missing,\n",
      "              nan_mapping=RealOverflow.missing)))\n",
      "  ],\n",
      "  counts=UnweightedCounts(counts=InterpretedInlineInt64Buffer(buffer=[60 70 80 90 150])))\n"
     ]
    }
   ],
   "source": [
    "h.loc[1.5:].dump()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`iloc` slices by bin index number."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Histogram(\n",
      "  axis=[\n",
      "    Axis(\n",
      "      binning=\n",
      "        RegularBinning(\n",
      "          num=4,\n",
      "          interval=RealInterval(low=1.0, high=5.0),\n",
      "          overflow=\n",
      "            RealOverflow(\n",
      "              loc_underflow=BinLocation.above1,\n",
      "              minf_mapping=RealOverflow.missing,\n",
      "              pinf_mapping=RealOverflow.missing,\n",
      "              nan_mapping=RealOverflow.missing)))\n",
      "  ],\n",
      "  counts=UnweightedCounts(counts=InterpretedInlineInt64Buffer(buffer=[60 70 80 90 150])))\n"
     ]
    }
   ],
   "source": [
    "h.iloc[6:].dump()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Slices have a `start`, `stop`, and `step` (`start:stop:step`). The `step` parameter rebins:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Histogram(\n",
      "  axis=[\n",
      "    Axis(binning=RegularBinning(num=5, interval=RealInterval(low=-5.0, high=5.0)))\n",
      "  ],\n",
      "  counts=UnweightedCounts(counts=InterpretedInlineInt64Buffer(buffer=[10 50 90 130 170])))\n"
     ]
    }
   ],
   "source": [
    "h.iloc[::2].dump()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Thus, you can slice and rebin as part of the same operation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Projecting uses the same mechanism, except that `None` passed as an axis's slice projects it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Histogram(\n",
      "  axis=[\n",
      "    Axis(binning=RegularBinning(num=10, interval=RealInterval(low=-5.0, high=5.0)))\n",
      "  ],\n",
      "  counts=\n",
      "    UnweightedCounts(\n",
      "      counts=InterpretedInlineInt64Buffer(buffer=[45 145 245 345 445 545 645 745 845 945])))\n"
     ]
    }
   ],
   "source": [
    "h2 = stagg.Histogram(\n",
    "    [stagg.Axis(stagg.RegularBinning(10, stagg.RealInterval(-5, 5))),\n",
    "     stagg.Axis(stagg.RegularBinning(10, stagg.RealInterval(-5, 5)))],\n",
    "    stagg.UnweightedCounts(\n",
    "        stagg.InterpretedInlineBuffer.fromarray(numpy.arange(100))))\n",
    "\n",
    "h2.iloc[:, None].dump()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Thus, all three axis reduction operations can be performed in a single syntax."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In general, an n-dimensional Stagg histogram can be sliced like an n-dimensional Numpy array. This includes integer and boolean indexing (though that necessarily changes the binning to `IrregularBinning`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Histogram(\n",
      "  axis=[\n",
      "    Axis(\n",
      "      binning=\n",
      "        IrregularBinning(\n",
      "          intervals=[\n",
      "            RealInterval(low=-1.0, high=0.0),\n",
      "            RealInterval(low=-2.0, high=-1.0),\n",
      "            RealInterval(low=1.0, high=2.0),\n",
      "            RealInterval(low=2.0, high=3.0),\n",
      "            RealInterval(low=-4.0, high=-3.0)\n",
      "          ]))\n",
      "  ],\n",
      "  counts=UnweightedCounts(counts=InterpretedInlineInt64Buffer(buffer=[40 30 60 70 10])))\n"
     ]
    }
   ],
   "source": [
    "h.iloc[[4, 3, 6, 7, 1]].dump()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Histogram(\n",
      "  axis=[\n",
      "    Axis(\n",
      "      binning=\n",
      "        IrregularBinning(\n",
      "          intervals=[\n",
      "            RealInterval(low=-5.0, high=-4.0),\n",
      "            RealInterval(low=-3.0, high=-2.0),\n",
      "            RealInterval(low=-1.0, high=0.0),\n",
      "            RealInterval(low=1.0, high=2.0),\n",
      "            RealInterval(low=3.0, high=4.0)\n",
      "          ]))\n",
      "  ],\n",
      "  counts=UnweightedCounts(counts=InterpretedInlineInt64Buffer(buffer=[0 20 40 60 80])))\n"
     ]
    }
   ],
   "source": [
    "h.iloc[[True, False, True, False, True, False, True, False, True, False]].dump()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`loc` for numerical binnings accepts\n",
    "\n",
    "   * a real number\n",
    "   * a real-valued slice\n",
    "   * `None` for projection\n",
    "   * ellipsis (`...`)\n",
    "\n",
    "`loc` for categorical binnings accepts\n",
    "\n",
    "   * a string\n",
    "   * an iterable of strings\n",
    "   * an _empty_ slice\n",
    "   * `None` for projection\n",
    "   * ellipsis (`...`)\n",
    "\n",
    "`iloc` accepts\n",
    "\n",
    "   * an integer\n",
    "   * an integer-valued slice\n",
    "   * `None` for projection\n",
    "   * integer-valued array-like\n",
    "   * boolean-valued array-like\n",
    "   * ellipsis (`...`)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bin counts → Numpy\n",
    "\n",
    "Frequently, one wants to extract bin counts from a histogram. The `loc`/`iloc` syntax above creates _histograms_ from _histograms_, not bin counts.\n",
    "\n",
    "A histogram's `counts` property has a slice syntax."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-999  999    0    0    0    0    0    0    0    0    0    0]\n",
      " [-999  999    2    3    4    5    6    7    8    9   10   11]\n",
      " [-999  999    4    6    8   10   12   14   16   18   20   22]\n",
      " [-999  999    6    9   12   15   18   21   24   27   30   33]\n",
      " [-999  999    8   12   16   20   24   28   32   36   40   44]\n",
      " [-999  999   10   15   20   25   30   35   40   45   50   55]\n",
      " [-999  999   12   18   24   30   36   42   48   54   60   66]\n",
      " [-999  999   14   21   28   35   42   49   56   63   70   77]\n",
      " [-999  999   16   24   32   40   48   56   64   72   80   88]\n",
      " [-999  999   18   27   36   45   54   63   72   81   90   99]\n",
      " [-999  999 -999 -999 -999 -999 -999 -999 -999 -999 -999 -999]\n",
      " [-999  999  999  999  999  999  999  999  999  999  999  999]]\n"
     ]
    }
   ],
   "source": [
    "allcounts = numpy.arange(12) * numpy.arange(12)[:, None]   # multiplication table\n",
    "allcounts[10, :] = -999   # underflows\n",
    "allcounts[11, :] = 999    # overflows\n",
    "allcounts[:, 0]  = -999   # underflows\n",
    "allcounts[:, 1]  = 999    # overflows\n",
    "print(allcounts)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
    "h2 = stagg.Histogram(\n",
    "    [stagg.Axis(stagg.RegularBinning(10, stagg.RealInterval(-5, 5),\n",
    "                    stagg.RealOverflow(loc_underflow=stagg.RealOverflow.above1,\n",
    "                                       loc_overflow=stagg.RealOverflow.above2))),\n",
    "     stagg.Axis(stagg.RegularBinning(10, stagg.RealInterval(-5, 5),\n",
    "                    stagg.RealOverflow(loc_underflow=stagg.RealOverflow.below2,\n",
    "                                       loc_overflow=stagg.RealOverflow.below1)))],\n",
    "    stagg.UnweightedCounts(\n",
    "        stagg.InterpretedInlineBuffer.fromarray(allcounts)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 0  0  0  0  0  0  0  0  0  0]\n",
      " [ 2  3  4  5  6  7  8  9 10 11]\n",
      " [ 4  6  8 10 12 14 16 18 20 22]\n",
      " [ 6  9 12 15 18 21 24 27 30 33]\n",
      " [ 8 12 16 20 24 28 32 36 40 44]\n",
      " [10 15 20 25 30 35 40 45 50 55]\n",
      " [12 18 24 30 36 42 48 54 60 66]\n",
      " [14 21 28 35 42 49 56 63 70 77]\n",
      " [16 24 32 40 48 56 64 72 80 88]\n",
      " [18 27 36 45 54 63 72 81 90 99]]\n"
     ]
    }
   ],
   "source": [
    "print(h2.counts[:, :])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To get the underflows and overflows, set the slice extremes to `-inf` and `+inf`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-999 -999 -999 -999 -999 -999 -999 -999 -999 -999]\n",
      " [   0    0    0    0    0    0    0    0    0    0]\n",
      " [   2    3    4    5    6    7    8    9   10   11]\n",
      " [   4    6    8   10   12   14   16   18   20   22]\n",
      " [   6    9   12   15   18   21   24   27   30   33]\n",
      " [   8   12   16   20   24   28   32   36   40   44]\n",
      " [  10   15   20   25   30   35   40   45   50   55]\n",
      " [  12   18   24   30   36   42   48   54   60   66]\n",
      " [  14   21   28   35   42   49   56   63   70   77]\n",
      " [  16   24   32   40   48   56   64   72   80   88]\n",
      " [  18   27   36   45   54   63   72   81   90   99]\n",
      " [ 999  999  999  999  999  999  999  999  999  999]]\n"
     ]
    }
   ],
   "source": [
    "print(h2.counts[-numpy.inf:numpy.inf, :])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-999    0    0    0    0    0    0    0    0    0    0  999]\n",
      " [-999    2    3    4    5    6    7    8    9   10   11  999]\n",
      " [-999    4    6    8   10   12   14   16   18   20   22  999]\n",
      " [-999    6    9   12   15   18   21   24   27   30   33  999]\n",
      " [-999    8   12   16   20   24   28   32   36   40   44  999]\n",
      " [-999   10   15   20   25   30   35   40   45   50   55  999]\n",
      " [-999   12   18   24   30   36   42   48   54   60   66  999]\n",
      " [-999   14   21   28   35   42   49   56   63   70   77  999]\n",
      " [-999   16   24   32   40   48   56   64   72   80   88  999]\n",
      " [-999   18   27   36   45   54   63   72   81   90   99  999]]\n"
     ]
    }
   ],
   "source": [
    "print(h2.counts[:, -numpy.inf:numpy.inf])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Also note that the underflows are now all below the normal bins and overflows are now all above the normal bins, regardless of how they were arranged in the Stagg object. This allows analysis code to be independent of histogram source."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Other types\n",
    "\n",
    "Stagg can attach fit functions to histograms, can store standalone functions, such as lookup tables, and can store ntuples for unweighted fits or machine learning."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}