{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Make template background model\n",
    "\n",
    "## Introduction \n",
    "\n",
    "In this tutorial, we will create a template background model from scratch. Often, background models are pre-computed and provided for analysis, but it's educational to see how the sausage is made.\n",
    "\n",
    "We will use the \"off observations\", i.e. those without significant gamma-ray emission sources in the field of view from the [H.E.S.S. first public test data release](https://www.mpi-hd.mpg.de/hfm/HESS/pages/dl3-dr1/). This model could then be used in the analysis of sources from that dataset (not done here).\n",
    "\n",
    "We will make a background model that is radially symmetric in the field of view, i.e. only depends on field of view offset angle and energy. At the end, we will save the model in the `BKG_2D` as defined in the [spec](https://gamma-astro-data-formats.readthedocs.io/en/latest/irfs/full_enclosure/bkg/index.html).\n",
    "\n",
    "Note that this is just a quick and dirty example. Actual background model production is done with more sophistication usually using 100s or 1000s of off runs, e.g. concerning non-radial symmetries, binning and smoothing of the distributions, and treating other dependencies such as zenith angle, telescope configuration or optical efficiency. Another aspect not shown here is how to use AGN observations to make background models, by cutting out the part of the field of view that contains gamma-rays from the AGN.\n",
    "\n",
    "We will mainly be using the following classes:\n",
    "        \n",
    "* [gammapy.data.DataStore](https://docs.gammapy.org/dev/api/gammapy.data.DataStore.html) to load the runs to use to build the bkg model.\n",
    "* [gammapy.irf.Background2D](https://docs.gammapy.org/dev/api/gammapy.irf.Background2D.html) to represent and write the background model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup\n",
    "\n",
    "As always, we start the notebook with some setup and imports."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt\n",
    "from matplotlib.colors import LogNorm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from copy import deepcopy\n",
    "import numpy as np\n",
    "import astropy.units as u\n",
    "from astropy.coordinates import SkyCoord, Angle\n",
    "from astropy.io import fits\n",
    "from astropy.table import Table, vstack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "from gammapy.utils.nddata import sqrt_space\n",
    "from gammapy.data import DataStore\n",
    "from gammapy.irf import Background2D"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Select off data\n",
    "\n",
    "We start by selecting the observations used to estimate the background model.\n",
    "\n",
    "In this case, we just take all \"off runs\" as defined in the observation table."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data_store = DataStore.from_dir(\"$GAMMAPY_DATA/hess-dl3-dr1\")\n",
    "# Select just the off data runs\n",
    "obs_table = data_store.obs_table\n",
    "obs_table = obs_table[obs_table[\"TARGET_NAME\"] == \"Off data\"]\n",
    "observations = data_store.get_observations(obs_table[\"OBS_ID\"])\n",
    "print(\"Number of observations:\", len(observations))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Background model\n",
    "\n",
    "The background model we will estimate is a differential background rate model in unit `s-1 MeV-1 sr-1` as a function of reconstructed energy and field of fiew offset.\n",
    "\n",
    "We estimate it by histogramming off data events and then smoothing a bit (not using a good method) to get a less noisy estimate. To get the differential rate, we divide by observation time and also take bin sizes into account to get the rate per energy and solid angle. So overall we fill two arrays called `counts` and `exposure` with `exposure` filled so that `background_rate = counts / exposure` will give the final background rate we're interested in.\n",
    "\n",
    "The processing can be done either one observation at a time, or first for counts and then for exposure. Either way is fine. Here we do one observation at a time, starting with empty histograms and then accumulating counts and exposure. Since this is a multi-step algorithm, we put the code to do this computation in a `BackgroundModelEstimator` class.\n",
    "\n",
    "This functionality was already in Gammapy previously, and will be added back again soon, after `gammapy.irf` has been restructured and improved."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class BackgroundModelEstimator:\n",
    "    def __init__(self, ebounds, offset):\n",
    "        self.counts = self._make_bkg2d(ebounds, offset, unit=\"\")\n",
    "        self.exposure = self._make_bkg2d(ebounds, offset, unit=\"s MeV sr\")\n",
    "\n",
    "    @staticmethod\n",
    "    def _make_bkg2d(ebounds, offset, unit):\n",
    "        ebounds = ebounds.to(\"MeV\")\n",
    "        offset = offset.to(\"deg\")\n",
    "        shape = len(ebounds) - 1, len(offset) - 1\n",
    "        return Background2D(\n",
    "            energy_lo=ebounds[:-1],\n",
    "            energy_hi=ebounds[1:],\n",
    "            offset_lo=offset[:-1],\n",
    "            offset_hi=offset[1:],\n",
    "            data=np.zeros(shape) * u.Unit(unit),\n",
    "        )\n",
    "\n",
    "    def run(self, observations):\n",
    "        for obs in observations:\n",
    "            self.fill_counts(obs)\n",
    "            self.fill_exposure(obs)\n",
    "\n",
    "    def fill_counts(self, obs):\n",
    "        events = obs.events\n",
    "        data = self.counts.data\n",
    "        counts = np.histogram2d(\n",
    "            x=events.energy.to(\"MeV\"),\n",
    "            y=events.offset.to(\"deg\"),\n",
    "            bins=(data.axes[0].bins, data.axes[1].bins),\n",
    "        )[0]\n",
    "        data.data += counts\n",
    "\n",
    "    def fill_exposure(self, obs):\n",
    "        data = self.exposure.data\n",
    "        energy_width = data.axes[0].bin_width\n",
    "        offset = data.axes[1].nodes\n",
    "        offset_width = data.axes[1].bin_width\n",
    "        solid_angle = 2 * np.pi * offset * offset_width\n",
    "        time = obs.observation_time_duration\n",
    "        exposure = time * energy_width[:, None] * solid_angle[None, :]\n",
    "        data.data += exposure\n",
    "\n",
    "    @property\n",
    "    def background_rate(self):\n",
    "        rate = deepcopy(self.counts)\n",
    "        rate.data.data /= self.exposure.data.data\n",
    "        return rate\n",
    "\n",
    "\n",
    "def background2d_peek(bkg):\n",
    "    data = bkg.data\n",
    "    x = data.axes[0].bins\n",
    "    y = data.axes[1].bins\n",
    "    c = data.data.T.value\n",
    "    plt.pcolormesh(x, y, c, norm=LogNorm())\n",
    "    plt.semilogx()\n",
    "    plt.colorbar()\n",
    "    plt.xlabel(\"Energy (TeV)\")\n",
    "    plt.ylabel(\"Offset (deg)\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "ebounds = np.logspace(-1, 2, 20) * u.TeV\n",
    "offset = sqrt_space(start=0, stop=3, num=10) * u.deg\n",
    "estimator = BackgroundModelEstimator(ebounds, offset)\n",
    "estimator.run(observations)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's have a quick look at what we did ..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "background2d_peek(estimator.background_rate)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# You could save the background model to a file like this\n",
    "# estimator.background_rate.to_fits().writeto('background_model.fits', overwrite=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Zenith dependence\n",
    "\n",
    "The background models used in H.E.S.S. usually depend on the zenith angle of the observation. That kinda makes sense because the energy threshold increases with zenith angle, and since the background is related to (but not given by) the charged cosmic ray spectrum that is a power-law and falls steeply, we also expect the background rate to change.\n",
    "\n",
    "Let's have a look at the dependence we get for this configuration used here (Hillas reconstruction, standard cuts, see H.E.S.S. release notes for more information)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = obs_table[\"ZEN_PNT\"]\n",
    "y = obs_table[\"SAFE_ENERGY_LO\"]\n",
    "plt.plot(x, y, \"o\")\n",
    "plt.xlabel(\"Zenith (deg)\")\n",
    "plt.ylabel(\"Energy threshold (TeV)\");"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = obs_table[\"ZEN_PNT\"]\n",
    "y = obs_table[\"EVENT_COUNT\"] / obs_table[\"ONTIME\"]\n",
    "plt.plot(x, y, \"o\")\n",
    "plt.xlabel(\"Zenith (deg)\")\n",
    "plt.ylabel(\"Rate (events / sec)\")\n",
    "plt.ylim(0, 10);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The energy threshold increases, as expected. It's a bit surprising that the total background rate doesn't decreases with increasing zenith angle. That's a bit of luck for this configuration, and because we're looking at the rate of background events in the whole field of view. As shown below, the energy threshold increases (reducing the total rate), but the rate at a given energy increases with zenith angle (increasing the total rate). Overall the background does change with zenith angle and that dependency should be taken into account.\n",
    "\n",
    "The remaining scatter you see in the plots above (in energy threshold and rate) is due to dependence on telescope optical efficiency, atmospheric changes from run to run and other effects. If you're interested in this, [2014APh....54...25H](http://adsabs.harvard.edu/abs/2014APh....54...25H) has some infos. We'll not consider this futher.\n",
    "\n",
    "When faced with the question whether and how to model the zenith angle dependence, we're faced with a complex optimisation problem: the closer we require off runs to be in zenith angle, the fewer off runs and thus event statistic we have available, which will lead do noise in the background model. The choice of zenith angle binning or \"on-off observation mathching\" strategy isn't the only thing that needs to be optimised, there's also energy and offset binnings and smoothing scales. And of course good settings will depend on the way you plan to use the background model, i.e. the science measurement you plan to do. Some say background modeling is the hardest part of IACT data analysis.\n",
    "\n",
    "Here we'll just code up something simple: make three background models, one from the off runs with zenith angle 0 to 20 deg, one from 20 to 40 deg, and one from 40 to 90 deg."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "zenith_bins = [\n",
    "    {\"min\": 0, \"max\": 20},\n",
    "    {\"min\": 20, \"max\": 40},\n",
    "    {\"min\": 40, \"max\": 90},\n",
    "]\n",
    "\n",
    "\n",
    "def make_model(observations):\n",
    "    ebounds = np.logspace(-1, 2, 20) * u.TeV\n",
    "    offset = sqrt_space(start=0, stop=3, num=10) * u.deg\n",
    "    estimator = BackgroundModelEstimator(ebounds, offset)\n",
    "    estimator.run(observations)\n",
    "    return estimator.background_rate\n",
    "\n",
    "\n",
    "def make_models():\n",
    "    for zenith in zenith_bins:\n",
    "        mask = zenith[\"min\"] <= obs_table[\"ZEN_PNT\"]\n",
    "        mask &= obs_table[\"ZEN_PNT\"] < zenith[\"max\"]\n",
    "        obs_ids = obs_table[\"OBS_ID\"][mask]\n",
    "        observations = data_store.get_observations(obs_ids)\n",
    "        yield make_model(observations)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "models = list(make_models())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "background2d_peek(models[0])\n",
    "plt.figure()\n",
    "background2d_peek(models[2])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "energy = models[0].data.axes[0].nodes.to(\"TeV\")\n",
    "y = models[0].data.evaluate(energy=energy, offset=\"0.5 deg\")\n",
    "plt.plot(energy, y, label=\"0 < zen < 20\")\n",
    "y = models[1].data.evaluate(energy=energy, offset=\"0.5 deg\")\n",
    "plt.plot(energy, y, label=\"20 < zen < 40\")\n",
    "y = models[2].data.evaluate(energy=energy, offset=\"0.5 deg\")\n",
    "plt.plot(energy, y, label=\"40 < zen < 90\")\n",
    "plt.loglog()\n",
    "plt.xlabel(\"Energy (TeV)\")\n",
    "plt.ylabel(\"Bkg rate (s-1 sr-1 MeV-1)\")\n",
    "plt.legend();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Index tables\n",
    "\n",
    "So now we have radially symmetric background models for three zenith angle bins. To be able to use it from the high-level Gammapy classes like e.g. the MapMaker though, we also have to create a [HDU index table](https://gamma-astro-data-formats.readthedocs.io/en/latest/data_storage/hdu_index/index.html) that declares which background model to use for each observation.\n",
    "\n",
    "It sounds harder than it actually is. Basically you have to some code to make a new `astropy.table.Table`. The most tricky part is that before you can make the HDU index table, you have to decide where to store the data, because the HDU index table is a reference to the data location. Let's decide in this example that we want to re-use all existing files in `$GAMMAPY_DATA/hess-dl3-dr1` and put all the new HDUs (for background models and new index files) bundled in a single FITS file called `hess-dl3-dr3-with-background.fits.gz`, which we will put  in `$GAMMAPY_DATA/hess-dl3-dr1`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "filename = \"hess-dl3-dr3-with-background.fits.gz\"\n",
    "\n",
    "# Make a new table with one row for each observation\n",
    "# pointing to the background model HDU\n",
    "rows = []\n",
    "for obs_row in data_store.obs_table:\n",
    "    obs_row[\"ZEN_PNT\"]\n",
    "    # TODO: pick the right background model\n",
    "    # based on zenith angle\n",
    "    bkg_idx = 0\n",
    "    hdu_name = \"BKG{}\".format(bkg_idx)\n",
    "    row = {\n",
    "        \"OBS_ID\": obs_row[\"OBS_ID\"],\n",
    "        \"HDU_TYPE\": \"bkg\",\n",
    "        \"HDU_CLASS\": \"bkg_2d\",\n",
    "        \"FILE_DIR\": \"\",\n",
    "        \"FILE_NAME\": filename,\n",
    "        \"HDU_NAME\": hdu_name,\n",
    "    }\n",
    "    rows.append(row)\n",
    "\n",
    "hdu_table_bkg = Table(rows=rows)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Make a copy of the original HDU index table\n",
    "hdu_table = data_store.hdu_table.copy()\n",
    "hdu_table.meta.pop(\"BASE_DIR\")\n",
    "\n",
    "# Add the rows for the background HDUs\n",
    "hdu_table = vstack([hdu_table, hdu_table_bkg])\n",
    "hdu_table.sort(\"OBS_ID\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hdu_table[:7]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Put index tables and background models in a FITS file\n",
    "hdu_list = fits.HDUList()\n",
    "\n",
    "hdu = fits.BinTableHDU(hdu_table)\n",
    "hdu.name = \"HDU_INDEX\"\n",
    "hdu_list.append(hdu)\n",
    "\n",
    "hdu = fits.BinTableHDU(data_store.obs_table)\n",
    "hdu_list.append(hdu)\n",
    "\n",
    "for idx, model in enumerate(models):\n",
    "    hdu = model.to_fits()\n",
    "    hdu.name = \"BKG{}\".format(idx)\n",
    "    hdu_list.append(hdu)\n",
    "\n",
    "print([_.name for _ in hdu_list])\n",
    "\n",
    "import os\n",
    "\n",
    "path = (\n",
    "    Path(os.environ[\"GAMMAPY_DATA\"])\n",
    "    / \"hess-dl3-dr1/hess-dl3-dr3-with-background.fits.gz\"\n",
    ")\n",
    "hdu_list.writeto(str(path), overwrite=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Let's see if it's possible to access the data\n",
    "ds2 = DataStore.from_file(path)\n",
    "ds2.info()\n",
    "obs = ds2.obs(20136)\n",
    "obs.events\n",
    "obs.aeff\n",
    "background2d_peek(obs.bkg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercises\n",
    "\n",
    "- Play with the parameters here (energy binning, offset binning, zenith binning)\n",
    "- Try to figure out why there are outliers on the zenith vs energy threshold curve.\n",
    "- Does azimuth angle or optical efficiency have an effect on background rate?\n",
    "- Use the background models for a 3D analysis (see \"hess\" notebook)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}