{ "cells": [ { "cell_type": "markdown", "id": "6af675ed", "metadata": {}, "source": [ "# Hypothesis tests with pyhf\n", "\n", "This notebook will provide you with the tools to do sensitivity estimates which can be used for search region optimization or sensitivity projections.\n", "\n", "## p-value for discovery of a new signal\n", "\n", "In searches for new physics we want to know how significant a potential deviation from our Standard Model (SM) expectation is. We do this by a hypothesis test where we try to exclude the SM (\"background only\") hypothesis. We use a so called **p-value** $p_0$ for this, abstractly defined by:\n", "\n", "$$p_0 = \\int\\limits_{t_\\mathrm{obs}}^{\\infty}p(t|H_0)\\mathrm{d}t$$\n", "\n", "where $t$ is a test statistic (a number we calculate from our data observations) and $p(t|H_0)$ is the probability distribution for $t$ under the assumption of our **null Hypothesis** $H_0$, in this case the background only hypothesis. This p-value is then typically converted into a number of standard deviations $z$, the **significance** (\"number of sigmas\") via the inverse of the cumulative standard normal distribution $\\Phi$:\n", "\n", "$$z = \\Phi^{-1}(1 - p)$$\n", "\n", "The typical convention for particle physics is to speak of *evidence* when $z>3$ and of an *observation* when $z>5$.\n", "\n", "So what do we use for $t$? We want to use something that discriminates well between our null Hypothesis and an **alternative Hypothesis** that we have in mind. When we try to discover new physics, our null Hypothesis is the absence and the alternative Hypothesis the presence of a signal. We can parametrize this by a **signal strength** parameter $\\mu$. The test statistics used in almost all LHC searches use the **profile likelihood ratio**\n", "\n", "$$\\Lambda_\\mu = \\frac{L(\\mu, \\hat{\\hat{\\theta}})}{L(\\hat{\\mu}, \\hat{\\theta})}$$\n", "\n", "where $\\theta$ are the other parameters of our model that are not part of the test, the so called **nuisance parameters**. In contrast, the parameter that we want to test, $\\mu$, is called our **parameter of interest** (POI). The nuisance parameters include all fit parameters, like normalization factors and parameters for describing uncertainties. $L(\\mu, \\hat{\\hat{\\theta}})$ is the Likelihood function, maximized under the condition that our parameter of interest takes the value $\\mu$ and $L(\\hat{\\mu}, \\hat{\\theta})$ is the unconditionally maximized Likelihood. So roughly speaking, we are calculating the fraction of the maximum possible likelihood that we can get under our test condition. If it is high, that speaks for our hypothesis, if it is low, against. The test statistic $t_\\mu$ is then defined as\n", "\n", "$$t_\\mu = -2\\ln\\Lambda_\\mu$$\n", "\n", "giving us a test statistic where **high values speak against the null hypothesis**.\n", "\n", "
\n", " Question 7a: If we want to discover a new signal (using the p-value $p_0$), which value of $\\mu$ are we testing against? Or in other words, what is our null Hypothesis?\n", "
\n", "\n", "All that's left now is to know the distribution of $p(t_\\mu|H_0)$. [Wilk's theorem](https://en.wikipedia.org/wiki/Wilks%27_theorem) tells us that the distribution of $t_\\mu$ is asymptotically (for large sample sizes) a chi-square distribution. For the discovery p-value we use a slightly modified version of test statistic, called $q_0$ where $\\hat{\\mu}$ is required to be $>=0$ ($q_0=0$ for $\\hat{\\mu} < 0$). For $q_0$ the p-value in the asymptotic limit collapses to a very simple formula:\n", "\n", "$$p_0 = \\sqrt{q_0}$$\n", "\n", "The asymptotic limit often matches quite well even for fairly small sample sizes, but it should be kept in mind this is an approximation. Alternatively, one can evaluate $p(t_\\mu|H_0)$ by Monte Carlo sampling (\"toys\").\n", "\n", "## CLs for exclusion of an absent signal\n", "\n", "Now, sadly, not all searches find evidence for new physics. What we still can do in such a case is to try exclude models by rejecting the hypothesis of a signal being present. That usually means we test against $\\mu=1$ or some other value $>0$. The rest of the procedure is very similar with one small detail worth mentioning ... In high energy physics it is very common to use a quantity called $CL_s$ instead of plain p-value. It is defined by\n", "\n", "$$CL_s = \\frac{CL_{s+b}}{CL_{b}}$$\n", "\n", "where $CL_{s+b}$ is the p-value for rejecting the hypothesis of signal + background being present (what would be the \"normal\" p-value) and $CL_{b}$ is the p-value for rejecting the background only hypothesis, but now using the test statistic for $\\mu=1$ (so this is different from $p_0$!). We won't go into further details how to calculate those p-values. `pyhf` has the formulas included and does it automatically for us. The asymptotic distributions for all different variants are described in the paper \"Asymptotic formulae for likelihood-based tests of new physics\" ([arXiv:1007.1727](https://arxiv.org/abs/1007.1727)).\n", "\n", "Just a qualitative explanation of why we use $CL_s$ instead of the p-value: We want to avoid excluding signals in cases where we don't have sensitivity, but observe an *underfluctuation* of the data. In these cases $CL_{s+b}$ and $CL_b$ will be very similar and consequently lead to a large value of $CL_{s}$, telling us the signal is **not** excluded. In case our observations are exactly on spot with the background expectations $CL_b = 0.5$ in the asymptotic limit, so on average we have twice as high \"p-values\" with $CL_s$.\n", "\n", "The typical convention for particle physics is to speak of an **exclusion** of a signal if $CL_s < 0.05$. That's usually what is meant by \"limit at 95% confidence level\"." ] }, { "cell_type": "markdown", "id": "59665ed7", "metadata": {}, "source": [ "## Discovery or exclusion of a signal for a cut & count experiment" ] }, { "cell_type": "markdown", "id": "fbc8e9e6", "metadata": {}, "source": [ "Let's start with a simple case where we only want to count the number of events in a certain search region. We assume a certain number of expected background events `b`, expected signal events `s` and a total uncertainty on the expected background `delta_b` ($\\sigma_b$).\n", "\n", "The likelihood function for this can be formulated as a primary measurement of `n` events and a control (\"auxiliary\") measurment of `m` events that constrains our background parameter within the uncertainty. So, a product of 2 Poisson distributions:\n", "\n", "$$L(s, b) = \\mathrm{Pois}(n|s + b)\\cdot \\mathrm{Pois}(m|\\tau b)$$\n", "\n", "The parameter $\\tau$ can be given in terms of $\\sigma_b$ by asking the question \"How much more events do i have to measure in the control region to get the relative uncertainty $\\sigma_b / b$\". That gives\n", "\n", "$\\tau = \\frac{b}{\\sigma_b^2}$\n", "\n", "Equivalently, we can replace $b$ by $\\gamma b$ and $s$ by $\\mu s$ to fit normalization factors (initialized to 1) and keep $s$ and $b$ fixed to our expectation.\n", "\n", "$$L'(\\mu, \\gamma) = L(\\mu s, \\gamma b)$$\n", "\n", "`pyhf` has a convenience function to create the specification for such a model: `pyhf.simplemodels.hepdata_like`. It also works for arbitrary many bins, but for now let's go with one bin and 5 expected background events, 7 expected signal events and an uncertainty of 2 on the expected background events:" ] }, { "cell_type": "code", "execution_count": null, "id": "29ba6b01", "metadata": {}, "outputs": [], "source": [ "import pyhf\n", "from scipy import stats\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": null, "id": "228261b1", "metadata": {}, "outputs": [], "source": [ "s = 7\n", "b = 5\n", "delta_b = 2" ] }, { "cell_type": "code", "execution_count": null, "id": "5d65cb13", "metadata": {}, "outputs": [], "source": [ "model = pyhf.simplemodels.hepdata_like(\n", " signal_data=[s], bkg_data=[b], bkg_uncerts=[delta_b]\n", ")" ] }, { "cell_type": "markdown", "id": "fd0168e3", "metadata": {}, "source": [ "The model comes with a \"parameter of interest\" (POI) called `mu` that is our signal strength:" ] }, { "cell_type": "code", "execution_count": null, "id": "c9a435b3", "metadata": {}, "outputs": [], "source": [ "model.config.poi_name" ] }, { "cell_type": "markdown", "id": "403d454b", "metadata": {}, "source": [ "In addition, we have one nuisance parameter, the constrained background normalization $\\gamma$, called `uncorr_bkguncrt` here:" ] }, { "cell_type": "code", "execution_count": null, "id": "eea9a3a6", "metadata": {}, "outputs": [], "source": [ "model.config.par_order" ] }, { "cell_type": "markdown", "id": "13498031", "metadata": {}, "source": [ "It's initial value should be 1" ] }, { "cell_type": "code", "execution_count": null, "id": "535f5314", "metadata": {}, "outputs": [], "source": [ "gamma_initial = 1" ] }, { "cell_type": "markdown", "id": "6d6d4528", "metadata": {}, "source": [ "So the expected data in our model scales with `mu`. For `mu=1` we get `5 * 1 * 7 = 12`" ] }, { "cell_type": "code", "execution_count": null, "id": "c27bd580", "metadata": {}, "outputs": [], "source": [ "model.expected_actualdata([1, gamma_initial])" ] }, { "cell_type": "markdown", "id": "de757850", "metadata": {}, "source": [ "for `mu=2` we get `5 + 2 * 7 = 19`" ] }, { "cell_type": "code", "execution_count": null, "id": "e30c4ab7", "metadata": {}, "outputs": [], "source": [ "model.expected_actualdata([2, gamma_initial])" ] }, { "cell_type": "markdown", "id": "c823d879", "metadata": {}, "source": [ "The auxiliary data corresponds to $\\tau b$ in the formula above:" ] }, { "cell_type": "code", "execution_count": null, "id": "041e884e", "metadata": {}, "outputs": [], "source": [ "model.config.auxdata" ] }, { "cell_type": "markdown", "id": "484c7fb9", "metadata": {}, "source": [ "It's given by our background uncertainty `delta_b`:" ] }, { "cell_type": "code", "execution_count": null, "id": "22f7a85c", "metadata": {}, "outputs": [], "source": [ "b ** 2 / (delta_b ** 2)" ] }, { "cell_type": "markdown", "id": "c69d76d2", "metadata": {}, "source": [ "To get the p-value for rejection of the background only hypothesis, we call `pyhf.infer.hypotest` with the test value 0 of our POI $\\mu$ using the `q0` test statistic.\n", "\n", "We want to know which p-value we would get if we would observe an excess of events of precisely the expected signal, so we plug in `s + b` for the data:" ] }, { "cell_type": "code", "execution_count": null, "id": "4d42e783", "metadata": {}, "outputs": [], "source": [ "pvalue = pyhf.infer.hypotest(\n", " poi_test=0,\n", " data=[s + b] + model.config.auxdata,\n", " pdf=model,\n", " test_stat=\"q0\"\n", ")\n", "pvalue" ] }, { "cell_type": "markdown", "id": "69482321", "metadata": {}, "source": [ "We can convert this into a significance (number of standard deviations) using the inverse of the cumulative standard normal distribution $\\Phi$\n", "\n", "$$z = \\Phi^{-1}(1 - p)$$\n", "\n", "The function [`scipy.stats.norm.isf`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html#scipy.stats.norm) (\"inverse survival function\") calculates $\\Phi^{-1}(1 - p)$ in a numerically stable way (also for small p-values)." ] }, { "cell_type": "code", "execution_count": null, "id": "78ca9b05", "metadata": {}, "outputs": [], "source": [ "def pvalue_to_significance(pvalue):\n", " return stats.norm.isf(pvalue)" ] }, { "cell_type": "code", "execution_count": null, "id": "6ad12b0a", "metadata": {}, "outputs": [], "source": [ "pvalue_to_significance(pvalue)" ] }, { "cell_type": "markdown", "id": "8ebf754d", "metadata": {}, "source": [ "That would not count as \"Evidence\" yet.\n", "\n", "
\n", " Exercise 7b: How much excess events would we need to observe in our search region (assuming unchanged expected background) that we have potential for finding evidence (3 $\\sigma$) of a new signal?\n", "
" ] }, { "cell_type": "markdown", "id": "35a7c630", "metadata": {}, "source": [ "Equivalently we can test for exclusion and calculate $CL_s$. For that we use 1 as the test value for $\\mu$ and the `qtilde` test statistic.\n", "\n", "We want to know if we could exclude a signal if we would not observe any more data than our background expectation, so we set our data to `b`:" ] }, { "cell_type": "code", "execution_count": null, "id": "38991cf2", "metadata": {}, "outputs": [], "source": [ "CLs = pyhf.infer.hypotest(\n", " poi_test=1,\n", " data=[b] + model.config.auxdata,\n", " pdf=model,\n", " test_stat=\"qtilde\"\n", ")\n", "CLs" ] }, { "cell_type": "markdown", "id": "7a54c716", "metadata": {}, "source": [ "
\n", " Question 7c: Would that signal count as excluded?\n", "
" ] }, { "cell_type": "markdown", "id": "2540c262", "metadata": {}, "source": [ "
\n", " Tip: For this simple number counting scenario, often used for optimizing search sensitivity you don't need to spin up pyhf to calculate the expected significance or CLs. The maximum likelihood estimates can be calculated exactly. Have a look at numbercounting.py for some functions to help you with that.

For a function that calculates the expected significance in one step as a formula have a look at http://www.pp.rhul.ac.uk/~cowan/stat/medsig/medsigNote.pdf\n", "
" ] }, { "cell_type": "code", "execution_count": null, "id": "35a055cd", "metadata": {}, "outputs": [], "source": [ "from numbercounting import z0_exp, cls_exp" ] }, { "cell_type": "code", "execution_count": null, "id": "0648aabe", "metadata": {}, "outputs": [], "source": [ "z0_exp(s, b, delta_b)" ] }, { "cell_type": "code", "execution_count": null, "id": "f101cab3", "metadata": {}, "outputs": [], "source": [ "cls_exp(s, b, delta_b)" ] }, { "cell_type": "markdown", "id": "d682cdd6", "metadata": {}, "source": [ "These functions support numpy arrays and you can run these functions easily to test millions of different values!" ] }, { "cell_type": "code", "execution_count": null, "id": "5bbd8344", "metadata": {}, "outputs": [], "source": [ "N = 1_000_000\n", "s_random = 20 * np.random.rand(N)\n", "b_random = 20 * np.random.rand(N)\n", "db_rel_random = np.random.rand(N)" ] }, { "cell_type": "code", "execution_count": null, "id": "a79c17fe", "metadata": {}, "outputs": [], "source": [ "%time z0_exp(s_random, b_random, db_rel_random * b_random)" ] }, { "cell_type": "code", "execution_count": null, "id": "941742db", "metadata": {}, "outputs": [], "source": [ "%time cls_exp(s_random, b_random, db_rel_random * b_random)" ] }, { "cell_type": "markdown", "id": "0b1ea2e8", "metadata": {}, "source": [ "## Run an upper limit scan on the signal strength" ] }, { "cell_type": "code", "execution_count": null, "id": "b15741cb", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "id": "4d08b81f", "metadata": {}, "source": [ "Often does not only quote limits on a particular assumed signal strength, but gives a limit on the signal strength itself. This is especially interesting for single-bin (cut & count) search regions, since it is quite model independent (everybody can simulate their own model and calculate the number of excess events in a certain search region to determine if it is excluded by such a limit).\n", "\n", "To do that, we need to invert the hypothesis test, e.g find the value of the signal strength for which we can exclude it at $CL_s<0.05$.\n", "\n", "So, let's do a scan!\n", "\n", "Ok, before we do so, let me introduce another concept: In addition to the expected $CL_s$ values we usually also show the expected 1 and 2 $\\sigma$ bands to get a feeling in which range we expect to actually observe $CL_s$ values when we do the analysis on real (and therefore fluctuating) data. `pyhf.infer.hypotest` can return us as a second return value the `[-2, -1, 0, 1, 2]` $\\sigma$ bounds on $CL_s$ if we pass `return_expected_set=True`.\n", "\n", "For example:" ] }, { "cell_type": "code", "execution_count": null, "id": "d6edf2ed", "metadata": {}, "outputs": [], "source": [ "pyhf.infer.hypotest(1, [b] + model.config.auxdata, model, return_expected_set=True, test_stat=\"qtilde\")" ] }, { "cell_type": "markdown", "id": "8b4e2d54", "metadata": {}, "source": [ "So the first return value is the observed $CL_s$ value (which is in our case the same as the expected one, since we plugged the exact background expectation into our model) and the second return value is a list of 5 $CL_s$ values for the `[-2, -1, 0, 1, 2]` $\\sigma$ bounds. So in our case the third value in that list is the same as our \"observed\" $CL_s$ value.\n", "\n", "Now for the actual scan - let's see what the lowest possible value (and the 1 and 2 $\\sigma$ bands) for $\\mu$ that is excluded with $CL_s<0.05$. Let's scan with 31 points between 0 and 3:" ] }, { "cell_type": "code", "execution_count": null, "id": "1659aa08", "metadata": {}, "outputs": [], "source": [ "mu_scan = np.linspace(0, 3, 31)\n", "results = [\n", " pyhf.infer.hypotest(mu, [b] + model.config.auxdata, model, return_expected_set=True, test_stat=\"qtilde\")\n", " for mu in mu_scan\n", "]\n", "# for this example we only need the expected band (second return value)\n", "# let's also convert this to a numpy array, such that we can slice it column-wise\n", "results = np.array([r[1] for r in results])" ] }, { "cell_type": "markdown", "id": "def14c8f", "metadata": {}, "source": [ "This is often visualized with interpolated lines and a yellow and green band for the 1 and 2 sigma bounds (\"brazil plot\"):" ] }, { "cell_type": "code", "execution_count": null, "id": "9ae235c6", "metadata": {}, "outputs": [], "source": [ "def plot_scan(scan_parameters, results, exclusion_level=0.05, ax=None):\n", " ax = ax or plt.gca()\n", " ax.axhline(exclusion_level, linestyle=\"--\", color=\"red\")\n", " ax.plot(scan_parameters, results[:, 2], \"--\", color=\"black\", label=\"Expected\")\n", " ax.fill_between(\n", " scan_parameters, results[:, 1], results[:, 3], alpha=0.5, color=\"green\", label=\"Expected +/- 1 σ\"\n", " )\n", " ax.fill_between(\n", " scan_parameters, results[:, 0], results[:, 4], alpha=0.5, color=\"yellow\", label=\"Expected +/- 2 σ\"\n", " )\n", " return ax" ] }, { "cell_type": "code", "execution_count": null, "id": "888b1144", "metadata": {}, "outputs": [], "source": [ "ax = plot_scan(mu_scan, results)\n", "ax.set_xlabel(\"Signal stregth $\\mu$\")\n", "ax.set_ylabel(\"$CL_s$\")\n", "ax.legend()" ] }, { "cell_type": "markdown", "id": "4c1fc03f", "metadata": {}, "source": [ "By looking where the red line crosses the expected line and the error bands we can conclude that the minimum signal strength we expect to exclude in case of no excess events is slightly below 1. We would expect that limit to fluctuate between $\\approx$ 0.6 and 1.4 at $1\\sigma$ level and 0.5 to 2 at $2\\sigma$ level.\n", "\n", "`pyhf` also has a convenience function to run the scan and interpolate this for us, so we don't need to read it from the plot with a ruler ;)" ] }, { "cell_type": "code", "execution_count": null, "id": "11059940", "metadata": {}, "outputs": [], "source": [ "pyhf.infer.intervals.upperlimit([b] + model.config.auxdata, model, mu_scan)" ] }, { "cell_type": "markdown", "id": "8b727fa2", "metadata": {}, "source": [ "
\n", " Exercise 7d [medium]: According to this method, what would be the upper limit on the number of excess events in case of exactly 0 expected background events (no uncertainty)? Follow up question: Is that correct? Or in other words: How well does the asymptotic limit do here? Hint: To create a model with negligible background you have to set the background expectation to a small value (e.g. 1e-10) since the $\\lambda$ parameter of a poisson distribution has to be strictly greater than 0.\n", "
" ] }, { "cell_type": "markdown", "id": "51b3c87c", "metadata": {}, "source": [ "## Run an upper limit scan on signal parameters" ] }, { "cell_type": "code", "execution_count": null, "id": "d69d148e", "metadata": {}, "outputs": [], "source": [ "import json\n", "import mplhep as hep" ] }, { "cell_type": "markdown", "id": "b5cffbe2", "metadata": {}, "source": [ "Now, often we are not only interested in one particular signal, but we might have a certain class of signal models in mind and ask ourselves which parameters of that model are excluded.\n", "\n", "What we can do is run an upper limit scan for each parameter and look for which parameters the excluded signal strength is below 1!\n", "\n", "To get something that looks realistic, i prepared a simplified version of a fit to 9 bins using the [published data from the ATLAS SUSY 1L Wh analysis](https://doi.org/10.17182/hepdata.90607.v3). If you are interested how the procedure of simplifiying the model work, have a look at [dump_signal_grid.ipynb](dump_signal_grid.ipynb).\n", "\n", "Lets load it! We have the following background expectations for each bin:" ] }, { "cell_type": "code", "execution_count": null, "id": "8a8b1503", "metadata": {}, "outputs": [], "source": [ "with open(\"example_background.json\") as f:\n", " b_9bins = json.load(f)" ] }, { "cell_type": "code", "execution_count": null, "id": "179f9fad", "metadata": {}, "outputs": [], "source": [ "b_9bins" ] }, { "attachments": { "image.png": { "image/png": "" } }, "cell_type": "markdown", "id": "803b0cae", "metadata": {}, "source": [ "And we have a long list of signal models for a SUSY process with a chargino/neutralino pair ($\\tilde{\\chi}_1^{\\pm}/\\tilde{\\chi}_2^{0}$) that each decay into the lightest neutralino $\\tilde{\\chi}_1^{0}$, while emitting a W Boson and a Higgs Boson.\n", "\n", "![image.png](attachment:image.png)\n", "\n", "The model parameters are the $\\tilde{\\chi}_1^{\\pm}/\\tilde{\\chi}_2^{0}$ mass (assumed to be the same) and the $\\tilde{\\chi}_1^{0}$ mass. We call them `x` and `y` in the following." ] }, { "cell_type": "code", "execution_count": null, "id": "bc16a5ee", "metadata": {}, "outputs": [], "source": [ "with open(\"example_signals.json\") as f:\n", " signals_9bins = json.load(f)" ] }, { "cell_type": "code", "execution_count": null, "id": "4cf37b69", "metadata": {}, "outputs": [], "source": [ "signals_9bins" ] }, { "cell_type": "markdown", "id": "f418bad5", "metadata": {}, "source": [ "Let's plot a few random signal models against the background expectation for all 9 bins:" ] }, { "cell_type": "code", "execution_count": null, "id": "3d96799c", "metadata": {}, "outputs": [], "source": [ "hep.histplot(b_9bins, label=\"background\", color=\"black\")\n", "for i in np.random.permutation(len(signals_9bins))[:3]:\n", " signal = signals_9bins[i]\n", " hep.histplot(signal[\"data\"], label=f\"i={i} x={signal['x']} y={signal['y']}\", linestyle=\"--\")\n", "plt.legend()" ] }, { "cell_type": "markdown", "id": "c19761fa", "metadata": {}, "source": [ "So by running the previous cell a few times you can see the distribution of the different signals is quite different, which makes a multi-bin search very powerful!\n", "\n", "A little background information: Originally those 9 bins originated from 3 different, statistically independent signal regions. The first 3 bins are optimized for signals with low masses, the next 3 bins for signals with medium masses and the last 3 bins for signals with high masses. Here we don't care about the x-axis though, since we just want to do a statistical analysis.\n", "\n", "So let's start with a 1D scan of all points with a neutralino mass (`y`) of 150 GeV:" ] }, { "cell_type": "code", "execution_count": null, "id": "a88481c7", "metadata": {}, "outputs": [], "source": [ "# filter out all signals with y=150 and sort by x\n", "signals_150 = sorted([i for i in signals_9bins if i[\"y\"] == 150], key=lambda k: k[\"x\"])" ] }, { "cell_type": "markdown", "id": "12a5dbae", "metadata": {}, "source": [ "We will run an upper limit scan on the signal strength for each of these signal points (so in some sense we are actually doing already a 2D scan)\n", "\n", "We will again use the `pyhf.simplemodels.hepdata_like` function and completely neglect any uncertainties:" ] }, { "cell_type": "code", "execution_count": null, "id": "bf27006d", "metadata": {}, "outputs": [], "source": [ "limits = []\n", "for signal in signals_150:\n", " model_s = pyhf.simplemodels.hepdata_like(\n", " signal_data=list(signal[\"data\"]), bkg_data=list(b_9bins), bkg_uncerts=[0] * 9\n", " )\n", " limit_obs, limit_exp = pyhf.infer.intervals.upperlimit(\n", " list(b_9bins) + model_s.config.auxdata,\n", " model_s,\n", " np.linspace(0, 5, 51)\n", " )\n", " # for now, we are just looking at the expected limit, since we didn't input any real data \n", " limits.append(limit_exp)\n", "limits = np.array(limits)" ] }, { "cell_type": "code", "execution_count": null, "id": "1e337758", "metadata": {}, "outputs": [], "source": [ "ax = plot_scan([i[\"x\"] for i in signals_150], limits, exclusion_level=1)\n", "ax.set_xlabel(r\"$\\tilde{\\chi}_1^{\\pm}/\\tilde{\\chi}_2^{0}$ mass [GeV]\")\n", "ax.set_ylabel(\"Upper limit on Signal strength\")\n", "ax.legend()" ] }, { "cell_type": "markdown", "id": "c4c2fb43", "metadata": {}, "source": [ "So the point at 300 GeV is at the boundary of exclusion and then the expected exclusion reaches up to around 850 GeV for these signals." ] }, { "cell_type": "markdown", "id": "be4c7091", "metadata": {}, "source": [ "## Let's go 2D\n", "\n", "As you have seen this example has 2 parameters, so we can run a 2D scan against both parameters. For this we don't do an upper limit scan for each point, but just calculate $CL_s$ for $\\mu=1$ and look at the contour of $CL_s=0.05$ in the grid of the 2 parameters." ] }, { "cell_type": "code", "execution_count": null, "id": "22016ff0", "metadata": {}, "outputs": [], "source": [ "x = []\n", "y = []\n", "cls = []\n", "for signal in signals_9bins:\n", " model_s = pyhf.simplemodels.hepdata_like(\n", " signal_data=list(signal[\"data\"]), bkg_data=list(b_9bins), bkg_uncerts=[0] * 9\n", " )\n", " cls_obs, cls_exp = pyhf.infer.hypotest(\n", " 1,\n", " list(b_9bins) + model_s.config.auxdata,\n", " model_s,\n", " test_stat=\"qtilde\",\n", " return_expected_set=True\n", " )\n", " x.append(signal[\"x\"])\n", " y.append(signal[\"y\"])\n", " cls.append(cls_exp)\n", "x = np.array(x)\n", "y = np.array(y)\n", "cls = np.array(cls)" ] }, { "cell_type": "markdown", "id": "9fc256c4", "metadata": {}, "source": [ "For better interpolation, we convert the $CL_s$ values to significances since they change much more linearily and are therefore easier to interpolate between:" ] }, { "cell_type": "code", "execution_count": null, "id": "1370d8d4", "metadata": {}, "outputs": [], "source": [ "z = pvalue_to_significance(cls)" ] }, { "cell_type": "code", "execution_count": null, "id": "3d342cdf", "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(ncols=2, figsize=(12, 3))\n", "ax[0].plot(sorted(cls[:, 2]))\n", "ax[0].set_title(\"Sorted CLs values\")\n", "ax[1].plot(sorted(z[:, 2]))\n", "ax[1].set_title(\"Sorted Significance values\")" ] }, { "cell_type": "markdown", "id": "21b3ea17", "metadata": {}, "source": [ "The significance level corresponding to a p-value of `0.05` is" ] }, { "cell_type": "code", "execution_count": null, "id": "26c38a64", "metadata": {}, "outputs": [], "source": [ "level = pvalue_to_significance(0.05)\n", "level" ] }, { "cell_type": "markdown", "id": "6ef6149c", "metadata": {}, "source": [ "You'll have the number `1.64` in your head after some time working for new physics searches :)\n", "\n", "So, let's draw the contour in the 2D grid of our signal mass parameters. For such grids we don't overdo it and leave out the 2 sigma band. We use the `tricontour` functions of `matplotlib` to do a triangulation of our points in 3D (`x`, `y`, `z`) space and draw contours along that interpolated hill.\n", "\n", "Reminder: the columns of expected $CL_s$ values are for `[-2, -1, 0, 1, 2]` sigma." ] }, { "cell_type": "code", "execution_count": null, "id": "c2c72c94", "metadata": {}, "outputs": [], "source": [ "opt = dict(levels=[level], colors=[\"black\"])\n", "plt.tricontour(x, y, z[:, 2], linestyles=\"dashed\", **opt)\n", "plt.tricontour(x, y, z[:, 1], linestyles=\"dotted\", **opt)\n", "plt.tricontour(x, y, z[:, 3], linestyles=\"dotted\", **opt)\n", "plt.tricontourf(x, y, z[:, 2], levels=100, cmap=\"Spectral_r\")\n", "plt.scatter(x, y, c=z[:, 2], marker='o', edgecolor=\"black\", cmap=\"Spectral_r\")\n", "plt.colorbar(label=\"Significance of CLs\")\n", "plt.xlabel(r\"$\\tilde{\\chi}_1^{\\pm}/\\tilde{\\chi}_2^{0}$ mass [GeV]\")\n", "plt.ylabel(r\"$\\tilde{\\chi}_1^{0}$ mass [GeV]\")" ] }, { "cell_type": "markdown", "id": "fd7358f2", "metadata": {}, "source": [ "
\n", " Question 7e: Which values of the signal parameters are expected to be excluded?\n", "
\n", "\n", "
\n", " Exercise 7f [medium]: How far would the expected limit go if we had twice the luminosity? Bonus Question [hard]: How far would the expected 3 $\\sigma$ contour, based on the discovery p-value reach?\n", "
" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.2" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 5 }