{ "cells": [ { "cell_type": "markdown", "id": "6af675ed", "metadata": {}, "source": [ "# Hypothesis tests with pyhf\n", "\n", "This notebook will provide you with the tools to do sensitivity estimates which can be used for search region optimization or sensitivity projections.\n", "\n", "## p-value for discovery of a new signal\n", "\n", "In searches for new physics we want to know how significant a potential deviation from our Standard Model (SM) expectation is. We do this by a hypothesis test where we try to exclude the SM (\"background only\") hypothesis. We use a so called **p-value** $p_0$ for this, abstractly defined by:\n", "\n", "$$p_0 = \\int\\limits_{t_\\mathrm{obs}}^{\\infty}p(t|H_0)\\mathrm{d}t$$\n", "\n", "where $t$ is a test statistic (a number we calculate from our data observations) and $p(t|H_0)$ is the probability distribution for $t$ under the assumption of our **null Hypothesis** $H_0$, in this case the background only hypothesis. This p-value is then typically converted into a number of standard deviations $z$, the **significance** (\"number of sigmas\") via the inverse of the cumulative standard normal distribution $\\Phi$:\n", "\n", "$$z = \\Phi^{-1}(1 - p)$$\n", "\n", "The typical convention for particle physics is to speak of *evidence* when $z>3$ and of an *observation* when $z>5$.\n", "\n", "So what do we use for $t$? We want to use something that discriminates well between our null Hypothesis and an **alternative Hypothesis** that we have in mind. When we try to discover new physics, our null Hypothesis is the absence and the alternative Hypothesis the presence of a signal. We can parametrize this by a **signal strength** parameter $\\mu$. The test statistics used in almost all LHC searches use the **profile likelihood ratio**\n", "\n", "$$\\Lambda_\\mu = \\frac{L(\\mu, \\hat{\\hat{\\theta}})}{L(\\hat{\\mu}, \\hat{\\theta})}$$\n", "\n", "where $\\theta$ are the other parameters of our model that are not part of the test, the so called **nuisance parameters**. In contrast, the parameter that we want to test, $\\mu$, is called our **parameter of interest** (POI). The nuisance parameters include all fit parameters, like normalization factors and parameters for describing uncertainties. $L(\\mu, \\hat{\\hat{\\theta}})$ is the Likelihood function, maximized under the condition that our parameter of interest takes the value $\\mu$ and $L(\\hat{\\mu}, \\hat{\\theta})$ is the unconditionally maximized Likelihood. So roughly speaking, we are calculating the fraction of the maximum possible likelihood that we can get under our test condition. If it is high, that speaks for our hypothesis, if it is low, against. The test statistic $t_\\mu$ is then defined as\n", "\n", "$$t_\\mu = -2\\ln\\Lambda_\\mu$$\n", "\n", "giving us a test statistic where **high values speak against the null hypothesis**.\n", "\n", "
pyhf
to calculate the expected significance or CLs. The maximum likelihood estimates can be calculated exactly. Have a look at numbercounting.py
for some functions to help you with that.1e-10
) since the $\\lambda$ parameter of a poisson distribution has to be strictly greater than 0.\n",
"