{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Higgs decay to two photons\n",
"
\n",
"The Standard Model predicted the decay of the [Higgs bosons](https://en.wikipedia.org/wiki/Higgs_boson) into photons. The process is depicted by the diagrams below:\n",
"\n",
"At the [Large Hadron Collider](https://en.wikipedia.org/wiki/Large_Hadron_Collider), this process has been measured. This figure shows how an Higgs boson decay looks in the CMS detector:\n",
"\n",
"This ROOTbook illustrates a simplified fitting procedure aiming to identify the peak due to the Higgs boson decay over the exponentially falling background.\n",
"\n",
"## Importing input data into a ROOT file\n",
"First of all we import the input data, here simplistically stored into a text file, into a [ROOT file](https://root.cern.ch/doc/master/classTFile.html)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Read 30770 events from the file.\n"
]
}
],
"source": [
"TTree tree(\"HiggsTree\",\"The tree cont\");\n",
"auto nevt = tree.ReadFile(\"Hgg.txt\",\"x\");\n",
"if (nevt <= 0) { \n",
" Error(\"fitHgg\",\"Error reading data from input file \");\n",
" }\n",
"std::cout << \"Read \" << nevt << \" events from the file.\" << std::endl;"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"## Modelling the distributions with RooFit\n",
"[ROOT](www.root.cern.ch) provides a tool for *rich modelling* of statistical distributions: [RooFit](https://root.cern.ch/roofit-20-minutes). We now create a RooFit model using the [RooWorkspace](https://root.cern.ch/doc/master/classRooWorkspace.html) factory: a tool which allow to express complex entities with a convenient syntax."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\u001b[1mRooFit v3.60 -- Developed by Wouter Verkerke and David Kirkby\u001b[0m \n",
" Copyright (C) 2000-2013 NIKHEF, University of California & Stanford University\n",
" All rights reserved, please read http://roofit.sourceforge.net/license.txt\n",
"\n"
]
}
],
"source": [
"RooWorkspace w(\"Workspace for the Higgs fit\");\n",
"w.factory(\"x[110,160]\"); // invariant mass\n",
"\n",
"// Backround Model\n",
"w.factory(\"nbackground[10000, 0, 10000]\");\n",
"w.var(\"nbackground\")->setVal(nevt);\n",
"w.var(\"nbackground\")->setMin(0.1*nevt);\n",
"w.var(\"nbackground\")->setMax(10*nevt);\n",
"\n",
"// Create exponential distribution characterised by two components\n",
"w.factory(\"a1[ 7.5, -500, 500]\");\n",
"w.factory(\"a2[-1.5, -500, 500]\");\n",
"w.factory(\"expr::z('-(a1*x/100 + a2*(x/100)^2)', a1, a2, x)\");\n",
"w.factory(\"Exponential::bmodel(z, 1)\");\n",
"\n",
"// Signal Model\n",
"w.factory(\"nsignal[100, 0.0, 1000.0]\");\n",
"w.factory(\"mass[130, 110, 150]\");\n",
"w.factory(\"width[1, 0.5, 5]\");\n",
"w.factory(\"Gaussian::smodel(x, mass, width)\");\n",
"\n",
"// Signal + Background Model\n",
"w.factory(\"SUM::model(nbackground*bmodel, nsignal*smodel)\");"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Extract the signal and signal+background model from the workspace:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"auto smodel = w.pdf(\"smodel\");\n",
"auto model = w.pdf(\"model\");\n",
"auto x = w.var(\"x\");"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We create now the RooFit data set importing the data from the ROOT tree"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[#1] INFO:Eval -- RooAbsReal::attachToTree(x) TTree Float_t branch x will be converted to double precision\n",
"(RooDataSet &) Name: data Title: data\n"
]
}
],
"source": [
"RooDataSet dataset(\"data\",\"data\",*x,RooFit::Import(tree) )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The workspace can be printed on screen: this is very useful to inspect the models created."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"RooWorkspace(Workspace for the Higgs fit) Workspace for the Higgs fit contents\n",
"\n",
"variables\n",
"---------\n",
"(a1,a2,mass,nbackground,nsignal,width,x)\n",
"\n",
"p.d.f.s\n",
"-------\n",
"RooExponential::bmodel[ x=z c=1 ] = 0.000616625\n",
"RooAddPdf::model[ nbackground * bmodel + nsignal * smodel ] = 0.000610556\n",
"RooGaussian::smodel[ x=x mean=mass sigma=width ] = 3.72665e-06\n",
"\n",
"functions\n",
"--------\n",
"RooFormulaVar::z[ actualVars=(a1,a2,x) formula=\"-(a1*x/100+a2*(x/100)^2)\" ] = -7.39125\n",
"\n"
]
}
],
"source": [
"w.Print();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Fit of the Data\n",
"The fitting procedure results in quite some output. It is important to learn how to read the output of a fit, for example the value of the parameters, their uncertainties, the correlation matrix and the status of the fitting procedure.\n",
"\n",
"** Be patient: the fit is unbinned and can take a while **"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[#1] INFO:Minization -- p.d.f. provides expected number of events, including extended term in likelihood.\n",
"[#1] INFO:NumericIntegration -- RooRealIntegral::init(bmodel_Int[x]) using numeric integrator RooIntegrator1D to calculate Int(x)\n",
"[#1] INFO:Minization -- RooMinimizer::optimizeConst: activating const optimization\n",
"[#1] INFO:NumericIntegration -- RooRealIntegral::init(bmodel_Int[x]) using numeric integrator RooIntegrator1D to calculate Int(x)\n",
"[#1] INFO:Minization -- The following expressions will be evaluated in cache-and-track mode: (bmodel,smodel)\n",
" **********\n",
" ** 1 **SET PRINT 1\n",
" **********\n",
" **********\n",
" ** 2 **SET NOGRAD\n",
" **********\n",
" PARAMETER DEFINITIONS:\n",
" NO. NAME VALUE STEP SIZE LIMITS\n",
" 1 a1 7.50000e+00 1.00000e+02 -5.00000e+02 5.00000e+02\n",
" 2 a2 -1.50000e+00 1.00000e+02 -5.00000e+02 5.00000e+02\n",
" 3 mass 1.30000e+02 4.00000e+00 1.10000e+02 1.50000e+02\n",
" 4 nbackground 1.00000e+04 3.46150e+03 3.07700e+03 3.07700e+05\n",
" 5 nsignal 1.00000e+02 5.00000e+01 0.00000e+00 1.00000e+03\n",
" 6 width 1.00000e+00 2.50000e-01 5.00000e-01 5.00000e+00\n",
" **********\n",
" ** 3 **SET ERR 0.5\n",
" **********\n",
" **********\n",
" ** 4 **SET PRINT 1\n",
" **********\n",
" **********\n",
" ** 5 **SET STR 1\n",
" **********\n",
" NOW USING STRATEGY 1: TRY TO BALANCE SPEED AGAINST RELIABILITY\n",
" **********\n",
" ** 6 **MIGRAD 3000 1\n",
" **********\n",
" FIRST CALL TO USER FUNCTION AT NEW START POINT, WITH IFLAG=4.\n",
"[#1] INFO:Minization -- RooNLLVar::evaluatePartition(nll_model_data) first = 0 last = 30770 Likelihood offset now set to -156454\n",
" START MIGRAD MINIMIZATION. STRATEGY 1. CONVERGENCE WHEN EDM .LT. 1.00e-03\n",
" FCN=-382.008 FROM MIGRAD STATUS=INITIATE 121 CALLS 122 TOTAL\n",
" EDM= unknown STRATEGY= 1 NO ERROR MATRIX \n",
" EXT PARAMETER CURRENT GUESS STEP FIRST \n",
" NO. NAME VALUE ERROR SIZE DERIVATIVE \n",
" 1 a1 7.50000e+00 1.00000e+02 0.00000e+00 8.25702e+04\n",
" 2 a2 -1.50000e+00 1.00000e+02 0.00000e+00 2.24355e+05\n",
" 3 mass 1.24328e+02 4.00000e+00 0.00000e+00 4.66358e+01\n",
" 4 nbackground 1.00000e+04 3.46150e+03 0.00000e+00 -9.08535e+04\n",
" 5 nsignal 3.12951e+02 5.00000e+01 3.64250e-01 -6.58428e+02\n",
" 6 width 1.00000e+00 2.50000e-01 0.00000e+00 -1.31987e+02\n",
" ERR DEF= 0.5\n",
" MIGRAD MINIMIZATION HAS CONVERGED.\n",
" MIGRAD WILL VERIFY CONVERGENCE AND ERROR MATRIX.\n",
" COVARIANCE MATRIX CALCULATED SUCCESSFULLY\n",
" FCN=-13661.8 FROM MIGRAD STATUS=CONVERGED 393 CALLS 394 TOTAL\n",
" EDM=2.73773e-06 STRATEGY= 1 ERROR MATRIX ACCURATE \n",
" EXT PARAMETER STEP FIRST \n",
" NO. NAME VALUE ERROR SIZE DERIVATIVE \n",
" 1 a1 8.62159e+00 7.99131e-01 6.71813e-06 3.95343e+00\n",
" 2 a2 -2.02240e+00 3.02257e-01 2.53440e-06 1.05738e+01\n",
" 3 mass 1.24043e+02 5.15995e-01 2.10145e-03 -5.18608e-02\n",
" 4 nbackground 3.04968e+04 1.94972e+02 1.62355e-04 1.89450e-01\n",
" 5 nsignal 2.73230e+02 8.78269e+01 1.19000e-02 4.11368e-03\n",
" 6 width 1.48353e+00 4.52588e-01 1.61020e-02 1.06792e-03\n",
" ERR DEF= 0.5\n",
" EXTERNAL ERROR MATRIX. NDIM= 25 NPAR= 6 ERR DEF=0.5\n",
" 6.386e-01 -2.412e-01 4.880e-03 -2.017e+01 -2.020e+01 5.434e-02 \n",
" -2.412e-01 9.136e-02 -1.149e-03 7.853e+00 7.866e+00 -2.129e-02 \n",
" 4.880e-03 -1.149e-03 2.663e-01 8.075e+00 8.083e+00 -5.954e-02 \n",
" -2.017e+01 7.853e+00 8.075e+00 3.801e+04 7.529e+03 -2.255e+01 \n",
" -2.020e+01 7.866e+00 8.083e+00 7.529e+03 7.816e+03 -2.257e+01 \n",
" 5.434e-02 -2.129e-02 -5.954e-02 -2.255e+01 -2.257e+01 2.090e-01 \n",
" PARAMETER CORRELATION COEFFICIENTS \n",
" NO. GLOBAL 1 2 3 4 5 6\n",
" 1 0.99864 1.000 -0.999 0.012 -0.129 -0.286 0.149\n",
" 2 0.99865 -0.999 1.000 -0.007 0.133 0.294 -0.154\n",
" 3 0.26833 0.012 -0.007 1.000 0.080 0.177 -0.252\n",
" 4 0.43699 -0.129 0.133 0.080 1.000 0.437 -0.253\n",
" 5 0.66859 -0.286 0.294 0.177 0.437 1.000 -0.558\n",
" 6 0.57982 0.149 -0.154 -0.252 -0.253 -0.558 1.000\n",
" **********\n",
" ** 7 **SET ERR 0.5\n",
" **********\n",
" **********\n",
" ** 8 **SET PRINT 1\n",
" **********\n",
" **********\n",
" ** 9 **HESSE 3000\n",
" **********\n",
" COVARIANCE MATRIX CALCULATED SUCCESSFULLY\n",
" FCN=-13661.8 FROM HESSE STATUS=OK 40 CALLS 434 TOTAL\n",
" EDM=2.78595e-06 STRATEGY= 1 ERROR MATRIX ACCURATE \n",
" EXT PARAMETER INTERNAL INTERNAL \n",
" NO. NAME VALUE ERROR STEP SIZE VALUE \n",
" 1 a1 8.62159e+00 8.80308e-01 1.34363e-06 1.72440e-02\n",
" 2 a2 -2.02240e+00 3.33009e-01 5.06880e-07 -4.04480e-03\n",
" 3 mass 1.24043e+02 5.19935e-01 8.40581e-05 -3.02425e-01\n",
" 4 nbackground 3.04968e+04 1.96490e+02 3.24710e-05 -9.61368e-01\n",
" 5 nsignal 2.73230e+02 9.09340e+01 4.76002e-04 3.61233e+00\n",
" 6 width 1.48353e+00 4.67716e-01 6.44080e-04 -5.97862e-01\n",
" ERR DEF= 0.5\n",
" EXTERNAL ERROR MATRIX. NDIM= 25 NPAR= 6 ERR DEF=0.5\n",
" 7.749e-01 -2.928e-01 2.922e-03 -2.578e+01 -2.578e+01 7.361e-02 \n",
" -2.928e-01 1.109e-01 -3.724e-04 9.988e+00 9.989e+00 -2.867e-02 \n",
" 2.922e-03 -3.724e-04 2.704e-01 9.088e+00 9.089e+00 -6.868e-02 \n",
" -2.578e+01 9.988e+00 9.088e+00 3.861e+04 8.112e+03 -2.562e+01 \n",
" -2.578e+01 9.989e+00 9.089e+00 8.112e+03 8.386e+03 -2.563e+01 \n",
" 7.361e-02 -2.867e-02 -6.868e-02 -2.562e+01 -2.563e+01 2.235e-01 \n",
" PARAMETER CORRELATION COEFFICIENTS \n",
" NO. GLOBAL 1 2 3 4 5 6\n",
" 1 0.99888 1.000 -0.999 0.006 -0.149 -0.320 0.177\n",
" 2 0.99889 -0.999 1.000 -0.002 0.153 0.328 -0.182\n",
" 3 0.29339 0.006 -0.002 1.000 0.089 0.191 -0.279\n",
" 4 0.45101 -0.149 0.153 0.089 1.000 0.451 -0.276\n",
" 5 0.69618 -0.320 0.328 0.191 0.451 1.000 -0.592\n",
" 6 0.61576 0.177 -0.182 -0.279 -0.276 -0.592 1.000\n",
"[#1] INFO:Minization -- RooMinimizer::optimizeConst: deactivating const optimization\n"
]
}
],
"source": [
"auto r = model->fitTo(dataset, RooFit::Minimizer(\"Minuit\"),RooFit::Save(true), RooFit::Offset(true));"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualisation of the Result\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[#1] INFO:NumericIntegration -- RooRealIntegral::init(bmodel_Int[x]) using numeric integrator RooIntegrator1D to calculate Int(x)\n",
"[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) directly selected PDF components: (bmodel)\n",
"[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) indirectly selected PDF components: (z)\n",
"[#1] INFO:NumericIntegration -- RooRealIntegral::init(bmodel_Int[x]) using numeric integrator RooIntegrator1D to calculate Int(x)\n",
"[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) directly selected PDF components: (smodel)\n",
"[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) indirectly selected PDF components: ()\n",
"[#1] INFO:NumericIntegration -- RooRealIntegral::init(bmodel_Int[x]) using numeric integrator RooIntegrator1D to calculate Int(x)\n"
]
}
],
"source": [
"auto plot = x->frame();\n",
"dataset.plotOn(plot);\n",
"model->plotOn(plot);\n",
"model->plotOn(plot, RooFit::Components(\"bmodel\"),RooFit::LineStyle(kDashed));\n",
"model->plotOn(plot, RooFit::Components(\"smodel\"),RooFit::LineColor(kRed));\n",
"model->paramOn(plot);"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
"
\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%jsroot on\n",
"TCanvas c;\n",
"plot->Draw();\n",
"c.Draw();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Persistification of the Model on Disk\n",
"The ROOT I/O is used to write on disk full RooFit models in order to ease their sharing among scientists, also from different experiments - this establishes an important common language which allows to compare and combine analyses."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"w.writeToFile(\"HiggsModel.root\",true);"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "ROOT C++",
"language": "c++",
"name": "root"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"file_extension": ".C",
"mimetype": " text/x-c++src",
"name": "c++"
}
},
"nbformat": 4,
"nbformat_minor": 0
}