{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "RooFit Basics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The RooFit package allows to model the expected distribution of events in a physics analysis. Models can be used to perform likelihood fits, produce plots, and generate \"toy Monte Carlo\" samples for various studies. For more information: http://roofit.sourceforge.net/." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import ROOT\n", "import rootnotes\n", "import rootprint\n", "import utils\n", "from ProgressBar import ProgressBar\n", "import math\n", "from ROOT import RooRealVar, RooFormulaVar, RooVoigtian, RooChebychev, RooArgList, \\\n", " RooArgSet, RooAddPdf, RooDataSet, RooCategory, RooSimultaneous, \\\n", " RooBreitWigner, RooCBShape, RooFFTConvPdf" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The tree we are reading stores some simple custom classes. The following command allows ROOT to know how the structure of these classes and access their data members. Uncomment the second line to print the content of the Loader.C file." ] }, { "cell_type": "code", "collapsed": false, "input": [ "ROOT.gROOT.LoadMacro(\"Loader.C+\")\n", "# print open(\"Loader.C\").read()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ "0" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load the input file" ] }, { "cell_type": "code", "collapsed": false, "input": [ "tree = ROOT.TChain(\"T\")\n", "tree.Add(\"/afs/cern.ch/user/d/demattia/work/public/TagAndProbe_ZMuMu.root\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ "1" ] } ], "prompt_number": 3 }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Define RooFit variables and PDFs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define pdf for fitting $Z\\rightarrow\\mu\\mu$" ] }, { "cell_type": "code", "collapsed": false, "input": [ "mass = RooRealVar(\"mass\", \"mass\", 60, 120)\n", "# Construct signal pdf\n", "mean = RooRealVar(\"mean\", \"mean\", 90, 80, 100)\n", "width = RooRealVar(\"width\", \"width\", 2.4952, 1, 3)\n", "width.setConstant(ROOT.kTRUE)\n", "sigma = RooRealVar(\"sigma\", \"sigma\", 1.2, 0.2, 2)\n", "signal = RooVoigtian(\"signal\", \"signal\", mass, mean, width, sigma)\n", "# Construct background pdf\n", "a0 = RooRealVar(\"a0\",\"a0\",0.1,-1,1)\n", "a1 = RooRealVar(\"a1\",\"a1\",0.004,-1,1)\n", "background = RooChebychev(\"background\",\"background\",mass,RooArgList(a0,a1))\n", "# Construct composite pdf\n", "nsig = RooRealVar(\"nsig\", \"nsig\", 2000, 0, 100000)\n", "nbkg = RooRealVar(\"nbkg\", \"nbkg\", 100, 0, 1000)\n", "model = RooAddPdf(\"model\", \"model\", RooArgList(signal, background), RooArgList(nsig, nbkg))\n", "\n", "\n", "# alpha = RooRealVar(\"alpha\", \"alpha\", 80, 60, 100)\n", "# CBn = RooRealVar(\"n\", \"n\", 1, -10, 10)\n", "# BW = RooBreitWigner(\"BW\",\"Breit Wigner theory\",mass,mean,width)\n", "# cryBall = RooCBShape(\"cryBall\",\"Crystal Ball resolution model\", mass, mean, sigma, alpha, CBn)\n", "# model = RooAddPdf(\"model\", \"model\", RooArgList(cryBall, background), RooArgList(nsig, nbkg))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 4 }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Prepare datasets" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define binning" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import bisect\n", "\n", "ptBins = [20, 40, 70]\n", "\n", "# Returns the bins corresponding to the given pt values\n", "def findBins(pt1, pt2):\n", " return (bisect.bisect_right(ptBins, pt1)-1, bisect.bisect_right(ptBins, pt2)-1)\n", "\n", "# This returns the position in an single dimension array given the bins\n", "def findPosition(bin1, bin2):\n", " return bin1+(len(ptBins)-1)*bin2" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "datasetAllMap = {}\n", "for ptBin1 in range(0, len(ptBins)):\n", " for ptBin2 in range(0, len(ptBins)):\n", " datasetAllMap[(ptBin1, ptBin2)] = RooDataSet(\"datasetAll_\"+str(ptBin1)+\"_\"+str(ptBin2),\n", " \"datasetAll_\"+str(ptBin1)+\"_\"+str(ptBin2), RooArgSet(mass))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "hAllMap = {}\n", "for ptBin1 in range(0, len(ptBins)):\n", " for ptBin2 in range(0, len(ptBins)):\n", " hAllMap[(ptBin1,ptBin2)] = ROOT.TH1F(\"hAll\"+\"_\"+str(ptBin1)+\"_\"+str(ptBin2),\n", " \"All events passing old trigger \"+str(ptBin1)+\"_\"+str(ptBin2), 100, 60, 120)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 7 }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Event loop" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Utility functions" ] }, { "cell_type": "code", "collapsed": false, "input": [ "muMass = 0.1057\n", "\n", "def passSelection(track):\n", " if ( track.pt > 26 and abs(track.eta) < 2 and track.isolation/track.pt < 0.1 and track.trackerLayersWithMeasurement >= 6\n", " and track.trackQuality and track.dxy < 30. and track.dz < 30. ):\n", " return True\n", "\n", "def passDileptonSelection(track1, track2):\n", " if track1.charge == track2.charge: return False\n", " if utils.deltaR(track1.phi, track1.eta, track2.phi, track2.eta) < 0.2: return False\n", " return True\n", "\n", "def computeMass(mu1, mu2):\n", " muon1 = ROOT.TLorentzVector()\n", " muon2 = ROOT.TLorentzVector()\n", " muon1.SetPtEtaPhiM(mu1.pt, mu1.eta, mu1.phi, muMass)\n", " muon2.SetPtEtaPhiM(mu2.pt, mu2.eta, mu2.phi, muMass)\n", " return (muon1+muon2).M()\n", "\n", "def fillSingleCandidate(track1, track2, histoMap, datasetMap):\n", " if not passDileptonSelection(track1, track2): return False\n", " mass.setVal(computeMass(track1,track2))\n", " histoMap[findBins(track1.pt, track2.pt)].Fill(mass.getVal())\n", " datasetMap[findBins(track1.pt, track2.pt)].add(RooArgSet(mass))\n", " return True\n", "\n", "def fillCandidates(tracks, histoMap, datasetMap):\n", " if len(tracks) == 2:\n", " fillSingleCandidate(tracks[0], tracks[1], histoMap, datasetMap)\n", " elif len(tracks) > 2:\n", " for i in range(0, len(tracks)):\n", " for j in range(i+1, len(tracks)):\n", " fillSingleCandidate(tracks[i], tracks[j], histoMap, datasetMap)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Event loop" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%rootprint\n", "allCandidates = 0\n", "passCandidates = 0\n", "\n", "processedEvents = 0\n", "\n", "maxEvents = 10000\n", "\n", "# p = ProgressBar(maxEvents)\n", "for event in tree:\n", " if processedEvents > maxEvents:\n", " break\n", "\n", " # p.animate(processedEvents)\n", " processedEvents += 1\n", " \n", " fillCandidates(event.muons, hAllMap, datasetAllMap)\n", "\n", "for bin1 in range(len(ptBins)-1):\n", " for bin2 in range(len(ptBins)-1):\n", " print \"candidates[\"+str(ptBins[bin1])+\"-\"+str(ptBins[bin1+1])+\\\n", " \", \"+str(ptBins[bin2])+\"-\"+str(ptBins[bin2+1])+\"] =\",\\\n", " hAllMap[bin1,bin2].GetEntries()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 19 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Draw the histograms. Note how the code running on the events and the code drawing the plots is separated in two different cells. This way it is very fast to do style changes to the plots and redraw them quickly." ] }, { "cell_type": "code", "collapsed": false, "input": [ "canvas = rootnotes.canvas(\"AllCanvas\", (800, 800))\n", "canvas.Divide(len(ptBins)-1,len(ptBins)-1)\n", "for ptBin1 in range(0, len(ptBins)-1):\n", " for ptBin2 in range(0, len(ptBins)-1):\n", " canvas.cd(findPosition(ptBin1, ptBin2)+1)\n", " hAllMap[(ptBin1, ptBin2)].Draw()\n", "canvas" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 16 }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Fit the Invariant Mass Distributions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We fit the datasets. This is an extended unbinned maximum likelihood fit. Extended means that we fit also for a scale parameter multiplying the PDF, therefore obtaining an estimate for the total number of Z candidates." ] }, { "cell_type": "code", "collapsed": false, "input": [ "#%%rootprint\n", "# Perform a fit of model to data\n", "\n", "# Use this for minos (better error estimate, but takes longer)\n", "# fr = model.fitTo(combData, ROOT.RooFit.Save(ROOT.kTRUE), ROOT.RooFit.Extended(ROOT.kTRUE), ROOT.RooFit.Minos(ROOT.kTRUE))\n", "\n", "# fr = model.fitTo(combData, ROOT.RooFit.Save(ROOT.kTRUE), ROOT.RooFit.Extended(ROOT.kTRUE))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "frMap = {}\n", "# Construct dataset in (x,sample)\n", "for ptBin1 in range(0, len(ptBins)-1):\n", " for ptBin2 in range(0, len(ptBins)-1):\n", " frMap[(ptBin1, ptBin2)] = model.fitTo(datasetAllMap[(ptBin1, ptBin2)],\n", " ROOT.RooFit.Save(ROOT.kTRUE),\n", " ROOT.RooFit.Extended(ROOT.kTRUE))\n", " # Use this to run with Minos enabled. Minos allows for a more precise esimation of the parameter errors,\n", " # but the fit will take longer.\n", " # frMap[(ptBin1, ptBin2)] = model.fitTo(datasetAllMap[(ptBin1, ptBin2)],\n", " # ROOT.RooFit.Save(ROOT.kTRUE),\n", " # ROOT.RooFit.Extended(ROOT.kTRUE),\n", " # ROOT.RooFit.Minos(ROOT.kTRUE))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 12 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Print the results of each fit showing the parameters and the outcome." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%rootprint\n", "for ptBin1 in range(0, len(ptBins)-1):\n", " for ptBin2 in range(0, len(ptBins)-1):\n", " frMap[(ptBin1, ptBin2)].Print(\"v\")" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot the fit results. The first cell contains a function to simplify the loop in the second cell." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def plotResults(ptBin1, ptBin2, Data, canvas2):\n", " frame2 = mass.frame(ROOT.RooFit.Bins(30),ROOT.RooFit.Title(\"All events\"))\n", " # Plot all data tagged as physics sample\n", " Data.plotOn(frame2)\n", " model.plotOn(frame2)\n", " model.plotOn(frame2, ROOT.RooFit.Components(\"background\"), ROOT.RooFit.LineStyle(ROOT.kDashed))\n", "\n", " canvas2.cd(findPosition(ptBin1, ptBin2)+1)\n", " frame2.GetYaxis().SetTitleOffset(1.4)\n", " frame2.Draw()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 14 }, { "cell_type": "code", "collapsed": false, "input": [ "# Canvases for fit results\n", "canvas2 = ROOT.TCanvas(\"RooFitCanvas\", \"RooFitCanvas\", 800, 800)\n", "canvas2.Divide(len(ptBins)-1,len(ptBins)-1)\n", "\n", "for ptBin1 in range(0, len(ptBins)-1):\n", " for ptBin2 in range(0, len(ptBins)-1):\n", " plotResults(ptBin1, ptBin2, datasetAllMap[(ptBin1,ptBin2)], canvas2)\n", "\n", "canvas2\n", "#canvas2.Print(\"allFits.pdf\")" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 20 }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Question 1**: Repeat the fit using Minos. What is the difference?" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 15 }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Question 2**: Remove the background from the pdf and redo the fit. What happens to the precision of the parameters?" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 15 } ], "metadata": {} } ] }