"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Machine Learning for $t\\bar{t}Z$ Opposite-sign dilepton analysis \n",
"This notebook uses ATLAS Open Data http://opendata.atlas.cern to show you the steps to implement Machine Learning in the $t\\bar{t}Z$ Opposite-sign dilepton analysis, following the ATLAS published paper [Measurement of the $t\\bar{t}Z$ and $t\\bar{t}W$ cross sections in proton-proton collisions at $\\sqrt{s}$ = 13 TeV with the ATLAS detector](https://journals.aps.org/prd/pdf/10.1103/PhysRevD.99.072009).\n",
"\n",
"The whole notebook takes a few hours to follow through.\n",
"\n",
"Notebooks are web applications that allow you to create and share documents that can contain for example:\n",
"1. live code\n",
"2. visualisations\n",
"3. narrative text\n",
"\n",
"Notebooks are a perfect platform to develop Machine Learning for your work, since you'll need exactly those 3 things: code, visualisations and narrative text!\n",
"\n",
"We're interested in Machine Learning because we can design an algorithm to figure out for itself how to do various analyses, potentially saving us countless human-hours of design and analysis work.\n",
"\n",
"Machine Learning use within ATLAS includes: \n",
"* particle tracking\n",
"* particle identification\n",
"* signal/background classification\n",
"* and more!\n",
"\n",
"This notebook will focus on signal/background classification.\n",
"\n",
"By the end of this notebook you will be able to:\n",
"1. run Boosted Decision Trees to classify signal and background\n",
"2. know some things you can change to improve your Boosted Decision Tree\n",
"\n",
"Feynman diagram pictures are borrowed from our friends at https://www.particlezoo.net"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction (from Section 1)\n",
"\n",
"Properties of the top quark have been explored by the\n",
"Large Hadron Collider (LHC) and previous collider experiments in great detail. \n",
"\n",
"Other properties of the top quark are\n",
"now becoming accessible, owing to the large centerof-mass energy and luminosity at the LHC.\n",
"\n",
"Measurements of top-quark pairs in association with a Z boson ($t\\bar{t}Z$) provide a direct probe of the\n",
"weak couplings of the top quark. These couplings\n",
"may be modified in the presence of physics beyond the\n",
"Standard Model (BSM). Measurements of the $t\\bar{t}Z$ production cross sections, $\\sigma_{t\\bar{t}Z}$, can be used to\n",
"set constraints on the weak couplings of the top quark. \n",
"\n",
"The production of $t\\bar{t}Z$ is often an important\n",
"background in searches involving final states with multiple\n",
"leptons and b-quarks. These processes also constitute an\n",
"important background in measurements of the associated\n",
"production of the Higgs boson with top quarks.\n",
"\n",
"This paper presents measurements of the $t\\bar{t}Z$ cross section using proton–proton (pp) collision data\n",
"at a center-of mass energy $\\sqrt{s} = 13 TeV.\n",
"\n",
"The final states of top-quark pairs produced in association with a\n",
"Z boson contain up to four isolated, prompt leptons. In this analysis, events with two opposite-sign\n",
"(OS) leptons are considered. The dominant backgrounds\n",
"in this channel are Z+jets and $t\\bar{t}$, \n",
"\n",
"(In this paper, lepton is used to denote electron or muon, and prompt lepton is used to denote a lepton produced in a Z or W\n",
"boson decay, or in the decay of a τ-lepton which arises from a Z or W boson decay.)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data and simulated samples (from Section 3)\n",
"\n",
"The data were collected with the ATLAS detector at a proton–proton (pp) collision\n",
"energy of 13 TeV. \n",
"\n",
"Monte Carlo (MC) simulation samples are used to model the expected signal and background distributions\n",
"in the different control, validation and signal regions described below. All samples were processed through the\n",
"same reconstruction software as used for the data. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Opposite-sign dilepton analysis (from Section 5A)\n",
"\n",
"The OS dilepton analysis targets the $t\\bar{t}Z$ process, where both top quarks decay hadronically and the Z boson\n",
"decays to a pair of leptons (electrons or muons). Events are required to have exactly two OSSF leptons.\n",
"Events with additional isolated leptons are rejected. The invariant mass of the lepton pair is required to be in the Z boson mass window, |$m_{ll} − m_Z$| < 10 GeV. The leading (subleading) lepton is required to have a\n",
"transverse momentum of at least 30 (15) GeV.\n",
"\n",
"The OS dilepton analysis is affected by large backgrounds from Z+jets or $t\\bar{t}$ production, both characterized\n",
"by the presence of two leptons. In order to improve the signal-to-background ratio and constrain these\n",
"backgrounds from data, three separate analysis regions are considered, depending on the number of\n",
"jets ($n_{jets}$) and number of b-tagged jets ($n_{b-tags}$): 2l-Z-5j2b, 2l-Z-6j1b and 2l-Z-6j2b. The signal region\n",
"requirements are summarized in Table 1 below."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"| Variable | 2l-Z-6j1b | 2l-Z-5j2b | 2l-Z-6j2b |\n",
"|------|------|------|------|\n",
"| Leptons | = 2, same flavour and opposite sign | = 2, same flavour and opposite sign | = 2, same flavour and opposite sign |\n",
"| $m_{ll}$ | $|m_{ll} − m_Z |$ < 10 GeV | $|m_{ll} − m_Z |$ < 10 GeV | $|m_{ll} − m_Z |$ < 10 GeV |\n",
"| $p_T$ (leading lepton) | > 30 GeV | > 30 GeV | > 30 GeV |\n",
"| $p_T$ (subleading lepton) | > 15 GeV | > 15 GeV | > 15 GeV |\n",
"| $n_{b-tags}$ | 1 | $\\geq$2 | $\\geq$2 |\n",
"| $n_{jets}$ | $\\geq$6 | 5 | $\\geq$6 |\n",
"\n",
"Table 1: Summary of the event selection requirements in the OS dilepton signal regions.\n",
"\n",
"This is Table 2 of the ATLAS published paper [Measurement of the $t\\bar{t}Z$ and $t\\bar{t}W$ cross sections in proton-proton collisions at $\\sqrt{s}$ = 13 TeV with the ATLAS detector](https://journals.aps.org/prd/pdf/10.1103/PhysRevD.99.072009).\n",
"\n",
"In signal region 2l-Z-5j2b, exactly five jets\n",
"are required, of which at least two must be b-tagged. In 2l-Z-6j1b (2l-Z-6j2b), at least six jets are required with\n",
"exactly one (at least two) being b-tagged jets."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Contents: \n",
"\n",
"[Running a Jupyter notebook](#running) \n",
"[To setup first time](#setupfirsttime) \n",
"[To setup everytime](#setupeverytime) \n",
" [Lumi, fraction, file path](#fraction) \n",
" [Samples to process](#samples_to_process) \n",
" [Get data from files](#get_data_from_files) \n",
" [Find the good jets!](#good_jets) \n",
" [Find the good leptons!](#good_leptons) \n",
" [Let's calculate some variables](#calc_variables) \n",
" [Load data](#load_data) \n",
" [Samples to plot](#samples_SR) \n",
" [Function to plot Data/MC](#plot_data) \n",
"\n",
"[Boosted Decision Tree (BDT) in 6j2b Region](#BDT_6j2b) \n",
" [Training and Testing split](#BDT_6j2b_train_test_split) \n",
" [Training](#BDT_6j2b_training) \n",
" [Signal Region plot](#plot_6j2b_SR) \n",
" \n",
"[Boosted Decision Tree (BDT) in 5j2b Region](#BDT_5j2b) \n",
" [Training and Testing split](#BDT_5j2b_train_test_split) \n",
" [Training](#BDT_5j2b_training) \n",
" [Signal Region plot](#plot_5j2b_SR) \n",
" \n",
"[Boosted Decision Tree (BDT) in 6j1b Region](#BDT_6j1b) \n",
" [Training and Testing split](#BDT_6j1b_train_test_split) \n",
" [Training](#BDT_6j1b_training) \n",
" [Signal Region plot](#plot_6j1b_SR) \n",
"\n",
"[Control Region plots](#data_CR) \n",
" [6j2b Control Region plot](#plot_6j2b_CR) \n",
" [5j2b Control Region plot](#plot_5j2b_CR) \n",
" [6j1b Control Region plot](#plot_6j1b_CR) \n",
"\n",
"[Data-driven ttbar estimate](#data_driven) \n",
"[Function to plot data from histograms](#plot_data_from_hist) \n",
" [6j2b Signal Region plot](#6j2b_data_driven) \n",
" [5j2b Signal Region plot](#5j2b_data_driven) \n",
" [6j1b Signal Region plot](#6j1b_data_driven) \n",
" \n",
"[BDT feature importances](#BDT_feature_importances) \n",
" \n",
"[Going further](#going_further) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Running a Jupyter notebook\n",
"\n",
"To run the whole Jupyter notebook, in the top menu click Cell -> Run All.\n",
"\n",
"To propagate a change you've made to a piece of code, click Cell -> Run All Below.\n",
"\n",
"You can also run a single code cell, by using the keyboard shortcut Shift+Enter."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## First time setup on your computer (no need on mybinder)\n",
"This first cell only needs to be run the first time you open this notebook on your computer. \n",
"\n",
"If you close Jupyter and re-open on the same computer, you won't need to run this first cell again.\n",
"\n",
"If you open on mybinder, you don't need to run this cell."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: pip in /Users/me338/.local/lib/python3.8/site-packages (21.0.1)\n",
"Collecting uproot3\n",
" Downloading uproot3-3.14.4-py3-none-any.whl (117 kB)\n",
"\u001b[K |████████████████████████████████| 117 kB 2.9 MB/s eta 0:00:01\n",
"\u001b[?25hRequirement already satisfied: pandas in /Users/me338/.local/lib/python3.8/site-packages (1.1.4)\n",
"Collecting pandas\n",
" Downloading pandas-1.2.3-cp38-cp38-macosx_10_9_x86_64.whl (10.5 MB)\n",
"\u001b[K |████████████████████████████████| 10.5 MB 10.2 MB/s eta 0:00:01 |████████████▋ | 4.1 MB 10.2 MB/s eta 0:00:01\n",
"\u001b[?25hRequirement already satisfied: sklearn in /Users/me338/.local/lib/python3.8/site-packages (0.0)\n",
"Requirement already satisfied: numpy>=1.16.5 in /Users/me338/.local/lib/python3.8/site-packages (from pandas) (1.19.2)\n",
"Requirement already satisfied: pytz>=2017.3 in /opt/anaconda3/lib/python3.8/site-packages (from pandas) (2020.1)\n",
"Requirement already satisfied: python-dateutil>=2.7.3 in /opt/anaconda3/lib/python3.8/site-packages (from pandas) (2.8.1)\n",
"Requirement already satisfied: six>=1.5 in /opt/anaconda3/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)\n",
"Requirement already satisfied: scikit-learn in /opt/anaconda3/lib/python3.8/site-packages (from sklearn) (0.23.1)\n",
"Collecting awkward0\n",
" Downloading awkward0-0.15.5-py3-none-any.whl (87 kB)\n",
"\u001b[K |████████████████████████████████| 87 kB 12.2 MB/s eta 0:00:01\n",
"\u001b[?25hCollecting uproot3-methods\n",
" Downloading uproot3_methods-0.10.0-py3-none-any.whl (32 kB)\n",
"Requirement already satisfied: cachetools in /Users/me338/.local/lib/python3.8/site-packages (from uproot3) (4.1.1)\n",
"Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/anaconda3/lib/python3.8/site-packages (from scikit-learn->sklearn) (2.1.0)\n",
"Requirement already satisfied: scipy>=0.19.1 in /opt/anaconda3/lib/python3.8/site-packages (from scikit-learn->sklearn) (1.5.0)\n",
"Requirement already satisfied: joblib>=0.11 in /opt/anaconda3/lib/python3.8/site-packages (from scikit-learn->sklearn) (0.16.0)\n",
"Installing collected packages: awkward0, uproot3-methods, uproot3, pandas\n",
" Attempting uninstall: pandas\n",
" Found existing installation: pandas 1.1.4\n",
" Uninstalling pandas-1.1.4:\n",
" Successfully uninstalled pandas-1.1.4\n",
"Successfully installed awkward0-0.15.5 pandas-1.2.3 uproot3-3.14.4 uproot3-methods-0.10.0\n"
]
}
],
"source": [
"import sys\n",
"!{sys.executable} -m pip install --upgrade --user pip # update the pip package installer\n",
"!{sys.executable} -m pip install -U uproot3 pandas sklearn --user # install required packages"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## To setup everytime\n",
"Cell -> Run All Below\n",
"\n",
"to be done every time you re-open this notebook.\n",
"\n",
"We're going to be using a number of tools to help us:\n",
"* uproot: lets us read .root files typically used in particle physics into data formats used in Machine Learning\n",
"* pandas: lets us store data as dataframes, a format widely used in Machine Learning\n",
"* numpy: provides numerical calculations such as histogramming\n",
"* matplotlib: common tool for making plots, figures, images, visualisations"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"import pandas as pd # to store data as dataframes\n",
"import numpy as np # for numerical calculations such as histogramming\n",
"import uproot3 # to read .root files as dataframes\n",
"import matplotlib.pyplot as plt # for plotting\n",
"from matplotlib.lines import Line2D # for dashed line in legend\n",
"from matplotlib.ticker import AutoMinorLocator,MaxNLocator # for minor ticks and forcing integer tick labels\n",
"\n",
"import infofile # local file containing cross-sections and sum of weights"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Lumi, fraction, file path\n",
"\n",
"General definitions of fraction of data used, where to access the input files.\n",
"\n",
"The notebook will run quicker if you download the input files to your computer, rather than reading them through the web. \n",
"\n",
"1. download by clicking https://atlas-opendata.web.cern.ch/atlas-opendata/samples/2020/exactly2lep/\n",
"2. Make a folder called exactly2lep inside notebooks-collection-opendata/13-TeV-examples/uproot_python/\n",
"3. Unzip the downloaded file in \"exactly2lep\"\n",
"4. Uncomment tuple_path = \"exactly2lep/\"\n",
"5. Comment out tuple_path = \"https://atlas-opendata.web.cern.ch/atlas-opendata/samples/2020/exactly2lep/\""
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"#lumi = 0.5 # 0.5 fb-1 for data_A\n",
"#lumi = 1.9 # 1.9 fb-1 for data_B\n",
"#lumi = 2.9 # 2.9 fb-1 for data_C\n",
"#lumi = 4.7 # 4.7 fb-1 for data_D\n",
"#lumi = 9.5 # 9.5 fb-1 for data_B+C+D\n",
"lumi = 10 # 10 fb-1 for data_A+B+C+D\n",
"\n",
"fraction = 1 # fraction of data to use, increase this for better results\n",
"MC_to_data_ratio = 1 # fraction of simulated data to use, increase this for better results\n",
" \n",
"tuple_path = \"exactly2lep/\" # local \n",
"#tuple_path = \"https://atlas-opendata.web.cern.ch/atlas-opendata/samples/2020/exactly2lep/\" # web address"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Samples to process\n",
"\n",
"In this notebook we only process the background samples that have a significant contribution. You can add in the uncommneted background samples later if you wish."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"samples = {\n",
"\n",
" 'data': {\n",
" 'list' : [\n",
" 'data_A', # period A from data16\n",
" 'data_B', # period B from data16\n",
" 'data_C', # period C from data16\n",
" 'data_D' # period D from data16\n",
" ] \n",
" },\n",
" \n",
" r'$t\\bar{t}Z$' : { # ttZ(->ee) and ttZ(->μμ) signal (associated production of a top-quark pair with one vector boson)\n",
" 'list' : [\n",
" 'ttee',\n",
" 'ttmumu'\n",
" ]\n",
" },\n",
"\n",
" r'$t\\bar{t}$' : { # ttbar semileptonic / all-leptonic\n",
" 'list' : ['ttbar_lep']\n",
" },\n",
" \n",
" 'Z' : { # Z+jets (Events containing Z bosons with associated jets) \n",
" 'list' : [\n",
" 'Zmumu_PTV0_70_CVetoBVeto', # Z->μμ + jets with 0 < pT(Z) < 70 GeV whilst vetoing c and b-jets\n",
" 'Zmumu_PTV0_70_CFilterBVeto', # Z->μμ + jets with 0 < pT(Z) < 70 GeV and a requirement for c-jets, whilst vetoing b-jets\n",
" 'Zmumu_PTV0_70_BFilter', # Z->μμ + jets with 0 < pT(Z) < 70 GeV and a requirement for b-jets\n",
" 'Zmumu_PTV70_140_CVetoBVeto', # Z->μμ + jets with 70 < pT(Z) < 140 GeV whilst vetoing c and b-jets\n",
" 'Zmumu_PTV70_140_CFilterBVeto', # Z->μμ + jets with 70 < pT(Z) < 140 GeV and a requirement for c-jets, whilst vetoing b-jets\n",
" 'Zmumu_PTV70_140_BFilter', # Z->μμ + jets with 70 < pT(Z) < 140 GeV and a requirement for b-jets\n",
" 'Zmumu_PTV140_280_CVetoBVeto', # Z->μμ + jets with 140 < pT(Z) < 280 GeV whilst vetoing c and b-jets\n",
" 'Zmumu_PTV140_280_CFilterBVeto', # Z->μμ + jets with 140 < pT(Z) < 280 GeV and a requirement for c-jets, whilst vetoing b-jets\n",
" 'Zmumu_PTV140_280_BFilter', # Z->μμ + jets with 140 < pT(Z) < 280 GeV and a requirement for b-jets\n",
" 'Zmumu_PTV280_500_CVetoBVeto', # Z->μμ + jets with 280 < pT(Z) < 500 GeV whilst vetoing c and b-jets\n",
" 'Zmumu_PTV280_500_CFilterBVeto', # Z->μμ + jets with 280 < pT(Z) < 500 GeV and a requirement for c-jets, whilst vetoing b-jets\n",
" 'Zmumu_PTV280_500_BFilter', # Z->μμ + jets with 280 < pT(Z) < 500 GeV and a requirement for b-jets\n",
" 'Zmumu_PTV500_1000', # Z->μμ + jets, with 500 < pT(Z) < 1000 GeV \n",
" 'Zmumu_PTV1000_E_CMS', # Z->μμ + jets with 1000 GeV < pT(Z) < centre-of-mass energy\n",
" 'Zee_PTV0_70_CVetoBVeto', # Z->ee + jets with 0 < pT(Z) < 70 GeV whilst vetoing c and b-jets\n",
" 'Zee_PTV0_70_CFilterBVeto', # Z->ee + jets with 0 < pT(Z) < 70 GeV and a requirement for c-jets, whilst vetoing b-jets\n",
" 'Zee_PTV0_70_BFilter', # Z->ee + jets with 0 < pT(Z) < 70 GeV and a requirement for b-jets\n",
" 'Zee_PTV70_140_CVetoBVeto', # Z->ee + jets with 70 < pT(Z) < 140 GeV whilst vetoing c and b-jets\n",
" 'Zee_PTV70_140_CFilterBVeto', # Z->ee + jets with 70 < pT(Z) < 140 GeV and a requirement for c-jets, whilst vetoing b-jets\n",
" 'Zee_PTV70_140_BFilter', # Z->ee + jets with 70 < pT(Z) < 140 GeV and a requirement for b-jets\n",
" 'Zee_PTV140_280_CVetoBVeto', # Z->ee + jets with 140 < pT(Z) < 280 GeV whilst vetoing c and b-jets\n",
" 'Zee_PTV140_280_CFilterBVeto', # Z->ee + jets with 140 < pT(Z) < 280 GeV and a requirement for c-jets, whilst vetoing b-jets\n",
" 'Zee_PTV140_280_BFilter', # Z->ee + jets with 140 < pT(Z) < 280 GeV and a requirement for b-jets\n",
" 'Zee_PTV280_500_CVetoBVeto', # Z->μμ + jets with 280 < pT(Z) < 500 GeV whilst vetoing c and b-jets\n",
" 'Zee_PTV280_500_CFilterBVeto', # Z->ee + jets with 280 < pT(Z) < 500 GeV and a requirement for c-jets, whilst vetoing b-jets\n",
" 'Zee_PTV280_500_BFilter', # Z->ee + jets with 280 < pT(Z) < 500 GeV and a requirement for b-jets \n",
" 'Zee_PTV500_1000', # Z->ee + jets with 500 < pT(Z) < 1000 GeV\n",
" 'Zee_PTV1000_E_CMS', # Z->ee + jets with 1000 GeV < pT(Z) < centre-of-mass energy\n",
" ]\n",
" }, \n",
"\n",
" 'Other' : { # background with at least 2 prompt leptons, other than Z+jets and ttbar \n",
" 'list' : [\n",
" 'ZqqZll', # Z(->qq)Z(->ll)\n",
" 'single_top_wtchan', # Wt\n",
" 'single_antitop_wtchan', # Wt\n",
" 'lllv', # W(->lv)Z(->ll)\n",
" 'ttW',\n",
" 'llll', # ZZ->llll\n",
" 'llvv', # ZZ->llvv and WW->lvlv\n",
" 'Ztautau_PTV0_70_CVetoBVeto', # Z->ττ + jets with 0 < pT(Z) < 70 GeV whilst vetoing c and b-jets\n",
" 'Ztautau_PTV0_70_CFilterBVeto', # Z->ττ + jets with 0 < pT(Z) < 70 GeV and a requirement for c-jets, whilst vetoing b-jets\n",
" 'Ztautau_PTV0_70_BFilter', # Z->ττ + jets with 0 < pT(Z) < 70 GeV and a requirement for b-jets\n",
" 'Ztautau_PTV70_140_CVetoBVeto', # Z->μμ + jets with 70 < pT(Z) < 140 GeV whilst vetoing c and b-jets\n",
" 'Ztautau_PTV70_140_CFilterBVeto', # Z->ττ + jets with 70 < pT(Z) < 140 GeV and a requirement for c-jets, whilst vetoing b-jets\n",
" 'Ztautau_PTV70_140_BFilter', # Z->ττ + jets with 70 < pT(Z) < 140 GeV and a requirement for b-jets\n",
" 'Ztautau_PTV140_280_CVetoBVeto', # Z->μμ + jets with 140 < pT(Z) < 280 GeV whilst vetoing c and b-jets\n",
" 'Ztautau_PTV140_280_CFilterBVeto', # Z->ττ + jets with 140 < pT(Z) < 280 GeV and a requirement for c-jets, whilst vetoing b-jets\n",
" 'Ztautau_PTV140_280_BFilter', # Z->ττ + jets with 140 < pT(Z) < 280 GeV and a requirement for b-jets\n",
" 'Ztautau_PTV280_500_CVetoBVeto', # Z->μμ + jets with 280 < pT(Z) < 500 GeV whilst vetoing c and b-jets\n",
" 'Ztautau_PTV280_500_CFilterBVeto', # Z->ττ + jets with 280 < pT(Z) < 500 GeV and a requirement for c-jets, whilst vetoing b-jets\n",
" 'Ztautau_PTV280_500_BFilter', # Z->ττ + jets with 280 < pT(Z) < 500 GeV and a requirement for b-jets\n",
" 'Ztautau_PTV500_1000', # Z->ττ + jets with 500 < pT(Z) < 1000 GeV\n",
" 'Ztautau_PTV1000_E_CMS' # Z->ττ + jets with 1000 GeV < pT(Z) < centre-of-mass energy\n",
" ] \n",
" }, \n",
" \n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get data from files"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"define function to get data from files\n",
"\n",
"The datasets used in this notebook have already been filtered to include exactly 2 leptons per event, so that processing is quicker."
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [],
"source": [
"def get_data_from_files(samples):\n",
"\n",
" data = {} # define empty dictionary to hold dataframes\n",
" for s in samples: # loop over samples\n",
" print('Processing '+s+' samples') # print which sample\n",
" frames = [] # define empty list to hold data\n",
" for val in samples[s]['list']: # loop over each file\n",
" if s == 'data': prefix = \"Data/\" # Data prefix\n",
" else: prefix = \"MC/mc_\"+str(infofile.infos[val][\"DSID\"])+\".\" # MC prefix\n",
" fileString = tuple_path+prefix+val+\".exactly2lep.root\" # file name to open\n",
" temp = read_file(fileString,val) # call the function read_file defined below\n",
" if 'data' in val: # processing data file\n",
" print('\\t\\tnumber of events ',len(temp)) # print total number of events for file\n",
" #else: # processing Monte Carlo simulation file\n",
" # print('\\t\\tsum of weights ',sum(temp['totalWeight'])) # print total weight for file\n",
" \n",
" frames.append(temp) # append dataframe returned from read_file to list of dataframes\n",
" data[s] = pd.concat(frames) # dictionary entry is concatenated dataframes\n",
" \n",
" return data # return dictionary of dataframes\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"define function to calculate weight of MC event\n",
"\n",
"paper: \"Simulated events were corrected so that the object\n",
"identification, reconstruction and trigger efficiencies, energy scales and energy resolutions match those\n",
"determined from data control samples.\""
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [],
"source": [
"def get_xsec_weight(sample): \n",
" info = infofile.infos[sample] # get infofile containing cross-sections, sums of weights, dataset IDs\n",
" return (lumi*1000*info[\"xsec\"]/info[\"sumw\"])/MC_to_data_ratio #*1000 to go from fb-1 to pb-1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [],
"source": [
"def calc_weight(xsec_weight,mcWeight,scaleFactor_PILEUP,scaleFactor_ELE,\n",
" scaleFactor_MUON, scaleFactor_LepTRIGGER\n",
" , scaleFactor_BTAG # uncomment this to get better Data vs MC match\n",
" ):\n",
" return xsec_weight*mcWeight*scaleFactor_PILEUP*scaleFactor_ELE*scaleFactor_MUON*scaleFactor_LepTRIGGER#*scaleFactor_BTAG"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Find the good jets!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"define functions to find jets passing some minimum requirements"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [],
"source": [
"def find_good_jet_indices(jet_pt,jet_jvt,\n",
" jet_eta # uncomment this when you want to impose stricter requirements on jets\n",
" ): # get indices of good jets\n",
" good_jets = [] # list to hold whether jets are good\n",
" for i in range(len(jet_pt)): # loop over jets\n",
" if jet_pt[i]>25000: # paper: \"Jets are accepted if they fulfill the requirements pT > 25 GeV\"\n",
" # paper: jets with pT < 60 GeV and |η| < 2.4 are required to satisfy pileup rejection criteria (JVT)\n",
" #if jet_pt[i]<60000: # extra requirements for pt < 60 GeV and |η|<2.4\n",
" if jet_pt[i]<60000 and abs(jet_eta[i])<2.4: # extra requirements for pt < 60 GeV and |η|<2.4\n",
" if jet_jvt[i]<0.59: \n",
" good_jets.append(0) # if jvt<0.59, this isn't a good jet\n",
" continue # move onto next jet\n",
" good_jets.append(1) # append good jet if gets here\n",
" continue # move onto next jet\n",
" good_jets.append(0) # if pt<25000, this isn't a good jet\n",
" \n",
" string_ints = [str(i) for i in good_jets] # Convert each integer to a string\n",
" str_of_ints = \"\".join(string_ints) # Combine each string\n",
" return str_of_ints # return list of whether jets are good\n",
"\n",
"# calculate number of good jets\n",
"def calc_goodjet_n(good_jets): # jet indices\n",
" return good_jets.count('1') # count number of 1 in good jets list\n",
"\n",
"# selection on number of good jets\n",
"def select_good_jet_n(good_jets):\n",
" return good_jets.count('1')<5 # throw away if fewer than 5 good jets"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"define functions to calculate number of b-jets"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [],
"source": [
"# calculate number of (good) b-jets\n",
"def calc_bjet_n(jet_MV2c10,good_jets): # MV2c10 scores and jet indices\n",
" bjet_n = 0 # start counter for number of b-jets\n",
" good_jets_indices = [i for i,b in enumerate(good_jets) if b=='1'] # get the good jets\n",
" for i in good_jets_indices: # loop over good jets\n",
" if jet_MV2c10[i]>0.6459: # for 77% b-tag efficiency from https://cds.cern.ch/record/2160731/files/ATL-PHYS-PUB-2016-012.pdf Table 2 \n",
" bjet_n+=1 # increment the counter for the number of b-jets\n",
" return bjet_n # return the number of b-jets\n",
"\n",
"# selection on number of b-jets\n",
"def select_bjet_n(bjet_n):\n",
" # throw away if fewer than 1 b-jets\n",
" return bjet_n<1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Find the good leptons!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"define function to calculate dilepton invariant mass (mll)"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [],
"source": [
"def calc_inv_mass_pair(pt_1,eta_1,phi_1,E_1,pt_2,eta_2,phi_2,E_2): # pt,eta,phi,energy of 2 objects\n",
" px_1 = pt_1*np.cos(phi_1) # x-momentum of object 1\n",
" py_1 = pt_1*np.sin(phi_1) # y-momentum of object 1\n",
" pz_1 = pt_1*np.sinh(eta_1) # z-momentum of object 1\n",
" px_2 = pt_2*np.cos(phi_2) # x-momentum of object 2\n",
" py_2 = pt_2*np.sin(phi_2) # y-momentum of object 2\n",
" pz_2 = pt_2*np.sinh(eta_2) # z-momentum of object 2\n",
" sumpx = px_1 + px_2 # x-momentum of combined system\n",
" sumpy = py_1 + py_2 # y-momentum of combined system\n",
" sumpz = pz_1 + pz_2 # z-momentum of combined system\n",
" sump = np.sqrt(sumpx**2 + sumpy**2 + sumpz**2) # total momentum of combined system\n",
" sumE = E_1 + E_2 # total energy of combined system\n",
" return np.sqrt(sumE**2 - sump**2)/1000 # /1000 to go from MeV to GeV\n",
"\n",
"# calculate dilepton invariant mass\n",
"def calc_mll(lep_pt,lep_eta,lep_phi,lep_E): # lepton pt,eta,phi,energy\n",
" return calc_inv_mass_pair(lep_pt[0],lep_eta[0],lep_phi[0],lep_E[0],lep_pt[1],lep_eta[1],lep_phi[1],lep_E[1])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"define lepton selections for Opposite-sign dilepton analysis. Information is taken from Section 4. Object Reconstruction of the ATLAS published paper [Measurement of the $t\\bar{t}Z$ and $t\\bar{t}W$ cross sections in proton-proton collisions at $\\sqrt{s}$ = 13 TeV with the ATLAS detector](https://journals.aps.org/prd/pdf/10.1103/PhysRevD.99.072009)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [],
"source": [
"# selection on nth good lepton\n",
"def select_leptons(mll, lep_ptcone30, lep_pt, lep_type, lep_etcone20, lep_tracksigd0pvunbiased, lep_eta, \n",
" lep_charge\n",
" ,lep_z0 # uncomment this to apply stricter requirements\n",
" ): # variables for good lepton\n",
" \n",
" # paper: \"invariant mass of the lepton pair is required to be in the Z boson mass window, |mll − mZ| < 10 GeV\"\n",
" if (mll < 81.12) or (mll > 101.12): return True\n",
" \n",
" # paper: \"total sum of [...] transverse momenta in a surrounding cone [...] is required to be less than 6% of [...] pT\"\n",
" if lep_ptcone30[0]>0.06*lep_pt[0] or lep_ptcone30[1]>0.06*lep_pt[1]: return True # bad lepton if ptcone>6%pt\n",
" \n",
" # paper: \"sum of [...] transverse energies [...] within a cone of size ∆Rη = 0.2 around any electron [...] is required to be less than 6% of [...] pT\"\n",
" if lep_type[0]==11 and lep_etcone20[0]>0.06*lep_pt[0]: return True # bad electron if etcone>6%pt\n",
" if lep_type[1]==11 and lep_etcone20[1]>0.06*lep_pt[1]: return True # bad electron if etcone>6%pt\n",
" \n",
" # paper: \"significance of [...] d0 is required to satisfy |d0|/σ(d0) < 5 for electrons and |d0|/σ(d0) < 3 for muons\"\n",
" if lep_type[0]==13 and lep_tracksigd0pvunbiased[0]>3: return True # bad muon if σ(d0)>3\n",
" if lep_type[1]==13 and lep_tracksigd0pvunbiased[1]>3: return True # bad muon if σ(d0)>3\n",
" if lep_tracksigd0pvunbiased[0]>5 or lep_tracksigd0pvunbiased[1]>5: return True # bad electron if σ(d0)>5\n",
" \n",
" # paper Table 2: pT (leading lepton) > 30 GeV and pT (subleading lepton) > 15 GeV\n",
" if lep_pt[0]<30000 or lep_pt[1]<15000: return True # minimum pt requirments on leptons\n",
" \n",
" # paper: \"Muons are required to have |η| < 2.5\"\n",
" if abs(lep_eta[0])>2.5 or abs(lep_eta[1])>2.5: return True # bad lepton if |η|>2.5\n",
" \n",
" # paper: \"Opposite-sign\"\n",
" if lep_charge[0]+lep_charge[1]!=0: return True # throw away when charges don't add to 0\n",
" \n",
" # paper: \"longitudinal impact parameter [...], z0, is required to satisfy |z0 sinθ| < 0.5 mm\"\n",
" theta_i = 2*np.arctan(np.exp(-lep_eta[0])) # calculate theta angle\n",
" if abs(lep_z0[0]*np.sin(theta_i))>0.5: return True # bad lepton if z0*sinθ > 0.5mm\n",
" theta_i = 2*np.arctan(np.exp(-lep_eta[1])) # calculate theta angle\n",
" if abs(lep_z0[1]*np.sin(theta_i))>0.5: return True # bad lepton if z0*sinθ > 0.5mm\n",
" \n",
" return False # don't throw away event if gets here\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Let's calculate some variables!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"define function to return which channel. 121 is ee, 169 is mumu, 143 is emu."
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [],
"source": [
"# return number to represent which channel \n",
"def calc_channel(lep_type):\n",
" return lep_type[0]*lep_type[1] \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"define functions to calculate invariant mass, ∆R and pT"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [],
"source": [
"def calc_inv_mass_triplet(pt_1,eta_1,phi_1,E_1,pt_2,eta_2,phi_2,E_2,pt_3,eta_3,phi_3,E_3): # pt,eta,phi,energy of 3 objects\n",
" px_1 = pt_1*np.cos(phi_1) # x-momentum of object 1\n",
" py_1 = pt_1*np.sin(phi_1) # y-momentum of object 1\n",
" pz_1 = pt_1*np.sinh(eta_1) # z-momentum of object 1\n",
" px_2 = pt_2*np.cos(phi_2) # x-momentum of object 2\n",
" py_2 = pt_2*np.sin(phi_2) # y-momentum of object 2\n",
" pz_2 = pt_2*np.sinh(eta_2) # z-momentum of object 2\n",
" px_3 = pt_3*np.cos(phi_3) # x-momentum of object 3\n",
" py_3 = pt_3*np.sin(phi_3) # y-momentum of object 3\n",
" pz_3 = pt_3*np.sinh(eta_3) # z-momentum of object 3\n",
" sumpx = px_1 + px_2 + px_3 # x-momentum of combined system\n",
" sumpy = py_1 + py_2 + py_3 # y-momentum of combined system\n",
" sumpz = pz_1 + pz_2 + pz_3 # z-momentum of combined system\n",
" sump = np.sqrt(sumpx**2 + sumpy**2 + sumpz**2) # total momentum of combined system\n",
" sumE = E_1 + E_2 + E_3 # total energy of combined system\n",
" return np.sqrt(sumE**2 - sump**2)/1000 # /1000 to go from MeV to GeV\n",
"\n",
"def calc_delta_R(eta_1,phi_1,eta_2,phi_2): # eta,phi of 2 objects\n",
" delta_eta = eta_1-eta_2 # Δη between the 2 objects\n",
" delta_phi = phi_1-phi_2 # Δϕ between the 2 objects\n",
" if delta_phi >= np.pi: delta_phi -= 2*np.pi # use π periodicity to get number between -π and π\n",
" elif delta_phi < -np.pi: delta_phi += 2*np.pi # use π periodicity to get number between -π and π\n",
" return np.sqrt(delta_eta**2 + delta_phi**2) # return ∆R for this object\n",
"\n",
"\n",
"def calc_pT_triplet(pt_1,eta_1,phi_1,pt_2,eta_2,phi_2,pt_3,eta_3,phi_3): # pt,eta,phi of 3 objects\n",
" px_1 = pt_1*np.cos(phi_1) # x-momentum of object 1\n",
" py_1 = pt_1*np.sin(phi_1) # y-momentum of object 1\n",
" px_2 = pt_2*np.cos(phi_2) # x-momentum of object 2\n",
" py_2 = pt_2*np.sin(phi_2) # y-momentum of object 2\n",
" px_3 = pt_3*np.cos(phi_3) # x-momentum of object 3\n",
" py_3 = pt_3*np.sin(phi_3) # y-momentum of object 3\n",
" pz_3 = pt_3*np.sinh(eta_3) # z-momentum of object 3\n",
" sumpx = px_1 + px_2 + px_3 # x-momentum of combined system\n",
" sumpy = py_1 + py_2 + py_3 # y-momentum of combined system\n",
" return np.sqrt(sumpx**2 + sumpy**2)/1000 # /1000 to go from MeV to GeV"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"define functions to calculate variables for 2b6j BDT"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [],
"source": [
"# functin to calculate pt of the lepton pair\n",
"def calc_ptll(lep_pt,lep_phi): # pt of dilepton system\n",
" px_0 = lep_pt[0]*np.cos(lep_phi[0]) # x-momentum of lep_0\n",
" py_0 = lep_pt[0]*np.sin(lep_phi[0]) # y-momentum of lep_0\n",
" px_1 = lep_pt[1]*np.cos(lep_phi[1]) # x-momentum of lep_1\n",
" py_1 = lep_pt[1]*np.sin(lep_phi[1]) # y-momentum of lep_1\n",
" sumpx = px_0 + px_1 # x-momentum of dilepton system\n",
" sumpy = py_0 + py_1 # y-momentum of dilepton system\n",
" return np.sqrt(sumpx**2 + sumpy**2)/1000 #/1000 to go from MeV to GeV \n",
"\n",
"# function to calculate pt of ith jet\n",
"def calc_pti_jet(jet_pt,good_jets,index): # jet pt and index\n",
" good_jets_indices = [i for i,b in enumerate(good_jets) if b=='1'] # get the good jets\n",
" if index0.6459)+int(jet_MV2c10[j]>0.6459)+int(jet_MV2c10[k]>0.6459)\n",
" if n_bjets==1: # only 1 b-jet for top quark candidate\n",
" if jet_MV2c10[i]>0.6459: # if i is the b-jet\n",
" b = i # assign index i to b\n",
" j1 = j # assign index j to j1\n",
" j2 = k # assing index k to j2\n",
" elif jet_MV2c10[j]>0.6459: # if j is is the b-jet\n",
" b = j # assign index j to b\n",
" j1 = i # assign index i to j1\n",
" j2 = k # assign index k to j2\n",
" else: # if k is the b-jet\n",
" b = k # assign index k to b\n",
" j1 = i # assign index i to j1\n",
" j2 = j # assign index j to j2\n",
" px_b = jet_pt[b]*np.cos(jet_phi[b]) # x-momentum of the b-jet\n",
" py_b = jet_pt[b]*np.sin(jet_phi[b]) # y-momentum of the b-jet\n",
" pz_b = jet_pt[b]*np.sinh(jet_eta[b]) # z-momentum of the b-jet\n",
" px_j1 = jet_pt[j1]*np.cos(jet_phi[j1]) # x-momentum of (non-b) jet 1\n",
" py_j1 = jet_pt[j1]*np.sin(jet_phi[j1]) # y-momentum of (non-b) jet 1\n",
" pz_j1 = jet_pt[j1]*np.sinh(jet_eta[j1]) # z-momentum of (non-b) jet 1\n",
" px_j2 = jet_pt[j2]*np.cos(jet_phi[j2]) # x-momentum of (non-b) jet 2\n",
" py_j2 = jet_pt[j2]*np.sin(jet_phi[j2]) # y-momentum of (non-b) jet 2\n",
" pz_j2 = jet_pt[j2]*np.sinh(jet_eta[j2]) # z-momentum of (non-b) jet 2\n",
" sumpx_t = px_b + px_j1 + px_j2 # x-momentum of 3-jet system\n",
" sumpy_t = py_b + py_j1 + py_j2 # y-momentum of 3-jet system\n",
" sumpz_t = pz_b + pz_j1 + pz_j2 # z-momentum of 3-jet system\n",
" sump_t = np.sqrt(sumpz_t**2 + sumpx_t**2 + sumpy_t**2) # total momentum of 3-jet system\n",
" sumE_t = jet_E[i] + jet_E[j] + jet_E[k] # total energy of 3-jet system\n",
" M_bjj = np.sqrt(sumE_t**2 - sump_t**2)/1000 # mass of 3-jet system\n",
" sumpx_W = px_j1 + px_j2 # x-momentum of W candidate\n",
" sumpy_W = py_j1 + py_j2 # y-momentum of W candidate\n",
" sumpz_W = pz_j1 + pz_j2 # z-momentum of W candidate\n",
" sump_W = np.sqrt(sumpz_W**2 + sumpx_W**2 + sumpy_W**2) # total momentum of W candidate\n",
" sumE_W = jet_E[j1] + jet_E[j2] # total energy of W candidate\n",
" M_W = np.sqrt(sumE_W**2 - sump_W**2)/1000 # mass of W candidate\n",
" if abs(M_bjj - 172.5)<15 and abs(M_W - 80)<15: # within 15 GeV of top and W mass windows\n",
" Nmbjj_top += 1 # increment counter for number of top-quark candidates\n",
" return Nmbjj_top # return number of top-quark candidates\n",
"\n",
"# function to calculate mass of the combination between any two jets with the smallest ∆R\n",
"def calc_MjjMindR(jet_pt,jet_eta,jet_phi,jet_E, # jet pt,eta,phi,energy\n",
" good_jets): # jet indices\n",
" good_jets_indices = [i for i,b in enumerate(good_jets) if b=='1'] # get the good jets\n",
" MindR = 99 # dummy value for smallest ∆R\n",
" for jet_i in good_jets_indices[:8]: # loop over good jets\n",
" for jet_j in good_jets_indices[jet_i+1:8]: # loop over remaining good jets\n",
" dR = calc_delta_R(jet_eta[jet_i],jet_phi[jet_i],jet_eta[jet_j],jet_eta[jet_j]) # calculate ∆R\n",
" if dR < MindR: # if this ∆R is smaller than the smallest ∆R found previously\n",
" MindR = dR # set this ∆R to be the smallest ∆R\n",
" i1r = jet_i # i1r is the index of the 1st jet to be considered later\n",
" i2r = jet_j # i2r is the index of the 2nd jet to be considered later\n",
" # use function calc_inv_mass_pair defined above to get the jet-jet invariant mass\n",
" MjjMindR = calc_inv_mass_pair(jet_pt[i1r],jet_eta[i1r],jet_phi[i1r],jet_E[i1r],\n",
" jet_pt[i2r],jet_eta[i2r],jet_phi[i2r],jet_E[i2r])\n",
" return MjjMindR # return mass of the combination between any two jets with the smallest ∆R\n",
"\n",
"# function to calculate mass of the two jets with the highest b-tag weight\n",
"def calc_MbbPtOrd(jet_pt,jet_eta,jet_phi,jet_E,jet_MV2c10, # jet pt,eta,phi,energy,MV2c10\n",
" good_jets):\n",
" good_jets_indices = [i for i,b in enumerate(good_jets) if b=='1'] # get the good jets\n",
" MV2c10_scores = {} # dictionary to hold indices and MV2c10 scores\n",
" for i in good_jets_indices[:8]: # loop over good jets\n",
" MV2c10_scores[i] = jet_MV2c10[i] # dictionary entry of index and MV2c10 score\n",
" max_MV2c10_index = max(MV2c10_scores, key=MV2c10_scores.get) # get index of jet with maximum MV2c10 score\n",
" del MV2c10_scores[max_MV2c10_index] # delete jet with maximum MV2c10 score from dictionary\n",
" second_MV2c10_index = max(MV2c10_scores, key=MV2c10_scores.get) # get index of jet with second MV2c10 score\n",
" # use function calc_inv_mass_pair defined above to get mass of the two jets with the highest b-tag weight\n",
" return calc_inv_mass_pair(jet_pt[max_MV2c10_index],jet_eta[max_MV2c10_index],jet_phi[max_MV2c10_index],\n",
" jet_E[max_MV2c10_index],jet_pt[second_MV2c10_index],jet_eta[second_MV2c10_index],\n",
" jet_phi[second_MV2c10_index],jet_E[second_MV2c10_index])\n",
"\n",
"# function to calculate jet centrality (scalar sum of pT divided by sum of E for all jets)\n",
"def calc_CentrJet(jet_pt,jet_E, # jet pt and energy\n",
" good_jets): # jet indices\n",
" good_jets_indices = [i for i,b in enumerate(good_jets) if b=='1'] # get the good jets\n",
" sum_pt = 0 # start counter for pt of all jets\n",
" sum_E = 0 # start counter for energy of all jets\n",
" for i in good_jets_indices[:8]: # loop over good jets\n",
" sum_pt += jet_pt[i] # add single jet pt to sum of jet pt\n",
" sum_E += jet_E[i] # add single jet energy to sum of jet energy\n",
" return sum_pt/sum_E # ratio of total pt to total energy\n",
"\n",
"# function to calculate average ∆R for all jet pairs\n",
"def calc_dRjjave_jet(jet_eta,jet_phi, # jet eta,phi\n",
" good_jets): # jet indices\n",
" good_jets_indices = [i for i,b in enumerate(good_jets) if b=='1'] # get the good jets\n",
" delta_R = [] # list to hold ∆R between all jet pairs\n",
" for i in good_jets_indices[:8]: # loop over good jets\n",
" for j in good_jets_indices[i+1:8]: # loop over remaining good jets\n",
" delta_R.append(calc_delta_R(jet_eta[i],jet_phi[i],jet_eta[j],jet_phi[j])) # append ∆R for this jet pair to list\n",
" return sum(delta_R)/len(delta_R) # average ∆R of all jet pairs\n",
"\n",
"# function to calculate maximum mass between a lepton and the tagged jet with the smallest ∆R\n",
"def calc_MaxMMindRlepb(lep_pt,lep_eta,lep_phi,lep_E,jet_pt,jet_eta,jet_phi,jet_E,jet_MV2c10, # pt,eta,phi,E,MV2c10\n",
" good_jets):\n",
" good_jets_indices = [i for i,b in enumerate(good_jets) if b=='1'] # get the good jets\n",
" # what if the b-jet isn't one of the first 8 good jets? Use the jet within the first 8 with highes MV2c10 score\n",
" if len(good_jets)>8 and jet_MV2c10[good_jets_indices[0]]<0.6459 and jet_MV2c10[good_jets_indices[1]]<0.6459 and jet_MV2c10[good_jets_indices[2]]<0.6459 and jet_MV2c10[good_jets_indices[3]]<0.6459 and jet_MV2c10[good_jets_indices[4]]<0.6459 and jet_MV2c10[good_jets_indices[5]]<0.6459 and jet_MV2c10[good_jets_indices[6]]<0.6459 and jet_MV2c10[good_jets_indices[7]]<0.6459:\n",
" MV2c10_scores = {} # dictionary to hold indices and btagging scores\n",
" for i in good_jets_indices[:8]: # loop over good jets \n",
" MV2c10_scores[i] = jet_MV2c10[i] # dictionary entry of index and btagging score \n",
" ijetr = max(MV2c10_scores, key=MV2c10_scores.get) # get index of jet with maximum btagging score\n",
" MaxMMindRlepb = 0 # dummy value for maximum mass between a lepton and the tagged jet with the smallest ∆R\n",
" for lep_i in [0,1]: # loop over leptons\n",
" MindRlepb = 99 # dummy value for smallest ∆R\n",
" for jet_j in good_jets_indices[:8]: # loop over good jets\n",
" if jet_MV2c10[jet_j]>0.6459: # for 77% b-tag efficiency from https://cds.cern.ch/record/2160731/files/ATL-PHYS-PUB-2016-012.pdf Table 2\n",
" dRlepb = calc_delta_R(lep_eta[lep_i],lep_phi[lep_i],jet_eta[jet_j],jet_phi[jet_j]) # get ∆R\n",
" if dRlepb < MindRlepb: # if this ∆R is smaller than the smallest ∆R found previously for this lep\n",
" MindRlepb = dRlepb # set this ∆R to be the minimum ∆R for this lepton\n",
" ijetr = jet_j # ijetr is the index of the jet to be considered later\n",
" # use function calc_inv_mass_pair to get the lep-b pair invariant mass\n",
" MMindRlepb = calc_inv_mass_pair(lep_pt[lep_i],lep_eta[lep_i],lep_phi[lep_i],lep_E[lep_i],\n",
" jet_pt[ijetr],jet_eta[ijetr],jet_phi[ijetr],jet_E[ijetr])\n",
" if MMindRlepb > MaxMMindRlepb: # if this lep-b pair mass is bigger than the biggest previous\n",
" MaxMMindRlepb = MMindRlepb # set this lep-b pair mass to be the maximum mass\n",
" return MaxMMindRlepb # return maximum mass between a lepton and the tagged jet with the smallest ∆R\n",
"\n",
"# function to calculate First Fox-Wolfram moment\n",
"def calc_H1(lep_pt,lep_eta,lep_phi,lep_E,jet_pt,jet_eta,jet_phi,jet_E, # lepton and jet pt,eta,phi,energy\n",
" good_jets): # jet indices\n",
" good_jets_indices = [i for i,b in enumerate(good_jets) if b=='1'] # get the good jets\n",
" E_viss = 0 # start counter for visible energy in the event\n",
" H1 = 0 # start counter for H1\n",
" for lep_i in range(2): # loop over leptons\n",
" E_viss += lep_E[lep_i] # add energy of leptons to visible energy counter\n",
" for jet_i in range(len(jet_pt)): # loop over all jets (not just good jets)\n",
" E_viss += jet_E[jet_i] # add energy of jets to visible energy counter\n",
" for lep_i in range(2): # loop over leptons\n",
" px_i = lep_pt[lep_i]*np.cos(lep_phi[lep_i]) # x-momentum of lepton i\n",
" py_i = lep_pt[lep_i]*np.sin(lep_phi[lep_i]) # y-momentum of lepton i\n",
" pz_i = lep_pt[lep_i]*np.sinh(lep_eta[lep_i]) # z-momentum of lepton i\n",
" for lep_j in range(lep_i,2): # loop over leptons\n",
" px_j = lep_pt[lep_j]*np.cos(lep_phi[lep_j]) # x-momentum of lepton j\n",
" py_j = lep_pt[lep_j]*np.sin(lep_phi[lep_j]) # y-momentum of lepton j\n",
" pz_j = lep_pt[lep_j]*np.sinh(lep_eta[lep_j]) # z-momentum of lepton j\n",
" numerator = px_i*px_j + py_i*py_j + pz_i*pz_j # dot product of lepton momenta\n",
" H1 += numerator/(E_viss**2) # H1 calculation for this lepton pairing\n",
" for jet_i in good_jets_indices[:8]: # loop over good jets\n",
" px_i = jet_pt[jet_i]*np.cos(jet_phi[jet_i]) # x-momentum of jet i\n",
" py_i = jet_pt[jet_i]*np.sin(jet_phi[jet_i]) # y-momentum of jet i\n",
" pz_i = jet_pt[jet_i]*np.sinh(jet_eta[jet_i]) # z-momentum of jet i\n",
" for jet_j in good_jets_indices[jet_i:]: # loop over good jets\n",
" px_j = jet_pt[jet_j]*np.cos(jet_phi[jet_j]) # x-momentum of jet j\n",
" py_j = jet_pt[jet_j]*np.sin(jet_phi[jet_j]) # y-momentum of jet j\n",
" pz_j = jet_pt[jet_j]*np.sinh(jet_eta[jet_j]) # z-momentum of jet j\n",
" numerator = px_i*px_j + py_i*py_j + pz_i*pz_j # dot product of jet-pair momenta\n",
" H1 += numerator/(E_viss**2) # H1 calculation for this jet pairing\n",
" for lep_i in range(2): # loop over leptons\n",
" px_i = lep_pt[lep_i]*np.cos(lep_phi[lep_i]) # x-momentum of lepton\n",
" py_i = lep_pt[lep_i]*np.sin(lep_phi[lep_i]) # y-momentum of lepton\n",
" pz_i = lep_pt[lep_i]*np.sinh(lep_eta[lep_i]) # z-momentum of lepton\n",
" for jet_j in good_jets_indices[:8]: # loop over jets\n",
" px_j = jet_pt[jet_j]*np.cos(jet_phi[jet_j]) # x-momentum of jet\n",
" py_j = jet_pt[jet_j]*np.sin(jet_phi[jet_j]) # y-momentum of jet\n",
" pz_j = jet_pt[jet_j]*np.sinh(jet_eta[jet_j]) # z-momentum of jet\n",
" numerator = px_i*px_j + py_i*py_j + pz_i*pz_j # dot product of lepton-jet pair momenta\n",
" H1 += numerator/(E_viss**2) # H1 calcuation for this lepton-jet pairing\n",
" return H1 # return H1 \n",
"\n",
"# function to calculate sum of jet pT , up to 6 jets\n",
"def calc_HT_jet6(jet_pt, # jet pt\n",
" good_jets): # jet indices\n",
" good_jets_indices = [i for i,b in enumerate(good_jets) if b=='1'] # get the good jets\n",
" sum_pt = 0 # start counter for total jet pt\n",
" for i in good_jets_indices[:6]: # loop over good jets, up to 6 jets \n",
" sum_pt += jet_pt[i] # add individual jet pt to total jet pt\n",
" return sum_pt/1000 # /1000 to go from MeV to GeV\n",
"\n",
"# function to calculate η of dilepton system\n",
"def calc_eta_ll(lep_pt,lep_eta,lep_phi): # lepton pt,eta,phi\n",
" px_0 = lep_pt[0]*np.cos(lep_phi[0]) # x-momentum of lep_0\n",
" py_0 = lep_pt[0]*np.sin(lep_phi[0]) # y-momentum of lep_0\n",
" pz_0 = lep_pt[0]*np.sinh(lep_eta[0]) # z-momentum of lep_0\n",
" px_1 = lep_pt[1]*np.cos(lep_phi[1]) # x-momentum of lep_1\n",
" py_1 = lep_pt[1]*np.sin(lep_phi[1]) # y-momentum of lep_1\n",
" pz_1 = lep_pt[1]*np.sinh(lep_eta[1]) # z-momentum of lep_1\n",
" sumpx = px_0 + px_1 # x-momentum of dilepton system\n",
" sumpy = py_0 + py_1 # y-momentum of dilepton system\n",
" sumpz = pz_0 + pz_1 # z-momentum of dilepton system\n",
" sump = np.sqrt(sumpz**2 + sumpx**2 + sumpy**2) # total momentum of dilepton system\n",
" return np.arctanh(sumpz/sump) # arctan between z-momentum and total momentum\n",
"\n",
"# function to calculate sum of the two closest 2 jet invariant masses from from jjj1 and jjj2, divided by 2\n",
"def calc_MWavg(jet_pt,jet_eta,jet_phi,jet_E,jet_MV2c10, # jet pt,eta,phi,energy,MV2c10\n",
" good_jets): # jet indices\n",
" good_jets_indices = [i for i,b in enumerate(good_jets) if b=='1'] # get the good jets\n",
" if len(good_jets_indices)<6: return 0 # don't calculate this for 5j event\n",
" b1_i = 0 # index of first jet\n",
" dR1 = 99 # dummy value for min ∆R between 1st b-jet and another jet\n",
" for i in good_jets_indices[:8]: # loop over good jets\n",
" if i==b1_i: continue # if this jet is the 1st b-jet, move onto the next jet\n",
" delta_R = calc_delta_R(jet_eta[b1_i],jet_phi[b1_i],jet_eta[i],jet_phi[i]) # ∆R between the two jets\n",
" if delta_R < dR1: # if this ∆R is smaller than the smallest ∆R found previously for the 1st b-jet\n",
" dR1 = delta_R # set this ∆R as the smallest ∆R for the 1st b-jet\n",
" jj1_1_i = i # assign index jj1_1_i to the jet closest to the 1st b-jet\n",
" dR2 = 99 # dummy value for second smallest ∆R between the 1st b-jet and another jet\n",
" for i in good_jets_indices[:8]: # loop over good jets\n",
" if i in [b1_i,jj1_1_i]: continue # if this jet is the 1st b-jet or the jet closest to the 1st b-jet, move on\n",
" delta_R = calc_delta_R(jet_eta[b1_i],jet_phi[b1_i],jet_eta[i],jet_phi[i]) # ∆R between the two jets\n",
" if delta_R < dR2: # if this ∆R is smaller than the second smallest ∆R found previously for the 1st b-jet\n",
" dR2 = delta_R # set this ∆R as the second smallest ∆R for the 1st b-jet\n",
" jj1_2_i = i # assign index jj1_2_i to the jet second closest to the 1st b-jet\n",
" for i in good_jets_indices[:8]: # looking for the second \"b-jet\"\n",
" if i in [b1_i,jj1_1_i,jj1_2_i]: continue # if this jet is in the 3-jet system of the 1st b-jet, move on\n",
" b2_i = i # assign index i to the next highest pt jet\n",
" break # finish this for loop\n",
" dR1 = 99 # dummy value for min ∆R between 2nd b-jet and another jet\n",
" for i in good_jets_indices[:8]: # loop over good jets \n",
" if i in [b1_i,jj1_1_i,jj1_2_i,b2_i]: continue # if i is part of 1st b-jet system or is the 2nd b-jet, move on\n",
" delta_R = calc_delta_R(jet_eta[b2_i],jet_phi[b2_i],jet_eta[i],jet_phi[i]) # ∆R between the two jets\n",
" if delta_R < dR1: # if this ∆R is smaller than the smallest ∆R found previously for the 2nd b-jet\n",
" dR1 = delta_R # set this ∆R as the smallest ∆R for the 2nd b-jet\n",
" jj2_1_i = i # assign index jj2_1_i to the jet closest to the 2nd b-jet (if not in 1st b-jet system)\n",
" dR2 = 99 # dummy value for second smallest ∆R between the 2nd b-jet and another jet\n",
" for i in good_jets_indices[:8]: # loop over good jets\n",
" if i in [b1_i,jj1_1_i,jj1_2_i,b2_i,jj2_1_i]: continue # if this jet has been used previously, move on\n",
" delta_R = calc_delta_R(jet_eta[b2_i],jet_phi[b2_i],jet_eta[i],jet_phi[i]) # ∆R between the two jets\n",
" if delta_R < dR2: # if this ∆R is smaller than the second smallest ∆R found previously for the 2nd b-jet\n",
" dR2 = delta_R # set this ∆R as the second smallest ∆R for the 2nd b-jet\n",
" jj2_2_i = i # assign index jj2_2_i to the jet second closest to the 2nd b-jet\n",
" M1 = [] # list to hold masses of W candidates from 1st b-jet\n",
" # inv mass between 1st b-jet and the jet closest to it\n",
" M1.append(calc_inv_mass_pair(jet_pt[b1_i],jet_eta[b1_i],jet_phi[b1_i],jet_E[b1_i],\n",
" jet_pt[jj1_1_i],jet_eta[jj1_1_i],jet_phi[jj1_1_i],jet_E[jj1_1_i]))\n",
" # inv mass between 1st b-jet and the jet second closest to it\n",
" M1.append(calc_inv_mass_pair(jet_pt[b1_i],jet_eta[b1_i],jet_phi[b1_i],jet_E[b1_i],\n",
" jet_pt[jj1_2_i],jet_eta[jj1_2_i],jet_phi[jj1_2_i],jet_E[jj1_2_i]))\n",
" # inv mass between the two jets closest to the 1st b-jet\n",
" M1.append(calc_inv_mass_pair(jet_pt[jj1_1_i],jet_eta[jj1_1_i],jet_phi[jj1_1_i],jet_E[jj1_1_i],\n",
" jet_pt[jj1_2_i],jet_eta[jj1_2_i],jet_phi[jj1_2_i],jet_E[jj1_2_i]))\n",
" M2 = [] # list to hold masses of W candidates from 2nd b-jet\n",
" # inv mass between 2nd b-jet and the jet closest to it\n",
" M2.append(calc_inv_mass_pair(jet_pt[b2_i],jet_eta[b2_i],jet_phi[b2_i],jet_E[b2_i],\n",
" jet_pt[jj2_1_i],jet_eta[jj2_1_i],jet_phi[jj2_1_i],jet_E[jj2_1_i]))\n",
" # inv mass between 2nd b-jet and the jet second closest to it\n",
" M2.append(calc_inv_mass_pair(jet_pt[b2_i],jet_eta[b2_i],jet_phi[b2_i],jet_E[b2_i],\n",
" jet_pt[jj2_2_i],jet_eta[jj2_2_i],jet_phi[jj2_2_i],jet_E[jj2_2_i]))\n",
" # inv mass between the two jets closest to the 2nd b-jet\n",
" M2.append(calc_inv_mass_pair(jet_pt[jj2_1_i],jet_eta[jj2_1_i],jet_phi[jj2_1_i],jet_E[jj2_1_i],\n",
" jet_pt[jj2_2_i],jet_eta[jj2_2_i],jet_phi[jj2_2_i],jet_E[jj2_2_i]))\n",
" difM = 1000000 # dummy value for the difference between the masses of the 1st and 2nd W candidates\n",
" for i in range(3): # loop over [0,1,2]\n",
" for j in range(3): # loop over [0,1,2]\n",
" difM12 = abs(M1[i]-M2[j]) # get absolute difference between W candidate 1 and W candidate 2\n",
" if difM120.6459: #77% WP to decide if it's a b-jet\n",
" return jet_pt[i]/1000 # /1000 to go from MeV to GeV\n",
" # what if the b-jet isn't one of the first 8 good jets? Use the jet within the first 8 with highes MV2c10 score\n",
" if len(good_jets)>8 and jet_MV2c10[good_jets_indices[0]]<0.6459 and jet_MV2c10[good_jets_indices[1]]<0.6459 and jet_MV2c10[good_jets_indices[2]]<0.6459 and jet_MV2c10[good_jets_indices[3]]<0.6459 and jet_MV2c10[good_jets_indices[4]]<0.6459 and jet_MV2c10[good_jets_indices[5]]<0.6459 and jet_MV2c10[good_jets_indices[6]]<0.6459 and jet_MV2c10[good_jets_indices[7]]<0.6459:\n",
" MV2c10_scores = {} # dictionary to hold indices and btagging scores\n",
" for i in good_jets_indices[:8]: # loop over good jets\n",
" MV2c10_scores[i] = jet_MV2c10[i] # dictionary entry of index and btagging score\n",
" max_MV2c10_index = max(MV2c10_scores, key=MV2c10_scores.get) # get index of jet with maximum btagging score\n",
" return jet_pt[max_MV2c10_index]/1000 # /1000 to go from MeV to GeV"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"define function to calculate variables for BDT in 2b5j or 1b6j signal regions"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [],
"source": [
"# function to calculate mass of the two untagged jets with the highest pT\n",
"def calc_MuuPtOrd(jet_pt,jet_eta,jet_phi,jet_E,jet_MV2c10, # jet pt,eta,phi,energy,MV2c10\n",
" good_jets): # jet indices\n",
" good_jets_indices = [i for i,b in enumerate(good_jets) if b=='1'] # get the good jets\n",
" MV2c10_scores = {} # dictionary to hold indices and MV2c10 scores\n",
" for i in good_jets_indices: # loop over good jets\n",
" MV2c10_scores[i] = jet_MV2c10[i] # dictionary entry of index and MV2c10 score\n",
" max_MV2c10_index = max(MV2c10_scores, key=MV2c10_scores.get) # get index of jet with maximum MV2c10 score\n",
" del MV2c10_scores[max_MV2c10_index] # delete jet with maximum MV2c10 score from dictionary\n",
" second_MV2c10_index = max(MV2c10_scores, key=MV2c10_scores.get) # get index of jet with second MV2c10 score\n",
" for i in good_jets_indices: # loop over good jets\n",
" if i==max_MV2c10_index or i==second_MV2c10_index: continue # if this jet is a b-jet, move onto next jet\n",
" for j in good_jets_indices[i+1:]: # loop over remaining good jets\n",
" if j==max_MV2c10_index or j==second_MV2c10_index: continue # if j is a b-jet, move onto next jet\n",
" # use function calc_inv_mass_pair defined above to get mass of jet pair\n",
" return calc_inv_mass_pair(jet_pt[i],jet_eta[i],jet_phi[i],jet_E[i],\n",
" jet_pt[j],jet_eta[j],jet_phi[j],jet_E[j])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"define function to calculate number of heavy-flavour jets"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [],
"source": [
"def calc_HF_n(jet_trueflav,good_jets): # jet true flavours and jet indices\n",
" good_jets_indices = [i for i,b in enumerate(good_jets) if b=='1'] # get the good jets\n",
" HF_n = 0 # start counter for number of heavy-flavour jets\n",
" for jet_i in good_jets_indices: # loop over good jets\n",
" if jet_trueflav[jet_i]==4 or jet_trueflav[jet_i]==5: # c=4, b=5\n",
" HF_n += 1 # increment counter for number of heavy-flavour jets\n",
" return HF_n # return number of heavy-flavour jets"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Can we process the data yet?!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"define function to read files"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [],
"source": [
"def read_file(path,sample):\n",
" print(\"\\tProcessing: \"+sample) # print which sample is being processed\n",
" data_all = pd.DataFrame() # define empty pandas DataFrame to hold all data for this sample\n",
" tree = uproot3.open(path)[\"mini\"] # open the tree called mini\n",
" numevents = uproot3.numentries(path, \"mini\") # number of events\n",
" if 'data' in sample: fraction_MC=fraction # process fraction*1 of measured data\n",
" else: fraction_MC=fraction*MC_to_data_ratio # process fraction*MC_to_data_ratio of simulated data\n",
" entrystop=numevents*fraction_MC # stop after fraction of events we want to process\n",
" branches = ['lep_pt','lep_eta','lep_phi','lep_E','lep_charge','lep_type','lep_ptcone30',\n",
" 'lep_etcone20','lep_tracksigd0pvunbiased','met_et',\n",
" 'jet_pt','jet_eta','jet_phi','jet_E','jet_jvt','jet_MV2c10'\n",
" ,'lep_z0' # uncomment this for stricter lepton requirements\n",
" ]\n",
" if 'data' not in sample: \n",
" xsec_weight = get_xsec_weight(sample) # get cross-section weight\n",
" branches.extend(['mcWeight','scaleFactor_PILEUP','scaleFactor_ELE',\n",
" 'scaleFactor_MUON','scaleFactor_LepTRIGGER'\n",
" ,'scaleFactor_BTAG' # uncomment this for better Data vs MC match\n",
" ])\n",
" if 'Zmumu' in sample or 'Zee' in sample:\n",
" branches.extend(['jet_trueflav'])\n",
" \n",
" for data in tree.iterate(branches, \n",
" outputtype=pd.DataFrame, # choose output type as pandas DataFrame\n",
" entrystop=entrystop): # process up to numevents*fraction\n",
" \n",
" nIn = len(data.index) # number of events in this batch\n",
" #print('\\t\\t initial \\t\\t\\t\\t',nIn) \n",
" \n",
" # get index of 0th good jet\n",
" data['good_jets'] = np.vectorize(find_good_jet_indices)(data.jet_pt, data.jet_jvt \n",
" ,data.jet_eta # uncomment this for stricter requirements\n",
" )\n",
" \n",
" data['goodjet_n'] = np.vectorize(calc_goodjet_n)(data.good_jets)\n",
"\n",
" # select on number of good jets\n",
" fail = data[ np.vectorize(select_good_jet_n)(data.good_jets) ].index # get events that fail the selection\n",
" data.drop(fail,inplace=True) # drop the events that have fewer than 5 jets\n",
" if len(data.index)==0: continue # move onto next batch if no events left\n",
" #print('\\t\\t at least 6 jets \\t\\t\\t',len(data.index))\n",
" \n",
" # calculate number of b-jets at 77% Working Point using function calc_bjet_n defined above\n",
" data['bjet_n'] = np.vectorize(calc_bjet_n)(data.jet_MV2c10, data.good_jets)\n",
" \n",
" # select on number of b-jets\n",
" fail = data[ np.vectorize(select_bjet_n)(data.bjet_n) ].index # get events that fail this selection\n",
" data.drop(fail,inplace=True) # drop the events with fewer than 1 b-jets\n",
" if len(data.index)==0: continue # move onto next batch if no events left\n",
" #print('\\t\\t at least 2 b-jets \\t\\t\\t',len(data.index))\n",
" \n",
" # calculation of 2-lepton invariant mass \n",
" data['mll'] = np.vectorize(calc_mll)(data.lep_pt, data.lep_eta, data.lep_phi, data.lep_E)\n",
" \n",
" # select on whether leptons are good \n",
" fail = data[ np.vectorize(select_leptons)(data.mll, data.lep_ptcone30, data.lep_pt, data.lep_type,\n",
" data.lep_etcone20, data.lep_tracksigd0pvunbiased, \n",
" data.lep_eta, data.lep_charge\n",
" ,data.lep_z0 # uncomment this for stricter requirements\n",
" ) ].index # get events that fail this selection\n",
" data.drop(fail, inplace=True) # drop events where leptons aren't good\n",
" if len(data.index)==0: continue # move onto next batch if no events left\n",
" #print('\\t\\t leptons are good \\t\\t\\t',len(data.index))\n",
" \n",
" data['channel'] = np.vectorize(calc_channel)(data.lep_type) # ee, mm or em\n",
"\n",
" data['ptll'] = np.vectorize(calc_ptll)(data.lep_pt, data.lep_phi) # pt of dilepton system\n",
" data['pt4_jet'] = np.vectorize(calc_pti_jet)(data.jet_pt, data.good_jets, 3) # pt of 4th jet\n",
" data['pt6_jet'] = np.vectorize(calc_pti_jet)(data.jet_pt, data.good_jets, 5) # pt of 6th jet\n",
" data['dRll'] = np.vectorize(calc_dRll)(data.lep_eta, data.lep_phi) # ∆R between the two leptons\n",
" \n",
" # get number of jet pairs with mass within a window of 30 GeV around 85 GeV (~Z/W mass)\n",
" data['NJetPairsZMass'] = np.vectorize(calc_NJetPairsZMass)(data.jet_pt, data.jet_eta, data.jet_phi,\n",
" data.jet_E, data.good_jets)\n",
" \n",
" # get mass of the combination between any two jets with the smallest ∆R\n",
" data['MjjMindR'] = np.vectorize(calc_MjjMindR)(data.jet_pt, data.jet_eta, data.jet_phi, data.jet_E,\n",
" data.good_jets)\n",
" \n",
" # get number of top-quark candidates\n",
" data['Nmbjj_top'] = np.vectorize(calc_Nmbjj_top)(data.jet_pt, data.jet_eta, data.jet_phi, data.jet_E,\n",
" data.jet_MV2c10, data.good_jets)\n",
" \n",
" # get mass of the two jets with the highest b-tag weight\n",
" data['MbbPtOrd'] = np.vectorize(calc_MbbPtOrd)(data.jet_pt, data.jet_eta, data.jet_phi, data.jet_E,\n",
" data.jet_MV2c10, data.good_jets)\n",
" \n",
" # get jet centrality (scalar sum of pT divided by sum of E for all jets)\n",
" data['CentrJet'] = np.vectorize(calc_CentrJet)(data.jet_pt, data.jet_E, data.good_jets)\n",
" \n",
" # get average ∆R for all jet pairs\n",
" data['dRjjave_jet'] = np.vectorize(calc_dRjjave_jet)(data.jet_eta, data.jet_phi, data.good_jets)\n",
" \n",
" # get maximum mass between a lepton and the tagged jet with the smallest ∆R\n",
" data['MaxMMindRlepb'] = np.vectorize(calc_MaxMMindRlepb)(data.lep_pt, data.lep_eta, data.lep_phi,\n",
" data.lep_E, data.jet_pt, data.jet_eta, \n",
" data.jet_phi, data.jet_E, data.jet_MV2c10,\n",
" data.good_jets)\n",
" # get sum of jet pT , up to 6 jets\n",
" data['HT_jet6'] = np.vectorize(calc_HT_jet6)(data.jet_pt, data.good_jets)\n",
" \n",
" data['eta_ll'] = np.vectorize(calc_eta_ll)(data.lep_pt, data.lep_eta, data.lep_phi) # η of dilepton system\n",
" \n",
" # get sum of the two closest 2 jet invariant masses from from jjj1 and jjj2, divided by 2\n",
" data['MWavg'] = np.vectorize(calc_MWavg)(data.jet_pt, data.jet_eta, data.jet_phi, data.jet_E, \n",
" data.jet_MV2c10, data.good_jets)\n",
" \n",
" # get ∆R between two jets with the highest b-tagging weight in the event\n",
" data['dRbb'] = np.vectorize(calc_dRbb)(data.jet_eta, data.jet_phi, data.jet_MV2c10, data.good_jets)\n",
" \n",
" # get pT of the first b-jet (ordered according to the pT)\n",
" data['pT1b'] = np.vectorize(calc_pT1b)(data.jet_pt, data.jet_MV2c10, data.good_jets)\n",
" \n",
" # get H1 (First Fox-Wolfram moment)\n",
" data['H1'] = np.vectorize(calc_H1)(data.lep_pt, data.lep_eta, data.lep_phi, data.lep_E,\n",
" data.jet_pt, data.jet_eta, data.jet_phi, data.jet_E,\n",
" data.good_jets)\n",
" \n",
" data['pt5_jet'] = np.vectorize(calc_pti_jet)(data.jet_pt, data.good_jets, 4) # pt of 5th jet\n",
" \n",
" # get mass of the two untagged jets with the highest pT\n",
" data['MuuPtOrd'] = np.vectorize(calc_MuuPtOrd)(data.jet_pt, data.jet_eta, data.jet_phi, data.jet_E,\n",
" data.jet_MV2c10, data.good_jets)\n",
" \n",
" if 'data' not in sample: # if MC\n",
" # calculate total weight of event from all individual weights\n",
" data['totalWeight'] = np.vectorize(calc_weight)(xsec_weight,data.mcWeight,data.scaleFactor_PILEUP,\n",
" data.scaleFactor_ELE,data.scaleFactor_MUON,\n",
" data.scaleFactor_LepTRIGGER\n",
" ,data.scaleFactor_BTAG\n",
" )\n",
" data.drop(['mcWeight','scaleFactor_PILEUP','scaleFactor_ELE',\n",
" 'scaleFactor_MUON','scaleFactor_LepTRIGGER'\n",
" ,'scaleFactor_BTAG' # uncomment this for better Data vs MC match\n",
" ], axis=1, inplace=True)\n",
" \n",
" if 'Zmumu' in sample or 'Zee' in sample:\n",
" # calculate number of heavy-flavour jets at truth level\n",
" data['HF_n'] = np.vectorize(calc_HF_n)(data.jet_trueflav,data.good_jets)\n",
" data.drop(['jet_trueflav'], axis=1, inplace=True)\n",
"\n",
" # drop variables that don't need to be saved to csv\n",
" data.drop(['good_jets','lep_pt','lep_eta','lep_phi','lep_E','lep_charge','lep_type','lep_ptcone30',\n",
" 'lep_etcone20','lep_tracksigd0pvunbiased','met_et','jet_pt','jet_eta','jet_phi',\n",
" 'jet_E','jet_jvt','jet_MV2c10'\n",
" ,'lep_z0'\n",
" ], \n",
" axis=1, inplace=True)\n",
" \n",
" nOut = len(data.index) # number of events passing selections in this batch\n",
" data_all = data_all.append(data) # append dataframe from this batch to the dataframe for the whole sample\n",
" \n",
"\n",
" return data_all # return dataframe containing events passing all selections\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is where the processing happens"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Processing data samples\n",
"\tProcessing: data_A\n",
"\t\tnumber of events 185\n",
"\tProcessing: data_B\n",
"\t\tnumber of events 684\n",
"\tProcessing: data_C\n",
"\t\tnumber of events 1007\n",
"\tProcessing: data_D\n",
"\t\tnumber of events 1710\n",
"Processing $t\\bar{t}Z$ samples\n",
"\tProcessing: ttee\n",
"\tProcessing: ttmumu\n",
"Processing $t\\bar{t}$ samples\n",
"\tProcessing: ttbar_lep\n",
"Processing Z samples\n",
"\tProcessing: Zmumu_PTV0_70_CVetoBVeto\n",
"\tProcessing: Zmumu_PTV0_70_CFilterBVeto\n",
"\tProcessing: Zmumu_PTV0_70_BFilter\n",
"\tProcessing: Zmumu_PTV70_140_CVetoBVeto\n",
"\tProcessing: Zmumu_PTV70_140_CFilterBVeto\n",
"\tProcessing: Zmumu_PTV70_140_BFilter\n",
"\tProcessing: Zmumu_PTV140_280_CVetoBVeto\n",
"\tProcessing: Zmumu_PTV140_280_CFilterBVeto\n",
"\tProcessing: Zmumu_PTV140_280_BFilter\n",
"\tProcessing: Zmumu_PTV280_500_CVetoBVeto\n",
"\tProcessing: Zmumu_PTV280_500_CFilterBVeto\n",
"\tProcessing: Zmumu_PTV280_500_BFilter\n",
"\tProcessing: Zmumu_PTV500_1000\n",
"\tProcessing: Zmumu_PTV1000_E_CMS\n",
"\tProcessing: Zee_PTV0_70_CVetoBVeto\n",
"\tProcessing: Zee_PTV0_70_CFilterBVeto\n",
"\tProcessing: Zee_PTV0_70_BFilter\n",
"\tProcessing: Zee_PTV70_140_CVetoBVeto\n",
"\tProcessing: Zee_PTV70_140_CFilterBVeto\n",
"\tProcessing: Zee_PTV70_140_BFilter\n",
"\tProcessing: Zee_PTV140_280_CVetoBVeto\n",
"\tProcessing: Zee_PTV140_280_CFilterBVeto\n",
"\tProcessing: Zee_PTV140_280_BFilter\n",
"\tProcessing: Zee_PTV280_500_CVetoBVeto\n",
"\tProcessing: Zee_PTV280_500_CFilterBVeto\n",
"\tProcessing: Zee_PTV280_500_BFilter\n",
"\tProcessing: Zee_PTV500_1000\n",
"\tProcessing: Zee_PTV1000_E_CMS\n",
"Processing Other samples\n",
"\tProcessing: ZqqZll\n",
"\tProcessing: single_top_wtchan\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
":13: RuntimeWarning: invalid value encountered in sqrt\n",
" return np.sqrt(sumE**2 - sump**2)/1000 # /1000 to go from MeV to GeV\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\tProcessing: single_antitop_wtchan\n",
"\tProcessing: lllv\n",
"\tProcessing: ttW\n",
"\tProcessing: llll\n",
"\tProcessing: llvv\n",
"\tProcessing: Ztautau_PTV0_70_CVetoBVeto\n",
"\tProcessing: Ztautau_PTV0_70_CFilterBVeto\n",
"\tProcessing: Ztautau_PTV0_70_BFilter\n",
"\tProcessing: Ztautau_PTV70_140_CVetoBVeto\n",
"\tProcessing: Ztautau_PTV70_140_CFilterBVeto\n",
"\tProcessing: Ztautau_PTV70_140_BFilter\n",
"\tProcessing: Ztautau_PTV140_280_CVetoBVeto\n",
"\tProcessing: Ztautau_PTV140_280_CFilterBVeto\n",
"\tProcessing: Ztautau_PTV140_280_BFilter\n",
"\tProcessing: Ztautau_PTV280_500_CVetoBVeto\n",
"\tProcessing: Ztautau_PTV280_500_CFilterBVeto\n",
"\tProcessing: Ztautau_PTV280_500_BFilter\n",
"\tProcessing: Ztautau_PTV500_1000\n",
"\tProcessing: Ztautau_PTV1000_E_CMS\n"
]
}
],
"source": [
"data_dict = get_data_from_files(samples) # process all files"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Samples to plot\n",
"\n",
"Define dictionary of samples and colours to use for plotting"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [],
"source": [
"samples_SR = {\n",
"\n",
" 'data': {\n",
" 'list' : ['data_A','data_B','data_C','data_D']\n",
" },\n",
" \n",
" 'Other' : {\n",
" 'color' : \"#79b278\" # purple \n",
" }, \n",
" \n",
" 'Z + 0 HF' : {\n",
" 'color' : \"#ce0000\" # light green \n",
" },\n",
" \n",
" 'Z + 1 HF' : {\n",
" 'color' : \"#ffcccc\" # middle green\n",
" }, \n",
" \n",
" 'Z + 2 HF' : {\n",
" 'color' : \"#ff6666\" # dark green\n",
" }, \n",
" \n",
" r'$t\\bar{t}$' : {\n",
" 'color' : \"#f8f8f8\" # almost white\n",
" }, \n",
" \n",
" r'$t\\bar{t}Z$' : {\n",
" 'color' : \"#00ccfd\" # blue\n",
" }, \n",
" \n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Regroup Z+jets into Z + 0 HF, Z + 1 HF, Z + 2 HF\n",
"\n",
"paper: \"The simulated Z +\n",
"jets background is split into three components, Z + 0HF,\n",
"Z + 1HF and Z + 2HF, depending on the number of\n",
"reconstructed jets which are matched to a generator-level\n",
"b- or c-hadron (heavy-flavor, or HF jets)\""
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [],
"source": [
"data_dict['Z + 0 HF'] = data_dict['Z'][data_dict['Z']['HF_n']==0] # find the events with 0 heavy-flavour partons\n",
"data_dict['Z + 1 HF'] = data_dict['Z'][data_dict['Z']['HF_n']==1] # find the events with 1 heavy-flavour parton\n",
"data_dict['Z + 2 HF'] = data_dict['Z'][data_dict['Z']['HF_n']>=2] # find the events with 2 heavy-flavour partons"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Function to plot Data and MC"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"def plot_data(data, samples_to_plot=samples_SR, x_variable='BDT_6j2b_output',\n",
" region_label='2L-Z-6j2b'):\n",
" \n",
" # *******************\n",
" # general definitions (shouldn't need to change)\n",
" \n",
" bin_edges = np.arange(start=-1, # The interval includes this value\n",
" stop=1+0.1, # The interval doesn't include this value\n",
" step=0.1 ) # Spacing between values\n",
" bin_centres = np.arange(start=-1+0.1/2, # The interval includes this value\n",
" stop=1+0.1/2, # The interval doesn't include this value\n",
" step=0.1 ) # Spacing between values\n",
"\n",
" mc_x = [] # define list to hold the MC histogram entries\n",
" mc_weights = [] # define list to hold the MC weights\n",
" mc_colors = [] # define list to hold the MC bar colors\n",
" mc_labels = [] # define list to hold the MC legend labels\n",
" \n",
" main_axes = plt.gca() # get current axes\n",
"\n",
" mc_stat_err_squared = np.zeros(len(bin_centres)) # define array to hold the MC statistical uncertainties\n",
" for s in samples_to_plot: # loop over samples\n",
" if s!='data': # if not data\n",
" mc_x.append( data[s][x_variable] ) # append to the list of Monte Carlo histogram entries\n",
" totalWeights = data[s]['totalWeight'] # get the totalWeight column\n",
" mc_weights.append( totalWeights ) # append to the list of Monte Carlo weights\n",
" mc_colors.append( samples_to_plot[s]['color'] ) # append to the list of Monte Carlo bar colors\n",
" mc_labels.append( s ) # append to the list of Monte Carlo legend labels \n",
" weights_squared,_ = np.histogram(data[s][x_variable], bins=bin_edges,\n",
" weights=totalWeights**2) # square the totalWeights\n",
" mc_stat_err_squared = np.add(mc_stat_err_squared,weights_squared) # add weights_squared for s \n",
"\n",
"\n",
" # plot the Monte Carlo bars\n",
" mc_heights = main_axes.hist(mc_x, bins=bin_edges, \n",
" weights=mc_weights, stacked=True, \n",
" color=mc_colors, label=mc_labels )\n",
"\n",
" mc_x_tot = mc_heights[0][-1] # stacked background MC y-axis value\n",
" mc_x_err = np.sqrt( mc_stat_err_squared ) # statistical error on the MC bars\n",
"\n",
" # histogram the data\n",
" data_x,_ = np.histogram(data['data'][x_variable], bins=bin_edges ) \n",
"\n",
" # statistical error on the data\n",
" data_x_errors = np.sqrt(data_x)\n",
"\n",
" # plot the data points\n",
" main_axes.errorbar(x=bin_centres, \n",
" y=data_x, \n",
" yerr=data_x_errors,\n",
" fmt='ko', # 'k' means black and 'o' is for circles \n",
" label='Data')\n",
"\n",
" # plot the statistical uncertainty\n",
" main_axes.bar(bin_centres, # x\n",
" 2*mc_x_err, # heights\n",
" alpha=0.5, # half transparency\n",
" bottom=mc_x_tot-mc_x_err, color='none', \n",
" hatch=\"////\", width=0.1, label='Stat. Unc.' )\n",
"\n",
" # set the x-limit of the main axes\n",
" main_axes.set_xlim( left=-1, right=1 ) \n",
"\n",
" # separation of x axis minor ticks\n",
" main_axes.xaxis.set_minor_locator( AutoMinorLocator() ) \n",
"\n",
" # set the axis tick parameters for the main axes\n",
" main_axes.tick_params(which='both', # ticks on both x and y axes\n",
" direction='in', # Put ticks inside and outside the axes\n",
" top=True, # draw ticks on the top axis\n",
" right=True ) # draw ticks on right axis\n",
"\n",
" # x-axis label\n",
" main_axes.set_xlabel(x_variable, fontsize=13)\n",
"\n",
" # y-axis label\n",
" main_axes.set_ylabel('Events / 0.1')\n",
"\n",
" # force y-axis ticks to be integers\n",
" main_axes.yaxis.set_major_locator(MaxNLocator(integer=True))\n",
"\n",
" # set y-axis limits for main axes\n",
" main_axes.set_ylim(bottom=0,\n",
" top=np.amax(data_x)+np.sqrt(np.amax(data_x))*5 )\n",
"\n",
" # add minor ticks on y-axis for main axes\n",
" main_axes.yaxis.set_minor_locator( AutoMinorLocator() ) \n",
"\n",
" # Add text 'ATLAS Open Data' on plot\n",
" plt.text(0.05, # x\n",
" 0.92, # y\n",
" 'ATLAS Open Data', # text\n",
" transform=main_axes.transAxes, # coordinate system used is that of main_axes\n",
" fontsize=13 ) \n",
"\n",
" # Add text 'for education' on plot\n",
" plt.text(0.05, # x\n",
" 0.87, # y\n",
" 'for education', # text\n",
" transform=main_axes.transAxes, # coordinate system used is that of main_axes\n",
" style='italic',\n",
" fontsize=8 ) \n",
"\n",
" # Add energy and luminosity\n",
" lumi_used = str(lumi*fraction) # luminosity to write on the plot\n",
" plt.text(0.05, # x\n",
" 0.81, # y\n",
" '$\\sqrt{s}$=13 TeV, '+lumi_used+' fb$^{-1}$', # text\n",
" transform=main_axes.transAxes ) # coordinate system used is that of main_axes\n",
"\n",
" # Add a label for the analysis region\n",
" plt.text(0.05, # x\n",
" 0.75, # y\n",
" region_label, # text \n",
" transform=main_axes.transAxes ) # coordinate system used is that of main_axes\n",
"\n",
" # draw the legend\n",
" main_axes.legend(ncol=2, # number of columns\n",
" frameon=False ) # no box around the legend \n",
"\n",
"\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## BDT in 6j2b region\n",
"\n",
"In order to separate signal from background, boosted decision trees (BDTs) are used. The BDTs are\n",
"constructed and trained against all the contributing backgrounds, using as input 17 variables for 2l-Z-6j2b. You can see the variables used in Table 11 of the ATLAS published paper [Measurement of the $t\\bar{t}Z$ and $t\\bar{t}W$ cross sections in proton-proton collisions at $\\sqrt{s}$ = 13 TeV with the ATLAS detector](https://journals.aps.org/prd/pdf/10.1103/PhysRevD.99.072009).\n",
"\n",
"We start by using fewer variables, you can add the other variables back in later."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Choose variables for use in BDT"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {},
"outputs": [],
"source": [
"BDT_6j2b_inputs = ['ptll','pt4_jet','pt6_jet','dRll','NJetPairsZMass','Nmbjj_top','MjjMindR','MbbPtOrd',\n",
" 'CentrJet','dRjjave_jet','MaxMMindRlepb','HT_jet6','eta_ll','MWavg','dRbb','pT1b'\n",
" ,'H1' # try add these variables first\n",
" #,'pt5_jet','MuuPtOrd' # maybe try add these variables later?\n",
" ] # variables to use in BDT"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Select the data in the 6j2b Signal Region."
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {},
"outputs": [],
"source": [
"# define function to select only events with leptons of same type, at least 6 jets and at least 2 b-tagged jets\n",
"def select_6j2b_SR(channel, goodjet_n, bjet_n):\n",
" if channel==143: return True # throw away if emu\n",
" if goodjet_n<6: return True # throw away if fewer than 6 jets\n",
" if bjet_n<2: return True # throw away if fewer than 2 b-tagged jets\n",
" return False # keep this event if it gets to this stage"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {},
"outputs": [],
"source": [
"data_6j2b_SR = {} # define empty dict\n",
"for s in samples_SR:\n",
" fail = data_dict[s][ np.vectorize(select_6j2b_SR)(data_dict[s]['channel'], data_dict[s]['goodjet_n'],\n",
" data_dict[s]['bjet_n']) ].index # get events that fail the selection\n",
" data_6j2b_SR[s] = data_dict[s].drop(fail).copy()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Organise data ready for BDT"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {},
"outputs": [],
"source": [
"# for sklearn data is usually organised \n",
"# into one 2D array of shape (n_samples x n_features) \n",
"# containing all the data and one array of categories \n",
"# of length n_samples \n",
"\n",
"all_MC = [] # define empty list that will contain all features for the MC\n",
"all_y = [] # define empty list that will contain labels whether an event in signal or background\n",
"for s in samples_SR: # loop over the different keys in the dictionary of dataframes\n",
" if s=='data': continue # skip data for BDT training\n",
" all_MC.append(data_6j2b_SR[s][BDT_6j2b_inputs]) # append the MC dataframe to the list containing all MC features\n",
" if s==r'$t\\bar{t}Z$': # only signal MC should pass this\n",
" all_y.append(np.ones(data_6j2b_SR[s].shape[0])) # signal events are labelled with 1\n",
" else: # only background MC should pass this\n",
" all_y.append(np.zeros(data_6j2b_SR[s].shape[0])) # background events are labelled with 0\n",
"X = np.concatenate(all_MC) # concatenate the list of MC dataframes into a single 2D array of features, called X\n",
"y = np.concatenate(all_y) # concatenate the list of lables into a single 1D array of labels, called y"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### The Training and Testing split (6j2b)\n",
"One of the first things to do is split your data into a training and testing set. This will split your data into train-test sets: 75%-25%. It will also shuffle entries so you will not get the first 75% of X for training and the last 25% for testing. This is particularly important in cases where you load all signal events first and then the background events.\n",
"\n",
"Here we split our data into two independent samples. The split is to create a training and testing set. The first will be used for training the classifier and the second to evaluate its performance.\n",
"\n",
"We don't want to test on events that we used to train on, this prevents overfitting to some subset of data so the network would be good for the test data but much worse at any *new* data it sees."
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split\n",
"\n",
"# make train and test sets\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y, \n",
" random_state=492 )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Training Decision Trees (6j2b)\n",
"\n",
"We'll use SciKit Learn (sklearn) in this tutorial. Other possible tools include keras and pytorch. \n",
"\n",
"After instantiating our GradientBoostingClassifier, call the fit() method with the training sample as an argument. This will train the tree, now we are ready to evaluate the performance on the held out testing set."
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"GradientBoostingClassifier()"
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.ensemble import GradientBoostingClassifier # BoostType\n",
"\n",
"bdt_6j2b = GradientBoostingClassifier()\n",
"bdt_6j2b.fit(X_train, y_train) # fit BDT to training set"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The fit() method returns the trained classifier. When printed out all the hyper-parameters are listed."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Apply the trained BDT to all Data and MC"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {},
"outputs": [],
"source": [
"bdt_outputs = bdt_6j2b.decision_function(X) # apply bdt to all X\n",
"min_bdt_output = min(bdt_outputs) # get minimum BDT output score\n",
"max_bdt_output = max(bdt_outputs) # get maximum BDT output score"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Map the BDT decision function to the range [-1,1]"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {},
"outputs": [],
"source": [
"for s in samples_SR: # loop over samples\n",
" X_s = data_6j2b_SR[s][BDT_6j2b_inputs] # get the BDT input features\n",
" bdt_output_on_X_s = bdt_6j2b.decision_function(X_s) # get decision function for this sample\n",
" mapped_bdt_output_on_X_s = (bdt_output_on_X_s-min_bdt_output)/(max_bdt_output-min_bdt_output)*2 - 1 # map to [-1,1]\n",
" data_6j2b_SR[s]['BDT_6j2b_output'] = mapped_bdt_output_on_X_s # save BDT output"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Back to contents](#contents)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6j2b Signal Region plot"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use the plot_data() function defined [above](#plot_data) to compare Data with MC in BDT outputs in the 6j2b Signal Region. \n",
"\n",
"The Figure below shows the BDT output distribution in the signal\n",
"region 2l-Z-6j2b."
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYYAAAEGCAYAAABhMDI9AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABHrElEQVR4nO3deXxTVfr48c9jy1aKIJRFQVpxhQpUqOCGFDdwKQwqbsUBBasy7iijw4zrtzOOC6g/R6WiolAFRR0FFQSkiAsoYAHBURQLggutWLZKgfb5/ZHFpE3SpE2atH3er1deTc4999yT2zYn99xzniOqijHGGONyULQrYIwxJrZYw2CMMcaLNQzGGGO8WMNgjDHGizUMxhhjvFjDYIwxxkt8tCvgS2Jioh533HEB8+zYsYPWrVvXeHs4yigqKqJ9+/YRPUY43kdDqWdd/E6DOUZjqWddvI/6Us+6+B8KRxkrV67craqtAh4kGKoac4+EhAStzjXXXFOr7eEoo2/fvhE/RjjeR0OpZ138ToM5RmOpZ128D9X6Uc+6+B8KRxnAHg3DZ3C97UrKzMys1fZwlRHpY4TjfQSjPtSzLn6ndXEuw1GPWPidh6uMujiG1TM0ojE487lly5a6Z8+eaFejWunp6axYsSLa1aiW1TO8rJ7hVR/qWR/qCCAiparasrblxOQVQ8uWLcnOzmbOnDnRrkpA2dnZ0a5CUKye4WX1DK/6UM9Yr+OcOXNcdQzLN+qYvGJIT0/X+tA6G2NMLBGRlaqaXttyYvKKwRhjTPREtGEQkTYiMltE/iciX4nIyR7bxouIikhSJOtgjDEmNJGex/A4ME9VLxaRpkACgIgcDpwDbI7w8Y0xxoQoYlcMItIaOB14DkBV96lqiXPzZGACUCc3OHJychARXnzxRQA2b95MYmKi+xEfH0/Tpk3dr1NTUwFISUlhxowZAcv++OOPERGuuuqqKtuKiooYM2YMnTt3JjExkUMPPZRzzz2Xn376yW95FRUVTJo0idTUVBISEmjXrh0XXXQR69evr8UZqL3CwkJEhJYtW9KqVSsOOeQQ0tPTuffee9mxY0fQ5eTn5xMfH5PzKk0jM2DAANLS0khPr3WXfIMTya6kI4Ai4AUR+UJEpopISxEZBmxV1dURPLZbRUUFzz77LG3btiU3NxeArl27snv3bvcjIyODv/3tb+7X69atC7r8KVOm0LZtW1599dUqH5AjR45k165dfPHFF+zevZvVq1dz+eWXIyJ+y7vqqquYNGkSkydPpqSkhC+//JKOHTvSv39/1qxZU7OTEEZff/01u3btYtu2bTzxxBMsWrSI9PR0fv3112hXzZiQLF26lIKCgnoxDLXOhWOWnK8HkA4cAPo7Xz8OPAwsB1o70wqBpMr7du3aVfv27et+TJkyJeBsv0DeffddjY+P17lz5yqga9eurZLnzDPP1HvuuadKenJysk6fPt1v2du3b9fmzZtrXl6etmvXTv/f//t/XtsTExP1nXfeCbquS5cuVUDz8/OrbMvIyNAzzzzT/RrQyZMna+/evTUxMVEzMjJ0w4YN7u379+/XnJwcPfroo7V169Z6yimn6Oeff+7ePmrUKB05cqSOHTtWW7durYcddpg+88wzfuv2/fffK6A//PCDV/qvv/6q7du317/97W+qqrpnzx4dPny4duzYUVu1aqUnnHCCvv/++6qqunXrVm3evLkC2rJlS23ZsqVOmzZNVVVHjx6tXbp00cTERO3evbvm5eUFfd6MacymTJni/qwECjUcn9/hKMRnwdDJs5LAAGARsM3ZIBQ6G47NQCfPfYOZfh6sP/3pT5qZmamqqr169dIbbrihSp6aNgyPPfaYJiUlaVlZmd50003as2dPr+3nnXee9ujRQ6dMmaKrVq3SAwcOBKzrXXfdpV26dPG5berUqRoXF6elpaWq6mgYunfvrhs2bNDS0lL9y1/+ot27d3cf429/+5v269dPv/vuOz1w4IBOnTpV27Vrp9u3b1dVR8PQvHlzfeutt7S8vFxff/11jY+P18LCQp/H99cwqKpeccUV2r9/f1VV3bVrl06fPl137typ+/bt04ceekhbtWql27ZtU1XVxYsXa1xcnM/3V1xcrAcOHNBXXnlFmzRpouvWrQt4vowJ1Q8//KAzZ86s8ryhAFZoLIfEUNWfgR9E5Fhn0pnAKlXtoKopqpoCbAH6OPOG3Y8//sjcuXO5+uqrARgzZgwzZszg999/D0v5ubm5ZGVl0bRpU8aMGcPatWv59NNP3dtnzZrFyJEjeeGFFzjllFNo164dt9xyC3v37vVZXlFREZ07d/a57bDDDqO8vJzt27e708aPH89RRx1FixYteOihh/juu+9Yvnw5qsoTTzzBww8/TLdu3YiLi2PMmDEceuihvPPOO+79zzjjDIYOHcpBBx3EhRdeSJs2bSgoKAj5PHTp0sXdlZSYmMjIkSNp1aoVTZo04Y477qBp06Z8/vnnAcsYM2YM7dq1Iy4ujssuu4xevXqRn58fcl3qky1btjBs2DCOPvpojjzySG6++Wb27dtHQUEB7777rjvfvffeyyOPPBLFmjYcixYtYtWqVVWeG2+Rvgt4I5DnHJG0Eah6hzaCnnvuOdq2bcsFF1wAOPr8J0yYwKxZsxg9enStyl66dCnr16/nlVdeAaBXr16kp6czZcoUTj7ZMSo3MTGRu+66i7vuuot9+/Yxb948rrzySg4++GDuv//+KmW2b9+erVu3+jzejz/+SFxcHG3btnWnpaSkuJ8nJCTQvn17tmzZQnFxMbt37yYzM9Prfsb+/fvZsmWL+/Whhx7qdYyWLVuya9eukM/Fli1baNeuHQC///47d9xxB++++y7FxcUcdNBB7Nq1i6KiIr/7V1RUcO+99zJr1ix+/vlnRIQ9e/YE3Cfc/jbnrrCW98/MfwXcrqpceOGFXH/99bz11luUl5eTnZ3NxIkTSU1NZcWKFZx33nlhqUt5eTlxcXFhKStcysrKwlpes2bNqs3z0Ucfcdttt9GmTRseffRRWrVqRdu2bZk/fz5vvPEG3bp1C2ud6rOIzmNQ1QJVTVfVXqr6J1X9rdL2FFUtjsSxKyoqeO655ygpKaFLly506tSJHj16UF5ezpQpU2pdvutG9jnnnEOnTp3o1KkT69ev59VXX6WkpKRK/qZNmzJ06FDOOussv9/KhwwZwpYtW1i6dGmVbS+//DIDBw6kRYsW7rTCwkL389LSUoqKiujSpQtJSUm0bNmShQsXUlJS4n7s2bOHO++8s1bvu7LffvuNBQsWcMYZZwAwadIkPvzwQxYtWsSOHTsoKSnhkEMOcXUnctBBVf/kXnnlFaZOncrrr7/Ob7/9RklJCb1793bv0xB98MEHNG/e3D2aLS4ujsmTJzN16lT3l5e0tDRmzZoFwPr168nIyKBbt2488cQT7nJmzJhBv379SEtL49prr6W8vBxwfCkZP348vXv39rqKbcxOO+00TjzxRN566y0OHDhA//79eeuttygoKLBGoZIGO/N53rx5/PDDD3zyyScUFBS4H3PnzmXZsmWsXbs2qHL279/P3r173Y+ysjK2b9/O7Nmz+c9//uNV9vr162nevDnTp08H4LbbbuPzzz9n7969VFRUkJ+fz+LFixkwYIDPY51++ulcccUVZGVlsXDhQvbt28fPP//MjTfeyPLly6t0J0yePJnvvvuOvXv3cuedd9KtWzf69++PiHDzzTdz++23s2HDBgB2797N/Pnz+fHHH2txVv9w4MABli1bxvDhw2nVqhW33XYbADt37qRZs2a0a9eOffv2cf/993s1lJ06daK8vJzvv//enbZz507i4+Np3749FRUVPP/886xeXSeD1qJm3bp19O3b1yvt4IMPJiUlhb///e9ceumlFBQUcOmllwLwv//9j/nz5/PZZ59x3333sX//fr766itmzZrFxx9/TEFBAXFxceTl5QGwZ88e+vfvz+rVqznttNPq/P3Fqq+//hrXWi+ez423mGwYduzYUesgelOmTOFPf/oTffv2dX+j79SpE4MHD+bkk08O+qrh6quvpkWLFu5H69atefHFFznkkEMYO3asV9nJyclcd9117rIrKiq46qqr6NChA4cccgjjxo3j9ttvZ/z48X6P99JLL3HTTTdx00030aZNG3r06MGWLVtYtmwZJ5xwglfesWPHcuGFF9K+fXtWr17NW2+95e4yuO+++xg2bBjDhg3j4IMP5uijj+aZZ56hoqKihmfU4dhjj6VVq1YkJSUxbtw4BgwYwMqVK0lKckxgd12qH3bYYRx55JEkJCR4dXkdc8wxXH/99fTr1482bdowffp0Ro0aRf/+/TnqqKPo3Lkz69ev99t4Nlbnn38+zZo1IykpiQ4dOvDLL7+waNEiVq5cyYknnkhaWhqLFi1i48aNgOMK5KKLLopyrWNLcXExrVu3Jj4+3ut5Q+ARRC/wSkBBsiB69ZSIsHTpUvs2GCZ1fY9h4cKF3H///Xz44YfutJ07d3LEEUfwwAMPsH79ep588knAcfM5MTGR22+/HYDjjz+euXPnMmfOHH788Uf+9a+qx0pMTGT37t1hfEfhFY17DCtWrODuu+/m3Xff9XrekFgQPWPqsTPPPJPS0lJeeuklwHGDePz48YwePZqOHTsGNQjgzDPPZPbs2Wzbtg2A7du3s2nTpojWuz477rjjKC4u5vjjj6e0tNT9/JNPPol21WJOw7iOMqaeERHefPNNxo0bxwMPPEBFRQXnnXce//znP9mzZw8PPvggaWlp3HWX/yuZHj168H//93+cc845VFRU0KRJE/7zn/+QnJxch++k/khMTOSzzz5zv/Z8brw16K6kb7/9liuvvJK+ffu6L8tr67///S/btm2r1cIdr7zyCpdccglxcXE89NBD7jH8xhhTG9aVFITp06czefLkahuFUG7IfvXVV+4gezX1/PPPu28ST5gwwRoFY0xMabANwwcffMCTTz7JnXfeyerVq7nllls45ZRTGDJkCHv27GHVqlUMHjyY4cOHe41+2rp1K8OGDWPAgAHce++9AHzxxRecdtppDB48mI8//pjU1FQmTJjgDmo3btw4Nm7cSGlpKX/+858ZNGgQF198MeAYpeOK4vjjjz/y5JNP8sUXX5CRkcH//vc/zj33XMAxvPakk06if//+LF68GICBAwdy880306NHD/cwRGOMibhwxNUI9+Ooo47Sa665Rt9+++2ahgxRVdXBgwerquq0adP0wQcfVFXVxx9/XF955RV98cUXdezYsVX2ueSSS/Sbb75RVdXMzEwtKyvTwYMHa3Fxse7atUuPOOIIVVW94IILdO/evaqqOmTIEK2oqNDbbrtN586dq6qq+/btU1XV3bt3q6rq888/ry+//LJ+9913+pe//EVVVTdt2qTXXHON7tixQzMyMrS0tFRLSkp0yJAhWlZWpsnJyfrbb7/pxo0b9c9//nOtzoUxpuF6++239ZprrlFgg4bhMzgmbz63bt3aPbO4psrKymjevDngGOP7/PPPA46Zty1atGDVqlVV7hPs27ePJUuWcM011wDw66+/snnzZo455hjatWvH3r173RNiSktLadasGapKRUUFIsK6det49NFHAWjSpAlbt25l/Pjx/PLLL2zZsoWpU6eyZs0aevXqBcCXX37J8ccfz4cffsiFF15IixYtKC8vp3nz5nzzzTecf/75tGnThg0bNnDEEUfU6nwYYxquzMxMMjMzefbZZ4NfHCWABtuV9NVXX9G9e3fAMRRw//79lJWV8frrrzNw4EC+/vpr9we0y+7duzn55JPJz88nPz+fVatWUVJS4g4zkJuby7HHHsu+ffvc9wiWLFniDnxXUlLiDuNw4MABHnjgAW666SYWLFhAp06d6NmzJ1999RXHHuuIK+hqGA4cOMC+ffvcxzj//PNZt24daWlpAKxevbrW9zWMMSZYDbZhWLt2LT179gTg5ptv5txzzyUjI4Nbb72VNm3aUFZWVmVSTNu2benWrRunnnoqZ511Fu+88w5paWl89dVXnHHGGcyePZsePXrQtGlTmjZtyi233ML06dM5/vjjAbjooovo168fp59+Oj/88AOnnHIK48aN4+abb2b//v20bduWXr16ceONNzJt2jR3wzB48GAWLFjAqaeeyoYNG7jqqqtYt26du+Fau3at+xjGGBNpDXq4qjGx6s033+S+++7zSluzZg3vvPOOe0BCTZSVlfHnP/+ZlStX0q5dO2bNmuUVkgQcwRcvuOACvvzyS3ea5+zq0aNHs2TJElq3dkRXuPrqq7nppptqXKdYNWDAAHbt2kV8fHyDWcUtXMNVY/IegzF17YsAy63WxAnVfOEaPnw4w4cPd7/Ozc0lLy+PwYMH+90nPz+fadOmMW3aNL95nnvuOQ455BC+/fZbZs6cyV//+ld3hNZQPPzww+6RdQ2VryjGxiEmu5LCEUTPmPrim2++4f7772f69Ok+w5KH4q233mLUqFEAXHzxxSxatKhBhy83DuEOoheTDYNrVFJmZma0q2JMRO3fv58rrriCRx99lK5du9a6vK1bt3L44YcDEB8fT+vWrd2r63n67rvvSEtLcz+eeeYZr+133HGHe1uwIerrgy1btrivoDyfA1x11VVe56RTp05eC2PFsszMTNdIzrCMSrKuJBMRGzduJCcnhx07djB79uxoVydm/eMf/yA1NdW97oIv/fv3p6ysjN27d7N9+3b3aLV///vfAbueAjnyyCO9FoxyTeZ0aahdSYsWLWL9+vVceumlXs8BXnjhBXe+77//ngEDBtR62Hx9FdErBhEpFJG1IlIgIis80m8Ukf+JyDoReSiSdajsyiuvRESCegRy9dVX06FDB6/RQnv37qVfv3707t2b1NRU7rnnnir7/frrr17fSDp37ux+7RqyWtmgQYOYP3++V9pjjz3G9ddfH3IdXebNm8exxx7LUUcdxYMPPuhz/yeeeILu3buTlZVFYWFhSCOjunXrxnPPPRd0/sYoPz+f119/vdqQLcuXL6egoICpU6cydOhQ98JQvhqFzp0788MPPwCOIdM7duyIyZArsrIirI9guJb2nD17NvHx8dxyyy3Mnj2btLQ09zoW4Fi3YciQIfzjH/9g6NChkToFMa0uupIGqWqa6065iAwChgG9VTUVqLNVzn/66SeOPfbYoGf/BTJ69GjmzZvnldasWTM++OADVq9eTUFBAfPmzWPZsmVeedq1a+f+x77uuuu49dZb3a+bNm3q81iXX345M2fO9EqbOXMml19+ech1BMe8jr/85S+899577nWr169fXyXfU089xYIFCwKG41i7di0XXHCB18MVBtr499tvv3HVVVfx0ksv0apVq7CVO3ToUF588UUAZs+ezRlnnFHtl5zGIpilPUtLS8nMzOSSSy7h2muvjXKNoyca9xiuBx5U1TIAVa2zT5FZs2aRlZXlfv3iiy/St29fevXqFfKCN6effnqV/kcRITExEXD0He/fvz/kf0pfa/hefPHFvPPOO+4risLCQn788cdqVznzVUdwhBs+6qij6NatG02bNuWyyy7jrbfe8spz3XXXsXHjRs4991wmT54MOL6BZmVl0b17dy6++GJKS0vp2bMnc+fO9Xp06NAhpPfcGD3zzDNs27aN66+/3qtfuyYjiDyNGTOGX3/9laOOOopJkyb5vRpsrAIt7VleXs5ll13GcccdxwMPPBCtKsaGcMTVCPCN+3tgFbASyHamFQD3AcuBJcCJlffr27dvbUOHqKpqcXGxzps3z/36lltucT/fuXOndu/eXcvKylRV9bfffvPa97TTTtPevXtXeSxYsMCd5/vvv9fU1FSv/Q4cOKC9e/fWli1b6oQJEwLW75577tGHH37Y/Xr9+vV6wQUXuOMsXX/99friiy+qqur555+v//3vf1VV9V//+peOHz8+qHPgq46vvfaajhkzxv36pZdecsdv8pScnKxFRUXucgD96KOPVFX1qquu8qp7ZcXFxXrttddqt27d9J///GdQdTWNByvKw/oIRlFRkfbu3bvKc5fs7GwdMmSI7t+/P8zvtu4AK7QexEo6TVW3ikgHYIGI/A/HDe+2wEnAicCrItLN+aYAKCoqIj39jzka2dnZNVr/oLCwkNtvv53+/fvz008/uWdCg2NN3N9//53x48czatQor+NBzcc4x8XFUVBQQElJCcOHD3fPbg6G5xq+AL///rv727erO2nYsGHMnDkzKv33hx9+OKeeeioAI0eO5IknnnAvN1lZu3btqox0MSaaCgsLOeyww6o8B8ca6StXriQ/P7/erQOdm5vreZM8KRxlRvQMqOpW589tIvIm0A/YArzhbAg+E5EKHG+myLVf+/btwzITsW/fvowYMYLXXnuNX375hZtvvtm9LSEhgS+//NI9/nfs2LGMGzfOvd01K7KyRx55hLPOOqvaY7dp04ZBgwYxb968oBsGVWXUqFE+1/AdNmwYt956K6tWraK0tJS+ffsGVaYvnjcowTFszxXvKZDK3WLWd23qE8+lPZ966in389zcXO69915SUlK8upSPPfbYWnft1QXPL84iUhyOMiPWMIhIS+AgVd3lfH4OcD+wGxgELBaRY4CmQFjejC8jR45k9OjRnHbaaV43+TZs2MDRRx/NZZddxvr169m7d6/XfjW5YigqKqJJkya0adOG33//nQULFvDXv/416P3PPPNMdwPQoUMHtm/fzq5du0hOTiYxMZFBgwZx9dVXV7npfOaZZ/LSSy8F9eEOcOKJJ7Jhwwa+//57OnfuzMyZM3n55Zer3W/z5s18+umnnHzyybz88ssh35cxJpoCLe3p0WFhiOwVQ0fgTee3ynjgZVWdJyJNgedF5EtgHzBKI/hb6datG+Xl5WRkZHil5+Tk8Omnn9KyZUtSU1N59tlnQyr38ssvJz8/n+LiYrp06cJ9993HiSeeyKhRoygvL6eiooJLLrmECy64IOgyq1vD9/LLL2f48OFeI5QqKir49ttvfd5k9lXHMWPGEB8fz5NPPsngwYMpLy/n6quvDip667HHHst//vMfrr76anr06FHtcFlj/NG+MTm31jg1iiB6H330Eaecckqtww3Eoi+//JLnn3+eSZMmRbsqxpgoC1cQvUbRMBhjTGMQroYhJr9CWxA909C9+eabXvMX0tLSOOigg3jvvfdqVe6HH35Inz59iI+PDxiKxDXfxmXatGnccMMNgCM8hueM/DvvvLNWdTKRF+4genbFYAzAypXhLS/EUWOusNuLFy/22+UZTNjtwsJCdu7cySOPPMLQoUP9xjtKTExk9+7d7tfTpk1jxYoVPPnkk15rM5j6xdZjMKaBcIXd/uSTT2p9H8y1KE9DvJ9m6o41DMZEUbjDbgfr999/d0dpBdi+fbtXwLjJkyczY8YMoHZRXGvDNZIwPz+/zo/d2DXorxU//PADgwYNokePHqSmpvL4448DjuBygfpfy8vLq/T/JiUl+Q2NvGbNGk4++WRSU1Pp2bOne07EeeedR0lJid96gOOP37rNGq9gw26npaUxduxY3n77bfffZOWIu6Fo0aKFO3hjQUEB999/v9d2z+CO0WgUIiEuLo60tDRSU1Pp3bs3jz76KBUVgSOzFhYWBjXHp6Fp0FcM8fHxPProo/Tp04ddu3bRt29fzj777Gr3c4W1cPnpp5/o168f//jHP6rkPXDgACNHjmT69On07t2bX3/9lSZNmgDw7rvvAo5vZ77q0aNHj/C8UVMvucJur1q1KmC+5cuXu/NXd4+hocjLy2PZsmWUlZWRkpJCTk6OVwDMmnA1hgDbtm3jiiuuYOfOnVXW3vbkahiuuOKKWh27vmnQVwyHHnooffr0AaBVq1Z0796drVu3hlSGK0zFHXfc4TO0xfvvv0+vXr3o3bs34IgRFBcXBzj6e4uLi6utx/Tp00lLS+P444/3mo1pGq5Ihd1uCPLy8sjOzqasrAyATZs2kZ2dHTD8e6g6dOhAbm4uTz75JKpKYWEhAwYMoE+fPvTp04dPPvkEgDvvvJOlS5eSlpbG5MmT/eZraGKyYYjEcNXCwkK++OIL+vfvH9J+kydPJj4+nhtvvNHn9m+++QYRYfDgwfTp04eHHgq87pCvepSWllJQUMBTTz3F1VdfHVL9TP0UqbDbn3/+OV26dOG1117j2muvDWpGe6yZOHEipaWlXmmlpaVMnDgxrMdxRUXYtm0bHTp0YMGCBaxatYpZs2Zx0003AfDggw8yYMAACgoK3KFqfOWLtnAPV41o2O2aPsIVdttl165d2qdPH3399ddVVXXUqFH62muvVbtfQUGBHn744frzzz/7zfPwww9rSkqKFhUV6Z49e/Skk07ShQsXqqp32Gpf9VBVHThwoC5atMj9+vDDD68SAtyYxkREFKjyEJFalduyZcsqaa1bt9aff/5ZS0pKdOTIkXr88cdr7969tUWLFqqqunjxYj3//PPd+f3lixWEKex2TF4xhNP+/fu56KKLyMrK4sILL/Sbz7UQ+HnnnQc47gtkZWXx9NNP07FjR3c+z4lJK1asoEuXLpx++ukkJSWRkJDAeeed57PPOFA9LGqpMX/wNzor3KO2Nm7cSFxcHB06dGDy5Ml07NiR1atXs2LFCr/L7Aabr75r0A2DqjJmzBi6d+/ObbfdFjDvCy+8QEFBgfuG8e23387AgQM5//zzvfINHz7cPVojPT2dwYMHs3btWkpLSzlw4ABLliypclO5unq4ug8++ugjWrduTevW4bkaNKY+ysnJISEhwSstISGBnJycsB2jqKiI6667jhtuuAERYceOHRx66KEcdNBBTJ8+nfLycsBxT9Az/L6/fA1OOC47wv0IV1fS0qVLFdCePXu6V2B75513dNSoUdq2bVvt3Lmzdu7cWU866SSv/bZu3aqAHnfccV6rt11xxRU+jzN9+nTt0aOHpqam6h133OFOT05O1uLiYr/1UHV0Jd18882alpamqampunz58rC8d2PqsxkzZmizZs0U0OTkZJ0xY0atyzzooIO0d+/e2qNHD+3Vq5c+/PDDWl7uWP3tm2++0Z49e2qvXr10woQJ7m6nffv26aBBg7RXr146adIkv/liBWHqSrKQGBFSXl5Ohw4d+Pnnn93DV40xwbMJbqFr0CExXKOSMjMzyczMjHZ1aiQ1NZWxY8dao2BMDVmDELw5c+a4RnFaED1jjDF/aNBht41p6CIVdnvSpEn06NGDXr16ceaZZ7Jp0yaf+Szstgkk4l1JIhIHrAC2quoFInIm8DCORmk3MFpVv410PYwJ6Nprw1velCkBNw8fPpzhw4e7X7vCbgeKSxRMSIwTTjiBFStWkJCQwNNPP82ECRNqNGnu1ltvtbDbjVhdXDHcDHzl8fppIEtV04CXgb/XQR2MiVmusNvTp0+vdbjsQYMGuYd6nnTSSWzZsiUcVTSNTEQbBhHpApwPTPVIVuBg5/PWwI+RrIMxsSySYbefe+45zj33XJ/bXGG3XY+7777ba/vkyZPDEsXV1E+RvmJ4DJgAeMa2HQu8KyJbgCuBByvvVFRURHp6uvuRm5sb4WoaEx2RCrs9Y8YMVqxYwR133OFze2MMu52Tk0Nqaiq9evUiLS3NHbX2scceqxKbyZdg8/kK61/5nk445ebmuj8rgaRwlBmxhkFELgC2qWrlNRNvBc5T1S7AC8Ckyvu2b9+eFStWuB/O4FDGNCiusNtPPvlkwHzLly+noKCAqVOnMnTo0Go/sBcuXEhOTg5vv/02zZo1i0TVwyY/P5/27dtXGZoaanp1Pv30U+bOncuqVatYs2YNCxcu5PDDDwfC3zDUtezsbPdnJVAcjjIjecVwKjBURAqBmcAZIvIO0FtVlzvzzAJOiWAdjIlJkQq7/cUXX3Dttdfy9ttv06FDh7CVGwn5+fmMGDGC1157zT2Zrabp1fnpp59ISkpyN5RJSUkcdthhPPHEE/z4448MGjSIQYMGAXD99deTnp5Oamoq99xzD4DPfDV9zxkZGVx88cUcd9xxZGVl4Zoy8Pnnn3PKKafQu3dv+vXr5xWKo65FrGFQ1btUtYuqpgCXAR8Aw4DWInKMM9vZeN+YNqZRiFTY7TvuuIPdu3czYsQI0tLSvJbrjCXhbBRGjBhR7fHOOeccfvjhB4455hjGjRvHkiVLALjppps47LDDWLx4MYsXLwYcXU4rVqxgzZo1LFmyhDVr1vjMV1NffPEFjz32GOvXr2fjxo18/PHH7Nu3j0svvZTHH3+c1atXs3DhQlq0aFGr49RKOOJqVPcAMoC5zufDgbXAaiAf6FY5f7jDbhtjYktSUpIuXrzYK23x4sVhSffnwIEDunjxYr377ru1Y8eO+sILL6hq1fD4Tz/9tJ5wwgnas2dPTUpK0ldeecVnPn9Gjx5dJax/YmKiu85nnXWWO/26667T6dOn65o1a/SUU04J6n0EQphiJdVJSAxVzXc2Aqjqm8CbdXFcY0xsCteVQuX0QOLi4sjIyCAjI4OePXvy4osvMnr0aK8833//PY888giff/45hxxyCKNHj3av4R6sdu3a8dtvv7lfb9++naSkP+4Je973iYuL48CBAyGVXxds5rMxps7VdaPw9ddfs2HDBvfrgoICkpOTAe/Q2jt37qRly5a0bt2aX375xWsmeuUQ3IHe26xZs9xrNUybNq3a+xLHHnssP/30E59//jkAu3btimqDYUH0jDFRUxeNAsDu3bu58cYbKSkpIT4+nqOOOso9DD47O5shQ4a47yGccMIJHHfccRx++OGceuqp7jIq5xs7dizXXXeda5io2wUXXMDKlSvp27cvcXFxHHnkkTzzzDMB69e0aVNmzZrFjTfeyO+//06LFi1YuHAhO3fuZOzYse51YvyxIHrGGGN8siB6xhhjIsIaBmOMMV6sYTDGGOPFGgZjjDFeYrJhcI1Kct5lN8YYE8CcOXNcMeVsVJIxxpg/2KgkY4wxEWENgzFh5Aq5YEx9Zg2DMcYYL9YwGGOM8VKjhkFEzg53RYwxxsSGml4xPBfWWlRiw1WNqR2719G4hHu4qt/oqiLytr9NQLtwHNyf1q1buyMfGmOMCcwVifrZZ5/dEY7yAoXdHgCMBHZXShegXzgObowxJvYEahiWAaWquqTyBhH5OtgDiEgcsALYqqoXiMgRwEwcVx0rgStVdV9o1TbGGBMpfu8xqOq5qupz1WtVPT2EY9wMfOXx+t/AZFU9CvgNGBNCWcYYYyIsosNVRaQLcD4w1flagDOA2c4sLwJ/imQdjDHGhKamw1WDvTP8GDABqHC+bgeUqKprMdMtQOfKOxUVFZGenu5+2I1oY4zxLTc31/1ZCSSFo8yarvk8pboMInIBsE1VV4pIRiiFt2/fHguiZ4wx1cvOznYNVUVEisNRZo0aBlVdGUS2U4GhInIe0Bw4GHgcaCMi8c6rhi7A1prUwRhjTGT47UoSkdYi8qCI/E9EtovIryLylTOtTXUFq+pdqtpFVVOAy4APVDULWAxc7Mw2Cnir9m/DGGNMuAS6x/AqjlFDGaraVlXbAYOcaa/W4ph/BW4TkW9x3HOI6CxqY0zobOZ04xaoKylFVf/tmaCqPwP/FpGrQzmIquYD+c7nG7EJcsYYE7MCXTFsEpEJItLRlSAiHUXkr8APka+aMcaYaAjUMFyKo6tnifMew3Yc3/rbApdEslIWRM/UR3l5eSxbtowlS5aQkpJCXl5etKtkGglb89mYGJSXl0d2djalpaXutISEBHJzc8nKyqrz+rjuD+Tn50dlfxMdtuazMTFk4sSJXo0CQGlpKRMnToxSjYypOWsYjAmDzZs3h5QeSdalZWor0DyGw+qyIsbUZ127dg0pPVJcXVplZWUAbNq0iezsbGscTEgCXTFMFZFlzgltGSJS0/AZxjR4OTk5JCQkeKUlJCSQk5NTp/WwLi0TDoHCbp8HZOAYiTQcWCYib4hItohE9GuQjUoy9U1WVha5ubk0a9YMgOTk5KjceI6lLi1Td6I6Ksm5yM65wBCgk6pGZKKajUoy9VW0R/OkpKSwadOmKunJyckUFhYGXU6034epmaiMSlLV71X1KVUdCpxW24MbY6qqTTiKWOnSMvVbjUcl2XKcxsSeWOnSMvWb3VA2poHJysri2WefBawryNRMSFcMInKIiPSKVGWMMcZEX7UNg4jki8jBItIWWAU8KyKTIl81Y4wx0RDMFUNrVd0JXAi8pKr9gbMiWSkbrmpM9NjM6fon3MNVg2kY4kXkUBwRVeeG46DVad26Nbm5uWRmZtbF4YwxTjZzun7KzMwkNzcXYEc4ygumYbgPmA98q6qfi0g3YEN1O4lIcxH5TERWi8g6EbnPmZ4nIl+LyJci8ryINKndWzAmPGzVMps5bRyCaRh+UtVeqjoO3CuwBXOPoQw4Q1V7A2nAEBE5CcgDjgN6Ai2AsTWpuDEm/GzmtIHgGob/F2SaF3XY7XzZxPlQVX3XuU2Bz4AuQdfWGBNRsRIM0ESX33kMInIycArQXkRu89h0MBAXTOEiEgesBI4C/qOqyz22NQGuBG6uQb2NMRGQk5Pjc8EhmznduAS6YmgKJOJoPFp5PHYCFwdTuKqWq2oajquCfiJyvMfmp4APVXVp5f2KiopIT093P5w3VYwxEWYzp+uf3Nxc92clkBSOMv1eMajqEhzrPU9T1apRuUKgqiUishhH8L0vReQeoD1wra/87du3x4LoGRMdNnO6fsnOznYNVUVEisNRZjAhMZqJSC6Q4plfVc8ItJOItAf2OxuFFsDZwL9FZCwwGDhTVStqXHNjjDEREUzD8BrwDDAVKA+h7EOBF533GQ4CXlXVuSJyANgEfCoiAG+o6v2hVdsYE+ssdHf9FUzDcEBVnw61YFVdA5zgI90C95kGK9gPwfbt2/Paa695zZvIz88nMzOTsrIy9u/fT0pKCjk5OXTu3JkRI0ZUyW9MpATzIT1HRMYBb+KYmwCAqm6PWK2MaeD8NQr79+9n//79gGPW8ZgxY2jSpAlz5syxRsHUmWDmMYwC7gA+wTH0dCVgd4aNqYXKjcKIESNo2bKlOxSFS1lZGS1btqySv3379lWuTkJNN8afaq8YVPWIuqiIJ1cQvczMTIuXZIJWn/q0ZaVz3MWKfLjzUnhwFlzvOzblL9u2Vcm/+M2qVxye3U2uc1A53TO/XYE0HHPmzHEFHa2bIHoikiAif3eOTEJEjhaRC8JxcH8siJ5pFDwbhfQM6Hi473yudI/8gRqFYNNNwxGNIHovAPtwzIIG2Ar8XzgObkyjVblRAPhLDjRt5p2veYIj3Vd+at4ovPbaaxF6Y6YhCObm85GqeqmIXA6gqqXiHGdqTLhEuxvItQZBWVmZezRQJGf7Jv39UvJm5TFw4KmA42bzkl87cFHTJpRpBfv376dr167cf//9HHpoB7KyvPPLyiZejcWgVqeDj+6pQOkasXdn6rtgGoZ9zglqCiAiR+IxOsmY+s7fGgRAxBqHLVu2VEkbOHAgxcXFnH322QAsWLDAf34/VxAhpxvjQzBdSfcC84DDRSQPWARMiGSljKlLNV2DIKqjg6xRMBEUzKik90VkJXASIMDNqhqWeBz+2KgkU5dqsgZBbfr2MzIyqgxLDdnCX6qmpWeElm4ajGiMSpoDnAPkq+rcSDcKYKOSTN0KdQ2C2jYKxoRbuEclBXOP4RHgUuBBEfkcmAnMVdW94aiAMdEW6hoEg4aPqPENX1f63uN9Fh20vcfvr10BADTzu6U+zAUxkVPtFYOqLnEu69kNmAJcAmyLdMWMqSshr0FQ2779FflhrX8sco3yWrJkCSkpKeTl5UW7SiYEQQW0c45KysRx5dAHeDGSlTKmroW0BkFtG4U7L4XRVUclNRTRGOVlwiuYewyvAl8BZwBP4pjXcGOkK2ZMrNp7/H72Hr+f+b8uJOnvlzJ/Vh57R58aUnpDVtNRXiZ2BHPF8BxwuaqGshZDrdioJFMfDBw40O98hFDSG5qajPIytVNno5JEZAKAqs4HLqy07Z/hOLg/NirJhCrYPu1A8wtKSkp8ptuN2NCEOsrL1F5dxkq6zOP5XZW2DQnHwY0JB3992r4ah0BDSdu0aeMzvXL+hqimk/V8ycnJISEhwSst0CivQDIyMmyIbxQEahjEz3Nfr6vuLHK4iCwWkfUisk5Ebq60fbyIqIgkhVBfY6oIpU/bopJWFe5AfCGP8jIxJ9A9BvXz3NdrXw4A41V1lYi0AlaKyAJVXS8ih+OYNGedjqbWAexC6dP2tQ6Ce37BLmBXSbXzFBqSSE3WC2mUl4k9qurzAZQDO3H8uxxwPne93u9vvwDlvQWc7Xw+G+gNFAJJlfP27dtXTeMwY8YMTUhIUBxfNhTQhIQEnTFjRtBlJCcne+3veiQnJ1fJu3fvXp0/f74mJSXp/Pnzde/eve5Hr169ND4+vkq6v/z1/bF48WJNSkrSxYsXe52jUNP9GThwoA4cODDo32OkymhMgBUa4mezr4ffriRVjVPVg1W1larGO5+7XjcJpfERkRTgBGC5iAwDtqrq6lDKMA1TOIY2htKnvWTJErKyssjLy2PgwIFe6evXr6dHjx5V0n3lbwgGDR9B8f85roxkZYXjMeWDkNJNwxTUBLfaEJFE4HXgFhxXHn/D0Y3kV1FREenp6e7X2dnZ7gkypmEJx9BGV7fTmDFjKCsrIzk52W93VKChpCeffHLQ+RuEcMzg7ntGXdTUBJCbm+sakQQQlnu2EW0YRKQJjkYhT1XfEJGewBHAaudaP12AVSLST1V/du3Xvn17VqxYEcmqmRjRtWtXNm3a5DM9FNanXQPhmMF9bVFd1NQE4PnFWUTCEuQ0mPUYasS5yttzwFeqOglAVdeqagdVTVHVFGAL0MezUTCNSziHNppacIXmrrxeQ3XppkGKWMMAnApcCZwhIgXOx3kRPJ6ph2xoozGxJ2INg6p+pKqiqr1UNc35eLdSnhStg/UdTGzLysripJNOYuDAgRQWFkatUViwYIHXcpomOP4mwZWUlPDxxx9HdiU7ExGRvGIwxjRwgeY7rFu3jtTUVFu0qB6K+KikmrAgesZEXjgW++nSxf8kuNTU1KDCjJjaq/OlPaPBgugZUz8Emu+wOq4NS3ZR7fwIf2yxn+DVZRA9Y4wJLBxDXn0IJTCiCT9rGIwxtefvw39XSfWNhQ+22E90xeQ9BmNMPeNrXkOu/26i6uZB2GI/0WVXDKbeqMl6ATYksn6yxX6iyxoGU2/UNjS0NRL1h82Ij66Y7Eqy4arGF3/rI7jTH/0AVuQHXE9h788NNCBeAxNKYEQT/uGq4gjhHVvS09PVgug1Lq5v9oG+1btGqAQKnR1Kuqm95l+GFIHfJ+3rv+MimL8L8wcRWamq6dXnDMy6kky9Yo2CMZFnVwym3nBdMZiGxRVA0ZdwXDE0pqsOu2IwxhgTEdYwGGOM8WKjkoyJgmY33VSr/cueeCJMNTENQbhHJcVkw+AKomfqh8bUh2tMLHJ9iX722WfDEkQvJhsGY2JZbb/tGxPrIrnm8/Misk1EvqyUfqOI/E9E1onIQ5E6vjHGmJqJ5BXDNOBJ4CVXgogMAoYBvVW1TEQ6RPD4xjRY4bhqsfsUxp+INQyq+qGIpFRKvh54UFXLnHm2Rer4xpj6z+5bRUdd32M4BhggIjnAXuB2Vf28jutgGjG7P2BM9ep6HkM80BY4CbgDeFVEpHKmoqIi0tPT3Q8bodQwhBoi20Jnm9pqDMuD5ubmuj8rgaRwlFnXVwxbgDfUEYfjMxGpwPFGijwztW/fHguJ0fCEGiLbFo43teFveVCgQUVpzc7Odr8vESkOR5l1fcXwX2AQgIgcAzQFwvJGTOwLtHB8MOmmYavpFaU/tjxozUVyuOorwKfAsSKyRUTGAM8D3ZxDWGcCozQWo/iZoIVyqb539KnsPX4/e4/fz/xfF5L090uZPysv6HTTcNVk0SVXuj+2PGjNRXJU0uV+No2M1DFN3arppXpNQ2dv2WKL7DREtWkUAnUzdu3alU2bNvlMN4FZ2G1TYykpKT7/8ZKTkyksLKySHgths21U0h9iZR5D805d4MFZkJ7xR6LHCn3Vpftb6Mf1xcWzOykhIYHc3NwGdY/BU7jCbsdkSAwLolc/2KV6/RYzk+Rq0SgE0piWB7WlPU3MqOsrBvu2H3vC0TDUdnnQQEuDQuMK8mgL9Zioy8nJISEhwSstISGBnJycKNXIGBMO1jCYGsvKyiI3N9e9NGNycnKD7r81prGIyXsMpv7Iysri2WefBRrHpboxjYFdMRhjjPFiDYMxxhgvMdkwuIarOodfGWOMCWDOnDmuyaW25rMxJrrCMoQ4++nal9HI2ZrPpt6yeQjGl73H769lCc0CbrVBEaGLya4kY4wJla33ET7WMJigBPrn+vjjjykpKQkqvzGREKlAfI2VdSWZoAT651qwYIH905mokSkfuGMoDWp1OrjW7vCIrVRdenVhNRqbmDwbNiop9uTl5XHyySdTVlZGWVkZ77//PiNGjAgp3ZiICCXgXoiB+OqLcI9KsiB6JiieAfBqup5C0YUX1mmdTf1Q20B8tQ3CB9UH4qsvLIieiYqaNgoNcRF2YxqqqNxjEJFbgbGAAmuBq1R1bzTqYkIzcOBAnyupBZX++uuRrp4xJgzq/IpBRDoDNwHpqno8EAdcVtf1MMYY41u0upLigRYiEg8kAD9GqR7GGGMqqfOGQVW3Ao8Am4GfgB2q6jVkpaioiPT0dPfDwmMYY4xvubm57s9KICkcZdb5PQYROQQYBhwBlACvichIVZ3hytO+fXtsVJIxxlQvOzvbNVQVESkOR5nRuPl8FvC9qhYBiMgbwCnAjIB7maiyOEfGNB7RaBg2AyeJSALwO3AmYJcHUdSYFks3DU/tg/BBdYH4aqu+/Y9F4x7DcmA2sArHUNWDALuJYIwxMSIq8xhU9R7gnmgc2xgTW2rbTVnbmdMuFojvDzbzuZHxF/W0pKQkYGhiYxoyaxS8xWTDYEH0gpeRkRH0H6avP+a8vDw++eQTVq9eTVxcHFu3bvWZ35iGbNDwERT/nyPaqqyscDymfBBSuj95eXksW7aMJUuWkJKSEpHwMBZEz3gJ5qZWWVmZz1hGM2fO5Nprr/UKkJeQkMBTTz3FoYce6pXfRiWZWBWOrqRIBeLLy8sjOzub0tJSd1pCQgK5ublkZWXV+piVWRA9EzR/Ae4mTJjg1SgAlJaWMmHCBJ/5jTGhmThxolejAI7/sYkTJ0apRsGxhXrqMdclallZGSkpKeTk5Pj8FuIvwF1RUZHPcouKivj999/DXl9jGpvNmzeHlB4r7IqhnnJdorq+8W/atIns7OyQ+i8PP/zwkNKNMaHp2rVrSOmxwu4x1FMpKSls2rSpSnpycjKFhYXeidde67OMvA0byF66lNIDB9xpCfHx5A4YQNbRR4ezusbENMl+utZl2D2GCLNRSdULxyVq1tFHkztgAM0OcvwZJCcmWqNgTBhlZWWRm5tLs2aOmdXJyckRaRRsVFIj42+8dKdOnfjll1+q5O/YsSPl5eXe+f1cMbhkOBvg/MzMcFXbmHolUlcMLnUVEiNcVwx28zmKgvljGTFiBHl5eZx88snu+wlLlixhz549NGvWzGtUUbNmzdizZw+vv/66V/7IRoExpv4LFG/p7LPPBmDBggXVlNJw/tNisivJ/GHLli1VhowOHDiQ4uJipkyZ4r5E7dq1K1OmTKG4uNiGmBpjasWuGKIk2KGmgSaWjQJeaNsWgPwhQ+CTTxyPEFkXkjG+zZw5k88++4yysjKOOeYY7r//fi67LPSViOtLVFUXu2KIgnAMNTXGRNbMmTMZN26c+/908+bNjBs3jpkzZ0a5ZpFnDUOE+QpMN378eJ+zIcePH+8zvzGm7t19990+/0/vvvvuKNWo7sRkV5JruGpmZiaZMdjNkZeXx5gxYygrKyM5OdlvN5Arb+Ubx75GEwH88ssvzJ8/P6Qbx9YNFAXOZRRrpbbrmMdCHRoQX122P/gZ+v3D5s0+88urb8CDsyA944/EFflw56VBpwca2RTInDlzXMP7bbhqNIQ8YcXHUNGUl19m0+7dVdKTExMpvOKKsNbXVBKOD1TzhwbcuIT6fxrpIa/BsOGqURIoKFawk1ZyTjzR54zjnBNPDGtdjYm4Bnzl0pj/T6PSMIjIEOBxIA6YqqoPem73F9wNHN/YJ06cyObNm+natWvAbpzalvH+++9XiTIaaMaxr/y+uoJcM4vHLFlCWUUFyYmJ5Jx4YsRmHOd+9RXZ3btHpOyQVPMhkvvGG2RfeGEdVabmYqGeX6RX/6XwdeCiCNfjhDBc2cfM32cldf1/GkgIn3tJ4ThenXcliUgc8A1wNrAF+By4XFXXu/K0bNlS9+zZU2XfULpx5syZ4/P+REhlDBlCZnKyV1qol5dzNm2qUkY4tweTJ/2NN1hRzQdZtfU47jgyTz/d//YPPwy4PZg86VdeyYrp0yN6jMfS0wk0y2MJBNwOkAUEGj8WTBn//PtFHN3P/wfMhs821Go7wAu3TeOqSaMjeoxwlPH+X9+q1e99zsSJtf4fCcf/0BuDB3P++ef73f7OO+8E3A6OL6OVP7dC+cwSkVJVbRnwIEGIxhVDP+BbVd0IICIzgWHA+oB7EVo3zpwJE8icO7dqGS+/7LuM664j68MPvcvw8ccS6uVlLDQMwaiujLvnvcinR23zu/29ee8F3B5Mnq37f+NvP86O6DHeO6c322441//2J98LuB3gt9um8VqAD9xgyvj2yfcCfmB++9m3tdoejHAcIxxl1Pb3XpyQQGY1V6NzcnIC5qluO0uXVnvFO2/kSC6cP9//9g8/DLgdYI4qrVq18gqDU93nXiRWWozGFcPFwBBVHet8fSXQX1Vv8MizFyj32K0IKAb6Bih6ZaXXrYEdPvKFo4y2QGegKbAP2Aps91OmvzLCtT2YPEk4zl9tyoiFeobjGOF4H42lnnXxPqB+1LMu/of85anuMysJcC3MHqeqzas5RrVi8uZzON6YMcaYmonGBLetgOdKMF2cacYYY2JANBqGz4GjReQIEWkKXAa8HYV6GGOM8aHOGwZVPQDcAHwM7AZ6Ai385ReRISLytYh8KyJ3eqQfISLLnemznI1M2IlIWxFZICIbnD8P8ZFnkIgUeDz2isifnNumicj3HtvSolVPZ75yj7q87ZEeS+czTUQ+FZF1IrJGRC712Bax8+nvb81jezPnufnWea5SPLbd5Uz/WkQGh6tONaznbSKy3nnuFolIssc2n7//KNVztIgUedRnrMe2Uc6/kQ0iMirK9ZzsUcdvRKTEY1udnE8ReV5EtonIl362i4g84XwPa0Skj8e20M+lqkblAXQHjgXygXQ/eeKA74BuOG70rgZ6OLe9ClzmfP4McH2E6vkQcKfz+Z3Av6vJ3xbHjegE5+tpwMV1cD6Dqiew2096zJxP4BjgaOfzw4CfgDaRPJ+B/tY88owDnnE+vwyY5Xzew5m/GXCEs5y4CJ2/YOo5yOPv73pXPQP9/qNUz9HAkz72bQtsdP48xPn8kGjVs1L+G4Hno3A+Twf6AF/62X4e8B4gwEnA8tqcy6gF0VPVr1T162qyuYe2quo+YCYwTEQEOANwjXF7EfhThKo6zFl+sMe5GHhPVUuryRduodbTLdbOp6p+o6obnM9/BLbxx6iLSPH5t1Ypj2fdZwNnOs/dMGCmqpap6vfAt87yolJPVV3s8fe3DMd9vLoWzPn0ZzCwQFW3q+pvwAJgSIzU83LglQjVxS9V/RD/Ix/BUeeX1GEZ0EZEDqWG5zLWo6t2Bn7weL3FmdYOKFFHt5RneiR0VNWfnM9/BjpWk/8yqv7h5Dgv7yaLSKSWeQq2ns1FZIWILHN1dxHD51NE+uH4JvedR3Ikzqe/vzWfeZznageOcxfMvuES6rHG4Pgm6eLr9x8JwdbzIufvcraIuAalxOT5dHbJHQF84JFcV+ezOv7eR43OZUSHq4rIQqCTj00TVfWtSB47FIHq6flCVVVE/E78cLbQPQHPWSx34fgAbArkAn8F7o9iPZNVdauIdAM+EJG1VD+2Ohr1dJ3P6cAoVa1wJoftfDZ0IjISSMd7InaV37+qfue7hIibA7yiqmUici2Oq7EzolSXYFwGzFZVzzlWsXQ+wyaiDYOqnlXLIvwNbf0Vx6VSvPObW62GvAaqp4j8IiKHqupPzg+qQFNvLwHeVFX3ArIe347LROQF4PZo1lNVtzp/bhSRfOAEHGF1Yup8isjBwDs4vkQs8yg7bOezkmCGUbvybBGReByTkX4Nct9wCepYInIWjoZ4oKq6Fwb38/uPxAdZtfVU1V89Xk7Fcf/JtW9GpX3zw17DP44V7O/uMuAvngl1eD6r4+991OhcxnpXks+hreq4q7IYR38+OFa5jNQVyNvO8oM5TpX+R+eHn6sf/0+Az1EFYVBtPUXkEFfXi4gkAacC62PtfDp/12/i6DOdXWlbpM5nMMOoPet+MfCB89y9DVwmjlFLRwBHA5+FqV4h11NETgCmAENVdZtHus/ffxTreajHy6HAV87n84FznPU9BDgH76vwOq2ns67H4bh5+6lHWl2ez+q8DfzZOTrpJGCH80tUzc5lXdxR93MXfTiO/q4y4BdgvjP9MOBdj3zn4Qi69x2Ob4+u9G44/vm+BV4DmkWonu2ARcAGYCHQ1pmejiMyrCtfCo7W+aBK+38ArMXxATYDSIxWPYFTnHVZ7fw5JhbPJzAS2A8UeDzSIn0+ff2t4eimGup83tx5br51nqtuHvtOdO73NXBuhP93qqvnQuf/lOvcvV3d7z9K9fwXsM5Zn8XAcR77Xu08z98CV0Wzns7X9wIPVtqvzs4nji+cPzn/L7bguHd0HXCdc7sA/3G+h7V4jPSsybmMyYV6jDHGRE+sdyUZY4ypY9YwGGOM8WINgzHGGC/WMBhjjPFiDYMxxhgv1jCYRklEuorIbhE5LMj8GSJyoPqcxtR/1jCYiBCRfBEpc3747hZHOOBbPLYXiiM8+S4R2SEi/xORKSJytHN7V499d4vIARHZ5/F6XRB16CAiL4rIryKyUxyhkQ8DUNXNqpqojiB9iMh5IvKBiBSLyG8islREBkTo9IRERFJEREUkrMHwxBH2+ttwlmkaBmsYTCQ94PzwTcQxaS1HRM722D5WVVsBbXBEhxSgQERO8vjgdu2fD/zTIy010IFFpDmOiXT7cIR3bwNk4VgDxJdDgP8HHIUjkuvLwHvyR2A3YxoNaxhMnVBHvKP1OIIMVt6mqvq1qmbjCDnwaBgOOQpHYzBOVYtVtUJV16nqTqj6LVxV81T1TVUtUdUDqvo0jkbkRM9CxbHoySYR2S6ORYMSg6mMiFwvjsVgdogjEucAj233iiPwoGf+fBH5u/PlaufPr51XS/9w5lERucV5JbRLRBaLyFF+ysBjn9NE5GQc625087gKywjmvZiGzxoGE3HO+C2nAsfhEWvGj1nASSKSUMvDDsIRdmOasyvpfyJya7A7i0hPIAlHeAGXOCAT6IVjoaljgElBlHU58ADwZxwhQZ4F5onHymrV6O38eazzaukBj23ZOOI2dcARXuJtEYmrrkBV/RRHSIWNHldh+UHWxzRw1jCYSJoojmUQ9wAfAXlUH1xuC46/S59Lk4YgCUfj8BlwKI6urIkiklXdjiLSAUfE2UfUuWCQh7+q6g5V/QW4G0fgsur+j64CpqjqcufVyHPAGuCK0N6ST4+q6req+jswATgS6B+Gck0jZg2DiaQcVW2jqgk4QgL3AJ6vZp8uQAXwWy2PvQvYqqqPq+o+VV2BI+hewFXEnDenFwPv41j7obJNHs8LcSznmVRNXQ4Hvq+U9h3eYZJrqtD1RB2rthURnRXbTANiDYOpE6q6Bce60hdWk/VSHOvV1nZp1ALAV4TIQAsDpQBLcSzNeoP6jjDp2f2TgiM6cHE1dfnBmddTN/5YWWsX0LLSds9htBX45y7X2f3WHsdVV5VyfQzNDVSuacSsYTB1QkQ6ASP440Zq5e1Hi8jTOGLah2PxnWlAOxH5i4jEiUhvHKOS3vBz/ONwdHe9oqqBjv8vETnY2d10LzBd/1hdLlBdrhWRfiISLyJXAWk4Rj4BrAT6iEhf5/YbcCwh6VKE40P8aB9l3yoiRzpHYT2IY7H35R7l/klE2otIKyCn0r4/Ax3EsSiSMW7WMJhI+odrxAuOBuEXvPvVpzpH0+zEsVpbPHCCqn5S2wOr6iYccfbHAjuB2cC9qjrLzy5/xbEW7i2V5k943pMod9ZzLY51FzYCtwVRl5eB+3B0Zf0KXA+c56wjzpu+k4B5OGLudwQ+9tj/d+AfwCsiUiIinkukTsXR2BXhuEk9TP9YenIyjsVvvsNxBfVOpaotxrE4/PfOcgdiDNh6DKZxEpEjcSxc0lE9VjmrT8SxXvYAVf0o2nUxDYtdMZjGqg+wA8c3bWOMB2sYTL0kIgMqdfl4Pv5Wzb6TgMeBG/3cYK6zuhgTi6wryRhjjBe7YjDGGOPFGgZjjDFerGEwxhjjxRoGY4wxXqxhMMYY48UaBmOMMV7+Pym/U5HAWAK/AAAAAElFTkSuQmCC\n",
"text/plain": [
"