{ "cells": [ { "cell_type": "markdown", "id": "24b92c4c", "metadata": {}, "source": [ "# AGC Training Demonstration\n", "________\n", "\n", "This notebook demonstrates a workflow for training a BDT with MLFlow logging and model storage at a coffea-casa analysis facility. The ML task shown here is simple signal/background classification, which was chosen for this demonstration so that we can focus on the technical aspects of the workflow. As such, the task and the features used are not necessarily well-motivated from a physics standpoint. This is a precursor to a talk at CHEP 2023, which will demonstrate both training and inference at coffea-casa analysis facility in the AGC context, using a BDT which predicts parton association of the jet in an event in order to provide more sophisticated event variables for the ttbar cross-section measurement. To avoid dealing with background cross-sections, the background samples are restricted to wjets.\n", "\n", "The steps in this notebook are as follows:\n", "1. Process ROOT files into feature and label columns using coffea + dask\n", "2. Prepare columns for training\n", " - Separate dataset into training and testing samples (even and odd event numbers)\n", " - Preprocess features using `sklearn` PowerTransformer\n", "3. Train BDTs for each event sample and perform hyperparameter optimization\n", " - Perform hyperparameter optimization by random sample method, using n-fold cross-validation\n", " - Use distributed dask to parallelize\n", " - Log training results for each trial in MLFlow\n", " - Log models for each trial in MLFlow\n", "4. Test and register models\n", " - Test each model against the appropriate test sample (even/odd event numbers)\n", " - Register best model in MLFlow model repository\n", " - Generate NVIDIA Triton model config file and move to Triton inference server (this last step will eventually become more automated using the triton-mlflow plugin)" ] }, { "cell_type": "markdown", "id": "58232012-8ecb-40bb-bcd1-9cc60f38746a", "metadata": {}, "source": [ "___\n", "### Imports" ] }, { "cell_type": "code", "execution_count": 1, "id": "2130f252", "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/conda/lib/python3.8/site-packages/requests/__init__.py:102: RequestsDependencyWarning: urllib3 (1.26.7) or chardet (5.0.0)/charset_normalizer (2.0.9) doesn't match a supported version!\n", " warnings.warn(\"urllib3 ({}) or chardet ({})/charset_normalizer ({}) doesn't match a supported \"\n" ] } ], "source": [ "import awkward as ak\n", "import hist\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import uproot\n", "import time\n", "import os\n", "\n", "# initial coffea processing\n", "from coffea import processor\n", "from coffea.processor import servicex\n", "from coffea.nanoevents import NanoAODSchema\n", "\n", "# for training/hyperparameter ioptimization\n", "from dask.distributed import Client, LocalCluster\n", "from sklearn.preprocessing import PowerTransformer\n", "from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score\n", "from sklearn.model_selection import ParameterSampler, train_test_split, KFold, cross_validate\n", "import mlflow\n", "from mlflow.models.signature import infer_signature\n", "from mlflow.tracking import MlflowClient\n", "from xgboost import XGBClassifier\n", "\n", "import utils # contains code for bookkeeping and cosmetics, as well as some boilerplate" ] }, { "cell_type": "markdown", "id": "7a8695ae-6e0f-4f86-8815-720f664c56d4", "metadata": {}, "source": [ "___\n", "### Configuration\n", "\n", "The configuration options below are separated into two categories: those used for processing the feature set, and those used for ML." ] }, { "cell_type": "code", "execution_count": 2, "id": "b547996c", "metadata": { "tags": [] }, "outputs": [], "source": [ "### FILE PROCESSING OPTIONS\n", "\n", "# input files per process (ttbar and wjets)\n", "N_FILES_MAX_PER_SAMPLE = 1\n", "\n", "# enable Dask (essentially whether to use DaskExecutor or FuturesExecutor with coffea)\n", "USE_DASK_PROCESSING = True\n", "\n", "# analysis facility: set to \"coffea_casa\" for coffea-casa environments, \"EAF\" for FNAL, \"local\" for local setups\n", "AF = \"coffea_casa\"\n", "\n", "# chunk size to use for file processing\n", "CHUNKSIZE = 250_000\n", "\n", "# \"ssl-dev\" allows for the switch to local data on /data\n", "AF_NAME = \"coffea_casa\" # \"ssl-dev\" allows for the switch to local data on /data\n", "\n", "# scaling for local setups with FuturesExecutor\n", "NUM_CORES = 8\n", "\n", "\n", "### MACHINE LEARNING OPTIONS\n", "\n", "# enable Dask (whether to use dask for hyperparameter optimization)\n", "USE_DASK_ML = True\n", "\n", "# enable MLFlow logging (to store metrics and models of hyperparameter optimization trials)\n", "USE_MLFLOW = True\n", "\n", "# enable MLFlow model logging/registering\n", "MODEL_LOGGING = False\n", "\n", "# number of folds for cross-validation\n", "N_FOLD = 5\n", "\n", "# number of trials (per model) for hyperparameter optimization. Total number of trials will be 2*N_TRIALS\n", "N_TRIALS = 20\n", "\n", "# number of events to use for training (choose smaller number, i.e. 10000, for quick demo)\n", "N_TRAIN = 10000\n", "\n", "# name to use for saving model to triton server\n", "MODEL_NAME = \"sigbkg_bdt\"\n", "\n", "# if True, write over previous versions of model in triton directory\n", "WRITE_OVER = True" ] }, { "cell_type": "markdown", "id": "d8674317-8ecd-402c-a1fe-d5de1074705e", "metadata": {}, "source": [ "___\n", "### Defining the `coffea` Processor\n", "\n", "This processor returns columns associated with the features and labels we will use to train a BDT to distinguish signal from background events. `coffea`'s column accumulator is utilized. The cuts here are much looser than those used in the AGC ttbar notebook in order to provide ample discrimination power for demonstration purposes." ] }, { "cell_type": "code", "execution_count": 3, "id": "ce9f9967", "metadata": { "tags": [] }, "outputs": [], "source": [ "# function to create column accumulator from list\n", "def col_accumulator(a):\n", " return processor.column_accumulator(np.array(a))\n", "\n", "# coffea processor\n", "class ProcessFeatures(processor.ProcessorABC):\n", " def __init__(self):\n", " super().__init__()\n", " \n", " def process(self, events):\n", " \n", " process = events.metadata[\"process\"] # \"ttbar\" or \"wjets\"\n", " \n", " electrons = events.Electron\n", " muons = events.Muon\n", " jets = events.Jet\n", " even = (events.event%2==0) # whether the event number is even\n", "\n", " # single lepton requirement\n", " event_filters = ((ak.count(electrons.pt, axis=1) + ak.count(muons.pt, axis=1)) == 1)\n", " # at least four jets\n", " event_filters = event_filters & (ak.count(jets.pt, axis=1) >= 4)\n", " # at least one b-tagged jet (\"tag\" means score above threshold)\n", " B_TAG_THRESHOLD = 0.5\n", " event_filters = event_filters & (ak.sum(jets.btagCSVV2 >= B_TAG_THRESHOLD, axis=1) >= 1)\n", "\n", " # apply event filters\n", " electrons = electrons[event_filters]\n", " muons = muons[event_filters]\n", " jets = jets[event_filters]\n", " even = even[event_filters]\n", " \n", " \n", " ### CALCULATE FEATURES\n", " \n", " ## calculate aplanarity and aphericity from sphericity tensor of all jets in event\n", " \n", " # sum of jet momentum\n", " sum_p = ak.sum(jets.p**2, axis=-1) \n", "\n", " # off-diagonal elements\n", " Sxy = np.divide(ak.sum(np.multiply(jets.px, jets.py), axis=-1), sum_p)\n", " Sxz = np.divide(ak.sum(np.multiply(jets.px, jets.pz), axis=-1), sum_p)\n", " Syz = np.divide(ak.sum(np.multiply(jets.py, jets.pz), axis=-1), sum_p)\n", " \n", " # diagonal elements\n", " Sxx = np.divide(ak.sum(jets.px**2, axis=-1), sum_p)\n", " Syy = np.divide(ak.sum(jets.py**2, axis=-1), sum_p)\n", " Szz = np.divide(ak.sum(jets.pz**2, axis=-1), sum_p)\n", "\n", " # combine elements into sphericity tensor\n", " flat = np.stack((Sxx,Sxy,Sxz,Sxy,Syy,Syz,Sxz,Syz,Szz),axis=1).to_numpy()\n", " sphericity_tensor = flat.reshape((flat.shape[0],3,3))\n", "\n", " # find eigenvalues then calculate features\n", " eigenvalues = ak.sort(np.linalg.eigvals(sphericity_tensor),axis=-1)\n", " aplanarity = (3/2)*eigenvalues[...,0]\n", " sphericity = (3/2)*(eigenvalues[...,0]+eigenvalues[...,1])\n", " \n", " ## calculate lepton eta and phi (should be one value per event due to single-lepton requirement\n", " lepton_eta = (ak.sum(electrons.eta,axis=-1) + ak.sum(muons.eta,axis=-1))\n", " lepton_phi = (ak.sum(electrons.phi,axis=-1) + ak.sum(muons.phi,axis=-1))\n", " \n", " ## populate array of features\n", " features = np.zeros((len(jets),10))\n", " features[:,0] = ak.num(jets) # number of jets in each event\n", " features[:,1] = ak.sum(jets.pt, axis=-1) # event HT\n", " features[:,2] = (ak.sum(electrons.pt,axis=-1) + ak.sum(muons.pt,axis=-1)) # lepton pt\n", " features[:,3] = jets.nConstituents[...,0] # leading jet number of constituents\n", " features[:,4] = ak.sum(jets.nConstituents,axis=-1) # total event number of constituents\n", " features[:,5] = jets.pt[...,0] # leading jet pT\n", " features[:,6] = aplanarity # aplanarity\n", " features[:,7] = sphericity # sphericity\n", " features[:,8] = np.sqrt((lepton_eta-jets.eta[...,0])**2 + \n", " (lepton_phi-jets.phi[...,0])**2) # delta R between leading jet and lepton\n", " features[:,9] = np.sqrt((jets.eta[...,1]-jets.eta[...,0])**2 + \n", " (jets.phi[...,1]-jets.phi[...,0])**2) # delta R between two leading jets\n", " \n", " ### calculate labels (1 for ttbar, 0 for wjets)\n", " if process=='ttbar':\n", " labels = np.ones(len(jets))\n", " else:\n", " labels = np.zeros(len(jets))\n", " \n", "\n", " output = {\"nevents\": {events.metadata[\"dataset\"]: len(jets)}, \n", " \"features\": col_accumulator(features),\n", " \"labels\": col_accumulator(labels),\n", " \"even\": col_accumulator(even),}\n", "\n", " return output\n", "\n", " def postprocess(self, accumulator):\n", " return accumulator" ] }, { "cell_type": "markdown", "id": "f586fab7-16cf-492b-9e00-bf8fb856bf87", "metadata": {}, "source": [ "___\n", "### \"Fileset\" construction and metadata\n", "\n", "Here, we gather all the required information about the files we want to process: paths to the files and asociated metadata." ] }, { "cell_type": "code", "execution_count": 4, "id": "29341dd9", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "{'ttbar__nominal': {'files': ['https://xrootd-local.unl.edu:1094//store/user/AGC/nanoAOD/TT_TuneCUETP8M1_13TeV-powheg-pythia8/cmsopendata2015_ttbar_19980_PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext3-v1_00000_0000.root'],\n", " 'metadata': {'process': 'ttbar',\n", " 'variation': 'nominal',\n", " 'nevts': 1334428,\n", " 'xsec': 729.84}},\n", " 'wjets__nominal': {'files': ['https://xrootd-local.unl.edu:1094//store/user/AGC/nanoAOD/WJetsToLNu_TuneCUETP8M1_13TeV-amcatnloFXFX-pythia8/cmsopendata2015_wjets_20547_PU25nsData2015v1_76X_mcRun2_asymptotic_v12_ext2-v1_10000_0000.root'],\n", " 'metadata': {'process': 'wjets',\n", " 'variation': 'nominal',\n", " 'nevts': 1249076,\n", " 'xsec': 15487.164}}}" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fileset = utils.construct_fileset(N_FILES_MAX_PER_SAMPLE, \n", " use_xcache=False, \n", " af_name=AF_NAME) # local files on /data for ssl-dev\n", "\n", "# get rid of all processes except ttbar (signal) and wjets (background)\n", "processes = list(fileset.keys())\n", "for process in processes:\n", " if ((process!=\"ttbar__nominal\") & (process!=\"wjets__nominal\")):\n", " fileset.pop(process)\n", "fileset" ] }, { "cell_type": "markdown", "id": "1ff864b1-9b18-4fdf-a001-9b7f6e9e673e", "metadata": {}, "source": [ "___\n", "### Process Files" ] }, { "cell_type": "code", "execution_count": 5, "id": "78fce979", "metadata": { "scrolled": true, "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/conda/lib/python3.8/site-packages/distributed/client.py:1288: VersionMismatchWarning: Mismatched versions found\n", "\n", "+---------+----------------+----------------+----------------+\n", "| Package | client | scheduler | workers |\n", "+---------+----------------+----------------+----------------+\n", "| python | 3.8.16.final.0 | 3.8.16.final.0 | 3.8.15.final.0 |\n", "+---------+----------------+----------------+----------------+\n", " warnings.warn(version_module.VersionMismatchWarning(msg[0][\"warning\"]))\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "[########################################] | 100% Completed | 28.1s\r" ] } ], "source": [ "NanoAODSchema.warn_missing_crossrefs = False # silences warnings about branches we will not use here\n", "\n", "if USE_DASK_PROCESSING:\n", " executor = processor.DaskExecutor(client=utils.get_client(AF))\n", "else:\n", " executor = processor.FuturesExecutor(workers=NUM_CORES)\n", "\n", "run = processor.Runner(executor=executor, schema=NanoAODSchema, savemetrics=True, metadata_cache={}, \n", " chunksize=CHUNKSIZE, maxchunks=1)\n", "filemeta = run.preprocess(fileset, treename=\"Events\") # pre-processing\n", "all_columns, metrics = run(fileset, \"Events\", processor_instance=ProcessFeatures()) # processing" ] }, { "cell_type": "markdown", "id": "be8f321a-940a-4e25-86f0-e5889331ad86", "metadata": {}, "source": [ "___\n", "### Inspecting the Features\n", "\n", "Let's look at the features we will use for training, comparing signal to background distributions. Note that some of the features we use would need to be appropriately calibrated to use in an actual analysis." ] }, { "cell_type": "code", "execution_count": 6, "id": "9f17abd8-74ed-40db-9ea8-73341c8bf627", "metadata": { "tags": [] }, "outputs": [], "source": [ "# grab the numpy arrays from the column_accumulator objects\n", "features = all_columns[\"features\"].value\n", "labels = all_columns[\"labels\"].value\n", "even = all_columns[\"even\"].value" ] }, { "cell_type": "code", "execution_count": 7, "id": "70f32c49-ecd0-43e3-b678-c1af245fd4a5", "metadata": { "scrolled": true, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Initial Sig/Bkg Ratio = 6.3710530291584835\n", "Final Sig/Bkg Ratio = 1.0\n" ] } ], "source": [ "### balance signal and background samples\n", "\n", "# separate into signal and background regions\n", "features_signal = features[labels==1]\n", "features_background = features[labels==0]\n", "even_signal = even[labels==1]\n", "even_background = even[labels==0]\n", "\n", "sb_ratio = len(features_signal)/len(features_background)\n", "print(\"Initial Sig/Bkg Ratio = \", sb_ratio)\n", "\n", "if len(features_signal)>len(features_background):\n", " features_signal = features_signal[:len(features_background)]\n", " even_signal = even_signal[:len(features_background)]\n", " \n", "if len(features_background)>len(features_signal):\n", " features_background = features_background[:len(features_signal)]\n", " even_background = even_background[:len(features_signal)]\n", "\n", "sb_ratio = len(features_signal)/len(features_background)\n", "print(\"Final Sig/Bkg Ratio = \", sb_ratio)" ] }, { "cell_type": "code", "execution_count": 8, "id": "b8f12ad9-b2b4-4bfd-9f3f-833b90dbf54d", "metadata": { "tags": [] }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "### FIRST FOUR FEATURES\n", "\n", "# define histogram\n", "h = hist.Hist(\n", " hist.axis.Regular(11, 4, 15, name=\"njet\", label=\"Number of Jets\", flow=False),\n", " hist.axis.Regular(50, 0, 1000, name=\"ht\", label=\"$H_T$\", flow=True),\n", " hist.axis.Regular(50, 0, 200, name=\"leptonpt\", label=\"Lepton $p_T$\", flow=False),\n", " hist.axis.Regular(50, 0, 100, name=\"ljnconst\", label=\"Leading Jet Number of Constituents\", flow=False),\n", " hist.axis.StrCategory([\"Signal\", \"Background\"], name=\"truthlabel\", label=\"Truth Label\"),\n", ")\n", "\n", "# fill histogram\n", "h.fill(njet = features_signal[:,0], ht = features_signal[:,1], \n", " leptonpt = features_signal[:,2], ljnconst = features_signal[:,3], \n", " truthlabel=\"Signal\")\n", "h.fill(njet = features_background[:,0], ht = features_background[:,1], \n", " leptonpt = features_background[:,2], ljnconst = features_background[:,3], \n", " truthlabel=\"Background\")\n", "\n", "# make plots\n", "fig,axs = plt.subplots(4,1,figsize=(8,16))\n", "\n", "h.project(\"njet\",\"truthlabel\").plot(density=True, ax=axs[0])\n", "axs[0].legend()\n", "h.project(\"ht\",\"truthlabel\").plot(density=True, ax=axs[1])\n", "axs[1].legend()\n", "h.project(\"leptonpt\",\"truthlabel\").plot(density=True, ax=axs[2])\n", "axs[2].legend()\n", "h.project(\"ljnconst\",\"truthlabel\").plot(density=True, ax=axs[3])\n", "axs[3].legend()\n", "\n", "fig.show()" ] }, { "cell_type": "code", "execution_count": 9, "id": "0faafb50-e784-4939-94fa-d680abd6cc3e", "metadata": { "tags": [] }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "### NEXT FOUR FEATURES\n", "\n", "# define histogram\n", "h = hist.Hist(\n", " hist.axis.Regular(50, 0, 300, name=\"totalnconst\", label=\"Sum of Number of Constituents\", flow=False),\n", " hist.axis.Regular(50, 0, 300, name=\"ljpt\", label=\"Leading Jet $p_T$\", flow=False),\n", " hist.axis.Regular(50, 0, 0.5, name=\"aplanarity\", label=\"Aplanarity\", flow=False),\n", " hist.axis.Regular(50, 0, 0.9, name=\"sphericity\", label=\"Sphericity\", flow=False),\n", " hist.axis.StrCategory([\"Signal\", \"Background\"], name=\"truthlabel\", label=\"Truth Label\"),\n", ")\n", "\n", "# fill histogram\n", "h.fill(totalnconst = features_signal[:,4], ljpt = features_signal[:,5], \n", " aplanarity = features_signal[:,6], sphericity = features_signal[:,7],\n", " truthlabel=\"Signal\")\n", "h.fill(totalnconst = features_background[:,4], ljpt = features_background[:,5], \n", " aplanarity = features_background[:,6], sphericity = features_background[:,7],\n", " truthlabel=\"Background\")\n", "\n", "fig,axs = plt.subplots(4,1,figsize=(8,16))\n", "\n", "h.project(\"totalnconst\",\"truthlabel\").plot(density=True, ax=axs[0])\n", "axs[0].legend()\n", "h.project(\"ljpt\",\"truthlabel\").plot(density=True, ax=axs[1])\n", "axs[1].legend()\n", "h.project(\"aplanarity\",\"truthlabel\").plot(density=True, ax=axs[2])\n", "axs[2].legend()\n", "h.project(\"sphericity\",\"truthlabel\").plot(density=True, ax=axs[3])\n", "axs[3].legend()\n", "\n", "fig.show()" ] }, { "cell_type": "code", "execution_count": 10, "id": "d6247a1c-a231-464f-afc6-f9d47f95882d", "metadata": { "tags": [] }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "### LAST TWO FEATURES\n", "\n", "# define histogram\n", "h = hist.Hist(\n", " hist.axis.Regular(50, 0, 8, name=\"ljlepdeltar\", label=\"$\\Delta R$ Between Leading Jet and Lepton\", flow=False),\n", " hist.axis.Regular(50, 0, 8, name=\"twoljdeltar\", label=\"$\\Delta R$ Between Two Leading Jets\", flow=False),\n", " hist.axis.StrCategory([\"Signal\", \"Background\"], name=\"truthlabel\", label=\"Truth Label\"),\n", ")\n", "\n", "# fill histogram\n", "h.fill(ljlepdeltar = features_signal[:,8], twoljdeltar = features_signal[:,9], \n", " truthlabel=\"Signal\")\n", "h.fill(ljlepdeltar = features_background[:,8], twoljdeltar = features_background[:,9], \n", " truthlabel=\"Background\")\n", "\n", "fig,axs = plt.subplots(2,1,figsize=(8,8))\n", "\n", "h.project(\"ljlepdeltar\",\"truthlabel\").plot(density=True, ax=axs[0])\n", "axs[0].legend()\n", "h.project(\"twoljdeltar\",\"truthlabel\").plot(density=True, ax=axs[1])\n", "axs[1].legend()\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "id": "3782415a-23fc-4c3e-ac2a-f0b3c12831f0", "metadata": {}, "source": [ "___\n", "### Preprocess and Split Events into Train/Test/Split" ] }, { "cell_type": "code", "execution_count": 11, "id": "d20ccfe7-fe4a-4a14-9102-d1e61fc96efe", "metadata": { "tags": [] }, "outputs": [], "source": [ "# re-combine events\n", "features = np.concatenate((features_signal, features_background))\n", "even = np.concatenate((even_signal, even_background))\n", "labels = np.concatenate((np.ones(len(even_signal)), np.zeros(len(even_background))))" ] }, { "cell_type": "code", "execution_count": 12, "id": "9d7b20bd-76e6-4005-b351-0f51d8620f23", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "% Signal for Even-Numbered Events = 49.70423176095859\n", "% Signal for Odd-Numbered Events = 50.293453724604966\n" ] } ], "source": [ "# separate dataset into even and odd features\n", "features_even = features[even]\n", "labels_even = labels[even]\n", "\n", "features_odd = features[np.invert(even)]\n", "labels_odd = labels[np.invert(even)]\n", "\n", "\n", "# shuffle events\n", "shuffle_ind_even = list(range(len(labels_even)))\n", "np.random.shuffle(shuffle_ind_even)\n", "features_even = features_even[shuffle_ind_even]\n", "labels_even = labels_even[shuffle_ind_even]\n", "\n", "shuffle_ind_odd = list(range(len(labels_odd)))\n", "np.random.shuffle(shuffle_ind_odd)\n", "features_odd = features_odd[shuffle_ind_odd]\n", "labels_odd = labels_odd[shuffle_ind_odd]\n", "\n", "print(\"% Signal for Even-Numbered Events = \", 100*np.average(labels_even))\n", "print(\"% Signal for Odd-Numbered Events = \", 100*np.average(labels_odd))" ] }, { "cell_type": "code", "execution_count": 13, "id": "1b7ee366-d729-455d-956a-e3d9d6726ddd", "metadata": { "tags": [] }, "outputs": [], "source": [ "# preprocess features so that they are more Gaussian-like\n", "power = PowerTransformer(method='yeo-johnson', standardize=True) #define preprocessor\n", "\n", "features_odd = power.fit_transform(features_odd)\n", "features_even = power.fit_transform(features_even)" ] }, { "cell_type": "markdown", "id": "2248f7c2-78a4-49d7-b47b-cbce6e42cc66", "metadata": {}, "source": [ "___\n", "### Set up MLFlow" ] }, { "cell_type": "code", "execution_count": 58, "id": "0a5f4f15-7757-446f-bcfa-612e5d7b67dc", "metadata": { "tags": [] }, "outputs": [], "source": [ "# function to provide necessary environment variables to workers\n", "def initialize_mlflow(): \n", " \n", " os.system(\"pip install boto3\")\n", "\n", " os.environ['MLFLOW_TRACKING_URI'] = \"https://mlflow.software-dev.ncsa.cloud\"\n", " os.environ['MLFLOW_S3_ENDPOINT_URL'] = \"https://mlflow-minio-api.software-dev.ncsa.cloud\"\n", " os.environ['AWS_ACCESS_KEY_ID'] = \"bengal1\"\n", " os.environ['AWS_SECRET_ACCESS_KEY'] = \"leftfoot1\"\n", " \n", " mlflow.set_tracking_uri('https://mlflow.software-dev.ncsa.cloud') \n", " mlflow.set_experiment(\"agc-training-demo\")" ] }, { "cell_type": "code", "execution_count": 30, "id": "0ac30d5d-22e3-49d7-a904-37e9441c2858", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "experiment_id = 15\n" ] } ], "source": [ "# set up trials\n", "\n", "if USE_MLFLOW:\n", " mlflow.set_tracking_uri('https://mlflow.software-dev.ncsa.cloud') \n", " mlflow.set_experiment(\"agc-training-demo\") # this will create the experiment if it does not yet exist\n", "\n", " # grab experiment\n", " current_experiment=dict(mlflow.get_experiment_by_name(\"agc-training-demo\"))\n", " experiment_id=current_experiment['experiment_id']\n", " print(\"experiment_id = \", experiment_id)\n", "\n", " # create runs ahead of time (avoids conflicts when parallelizing mlflow logging)\n", " run_id_list=[]\n", " for n in range(N_TRIALS*2):\n", " run = MlflowClient().create_run(experiment_id=experiment_id, run_name=f\"run-{n}\")\n", " run_id_list.append(run.info.run_id)\n", "\n", " #after running the above lines, you should be able to see the runs at https://mlflow.software-dev.ncsa.cloud/#/experiments/10/" ] }, { "cell_type": "code", "execution_count": null, "id": "163eb5e4-6913-43a5-b567-adfae7828969", "metadata": {}, "outputs": [], "source": [ "# make directory to save models for triton to\n", "if not os.path.isdir(f\"/mnt/{MODEL_NAME}\"):\n", " os.mkdir(f\"/mnt/{MODEL_NAME}\")\n", "!ls /mnt/" ] }, { "cell_type": "markdown", "id": "597900cf-b61e-477f-b5b7-c5eba81f9eca", "metadata": {}, "source": [ "___\n", "### Training/Hyperparameter Optimization" ] }, { "cell_type": "code", "execution_count": 32, "id": "9a44e10b-0841-441f-80bf-eca39a79d81d", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Example of Trial Parameters: \n" ] }, { "data": { "text/plain": [ "{'reg_lambda': 0.25,\n", " 'reg_alpha': 0.25,\n", " 'n_estimators': 502,\n", " 'min_child_weight': 736.8421052631578,\n", " 'max_depth': 8,\n", " 'learning_rate': 1e-05,\n", " 'gamma': 0.29763514416313164,\n", " 'booster': 'gbtree',\n", " 'trial_num': 0,\n", " 'parity': 'even',\n", " 'run_id': '094907b6d9e1454b9b57338fa0af720f'}" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# generate N_TRIALS random samples from parameter space \n", "sampler = ParameterSampler(\n", " {\n", " 'max_depth': np.arange(2, 50, 2, dtype=int), # maximum tree depth\n", " 'n_estimators': np.arange(2, 700, 20, dtype=int), # number of boosting rounds\n", " 'learning_rate': np.logspace(-5, 1, 10), # boosted learning rate\n", " 'min_child_weight': np.linspace(0, 1000, 20), # minimum weight needed in child\n", " 'reg_lambda': [0, 0.25, 0.5, 0.75, 1], # L2 weight regularization\n", " 'reg_alpha': [0, 0.25, 0.5, 0.75, 1], # L1 weight regularization\n", " 'gamma': np.logspace(-4, 2, 20), # minimum loss reduction to make split in tree\n", " 'booster': ['gbtree'], # which booster to use\n", " },\n", " n_iter = N_TRIALS, # number of trials to perform\n", " random_state=1,\n", ") \n", "\n", "samples_even = list(sampler)\n", "samples_odd = list(sampler)\n", "\n", "# add additional info to each trial\n", "for i in range(N_TRIALS):\n", " samples_even[i]['trial_num'] = i\n", " samples_even[i]['parity'] = 'even' # categorizes this trial as for even event numbers\n", " \n", " samples_odd[i]['trial_num'] = i\n", " samples_odd[i]['parity'] = 'odd' # categorizes this trial as for odd event numbers\n", " \n", " if USE_MLFLOW: \n", " samples_even[i]['run_id'] = run_id_list[i]\n", " samples_odd[i]['run_id'] = run_id_list[i+N_TRIALS]\n", " \n", "print(\"Example of Trial Parameters: \")\n", "samples_even[0]" ] }, { "cell_type": "code", "execution_count": 45, "id": "1ed91e33-db32-4d62-bc6a-f0ca8e43f87c", "metadata": { "tags": [] }, "outputs": [], "source": [ "def fit_model(params, features, labels, cv, mlflowclient=None, use_mlflow=True, model_logging=True): \n", " \n", " trial_num = params[\"trial_num\"]\n", " parity = params[\"parity\"]\n", " \n", " if use_mlflow:\n", " run_id = params[\"run_id\"]\n", " \n", " for param, value in params.items(): \n", " mlflowclient.log_param(run_id, param, value) \n", " \n", " # remove parameters that are not used for XGBClassifier\n", " params_copy = params.copy()\n", " params_copy.pop(\"trial_num\")\n", " params_copy.pop(\"run_id\")\n", " params_copy.pop(\"parity\")\n", " \n", " # initialize model with current trial paramters\n", " model = XGBClassifier(random_state=5, \n", " nthread=-1,\n", " **params_copy) \n", "\n", " # perform n-fold cross-validation\n", " result = cross_validate(model, features, labels, \n", " scoring=['roc_auc','accuracy','precision','f1','recall'], \n", " cv=cv, n_jobs=-1, \n", " return_train_score=True, return_estimator=True)\n", " \n", " trial_metrics = {\n", " \"avg_train_roc_auc\": result['train_roc_auc'].mean(), \n", " \"avg_train_accuracy\": result['train_accuracy'].mean(), \n", " \"avg_train_precision\": result['train_precision'].mean(), \n", " \"avg_train_f1\": result['train_f1'].mean(), \n", " \"avg_train_recall\": result['train_recall'].mean(), \n", " \"avg_test_roc_auc\": result['test_roc_auc'].mean(), \n", " \"avg_test_accuracy\": result['test_accuracy'].mean(), \n", " \"avg_test_precision\": result['test_precision'].mean(), \n", " \"avg_test_f1\": result['test_f1'].mean(), \n", " \"avg_test_recall\": result['test_recall'].mean(), \n", " \"avg_fit_time\": result['fit_time'].mean(),\n", " \"avg_score_time\": result['score_time'].mean(),\n", " }\n", " \n", " if use_mlflow:\n", " \n", " for metric, value in trial_metrics.items():\n", " # log timing metrics\n", " mlflowclient.log_metric(run_id, metric, value)\n", "\n", " # manually end run\n", " mlflowclient.set_terminated(run_id)\n", " \n", " # fit model with all events\n", " model.fit(features,labels)\n", "\n", " # log model in mlflow\n", " if model_logging and use_mlflow:\n", " signature = infer_signature(features, model.predict(features))\n", " with mlflow.start_run(run_id=run_id, nested=True) as run:\n", " mlflow.xgboost.log_model(model, \"model\", signature=signature)\n", " \n", " if model_logging:\n", " return {'score': result['test_roc_auc'].mean()}\n", " \n", " else: # return models as well if we do not log in mlflow\n", " return {'score': result['test_roc_auc'].mean(),\n", " 'model': model,\n", " 'all_metrics': trial_metrics}" ] }, { "cell_type": "code", "execution_count": 46, "id": "7eea145e-bec1-46ac-842c-358a9e6dddb1", "metadata": { "tags": [] }, "outputs": [], "source": [ "# folds to use for cross-validation\n", "folds = KFold(N_FOLD, random_state=5, shuffle=True)\n", "\n", "# set mlflowclient\n", "mlflowclient = MlflowClient()" ] }, { "cell_type": "code", "execution_count": null, "id": "c579e34a-c104-49bd-b086-c88625ec6b42", "metadata": { "tags": [] }, "outputs": [], "source": [ "# set up dask client\n", "if USE_DASK_ML:\n", " client = utils.get_client()\n", " client.run(initialize_mlflow)" ] }, { "cell_type": "code", "execution_count": 48, "id": "883d824f-d63e-460a-8dff-0b0cd588eef9", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Hyperparameter optimization took time = 22.939964532852173\n", "Highest AUC = 0.9555297656474304\n", "Parameters for Model with Highest AUC = \n" ] }, { "data": { "text/plain": [ "{'reg_lambda': 0.25,\n", " 'reg_alpha': 0.75,\n", " 'n_estimators': 382,\n", " 'min_child_weight': 52.63157894736842,\n", " 'max_depth': 20,\n", " 'learning_rate': 0.46415888336127725,\n", " 'gamma': 0.01623776739188721,\n", " 'booster': 'gbtree',\n", " 'trial_num': 8,\n", " 'parity': 'even',\n", " 'run_id': 'ca16dcce8f54489e97029d33c844d4a2'}" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "## MODEL 1 OPTIMIZATION\n", "\n", "if USE_DASK_ML:\n", " start_time = time.time() \n", " futures = client.map(fit_model,\n", " samples_even, \n", " features=features_even[:N_TRAIN,:], \n", " labels=labels_even[:N_TRAIN],\n", " cv=folds,\n", " mlflowclient=mlflowclient,\n", " use_mlflow=USE_MLFLOW,\n", " model_logging=MODEL_LOGGING) \n", " \n", " res = client.gather(futures) \n", " time_elapsed = time.time() - start_time\n", " \n", "else:\n", " start_time = time.time() \n", " res = np.zeros(len(samples_even))\n", " for i in range(len(samples_even)):\n", " res[i] = fit_model(samples_even[i], \n", " features=features_even[:N_TRAIN,:],\n", " labels=labels_even[:N_TRAIN], \n", " cv=folds,\n", " mlflowclient=mlflowclient,\n", " use_mlflow=USE_MLFLOW,\n", " model_logging=MODEL_LOGGING)\n", " \n", " time_elapsed = time.time() - start_time\n", " \n", "print(\"Hyperparameter optimization took time = \", time_elapsed)\n", "\n", "scores = [res[i]['score'] for i in range(N_TRIALS)]\n", "print(\"Highest AUC = \", max(scores))\n", "print(\"Parameters for Model with Highest AUC = \")\n", "samples_even[np.argmax(scores)]" ] }, { "cell_type": "code", "execution_count": 49, "id": "b468c2b8-f341-4fcb-bf2f-0043d2c05b5e", "metadata": { "tags": [] }, "outputs": [], "source": [ "# load best model\n", "if MODEL_LOGGING:\n", " best_run_id = samples_even[np.argmax(scores)]['run_id']\n", " best_model_path = f'runs:/{best_run_id}/model'\n", " best_model = mlflow.xgboost.load_model(best_model_path)\n", "else:\n", " best_run_id = samples_even[np.argmax(scores)]['run_id']\n", " best_model = res[np.argmax(scores)]['model']" ] }, { "cell_type": "code", "execution_count": 50, "id": "43c3624d-ccd6-4eda-9469-f4a4e913f9d9", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Train Accuracy = 0.9687\n", "Train Precision = 0.9723344103392568\n", "Train f1 = 0.9685205672332293\n", "Train Recall = 0.9647365257463434\n", "Train ROC_AUC = 0.9686928785649266\n", "\n", "Test Accuracy = 0.8906696764484575\n", "Test Precision = 0.8886894075403949\n", "Test f1 = 0.891022275556889\n", "Test Recall = 0.8933674236727327\n", "Test ROC_AUC = 0.8906680515442417\n" ] } ], "source": [ "predictions_train = best_model.predict(features_even)\n", "predictions_test = best_model.predict(features_odd)\n", "\n", "print(\"Train Accuracy = \", accuracy_score(predictions_train[:N_TRAIN], labels_even[:N_TRAIN]))\n", "print(\"Train Precision = \", precision_score(predictions_train[:N_TRAIN], labels_even[:N_TRAIN]))\n", "print(\"Train f1 = \", f1_score(predictions_train[:N_TRAIN], labels_even[:N_TRAIN]))\n", "print(\"Train Recall = \", recall_score(predictions_train[:N_TRAIN], labels_even[:N_TRAIN]))\n", "print(\"Train ROC_AUC = \", roc_auc_score(predictions_train[:N_TRAIN], labels_even[:N_TRAIN]))\n", "print()\n", "print(\"Test Accuracy = \", accuracy_score(predictions_test, labels_odd))\n", "print(\"Test Precision = \", precision_score(predictions_test, labels_odd))\n", "print(\"Test f1 = \", f1_score(predictions_test, labels_odd))\n", "print(\"Test Recall = \", recall_score(predictions_test, labels_odd))\n", "print(\"Test ROC_AUC = \", roc_auc_score(predictions_test, labels_odd))" ] }, { "cell_type": "code", "execution_count": 51, "id": "c4a6c55b-3274-486d-b24d-4ea2db801320", "metadata": { "tags": [] }, "outputs": [], "source": [ "# load best model into model registry\n", "if MODEL_LOGGING:\n", " result = mlflow.register_model(best_model_path, \"sig-bkg-bdt\")" ] }, { "cell_type": "code", "execution_count": null, "id": "5275e7ec-1130-4a7c-ba52-baf11ecc2087", "metadata": {}, "outputs": [], "source": [ "# save registered model to triton directory\n", "if len(os.listdir(f\"/mnt/{MODEL_NAME}\"))==0 or WRITE_OVER:\n", " model_version=1\n", "else:\n", " model_version=int(max(next(os.walk(f\"/mnt/{MODEL_NAME}\"))[1]))+1\n", "\n", "if not WRITE_OVER:\n", " os.mkdir(f\"/mnt/{MODEL_NAME}/{model_version}\")\n", " \n", " \n", "print(f\"Saving Model to /mnt/{MODEL_NAME}/{model_version}/xgboost.model\")\n", "best_model.save_model(f\"/mnt/{MODEL_NAME}/{model_version}/xgboost.model\")" ] }, { "cell_type": "code", "execution_count": 53, "id": "7d508519-625f-448d-af76-249365ba88fa", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Hyperparameter optimization took time = 29.643147945404053\n", "Highest AUC = 0.9561495431371775\n", "Parameters for Model with Highest AUC = \n" ] }, { "data": { "text/plain": [ "{'reg_lambda': 0.25,\n", " 'reg_alpha': 0.75,\n", " 'n_estimators': 382,\n", " 'min_child_weight': 52.63157894736842,\n", " 'max_depth': 20,\n", " 'learning_rate': 0.46415888336127725,\n", " 'gamma': 0.01623776739188721,\n", " 'booster': 'gbtree',\n", " 'trial_num': 8,\n", " 'parity': 'odd',\n", " 'run_id': '81ee7a5a20b340a7bae11bfeb84e3dbc'}" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "## MODEL 2 OPTIMIZATION\n", "\n", "if USE_DASK_ML:\n", " start_time = time.time() \n", " futures = client.map(fit_model,\n", " samples_odd, \n", " features=features_odd[:N_TRAIN,:], \n", " labels=labels_odd[:N_TRAIN],\n", " cv=folds,\n", " mlflowclient=mlflowclient,\n", " use_mlflow=USE_MLFLOW,\n", " model_logging=MODEL_LOGGING) \n", " \n", " res = client.gather(futures) \n", " time_elapsed = time.time() - start_time\n", " \n", "else:\n", " start_time = time.time() \n", " res = np.zeros(len(samples_odd))\n", " for i in range(len(samples_odd)):\n", " res[i] = fit_model(samples_odd[i], \n", " features=features_odd[:N_TRAIN,:],\n", " labels=labels_odd[:N_TRAIN], \n", " cv=folds,\n", " mlflowclient=mlflowclient,\n", " use_mlflow=USE_MLFLOW,\n", " model_logging=MODEL_LOGGING)\n", " \n", " time_elapsed = time.time() - start_time\n", " \n", "print(\"Hyperparameter optimization took time = \", time_elapsed)\n", "\n", "scores = [res[i]['score'] for i in range(N_TRIALS)]\n", "print(\"Highest AUC = \", max(scores))\n", "print(\"Parameters for Model with Highest AUC = \")\n", "samples_odd[np.argmax(scores)]" ] }, { "cell_type": "code", "execution_count": 54, "id": "44d5cb73-e65d-4275-ba19-871f371f2b73", "metadata": { "tags": [] }, "outputs": [], "source": [ "# load best model\n", "if MODEL_LOGGING:\n", " best_run_id = samples_odd[np.argmax(scores)]['run_id']\n", " best_model_path = f'runs:/{best_run_id}/model'\n", " best_model = mlflow.xgboost.load_model(best_model_path)\n", "else:\n", " best_run_id = samples_odd[np.argmax(scores)]['run_id']\n", " best_model = res[np.argmax(scores)]['model']" ] }, { "cell_type": "code", "execution_count": 55, "id": "5eb501bd-369e-4cd6-98d3-2295b10ec3fe", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Train Accuracy = 0.9708\n", "Train Precision = 0.9761003784106752\n", "Train f1 = 0.9710719239151973\n", "Train Recall = 0.9660950128129312\n", "Train ROC_AUC = 0.9708697105875088\n", "\n", "Test Accuracy = 0.8914758076748066\n", "Test Precision = 0.9021971315227342\n", "Test f1 = 0.8920570264765784\n", "Test Recall = 0.8821423243323885\n", "Test ROC_AUC = 0.8916341731179142\n" ] } ], "source": [ "predictions_train = best_model.predict(features_odd)\n", "predictions_test = best_model.predict(features_even)\n", "\n", "print(\"Train Accuracy = \", accuracy_score(predictions_train[:N_TRAIN], labels_odd[:N_TRAIN]))\n", "print(\"Train Precision = \", precision_score(predictions_train[:N_TRAIN], labels_odd[:N_TRAIN]))\n", "print(\"Train f1 = \", f1_score(predictions_train[:N_TRAIN], labels_odd[:N_TRAIN]))\n", "print(\"Train Recall = \", recall_score(predictions_train[:N_TRAIN], labels_odd[:N_TRAIN]))\n", "print(\"Train ROC_AUC = \", roc_auc_score(predictions_train[:N_TRAIN], labels_odd[:N_TRAIN]))\n", "print()\n", "print(\"Test Accuracy = \", accuracy_score(predictions_test, labels_even))\n", "print(\"Test Precision = \", precision_score(predictions_test, labels_even))\n", "print(\"Test f1 = \", f1_score(predictions_test, labels_even))\n", "print(\"Test Recall = \", recall_score(predictions_test, labels_even))\n", "print(\"Test ROC_AUC = \", roc_auc_score(predictions_test, labels_even))" ] }, { "cell_type": "code", "execution_count": 56, "id": "dac8afee-f563-4d49-8613-43ede355bac4", "metadata": { "tags": [] }, "outputs": [], "source": [ "# load best model into model registry\n", "if MODEL_LOGGING:\n", " result = mlflow.register_model(best_model_path, \"sig-bkg-bdt\")" ] }, { "cell_type": "code", "execution_count": null, "id": "ddd079d1-54b7-4ba3-8886-f30523023536", "metadata": {}, "outputs": [], "source": [ "# save registered model to triton directory\n", "if len(os.listdir(f\"/mnt/{MODEL_NAME}\"))==0:\n", " model_version=1\n", "else:\n", " model_version=int(max(next(os.walk(f\"/mnt/{MODEL_NAME}\"))[1]))+1\n", "if WRITE_OVER:\n", " if not os.path.isdir(f\"/mnt/{MODEL_NAME}/{model_version}\"):\n", " os.mkdir(f\"/mnt/{MODEL_NAME}/{model_version}\")\n", " model_version=2\n", " \n", "if not WRITE_OVER:\n", " os.mkdir(f\"/mnt/{MODEL_NAME}/{model_version}\")\n", " \n", "print(f\"Saving Model to /mnt/{MODEL_NAME}/{model_version}/xgboost.model\")\n", "best_model.save_model(f\"/mnt/{MODEL_NAME}/{model_version}/xgboost.model\")" ] }, { "cell_type": "markdown", "id": "30d7c971-dbae-49dd-b386-ad91f006b6ea", "metadata": {}, "source": [ "After running hyperparameter optimization for both models, we can view the results of the trials at [this link](https://mlflow.software-dev.ncsa.cloud/#/experiments/10/s?searchInput=&orderByKey=metrics.%60avg_test_roc_auc%60&orderByAsc=false&startTime=ALL&lifecycleFilter=Active&modelVersionFilter=All%20Runs&showMultiColumns=true&categorizedUncheckedKeys%5Battributes%5D%5B0%5D=&categorizedUncheckedKeys%5Bparams%5D%5B0%5D=&categorizedUncheckedKeys%5Bmetrics%5D%5B0%5D=&categorizedUncheckedKeys%5Btags%5D%5B0%5D=&diffSwitchSelected=false&preSwitchCategorizedUncheckedKeys%5Battributes%5D%5B0%5D=&preSwitchCategorizedUncheckedKeys%5Bparams%5D%5B0%5D=&preSwitchCategorizedUncheckedKeys%5Bmetrics%5D%5B0%5D=&preSwitchCategorizedUncheckedKeys%5Btags%5D%5B0%5D=&postSwitchCategorizedUncheckedKeys%5Battributes%5D%5B0%5D=&postSwitchCategorizedUncheckedKeys%5Bparams%5D%5B0%5D=&postSwitchCategorizedUncheckedKeys%5Bmetrics%5D%5B0%5D=&postSwitchCategorizedUncheckedKeys%5Btags%5D%5B0%5D=). The best models can be seen at the top of the list with links to their associated registered models." ] }, { "cell_type": "markdown", "id": "93aa7192-4706-44f9-aecd-a2c3e93004fc", "metadata": {}, "source": [ "___\n", "### Create Triton config file for our model" ] }, { "cell_type": "code", "execution_count": 32, "id": "a8ade1fd-801e-49f3-8f22-f0538939d1ab", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "name: \"sigbkg_bdt\"\n", "backend: \"fil\"\n", "max_batch_size: 500000\n", "input [\n", " {\n", " name: \"input__0\"\n", " data_type: TYPE_FP32\n", " dims: [ 10 ]\n", " }\n", "]\n", "output [\n", " {\n", " name: \"output__0\"\n", " data_type: TYPE_FP32\n", " dims: [ 2 ]\n", " }\n", " \n", "]\n", "instance_group [{ kind: KIND_GPU }]\n", "parameters [\n", " {\n", " key: \"model_type\"\n", " value: { string_value: \"xgboost\" }\n", " },\n", " {\n", " key: \"predict_proba\"\n", " value: { string_value: \"true\" }\n", " },\n", " {\n", " key: \"output_class\"\n", " value: { string_value: \"true\" }\n", " },\n", " {\n", " key: \"threshold\"\n", " value: { string_value: \"0.5\" }\n", " },\n", " {\n", " key: \"algo\"\n", " value: { string_value: \"ALGO_AUTO\" }\n", " },\n", " {\n", " key: \"storage_type\"\n", " value: { string_value: \"AUTO\" }\n", " },\n", " {\n", " key: \"blocks_per_sm\"\n", " value: { string_value: \"0\" }\n", " }\n", "]\n", "\n", "dynamic_batching { }\n" ] } ], "source": [ "config_txt = utils.generate_triton_config(MODEL_NAME, features_even.shape[1], predict_proba=True)\n", "print(config_txt)" ] }, { "cell_type": "code", "execution_count": 33, "id": "dba8851f-b7f9-4073-860d-06e83f183c54", "metadata": {}, "outputs": [], "source": [ "# save config file \n", "with open(f'/mnt/{MODEL_NAME}/config.pbtxt', 'w') as the_file:\n", " the_file.write(config_txt)" ] }, { "cell_type": "code", "execution_count": 34, "id": "b439dde3-f360-4b9d-ad98-91aab151c2cb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['3', '2', '4', 'config.pbtxt', '1']\n" ] } ], "source": [ "# print contents of triton server model directory\n", "print(os.listdir(f\"/mnt/{MODEL_NAME}\"))" ] } ], "metadata": { "jupytext": { "formats": "ipynb,py:percent" }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.16" } }, "nbformat": 4, "nbformat_minor": 5 }