{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## TriScale\n", "# Case Study - Failure Detection\n", "\n", "This notebook presents a case study of the TriScale framework. It revisits the analysis of [Blink](https://www.usenix.org/conference/nsdi19/presentation/holterbach), an algorithm that detects failuresand reroutes traffic directly in the data plane. Parts of this case study are described in the [TriScale paper](https://doi.org/10.5281/zenodo.3464273).\n", "\n", "\n", "## List of Imports" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import os\n", "import copy\n", "from pathlib import Path\n", "import zipfile\n", "\n", "import pandas as pd\n", "import numpy as np\n", "import plotly.graph_objects as go\n", "\n", "import triscale\n", "import triplots" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Download Source Files and Data\n", "[[Back to top](#TriScale)]\n", "\n", "The dataset for this case study is available on Zenodo: \n", "\n", "[](https://doi.org/10.5281/zenodo.3451417)\n", "\n", "\n", "The wget commands below download the required files to reproduce this case study. Downloading and unzipping might take a while...\n", "> **The .zip file is ~100 kB**" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Nothing to download\n" ] } ], "source": [ "# Set `download = True` to download (and extract) the data from this case study\n", "# Eventually, adjust the record_id for the file version you are interested in.\n", "\n", "# For reproducing the results of the TriScale paper, set `record_id = 3666724`\n", "\n", "download = True\n", "record_id = 3666724 # v3.0.1 (https://doi.org/10.5281/zenodo.3666724)\n", "\n", "files= ['UseCase_FailureDetection.zip']\n", "if download:\n", " for file in files:\n", " print(file)\n", " url = 'https://zenodo.org/record/'+str(record_id)+'/files/'+file \n", " os.system('wget %s' %url)\n", " if file[-4:] == '.zip': \n", " with zipfile.ZipFile(file,\"r\") as zip_file:\n", " zip_file.extractall()\n", " print('Done.')\n", "else: \n", " print('Nothing to download')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now import the custom module for the case study. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import UseCase_FailureDetection.failuredetection as fd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluation objectives\n", "[[Back to top](#TriScale)]\n", "\n", "In this case study, 30 prefixes of 15 different internet traces have been selected. For each of these prefixes, 5 artificial traces have been generates, all of which include a failure. We are interested in evaluating\n", "1. The ratio of failures which are correctly detected (true positives)\n", "2. The time taken until the traffic is rerouted\n", "\n", "The experiment has been designed and performed by the authors of [the Blink paper](https://www.usenix.org/conference/nsdi19/presentation/holterbach). In this case study, we only perform the data analysis, using _TriScale_ approach to generalize the results.\n", "\n", "### 1. Compute the Metrics\n", "For each prefix, we compute two metrics\n", "1. The true positive rate; that is, the ratio of failures correctly detected by the algorithm. Since there are 5 synthetic trace per prefix, this metric has values in {0, 0.2, 0.4, 0.6, 0.8, 1}\n", "2. The median time taken to reroute the traffic (considering only the failures that have been detected)\n", "\n", "The computation of metric values is performed by the `compute_metrics()` function below." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Output retrieved from file. Skipping computation.\n" ] }, { "data": { "text/html": [ "
| \n", " | Protocol | \n", "Trace | \n", "Prefix | \n", "TPR | \n", "Speed_s | \n", "
|---|---|---|---|---|---|
| 0 | \n", "blink | \n", "1 | \n", "0 | \n", "0.6 | \n", "1.998730 | \n", "
| 1 | \n", "blink | \n", "1 | \n", "1 | \n", "1.0 | \n", "1.579861 | \n", "
| 2 | \n", "blink | \n", "1 | \n", "2 | \n", "0.0 | \n", "NaN | \n", "
| 3 | \n", "blink | \n", "1 | \n", "3 | \n", "1.0 | \n", "1.707236 | \n", "
| 4 | \n", "blink | \n", "1 | \n", "4 | \n", "0.8 | \n", "1.419164 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 1345 | \n", "infinite_timeout | \n", "15 | \n", "25 | \n", "0.8 | \n", "1.681014 | \n", "
| 1346 | \n", "infinite_timeout | \n", "15 | \n", "26 | \n", "0.0 | \n", "NaN | \n", "
| 1347 | \n", "infinite_timeout | \n", "15 | \n", "27 | \n", "0.4 | \n", "2.107471 | \n", "
| 1348 | \n", "infinite_timeout | \n", "15 | \n", "28 | \n", "1.0 | \n", "0.717849 | \n", "
| 1349 | \n", "infinite_timeout | \n", "15 | \n", "29 | \n", "1.0 | \n", "0.743098 | \n", "
1350 rows × 5 columns
\n", "