{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# StatsBomb 360 Data Engineering\n", "##### Notebook to engineer previously parsed CSV data from the [StatsBomb Open Data GitHub repository](https://github.com/statsbomb/open-data) using [pandas](http://pandas.pydata.org/).\n", "\n", "### By [Edd Webster](https://www.twitter.com/eddwebster)\n", "Notebook first written: 29/10/2021
\n", "Notebook last updated: 05/12/2021\n", "\n", "![StatsBomb](../../img/logos/stats-bomb-logo.png)\n", "\n", "![StatsBomb 360](../../img/logos/stats-bomb-360-logo.png)\n", "\n", "Click [here](#section5) to jump straight to the Exploratory Data Analysis section and skip the [Task Brief](#section2), [Data Sources](#section3), [Data Engineering](#section4), [Data Aggregation](#section5), and [Subsetted DataFrames](#section6) sections." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "___\n", "\n", "\n", "## Introduction\n", "This notebook engineered pubicly available [StatsBomb](https://statsbomb.com/) 360 data, using [pandas](http://pandas.pydata.org/) for data manipulation through DataFrames.\n", "\n", "For more information about this notebook and the author, I'm available through all the following channels:\n", "* [eddwebster.com](https://www.eddwebster.com/);\n", "* edd.j.webster@gmail.com;\n", "* [@eddwebster](https://www.twitter.com/eddwebster);\n", "* [linkedin.com/in/eddwebster](https://www.linkedin.com/in/eddwebster/);\n", "* [github/eddwebster](https://github.com/eddwebster/); and\n", "* [public.tableau.com/profile/edd.webster](https://public.tableau.com/profile/edd.webster).\n", "\n", "![Edd Webster](../../img/edd_webster/fifa21eddwebsterbanner.png)\n", "\n", "The accompanying GitHub repository for this notebook can be found [here](https://github.com/eddwebster/football_analytics) and a static version of this notebook can be found [here](https://nbviewer.org/github/eddwebster/football_analytics/blob/master/notebooks/2_data_parsing/Parma%20Calcio%201913%20-%20StatsBomb%20Data%20Parsing%20and%20Engineering.ipynb)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "___\n", "\n", "## Notebook Contents\n", "1. [Notebook Dependencies](#section1)
\n", "2. [Project Brief](#section2)
\n", "3. [Data Sources](#section3)
\n", " 1. [Introduction](#section3.1)
\n", " 2. [Read in the Datasets](#section3.2)
\n", " 4. [Initial Data Handling](#section3.3)
\n", "4. [Data Engineering](#section4)
\n", " 1. [Assign Raw DataFrame to Engineered DataFrame](#section4.1)
\n", " 2. [Sort the DataFrame](#section4.2)
\n", " 3. [Determine Each Player's Most Frequent Position](#section4.3)
\n", " 4. [Determine Each Player's Total Minutes Played](#section4.4)
\n", " 5. [Isolate In-Play Events](#section4.5)
\n", " 6. [Break Down All location Attributes](#section4.6)
\n", " 7. [Create New Attributes](#section4.7)
\n", " 8. [Fill Null Values](#section4.8)
\n", " 9. [Export Events Dataset](#section4.9)
\n", "5. [Aggregated Data](#section5)
\n", " 1. [Groupby and Aggregate by Player and Match](#section5.1)
\n", " 2. [Groupby and Aggregate by Player for the Entire Tournament](#section5.2)
\n", "6. [Subset Data](#section6)
\n", " 1. [Passing Matrix Data](#section6.1)
\n", " 2. [Passing Network Data](#section6.2)
\n", " 3. [...](#section6.3)
\n", "7. [Summary](#section7)
\n", "8. [Next Steps](#section8)
\n", "9. [References](#section9)
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "___\n", "\n", "\n", "\n", "## 1. Notebook Dependencies\n", "\n", "This notebook was written using [Python 3](https://docs.python.org/3.7/) and requires the following libraries:\n", "* [`Jupyter notebooks`](https://jupyter.org/) for this notebook environment with which this project is presented;\n", "* [`NumPy`](http://www.numpy.org/) for multidimensional array computing; and\n", "* [`pandas`](http://pandas.pydata.org/) for data analysis and manipulation.\n", "\n", "All packages used for this notebook except for BeautifulSoup can be obtained by downloading and installing the [Conda](https://anaconda.org/anaconda/conda) distribution, available on all platforms (Windows, Linux and Mac OSX). Step-by-step guides on how to install Anaconda can be found for Windows [here](https://medium.com/@GalarnykMichael/install-python-on-windows-anaconda-c63c7c3d1444) and Mac [here](https://medium.com/@GalarnykMichael/install-python-on-mac-anaconda-ccd9f2014072), as well as in the Anaconda documentation itself [here](https://docs.anaconda.com/anaconda/install/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Import Libraries and Modules" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Setup Complete\n" ] } ], "source": [ "# Python ≥3.5 (ideally)\n", "import platform\n", "import sys, getopt\n", "assert sys.version_info >= (3, 5)\n", "import csv\n", "\n", "# Import Dependencies\n", "%matplotlib inline\n", "\n", "# Math Operations\n", "import numpy as np\n", "from math import pi\n", "\n", "# Datetime\n", "import datetime\n", "from datetime import date\n", "import time\n", "\n", "# Data Preprocessing\n", "import pandas as pd\n", "import pandas_profiling as pp\n", "import os\n", "import re\n", "import chardet\n", "import random\n", "from io import BytesIO\n", "from pathlib import Path\n", "\n", "# Reading Directories\n", "import glob\n", "import os\n", "\n", "# Working with JSON\n", "import json\n", "from pandas.io.json import json_normalize\n", "\n", "# Data Visualisation\n", "import matplotlib as mpl\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "import missingno as msno\n", "\n", "# Progress Bar\n", "from tqdm import tqdm\n", "\n", "# Display in Jupyter\n", "from IPython.display import Image, YouTubeVideo\n", "from IPython.core.display import HTML\n", "\n", "# Ignore Warnings\n", "import warnings\n", "warnings.filterwarnings(action=\"ignore\", message=\"^internal gelsd\")\n", "\n", "print(\"Setup Complete\")" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "from rfpimp import *\n", "\n", "\n", "# Machine learning\n", "from sklearn import preprocessing, model_selection, svm, metrics\n", "from sklearn.model_selection import train_test_split, cross_val_predict, cross_val_score, GridSearchCV\n", "from sklearn.ensemble import RandomForestClassifier\n", "from sklearn.ensemble import ExtraTreesClassifier\n", "from sklearn.ensemble import GradientBoostingClassifier\n", "from sklearn.linear_model import LogisticRegression\n", "from sklearn.tree import DecisionTreeClassifier\n", "from xgboost import XGBClassifier\n", "from sklearn.utils import resample\n", "from sklearn.feature_selection import RFE\n", "from imblearn.over_sampling import SMOTE\n", "from collections import Counter\n", "\n", "from sklearn import preprocessing, model_selection, metrics\n", "from sklearn.model_selection import train_test_split, StratifiedKFold\n", "from sklearn.tree import DecisionTreeClassifier\n", "from sklearn.utils.class_weight import compute_class_weight\n", " \n", "from imblearn.over_sampling import SMOTENC\n", "\n", "\n", "from sklearn.preprocessing import StandardScaler \n", "from sklearn.decomposition import PCA\n", "\n", "from imblearn.under_sampling import TomekLinks\n", "from imblearn.under_sampling import EditedNearestNeighbours\n", "from imblearn.combine import SMOTETomek\n", "from imblearn.combine import SMOTEENN\n", "from sklearn.metrics import matthews_corrcoef\n", "from sklearn.metrics import brier_score_loss\n", "\n", "from sklearn.calibration import CalibratedClassifierCV, calibration_curve\n", "\n", "from imblearn.ensemble import BalancedRandomForestClassifier\n", "from sklearn.ensemble import AdaBoostClassifier\n", "from sklearn.datasets import make_classification\n", "\n", "import pickle\n", "from sklearn.metrics import plot_confusion_matrix \n", "import scikitplot as skplt" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Python: 3.7.6\n", "NumPy: 1.20.3\n", "pandas: 1.3.2\n", "matplotlib: 3.4.2\n" ] } ], "source": [ "# Python / module versions used here for reference\n", "print('Python: {}'.format(platform.python_version()))\n", "print('NumPy: {}'.format(np.__version__))\n", "print('pandas: {}'.format(pd.__version__))\n", "print('matplotlib: {}'.format(mpl.__version__))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Defined Variables" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# Define today's date\n", "today = datetime.datetime.now().strftime('%d/%m/%Y').replace('/', '')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Defined Filepaths" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# Set up initial paths to subfolders\n", "base_dir = os.path.join('..', '..')\n", "data_dir = os.path.join(base_dir, 'data')\n", "data_dir_sb = os.path.join(base_dir, 'data', 'sb')\n", "img_dir = os.path.join(base_dir, 'img')\n", "fig_dir = os.path.join(base_dir, 'img', 'fig')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create Directory Structure" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "# make the directory structure\n", "for folder in ['combined', 'competitions', 'events', 'tactics', 'lineups']:\n", " path = os.path.join(data_dir_sb, 'raw', folder)\n", " if not os.path.exists(path):\n", " os.mkdir(path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Custom Functions" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "# Define custom functions for used in the notebook\n", "\n", "## Function to read JSON files that also handles the encoding of special characters e.g. accents in names of players and teams\n", "def read_json_file(filename):\n", " with open(filename, 'rb') as json_file:\n", " return BytesIO(json_file.read()).getvalue().decode('unicode_escape')\n", "\n", " \n", "## Function to flatten pandas DataFrames with nested JSON columns. Source: https://stackoverflow.com/questions/39899005/how-to-flatten-a-pandas-dataframe-with-some-columns-as-json\n", "def flatten_nested_json_df(df):\n", "\n", " df = df.reset_index()\n", "\n", " print(f\"original shape: {df.shape}\")\n", " print(f\"original columns: {df.columns}\")\n", "\n", "\n", " # search for columns to explode/flatten\n", " s = (df.applymap(type) == list).all()\n", " list_columns = s[s].index.tolist()\n", "\n", " s = (df.applymap(type) == dict).all()\n", " dict_columns = s[s].index.tolist()\n", "\n", " print(f\"lists: {list_columns}, dicts: {dict_columns}\")\n", " while len(list_columns) > 0 or len(dict_columns) > 0:\n", " new_columns = []\n", "\n", " for col in dict_columns:\n", " print(f\"flattening: {col}\")\n", " # explode dictionaries horizontally, adding new columns\n", " horiz_exploded = pd.json_normalize(df[col]).add_prefix(f'{col}.')\n", " horiz_exploded.index = df.index\n", " df = pd.concat([df, horiz_exploded], axis=1).drop(columns=[col])\n", " new_columns.extend(horiz_exploded.columns) # inplace\n", "\n", " for col in list_columns:\n", " print(f\"exploding: {col}\")\n", " # explode lists vertically, adding new columns\n", " df = df.drop(columns=[col]).join(df[col].explode().to_frame())\n", " new_columns.append(col)\n", "\n", " # check if there are still dict o list fields to flatten\n", " s = (df[new_columns].applymap(type) == list).all()\n", " list_columns = s[s].index.tolist()\n", "\n", " s = (df[new_columns].applymap(type) == dict).all()\n", " dict_columns = s[s].index.tolist()\n", "\n", " print(f\"lists: {list_columns}, dicts: {dict_columns}\")\n", "\n", " print(f\"final shape: {df.shape}\")\n", " print(f\"final columns: {df.columns}\")\n", " return df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Notebook Settings" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "# Display all columns of displayed pandas DataFrames\n", "pd.set_option('display.max_columns', None)\n", "pd.options.mode.chained_assignment=None" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "\n", "\n", "## 2. Notebook Brief\n", "This Jupyter notebook is part of a series of notebooks to parse and engineer StatsBomb Event data.\n", "\n", "This particular notebook is the **StatsBomb Data Engineer** notebook for 360 data, that takes a previously parsed dataset of StatsBomb 360 data from the StatsBomb Open Data GitHub Repository and prepares it for data analysis.\n", "\n", "Links to these notebooks in the [`football_analytics`](https://github.com/eddwebster/football_analytics) GitHub repository can be found at the following:\n", "* [Data Parsing](https://github.com/eddwebster/football_analytics/tree/master/notebooks/2_data_parsing)\n", " + [StatsBomb Data Parsing](https://github.com/eddwebster/football_analytics/blob/master/notebooks/2_data_parsing/ELO%20Team%20Ratings%20Data%20Parsing.ipynb)\n", "* [Data Engineering](https://github.com/eddwebster/football_analytics/tree/master/notebooks/3_data_engineering)\n", " + [StatsBomb Data Engineering](https://github.com/eddwebster/football_analytics/blob/master/notebooks/3_data_engineering/FBref%20Player%20Stats%20Data%20Engineering.ipynb)\n", "\n", "**Notebook Conventions**:
\n", "* Variables that refer a `DataFrame` object are prefixed with `df_`.\n", "* Variables that refer to a collection of `DataFrame` objects (e.g., a list, a set or a dict) are prefixed with `dfs_`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "\n", "\n", "## 3. Data Sources" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.1. Reading In CSV Data\n", "The following cells read in the previously prepared DataFrame of StatsBomb 360 data." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "scrolled": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/anaconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3441: DtypeWarning: Columns (18,45,69,78,79,80,82,83,84,89,90,92,93,98,100,101,102,108,109,112,113,114,116,118,119,120,121,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,161,164,168,171) have mixed types.Specify dtype option on import or set low_memory=False.\n", " exec(code_obj, self.user_global_ns, self.user_ns)\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
level_0idindexperiodtimestampminutesecondpossessiondurationtype.idtype.namepossession_team.idpossession_team.nameplay_pattern.idplay_pattern.nameteam.idteam.nametactics.formationtactics.lineuprelated_eventslocationplayer.idplayer.nameposition.idposition.namepass.recipient.idpass.recipient.namepass.lengthpass.anglepass.height.idpass.height.namepass.end_locationpass.body_part.idpass.body_part.namepass.type.idpass.type.namecarry.end_locationunder_pressureduel.type.idduel.type.namepass.aerial_woncounterpressduel.outcome.idduel.outcome.namedribble.outcome.iddribble.outcome.namepass.outcome.idpass.outcome.nameball_receipt.outcome.idball_receipt.outcome.nameinterception.outcome.idinterception.outcome.nameshot.statsbomb_xgshot.end_locationshot.outcome.idshot.outcome.nameshot.type.idshot.type.nameshot.body_part.idshot.body_part.nameshot.technique.idshot.technique.nameshot.freeze_framegoalkeeper.end_locationgoalkeeper.type.idgoalkeeper.type.namegoalkeeper.position.idgoalkeeper.position.nameoutpass.outswingingpass.technique.idpass.technique.nameclearance.headclearance.body_part.idclearance.body_part.namepass.switchoff_camerapass.crossclearance.left_footdribble.overrundribble.nutmegclearance.right_footpass.no_touchfoul_committed.advantagefoul_won.advantagepass.assisted_shot_idpass.shot_assistshot.key_pass_idshot.first_timeclearance.otherpass.miscommunicationclearance.aerial_wonpass.through_ballball_recovery.recovery_failuregoalkeeper.outcome.idgoalkeeper.outcome.namegoalkeeper.body_part.idgoalkeeper.body_part.nameshot.aerial_wonfoul_committed.card.idfoul_committed.card.namefoul_committed.offensivefoul_won.defensivesubstitution.outcome.idsubstitution.outcome.namesubstitution.replacement.idsubstitution.replacement.name50_50.outcome.id50_50.outcome.namepass.goal_assistgoalkeeper.technique.idgoalkeeper.technique.namepass.cut_backmiscontrol.aerial_wonpass.straightfoul_committed.type.idfoul_committed.type.namematch_idpass.inswingingpass.deflectedinjury_stoppage.in_chainshot.one_on_onebad_behaviour.card.idbad_behaviour.card.nameshot.deflectedblock.deflectionfoul_committed.penaltyfoul_won.penaltyblock.save_blockgoalkeeper.punched_outplayer_off.permanentshot.saved_off_targetgoalkeeper.shot_saved_off_targetshot.saved_to_postgoalkeeper.shot_saved_to_postshot.open_goalgoalkeeper.penalty_saved_to_postdribble.no_touchblock.offensiveshot.follows_dribbleball_recovery.offensiveshot.redirectgoalkeeper.lost_in_playgoalkeeper.success_in_playmatch_datekick_offhome_scoreaway_scorematch_statusmatch_status_360last_updatedlast_updated_360match_weekcompetition.competition_idcompetition.country_namecompetition.competition_nameseason.season_idseason.season_namehome_team.home_team_idhome_team.home_team_namehome_team.home_team_genderhome_team.home_team_grouphome_team.country.idhome_team.country.namehome_team.managersaway_team.away_team_idaway_team.away_team_nameaway_team.away_team_genderaway_team.away_team_groupaway_team.country.idaway_team.country.nameaway_team.managersmetadata.data_versionmetadata.shot_fidelity_versionmetadata.xy_fidelity_versioncompetition_stage.idcompetition_stage.namestadium.idstadium.namestadium.country.idstadium.country.namereferee.idreferee.namereferee.country.idreferee.country.namecompetition_idseason_idcountry_namecompetition_namecompetition_gendercompetition_youthcompetition_internationalseason_namematch_updatedmatch_updated_360match_available_360match_available
009427b18a-6b10-411f-90da-3d6240b80c711100:00:00.0000010.00000035Starting XI1835Finland1Regular Play1835Finland352.0[{'player': {'id': 8667, 'name': 'Lukáš Hrádec...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788753NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1615:00:00.00001availableavailable2021-11-11T14:00:16.1058092021-09-22T16:39:05.697512255EuropeUEFA Euro4320201835FinlandmaleGroup B77Finland[{'id': 3622, 'name': 'Markku Kanerva', 'nickn...796RussiamaleGroup B188Russia[{'id': 365, 'name': 'Stanislav Cherchesov', '...1.1.02210Group Stage4726Saint-Petersburg Stadium188Russia293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
11542c58bf-5c6c-43ca-9d8d-e086c7f08aaf2100:00:00.0000010.00000035Starting XI1835Finland1Regular Play796Russia3421.0[{'player': {'id': 21298, 'name': 'Matvey Safo...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788753NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1615:00:00.00001availableavailable2021-11-11T14:00:16.1058092021-09-22T16:39:05.697512255EuropeUEFA Euro4320201835FinlandmaleGroup B77Finland[{'id': 3622, 'name': 'Markku Kanerva', 'nickn...796RussiamaleGroup B188Russia[{'id': 365, 'name': 'Stanislav Cherchesov', '...1.1.02210Group Stage4726Saint-Petersburg Stadium188Russia293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
22a0dfe8a0-a0b9-443e-89e3-a8ba6596fa333100:00:00.0000010.00000018Half Start1835Finland1Regular Play1835FinlandNaNNaN['c7156352-f4b7-4140-aa51-6e26fd019a11']NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788753NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1615:00:00.00001availableavailable2021-11-11T14:00:16.1058092021-09-22T16:39:05.697512255EuropeUEFA Euro4320201835FinlandmaleGroup B77Finland[{'id': 3622, 'name': 'Markku Kanerva', 'nickn...796RussiamaleGroup B188Russia[{'id': 365, 'name': 'Stanislav Cherchesov', '...1.1.02210Group Stage4726Saint-Petersburg Stadium188Russia293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
33c7156352-f4b7-4140-aa51-6e26fd019a114100:00:00.0000010.00000018Half Start1835Finland1Regular Play796RussiaNaNNaN['a0dfe8a0-a0b9-443e-89e3-a8ba6596fa33']NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788753NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1615:00:00.00001availableavailable2021-11-11T14:00:16.1058092021-09-22T16:39:05.697512255EuropeUEFA Euro4320201835FinlandmaleGroup B77Finland[{'id': 3622, 'name': 'Markku Kanerva', 'nickn...796RussiamaleGroup B188Russia[{'id': 365, 'name': 'Stanislav Cherchesov', '...1.1.02210Group Stage4726Saint-Petersburg Stadium188Russia293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
4494dbc5c3-ef37-445e-9154-3d9f9ea9245d5100:00:00.4900021.37321530Pass796Russia9From Kick Off796RussiaNaNNaN['c0935bbe-3eb4-4a21-9eee-45f380d1f26d'][60.0, 40.0]6299.0Aleksey Miranchuk18.0Right Attacking Midfield31917.0Igor Diveev22.3573253.0699671.0Ground Pass[37.7, 41.6]38.0Left Foot65.0Kick OffNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788753NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1615:00:00.00001availableavailable2021-11-11T14:00:16.1058092021-09-22T16:39:05.697512255EuropeUEFA Euro4320201835FinlandmaleGroup B77Finland[{'id': 3622, 'name': 'Markku Kanerva', 'nickn...796RussiamaleGroup B188Russia[{'id': 365, 'name': 'Stanislav Cherchesov', '...1.1.02210Group Stage4726Saint-Petersburg Stadium188Russia293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
\n", "
" ], "text/plain": [ " level_0 id index period timestamp \\\n", "0 0 9427b18a-6b10-411f-90da-3d6240b80c71 1 1 00:00:00.000 \n", "1 1 542c58bf-5c6c-43ca-9d8d-e086c7f08aaf 2 1 00:00:00.000 \n", "2 2 a0dfe8a0-a0b9-443e-89e3-a8ba6596fa33 3 1 00:00:00.000 \n", "3 3 c7156352-f4b7-4140-aa51-6e26fd019a11 4 1 00:00:00.000 \n", "4 4 94dbc5c3-ef37-445e-9154-3d9f9ea9245d 5 1 00:00:00.490 \n", "\n", " minute second possession duration type.id type.name \\\n", "0 0 0 1 0.000000 35 Starting XI \n", "1 0 0 1 0.000000 35 Starting XI \n", "2 0 0 1 0.000000 18 Half Start \n", "3 0 0 1 0.000000 18 Half Start \n", "4 0 0 2 1.373215 30 Pass \n", "\n", " possession_team.id possession_team.name play_pattern.id play_pattern.name \\\n", "0 1835 Finland 1 Regular Play \n", "1 1835 Finland 1 Regular Play \n", "2 1835 Finland 1 Regular Play \n", "3 1835 Finland 1 Regular Play \n", "4 796 Russia 9 From Kick Off \n", "\n", " team.id team.name tactics.formation \\\n", "0 1835 Finland 352.0 \n", "1 796 Russia 3421.0 \n", "2 1835 Finland NaN \n", "3 796 Russia NaN \n", "4 796 Russia NaN \n", "\n", " tactics.lineup \\\n", "0 [{'player': {'id': 8667, 'name': 'Lukáš Hrádec... \n", "1 [{'player': {'id': 21298, 'name': 'Matvey Safo... \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "\n", " related_events location player.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 ['c7156352-f4b7-4140-aa51-6e26fd019a11'] NaN NaN \n", "3 ['a0dfe8a0-a0b9-443e-89e3-a8ba6596fa33'] NaN NaN \n", "4 ['c0935bbe-3eb4-4a21-9eee-45f380d1f26d'] [60.0, 40.0] 6299.0 \n", "\n", " player.name position.id position.name \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 Aleksey Miranchuk 18.0 Right Attacking Midfield \n", "\n", " pass.recipient.id pass.recipient.name pass.length pass.angle \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 31917.0 Igor Diveev 22.357325 3.069967 \n", "\n", " pass.height.id pass.height.name pass.end_location pass.body_part.id \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 1.0 Ground Pass [37.7, 41.6] 38.0 \n", "\n", " pass.body_part.name pass.type.id pass.type.name carry.end_location \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 Left Foot 65.0 Kick Off NaN \n", "\n", " under_pressure duel.type.id duel.type.name pass.aerial_won counterpress \\\n", "0 NaN NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN NaN \n", "\n", " duel.outcome.id duel.outcome.name dribble.outcome.id dribble.outcome.name \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " pass.outcome.id pass.outcome.name ball_receipt.outcome.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " ball_receipt.outcome.name interception.outcome.id \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", "\n", " interception.outcome.name shot.statsbomb_xg shot.end_location \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " shot.outcome.id shot.outcome.name shot.type.id shot.type.name \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " shot.body_part.id shot.body_part.name shot.technique.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " shot.technique.name shot.freeze_frame goalkeeper.end_location \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " goalkeeper.type.id goalkeeper.type.name goalkeeper.position.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " goalkeeper.position.name out pass.outswinging pass.technique.id \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " pass.technique.name clearance.head clearance.body_part.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " clearance.body_part.name pass.switch off_camera pass.cross \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " clearance.left_foot dribble.overrun dribble.nutmeg clearance.right_foot \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " pass.no_touch foul_committed.advantage foul_won.advantage \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " pass.assisted_shot_id pass.shot_assist shot.key_pass_id shot.first_time \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " clearance.other pass.miscommunication clearance.aerial_won \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " pass.through_ball ball_recovery.recovery_failure goalkeeper.outcome.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " goalkeeper.outcome.name goalkeeper.body_part.id goalkeeper.body_part.name \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " shot.aerial_won foul_committed.card.id foul_committed.card.name \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " foul_committed.offensive foul_won.defensive substitution.outcome.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " substitution.outcome.name substitution.replacement.id \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", "\n", " substitution.replacement.name 50_50.outcome.id 50_50.outcome.name \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " pass.goal_assist goalkeeper.technique.id goalkeeper.technique.name \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " pass.cut_back miscontrol.aerial_won pass.straight foul_committed.type.id \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " foul_committed.type.name match_id pass.inswinging pass.deflected \\\n", "0 NaN 3788753 NaN NaN \n", "1 NaN 3788753 NaN NaN \n", "2 NaN 3788753 NaN NaN \n", "3 NaN 3788753 NaN NaN \n", "4 NaN 3788753 NaN NaN \n", "\n", " injury_stoppage.in_chain shot.one_on_one bad_behaviour.card.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " bad_behaviour.card.name shot.deflected block.deflection \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " foul_committed.penalty foul_won.penalty block.save_block \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " goalkeeper.punched_out player_off.permanent shot.saved_off_target \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " goalkeeper.shot_saved_off_target shot.saved_to_post \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", "\n", " goalkeeper.shot_saved_to_post shot.open_goal \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", "\n", " goalkeeper.penalty_saved_to_post dribble.no_touch block.offensive \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " shot.follows_dribble ball_recovery.offensive shot.redirect \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " goalkeeper.lost_in_play goalkeeper.success_in_play match_date \\\n", "0 NaN NaN 2021-06-16 \n", "1 NaN NaN 2021-06-16 \n", "2 NaN NaN 2021-06-16 \n", "3 NaN NaN 2021-06-16 \n", "4 NaN NaN 2021-06-16 \n", "\n", " kick_off home_score away_score match_status match_status_360 \\\n", "0 15:00:00.000 0 1 available available \n", "1 15:00:00.000 0 1 available available \n", "2 15:00:00.000 0 1 available available \n", "3 15:00:00.000 0 1 available available \n", "4 15:00:00.000 0 1 available available \n", "\n", " last_updated last_updated_360 match_week \\\n", "0 2021-11-11T14:00:16.105809 2021-09-22T16:39:05.697512 2 \n", "1 2021-11-11T14:00:16.105809 2021-09-22T16:39:05.697512 2 \n", "2 2021-11-11T14:00:16.105809 2021-09-22T16:39:05.697512 2 \n", "3 2021-11-11T14:00:16.105809 2021-09-22T16:39:05.697512 2 \n", "4 2021-11-11T14:00:16.105809 2021-09-22T16:39:05.697512 2 \n", "\n", " competition.competition_id competition.country_name \\\n", "0 55 Europe \n", "1 55 Europe \n", "2 55 Europe \n", "3 55 Europe \n", "4 55 Europe \n", "\n", " competition.competition_name season.season_id season.season_name \\\n", "0 UEFA Euro 43 2020 \n", "1 UEFA Euro 43 2020 \n", "2 UEFA Euro 43 2020 \n", "3 UEFA Euro 43 2020 \n", "4 UEFA Euro 43 2020 \n", "\n", " home_team.home_team_id home_team.home_team_name home_team.home_team_gender \\\n", "0 1835 Finland male \n", "1 1835 Finland male \n", "2 1835 Finland male \n", "3 1835 Finland male \n", "4 1835 Finland male \n", "\n", " home_team.home_team_group home_team.country.id home_team.country.name \\\n", "0 Group B 77 Finland \n", "1 Group B 77 Finland \n", "2 Group B 77 Finland \n", "3 Group B 77 Finland \n", "4 Group B 77 Finland \n", "\n", " home_team.managers away_team.away_team_id \\\n", "0 [{'id': 3622, 'name': 'Markku Kanerva', 'nickn... 796 \n", "1 [{'id': 3622, 'name': 'Markku Kanerva', 'nickn... 796 \n", "2 [{'id': 3622, 'name': 'Markku Kanerva', 'nickn... 796 \n", "3 [{'id': 3622, 'name': 'Markku Kanerva', 'nickn... 796 \n", "4 [{'id': 3622, 'name': 'Markku Kanerva', 'nickn... 796 \n", "\n", " away_team.away_team_name away_team.away_team_gender \\\n", "0 Russia male \n", "1 Russia male \n", "2 Russia male \n", "3 Russia male \n", "4 Russia male \n", "\n", " away_team.away_team_group away_team.country.id away_team.country.name \\\n", "0 Group B 188 Russia \n", "1 Group B 188 Russia \n", "2 Group B 188 Russia \n", "3 Group B 188 Russia \n", "4 Group B 188 Russia \n", "\n", " away_team.managers metadata.data_version \\\n", "0 [{'id': 365, 'name': 'Stanislav Cherchesov', '... 1.1.0 \n", "1 [{'id': 365, 'name': 'Stanislav Cherchesov', '... 1.1.0 \n", "2 [{'id': 365, 'name': 'Stanislav Cherchesov', '... 1.1.0 \n", "3 [{'id': 365, 'name': 'Stanislav Cherchesov', '... 1.1.0 \n", "4 [{'id': 365, 'name': 'Stanislav Cherchesov', '... 1.1.0 \n", "\n", " metadata.shot_fidelity_version metadata.xy_fidelity_version \\\n", "0 2 2 \n", "1 2 2 \n", "2 2 2 \n", "3 2 2 \n", "4 2 2 \n", "\n", " competition_stage.id competition_stage.name stadium.id \\\n", "0 10 Group Stage 4726 \n", "1 10 Group Stage 4726 \n", "2 10 Group Stage 4726 \n", "3 10 Group Stage 4726 \n", "4 10 Group Stage 4726 \n", "\n", " stadium.name stadium.country.id stadium.country.name \\\n", "0 Saint-Petersburg Stadium 188 Russia \n", "1 Saint-Petersburg Stadium 188 Russia \n", "2 Saint-Petersburg Stadium 188 Russia \n", "3 Saint-Petersburg Stadium 188 Russia \n", "4 Saint-Petersburg Stadium 188 Russia \n", "\n", " referee.id referee.name referee.country.id \\\n", "0 293 Danny Desmond Makkelie 160 \n", "1 293 Danny Desmond Makkelie 160 \n", "2 293 Danny Desmond Makkelie 160 \n", "3 293 Danny Desmond Makkelie 160 \n", "4 293 Danny Desmond Makkelie 160 \n", "\n", " referee.country.name competition_id season_id country_name \\\n", "0 Netherlands 55 43 Europe \n", "1 Netherlands 55 43 Europe \n", "2 Netherlands 55 43 Europe \n", "3 Netherlands 55 43 Europe \n", "4 Netherlands 55 43 Europe \n", "\n", " competition_name competition_gender competition_youth \\\n", "0 UEFA Euro male False \n", "1 UEFA Euro male False \n", "2 UEFA Euro male False \n", "3 UEFA Euro male False \n", "4 UEFA Euro male False \n", "\n", " competition_international season_name match_updated \\\n", "0 True 2020 2021-11-11T14:00:16.105809 \n", "1 True 2020 2021-11-11T14:00:16.105809 \n", "2 True 2020 2021-11-11T14:00:16.105809 \n", "3 True 2020 2021-11-11T14:00:16.105809 \n", "4 True 2020 2021-11-11T14:00:16.105809 \n", "\n", " match_updated_360 match_available_360 \\\n", "0 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "1 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "2 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "3 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "4 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "\n", " match_available \n", "0 2021-11-11T14:00:16.105809 \n", "1 2021-11-11T14:00:16.105809 \n", "2 2021-11-11T14:00:16.105809 \n", "3 2021-11-11T14:00:16.105809 \n", "4 2021-11-11T14:00:16.105809 " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_sb_events_raw = pd.read_csv(os.path.join(data_dir_sb, 'raw', 'combined', 'combined_sb_360.csv'))\n", " \n", "# Display DataFrame\n", "df_sb_events_raw.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.2. Initial Data Handling\n", "Let's quality of the dataset by looking first and last rows in pandas using the [head()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.head.html) and [tail()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.tail.html) methods." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 3.2.1. Summary Report\n", "Initial step of the data handling and Exploratory Data Analysis (EDA) is to create a quick summary report of the dataset using [pandas Profiling Report](https://github.com/pandas-profiling/pandas-profiling)." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "# Summary of the data using pandas Profiling Report\n", "#pp.ProfileReport(df_sb_events_raw)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 3.2.2. Further Inspection\n", "The following commands go into more bespoke summary of the dataset. Some of the commands include content covered in the [pandas Profiling](https://github.com/pandas-profiling/pandas-profiling) summary above, but using the standard [pandas](https://pandas.pydata.org/) functions and methods that most peoplem will be more familiar with.\n", "\n", "First check the quality of the dataset by looking first and last rows in pandas using the [head()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.head.html) and [tail()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.tail.html) methods." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
level_0idindexperiodtimestampminutesecondpossessiondurationtype.idtype.namepossession_team.idpossession_team.nameplay_pattern.idplay_pattern.nameteam.idteam.nametactics.formationtactics.lineuprelated_eventslocationplayer.idplayer.nameposition.idposition.namepass.recipient.idpass.recipient.namepass.lengthpass.anglepass.height.idpass.height.namepass.end_locationpass.body_part.idpass.body_part.namepass.type.idpass.type.namecarry.end_locationunder_pressureduel.type.idduel.type.namepass.aerial_woncounterpressduel.outcome.idduel.outcome.namedribble.outcome.iddribble.outcome.namepass.outcome.idpass.outcome.nameball_receipt.outcome.idball_receipt.outcome.nameinterception.outcome.idinterception.outcome.nameshot.statsbomb_xgshot.end_locationshot.outcome.idshot.outcome.nameshot.type.idshot.type.nameshot.body_part.idshot.body_part.nameshot.technique.idshot.technique.nameshot.freeze_framegoalkeeper.end_locationgoalkeeper.type.idgoalkeeper.type.namegoalkeeper.position.idgoalkeeper.position.nameoutpass.outswingingpass.technique.idpass.technique.nameclearance.headclearance.body_part.idclearance.body_part.namepass.switchoff_camerapass.crossclearance.left_footdribble.overrundribble.nutmegclearance.right_footpass.no_touchfoul_committed.advantagefoul_won.advantagepass.assisted_shot_idpass.shot_assistshot.key_pass_idshot.first_timeclearance.otherpass.miscommunicationclearance.aerial_wonpass.through_ballball_recovery.recovery_failuregoalkeeper.outcome.idgoalkeeper.outcome.namegoalkeeper.body_part.idgoalkeeper.body_part.nameshot.aerial_wonfoul_committed.card.idfoul_committed.card.namefoul_committed.offensivefoul_won.defensivesubstitution.outcome.idsubstitution.outcome.namesubstitution.replacement.idsubstitution.replacement.name50_50.outcome.id50_50.outcome.namepass.goal_assistgoalkeeper.technique.idgoalkeeper.technique.namepass.cut_backmiscontrol.aerial_wonpass.straightfoul_committed.type.idfoul_committed.type.namematch_idpass.inswingingpass.deflectedinjury_stoppage.in_chainshot.one_on_onebad_behaviour.card.idbad_behaviour.card.nameshot.deflectedblock.deflectionfoul_committed.penaltyfoul_won.penaltyblock.save_blockgoalkeeper.punched_outplayer_off.permanentshot.saved_off_targetgoalkeeper.shot_saved_off_targetshot.saved_to_postgoalkeeper.shot_saved_to_postshot.open_goalgoalkeeper.penalty_saved_to_postdribble.no_touchblock.offensiveshot.follows_dribbleball_recovery.offensiveshot.redirectgoalkeeper.lost_in_playgoalkeeper.success_in_playmatch_datekick_offhome_scoreaway_scorematch_statusmatch_status_360last_updatedlast_updated_360match_weekcompetition.competition_idcompetition.country_namecompetition.competition_nameseason.season_idseason.season_namehome_team.home_team_idhome_team.home_team_namehome_team.home_team_genderhome_team.home_team_grouphome_team.country.idhome_team.country.namehome_team.managersaway_team.away_team_idaway_team.away_team_nameaway_team.away_team_genderaway_team.away_team_groupaway_team.country.idaway_team.country.nameaway_team.managersmetadata.data_versionmetadata.shot_fidelity_versionmetadata.xy_fidelity_versioncompetition_stage.idcompetition_stage.namestadium.idstadium.namestadium.country.idstadium.country.namereferee.idreferee.namereferee.country.idreferee.country.namecompetition_idseason_idcountry_namecompetition_namecompetition_gendercompetition_youthcompetition_internationalseason_namematch_updatedmatch_updated_360match_available_360match_available
009427b18a-6b10-411f-90da-3d6240b80c711100:00:00.0000010.00000035Starting XI1835Finland1Regular Play1835Finland352.0[{'player': {'id': 8667, 'name': 'Lukáš Hrádec...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788753NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1615:00:00.00001availableavailable2021-11-11T14:00:16.1058092021-09-22T16:39:05.697512255EuropeUEFA Euro4320201835FinlandmaleGroup B77Finland[{'id': 3622, 'name': 'Markku Kanerva', 'nickn...796RussiamaleGroup B188Russia[{'id': 365, 'name': 'Stanislav Cherchesov', '...1.1.02210Group Stage4726Saint-Petersburg Stadium188Russia293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
11542c58bf-5c6c-43ca-9d8d-e086c7f08aaf2100:00:00.0000010.00000035Starting XI1835Finland1Regular Play796Russia3421.0[{'player': {'id': 21298, 'name': 'Matvey Safo...NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788753NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1615:00:00.00001availableavailable2021-11-11T14:00:16.1058092021-09-22T16:39:05.697512255EuropeUEFA Euro4320201835FinlandmaleGroup B77Finland[{'id': 3622, 'name': 'Markku Kanerva', 'nickn...796RussiamaleGroup B188Russia[{'id': 365, 'name': 'Stanislav Cherchesov', '...1.1.02210Group Stage4726Saint-Petersburg Stadium188Russia293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
22a0dfe8a0-a0b9-443e-89e3-a8ba6596fa333100:00:00.0000010.00000018Half Start1835Finland1Regular Play1835FinlandNaNNaN['c7156352-f4b7-4140-aa51-6e26fd019a11']NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788753NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1615:00:00.00001availableavailable2021-11-11T14:00:16.1058092021-09-22T16:39:05.697512255EuropeUEFA Euro4320201835FinlandmaleGroup B77Finland[{'id': 3622, 'name': 'Markku Kanerva', 'nickn...796RussiamaleGroup B188Russia[{'id': 365, 'name': 'Stanislav Cherchesov', '...1.1.02210Group Stage4726Saint-Petersburg Stadium188Russia293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
33c7156352-f4b7-4140-aa51-6e26fd019a114100:00:00.0000010.00000018Half Start1835Finland1Regular Play796RussiaNaNNaN['a0dfe8a0-a0b9-443e-89e3-a8ba6596fa33']NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788753NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1615:00:00.00001availableavailable2021-11-11T14:00:16.1058092021-09-22T16:39:05.697512255EuropeUEFA Euro4320201835FinlandmaleGroup B77Finland[{'id': 3622, 'name': 'Markku Kanerva', 'nickn...796RussiamaleGroup B188Russia[{'id': 365, 'name': 'Stanislav Cherchesov', '...1.1.02210Group Stage4726Saint-Petersburg Stadium188Russia293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
4494dbc5c3-ef37-445e-9154-3d9f9ea9245d5100:00:00.4900021.37321530Pass796Russia9From Kick Off796RussiaNaNNaN['c0935bbe-3eb4-4a21-9eee-45f380d1f26d'][60.0, 40.0]6299.0Aleksey Miranchuk18.0Right Attacking Midfield31917.0Igor Diveev22.3573253.0699671.0Ground Pass[37.7, 41.6]38.0Left Foot65.0Kick OffNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788753NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1615:00:00.00001availableavailable2021-11-11T14:00:16.1058092021-09-22T16:39:05.697512255EuropeUEFA Euro4320201835FinlandmaleGroup B77Finland[{'id': 3622, 'name': 'Markku Kanerva', 'nickn...796RussiamaleGroup B188Russia[{'id': 365, 'name': 'Stanislav Cherchesov', '...1.1.02210Group Stage4726Saint-Petersburg Stadium188Russia293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
\n", "
" ], "text/plain": [ " level_0 id index period timestamp \\\n", "0 0 9427b18a-6b10-411f-90da-3d6240b80c71 1 1 00:00:00.000 \n", "1 1 542c58bf-5c6c-43ca-9d8d-e086c7f08aaf 2 1 00:00:00.000 \n", "2 2 a0dfe8a0-a0b9-443e-89e3-a8ba6596fa33 3 1 00:00:00.000 \n", "3 3 c7156352-f4b7-4140-aa51-6e26fd019a11 4 1 00:00:00.000 \n", "4 4 94dbc5c3-ef37-445e-9154-3d9f9ea9245d 5 1 00:00:00.490 \n", "\n", " minute second possession duration type.id type.name \\\n", "0 0 0 1 0.000000 35 Starting XI \n", "1 0 0 1 0.000000 35 Starting XI \n", "2 0 0 1 0.000000 18 Half Start \n", "3 0 0 1 0.000000 18 Half Start \n", "4 0 0 2 1.373215 30 Pass \n", "\n", " possession_team.id possession_team.name play_pattern.id play_pattern.name \\\n", "0 1835 Finland 1 Regular Play \n", "1 1835 Finland 1 Regular Play \n", "2 1835 Finland 1 Regular Play \n", "3 1835 Finland 1 Regular Play \n", "4 796 Russia 9 From Kick Off \n", "\n", " team.id team.name tactics.formation \\\n", "0 1835 Finland 352.0 \n", "1 796 Russia 3421.0 \n", "2 1835 Finland NaN \n", "3 796 Russia NaN \n", "4 796 Russia NaN \n", "\n", " tactics.lineup \\\n", "0 [{'player': {'id': 8667, 'name': 'Lukáš Hrádec... \n", "1 [{'player': {'id': 21298, 'name': 'Matvey Safo... \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "\n", " related_events location player.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 ['c7156352-f4b7-4140-aa51-6e26fd019a11'] NaN NaN \n", "3 ['a0dfe8a0-a0b9-443e-89e3-a8ba6596fa33'] NaN NaN \n", "4 ['c0935bbe-3eb4-4a21-9eee-45f380d1f26d'] [60.0, 40.0] 6299.0 \n", "\n", " player.name position.id position.name \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 Aleksey Miranchuk 18.0 Right Attacking Midfield \n", "\n", " pass.recipient.id pass.recipient.name pass.length pass.angle \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 31917.0 Igor Diveev 22.357325 3.069967 \n", "\n", " pass.height.id pass.height.name pass.end_location pass.body_part.id \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 1.0 Ground Pass [37.7, 41.6] 38.0 \n", "\n", " pass.body_part.name pass.type.id pass.type.name carry.end_location \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 Left Foot 65.0 Kick Off NaN \n", "\n", " under_pressure duel.type.id duel.type.name pass.aerial_won counterpress \\\n", "0 NaN NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN NaN \n", "\n", " duel.outcome.id duel.outcome.name dribble.outcome.id dribble.outcome.name \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " pass.outcome.id pass.outcome.name ball_receipt.outcome.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " ball_receipt.outcome.name interception.outcome.id \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", "\n", " interception.outcome.name shot.statsbomb_xg shot.end_location \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " shot.outcome.id shot.outcome.name shot.type.id shot.type.name \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " shot.body_part.id shot.body_part.name shot.technique.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " shot.technique.name shot.freeze_frame goalkeeper.end_location \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " goalkeeper.type.id goalkeeper.type.name goalkeeper.position.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " goalkeeper.position.name out pass.outswinging pass.technique.id \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " pass.technique.name clearance.head clearance.body_part.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " clearance.body_part.name pass.switch off_camera pass.cross \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " clearance.left_foot dribble.overrun dribble.nutmeg clearance.right_foot \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " pass.no_touch foul_committed.advantage foul_won.advantage \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " pass.assisted_shot_id pass.shot_assist shot.key_pass_id shot.first_time \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " clearance.other pass.miscommunication clearance.aerial_won \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " pass.through_ball ball_recovery.recovery_failure goalkeeper.outcome.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " goalkeeper.outcome.name goalkeeper.body_part.id goalkeeper.body_part.name \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " shot.aerial_won foul_committed.card.id foul_committed.card.name \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " foul_committed.offensive foul_won.defensive substitution.outcome.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " substitution.outcome.name substitution.replacement.id \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", "\n", " substitution.replacement.name 50_50.outcome.id 50_50.outcome.name \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " pass.goal_assist goalkeeper.technique.id goalkeeper.technique.name \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " pass.cut_back miscontrol.aerial_won pass.straight foul_committed.type.id \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " foul_committed.type.name match_id pass.inswinging pass.deflected \\\n", "0 NaN 3788753 NaN NaN \n", "1 NaN 3788753 NaN NaN \n", "2 NaN 3788753 NaN NaN \n", "3 NaN 3788753 NaN NaN \n", "4 NaN 3788753 NaN NaN \n", "\n", " injury_stoppage.in_chain shot.one_on_one bad_behaviour.card.id \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " bad_behaviour.card.name shot.deflected block.deflection \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " foul_committed.penalty foul_won.penalty block.save_block \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " goalkeeper.punched_out player_off.permanent shot.saved_off_target \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " goalkeeper.shot_saved_off_target shot.saved_to_post \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", "\n", " goalkeeper.shot_saved_to_post shot.open_goal \\\n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", "\n", " goalkeeper.penalty_saved_to_post dribble.no_touch block.offensive \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " shot.follows_dribble ball_recovery.offensive shot.redirect \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", " goalkeeper.lost_in_play goalkeeper.success_in_play match_date \\\n", "0 NaN NaN 2021-06-16 \n", "1 NaN NaN 2021-06-16 \n", "2 NaN NaN 2021-06-16 \n", "3 NaN NaN 2021-06-16 \n", "4 NaN NaN 2021-06-16 \n", "\n", " kick_off home_score away_score match_status match_status_360 \\\n", "0 15:00:00.000 0 1 available available \n", "1 15:00:00.000 0 1 available available \n", "2 15:00:00.000 0 1 available available \n", "3 15:00:00.000 0 1 available available \n", "4 15:00:00.000 0 1 available available \n", "\n", " last_updated last_updated_360 match_week \\\n", "0 2021-11-11T14:00:16.105809 2021-09-22T16:39:05.697512 2 \n", "1 2021-11-11T14:00:16.105809 2021-09-22T16:39:05.697512 2 \n", "2 2021-11-11T14:00:16.105809 2021-09-22T16:39:05.697512 2 \n", "3 2021-11-11T14:00:16.105809 2021-09-22T16:39:05.697512 2 \n", "4 2021-11-11T14:00:16.105809 2021-09-22T16:39:05.697512 2 \n", "\n", " competition.competition_id competition.country_name \\\n", "0 55 Europe \n", "1 55 Europe \n", "2 55 Europe \n", "3 55 Europe \n", "4 55 Europe \n", "\n", " competition.competition_name season.season_id season.season_name \\\n", "0 UEFA Euro 43 2020 \n", "1 UEFA Euro 43 2020 \n", "2 UEFA Euro 43 2020 \n", "3 UEFA Euro 43 2020 \n", "4 UEFA Euro 43 2020 \n", "\n", " home_team.home_team_id home_team.home_team_name home_team.home_team_gender \\\n", "0 1835 Finland male \n", "1 1835 Finland male \n", "2 1835 Finland male \n", "3 1835 Finland male \n", "4 1835 Finland male \n", "\n", " home_team.home_team_group home_team.country.id home_team.country.name \\\n", "0 Group B 77 Finland \n", "1 Group B 77 Finland \n", "2 Group B 77 Finland \n", "3 Group B 77 Finland \n", "4 Group B 77 Finland \n", "\n", " home_team.managers away_team.away_team_id \\\n", "0 [{'id': 3622, 'name': 'Markku Kanerva', 'nickn... 796 \n", "1 [{'id': 3622, 'name': 'Markku Kanerva', 'nickn... 796 \n", "2 [{'id': 3622, 'name': 'Markku Kanerva', 'nickn... 796 \n", "3 [{'id': 3622, 'name': 'Markku Kanerva', 'nickn... 796 \n", "4 [{'id': 3622, 'name': 'Markku Kanerva', 'nickn... 796 \n", "\n", " away_team.away_team_name away_team.away_team_gender \\\n", "0 Russia male \n", "1 Russia male \n", "2 Russia male \n", "3 Russia male \n", "4 Russia male \n", "\n", " away_team.away_team_group away_team.country.id away_team.country.name \\\n", "0 Group B 188 Russia \n", "1 Group B 188 Russia \n", "2 Group B 188 Russia \n", "3 Group B 188 Russia \n", "4 Group B 188 Russia \n", "\n", " away_team.managers metadata.data_version \\\n", "0 [{'id': 365, 'name': 'Stanislav Cherchesov', '... 1.1.0 \n", "1 [{'id': 365, 'name': 'Stanislav Cherchesov', '... 1.1.0 \n", "2 [{'id': 365, 'name': 'Stanislav Cherchesov', '... 1.1.0 \n", "3 [{'id': 365, 'name': 'Stanislav Cherchesov', '... 1.1.0 \n", "4 [{'id': 365, 'name': 'Stanislav Cherchesov', '... 1.1.0 \n", "\n", " metadata.shot_fidelity_version metadata.xy_fidelity_version \\\n", "0 2 2 \n", "1 2 2 \n", "2 2 2 \n", "3 2 2 \n", "4 2 2 \n", "\n", " competition_stage.id competition_stage.name stadium.id \\\n", "0 10 Group Stage 4726 \n", "1 10 Group Stage 4726 \n", "2 10 Group Stage 4726 \n", "3 10 Group Stage 4726 \n", "4 10 Group Stage 4726 \n", "\n", " stadium.name stadium.country.id stadium.country.name \\\n", "0 Saint-Petersburg Stadium 188 Russia \n", "1 Saint-Petersburg Stadium 188 Russia \n", "2 Saint-Petersburg Stadium 188 Russia \n", "3 Saint-Petersburg Stadium 188 Russia \n", "4 Saint-Petersburg Stadium 188 Russia \n", "\n", " referee.id referee.name referee.country.id \\\n", "0 293 Danny Desmond Makkelie 160 \n", "1 293 Danny Desmond Makkelie 160 \n", "2 293 Danny Desmond Makkelie 160 \n", "3 293 Danny Desmond Makkelie 160 \n", "4 293 Danny Desmond Makkelie 160 \n", "\n", " referee.country.name competition_id season_id country_name \\\n", "0 Netherlands 55 43 Europe \n", "1 Netherlands 55 43 Europe \n", "2 Netherlands 55 43 Europe \n", "3 Netherlands 55 43 Europe \n", "4 Netherlands 55 43 Europe \n", "\n", " competition_name competition_gender competition_youth \\\n", "0 UEFA Euro male False \n", "1 UEFA Euro male False \n", "2 UEFA Euro male False \n", "3 UEFA Euro male False \n", "4 UEFA Euro male False \n", "\n", " competition_international season_name match_updated \\\n", "0 True 2020 2021-11-11T14:00:16.105809 \n", "1 True 2020 2021-11-11T14:00:16.105809 \n", "2 True 2020 2021-11-11T14:00:16.105809 \n", "3 True 2020 2021-11-11T14:00:16.105809 \n", "4 True 2020 2021-11-11T14:00:16.105809 \n", "\n", " match_updated_360 match_available_360 \\\n", "0 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "1 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "2 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "3 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "4 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "\n", " match_available \n", "0 2021-11-11T14:00:16.105809 \n", "1 2021-11-11T14:00:16.105809 \n", "2 2021-11-11T14:00:16.105809 \n", "3 2021-11-11T14:00:16.105809 \n", "4 2021-11-11T14:00:16.105809 " ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Display the first five rows of the DataFrame, df_sb_events_raw\n", "df_sb_events_raw.head()" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
level_0idindexperiodtimestampminutesecondpossessiondurationtype.idtype.namepossession_team.idpossession_team.nameplay_pattern.idplay_pattern.nameteam.idteam.nametactics.formationtactics.lineuprelated_eventslocationplayer.idplayer.nameposition.idposition.namepass.recipient.idpass.recipient.namepass.lengthpass.anglepass.height.idpass.height.namepass.end_locationpass.body_part.idpass.body_part.namepass.type.idpass.type.namecarry.end_locationunder_pressureduel.type.idduel.type.namepass.aerial_woncounterpressduel.outcome.idduel.outcome.namedribble.outcome.iddribble.outcome.namepass.outcome.idpass.outcome.nameball_receipt.outcome.idball_receipt.outcome.nameinterception.outcome.idinterception.outcome.nameshot.statsbomb_xgshot.end_locationshot.outcome.idshot.outcome.nameshot.type.idshot.type.nameshot.body_part.idshot.body_part.nameshot.technique.idshot.technique.nameshot.freeze_framegoalkeeper.end_locationgoalkeeper.type.idgoalkeeper.type.namegoalkeeper.position.idgoalkeeper.position.nameoutpass.outswingingpass.technique.idpass.technique.nameclearance.headclearance.body_part.idclearance.body_part.namepass.switchoff_camerapass.crossclearance.left_footdribble.overrundribble.nutmegclearance.right_footpass.no_touchfoul_committed.advantagefoul_won.advantagepass.assisted_shot_idpass.shot_assistshot.key_pass_idshot.first_timeclearance.otherpass.miscommunicationclearance.aerial_wonpass.through_ballball_recovery.recovery_failuregoalkeeper.outcome.idgoalkeeper.outcome.namegoalkeeper.body_part.idgoalkeeper.body_part.nameshot.aerial_wonfoul_committed.card.idfoul_committed.card.namefoul_committed.offensivefoul_won.defensivesubstitution.outcome.idsubstitution.outcome.namesubstitution.replacement.idsubstitution.replacement.name50_50.outcome.id50_50.outcome.namepass.goal_assistgoalkeeper.technique.idgoalkeeper.technique.namepass.cut_backmiscontrol.aerial_wonpass.straightfoul_committed.type.idfoul_committed.type.namematch_idpass.inswingingpass.deflectedinjury_stoppage.in_chainshot.one_on_onebad_behaviour.card.idbad_behaviour.card.nameshot.deflectedblock.deflectionfoul_committed.penaltyfoul_won.penaltyblock.save_blockgoalkeeper.punched_outplayer_off.permanentshot.saved_off_targetgoalkeeper.shot_saved_off_targetshot.saved_to_postgoalkeeper.shot_saved_to_postshot.open_goalgoalkeeper.penalty_saved_to_postdribble.no_touchblock.offensiveshot.follows_dribbleball_recovery.offensiveshot.redirectgoalkeeper.lost_in_playgoalkeeper.success_in_playmatch_datekick_offhome_scoreaway_scorematch_statusmatch_status_360last_updatedlast_updated_360match_weekcompetition.competition_idcompetition.country_namecompetition.competition_nameseason.season_idseason.season_namehome_team.home_team_idhome_team.home_team_namehome_team.home_team_genderhome_team.home_team_grouphome_team.country.idhome_team.country.namehome_team.managersaway_team.away_team_idaway_team.away_team_nameaway_team.away_team_genderaway_team.away_team_groupaway_team.country.idaway_team.country.nameaway_team.managersmetadata.data_versionmetadata.shot_fidelity_versionmetadata.xy_fidelity_versioncompetition_stage.idcompetition_stage.namestadium.idstadium.namestadium.country.idstadium.country.namereferee.idreferee.namereferee.country.idreferee.country.namecompetition_idseason_idcountry_namecompetition_namecompetition_gendercompetition_youthcompetition_internationalseason_namematch_updatedmatch_updated_360match_available_360match_available
19268129928aa68a18-79d2-4b8b-ba5f-9d3124f1cd202993200:50:23.15095231510.8709330Pass907Wales4From Throw In907WalesNaNNaN['a0bdd045-65d0-4601-9378-a5dacdba3257', 'dcbf...[110.7, 0.1]3086.0Ben Davies6.0Left Back6399.0Gareth Frank Bale6.1032781.2542272.0Low Pass[112.6, 5.9]NaNNaN67.0Throw-inNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN9.0IncompleteNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788744NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1215:00:00.00011availableavailable2021-06-20T12:57:59.2582021-09-22T16:38:18.433799155EuropeUEFA Euro432020907WalesmaleGroup A249WalesNaN773SwitzerlandmaleGroup A221SwitzerlandNaN1.1.02210Group Stage4549Bakı Olimpiya Stadionu16Azerbaijan76Clément Turpin78France5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
1926822993a0bdd045-65d0-4601-9378-a5dacdba32572994200:50:24.0219524151NaN42Ball Receipt*907Wales4From Throw In907WalesNaNNaN['8aa68a18-79d2-4b8b-ba5f-9d3124f1cd20'][113.7, 8.0]6399.0Gareth Frank Bale16.0Left MidfieldNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN9.0IncompleteNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788744NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1215:00:00.00011availableavailable2021-06-20T12:57:59.2582021-09-22T16:38:18.433799155EuropeUEFA Euro432020907WalesmaleGroup A249WalesNaN773SwitzerlandmaleGroup A221SwitzerlandNaN1.1.02210Group Stage4549Bakı Olimpiya Stadionu16Azerbaijan76Clément Turpin78France5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
1926832994dcbfc5e1-b22f-460f-8132-cf6fe7a590f82995200:50:24.02195241510.0000010Interception907Wales4From Throw In773SwitzerlandNaNNaN['8aa68a18-79d2-4b8b-ba5f-9d3124f1cd20'][7.5, 74.2]8814.0Nico Elvedi3.0Right Center BackNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN4.0WonNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788744NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1215:00:00.00011availableavailable2021-06-20T12:57:59.2582021-09-22T16:38:18.433799155EuropeUEFA Euro432020907WalesmaleGroup A249WalesNaN773SwitzerlandmaleGroup A221SwitzerlandNaN1.1.02210Group Stage4549Bakı Olimpiya Stadionu16Azerbaijan76Clément Turpin78France5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
192684299597cb188e-82e3-49cb-9ccf-9a5346f042b12996200:50:25.40495251510.0000034Half End907Wales4From Throw In773SwitzerlandNaNNaN['edfa93d5-a744-41a4-a10d-f1d09017011d']NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788744NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1215:00:00.00011availableavailable2021-06-20T12:57:59.2582021-09-22T16:38:18.433799155EuropeUEFA Euro432020907WalesmaleGroup A249WalesNaN773SwitzerlandmaleGroup A221SwitzerlandNaN1.1.02210Group Stage4549Bakı Olimpiya Stadionu16Azerbaijan76Clément Turpin78France5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
1926852996edfa93d5-a744-41a4-a10d-f1d09017011d2997200:50:25.40495251510.0000034Half End907Wales4From Throw In907WalesNaNNaN['97cb188e-82e3-49cb-9ccf-9a5346f042b1']NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788744NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1215:00:00.00011availableavailable2021-06-20T12:57:59.2582021-09-22T16:38:18.433799155EuropeUEFA Euro432020907WalesmaleGroup A249WalesNaN773SwitzerlandmaleGroup A221SwitzerlandNaN1.1.02210Group Stage4549Bakı Olimpiya Stadionu16Azerbaijan76Clément Turpin78France5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.105809
\n", "
" ], "text/plain": [ " level_0 id index period \\\n", "192681 2992 8aa68a18-79d2-4b8b-ba5f-9d3124f1cd20 2993 2 \n", "192682 2993 a0bdd045-65d0-4601-9378-a5dacdba3257 2994 2 \n", "192683 2994 dcbfc5e1-b22f-460f-8132-cf6fe7a590f8 2995 2 \n", "192684 2995 97cb188e-82e3-49cb-9ccf-9a5346f042b1 2996 2 \n", "192685 2996 edfa93d5-a744-41a4-a10d-f1d09017011d 2997 2 \n", "\n", " timestamp minute second possession duration type.id \\\n", "192681 00:50:23.150 95 23 151 0.87093 30 \n", "192682 00:50:24.021 95 24 151 NaN 42 \n", "192683 00:50:24.021 95 24 151 0.00000 10 \n", "192684 00:50:25.404 95 25 151 0.00000 34 \n", "192685 00:50:25.404 95 25 151 0.00000 34 \n", "\n", " type.name possession_team.id possession_team.name \\\n", "192681 Pass 907 Wales \n", "192682 Ball Receipt* 907 Wales \n", "192683 Interception 907 Wales \n", "192684 Half End 907 Wales \n", "192685 Half End 907 Wales \n", "\n", " play_pattern.id play_pattern.name team.id team.name \\\n", "192681 4 From Throw In 907 Wales \n", "192682 4 From Throw In 907 Wales \n", "192683 4 From Throw In 773 Switzerland \n", "192684 4 From Throw In 773 Switzerland \n", "192685 4 From Throw In 907 Wales \n", "\n", " tactics.formation tactics.lineup \\\n", "192681 NaN NaN \n", "192682 NaN NaN \n", "192683 NaN NaN \n", "192684 NaN NaN \n", "192685 NaN NaN \n", "\n", " related_events location \\\n", "192681 ['a0bdd045-65d0-4601-9378-a5dacdba3257', 'dcbf... [110.7, 0.1] \n", "192682 ['8aa68a18-79d2-4b8b-ba5f-9d3124f1cd20'] [113.7, 8.0] \n", "192683 ['8aa68a18-79d2-4b8b-ba5f-9d3124f1cd20'] [7.5, 74.2] \n", "192684 ['edfa93d5-a744-41a4-a10d-f1d09017011d'] NaN \n", "192685 ['97cb188e-82e3-49cb-9ccf-9a5346f042b1'] NaN \n", "\n", " player.id player.name position.id position.name \\\n", "192681 3086.0 Ben Davies 6.0 Left Back \n", "192682 6399.0 Gareth Frank Bale 16.0 Left Midfield \n", "192683 8814.0 Nico Elvedi 3.0 Right Center Back \n", "192684 NaN NaN NaN NaN \n", "192685 NaN NaN NaN NaN \n", "\n", " pass.recipient.id pass.recipient.name pass.length pass.angle \\\n", "192681 6399.0 Gareth Frank Bale 6.103278 1.254227 \n", "192682 NaN NaN NaN NaN \n", "192683 NaN NaN NaN NaN \n", "192684 NaN NaN NaN NaN \n", "192685 NaN NaN NaN NaN \n", "\n", " pass.height.id pass.height.name pass.end_location pass.body_part.id \\\n", "192681 2.0 Low Pass [112.6, 5.9] NaN \n", "192682 NaN NaN NaN NaN \n", "192683 NaN NaN NaN NaN \n", "192684 NaN NaN NaN NaN \n", "192685 NaN NaN NaN NaN \n", "\n", " pass.body_part.name pass.type.id pass.type.name carry.end_location \\\n", "192681 NaN 67.0 Throw-in NaN \n", "192682 NaN NaN NaN NaN \n", "192683 NaN NaN NaN NaN \n", "192684 NaN NaN NaN NaN \n", "192685 NaN NaN NaN NaN \n", "\n", " under_pressure duel.type.id duel.type.name pass.aerial_won \\\n", "192681 NaN NaN NaN NaN \n", "192682 NaN NaN NaN NaN \n", "192683 NaN NaN NaN NaN \n", "192684 NaN NaN NaN NaN \n", "192685 NaN NaN NaN NaN \n", "\n", " counterpress duel.outcome.id duel.outcome.name dribble.outcome.id \\\n", "192681 NaN NaN NaN NaN \n", "192682 NaN NaN NaN NaN \n", "192683 NaN NaN NaN NaN \n", "192684 NaN NaN NaN NaN \n", "192685 NaN NaN NaN NaN \n", "\n", " dribble.outcome.name pass.outcome.id pass.outcome.name \\\n", "192681 NaN 9.0 Incomplete \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " ball_receipt.outcome.id ball_receipt.outcome.name \\\n", "192681 NaN NaN \n", "192682 9.0 Incomplete \n", "192683 NaN NaN \n", "192684 NaN NaN \n", "192685 NaN NaN \n", "\n", " interception.outcome.id interception.outcome.name shot.statsbomb_xg \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 4.0 Won NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " shot.end_location shot.outcome.id shot.outcome.name shot.type.id \\\n", "192681 NaN NaN NaN NaN \n", "192682 NaN NaN NaN NaN \n", "192683 NaN NaN NaN NaN \n", "192684 NaN NaN NaN NaN \n", "192685 NaN NaN NaN NaN \n", "\n", " shot.type.name shot.body_part.id shot.body_part.name \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " shot.technique.id shot.technique.name shot.freeze_frame \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " goalkeeper.end_location goalkeeper.type.id goalkeeper.type.name \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " goalkeeper.position.id goalkeeper.position.name out pass.outswinging \\\n", "192681 NaN NaN NaN NaN \n", "192682 NaN NaN NaN NaN \n", "192683 NaN NaN NaN NaN \n", "192684 NaN NaN NaN NaN \n", "192685 NaN NaN NaN NaN \n", "\n", " pass.technique.id pass.technique.name clearance.head \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " clearance.body_part.id clearance.body_part.name pass.switch \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " off_camera pass.cross clearance.left_foot dribble.overrun \\\n", "192681 NaN NaN NaN NaN \n", "192682 NaN NaN NaN NaN \n", "192683 NaN NaN NaN NaN \n", "192684 NaN NaN NaN NaN \n", "192685 NaN NaN NaN NaN \n", "\n", " dribble.nutmeg clearance.right_foot pass.no_touch \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " foul_committed.advantage foul_won.advantage pass.assisted_shot_id \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " pass.shot_assist shot.key_pass_id shot.first_time clearance.other \\\n", "192681 NaN NaN NaN NaN \n", "192682 NaN NaN NaN NaN \n", "192683 NaN NaN NaN NaN \n", "192684 NaN NaN NaN NaN \n", "192685 NaN NaN NaN NaN \n", "\n", " pass.miscommunication clearance.aerial_won pass.through_ball \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " ball_recovery.recovery_failure goalkeeper.outcome.id \\\n", "192681 NaN NaN \n", "192682 NaN NaN \n", "192683 NaN NaN \n", "192684 NaN NaN \n", "192685 NaN NaN \n", "\n", " goalkeeper.outcome.name goalkeeper.body_part.id \\\n", "192681 NaN NaN \n", "192682 NaN NaN \n", "192683 NaN NaN \n", "192684 NaN NaN \n", "192685 NaN NaN \n", "\n", " goalkeeper.body_part.name shot.aerial_won foul_committed.card.id \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " foul_committed.card.name foul_committed.offensive foul_won.defensive \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " substitution.outcome.id substitution.outcome.name \\\n", "192681 NaN NaN \n", "192682 NaN NaN \n", "192683 NaN NaN \n", "192684 NaN NaN \n", "192685 NaN NaN \n", "\n", " substitution.replacement.id substitution.replacement.name \\\n", "192681 NaN NaN \n", "192682 NaN NaN \n", "192683 NaN NaN \n", "192684 NaN NaN \n", "192685 NaN NaN \n", "\n", " 50_50.outcome.id 50_50.outcome.name pass.goal_assist \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " goalkeeper.technique.id goalkeeper.technique.name pass.cut_back \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " miscontrol.aerial_won pass.straight foul_committed.type.id \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " foul_committed.type.name match_id pass.inswinging pass.deflected \\\n", "192681 NaN 3788744 NaN NaN \n", "192682 NaN 3788744 NaN NaN \n", "192683 NaN 3788744 NaN NaN \n", "192684 NaN 3788744 NaN NaN \n", "192685 NaN 3788744 NaN NaN \n", "\n", " injury_stoppage.in_chain shot.one_on_one bad_behaviour.card.id \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " bad_behaviour.card.name shot.deflected block.deflection \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " foul_committed.penalty foul_won.penalty block.save_block \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " goalkeeper.punched_out player_off.permanent shot.saved_off_target \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " goalkeeper.shot_saved_off_target shot.saved_to_post \\\n", "192681 NaN NaN \n", "192682 NaN NaN \n", "192683 NaN NaN \n", "192684 NaN NaN \n", "192685 NaN NaN \n", "\n", " goalkeeper.shot_saved_to_post shot.open_goal \\\n", "192681 NaN NaN \n", "192682 NaN NaN \n", "192683 NaN NaN \n", "192684 NaN NaN \n", "192685 NaN NaN \n", "\n", " goalkeeper.penalty_saved_to_post dribble.no_touch block.offensive \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " shot.follows_dribble ball_recovery.offensive shot.redirect \\\n", "192681 NaN NaN NaN \n", "192682 NaN NaN NaN \n", "192683 NaN NaN NaN \n", "192684 NaN NaN NaN \n", "192685 NaN NaN NaN \n", "\n", " goalkeeper.lost_in_play goalkeeper.success_in_play match_date \\\n", "192681 NaN NaN 2021-06-12 \n", "192682 NaN NaN 2021-06-12 \n", "192683 NaN NaN 2021-06-12 \n", "192684 NaN NaN 2021-06-12 \n", "192685 NaN NaN 2021-06-12 \n", "\n", " kick_off home_score away_score match_status match_status_360 \\\n", "192681 15:00:00.000 1 1 available available \n", "192682 15:00:00.000 1 1 available available \n", "192683 15:00:00.000 1 1 available available \n", "192684 15:00:00.000 1 1 available available \n", "192685 15:00:00.000 1 1 available available \n", "\n", " last_updated last_updated_360 match_week \\\n", "192681 2021-06-20T12:57:59.258 2021-09-22T16:38:18.433799 1 \n", "192682 2021-06-20T12:57:59.258 2021-09-22T16:38:18.433799 1 \n", "192683 2021-06-20T12:57:59.258 2021-09-22T16:38:18.433799 1 \n", "192684 2021-06-20T12:57:59.258 2021-09-22T16:38:18.433799 1 \n", "192685 2021-06-20T12:57:59.258 2021-09-22T16:38:18.433799 1 \n", "\n", " competition.competition_id competition.country_name \\\n", "192681 55 Europe \n", "192682 55 Europe \n", "192683 55 Europe \n", "192684 55 Europe \n", "192685 55 Europe \n", "\n", " competition.competition_name season.season_id season.season_name \\\n", "192681 UEFA Euro 43 2020 \n", "192682 UEFA Euro 43 2020 \n", "192683 UEFA Euro 43 2020 \n", "192684 UEFA Euro 43 2020 \n", "192685 UEFA Euro 43 2020 \n", "\n", " home_team.home_team_id home_team.home_team_name \\\n", "192681 907 Wales \n", "192682 907 Wales \n", "192683 907 Wales \n", "192684 907 Wales \n", "192685 907 Wales \n", "\n", " home_team.home_team_gender home_team.home_team_group \\\n", "192681 male Group A \n", "192682 male Group A \n", "192683 male Group A \n", "192684 male Group A \n", "192685 male Group A \n", "\n", " home_team.country.id home_team.country.name home_team.managers \\\n", "192681 249 Wales NaN \n", "192682 249 Wales NaN \n", "192683 249 Wales NaN \n", "192684 249 Wales NaN \n", "192685 249 Wales NaN \n", "\n", " away_team.away_team_id away_team.away_team_name \\\n", "192681 773 Switzerland \n", "192682 773 Switzerland \n", "192683 773 Switzerland \n", "192684 773 Switzerland \n", "192685 773 Switzerland \n", "\n", " away_team.away_team_gender away_team.away_team_group \\\n", "192681 male Group A \n", "192682 male Group A \n", "192683 male Group A \n", "192684 male Group A \n", "192685 male Group A \n", "\n", " away_team.country.id away_team.country.name away_team.managers \\\n", "192681 221 Switzerland NaN \n", "192682 221 Switzerland NaN \n", "192683 221 Switzerland NaN \n", "192684 221 Switzerland NaN \n", "192685 221 Switzerland NaN \n", "\n", " metadata.data_version metadata.shot_fidelity_version \\\n", "192681 1.1.0 2 \n", "192682 1.1.0 2 \n", "192683 1.1.0 2 \n", "192684 1.1.0 2 \n", "192685 1.1.0 2 \n", "\n", " metadata.xy_fidelity_version competition_stage.id \\\n", "192681 2 10 \n", "192682 2 10 \n", "192683 2 10 \n", "192684 2 10 \n", "192685 2 10 \n", "\n", " competition_stage.name stadium.id stadium.name \\\n", "192681 Group Stage 4549 Bakı Olimpiya Stadionu \n", "192682 Group Stage 4549 Bakı Olimpiya Stadionu \n", "192683 Group Stage 4549 Bakı Olimpiya Stadionu \n", "192684 Group Stage 4549 Bakı Olimpiya Stadionu \n", "192685 Group Stage 4549 Bakı Olimpiya Stadionu \n", "\n", " stadium.country.id stadium.country.name referee.id referee.name \\\n", "192681 16 Azerbaijan 76 Clément Turpin \n", "192682 16 Azerbaijan 76 Clément Turpin \n", "192683 16 Azerbaijan 76 Clément Turpin \n", "192684 16 Azerbaijan 76 Clément Turpin \n", "192685 16 Azerbaijan 76 Clément Turpin \n", "\n", " referee.country.id referee.country.name competition_id season_id \\\n", "192681 78 France 55 43 \n", "192682 78 France 55 43 \n", "192683 78 France 55 43 \n", "192684 78 France 55 43 \n", "192685 78 France 55 43 \n", "\n", " country_name competition_name competition_gender competition_youth \\\n", "192681 Europe UEFA Euro male False \n", "192682 Europe UEFA Euro male False \n", "192683 Europe UEFA Euro male False \n", "192684 Europe UEFA Euro male False \n", "192685 Europe UEFA Euro male False \n", "\n", " competition_international season_name match_updated \\\n", "192681 True 2020 2021-11-11T14:00:16.105809 \n", "192682 True 2020 2021-11-11T14:00:16.105809 \n", "192683 True 2020 2021-11-11T14:00:16.105809 \n", "192684 True 2020 2021-11-11T14:00:16.105809 \n", "192685 True 2020 2021-11-11T14:00:16.105809 \n", "\n", " match_updated_360 match_available_360 \\\n", "192681 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "192682 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "192683 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "192684 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "192685 2021-11-11T13:54:37.507376 2021-11-11T13:54:37.507376 \n", "\n", " match_available \n", "192681 2021-11-11T14:00:16.105809 \n", "192682 2021-11-11T14:00:16.105809 \n", "192683 2021-11-11T14:00:16.105809 \n", "192684 2021-11-11T14:00:16.105809 \n", "192685 2021-11-11T14:00:16.105809 " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Display the last five rows of the DataFrame, df_sb_events_raw\n", "df_sb_events_raw.tail()" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(192686, 197)\n" ] } ], "source": [ "# Print the shape of the DataFrame, df_sb_events_raw\n", "print(df_sb_events_raw.shape)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Index(['level_0', 'id', 'index', 'period', 'timestamp', 'minute', 'second',\n", " 'possession', 'duration', 'type.id',\n", " ...\n", " 'country_name', 'competition_name', 'competition_gender',\n", " 'competition_youth', 'competition_international', 'season_name',\n", " 'match_updated', 'match_updated_360', 'match_available_360',\n", " 'match_available'],\n", " dtype='object', length=197)\n" ] } ], "source": [ "# Print the column names of the DataFrame, df_sb_events_raw\n", "print(df_sb_events_raw.columns)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "level_0 int64\n", "id object\n", "index int64\n", "period int64\n", "timestamp object\n", " ... \n", "season_name int64\n", "match_updated object\n", "match_updated_360 object\n", "match_available_360 object\n", "match_available object\n", "Length: 197, dtype: object" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Data types of the features of the raw DataFrame, df_sb_events_raw\n", "df_sb_events_raw.dtypes" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "level_0 int64\n", "id object\n", "index int64\n", "period int64\n", "timestamp object\n", "minute int64\n", "second int64\n", "possession int64\n", "duration float64\n", "type.id int64\n", "type.name object\n", "possession_team.id int64\n", "possession_team.name object\n", "play_pattern.id int64\n", "play_pattern.name object\n", "team.id int64\n", "team.name object\n", "tactics.formation float64\n", "tactics.lineup object\n", "related_events object\n", "location object\n", "player.id float64\n", "player.name object\n", "position.id float64\n", "position.name object\n", "pass.recipient.id float64\n", "pass.recipient.name object\n", "pass.length float64\n", "pass.angle float64\n", "pass.height.id float64\n", "pass.height.name object\n", "pass.end_location object\n", "pass.body_part.id float64\n", "pass.body_part.name object\n", "pass.type.id float64\n", "pass.type.name object\n", "carry.end_location object\n", "under_pressure object\n", "duel.type.id float64\n", "duel.type.name object\n", "pass.aerial_won object\n", "counterpress object\n", "duel.outcome.id float64\n", "duel.outcome.name object\n", "dribble.outcome.id float64\n", "dribble.outcome.name object\n", "pass.outcome.id float64\n", "pass.outcome.name object\n", "ball_receipt.outcome.id float64\n", "ball_receipt.outcome.name object\n", "interception.outcome.id float64\n", "interception.outcome.name object\n", "shot.statsbomb_xg float64\n", "shot.end_location object\n", "shot.outcome.id float64\n", "shot.outcome.name object\n", "shot.type.id float64\n", "shot.type.name object\n", "shot.body_part.id float64\n", "shot.body_part.name object\n", "shot.technique.id float64\n", "shot.technique.name object\n", "shot.freeze_frame object\n", "goalkeeper.end_location object\n", "goalkeeper.type.id float64\n", "goalkeeper.type.name object\n", "goalkeeper.position.id float64\n", "goalkeeper.position.name object\n", "out object\n", "pass.outswinging object\n", "pass.technique.id float64\n", "pass.technique.name object\n", "clearance.head object\n", "clearance.body_part.id float64\n", "clearance.body_part.name object\n", "pass.switch object\n", "off_camera object\n", "pass.cross object\n", "clearance.left_foot object\n", "dribble.overrun object\n", "dribble.nutmeg object\n", "clearance.right_foot object\n", "pass.no_touch object\n", "foul_committed.advantage object\n", "foul_won.advantage object\n", "pass.assisted_shot_id object\n", "pass.shot_assist object\n", "shot.key_pass_id object\n", "shot.first_time object\n", "clearance.other object\n", "pass.miscommunication object\n", "clearance.aerial_won object\n", "pass.through_ball object\n", "ball_recovery.recovery_failure object\n", "goalkeeper.outcome.id float64\n", "goalkeeper.outcome.name object\n", "goalkeeper.body_part.id float64\n", "goalkeeper.body_part.name object\n", "shot.aerial_won object\n", "foul_committed.card.id float64\n", "foul_committed.card.name object\n", "foul_committed.offensive object\n", "foul_won.defensive object\n", "substitution.outcome.id float64\n", "substitution.outcome.name object\n", "substitution.replacement.id float64\n", "substitution.replacement.name object\n", "50_50.outcome.id float64\n", "50_50.outcome.name object\n", "pass.goal_assist object\n", "goalkeeper.technique.id float64\n", "goalkeeper.technique.name object\n", "pass.cut_back object\n", "miscontrol.aerial_won object\n", "pass.straight object\n", "foul_committed.type.id float64\n", "foul_committed.type.name object\n", "match_id int64\n", "pass.inswinging object\n", "pass.deflected object\n", "injury_stoppage.in_chain object\n", "shot.one_on_one object\n", "bad_behaviour.card.id float64\n", "bad_behaviour.card.name object\n", "shot.deflected object\n", "block.deflection object\n", "foul_committed.penalty object\n", "foul_won.penalty object\n", "block.save_block object\n", "goalkeeper.punched_out object\n", "player_off.permanent object\n", "shot.saved_off_target object\n", "goalkeeper.shot_saved_off_target object\n", "shot.saved_to_post object\n", "goalkeeper.shot_saved_to_post object\n", "shot.open_goal object\n", "goalkeeper.penalty_saved_to_post object\n", "dribble.no_touch object\n", "block.offensive object\n", "shot.follows_dribble object\n", "ball_recovery.offensive object\n", "shot.redirect object\n", "goalkeeper.lost_in_play object\n", "goalkeeper.success_in_play object\n", "match_date object\n", "kick_off object\n", "home_score int64\n", "away_score int64\n", "match_status object\n", "match_status_360 object\n", "last_updated object\n", "last_updated_360 object\n", "match_week int64\n", "competition.competition_id int64\n", "competition.country_name object\n", "competition.competition_name object\n", "season.season_id int64\n", "season.season_name int64\n", "home_team.home_team_id int64\n", "home_team.home_team_name object\n", "home_team.home_team_gender object\n", "home_team.home_team_group object\n", "home_team.country.id int64\n", "home_team.country.name object\n", "home_team.managers object\n", "away_team.away_team_id int64\n", "away_team.away_team_name object\n", "away_team.away_team_gender object\n", "away_team.away_team_group object\n", "away_team.country.id int64\n", "away_team.country.name object\n", "away_team.managers object\n", "metadata.data_version object\n", "metadata.shot_fidelity_version int64\n", "metadata.xy_fidelity_version int64\n", "competition_stage.id int64\n", "competition_stage.name object\n", "stadium.id int64\n", "stadium.name object\n", "stadium.country.id int64\n", "stadium.country.name object\n", "referee.id int64\n", "referee.name object\n", "referee.country.id int64\n", "referee.country.name object\n", "competition_id int64\n", "season_id int64\n", "country_name object\n", "competition_name object\n", "competition_gender object\n", "competition_youth bool\n", "competition_international bool\n", "season_name int64\n", "match_updated object\n", "match_updated_360 object\n", "match_available_360 object\n", "match_available object\n", "dtype: object\n" ] } ], "source": [ "# Displays all columns\n", "with pd.option_context('display.max_rows', None, 'display.max_columns', None):\n", " print(df_sb_events_raw.dtypes)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Plot visualisation of the missing values for each feature of the raw DataFrame, df_shots_raw\n", "msno.matrix(df_sb_events_raw, figsize = (30, 7))" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "duration 52722\n", "tactics.formation 192459\n", "tactics.lineup 192459\n", "related_events 7079\n", "location 1530\n", " ... \n", "goalkeeper.success_in_play 192684\n", "home_team.home_team_group 63094\n", "home_team.managers 12516\n", "away_team.away_team_group 63094\n", "away_team.managers 12516\n", "Length: 131, dtype: int64" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Counts of missing values\n", "null_value_stats = df_sb_events_raw.isnull().sum(axis=0)\n", "null_value_stats[null_value_stats != 0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "\n", "\n", "## 4. Data Engineering\n", "The next ext step is to wrangle the dataset to into a format that’s suitable for analysis, through the creation of bespoke in-possession and out-of-possession metrics.\n", "\n", "This section is broken down into the following subsections:\n", "\n", "4.1. [Assign Raw DataFrame to Engineered DataFrame](#section4.1)
\n", "4.2. [Sort the DataFrame](#section4.2)
\n", "4.3. [Determine Each Player's Most Frequent Position](#section4.3)
\n", "4.4. [Determine Each Player's Total Minutes Played](#section4.4)
\n", "4.5. [Isolate In-Play Events](#section4.5)
\n", "4.6. [Break Down All location Attributes](#section4.6)
\n", "4.7. [Create New Attributes](#section4.7)
\n", "4.8. [Fill Null Values](#section4.8)
\n", "4.9. [Export Events Dataset](#section4.9)
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.1. Assign Raw DataFrame to Engineered DataFrame" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "# Assign Raw DataFrame to Engineered DataFrame\n", "df_sb_events = df_sb_events_raw" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.2. Clean Column Names\n", "Remove dots (.) from column names." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:1: FutureWarning: The default value of regex will change from True to False in a future version.\n", " \"\"\"Entry point for launching an IPython kernel.\n" ] } ], "source": [ "# Replace dots (.) in column names with an underscore (_)\n", "df_sb_events.columns = df_sb_events.columns.str.replace('[.]', '_')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.3. Sort DataFrame\n", "Sort DataFrame into correct order of events by time and date, required for creating accurate features." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "# Sort DataFrame\n", "\n", "## Create a 'Full_Fixture Data' attribute from the date, teams, and goals scored\n", "df_sb_events['Full_Fixture_Date'] = df_sb_events['match_date'].astype(str) + ' ' + df_sb_events['home_team_home_team_name'].astype(str) + ' ' + df_sb_events['home_score'].astype(str) + ' ' + ' vs. ' + ' ' + df_sb_events['away_score'].astype(str) + ' ' + df_sb_events['away_team_away_team_name'].astype(str)\n", "\n", "## Sort the DataFrame by the newly created 'Full_Fixture_Date' attribute\n", "df_sb_events = df_sb_events.sort_values(['Full_Fixture_Date', 'match_date', 'timestamp'], ascending=[True, True, True])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.4. Determine Each Player's Most Frequent Playing Position\n", "A player's dominant position is determined as the most frequent position in which the player is playing in the Events data i.e. the highest count of Events in that position. \n", "\n", "The following determined positions will be used as the player's primary position" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
player_nameteam_nameprimary_position_name
2Aaron RamseyWalesLeft Center Midfield
3Adam HložekCzech RepublicLeft Wing
6Adama Traoré DiarraSpainRight Back
8Admir MehmediSwitzerlandLeft Center Forward
11Adrien RabiotFranceLeft Center Midfield
\n", "
" ], "text/plain": [ " player_name team_name primary_position_name\n", "2 Aaron Ramsey Wales Left Center Midfield\n", "3 Adam Hložek Czech Republic Left Wing\n", "6 Adama Traoré Diarra Spain Right Back\n", "8 Admir Mehmedi Switzerland Left Center Forward\n", "11 Adrien Rabiot France Left Center Midfield" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Determine Each Player's Most Frequent Playing Position\n", "\n", "## Groupby and Aggregate by player name and position\n", "df_sb_player_positions = (df_sb_events\n", " .groupby(['player_name', 'team_name', 'position_name'])\n", " .agg({'type_name': 'count'})\n", " .reset_index()\n", " )\n", "\n", "## Rename columns after groupby and aggregation\n", "df_sb_player_positions.columns = ['player_name', 'team_name', 'primary_position_name', 'count']\n", "\n", "## Drop level\n", "#df_sb_player_positions.columns = df_sb_player_positions.columns.droplevel(level=0)\n", "\n", "## Reset index\n", "df_sb_player_positions = df_sb_player_positions.reset_index()\n", "\n", "## Sort by 'mins_total' decending\n", "df_sb_player_positions = df_sb_player_positions.sort_values(['player_name', 'count'], ascending=[True, False])\n", "\n", "## Groupby position and drop the counts\n", "df_sb_player_positions = (df_sb_player_positions\n", " .groupby(['player_name', 'team_name']).head(1)\n", " .drop(['index', 'count'], axis=1)\n", " )\n", "\n", "## Display DataFrame\n", "df_sb_player_positions.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Aggregate the positions into Goalkeepers, Defenders, Midfielders, and Forwards." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([nan, 'Right Center Midfield', 'Right Back', 'Center Forward',\n", " 'Center Defensive Midfield', 'Left Center Back', 'Left Back',\n", " 'Right Midfield', 'Right Center Back', 'Goalkeeper', 'Left Wing',\n", " 'Left Midfield', 'Left Center Midfield', 'Right Wing',\n", " 'Right Defensive Midfield', 'Center Attacking Midfield',\n", " 'Left Defensive Midfield', 'Left Wing Back', 'Center Back',\n", " 'Right Wing Back', 'Right Attacking Midfield',\n", " 'Left Attacking Midfield', 'Left Center Forward',\n", " 'Right Center Forward'], dtype=object)" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Show all unique values for position\n", "df_sb_events['position_name'].unique()" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "# Map a defined dictionary of grouped positions, per specific position\n", "\n", "## Define a dictionary of positions\n", "dict_positions_grouped = {'Goalkeeper': 'Goalkeeper',\n", " 'Left Center Back': 'Defender',\n", " 'Center Back': 'Defender',\n", " 'Right Center Back': 'Defender',\n", " 'Left Back': 'Defender',\n", " 'Right Back': 'Defender',\n", " 'Left Wing Back': 'Defender',\n", " 'Right Wing Back': 'Defender',\n", " 'Left Defensive Midfield': 'Midfield',\n", " 'Center Defensive Midfield': 'Midfield',\n", " 'Right Defensive Midfield': 'Midfield',\n", " 'Left Center Midfield': 'Midfield',\n", " 'Center Midfield': 'Midfield',\n", " 'Right Center Midfield': 'Midfield',\n", " 'Left Midfield': 'Midfield',\n", " 'Right Midfield': 'Midfield',\n", " 'Left Attacking Midfield': 'Midfield',\n", " 'Right Attacking Midfield': 'Midfield',\n", " 'Center Attacking Midfield': 'Midfield',\n", " 'Left Center Forward': 'Forward',\n", " 'Center Forward': 'Forward',\n", " 'Right Center Forward': 'Forward',\n", " 'Left Wing': 'Forward',\n", " 'Right Wing': 'Forward',\n", " 'Secondary Striker': 'Forward'\n", " }\n", "\n", "## Map grouped positions to DataFrame\n", "df_sb_player_positions['primary_position_name_grouped'] = df_sb_player_positions['primary_position_name'].map(dict_positions_grouped)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(['Midfield', 'Forward', 'Defender', 'Goalkeeper'], dtype=object)" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Show all unique values for position\n", "df_sb_player_positions['primary_position_name_grouped'].unique()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, create an `outfield_goalkeeper` attribute." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "# Separate Goalkeepers and Outfielders\n", "df_sb_player_positions['outfielder_goalkeeper'] = np.where(df_sb_player_positions['primary_position_name'].isnull(), np.nan, (np.where(df_sb_player_positions['primary_position_name'] == 'Goalkeeper', 'Goalkeeper', 'Outfielder')))" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "# Export DataFrame as a CSV file\n", "\n", "## \n", "if not os.path.exists(os.path.join(data_dir_sb, 'engineered', 'combined', 'sb_360', 'sb_events_grouped_position.csv')):\n", " df_sb_player_positions.to_csv(os.path.join(data_dir_sb, 'engineered', 'combined', 'sb_360', 'sb_events_grouped_position.csv'), index=None, header=True)\n", "\n", "## \n", "else:\n", " pass" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.5. Determine Each Player's Total Minutes" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
player_namemins_total
144Gianluigi Donnarumma729
197Jorge Luiz Frello Filho721
195Jordan Pickford699
250Leonardo Bonucci681
190John Stones679
\n", "
" ], "text/plain": [ " player_name mins_total\n", "144 Gianluigi Donnarumma 729\n", "197 Jorge Luiz Frello Filho 721\n", "195 Jordan Pickford 699\n", "250 Leonardo Bonucci 681\n", "190 John Stones 679" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Determine Each Player's Total Minutes Played\n", "\n", "## Groupby and Aggregate by player name and position\n", "df_sb_player_minutes = (df_sb_events\n", " .groupby(['player_name', 'Full_Fixture_Date'])\n", " .agg({'minute': ['min', 'max']})\n", " )\n", "\n", "## Drop level\n", "df_sb_player_minutes.columns = df_sb_player_minutes.columns.droplevel(level=0)\n", "\n", "## Reset index\n", "df_sb_player_minutes = df_sb_player_minutes.reset_index()\n", "\n", "\n", "## Reset 'min_start'\n", "df_sb_player_minutes['min'] = np.where(df_sb_player_minutes['min'] <= 5, 0, df_sb_player_minutes['min']) \n", "\n", "## Determine the total minutes played per match\n", "df_sb_player_minutes['mins_total'] = df_sb_player_minutes['max'] - df_sb_player_minutes['min'] \n", "\n", "## Sum the total minutes played\n", "df_sb_player_minutes = (df_sb_player_minutes\n", " .groupby(['player_name'])\n", " .agg({'mins_total': ['sum']})\n", " )\n", "\n", "## Reset index\n", "df_sb_player_minutes = df_sb_player_minutes.reset_index()\n", "\n", "## Rename columns after groupby and aggregation\n", "df_sb_player_minutes.columns = ['player_name', 'mins_total']\n", "\n", "## Sort by 'mins_total' decending\n", "df_sb_player_minutes = df_sb_player_minutes.sort_values(['mins_total'], ascending=[False])\n", "\n", "## Display DataFrame\n", "df_sb_player_minutes.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.6. Break Down All `location` Attributes\n", "Separate all location attributes for X, Y (and sometimes Z) coordinates" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "location\n", "pass_end_location\n", "carry_end_location\n", "shot_end_location\n", "goalkeeper_end_location\n" ] } ], "source": [ "# Display all location columns\n", "for col in df_sb_events.columns:\n", " if 'location' in col:\n", " print(col)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are the following five 'location' attributes:\n", "- `location`\n", "- `pass.end_location`\n", "- `carry.end_location`\n", "- `shot.end_location`\n", "- `goalkeeper.end_location`\n", "\n", "From reviewing the official documentation [[link](https://statsbomb.com/stat-definitions/)], the five attributes have the following dimensionality:\n", "- `location` [x, y]\n", "- `pass.end_location` [x, y]\n", "- `carry.end_location` [x, y]\n", "- `shot.end_location` [x, y, z]\n", "- `goalkeeper.end_location` [x, y]" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\"\\n# CURRENTLY NOT WORKING, NEED TO FIX\\n\\n# Normalize 'shot.freeze_frame' attribute - see: https://stackoverflow.com/questions/52795561/flattening-nested-json-in-pandas-data-frame\\n\\n## explode all columns with lists of dicts\\ndf_sb_events_normalize = df_sb_events.apply(lambda x: x.explode()).reset_index(drop=True)\\n\\n## list of columns with dicts\\ncols_to_normalize = ['shot.freeze_frame']\\n\\n## if there are keys, which will become column names, overlap with excising column names. add the current column name as a prefix\\nnormalized = list()\\n\\nfor col in cols_to_normalize:\\n d = pd.json_normalize(df_sb_events_normalize[col], sep='_')\\n d.columns = [f'{col}_{v}' for v in d.columns]\\n normalized.append(d.copy())\\n\\n## combine df with the normalized columns\\ndf_sb_events_normalize = pd.concat([df_sb_events_normalize] + normalized, axis=1).drop(columns=cols_to_normalize)\\n\\n## display(df_lineup_select_normalize)\\ndf_sb_events_normalize.head(30)\\n\"" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"\"\"\n", "# CURRENTLY NOT WORKING, NEED TO FIX\n", "\n", "# Normalize 'shot.freeze_frame' attribute - see: https://stackoverflow.com/questions/52795561/flattening-nested-json-in-pandas-data-frame\n", "\n", "## explode all columns with lists of dicts\n", "df_sb_events_normalize = df_sb_events.apply(lambda x: x.explode()).reset_index(drop=True)\n", "\n", "## list of columns with dicts\n", "cols_to_normalize = ['shot.freeze_frame']\n", "\n", "## if there are keys, which will become column names, overlap with excising column names. add the current column name as a prefix\n", "normalized = list()\n", "\n", "for col in cols_to_normalize:\n", " d = pd.json_normalize(df_sb_events_normalize[col], sep='_')\n", " d.columns = [f'{col}_{v}' for v in d.columns]\n", " normalized.append(d.copy())\n", "\n", "## combine df with the normalized columns\n", "df_sb_events_normalize = pd.concat([df_sb_events_normalize] + normalized, axis=1).drop(columns=cols_to_normalize)\n", "\n", "## display(df_lineup_select_normalize)\n", "df_sb_events_normalize.head(30)\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:12: FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.\n", " if sys.path[0] == '':\n", "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:13: FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.\n", " del sys.path[0]\n", "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:14: FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.\n", " \n", "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:15: FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.\n", " from ipykernel import kernelapp as app\n", "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:16: FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.\n", " app.launch_new_instance()\n", "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:19: FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.\n", "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:20: FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.\n", "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:21: FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.\n", "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:22: FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.\n", "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:23: FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.\n", "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:26: FutureWarning: Columnar iteration over characters will be deprecated in future releases.\n", "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:27: FutureWarning: Columnar iteration over characters will be deprecated in future releases.\n", "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:28: FutureWarning: Columnar iteration over characters will be deprecated in future releases.\n", "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:29: FutureWarning: Columnar iteration over characters will be deprecated in future releases.\n", "/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:30: FutureWarning: Columnar iteration over characters will be deprecated in future releases.\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
level_0idindexperiodtimestampminutesecondpossessiondurationtype_idtype_namepossession_team_idpossession_team_nameplay_pattern_idplay_pattern_nameteam_idteam_nametactics_formationtactics_lineuprelated_eventslocationplayer_idplayer_nameposition_idposition_namepass_recipient_idpass_recipient_namepass_lengthpass_anglepass_height_idpass_height_namepass_end_locationpass_body_part_idpass_body_part_namepass_type_idpass_type_namecarry_end_locationunder_pressureduel_type_idduel_type_namepass_aerial_woncounterpressduel_outcome_idduel_outcome_namedribble_outcome_iddribble_outcome_namepass_outcome_idpass_outcome_nameball_receipt_outcome_idball_receipt_outcome_nameinterception_outcome_idinterception_outcome_nameshot_statsbomb_xgshot_end_locationshot_outcome_idshot_outcome_nameshot_type_idshot_type_nameshot_body_part_idshot_body_part_nameshot_technique_idshot_technique_nameshot_freeze_framegoalkeeper_end_locationgoalkeeper_type_idgoalkeeper_type_namegoalkeeper_position_idgoalkeeper_position_nameoutpass_outswingingpass_technique_idpass_technique_nameclearance_headclearance_body_part_idclearance_body_part_namepass_switchoff_camerapass_crossclearance_left_footdribble_overrundribble_nutmegclearance_right_footpass_no_touchfoul_committed_advantagefoul_won_advantagepass_assisted_shot_idpass_shot_assistshot_key_pass_idshot_first_timeclearance_otherpass_miscommunicationclearance_aerial_wonpass_through_ballball_recovery_recovery_failuregoalkeeper_outcome_idgoalkeeper_outcome_namegoalkeeper_body_part_idgoalkeeper_body_part_nameshot_aerial_wonfoul_committed_card_idfoul_committed_card_namefoul_committed_offensivefoul_won_defensivesubstitution_outcome_idsubstitution_outcome_namesubstitution_replacement_idsubstitution_replacement_name50_50_outcome_id50_50_outcome_namepass_goal_assistgoalkeeper_technique_idgoalkeeper_technique_namepass_cut_backmiscontrol_aerial_wonpass_straightfoul_committed_type_idfoul_committed_type_namematch_idpass_inswingingpass_deflectedinjury_stoppage_in_chainshot_one_on_onebad_behaviour_card_idbad_behaviour_card_nameshot_deflectedblock_deflectionfoul_committed_penaltyfoul_won_penaltyblock_save_blockgoalkeeper_punched_outplayer_off_permanentshot_saved_off_targetgoalkeeper_shot_saved_off_targetshot_saved_to_postgoalkeeper_shot_saved_to_postshot_open_goalgoalkeeper_penalty_saved_to_postdribble_no_touchblock_offensiveshot_follows_dribbleball_recovery_offensiveshot_redirectgoalkeeper_lost_in_playgoalkeeper_success_in_playmatch_datekick_offhome_scoreaway_scorematch_statusmatch_status_360last_updatedlast_updated_360match_weekcompetition_competition_idcompetition_country_namecompetition_competition_nameseason_season_idseason_season_namehome_team_home_team_idhome_team_home_team_namehome_team_home_team_genderhome_team_home_team_grouphome_team_country_idhome_team_country_namehome_team_managersaway_team_away_team_idaway_team_away_team_nameaway_team_away_team_genderaway_team_away_team_groupaway_team_country_idaway_team_country_nameaway_team_managersmetadata_data_versionmetadata_shot_fidelity_versionmetadata_xy_fidelity_versioncompetition_stage_idcompetition_stage_namestadium_idstadium_namestadium_country_idstadium_country_namereferee_idreferee_namereferee_country_idreferee_country_namecompetition_idseason_idcountry_namecompetition_namecompetition_gendercompetition_youthcompetition_internationalseason_namematch_updatedmatch_updated_360match_available_360match_availableFull_Fixture_Datelocation_xlocation_ypass_end_location_xpass_end_location_ycarry_end_location_xcarry_end_location_yshot_end_location_xshot_end_location_yshot_end_location_zgoalkeeper_end_location_xgoalkeeper_end_location_y
128670019edeac2-e63f-4795-8a8b-17a6e9fdb6e31100:00:00.0000010.0000035Starting XI909Turkey1Regular Play909Turkey4141.0[{'player': {'id': 30357, 'name': 'Uğurcan Çak...NaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
128671189072e2e-b64f-4099-846b-b22cf000f9c72100:00:00.0000010.0000035Starting XI909Turkey1Regular Play914Italy433.0[{'player': {'id': 7036, 'name': 'Gianluigi Do...NaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
128672246c6901e-3b12-495a-b68a-19ca15798ed03100:00:00.0000010.0000018Half Start909Turkey1Regular Play914ItalyNaNNaN['9e5b0646-91cc-49a1-bf88-39bde773b949']nanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
12867339e5b0646-91cc-49a1-bf88-39bde773b9494100:00:00.0000010.0000018Half Start909Turkey1Regular Play909TurkeyNaNNaN['46c6901e-3b12-495a-b68a-19ca15798ed0']nanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
13067320035c2a7c70-becf-4520-856d-d80b55367c752004200:00:00.0004501100.0000018Half Start909Turkey3From Free Kick914ItalyNaNNaN['e767b1ab-b5c8-4ac4-87c9-d7a884399777']nanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1306742004e767b1ab-b5c8-4ac4-87c9-d7a8843997772005200:00:00.0004501100.0000018Half Start909Turkey3From Free Kick909TurkeyNaNNaN['5c2a7c70-becf-4520-856d-d80b55367c75']nanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
130675200516b05b12-cfcb-46cd-b74b-b74b6f8e6ceb2006200:00:00.0004501100.0000019Substitution909Turkey3From Free Kick909TurkeyNaNNaNNaNnan29989.0Yusuf Yazıcı13.0Right Center MidfieldNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN103.0Tactical6971.0Cengiz ÜnderNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
130676200635570177-df15-407e-b232-7b9c244ac9da2007200:00:00.0004501100.0000019Substitution909Turkey3From Free Kick914ItalyNaNNaNNaNnan6964.0Alessandro Florenzi2.0Right BackNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN103.0Tactical11514.0Giovanni Di LorenzoNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1306702000225c7ac7-1c71-4225-8be3-d943e85b518d2001100:00:00.18900110NaN42Ball Receipt*909Turkey3From Free Kick909TurkeyNaNNaN['9e0765b1-85a1-4d49-9b0c-55c6276646d5']70.2, 56.429989.0Yusuf Yazıcı13.0Right Center MidfieldNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 Italy70.256.4NaNNaNNaNNaNNaNNaNNaNNaNNaN
130677200737939db0-8973-469e-ac21-e3246b8eff002008200:00:00.1894501110.7375630Pass914Italy9From Kick Off914ItalyNaNNaN['2f3a9e17-a5fb-4e00-a12b-41bc392ea588']60.0, 40.07788.0Ciro Immobile23.0Center Forward7024.0Jorge Luiz Frello Filho11.940268-2.7641971.0Ground Pass48.9, 35.640.0Right Foot65.0Kick OffnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 Italy60.040.048.935.6NaNNaNNaNNaNNaNNaNNaN
\n", "
" ], "text/plain": [ " level_0 id index period \\\n", "128670 0 19edeac2-e63f-4795-8a8b-17a6e9fdb6e3 1 1 \n", "128671 1 89072e2e-b64f-4099-846b-b22cf000f9c7 2 1 \n", "128672 2 46c6901e-3b12-495a-b68a-19ca15798ed0 3 1 \n", "128673 3 9e5b0646-91cc-49a1-bf88-39bde773b949 4 1 \n", "130673 2003 5c2a7c70-becf-4520-856d-d80b55367c75 2004 2 \n", "130674 2004 e767b1ab-b5c8-4ac4-87c9-d7a884399777 2005 2 \n", "130675 2005 16b05b12-cfcb-46cd-b74b-b74b6f8e6ceb 2006 2 \n", "130676 2006 35570177-df15-407e-b232-7b9c244ac9da 2007 2 \n", "130670 2000 225c7ac7-1c71-4225-8be3-d943e85b518d 2001 1 \n", "130677 2007 37939db0-8973-469e-ac21-e3246b8eff00 2008 2 \n", "\n", " timestamp minute second possession duration type_id \\\n", "128670 00:00:00.000 0 0 1 0.00000 35 \n", "128671 00:00:00.000 0 0 1 0.00000 35 \n", "128672 00:00:00.000 0 0 1 0.00000 18 \n", "128673 00:00:00.000 0 0 1 0.00000 18 \n", "130673 00:00:00.000 45 0 110 0.00000 18 \n", "130674 00:00:00.000 45 0 110 0.00000 18 \n", "130675 00:00:00.000 45 0 110 0.00000 19 \n", "130676 00:00:00.000 45 0 110 0.00000 19 \n", "130670 00:00:00.189 0 0 110 NaN 42 \n", "130677 00:00:00.189 45 0 111 0.73756 30 \n", "\n", " type_name possession_team_id possession_team_name \\\n", "128670 Starting XI 909 Turkey \n", "128671 Starting XI 909 Turkey \n", "128672 Half Start 909 Turkey \n", "128673 Half Start 909 Turkey \n", "130673 Half Start 909 Turkey \n", "130674 Half Start 909 Turkey \n", "130675 Substitution 909 Turkey \n", "130676 Substitution 909 Turkey \n", "130670 Ball Receipt* 909 Turkey \n", "130677 Pass 914 Italy \n", "\n", " play_pattern_id play_pattern_name team_id team_name \\\n", "128670 1 Regular Play 909 Turkey \n", "128671 1 Regular Play 914 Italy \n", "128672 1 Regular Play 914 Italy \n", "128673 1 Regular Play 909 Turkey \n", "130673 3 From Free Kick 914 Italy \n", "130674 3 From Free Kick 909 Turkey \n", "130675 3 From Free Kick 909 Turkey \n", "130676 3 From Free Kick 914 Italy \n", "130670 3 From Free Kick 909 Turkey \n", "130677 9 From Kick Off 914 Italy \n", "\n", " tactics_formation tactics_lineup \\\n", "128670 4141.0 [{'player': {'id': 30357, 'name': 'Uğurcan Çak... \n", "128671 433.0 [{'player': {'id': 7036, 'name': 'Gianluigi Do... \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "130674 NaN NaN \n", "130675 NaN NaN \n", "130676 NaN NaN \n", "130670 NaN NaN \n", "130677 NaN NaN \n", "\n", " related_events location player_id \\\n", "128670 NaN nan NaN \n", "128671 NaN nan NaN \n", "128672 ['9e5b0646-91cc-49a1-bf88-39bde773b949'] nan NaN \n", "128673 ['46c6901e-3b12-495a-b68a-19ca15798ed0'] nan NaN \n", "130673 ['e767b1ab-b5c8-4ac4-87c9-d7a884399777'] nan NaN \n", "130674 ['5c2a7c70-becf-4520-856d-d80b55367c75'] nan NaN \n", "130675 NaN nan 29989.0 \n", "130676 NaN nan 6964.0 \n", "130670 ['9e0765b1-85a1-4d49-9b0c-55c6276646d5'] 70.2, 56.4 29989.0 \n", "130677 ['2f3a9e17-a5fb-4e00-a12b-41bc392ea588'] 60.0, 40.0 7788.0 \n", "\n", " player_name position_id position_name \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 Yusuf Yazıcı 13.0 Right Center Midfield \n", "130676 Alessandro Florenzi 2.0 Right Back \n", "130670 Yusuf Yazıcı 13.0 Right Center Midfield \n", "130677 Ciro Immobile 23.0 Center Forward \n", "\n", " pass_recipient_id pass_recipient_name pass_length pass_angle \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "130674 NaN NaN NaN NaN \n", "130675 NaN NaN NaN NaN \n", "130676 NaN NaN NaN NaN \n", "130670 NaN NaN NaN NaN \n", "130677 7024.0 Jorge Luiz Frello Filho 11.940268 -2.764197 \n", "\n", " pass_height_id pass_height_name pass_end_location pass_body_part_id \\\n", "128670 NaN NaN nan NaN \n", "128671 NaN NaN nan NaN \n", "128672 NaN NaN nan NaN \n", "128673 NaN NaN nan NaN \n", "130673 NaN NaN nan NaN \n", "130674 NaN NaN nan NaN \n", "130675 NaN NaN nan NaN \n", "130676 NaN NaN nan NaN \n", "130670 NaN NaN nan NaN \n", "130677 1.0 Ground Pass 48.9, 35.6 40.0 \n", "\n", " pass_body_part_name pass_type_id pass_type_name carry_end_location \\\n", "128670 NaN NaN NaN nan \n", "128671 NaN NaN NaN nan \n", "128672 NaN NaN NaN nan \n", "128673 NaN NaN NaN nan \n", "130673 NaN NaN NaN nan \n", "130674 NaN NaN NaN nan \n", "130675 NaN NaN NaN nan \n", "130676 NaN NaN NaN nan \n", "130670 NaN NaN NaN nan \n", "130677 Right Foot 65.0 Kick Off nan \n", "\n", " under_pressure duel_type_id duel_type_name pass_aerial_won \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "130674 NaN NaN NaN NaN \n", "130675 NaN NaN NaN NaN \n", "130676 NaN NaN NaN NaN \n", "130670 NaN NaN NaN NaN \n", "130677 NaN NaN NaN NaN \n", "\n", " counterpress duel_outcome_id duel_outcome_name dribble_outcome_id \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "130674 NaN NaN NaN NaN \n", "130675 NaN NaN NaN NaN \n", "130676 NaN NaN NaN NaN \n", "130670 NaN NaN NaN NaN \n", "130677 NaN NaN NaN NaN \n", "\n", " dribble_outcome_name pass_outcome_id pass_outcome_name \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " ball_receipt_outcome_id ball_receipt_outcome_name \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "130674 NaN NaN \n", "130675 NaN NaN \n", "130676 NaN NaN \n", "130670 NaN NaN \n", "130677 NaN NaN \n", "\n", " interception_outcome_id interception_outcome_name shot_statsbomb_xg \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " shot_end_location shot_outcome_id shot_outcome_name shot_type_id \\\n", "128670 nan NaN NaN NaN \n", "128671 nan NaN NaN NaN \n", "128672 nan NaN NaN NaN \n", "128673 nan NaN NaN NaN \n", "130673 nan NaN NaN NaN \n", "130674 nan NaN NaN NaN \n", "130675 nan NaN NaN NaN \n", "130676 nan NaN NaN NaN \n", "130670 nan NaN NaN NaN \n", "130677 nan NaN NaN NaN \n", "\n", " shot_type_name shot_body_part_id shot_body_part_name \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " shot_technique_id shot_technique_name shot_freeze_frame \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " goalkeeper_end_location goalkeeper_type_id goalkeeper_type_name \\\n", "128670 nan NaN NaN \n", "128671 nan NaN NaN \n", "128672 nan NaN NaN \n", "128673 nan NaN NaN \n", "130673 nan NaN NaN \n", "130674 nan NaN NaN \n", "130675 nan NaN NaN \n", "130676 nan NaN NaN \n", "130670 nan NaN NaN \n", "130677 nan NaN NaN \n", "\n", " goalkeeper_position_id goalkeeper_position_name out pass_outswinging \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "130674 NaN NaN NaN NaN \n", "130675 NaN NaN NaN NaN \n", "130676 NaN NaN NaN NaN \n", "130670 NaN NaN NaN NaN \n", "130677 NaN NaN NaN NaN \n", "\n", " pass_technique_id pass_technique_name clearance_head \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " clearance_body_part_id clearance_body_part_name pass_switch \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " off_camera pass_cross clearance_left_foot dribble_overrun \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "130674 NaN NaN NaN NaN \n", "130675 NaN NaN NaN NaN \n", "130676 NaN NaN NaN NaN \n", "130670 NaN NaN NaN NaN \n", "130677 NaN NaN NaN NaN \n", "\n", " dribble_nutmeg clearance_right_foot pass_no_touch \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " foul_committed_advantage foul_won_advantage pass_assisted_shot_id \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " pass_shot_assist shot_key_pass_id shot_first_time clearance_other \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "130674 NaN NaN NaN NaN \n", "130675 NaN NaN NaN NaN \n", "130676 NaN NaN NaN NaN \n", "130670 NaN NaN NaN NaN \n", "130677 NaN NaN NaN NaN \n", "\n", " pass_miscommunication clearance_aerial_won pass_through_ball \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " ball_recovery_recovery_failure goalkeeper_outcome_id \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "130674 NaN NaN \n", "130675 NaN NaN \n", "130676 NaN NaN \n", "130670 NaN NaN \n", "130677 NaN NaN \n", "\n", " goalkeeper_outcome_name goalkeeper_body_part_id \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "130674 NaN NaN \n", "130675 NaN NaN \n", "130676 NaN NaN \n", "130670 NaN NaN \n", "130677 NaN NaN \n", "\n", " goalkeeper_body_part_name shot_aerial_won foul_committed_card_id \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " foul_committed_card_name foul_committed_offensive foul_won_defensive \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " substitution_outcome_id substitution_outcome_name \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "130674 NaN NaN \n", "130675 103.0 Tactical \n", "130676 103.0 Tactical \n", "130670 NaN NaN \n", "130677 NaN NaN \n", "\n", " substitution_replacement_id substitution_replacement_name \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "130674 NaN NaN \n", "130675 6971.0 Cengiz Ünder \n", "130676 11514.0 Giovanni Di Lorenzo \n", "130670 NaN NaN \n", "130677 NaN NaN \n", "\n", " 50_50_outcome_id 50_50_outcome_name pass_goal_assist \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " goalkeeper_technique_id goalkeeper_technique_name pass_cut_back \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " miscontrol_aerial_won pass_straight foul_committed_type_id \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " foul_committed_type_name match_id pass_inswinging pass_deflected \\\n", "128670 NaN 3788741 NaN NaN \n", "128671 NaN 3788741 NaN NaN \n", "128672 NaN 3788741 NaN NaN \n", "128673 NaN 3788741 NaN NaN \n", "130673 NaN 3788741 NaN NaN \n", "130674 NaN 3788741 NaN NaN \n", "130675 NaN 3788741 NaN NaN \n", "130676 NaN 3788741 NaN NaN \n", "130670 NaN 3788741 NaN NaN \n", "130677 NaN 3788741 NaN NaN \n", "\n", " injury_stoppage_in_chain shot_one_on_one bad_behaviour_card_id \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " bad_behaviour_card_name shot_deflected block_deflection \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " foul_committed_penalty foul_won_penalty block_save_block \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " goalkeeper_punched_out player_off_permanent shot_saved_off_target \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " goalkeeper_shot_saved_off_target shot_saved_to_post \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "130674 NaN NaN \n", "130675 NaN NaN \n", "130676 NaN NaN \n", "130670 NaN NaN \n", "130677 NaN NaN \n", "\n", " goalkeeper_shot_saved_to_post shot_open_goal \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "130674 NaN NaN \n", "130675 NaN NaN \n", "130676 NaN NaN \n", "130670 NaN NaN \n", "130677 NaN NaN \n", "\n", " goalkeeper_penalty_saved_to_post dribble_no_touch block_offensive \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " shot_follows_dribble ball_recovery_offensive shot_redirect \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " goalkeeper_lost_in_play goalkeeper_success_in_play match_date \\\n", "128670 NaN NaN 2021-06-11 \n", "128671 NaN NaN 2021-06-11 \n", "128672 NaN NaN 2021-06-11 \n", "128673 NaN NaN 2021-06-11 \n", "130673 NaN NaN 2021-06-11 \n", "130674 NaN NaN 2021-06-11 \n", "130675 NaN NaN 2021-06-11 \n", "130676 NaN NaN 2021-06-11 \n", "130670 NaN NaN 2021-06-11 \n", "130677 NaN NaN 2021-06-11 \n", "\n", " kick_off home_score away_score match_status match_status_360 \\\n", "128670 21:00:00.000 0 3 available available \n", "128671 21:00:00.000 0 3 available available \n", "128672 21:00:00.000 0 3 available available \n", "128673 21:00:00.000 0 3 available available \n", "130673 21:00:00.000 0 3 available available \n", "130674 21:00:00.000 0 3 available available \n", "130675 21:00:00.000 0 3 available available \n", "130676 21:00:00.000 0 3 available available \n", "130670 21:00:00.000 0 3 available available \n", "130677 21:00:00.000 0 3 available available \n", "\n", " last_updated last_updated_360 match_week \\\n", "128670 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "128671 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "128672 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "128673 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "130673 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "130674 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "130675 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "130676 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "130670 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "130677 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "\n", " competition_competition_id competition_country_name \\\n", "128670 55 Europe \n", "128671 55 Europe \n", "128672 55 Europe \n", "128673 55 Europe \n", "130673 55 Europe \n", "130674 55 Europe \n", "130675 55 Europe \n", "130676 55 Europe \n", "130670 55 Europe \n", "130677 55 Europe \n", "\n", " competition_competition_name season_season_id season_season_name \\\n", "128670 UEFA Euro 43 2020 \n", "128671 UEFA Euro 43 2020 \n", "128672 UEFA Euro 43 2020 \n", "128673 UEFA Euro 43 2020 \n", "130673 UEFA Euro 43 2020 \n", "130674 UEFA Euro 43 2020 \n", "130675 UEFA Euro 43 2020 \n", "130676 UEFA Euro 43 2020 \n", "130670 UEFA Euro 43 2020 \n", "130677 UEFA Euro 43 2020 \n", "\n", " home_team_home_team_id home_team_home_team_name \\\n", "128670 909 Turkey \n", "128671 909 Turkey \n", "128672 909 Turkey \n", "128673 909 Turkey \n", "130673 909 Turkey \n", "130674 909 Turkey \n", "130675 909 Turkey \n", "130676 909 Turkey \n", "130670 909 Turkey \n", "130677 909 Turkey \n", "\n", " home_team_home_team_gender home_team_home_team_group \\\n", "128670 male Group A \n", "128671 male Group A \n", "128672 male Group A \n", "128673 male Group A \n", "130673 male Group A \n", "130674 male Group A \n", "130675 male Group A \n", "130676 male Group A \n", "130670 male Group A \n", "130677 male Group A \n", "\n", " home_team_country_id home_team_country_name \\\n", "128670 233 Turkey \n", "128671 233 Turkey \n", "128672 233 Turkey \n", "128673 233 Turkey \n", "130673 233 Turkey \n", "130674 233 Turkey \n", "130675 233 Turkey \n", "130676 233 Turkey \n", "130670 233 Turkey \n", "130677 233 Turkey \n", "\n", " home_team_managers \\\n", "128670 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "128671 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "128672 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "128673 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "130673 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "130674 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "130675 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "130676 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "130670 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "130677 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "\n", " away_team_away_team_id away_team_away_team_name \\\n", "128670 914 Italy \n", "128671 914 Italy \n", "128672 914 Italy \n", "128673 914 Italy \n", "130673 914 Italy \n", "130674 914 Italy \n", "130675 914 Italy \n", "130676 914 Italy \n", "130670 914 Italy \n", "130677 914 Italy \n", "\n", " away_team_away_team_gender away_team_away_team_group \\\n", "128670 male Group A \n", "128671 male Group A \n", "128672 male Group A \n", "128673 male Group A \n", "130673 male Group A \n", "130674 male Group A \n", "130675 male Group A \n", "130676 male Group A \n", "130670 male Group A \n", "130677 male Group A \n", "\n", " away_team_country_id away_team_country_name \\\n", "128670 112 Italy \n", "128671 112 Italy \n", "128672 112 Italy \n", "128673 112 Italy \n", "130673 112 Italy \n", "130674 112 Italy \n", "130675 112 Italy \n", "130676 112 Italy \n", "130670 112 Italy \n", "130677 112 Italy \n", "\n", " away_team_managers \\\n", "128670 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "128671 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "128672 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "128673 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "130673 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "130674 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "130675 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "130676 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "130670 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "130677 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "\n", " metadata_data_version metadata_shot_fidelity_version \\\n", "128670 1.1.0 2 \n", "128671 1.1.0 2 \n", "128672 1.1.0 2 \n", "128673 1.1.0 2 \n", "130673 1.1.0 2 \n", "130674 1.1.0 2 \n", "130675 1.1.0 2 \n", "130676 1.1.0 2 \n", "130670 1.1.0 2 \n", "130677 1.1.0 2 \n", "\n", " metadata_xy_fidelity_version competition_stage_id \\\n", "128670 2 10 \n", "128671 2 10 \n", "128672 2 10 \n", "128673 2 10 \n", "130673 2 10 \n", "130674 2 10 \n", "130675 2 10 \n", "130676 2 10 \n", "130670 2 10 \n", "130677 2 10 \n", "\n", " competition_stage_name stadium_id stadium_name \\\n", "128670 Group Stage 381 Stadio Olimpico (Roma) \n", "128671 Group Stage 381 Stadio Olimpico (Roma) \n", "128672 Group Stage 381 Stadio Olimpico (Roma) \n", "128673 Group Stage 381 Stadio Olimpico (Roma) \n", "130673 Group Stage 381 Stadio Olimpico (Roma) \n", "130674 Group Stage 381 Stadio Olimpico (Roma) \n", "130675 Group Stage 381 Stadio Olimpico (Roma) \n", "130676 Group Stage 381 Stadio Olimpico (Roma) \n", "130670 Group Stage 381 Stadio Olimpico (Roma) \n", "130677 Group Stage 381 Stadio Olimpico (Roma) \n", "\n", " stadium_country_id stadium_country_name referee_id \\\n", "128670 112 Italy 293 \n", "128671 112 Italy 293 \n", "128672 112 Italy 293 \n", "128673 112 Italy 293 \n", "130673 112 Italy 293 \n", "130674 112 Italy 293 \n", "130675 112 Italy 293 \n", "130676 112 Italy 293 \n", "130670 112 Italy 293 \n", "130677 112 Italy 293 \n", "\n", " referee_name referee_country_id referee_country_name \\\n", "128670 Danny Desmond Makkelie 160 Netherlands \n", "128671 Danny Desmond Makkelie 160 Netherlands \n", "128672 Danny Desmond Makkelie 160 Netherlands \n", "128673 Danny Desmond Makkelie 160 Netherlands \n", "130673 Danny Desmond Makkelie 160 Netherlands \n", "130674 Danny Desmond Makkelie 160 Netherlands \n", "130675 Danny Desmond Makkelie 160 Netherlands \n", "130676 Danny Desmond Makkelie 160 Netherlands \n", "130670 Danny Desmond Makkelie 160 Netherlands \n", "130677 Danny Desmond Makkelie 160 Netherlands \n", "\n", " competition_id season_id country_name competition_name \\\n", "128670 55 43 Europe UEFA Euro \n", "128671 55 43 Europe UEFA Euro \n", "128672 55 43 Europe UEFA Euro \n", "128673 55 43 Europe UEFA Euro \n", "130673 55 43 Europe UEFA Euro \n", "130674 55 43 Europe UEFA Euro \n", "130675 55 43 Europe UEFA Euro \n", "130676 55 43 Europe UEFA Euro \n", "130670 55 43 Europe UEFA Euro \n", "130677 55 43 Europe UEFA Euro \n", "\n", " competition_gender competition_youth competition_international \\\n", "128670 male False True \n", "128671 male False True \n", "128672 male False True \n", "128673 male False True \n", "130673 male False True \n", "130674 male False True \n", "130675 male False True \n", "130676 male False True \n", "130670 male False True \n", "130677 male False True \n", "\n", " season_name match_updated match_updated_360 \\\n", "128670 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "128671 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "128672 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "128673 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "130673 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "130674 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "130675 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "130676 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "130670 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "130677 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "\n", " match_available_360 match_available \\\n", "128670 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "128671 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "128672 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "128673 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "130673 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "130674 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "130675 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "130676 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "130670 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "130677 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "\n", " Full_Fixture_Date location_x location_y \\\n", "128670 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "128671 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "128672 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "128673 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "130673 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "130674 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "130675 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "130676 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "130670 2021-06-11 Turkey 0 vs. 3 Italy 70.2 56.4 \n", "130677 2021-06-11 Turkey 0 vs. 3 Italy 60.0 40.0 \n", "\n", " pass_end_location_x pass_end_location_y carry_end_location_x \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 48.9 35.6 NaN \n", "\n", " carry_end_location_y shot_end_location_x shot_end_location_y \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "130674 NaN NaN NaN \n", "130675 NaN NaN NaN \n", "130676 NaN NaN NaN \n", "130670 NaN NaN NaN \n", "130677 NaN NaN NaN \n", "\n", " shot_end_location_z goalkeeper_end_location_x \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "130674 NaN NaN \n", "130675 NaN NaN \n", "130676 NaN NaN \n", "130670 NaN NaN \n", "130677 NaN NaN \n", "\n", " goalkeeper_end_location_y \n", "128670 NaN \n", "128671 NaN \n", "128672 NaN \n", "128673 NaN \n", "130673 NaN \n", "130674 NaN \n", "130675 NaN \n", "130676 NaN \n", "130670 NaN \n", "130677 NaN " ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#\n", "\n", "##\n", "df_sb_events['location'] = df_sb_events['location'].astype(str)\n", "df_sb_events['pass_end_location'] = df_sb_events['pass_end_location'].astype(str)\n", "df_sb_events['carry_end_location'] = df_sb_events['carry_end_location'].astype(str)\n", "df_sb_events['shot_end_location'] = df_sb_events['shot_end_location'].astype(str)\n", "df_sb_events['goalkeeper_end_location'] = df_sb_events['goalkeeper_end_location'].astype(str)\n", "df_sb_events['shot_end_location'] = df_sb_events['shot_end_location'].astype(str)\n", "\n", "##\n", "df_sb_events['location'] = df_sb_events['location'].str.replace('[','')\n", "df_sb_events['pass_end_location'] = df_sb_events['pass_end_location'].str.replace('[','')\n", "df_sb_events['carry_end_location'] = df_sb_events['carry_end_location'].str.replace('[','')\n", "df_sb_events['shot_end_location'] = df_sb_events['shot_end_location'].str.replace('[','')\n", "df_sb_events['goalkeeper_end_location'] = df_sb_events['goalkeeper_end_location'].str.replace('[','')\n", "\n", "##\n", "df_sb_events['location'] = df_sb_events['location'].str.replace(']','')\n", "df_sb_events['pass_end_location'] = df_sb_events['pass_end_location'].str.replace(']','')\n", "df_sb_events['carry_end_location'] = df_sb_events['carry_end_location'].str.replace(']','')\n", "df_sb_events['shot_end_location'] = df_sb_events['shot_end_location'].str.replace(']','')\n", "df_sb_events['goalkeeper_end_location'] = df_sb_events['goalkeeper_end_location'].str.replace(']','')\n", "\n", "##\n", "df_sb_events['location_x'], df_sb_events['location_y'] = df_sb_events['location'].str.split(',', 1).str\n", "df_sb_events['pass_end_location_x'], df_sb_events['pass_end_location_y'] = df_sb_events['pass_end_location'].str.split(',', 1).str\n", "df_sb_events['carry_end_location_x'], df_sb_events['carry_end_location_y'] = df_sb_events['carry_end_location'].str.split(',', 1).str\n", "df_sb_events['shot_end_location_x'], df_sb_events['shot_end_location_y'], df_sb_events['shot_end_location_z'] = df_sb_events['shot_end_location'].str.split(',', 3).str[0:3].str\n", "df_sb_events['goalkeeper_end_location_x'], df_sb_events['goalkeeper_end_location_y'] = df_sb_events['goalkeeper_end_location'].str.split(',', 1).str\n", "\n", "## Convert to float\n", "df_sb_events['location_x'] = df_sb_events['location_x'].astype(float)\n", "df_sb_events['location_y'] = df_sb_events['location_y'].astype(float)\n", "df_sb_events['pass_end_location_x'] = df_sb_events['pass_end_location_x'].astype(float)\n", "df_sb_events['pass_end_location_y'] = df_sb_events['pass_end_location_y'].astype(float)\n", "df_sb_events['carry_end_location_x'] = df_sb_events['carry_end_location_x'].astype(float)\n", "df_sb_events['carry_end_location_y'] = df_sb_events['carry_end_location_y'].astype(float)\n", "df_sb_events['shot_end_location_x'] = df_sb_events['shot_end_location_x'].astype(float)\n", "df_sb_events['shot_end_location_y'] = df_sb_events['shot_end_location_y'].astype(float)\n", "df_sb_events['goalkeeper_end_location_x'] = df_sb_events['goalkeeper_end_location_x'].astype(float)\n", "df_sb_events['goalkeeper_end_location_y'] = df_sb_events['goalkeeper_end_location_y'].astype(float)\n", "\n", "## Display DataFrame\n", "df_sb_events.head(10)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(192686, 209)" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_sb_events.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.8. Create New Attributes\n", "\n", "Baseline attributes required for determining in-possession and out-of-possession metrics in later section:\n", "* **Team**: the team or in this case, the country that the player is playing for;\n", "* **Opponent**: the team or in this case, the country that the player is playing against;\n", "* **Minutes played**: the number of minutes played; and\n", "* **Games played**: the total number of matches played (for the aggregated version only)." ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
level_0idindexperiodtimestampminutesecondpossessiondurationtype_idtype_namepossession_team_idpossession_team_nameplay_pattern_idplay_pattern_nameteam_idteam_nametactics_formationtactics_lineuprelated_eventslocationplayer_idplayer_nameposition_idposition_namepass_recipient_idpass_recipient_namepass_lengthpass_anglepass_height_idpass_height_namepass_end_locationpass_body_part_idpass_body_part_namepass_type_idpass_type_namecarry_end_locationunder_pressureduel_type_idduel_type_namepass_aerial_woncounterpressduel_outcome_idduel_outcome_namedribble_outcome_iddribble_outcome_namepass_outcome_idpass_outcome_nameball_receipt_outcome_idball_receipt_outcome_nameinterception_outcome_idinterception_outcome_nameshot_statsbomb_xgshot_end_locationshot_outcome_idshot_outcome_nameshot_type_idshot_type_nameshot_body_part_idshot_body_part_nameshot_technique_idshot_technique_nameshot_freeze_framegoalkeeper_end_locationgoalkeeper_type_idgoalkeeper_type_namegoalkeeper_position_idgoalkeeper_position_nameoutpass_outswingingpass_technique_idpass_technique_nameclearance_headclearance_body_part_idclearance_body_part_namepass_switchoff_camerapass_crossclearance_left_footdribble_overrundribble_nutmegclearance_right_footpass_no_touchfoul_committed_advantagefoul_won_advantagepass_assisted_shot_idpass_shot_assistshot_key_pass_idshot_first_timeclearance_otherpass_miscommunicationclearance_aerial_wonpass_through_ballball_recovery_recovery_failuregoalkeeper_outcome_idgoalkeeper_outcome_namegoalkeeper_body_part_idgoalkeeper_body_part_nameshot_aerial_wonfoul_committed_card_idfoul_committed_card_namefoul_committed_offensivefoul_won_defensivesubstitution_outcome_idsubstitution_outcome_namesubstitution_replacement_idsubstitution_replacement_name50_50_outcome_id50_50_outcome_namepass_goal_assistgoalkeeper_technique_idgoalkeeper_technique_namepass_cut_backmiscontrol_aerial_wonpass_straightfoul_committed_type_idfoul_committed_type_namematch_idpass_inswingingpass_deflectedinjury_stoppage_in_chainshot_one_on_onebad_behaviour_card_idbad_behaviour_card_nameshot_deflectedblock_deflectionfoul_committed_penaltyfoul_won_penaltyblock_save_blockgoalkeeper_punched_outplayer_off_permanentshot_saved_off_targetgoalkeeper_shot_saved_off_targetshot_saved_to_postgoalkeeper_shot_saved_to_postshot_open_goalgoalkeeper_penalty_saved_to_postdribble_no_touchblock_offensiveshot_follows_dribbleball_recovery_offensiveshot_redirectgoalkeeper_lost_in_playgoalkeeper_success_in_playmatch_datekick_offhome_scoreaway_scorematch_statusmatch_status_360last_updatedlast_updated_360match_weekcompetition_competition_idcompetition_country_namecompetition_competition_nameseason_season_idseason_season_namehome_team_home_team_idhome_team_home_team_namehome_team_home_team_genderhome_team_home_team_grouphome_team_country_idhome_team_country_namehome_team_managersaway_team_away_team_idaway_team_away_team_nameaway_team_away_team_genderaway_team_away_team_groupaway_team_country_idaway_team_country_nameaway_team_managersmetadata_data_versionmetadata_shot_fidelity_versionmetadata_xy_fidelity_versioncompetition_stage_idcompetition_stage_namestadium_idstadium_namestadium_country_idstadium_country_namereferee_idreferee_namereferee_country_idreferee_country_namecompetition_idseason_idcountry_namecompetition_namecompetition_gendercompetition_youthcompetition_internationalseason_namematch_updatedmatch_updated_360match_available_360match_availableFull_Fixture_Datelocation_xlocation_ypass_end_location_xpass_end_location_ycarry_end_location_xcarry_end_location_yshot_end_location_xshot_end_location_yshot_end_location_zgoalkeeper_end_location_xgoalkeeper_end_location_yTeamOpponentnext_eventprevious_eventnext_team_possessionprevious_team_possessionpossession_retainedendloc_xendloc_ydist1dist2diffdist
128670019edeac2-e63f-4795-8a8b-17a6e9fdb6e31100:00:00.0000010.035Starting XI909Turkey1Regular Play909Turkey4141.0[{'player': {'id': 30357, 'name': 'Uğurcan Çak...NaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNTurkeyItalyStarting XINaNTurkeyNaN1NaNNaNNaNNaNNaN
128671189072e2e-b64f-4099-846b-b22cf000f9c72100:00:00.0000010.035Starting XI909Turkey1Regular Play914Italy433.0[{'player': {'id': 7036, 'name': 'Gianluigi Do...NaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNItalyTurkeyHalf StartStarting XITurkeyTurkey1NaNNaNNaNNaNNaN
128672246c6901e-3b12-495a-b68a-19ca15798ed03100:00:00.0000010.018Half Start909Turkey1Regular Play914ItalyNaNNaN['9e5b0646-91cc-49a1-bf88-39bde773b949']nanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNItalyTurkeyHalf StartStarting XITurkeyTurkey1NaNNaNNaNNaNNaN
12867339e5b0646-91cc-49a1-bf88-39bde773b9494100:00:00.0000010.018Half Start909Turkey1Regular Play909TurkeyNaNNaN['46c6901e-3b12-495a-b68a-19ca15798ed0']nanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNTurkeyItalyHalf StartHalf StartTurkeyTurkey1NaNNaNNaNNaNNaN
13067320035c2a7c70-becf-4520-856d-d80b55367c752004200:00:00.0004501100.018Half Start909Turkey3From Free Kick914ItalyNaNNaN['e767b1ab-b5c8-4ac4-87c9-d7a884399777']nanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNItalyTurkeyHalf StartHalf StartTurkeyTurkey1NaNNaNNaNNaNNaN
\n", "
" ], "text/plain": [ " level_0 id index period \\\n", "128670 0 19edeac2-e63f-4795-8a8b-17a6e9fdb6e3 1 1 \n", "128671 1 89072e2e-b64f-4099-846b-b22cf000f9c7 2 1 \n", "128672 2 46c6901e-3b12-495a-b68a-19ca15798ed0 3 1 \n", "128673 3 9e5b0646-91cc-49a1-bf88-39bde773b949 4 1 \n", "130673 2003 5c2a7c70-becf-4520-856d-d80b55367c75 2004 2 \n", "\n", " timestamp minute second possession duration type_id \\\n", "128670 00:00:00.000 0 0 1 0.0 35 \n", "128671 00:00:00.000 0 0 1 0.0 35 \n", "128672 00:00:00.000 0 0 1 0.0 18 \n", "128673 00:00:00.000 0 0 1 0.0 18 \n", "130673 00:00:00.000 45 0 110 0.0 18 \n", "\n", " type_name possession_team_id possession_team_name play_pattern_id \\\n", "128670 Starting XI 909 Turkey 1 \n", "128671 Starting XI 909 Turkey 1 \n", "128672 Half Start 909 Turkey 1 \n", "128673 Half Start 909 Turkey 1 \n", "130673 Half Start 909 Turkey 3 \n", "\n", " play_pattern_name team_id team_name tactics_formation \\\n", "128670 Regular Play 909 Turkey 4141.0 \n", "128671 Regular Play 914 Italy 433.0 \n", "128672 Regular Play 914 Italy NaN \n", "128673 Regular Play 909 Turkey NaN \n", "130673 From Free Kick 914 Italy NaN \n", "\n", " tactics_lineup \\\n", "128670 [{'player': {'id': 30357, 'name': 'Uğurcan Çak... \n", "128671 [{'player': {'id': 7036, 'name': 'Gianluigi Do... \n", "128672 NaN \n", "128673 NaN \n", "130673 NaN \n", "\n", " related_events location player_id \\\n", "128670 NaN nan NaN \n", "128671 NaN nan NaN \n", "128672 ['9e5b0646-91cc-49a1-bf88-39bde773b949'] nan NaN \n", "128673 ['46c6901e-3b12-495a-b68a-19ca15798ed0'] nan NaN \n", "130673 ['e767b1ab-b5c8-4ac4-87c9-d7a884399777'] nan NaN \n", "\n", " player_name position_id position_name pass_recipient_id \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "\n", " pass_recipient_name pass_length pass_angle pass_height_id \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "\n", " pass_height_name pass_end_location pass_body_part_id \\\n", "128670 NaN nan NaN \n", "128671 NaN nan NaN \n", "128672 NaN nan NaN \n", "128673 NaN nan NaN \n", "130673 NaN nan NaN \n", "\n", " pass_body_part_name pass_type_id pass_type_name carry_end_location \\\n", "128670 NaN NaN NaN nan \n", "128671 NaN NaN NaN nan \n", "128672 NaN NaN NaN nan \n", "128673 NaN NaN NaN nan \n", "130673 NaN NaN NaN nan \n", "\n", " under_pressure duel_type_id duel_type_name pass_aerial_won \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "\n", " counterpress duel_outcome_id duel_outcome_name dribble_outcome_id \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "\n", " dribble_outcome_name pass_outcome_id pass_outcome_name \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " ball_receipt_outcome_id ball_receipt_outcome_name \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " interception_outcome_id interception_outcome_name shot_statsbomb_xg \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " shot_end_location shot_outcome_id shot_outcome_name shot_type_id \\\n", "128670 nan NaN NaN NaN \n", "128671 nan NaN NaN NaN \n", "128672 nan NaN NaN NaN \n", "128673 nan NaN NaN NaN \n", "130673 nan NaN NaN NaN \n", "\n", " shot_type_name shot_body_part_id shot_body_part_name \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " shot_technique_id shot_technique_name shot_freeze_frame \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " goalkeeper_end_location goalkeeper_type_id goalkeeper_type_name \\\n", "128670 nan NaN NaN \n", "128671 nan NaN NaN \n", "128672 nan NaN NaN \n", "128673 nan NaN NaN \n", "130673 nan NaN NaN \n", "\n", " goalkeeper_position_id goalkeeper_position_name out pass_outswinging \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "\n", " pass_technique_id pass_technique_name clearance_head \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " clearance_body_part_id clearance_body_part_name pass_switch \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " off_camera pass_cross clearance_left_foot dribble_overrun \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "\n", " dribble_nutmeg clearance_right_foot pass_no_touch \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " foul_committed_advantage foul_won_advantage pass_assisted_shot_id \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " pass_shot_assist shot_key_pass_id shot_first_time clearance_other \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "\n", " pass_miscommunication clearance_aerial_won pass_through_ball \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " ball_recovery_recovery_failure goalkeeper_outcome_id \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " goalkeeper_outcome_name goalkeeper_body_part_id \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " goalkeeper_body_part_name shot_aerial_won foul_committed_card_id \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " foul_committed_card_name foul_committed_offensive foul_won_defensive \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " substitution_outcome_id substitution_outcome_name \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " substitution_replacement_id substitution_replacement_name \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " 50_50_outcome_id 50_50_outcome_name pass_goal_assist \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " goalkeeper_technique_id goalkeeper_technique_name pass_cut_back \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " miscontrol_aerial_won pass_straight foul_committed_type_id \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " foul_committed_type_name match_id pass_inswinging pass_deflected \\\n", "128670 NaN 3788741 NaN NaN \n", "128671 NaN 3788741 NaN NaN \n", "128672 NaN 3788741 NaN NaN \n", "128673 NaN 3788741 NaN NaN \n", "130673 NaN 3788741 NaN NaN \n", "\n", " injury_stoppage_in_chain shot_one_on_one bad_behaviour_card_id \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " bad_behaviour_card_name shot_deflected block_deflection \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " foul_committed_penalty foul_won_penalty block_save_block \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " goalkeeper_punched_out player_off_permanent shot_saved_off_target \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " goalkeeper_shot_saved_off_target shot_saved_to_post \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " goalkeeper_shot_saved_to_post shot_open_goal \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " goalkeeper_penalty_saved_to_post dribble_no_touch block_offensive \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " shot_follows_dribble ball_recovery_offensive shot_redirect \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " goalkeeper_lost_in_play goalkeeper_success_in_play match_date \\\n", "128670 NaN NaN 2021-06-11 \n", "128671 NaN NaN 2021-06-11 \n", "128672 NaN NaN 2021-06-11 \n", "128673 NaN NaN 2021-06-11 \n", "130673 NaN NaN 2021-06-11 \n", "\n", " kick_off home_score away_score match_status match_status_360 \\\n", "128670 21:00:00.000 0 3 available available \n", "128671 21:00:00.000 0 3 available available \n", "128672 21:00:00.000 0 3 available available \n", "128673 21:00:00.000 0 3 available available \n", "130673 21:00:00.000 0 3 available available \n", "\n", " last_updated last_updated_360 match_week \\\n", "128670 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "128671 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "128672 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "128673 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "130673 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "\n", " competition_competition_id competition_country_name \\\n", "128670 55 Europe \n", "128671 55 Europe \n", "128672 55 Europe \n", "128673 55 Europe \n", "130673 55 Europe \n", "\n", " competition_competition_name season_season_id season_season_name \\\n", "128670 UEFA Euro 43 2020 \n", "128671 UEFA Euro 43 2020 \n", "128672 UEFA Euro 43 2020 \n", "128673 UEFA Euro 43 2020 \n", "130673 UEFA Euro 43 2020 \n", "\n", " home_team_home_team_id home_team_home_team_name \\\n", "128670 909 Turkey \n", "128671 909 Turkey \n", "128672 909 Turkey \n", "128673 909 Turkey \n", "130673 909 Turkey \n", "\n", " home_team_home_team_gender home_team_home_team_group \\\n", "128670 male Group A \n", "128671 male Group A \n", "128672 male Group A \n", "128673 male Group A \n", "130673 male Group A \n", "\n", " home_team_country_id home_team_country_name \\\n", "128670 233 Turkey \n", "128671 233 Turkey \n", "128672 233 Turkey \n", "128673 233 Turkey \n", "130673 233 Turkey \n", "\n", " home_team_managers \\\n", "128670 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "128671 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "128672 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "128673 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "130673 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "\n", " away_team_away_team_id away_team_away_team_name \\\n", "128670 914 Italy \n", "128671 914 Italy \n", "128672 914 Italy \n", "128673 914 Italy \n", "130673 914 Italy \n", "\n", " away_team_away_team_gender away_team_away_team_group \\\n", "128670 male Group A \n", "128671 male Group A \n", "128672 male Group A \n", "128673 male Group A \n", "130673 male Group A \n", "\n", " away_team_country_id away_team_country_name \\\n", "128670 112 Italy \n", "128671 112 Italy \n", "128672 112 Italy \n", "128673 112 Italy \n", "130673 112 Italy \n", "\n", " away_team_managers \\\n", "128670 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "128671 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "128672 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "128673 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "130673 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "\n", " metadata_data_version metadata_shot_fidelity_version \\\n", "128670 1.1.0 2 \n", "128671 1.1.0 2 \n", "128672 1.1.0 2 \n", "128673 1.1.0 2 \n", "130673 1.1.0 2 \n", "\n", " metadata_xy_fidelity_version competition_stage_id \\\n", "128670 2 10 \n", "128671 2 10 \n", "128672 2 10 \n", "128673 2 10 \n", "130673 2 10 \n", "\n", " competition_stage_name stadium_id stadium_name \\\n", "128670 Group Stage 381 Stadio Olimpico (Roma) \n", "128671 Group Stage 381 Stadio Olimpico (Roma) \n", "128672 Group Stage 381 Stadio Olimpico (Roma) \n", "128673 Group Stage 381 Stadio Olimpico (Roma) \n", "130673 Group Stage 381 Stadio Olimpico (Roma) \n", "\n", " stadium_country_id stadium_country_name referee_id \\\n", "128670 112 Italy 293 \n", "128671 112 Italy 293 \n", "128672 112 Italy 293 \n", "128673 112 Italy 293 \n", "130673 112 Italy 293 \n", "\n", " referee_name referee_country_id referee_country_name \\\n", "128670 Danny Desmond Makkelie 160 Netherlands \n", "128671 Danny Desmond Makkelie 160 Netherlands \n", "128672 Danny Desmond Makkelie 160 Netherlands \n", "128673 Danny Desmond Makkelie 160 Netherlands \n", "130673 Danny Desmond Makkelie 160 Netherlands \n", "\n", " competition_id season_id country_name competition_name \\\n", "128670 55 43 Europe UEFA Euro \n", "128671 55 43 Europe UEFA Euro \n", "128672 55 43 Europe UEFA Euro \n", "128673 55 43 Europe UEFA Euro \n", "130673 55 43 Europe UEFA Euro \n", "\n", " competition_gender competition_youth competition_international \\\n", "128670 male False True \n", "128671 male False True \n", "128672 male False True \n", "128673 male False True \n", "130673 male False True \n", "\n", " season_name match_updated match_updated_360 \\\n", "128670 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "128671 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "128672 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "128673 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "130673 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "\n", " match_available_360 match_available \\\n", "128670 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "128671 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "128672 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "128673 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "130673 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "\n", " Full_Fixture_Date location_x location_y \\\n", "128670 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "128671 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "128672 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "128673 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "130673 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "\n", " pass_end_location_x pass_end_location_y carry_end_location_x \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " carry_end_location_y shot_end_location_x shot_end_location_y \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " shot_end_location_z goalkeeper_end_location_x \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " goalkeeper_end_location_y Team Opponent next_event \\\n", "128670 NaN Turkey Italy Starting XI \n", "128671 NaN Italy Turkey Half Start \n", "128672 NaN Italy Turkey Half Start \n", "128673 NaN Turkey Italy Half Start \n", "130673 NaN Italy Turkey Half Start \n", "\n", " previous_event next_team_possession previous_team_possession \\\n", "128670 NaN Turkey NaN \n", "128671 Starting XI Turkey Turkey \n", "128672 Starting XI Turkey Turkey \n", "128673 Half Start Turkey Turkey \n", "130673 Half Start Turkey Turkey \n", "\n", " possession_retained endloc_x endloc_y dist1 dist2 diffdist \n", "128670 1 NaN NaN NaN NaN NaN \n", "128671 1 NaN NaN NaN NaN NaN \n", "128672 1 NaN NaN NaN NaN NaN \n", "128673 1 NaN NaN NaN NaN NaN \n", "130673 1 NaN NaN NaN NaN NaN " ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#\n", "df_sb_events['Team'] = np.where(df_sb_events['team_name'] == df_sb_events['home_team_home_team_name'], df_sb_events['home_team_home_team_name'], df_sb_events['away_team_away_team_name'])\n", "df_sb_events['Opponent'] = np.where(df_sb_events['team_name'] == df_sb_events['away_team_away_team_name'], df_sb_events['home_team_home_team_name'], df_sb_events['away_team_away_team_name'])\n", "df_sb_events['next_event'] = df_sb_events['type_name'].shift(-1)\n", "df_sb_events['previous_event'] = df_sb_events['type_name'].shift(+1)\n", "df_sb_events['next_team_possession'] = df_sb_events['possession_team_name'].shift(-1)\n", "df_sb_events['previous_team_possession'] = df_sb_events['possession_team_name'].shift(+1)\n", "df_sb_events['possession_retained'] = np.where((df_sb_events['possession_team_name'] == df_sb_events['next_team_possession']), 1, 0)\n", "df_sb_events['endloc_x'] = np.where(df_sb_events['type_name'] == 'Pass', df_sb_events['pass_end_location_x'], np.where(df_sb_events['type_name'] == 'Carry', df_sb_events['carry_end_location_x'], df_sb_events['location_x']))\n", "df_sb_events['endloc_y'] = np.where(df_sb_events['type_name'] == 'Pass', df_sb_events['pass_end_location_y'], np.where(df_sb_events['type_name'] == 'Carry', df_sb_events['carry_end_location_y'], df_sb_events['location_y']))\n", "df_sb_events['dist1'] = np.sqrt((df_sb_events['location_x'] - 120)**2 + (df_sb_events['location_y'] - 40)**2)\n", "df_sb_events['dist2'] = np.sqrt((df_sb_events['endloc_x'] - 120)**2 + (df_sb_events['endloc_y'] - 40)**2)\n", "df_sb_events['diffdist'] = df_sb_events['dist1'] - df_sb_events['dist2']\n", "\n", "# Display DataFrame\n", "df_sb_events.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.9. Create In-Posession Metrics\n", "\n", "The in- possession metrics (passes and carries) that are created in the following section include following:\n", "* **Open Play Passes p90**: the number of attempted passes in open play, per 90 minutes;\n", "* **Pass Completion %**: the number of completed passes divded by the number of attempted passes;\n", "* **Being Pressured Change in Pass %**: How does passing % change when under pressure? This is calculated as Pressured Pass % minus Pass %\n", "* **Deep Progressions p90**: the number of passes and dribbles/carries into the opposition final third, p90\n", "* **xGBuildup p90**: xG Chain is the total xG of every possession the player is involved in. xG build up is the same minus shots and key passes. To determine this: 1.Find all the possessions each player is involved in, 2.Find all the shots within those possessions, 3.Sum their xG (you might take the highest xG per possession, or you might treat the shots as dependent events), and 4.Assign that sum to each player, however involved they were.\n", "* **Carries p90**: the number of carries, defined as when a player controls the ball at their feet while moving or standing still, p90;\n", "* **Carry %**: percentage of a player's Carries that were successful; and\n", "* **Carry Length p90**: average Carry length, p90.\n", "\n", "Progressive passes and carries are defined as actions that move the ball closer to the goal by 25% or that get the ball into the box." ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
level_0idindexperiodtimestampminutesecondpossessiondurationtype_idtype_namepossession_team_idpossession_team_nameplay_pattern_idplay_pattern_nameteam_idteam_nametactics_formationtactics_lineuprelated_eventslocationplayer_idplayer_nameposition_idposition_namepass_recipient_idpass_recipient_namepass_lengthpass_anglepass_height_idpass_height_namepass_end_locationpass_body_part_idpass_body_part_namepass_type_idpass_type_namecarry_end_locationunder_pressureduel_type_idduel_type_namepass_aerial_woncounterpressduel_outcome_idduel_outcome_namedribble_outcome_iddribble_outcome_namepass_outcome_idpass_outcome_nameball_receipt_outcome_idball_receipt_outcome_nameinterception_outcome_idinterception_outcome_nameshot_statsbomb_xgshot_end_locationshot_outcome_idshot_outcome_nameshot_type_idshot_type_nameshot_body_part_idshot_body_part_nameshot_technique_idshot_technique_nameshot_freeze_framegoalkeeper_end_locationgoalkeeper_type_idgoalkeeper_type_namegoalkeeper_position_idgoalkeeper_position_nameoutpass_outswingingpass_technique_idpass_technique_nameclearance_headclearance_body_part_idclearance_body_part_namepass_switchoff_camerapass_crossclearance_left_footdribble_overrundribble_nutmegclearance_right_footpass_no_touchfoul_committed_advantagefoul_won_advantagepass_assisted_shot_idpass_shot_assistshot_key_pass_idshot_first_timeclearance_otherpass_miscommunicationclearance_aerial_wonpass_through_ballball_recovery_recovery_failuregoalkeeper_outcome_idgoalkeeper_outcome_namegoalkeeper_body_part_idgoalkeeper_body_part_nameshot_aerial_wonfoul_committed_card_idfoul_committed_card_namefoul_committed_offensivefoul_won_defensivesubstitution_outcome_idsubstitution_outcome_namesubstitution_replacement_idsubstitution_replacement_name50_50_outcome_id50_50_outcome_namepass_goal_assistgoalkeeper_technique_idgoalkeeper_technique_namepass_cut_backmiscontrol_aerial_wonpass_straightfoul_committed_type_idfoul_committed_type_namematch_idpass_inswingingpass_deflectedinjury_stoppage_in_chainshot_one_on_onebad_behaviour_card_idbad_behaviour_card_nameshot_deflectedblock_deflectionfoul_committed_penaltyfoul_won_penaltyblock_save_blockgoalkeeper_punched_outplayer_off_permanentshot_saved_off_targetgoalkeeper_shot_saved_off_targetshot_saved_to_postgoalkeeper_shot_saved_to_postshot_open_goalgoalkeeper_penalty_saved_to_postdribble_no_touchblock_offensiveshot_follows_dribbleball_recovery_offensiveshot_redirectgoalkeeper_lost_in_playgoalkeeper_success_in_playmatch_datekick_offhome_scoreaway_scorematch_statusmatch_status_360last_updatedlast_updated_360match_weekcompetition_competition_idcompetition_country_namecompetition_competition_nameseason_season_idseason_season_namehome_team_home_team_idhome_team_home_team_namehome_team_home_team_genderhome_team_home_team_grouphome_team_country_idhome_team_country_namehome_team_managersaway_team_away_team_idaway_team_away_team_nameaway_team_away_team_genderaway_team_away_team_groupaway_team_country_idaway_team_country_nameaway_team_managersmetadata_data_versionmetadata_shot_fidelity_versionmetadata_xy_fidelity_versioncompetition_stage_idcompetition_stage_namestadium_idstadium_namestadium_country_idstadium_country_namereferee_idreferee_namereferee_country_idreferee_country_namecompetition_idseason_idcountry_namecompetition_namecompetition_gendercompetition_youthcompetition_internationalseason_namematch_updatedmatch_updated_360match_available_360match_availableFull_Fixture_Datelocation_xlocation_ypass_end_location_xpass_end_location_ycarry_end_location_xcarry_end_location_yshot_end_location_xshot_end_location_yshot_end_location_zgoalkeeper_end_location_xgoalkeeper_end_location_yTeamOpponentnext_eventprevious_eventnext_team_possessionprevious_team_possessionpossession_retainedendloc_xendloc_ydist1dist2diffdistPassesSuccessful PassesShort PassesSuccessful Short PassesMedium PassesSuccessful Medium PassesLong PassesSuccessful Long PassesFinal Third PassesSuccessful Final Third PassesPenalty Area PassesSuccessful Penalty Area PassesUnder Pressure PassesSuccessful Under Pressure PassesThroughballsSuccessful ThroughballsSwitchesSuccessful SwitchesCrossesSuccessful CrossesPenalty Area CrossesSuccessful Penalty Area CrossesProgressive PassesSuccessful Progressive PassesPass Progressive DistanceCarriesFinal Third CarriesProgressive CarriesCarry DistanceCarry Progressive Distance
128670019edeac2-e63f-4795-8a8b-17a6e9fdb6e31100:00:00.0000010.035Starting XI909Turkey1Regular Play909Turkey4141.0[{'player': {'id': 30357, 'name': 'Uğurcan Çak...NaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNTurkeyItalyStarting XINaNTurkeyNaN1NaNNaNNaNNaNNaN0000000000000000000000000.00000.00.0
128671189072e2e-b64f-4099-846b-b22cf000f9c72100:00:00.0000010.035Starting XI909Turkey1Regular Play914Italy433.0[{'player': {'id': 7036, 'name': 'Gianluigi Do...NaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNItalyTurkeyHalf StartStarting XITurkeyTurkey1NaNNaNNaNNaNNaN0000000000000000000000000.00000.00.0
128672246c6901e-3b12-495a-b68a-19ca15798ed03100:00:00.0000010.018Half Start909Turkey1Regular Play914ItalyNaNNaN['9e5b0646-91cc-49a1-bf88-39bde773b949']nanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNItalyTurkeyHalf StartStarting XITurkeyTurkey1NaNNaNNaNNaNNaN0000000000000000000000000.00000.00.0
12867339e5b0646-91cc-49a1-bf88-39bde773b9494100:00:00.0000010.018Half Start909Turkey1Regular Play909TurkeyNaNNaN['46c6901e-3b12-495a-b68a-19ca15798ed0']nanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNTurkeyItalyHalf StartHalf StartTurkeyTurkey1NaNNaNNaNNaNNaN0000000000000000000000000.00000.00.0
13067320035c2a7c70-becf-4520-856d-d80b55367c752004200:00:00.0004501100.018Half Start909Turkey3From Free Kick914ItalyNaNNaN['e767b1ab-b5c8-4ac4-87c9-d7a884399777']nanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNnanNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN3788741NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN2021-06-1121:00:00.00003availableavailable2021-06-12T12:49:02.0702021-09-22T16:38:05.059090155EuropeUEFA Euro432020909TurkeymaleGroup A233Turkey[{'id': 701, 'name': 'Şenol Güneş', 'nickname'...914ItalymaleGroup A112Italy[{'id': 2997, 'name': 'Roberto Mancini', 'nick...1.1.02210Group Stage381Stadio Olimpico (Roma)112Italy293Danny Desmond Makkelie160Netherlands5543EuropeUEFA EuromaleFalseTrue20202021-11-11T14:00:16.1058092021-11-11T13:54:37.5073762021-11-11T13:54:37.5073762021-11-11T14:00:16.1058092021-06-11 Turkey 0 vs. 3 ItalyNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNItalyTurkeyHalf StartHalf StartTurkeyTurkey1NaNNaNNaNNaNNaN0000000000000000000000000.00000.00.0
\n", "
" ], "text/plain": [ " level_0 id index period \\\n", "128670 0 19edeac2-e63f-4795-8a8b-17a6e9fdb6e3 1 1 \n", "128671 1 89072e2e-b64f-4099-846b-b22cf000f9c7 2 1 \n", "128672 2 46c6901e-3b12-495a-b68a-19ca15798ed0 3 1 \n", "128673 3 9e5b0646-91cc-49a1-bf88-39bde773b949 4 1 \n", "130673 2003 5c2a7c70-becf-4520-856d-d80b55367c75 2004 2 \n", "\n", " timestamp minute second possession duration type_id \\\n", "128670 00:00:00.000 0 0 1 0.0 35 \n", "128671 00:00:00.000 0 0 1 0.0 35 \n", "128672 00:00:00.000 0 0 1 0.0 18 \n", "128673 00:00:00.000 0 0 1 0.0 18 \n", "130673 00:00:00.000 45 0 110 0.0 18 \n", "\n", " type_name possession_team_id possession_team_name play_pattern_id \\\n", "128670 Starting XI 909 Turkey 1 \n", "128671 Starting XI 909 Turkey 1 \n", "128672 Half Start 909 Turkey 1 \n", "128673 Half Start 909 Turkey 1 \n", "130673 Half Start 909 Turkey 3 \n", "\n", " play_pattern_name team_id team_name tactics_formation \\\n", "128670 Regular Play 909 Turkey 4141.0 \n", "128671 Regular Play 914 Italy 433.0 \n", "128672 Regular Play 914 Italy NaN \n", "128673 Regular Play 909 Turkey NaN \n", "130673 From Free Kick 914 Italy NaN \n", "\n", " tactics_lineup \\\n", "128670 [{'player': {'id': 30357, 'name': 'Uğurcan Çak... \n", "128671 [{'player': {'id': 7036, 'name': 'Gianluigi Do... \n", "128672 NaN \n", "128673 NaN \n", "130673 NaN \n", "\n", " related_events location player_id \\\n", "128670 NaN nan NaN \n", "128671 NaN nan NaN \n", "128672 ['9e5b0646-91cc-49a1-bf88-39bde773b949'] nan NaN \n", "128673 ['46c6901e-3b12-495a-b68a-19ca15798ed0'] nan NaN \n", "130673 ['e767b1ab-b5c8-4ac4-87c9-d7a884399777'] nan NaN \n", "\n", " player_name position_id position_name pass_recipient_id \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "\n", " pass_recipient_name pass_length pass_angle pass_height_id \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "\n", " pass_height_name pass_end_location pass_body_part_id \\\n", "128670 NaN nan NaN \n", "128671 NaN nan NaN \n", "128672 NaN nan NaN \n", "128673 NaN nan NaN \n", "130673 NaN nan NaN \n", "\n", " pass_body_part_name pass_type_id pass_type_name carry_end_location \\\n", "128670 NaN NaN NaN nan \n", "128671 NaN NaN NaN nan \n", "128672 NaN NaN NaN nan \n", "128673 NaN NaN NaN nan \n", "130673 NaN NaN NaN nan \n", "\n", " under_pressure duel_type_id duel_type_name pass_aerial_won \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "\n", " counterpress duel_outcome_id duel_outcome_name dribble_outcome_id \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "\n", " dribble_outcome_name pass_outcome_id pass_outcome_name \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " ball_receipt_outcome_id ball_receipt_outcome_name \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " interception_outcome_id interception_outcome_name shot_statsbomb_xg \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " shot_end_location shot_outcome_id shot_outcome_name shot_type_id \\\n", "128670 nan NaN NaN NaN \n", "128671 nan NaN NaN NaN \n", "128672 nan NaN NaN NaN \n", "128673 nan NaN NaN NaN \n", "130673 nan NaN NaN NaN \n", "\n", " shot_type_name shot_body_part_id shot_body_part_name \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " shot_technique_id shot_technique_name shot_freeze_frame \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " goalkeeper_end_location goalkeeper_type_id goalkeeper_type_name \\\n", "128670 nan NaN NaN \n", "128671 nan NaN NaN \n", "128672 nan NaN NaN \n", "128673 nan NaN NaN \n", "130673 nan NaN NaN \n", "\n", " goalkeeper_position_id goalkeeper_position_name out pass_outswinging \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "\n", " pass_technique_id pass_technique_name clearance_head \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " clearance_body_part_id clearance_body_part_name pass_switch \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " off_camera pass_cross clearance_left_foot dribble_overrun \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "\n", " dribble_nutmeg clearance_right_foot pass_no_touch \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " foul_committed_advantage foul_won_advantage pass_assisted_shot_id \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " pass_shot_assist shot_key_pass_id shot_first_time clearance_other \\\n", "128670 NaN NaN NaN NaN \n", "128671 NaN NaN NaN NaN \n", "128672 NaN NaN NaN NaN \n", "128673 NaN NaN NaN NaN \n", "130673 NaN NaN NaN NaN \n", "\n", " pass_miscommunication clearance_aerial_won pass_through_ball \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " ball_recovery_recovery_failure goalkeeper_outcome_id \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " goalkeeper_outcome_name goalkeeper_body_part_id \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " goalkeeper_body_part_name shot_aerial_won foul_committed_card_id \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " foul_committed_card_name foul_committed_offensive foul_won_defensive \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " substitution_outcome_id substitution_outcome_name \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " substitution_replacement_id substitution_replacement_name \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " 50_50_outcome_id 50_50_outcome_name pass_goal_assist \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " goalkeeper_technique_id goalkeeper_technique_name pass_cut_back \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " miscontrol_aerial_won pass_straight foul_committed_type_id \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " foul_committed_type_name match_id pass_inswinging pass_deflected \\\n", "128670 NaN 3788741 NaN NaN \n", "128671 NaN 3788741 NaN NaN \n", "128672 NaN 3788741 NaN NaN \n", "128673 NaN 3788741 NaN NaN \n", "130673 NaN 3788741 NaN NaN \n", "\n", " injury_stoppage_in_chain shot_one_on_one bad_behaviour_card_id \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " bad_behaviour_card_name shot_deflected block_deflection \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " foul_committed_penalty foul_won_penalty block_save_block \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " goalkeeper_punched_out player_off_permanent shot_saved_off_target \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " goalkeeper_shot_saved_off_target shot_saved_to_post \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " goalkeeper_shot_saved_to_post shot_open_goal \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " goalkeeper_penalty_saved_to_post dribble_no_touch block_offensive \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " shot_follows_dribble ball_recovery_offensive shot_redirect \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " goalkeeper_lost_in_play goalkeeper_success_in_play match_date \\\n", "128670 NaN NaN 2021-06-11 \n", "128671 NaN NaN 2021-06-11 \n", "128672 NaN NaN 2021-06-11 \n", "128673 NaN NaN 2021-06-11 \n", "130673 NaN NaN 2021-06-11 \n", "\n", " kick_off home_score away_score match_status match_status_360 \\\n", "128670 21:00:00.000 0 3 available available \n", "128671 21:00:00.000 0 3 available available \n", "128672 21:00:00.000 0 3 available available \n", "128673 21:00:00.000 0 3 available available \n", "130673 21:00:00.000 0 3 available available \n", "\n", " last_updated last_updated_360 match_week \\\n", "128670 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "128671 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "128672 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "128673 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "130673 2021-06-12T12:49:02.070 2021-09-22T16:38:05.059090 1 \n", "\n", " competition_competition_id competition_country_name \\\n", "128670 55 Europe \n", "128671 55 Europe \n", "128672 55 Europe \n", "128673 55 Europe \n", "130673 55 Europe \n", "\n", " competition_competition_name season_season_id season_season_name \\\n", "128670 UEFA Euro 43 2020 \n", "128671 UEFA Euro 43 2020 \n", "128672 UEFA Euro 43 2020 \n", "128673 UEFA Euro 43 2020 \n", "130673 UEFA Euro 43 2020 \n", "\n", " home_team_home_team_id home_team_home_team_name \\\n", "128670 909 Turkey \n", "128671 909 Turkey \n", "128672 909 Turkey \n", "128673 909 Turkey \n", "130673 909 Turkey \n", "\n", " home_team_home_team_gender home_team_home_team_group \\\n", "128670 male Group A \n", "128671 male Group A \n", "128672 male Group A \n", "128673 male Group A \n", "130673 male Group A \n", "\n", " home_team_country_id home_team_country_name \\\n", "128670 233 Turkey \n", "128671 233 Turkey \n", "128672 233 Turkey \n", "128673 233 Turkey \n", "130673 233 Turkey \n", "\n", " home_team_managers \\\n", "128670 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "128671 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "128672 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "128673 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "130673 [{'id': 701, 'name': 'Şenol Güneş', 'nickname'... \n", "\n", " away_team_away_team_id away_team_away_team_name \\\n", "128670 914 Italy \n", "128671 914 Italy \n", "128672 914 Italy \n", "128673 914 Italy \n", "130673 914 Italy \n", "\n", " away_team_away_team_gender away_team_away_team_group \\\n", "128670 male Group A \n", "128671 male Group A \n", "128672 male Group A \n", "128673 male Group A \n", "130673 male Group A \n", "\n", " away_team_country_id away_team_country_name \\\n", "128670 112 Italy \n", "128671 112 Italy \n", "128672 112 Italy \n", "128673 112 Italy \n", "130673 112 Italy \n", "\n", " away_team_managers \\\n", "128670 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "128671 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "128672 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "128673 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "130673 [{'id': 2997, 'name': 'Roberto Mancini', 'nick... \n", "\n", " metadata_data_version metadata_shot_fidelity_version \\\n", "128670 1.1.0 2 \n", "128671 1.1.0 2 \n", "128672 1.1.0 2 \n", "128673 1.1.0 2 \n", "130673 1.1.0 2 \n", "\n", " metadata_xy_fidelity_version competition_stage_id \\\n", "128670 2 10 \n", "128671 2 10 \n", "128672 2 10 \n", "128673 2 10 \n", "130673 2 10 \n", "\n", " competition_stage_name stadium_id stadium_name \\\n", "128670 Group Stage 381 Stadio Olimpico (Roma) \n", "128671 Group Stage 381 Stadio Olimpico (Roma) \n", "128672 Group Stage 381 Stadio Olimpico (Roma) \n", "128673 Group Stage 381 Stadio Olimpico (Roma) \n", "130673 Group Stage 381 Stadio Olimpico (Roma) \n", "\n", " stadium_country_id stadium_country_name referee_id \\\n", "128670 112 Italy 293 \n", "128671 112 Italy 293 \n", "128672 112 Italy 293 \n", "128673 112 Italy 293 \n", "130673 112 Italy 293 \n", "\n", " referee_name referee_country_id referee_country_name \\\n", "128670 Danny Desmond Makkelie 160 Netherlands \n", "128671 Danny Desmond Makkelie 160 Netherlands \n", "128672 Danny Desmond Makkelie 160 Netherlands \n", "128673 Danny Desmond Makkelie 160 Netherlands \n", "130673 Danny Desmond Makkelie 160 Netherlands \n", "\n", " competition_id season_id country_name competition_name \\\n", "128670 55 43 Europe UEFA Euro \n", "128671 55 43 Europe UEFA Euro \n", "128672 55 43 Europe UEFA Euro \n", "128673 55 43 Europe UEFA Euro \n", "130673 55 43 Europe UEFA Euro \n", "\n", " competition_gender competition_youth competition_international \\\n", "128670 male False True \n", "128671 male False True \n", "128672 male False True \n", "128673 male False True \n", "130673 male False True \n", "\n", " season_name match_updated match_updated_360 \\\n", "128670 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "128671 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "128672 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "128673 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "130673 2020 2021-11-11T14:00:16.105809 2021-11-11T13:54:37.507376 \n", "\n", " match_available_360 match_available \\\n", "128670 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "128671 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "128672 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "128673 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "130673 2021-11-11T13:54:37.507376 2021-11-11T14:00:16.105809 \n", "\n", " Full_Fixture_Date location_x location_y \\\n", "128670 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "128671 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "128672 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "128673 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "130673 2021-06-11 Turkey 0 vs. 3 Italy NaN NaN \n", "\n", " pass_end_location_x pass_end_location_y carry_end_location_x \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " carry_end_location_y shot_end_location_x shot_end_location_y \\\n", "128670 NaN NaN NaN \n", "128671 NaN NaN NaN \n", "128672 NaN NaN NaN \n", "128673 NaN NaN NaN \n", "130673 NaN NaN NaN \n", "\n", " shot_end_location_z goalkeeper_end_location_x \\\n", "128670 NaN NaN \n", "128671 NaN NaN \n", "128672 NaN NaN \n", "128673 NaN NaN \n", "130673 NaN NaN \n", "\n", " goalkeeper_end_location_y Team Opponent next_event \\\n", "128670 NaN Turkey Italy Starting XI \n", "128671 NaN Italy Turkey Half Start \n", "128672 NaN Italy Turkey Half Start \n", "128673 NaN Turkey Italy Half Start \n", "130673 NaN Italy Turkey Half Start \n", "\n", " previous_event next_team_possession previous_team_possession \\\n", "128670 NaN Turkey NaN \n", "128671 Starting XI Turkey Turkey \n", "128672 Starting XI Turkey Turkey \n", "128673 Half Start Turkey Turkey \n", "130673 Half Start Turkey Turkey \n", "\n", " possession_retained endloc_x endloc_y dist1 dist2 diffdist \\\n", "128670 1 NaN NaN NaN NaN NaN \n", "128671 1 NaN NaN NaN NaN NaN \n", "128672 1 NaN NaN NaN NaN NaN \n", "128673 1 NaN NaN NaN NaN NaN \n", "130673 1 NaN NaN NaN NaN NaN \n", "\n", " Passes Successful Passes Short Passes Successful Short Passes \\\n", "128670 0 0 0 0 \n", "128671 0 0 0 0 \n", "128672 0 0 0 0 \n", "128673 0 0 0 0 \n", "130673 0 0 0 0 \n", "\n", " Medium Passes Successful Medium Passes Long Passes \\\n", "128670 0 0 0 \n", "128671 0 0 0 \n", "128672 0 0 0 \n", "128673 0 0 0 \n", "130673 0 0 0 \n", "\n", " Successful Long Passes Final Third Passes \\\n", "128670 0 0 \n", "128671 0 0 \n", "128672 0 0 \n", "128673 0 0 \n", "130673 0 0 \n", "\n", " Successful Final Third Passes Penalty Area Passes \\\n", "128670 0 0 \n", "128671 0 0 \n", "128672 0 0 \n", "128673 0 0 \n", "130673 0 0 \n", "\n", " Successful Penalty Area Passes Under Pressure Passes \\\n", "128670 0 0 \n", "128671 0 0 \n", "128672 0 0 \n", "128673 0 0 \n", "130673 0 0 \n", "\n", " Successful Under Pressure Passes Throughballs \\\n", "128670 0 0 \n", "128671 0 0 \n", "128672 0 0 \n", "128673 0 0 \n", "130673 0 0 \n", "\n", " Successful Throughballs Switches Successful Switches Crosses \\\n", "128670 0 0 0 0 \n", "128671 0 0 0 0 \n", "128672 0 0 0 0 \n", "128673 0 0 0 0 \n", "130673 0 0 0 0 \n", "\n", " Successful Crosses Penalty Area Crosses \\\n", "128670 0 0 \n", "128671 0 0 \n", "128672 0 0 \n", "128673 0 0 \n", "130673 0 0 \n", "\n", " Successful Penalty Area Crosses Progressive Passes \\\n", "128670 0 0 \n", "128671 0 0 \n", "128672 0 0 \n", "128673 0 0 \n", "130673 0 0 \n", "\n", " Successful Progressive Passes Pass Progressive Distance Carries \\\n", "128670 0 0.0 0 \n", "128671 0 0.0 0 \n", "128672 0 0.0 0 \n", "128673 0 0.0 0 \n", "130673 0 0.0 0 \n", "\n", " Final Third Carries Progressive Carries Carry Distance \\\n", "128670 0 0 0.0 \n", "128671 0 0 0.0 \n", "128672 0 0 0.0 \n", "128673 0 0 0.0 \n", "130673 0 0 0.0 \n", "\n", " Carry Progressive Distance \n", "128670 0.0 \n", "128671 0.0 \n", "128672 0.0 \n", "128673 0.0 \n", "130673 0.0 " ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create in-possession metrics\n", "\n", "## Define masks\n", "pass_mask = df_sb_events['type_name'] == 'Pass'\n", "success_mask = df_sb_events.pass_outcome_name.isna()\n", "openplay_mask = df_sb_events['pass_type_name'].isna()\n", "shortpass_mask = (df_sb_events.pass_length >= 5) & (df_sb_events.pass_length < 15)\n", "mediumpass_mask = (df_sb_events.pass_length >= 15) & (df_sb_events.pass_length < 30)\n", "longpass_mask = (df_sb_events.pass_length >= 30)\n", "finalthird_mask = (df_sb_events.endloc_x > 80) & (df_sb_events.location_x <= 80)\n", "penaltyarea_mask = (df_sb_events.endloc_x > 102) & (np.abs(df_sb_events.endloc_y - 40) < 22)\n", "pressure_mask = df_sb_events.under_pressure==True\n", "throughball_mask = df_sb_events.pass_through_ball == True \n", "switch_mask = df_sb_events.pass_switch == True \n", "cross_mask = df_sb_events.pass_cross == True\n", "dist_mask = (df_sb_events['dist1'] - df_sb_events['dist2'])/df_sb_events['dist1'] > 0.25\n", "box_mask = ~(df_sb_events.location_x > 102) & (np.abs(df_sb_events.location_y - 40) < 22)\n", "prog_mask = dist_mask | (box_mask & penaltyarea_mask)\n", "carry_mask = df_sb_events.type_name == 'Carry'\n", "\n", "\n", "## Apply defined _masks\n", "\n", "### Passes\n", "df_sb_events['Passes'] = np.where(pass_mask, 1, 0)\n", "df_sb_events['Successful Passes'] = np.where(pass_mask & success_mask, 1, 0)\n", "df_sb_events['Short Passes'] = np.where(pass_mask & shortpass_mask, 1, 0)\n", "df_sb_events['Successful Short Passes'] = np.where((df_sb_events['Short Passes']==1) & success_mask, 1, 0)\n", "df_sb_events['Medium Passes'] = np.where(pass_mask & mediumpass_mask, 1, 0)\n", "df_sb_events['Successful Medium Passes'] = np.where((df_sb_events['Medium Passes']==1) & success_mask, 1, 0)\n", "df_sb_events['Long Passes'] = np.where(pass_mask & longpass_mask, 1, 0)\n", "df_sb_events['Successful Long Passes'] = np.where((df_sb_events['Long Passes']==1) & success_mask, 1, 0)\n", "df_sb_events['Final Third Passes'] = np.where(pass_mask & finalthird_mask & openplay_mask, 1, 0)\n", "df_sb_events['Successful Final Third Passes'] = np.where((df_sb_events['Final Third Passes']==1) & success_mask, 1, 0)\n", "df_sb_events['Penalty Area Passes'] = np.where(pass_mask & penaltyarea_mask & openplay_mask, 1, 0)\n", "df_sb_events['Successful Penalty Area Passes'] = np.where((df_sb_events['Penalty Area Passes']==1) & success_mask, 1, 0)\n", "df_sb_events['Under Pressure Passes'] = np.where(pass_mask & pressure_mask, 1, 0)\n", "df_sb_events['Successful Under Pressure Passes'] = np.where(pass_mask & pressure_mask & success_mask, 1, 0)\n", "df_sb_events['Throughballs'] = np.where(throughball_mask, 1, 0)\n", "df_sb_events['Successful Throughballs'] = np.where(throughball_mask & success_mask, 1, 0)\n", "df_sb_events['Switches'] = np.where(switch_mask, 1, 0)\n", "df_sb_events['Successful Switches'] = np.where(switch_mask & success_mask, 1, 0)\n", "df_sb_events['Crosses'] = np.where(cross_mask, 1, 0)\n", "df_sb_events['Successful Crosses'] = np.where(cross_mask & success_mask, 1, 0)\n", "df_sb_events['Penalty Area Crosses'] = np.where(cross_mask & penaltyarea_mask & openplay_mask, 1, 0)\n", "df_sb_events['Successful Penalty Area Crosses'] = np.where(cross_mask & penaltyarea_mask & openplay_mask & success_mask,\n", " 1,0)\n", "### Progressive Passes\n", "df_sb_events['Progressive Passes'] = np.where(pass_mask & prog_mask, 1, 0)\n", "df_sb_events['Successful Progressive Passes'] = np.where(pass_mask & prog_mask & success_mask, 1, 0)\n", "df_sb_events['Pass Progressive Distance'] = np.where(pass_mask & (df_sb_events.diffdist > 0), df_sb_events.diffdist, 0)\n", "\n", "### Carries\n", "df_sb_events['Carries'] = np.where(carry_mask, 1, 0)\n", "df_sb_events['Final Third Carries'] = np.where(carry_mask & finalthird_mask, 1, 0)\n", "df_sb_events['Progressive Carries'] = np.where(carry_mask & prog_mask, 1, 0)\n", "df_sb_events['Carry Distance'] = np.where(carry_mask, np.sqrt((df_sb_events.location_x - df_sb_events.endloc_x)**2 + (df_sb_events.location_y -df_sb_events.endloc_y)**2),0)\n", "df_sb_events['Carry Progressive Distance'] = np.where(carry_mask & (df_sb_events.diffdist > 0), df_sb_events.diffdist, 0)\n", "\n", "\n", "## Display DataFrame\n", "df_sb_events.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that the in-possession stats for passing and dribbling have been created, the next stage is to aggregate this stats." ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
player_namePassesSuccessful PassesShort PassesSuccessful Short PassesMedium PassesSuccessful Medium PassesLong PassesSuccessful Long PassesFinal Third PassesSuccessful Final Third PassesPenalty Area PassesSuccessful Penalty Area PassesUnder Pressure PassesSuccessful Under Pressure PassesThroughballsSuccessful ThroughballsSwitchesSuccessful SwitchesCrossesSuccessful CrossesPenalty Area CrossesSuccessful Penalty Area CrossesProgressive PassesSuccessful Progressive PassesTotal Pass LengthPass Progressive DistanceCarriesFinal Third CarriesProgressive CarriesCarry DistanceCarry Progressive Distance
0Aaron Ramsey1541186552524533211811167292100116614133153215.3133611090.83980312855786.272278311.926128
1Adam Hložek25161181074140533100113232105484.649314204.5054692225216.430147149.074994
2Adama Traoré Diarra16147755320010330021210000329.78034726.3415351936244.339460194.735133
3Admir Mehmedi2520171444112220760011000030296.84283795.074445190088.00489117.880685
4Adrien Rabiot202176101928372121020171034132102240401683228.6004791035.29527417711131133.280008614.950937
\n", "
" ], "text/plain": [ " player_name Passes Successful Passes Short Passes \\\n", "0 Aaron Ramsey 154 118 65 \n", "1 Adam Hložek 25 16 11 \n", "2 Adama Traoré Diarra 16 14 7 \n", "3 Admir Mehmedi 25 20 17 \n", "4 Adrien Rabiot 202 176 101 \n", "\n", " Successful Short Passes Medium Passes Successful Medium Passes \\\n", "0 52 52 45 \n", "1 8 10 7 \n", "2 7 5 5 \n", "3 14 4 4 \n", "4 92 83 72 \n", "\n", " Long Passes Successful Long Passes Final Third Passes \\\n", "0 33 21 18 \n", "1 4 1 4 \n", "2 3 2 0 \n", "3 1 1 2 \n", "4 12 10 20 \n", "\n", " Successful Final Third Passes Penalty Area Passes \\\n", "0 11 16 \n", "1 0 5 \n", "2 0 1 \n", "3 2 2 \n", "4 17 10 \n", "\n", " Successful Penalty Area Passes Under Pressure Passes \\\n", "0 7 29 \n", "1 3 3 \n", "2 0 3 \n", "3 0 7 \n", "4 3 41 \n", "\n", " Successful Under Pressure Passes Throughballs Successful Throughballs \\\n", "0 21 0 0 \n", "1 1 0 0 \n", "2 3 0 0 \n", "3 6 0 0 \n", "4 32 1 0 \n", "\n", " Switches Successful Switches Crosses Successful Crosses \\\n", "0 11 6 6 1 \n", "1 1 1 3 2 \n", "2 2 1 2 1 \n", "3 1 1 0 0 \n", "4 2 2 4 0 \n", "\n", " Penalty Area Crosses Successful Penalty Area Crosses Progressive Passes \\\n", "0 4 1 33 \n", "1 3 2 10 \n", "2 0 0 0 \n", "3 0 0 3 \n", "4 4 0 16 \n", "\n", " Successful Progressive Passes Total Pass Length \\\n", "0 15 3215.313361 \n", "1 5 484.649314 \n", "2 0 329.780347 \n", "3 0 296.842837 \n", "4 8 3228.600479 \n", "\n", " Pass Progressive Distance Carries Final Third Carries \\\n", "0 1090.839803 128 5 \n", "1 204.505469 22 2 \n", "2 26.341535 19 3 \n", "3 95.074445 19 0 \n", "4 1035.295274 177 11 \n", "\n", " Progressive Carries Carry Distance Carry Progressive Distance \n", "0 5 786.272278 311.926128 \n", "1 5 216.430147 149.074994 \n", "2 6 244.339460 194.735133 \n", "3 0 88.004891 17.880685 \n", "4 13 1133.280008 614.950937 " ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#\n", "\n", "##\n", "dict_onball_agg = {'Passes':'sum',\n", " 'Successful Passes':'sum',\n", " 'Short Passes':'sum', \n", " 'Successful Short Passes':'sum',\n", " 'Medium Passes':'sum', \n", " 'Successful Medium Passes':'sum',\n", " 'Long Passes':'sum', \n", " 'Successful Long Passes':'sum',\n", " 'Final Third Passes':'sum',\n", " 'Successful Final Third Passes':'sum',\n", " 'Penalty Area Passes':'sum',\n", " 'Successful Penalty Area Passes':'sum',\n", " 'Under Pressure Passes':'sum',\n", " 'Successful Under Pressure Passes':'sum',\n", " 'Throughballs':'sum',\n", " 'Successful Throughballs':'sum',\n", " 'Switches':'sum',\n", " 'Successful Switches':'sum',\n", " 'Crosses':'sum',\n", " 'Successful Crosses':'sum',\n", " 'Penalty Area Crosses':'sum',\n", " 'Successful Penalty Area Crosses':'sum',\n", " 'Progressive Passes':'sum',\n", " 'Successful Progressive Passes':'sum',\n", " 'pass_length':'sum',\n", " 'Pass Progressive Distance':'sum',\n", " 'Carries':'sum',\n", " 'Final Third Carries':'sum',\n", " 'Progressive Carries':'sum',\n", " 'Carry Distance':'sum',\n", " 'Carry Progressive Distance':'sum'\n", " }\n", "\n", "##\n", "df_sb_events_grouped_passing_carrying = df_sb_events.groupby('player_name').agg(dict_onball_agg).reset_index()\n", "\n", "##\n", "df_sb_events_grouped_passing_carrying.rename(columns={'pass_length':'Total Pass Length'}, errors='raise', inplace=True)\n", "\n", "## Display DataFrame\n", "df_sb_events_grouped_passing_carrying.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.10. Determine Expected Threat\n", "\n", "The" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [], "source": [ "touchbasedactions = ['Pass',\n", " 'Carry',\n", " 'Dribble',\n", " 'Foul Won',\n", " 'Interception',\n", " 'Duel',\n", " '50/50',\n", " 'Ball Recovery',\n", " 'Dispossessed',\n", " 'Block',\n", " 'Clearance',\n", " 'Miscontrol',\n", " 'Goal Keeper',\n", " 'Shot'\n", " ]\n", "\n", "df_sb_events['isTouch'] = np.where(df_sb_events.type_name.isin(touchbasedactions), 1, 0)\n", "\n", "df_sb_events = df_sb_events[(df_sb_events.isTouch==1)].reset_index(drop=True)\n", "\n", "binx = [10*i for i in range(13)]\n", "biny = [10*i for i in range(9)]\n", "\n", "for cols in ['location_x','endloc_x']:\n", " s = pd.cut(df_sb_events[cols], bins=binx, include_lowest=True)\n", " df_sb_events['zone_'+cols] = pd.Series(data=pd.IntervalIndex(s).right, index = s.index)/10\n", " \n", "for cols in ['location_y','endloc_y']:\n", " s = pd.cut(df_sb_events[cols], bins=biny, include_lowest=True)\n", " df_sb_events['zone_'+cols] = pd.Series(data=pd.IntervalIndex(s).right, index = s.index)/10\n", "\n", "df_sb_events['zone_start'] = df_sb_events['zone_location_x'] + (df_sb_events['zone_location_y']-1)*12\n", "df_sb_events['zone_end'] = df_sb_events['zone_endloc_x'] + (df_sb_events['zone_endloc_y']-1)*12\n", "\n", "xtd = np.array([[0.00638303, 0.00779616, 0.00844854, 0.00977659, 0.01126267,\n", " 0.01248344, 0.01473596, 0.0174506 , 0.02122129, 0.02756312,\n", " 0.03485072, 0.0379259 ],\n", " [0.00750072, 0.00878589, 0.00942382, 0.0105949 , 0.01214719,\n", " 0.0138454 , 0.01611813, 0.01870347, 0.02401521, 0.02953272,\n", " 0.04066992, 0.04647721],\n", " [0.0088799 , 0.00977745, 0.01001304, 0.01110462, 0.01269174,\n", " 0.01429128, 0.01685596, 0.01935132, 0.0241224 , 0.02855202,\n", " 0.05491138, 0.06442595],\n", " [0.00941056, 0.01082722, 0.01016549, 0.01132376, 0.01262646,\n", " 0.01484598, 0.01689528, 0.0199707 , 0.02385149, 0.03511326,\n", " 0.10805102, 0.25745362],\n", " [0.00941056, 0.01082722, 0.01016549, 0.01132376, 0.01262646,\n", " 0.01484598, 0.01689528, 0.0199707 , 0.02385149, 0.03511326,\n", " 0.10805102, 0.25745362],\n", " [0.0088799 , 0.00977745, 0.01001304, 0.01110462, 0.01269174,\n", " 0.01429128, 0.01685596, 0.01935132, 0.0241224 , 0.02855202,\n", " 0.05491138, 0.06442595],\n", " [0.00750072, 0.00878589, 0.00942382, 0.0105949 , 0.01214719,\n", " 0.0138454 , 0.01611813, 0.01870347, 0.02401521, 0.02953272,\n", " 0.04066992, 0.04647721],\n", " [0.00638303, 0.00779616, 0.00844854, 0.00977659, 0.01126267,\n", " 0.01248344, 0.01473596, 0.0174506 , 0.02122129, 0.02756312,\n", " 0.03485072, 0.0379259 ]]).flatten()\n", "\n", "startXTdf_sb_events = pd.DataFrame(data=xtd,columns=['xT_start'])\n", "startXTdf_sb_events['zone_start'] = [i+1 for i in range(96)]\n", "endXTdf_sb_events = pd.DataFrame(data=xtd,columns=['xT_end'])\n", "endXTdf_sb_events['zone_end'] = [i+1 for i in range(96)]\n", "\n", "df_sb_events = df_sb_events.merge(startXTdf_sb_events, on=['zone_start'], how='left')\n", "df_sb_events = df_sb_events.merge(endXTdf_sb_events, on=['zone_end'], how='left')\n", "df_sb_events['xT'] = df_sb_events['xT_end'] - df_sb_events['xT_start']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that the xT have been created, the next stage is to aggregate these stats per player." ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
player_namexTxT Facilitated
0Aaron Ramsey0.8509620.608308
1Adam Hložek0.2084670.053769
2Adama Traoré Diarra0.059728-0.022413
3Admir Mehmedi0.004645-0.004682
4Adrien Rabiot0.4259500.341875
............
484İbrahim Halil Dervişoğlu0.0381710.022885
485İlkay Gündoğan0.0499850.207556
486İrfan Can Kahveci0.0916620.024946
487Ľubomír Šatka0.1248380.191737
488Šime Vrsaljko0.3813110.377846
\n", "

489 rows × 3 columns

\n", "
" ], "text/plain": [ " player_name xT xT Facilitated\n", "0 Aaron Ramsey 0.850962 0.608308\n", "1 Adam Hložek 0.208467 0.053769\n", "2 Adama Traoré Diarra 0.059728 -0.022413\n", "3 Admir Mehmedi 0.004645 -0.004682\n", "4 Adrien Rabiot 0.425950 0.341875\n", ".. ... ... ...\n", "484 İbrahim Halil Dervişoğlu 0.038171 0.022885\n", "485 İlkay Gündoğan 0.049985 0.207556\n", "486 İrfan Can Kahveci 0.091662 0.024946\n", "487 Ľubomír Šatka 0.124838 0.191737\n", "488 Šime Vrsaljko 0.381311 0.377846\n", "\n", "[489 rows x 3 columns]" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_sb_events['xT'] = np.where(df_sb_events.type_name.isin(['Pass','Carry']) & df_sb_events.pass_outcome_name.isna(),df_sb_events.xT,0)\n", "\n", "df_sb_events['xT Facilitated'] = np.where(df_sb_events.team_name==df_sb_events.team_name.shift(-1), df_sb_events.xT.shift(-1).fillna(value=0), 0)\n", "\n", "df_sb_events_grouped_xt = (df_sb_events\n", " .groupby('player_name')\n", " .agg({'xT':'sum',\n", " 'xT Facilitated':'sum'\n", " }\n", " )\n", " .reset_index()\n", " )\n", "\n", "## Display DataFrame\n", "df_sb_events_grouped_xt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.11. Determine xGChain and xGBuildup\n", "\n", "The" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [], "source": [ "def player_xgc(match_id):\n", " gamedf = df[(df.match_id==match_id)&(df.period<=4)].reset_index(drop=True)\n", " typemask = gamedf.type_name == 'Shot'\n", " openplay = gamedf.shot_type_name == 'Open Play'\n", " sameteam = gamedf.team_name == gamedf.possession_team_name\n", " gamedf['OPS'] = np.where(typemask & openplay & sameteam,1,0)\n", " gamedf['oneminusxG'] = 1.0 - gamedf['shot_statsbomb_xg']\n", " aggdict = {'OPS':'sum','oneminusxG':np.prod}\n", " grouped = gamedf[gamedf.OPS==1].groupby(['team_name','possession']).agg(aggdict).reset_index()\n", " grouped['oneminusxG'] = 1.0 - grouped['oneminusxG']\n", " grouped.rename(columns={'oneminusxG':'xGCond'},inplace=True)\n", " grouped.drop(columns='OPS',inplace=True)\n", " gamedf = gamedf.merge(grouped,how='left')\n", " gamedf['xGCond'].fillna(value=0,inplace=True)\n", " gamedf['xGCond'] = np.where(gamedf.type_name.isin(['Pass','Carry']),gamedf.xGCond,0)\n", " groupdf = gamedf.groupby(['player_name','possession']).agg({'xGCond':'mean'}).reset_index()\n", " return groupdf" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [], "source": [ "def player_xgb(match_id):\n", " gamedf = df[(df.match_id==match_id)&(df.period<=4)].reset_index(drop=True)\n", " typemask = gamedf.type_name == 'Shot'\n", " openplay = gamedf.shot_type_name == 'Open Play'\n", " sameteam = gamedf.team_name == gamedf.possession_team_name\n", " gamedf['OPS'] = np.where(typemask & openplay & sameteam,1,0)\n", " gamedf['oneminusxG'] = 1.0 - gamedf['shot_statsbomb_xg']\n", " aggdict = {'OPS':'sum','oneminusxG':np.prod}\n", " grouped = gamedf[gamedf.OPS==1].groupby(['team_name','possession']).agg(aggdict).reset_index()\n", " grouped['oneminusxG'] = 1.0 - grouped['oneminusxG']\n", " grouped.rename(columns={'oneminusxG':'xGCond'},inplace=True)\n", " grouped.drop(columns='OPS',inplace=True)\n", " gamedf = gamedf.merge(grouped,how='left')\n", " gamedf['xGCond'].fillna(value=0,inplace=True)\n", " gamedf['xGCond'] = np.where(gamedf.type_name.isin(['Pass','Carry']),gamedf.xGCond,0)\n", " gamedf.loc[(gamedf.pass_shot_assist==True)|(gamedf.pass_goal_assist==True),\n", " 'xGCond'] = 0\n", " groupdf = gamedf.groupby(['player_name','possession']).agg({'xGCond':'mean'}).reset_index()\n", " return groupdf" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Reading Games: 100%|████████████████████████████| 51/51 [00:21<00:00, 2.33it/s]\n" ] } ], "source": [ "xgcdfs = []\n", "xgbdfs = []\n", "\n", "df = df_sb_events\n", "\n", "for g in tqdm(df.match_id.unique(), desc='Reading Games'):\n", " xgcdfs.append(player_xgc(g))\n", " xgbdfs.append(player_xgb(g))\n", " \n", "xgcdf = pd.concat(xgcdfs, ignore_index=True)\n", "xgbdf = pd.concat(xgbdfs, ignore_index=True)" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(489, 489)" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xgbdf.rename(columns={'xGCond':'xGBuildup'}, inplace=True)\n", "xgcdf.rename(columns={'xGCond':'xGChain'}, inplace=True)\n", "\n", "df_sb_events_grouped_xgbuildup = xgbdf.groupby('player_name').xGBuildup.sum().reset_index()\n", "df_sb_events_grouped_xgchain = xgcdf.groupby('player_name').xGChain.sum().reset_index()\n", "len(df_sb_events_grouped_xgbuildup), len(df_sb_events_grouped_xgchain)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Combine xGChain and xGBuildup" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "489" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_sb_events_grouped_xg = df_sb_events_grouped_xgbuildup.merge(df_sb_events_grouped_xgchain, how='left')\n", "len(df_sb_events_grouped_xg)" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
player_namexGBuildupxGChain
0Aaron Ramsey1.2893481.369214
1Adam Hložek0.2547200.367832
2Adama Traoré Diarra0.0425830.042583
3Admir Mehmedi0.0105210.010521
4Adrien Rabiot2.1311492.165416
............
484İbrahim Halil Dervişoğlu0.0808790.080879
485İlkay Gündoğan2.1833682.183368
486İrfan Can Kahveci0.4470840.493820
487Ľubomír Šatka0.2164380.222530
488Šime Vrsaljko0.7194400.725648
\n", "

489 rows × 3 columns

\n", "
" ], "text/plain": [ " player_name xGBuildup xGChain\n", "0 Aaron Ramsey 1.289348 1.369214\n", "1 Adam Hložek 0.254720 0.367832\n", "2 Adama Traoré Diarra 0.042583 0.042583\n", "3 Admir Mehmedi 0.010521 0.010521\n", "4 Adrien Rabiot 2.131149 2.165416\n", ".. ... ... ...\n", "484 İbrahim Halil Dervişoğlu 0.080879 0.080879\n", "485 İlkay Gündoğan 2.183368 2.183368\n", "486 İrfan Can Kahveci 0.447084 0.493820\n", "487 Ľubomír Šatka 0.216438 0.222530\n", "488 Šime Vrsaljko 0.719440 0.725648\n", "\n", "[489 rows x 3 columns]" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "## Display DataFrame\n", "df_sb_events_grouped_xg" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.12. Determine Out-of-Possession Metrics\n", "Determine all out-of-possession metrics including dribbles, turnovers and pressures, the latter using the in-possession sequences determined in the previous section.\n", "\n", "Defensive metrics are possession-adjusted to mitigate the possession-heavy style effects of certain teams.\n", "\n", "The out-of-possession metrics (tackles, interceptions, turnovers, and pressures) that are created in the following section include following:\n", "* **Tackles & Dribbles Past p90**: ...\n", "* **Dribblers tackled %**: the number of dribblers tackled divided by dribblers tackled plus times dribbled past;\n", "* **Aerial Wins %**: the percentage of aerial battles won divided by total aerial battles;\n", "* **Aerial Wins p90**: the number of aerial duels a player wins, per 90 minutes;\n", "* **Fouls p90**: the number of fouls per 90 minutes\n", "* **Pressures p90**: the number of times applying pressure to opposing player who is receiving, carrying or releasing the ball\n", "* **Pressured Long Balls**: \n", "* **Unpressured Long Balls**: the number of completed long balls while not under pressure per 90.\n", "* **Tackles p90**: the number of tackles per 90 minutes (ideally this would be pAdj Tackles, the number of tackles adjusted proportionally to the possession volume of a team. Unfortunately, in the time available for this task, this is difficult to determine);\n", "* **Interceptions**: the number of interceptions per 90 minutes (ideally this would be pAdj Interceptions, the number of interceptions adjusted proportionally to the possession volume of a team. Unfortunately, in the time available for this task, this is difficult to determine); \n", "* **Average Defensive Action Distance**: the average distance from the goal line that the player successfully makes a defensive action;\n", "* **Clearances p90**: the number of times a player makes a clearance or plays a long ball while under pressure, per 90 minutes;\n", "* **Blocks p90**: the number of blocks, per 90 minutes. A 'block' is defined as blocking the ball by standing in its path; and\n", "* **Blocks/Shot p90**: the number of blocks made per shot faced, per 90 minutes. " ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Initial number : 100\n", "Suspicious cases : 14\n", "Final number :98\n" ] } ], "source": [ "\n", "defenders = df_sb_events[df_sb_events.position_name.isin(['Left Center Back', 'Right Center Back', 'Center Back'])].player_name.unique()\n", "\n", "print(\"Initial number : \"+str(len(defenders)))\n", "\n", "flagnames = ['Francisco Javier Calvo Quesada',\n", " 'Joshua Kimmich',\n", " 'Luis Carlos Tejada Hansell',\n", " 'Michael Lang',\n", " 'Nicolás Alejandro Tagliafico',\n", " 'Gabriel Iván Mercado',\n", " 'Hörður Björgvin Magnússon',\n", " 'Birkir Már Sævarsson',\n", " 'Fedor Kudryashov',\n", " 'Éver Maximiliano David Banega',\n", " 'Edson Omar Álvarez Velázquez',\n", " 'Marcus Rashford',\n", " 'İlkay Gündoğan',\n", " 'Dylan Bronn'\n", " ]\n", "\n", "print(\"Suspicious cases : \"+str(len(flagnames)))\n", "\n", "defenders = list(set(defenders) - set(flagnames))\n", "\n", "print(\"Final number :\"+str(len(defenders)))" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [], "source": [ "def game_poss(match_id):\n", " gamedf = df[(df.match_id==match_id)&(df.period<=4)].reset_index(drop=True)\n", " team1 = gamedf.team_name[0]\n", " team2 = gamedf.team_name[1]\n", " gamedf['time_seconds'] = gamedf['minute']*60 + gamedf['second']\n", " gamedf['Successful Pressures'] = 0\n", " passes1 = len(gamedf[(gamedf.team_name==team1)&(gamedf.type_name=='Pass')]) \n", " passes2 = len(gamedf[(gamedf.team_name==team2)&(gamedf.type_name=='Pass')]) \n", " poss1 = round(passes1*100/(passes1+passes2))\n", " poss2 = 100 - poss1\n", " tacklemask = gamedf.duel_type_name=='Tackle'\n", " tacklesuccess = gamedf.duel_outcome_name.isin(['Success In Play', 'Won',\n", " 'Success Out'])\n", " interceptmask = gamedf.type_name == 'Interception'\n", " interceptsuccess = gamedf.interception_outcome_name.isin(['Success In Play', 'Won',\n", " 'Success Out'])\n", " dribbled_past = gamedf.type_name == 'Dribbled Past'\n", " fouls = gamedf.type_name == 'Foul Committed'\n", " aerialL = gamedf.duel_type_name=='Aerial Lost'\n", " aerialW = gamedf.pass_aerial_won.notna() | gamedf.shot_aerial_won.notna() | \\\n", " gamedf.clearance_aerial_won.notna() | gamedf.miscontrol_aerial_won.notna() \n", " blocks = gamedf.type_name == 'Block'\n", " passblock = gamedf.block_offensive.isna() & gamedf.block_deflection.isna() &\\\n", " gamedf.block_save_block.isna()\n", " pressures = gamedf.type_name=='Pressure'\n", " pressuredf = gamedf[pressures]\n", " for indx in list(pressuredf.index):\n", " t = pressuredf['time_seconds'][indx]\n", " possession_team_name = pressuredf['possession_team_name'][indx]\n", " \n", " if t+5>=gamedf.time_seconds.max():\n", " t_end = gamedf.time_seconds.max()\n", " else:\n", " t_end = t+5\n", " \n", " index_after_five_seconds = list(gamedf[(gamedf.time_seconds>=t) & \n", " (gamedf.time_seconds<=t_end)].index)\n", " possession_teams = gamedf['possession_team_name'][index_after_five_seconds].unique().tolist()\n", " \n", " if len(possession_teams) == 2:\n", " gamedf.loc[indx,'Successful Pressures'] = 1\n", " successful_dribbles = gamedf.dribble_outcome_name == 'Complete'\n", " failed_dribbles = gamedf.dribble_outcome_name == 'Incomplete'\n", " miscontrols = gamedf.type_name == 'Miscontrol'\n", " dispossessions = gamedf.type_name == 'Dispossessed'\n", "\n", " gamedf['Tackles'] = np.where(tacklemask, 1, 0)\n", " gamedf['Tackles Won'] = np.where(tacklesuccess, 1, 0)\n", " gamedf['Interceptions'] = np.where(interceptmask, 1, 0)\n", " gamedf['Interceptions Won'] = np.where(interceptsuccess, 1, 0)\n", " gamedf['Dribbled Past'] = np.where(dribbled_past,1,0)\n", " gamedf['Fouls'] = np.where(fouls,1,0)\n", " gamedf['Aerial Challenges Lost'] = np.where(aerialL,1,0)\n", " gamedf['Aerial Challenges Won'] = np.where(aerialW,1,0)\n", " gamedf['Blocks'] = np.where(blocks,1,0)\n", " gamedf['Blocked Passes'] = np.where(blocks & passblock,1,0)\n", " gamedf['Pressures'] = np.where(pressures,1,0)\n", " gamedf['Successful Dribbles'] = np.where(successful_dribbles,1,0)\n", " gamedf['Failed Dribbles'] = np.where(failed_dribbles,1,0)\n", " gamedf['Miscontrols'] = np.where(miscontrols,1,0)\n", " gamedf['Dispossessions'] = np.where(dispossessions,1,0)\n", " gamedf['Ball Recovery'] = np.where(gamedf.type_name=='Ball Recovery',1,0)\n", " gamedf['Clearances'] = np.where(gamedf.type_name=='Clearance',1,0)\n", "\n", " aggdict = {'Tackles':'sum', 'Tackles Won':'sum','Interceptions':'sum',\n", " 'Interceptions Won':'sum','Dribbled Past':'sum','Fouls':'sum',\n", " 'Aerial Challenges Lost':'sum','Aerial Challenges Won':'sum',\n", " 'Blocks':'sum','Blocked Passes':'sum','Pressures':'sum',\n", " 'Successful Pressures':'sum','Successful Dribbles':'sum',\n", " 'Failed Dribbles':'sum','Miscontrols':'sum','Dispossessions':'sum',\n", " 'Ball Recovery':'sum','Clearances':'sum'}\n", "\n", " groupedstats = gamedf.groupby(['player_name','team_name']).agg(aggdict).reset_index()\n", " groupedstats = groupedstats.sort_values(by=['team_name','Successful Pressures'],\n", " ascending=False).reset_index(drop=True)\n", " #groupedstats.rename(columns={\"player_name\": \"name\",\"team_name\":'team'}, errors=\"raise\",inplace=True)\n", " groupedstats['Possession %'] = np.where(groupedstats.team_name==team1,poss1,poss2) \n", " groupedstats['True Tackles'] = groupedstats['Tackles'] + groupedstats['Fouls'] + \\\n", " groupedstats['Dribbled Past']\n", " groupedstats['True Tackle Win%'] = groupedstats['Tackles']*100/groupedstats['True Tackles']\n", " groupedstats['True Interceptions'] = groupedstats['Interceptions'] + \\\n", " groupedstats['Blocked Passes']\n", " groupedstats['Defensive Acts'] = groupedstats['Tackles'] + groupedstats['Interceptions'] + \\\n", " groupedstats['Clearances'] + groupedstats['Ball Recovery'] + \\\n", " groupedstats['Blocks']\n", " return groupedstats" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Reading all games: 100%|████████████████████████| 51/51 [00:02<00:00, 19.91it/s]\n" ] } ], "source": [ "groupgamedfs = []\n", "\n", "for game in tqdm(df.match_id.unique(),desc='Reading all games'):\n", " groupgamedfs.append(game_poss(game))\n", "\n", "groupgamedfs = pd.concat(groupgamedfs,ignore_index=True)" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [], "source": [ "df_sb_events_grouped_defending = groupgamedfs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Final step is to aggregrate the defensive stats." ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
player_nameteam_namePadj_Defensive ActsTurnoversAerial ChallengesAerial Win %True Tackle Win%Padj_PressuresPadj_Successful PressuresDribbles
0Aaron RamseyWales21.892675131227.083333100.0000000.00.05
1Adam HložekCzech Republic5.1260626675.0000000.0000000.00.01
2Adama Traoré DiarraSpain3.328074300.0000000.0000000.00.05
3Admir MehmediSwitzerland5.815748300.00000050.0000000.00.00
4Adrien RabiotFrance36.3420579541.666667100.0000000.00.02
.................................
484İbrahim Halil DervişoğluTurkey1.9653801425.0000000.0000000.00.01
485İlkay GündoğanGermany37.793995210.00000066.6666670.00.00
486İrfan Can KahveciTurkey11.965380510.00000033.3333330.00.03
487Ľubomír ŠatkaSlovakia31.63932611847.61904866.6666670.00.00
488Šime VrsaljkoCroatia26.0000005750.000000100.0000000.00.04
\n", "

489 rows × 10 columns

\n", "
" ], "text/plain": [ " player_name team_name Padj_Defensive Acts Turnovers \\\n", "0 Aaron Ramsey Wales 21.892675 13 \n", "1 Adam Hložek Czech Republic 5.126062 6 \n", "2 Adama Traoré Diarra Spain 3.328074 3 \n", "3 Admir Mehmedi Switzerland 5.815748 3 \n", "4 Adrien Rabiot France 36.342057 9 \n", ".. ... ... ... ... \n", "484 İbrahim Halil Dervişoğlu Turkey 1.965380 1 \n", "485 İlkay Gündoğan Germany 37.793995 2 \n", "486 İrfan Can Kahveci Turkey 11.965380 5 \n", "487 Ľubomír Šatka Slovakia 31.639326 1 \n", "488 Šime Vrsaljko Croatia 26.000000 5 \n", "\n", " Aerial Challenges Aerial Win % True Tackle Win% Padj_Pressures \\\n", "0 12 27.083333 100.000000 0.0 \n", "1 6 75.000000 0.000000 0.0 \n", "2 0 0.000000 0.000000 0.0 \n", "3 0 0.000000 50.000000 0.0 \n", "4 5 41.666667 100.000000 0.0 \n", ".. ... ... ... ... \n", "484 4 25.000000 0.000000 0.0 \n", "485 1 0.000000 66.666667 0.0 \n", "486 1 0.000000 33.333333 0.0 \n", "487 18 47.619048 66.666667 0.0 \n", "488 7 50.000000 100.000000 0.0 \n", "\n", " Padj_Successful Pressures Dribbles \n", "0 0.0 5 \n", "1 0.0 1 \n", "2 0.0 5 \n", "3 0.0 0 \n", "4 0.0 2 \n", ".. ... ... \n", "484 0.0 1 \n", "485 0.0 0 \n", "486 0.0 3 \n", "487 0.0 0 \n", "488 0.0 4 \n", "\n", "[489 rows x 10 columns]" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "for cols in ['Tackles', 'True Interceptions', 'Pressures', 'Successful Pressures', 'Defensive Acts']:\n", " df_sb_events_grouped_defending['Padj_'+cols] = 2.0*df_sb_events_grouped_defending[cols]/(1.0 + np.exp(-0.1*(df_sb_events_grouped_defending['Possession %']-50)))\n", "#df_sb_events_grouped_defending['Padj Defensive Actions'] = df_sb_events_grouped_defending['Padj_Tackles'] + df_sb_events_grouped_defending['Padj_True Interceptions']\n", "\n", "df_sb_events_grouped_defending['Turnovers'] = df_sb_events_grouped_defending['Failed Dribbles'] + df_sb_events_grouped_defending['Miscontrols'] + df_sb_events_grouped_defending['Dispossessions']\n", "df_sb_events_grouped_defending['Dribbles'] = df_sb_events_grouped_defending['Successful Dribbles']+df_sb_events_grouped_defending['Failed Dribbles']\n", "df_sb_events_grouped_defending['Dribble Success %'] = df_sb_events_grouped_defending['Successful Dribbles']*100/df_sb_events_grouped_defending['Dribbles']\n", "df_sb_events_grouped_defending['Aerial Challenges'] = df_sb_events_grouped_defending['Aerial Challenges Lost'] + df_sb_events_grouped_defending['Aerial Challenges Won']\n", "df_sb_events_grouped_defending['Aerial Win %'] = df_sb_events_grouped_defending['Aerial Challenges Won']*100/df_sb_events_grouped_defending['Aerial Challenges']\n", "df_sb_events_grouped_defending = df_sb_events_grouped_defending.fillna(value=0)\n", "\n", "aggdict = {'Padj_Defensive Acts':'sum',\n", " 'Turnovers':'sum',\n", " 'Aerial Challenges':'sum',\n", " 'Aerial Win %':'mean',\n", " 'True Tackle Win%':'mean',\n", " 'Padj_Pressures':'sum',\n", " 'Padj_Successful Pressures':'sum',\n", " 'Dribbles':'sum'\n", " }\n", "\n", "df_sb_events_grouped_defending = df_sb_events_grouped_defending.groupby(['player_name', 'team_name']).agg(aggdict).reset_index()\n", "df_sb_events_grouped_defending\n", "\n", "## Display DataFrame\n", "df_sb_events_grouped_defending" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "\n", "\n", "## 5. Modeling\n", "The following section creates an Expected Pass model used that to evaluate `Pass Completion Above Expected`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.1. ..." ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [], "source": [ "# Extract open play passes from Events DataFrame\n", "df_sb_events = df_sb_events.query(\"(type_name == 'Pass') & (pass_type_name not in ['Free Kick', 'Corner', 'Throw-in', 'Kick Off'])\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "At this stage, the event-level DataFrame experiences no further Data Engineering and the following steps are all for the aggregated DataFrame. For this reason, it's ready to export a copy" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [], "source": [ "# Export a copy of these DataFrames\n", "df_sb_events.to_csv(data_dir + '/export/' + '/sb_360_events.csv', index=None, header=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.2. Feature Engineering" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
durationanglepass_body_part_namelocation_xlocation_yendloc_xendloc_ypass_crosspass_cut_backpass_deflectedpass_height_namedistancepass_switchpass_through_ballplay_pattern_nameunder_pressuresuccess
41.3963211.770789Right Foot49.343.444.865.6000Ground Pass22.65148800From Kick Off01.0
52.4767720.920982Left Foot36.230.470.775.8000High Pass57.02114210From Kick Off01.0
71.5136111.936746Right Foot45.563.235.035.8000Ground Pass29.34296800From Kick Off01.0
111.4788651.201462Right Foot36.234.642.219.1000Ground Pass16.62077400From Kick Off01.0
131.1269321.001164Right Foot47.911.853.62.9000Ground Pass10.56882700From Kick Off01.0
\n", "
" ], "text/plain": [ " duration angle pass_body_part_name location_x location_y endloc_x \\\n", "4 1.396321 1.770789 Right Foot 49.3 43.4 44.8 \n", "5 2.476772 0.920982 Left Foot 36.2 30.4 70.7 \n", "7 1.513611 1.936746 Right Foot 45.5 63.2 35.0 \n", "11 1.478865 1.201462 Right Foot 36.2 34.6 42.2 \n", "13 1.126932 1.001164 Right Foot 47.9 11.8 53.6 \n", "\n", " endloc_y pass_cross pass_cut_back pass_deflected pass_height_name \\\n", "4 65.6 0 0 0 Ground Pass \n", "5 75.8 0 0 0 High Pass \n", "7 35.8 0 0 0 Ground Pass \n", "11 19.1 0 0 0 Ground Pass \n", "13 2.9 0 0 0 Ground Pass \n", "\n", " distance pass_switch pass_through_ball play_pattern_name \\\n", "4 22.651488 0 0 From Kick Off \n", "5 57.021142 1 0 From Kick Off \n", "7 29.342968 0 0 From Kick Off \n", "11 16.620774 0 0 From Kick Off \n", "13 10.568827 0 0 From Kick Off \n", "\n", " under_pressure success \n", "4 0 1.0 \n", "5 0 1.0 \n", "7 0 1.0 \n", "11 0 1.0 \n", "13 0 1.0 " ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Feature engineering\n", "passes = df_sb_events.copy()\n", "passes.loc[passes['pass_outcome_name'].isnull(),'success'] = 1\n", "passes.loc[passes['pass_outcome_name'].notnull(), 'success'] = 0\n", "passes['x_dist'] = passes['endloc_x'] - passes['location_x'] + 1e-5\n", "passes['y_dist'] = passes['endloc_y'] - passes['location_y']\n", "passes['distance'] = np.sqrt((passes['x_dist']**2 + passes['y_dist']**2))\n", "passes['angle'] = np.abs(np.arctan2(passes['y_dist'],passes['x_dist']))\n", "feature_cols = ['duration', 'angle', 'pass_body_part_name', 'location_x', 'location_y', 'endloc_x','endloc_y',\n", " 'pass_cross', 'pass_cut_back', 'pass_deflected', 'pass_height_name', 'distance', \n", " 'pass_switch', 'pass_through_ball', 'play_pattern_name', 'under_pressure', 'success']\n", "pass_final = passes[feature_cols]\n", "bool_cols = ['pass_cross', 'pass_cut_back', 'pass_deflected','pass_switch', 'pass_through_ball','under_pressure']\n", "for col in bool_cols:\n", " pass_final[col] = np.where(pass_final[col].isna(), 0, 1)\n", "features = pass_final.drop('success', axis=1)\n", "labels = pass_final['success']\n", "\n", "pass_final.head()" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [], "source": [ "# Label-encoding the categorical columns\n", "cont_cols = ['duration', 'angle', 'distance','location_x', 'location_y', 'endloc_x','endloc_y'] \n", "cat_features = features.drop(cont_cols, axis=1)\n", "cont_features = features[cont_cols]\n", "def display_all(df):\n", " with pd.option_context(\"display.max_rows\", 1000): \n", " with pd.option_context(\"display.max_columns\", 1000): \n", " display(df.head(20).transpose())\n", "def label_encode(df):\n", " # Convert df to label encoded\n", " df_le = pd.DataFrame({col: df[col].astype('category').cat.codes for col in df}, index=df.index)\n", " # Save mappings as a dict\n", " mappings = {col: {n: cat for n, cat in enumerate(df[col].astype('category').cat.categories)} \n", " for col in df}\n", " return df_le, mappings\n", "cat_features_le, mappings = label_encode(cat_features)\n", "features_le = cont_features.merge(cat_features_le, left_index=True, right_index=True)" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Brier loss: 0.06823\n" ] } ], "source": [ "# First try with a Random Forest\n", "X = features_le\n", "y = labels\n", " \n", "m = RandomForestClassifier(n_estimators=100, n_jobs=-1, random_state=42)\n", "cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)\n", " \n", "# Define a function to calculate the Brier loss using cross-validation\n", "def get_loss(X, y=y, m=m, cv=cv):\n", " scores = cross_val_score(m, X, y, cv=cv, scoring='neg_brier_score')\n", " return np.mean(scores)*-1\n", " \n", "loss = get_loss(X=X)\n", "print('Brier loss:', \"{0:.5f}\".format(loss))" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Make a dendrogram for correlated features\n", "from numpy.random import rand\n", "from numpy.random import seed\n", "import scipy \n", "from scipy.cluster import hierarchy as hc\n", "seed(42)\n", "copyX = X + rand(*X.shape) / 100000.0\n", "def dendrogram(X):\n", " corr = np.round(scipy.stats.spearmanr(X).correlation, 4)\n", " corr_condensed = hc.distance.squareform(1-corr)\n", " z = hc.linkage(corr_condensed, method='average')\n", " fig = plt.figure(figsize=(10,8))\n", " dendrogram = hc.dendrogram(z, labels=X.columns, \n", " orientation='right', leaf_font_size=16)\n", " plt.show()\n", " return\n", "dendrogram(copyX)" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "original 0.06823\n", "location_x 0.06827\n", "endloc_x 0.06931\n", "location_y 0.06874\n", "endloc_y 0.07324\n", "distance 0.06867\n", "duration 0.07411\n" ] } ], "source": [ "# Remove the correlated features and check if the Brier loss reduces.\n", "feats = ['location_x', 'endloc_x',\n", " 'location_y', 'endloc_y',\n", " 'distance', 'duration']\n", "print('original', \"{0:.5f}\".format(loss))\n", "for feat in feats:\n", " loss_feats = get_loss(X=X.drop(feat, axis=1)) \n", " print(feat, \"{0:.5f}\".format(loss_feats))" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "ename": "KeyboardInterrupt", "evalue": "", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)", "\u001b[0;32m/var/folders/d7/wvbp_b411h75p3zvbmrvql080000gn/T/ipykernel_14558/3665184918.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 13\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mimp_df\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 14\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 15\u001b[0;31m \u001b[0mimp1\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mget_imp\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 16\u001b[0m \u001b[0mimp1\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mreset_index\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mplot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Feature'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'Importance'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfigsize\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m6\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlegend\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m;\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/var/folders/d7/wvbp_b411h75p3zvbmrvql080000gn/T/ipykernel_14558/3665184918.py\u001b[0m in \u001b[0;36mget_imp\u001b[0;34m(X, y, m, cv)\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0mimp\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 7\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mcol\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mX\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcolumns\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 8\u001b[0;31m \u001b[0ms\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mget_loss\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdrop\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcol\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mm\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mm\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcv\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcv\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 9\u001b[0m \u001b[0mchange_in_score\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0ms\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0mbaseline\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 10\u001b[0m \u001b[0mimp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mchange_in_score\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/var/folders/d7/wvbp_b411h75p3zvbmrvql080000gn/T/ipykernel_14558/478258356.py\u001b[0m in \u001b[0;36mget_loss\u001b[0;34m(X, y, m, cv)\u001b[0m\n\u001b[1;32m 8\u001b[0m \u001b[0;31m# Define a function to calculate the Brier loss using cross-validation\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mget_loss\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mm\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mm\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcv\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcv\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 10\u001b[0;31m \u001b[0mscores\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcross_val_score\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mm\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mX\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcv\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcv\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mscoring\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'neg_brier_score'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 11\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmean\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mscores\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 12\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py\u001b[0m in \u001b[0;36minner_f\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 61\u001b[0m \u001b[0mextra_args\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mall_args\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 62\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mextra_args\u001b[0m \u001b[0;34m<=\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 63\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 64\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 65\u001b[0m \u001b[0;31m# extra_args > 0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/sklearn/model_selection/_validation.py\u001b[0m in \u001b[0;36mcross_val_score\u001b[0;34m(estimator, X, y, groups, scoring, cv, n_jobs, verbose, fit_params, pre_dispatch, error_score)\u001b[0m\n\u001b[1;32m 448\u001b[0m \u001b[0mfit_params\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mfit_params\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 449\u001b[0m \u001b[0mpre_dispatch\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mpre_dispatch\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 450\u001b[0;31m error_score=error_score)\n\u001b[0m\u001b[1;32m 451\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mcv_results\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'test_score'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 452\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py\u001b[0m in \u001b[0;36minner_f\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 61\u001b[0m \u001b[0mextra_args\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mall_args\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 62\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mextra_args\u001b[0m \u001b[0;34m<=\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 63\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 64\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 65\u001b[0m \u001b[0;31m# extra_args > 0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/sklearn/model_selection/_validation.py\u001b[0m in \u001b[0;36mcross_validate\u001b[0;34m(estimator, X, y, groups, scoring, cv, n_jobs, verbose, fit_params, pre_dispatch, return_train_score, return_estimator, error_score)\u001b[0m\n\u001b[1;32m 254\u001b[0m \u001b[0mreturn_times\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mreturn_estimator\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mreturn_estimator\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 255\u001b[0m error_score=error_score)\n\u001b[0;32m--> 256\u001b[0;31m for train, test in cv.split(X, y, groups))\n\u001b[0m\u001b[1;32m 257\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 258\u001b[0m \u001b[0;31m# For callabe scoring, the return type is only know after calling. If the\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/joblib/parallel.py\u001b[0m in \u001b[0;36m__call__\u001b[0;34m(self, iterable)\u001b[0m\n\u001b[1;32m 1042\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_iterating\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_original_iterator\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1043\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1044\u001b[0;31m \u001b[0;32mwhile\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdispatch_one_batch\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0miterator\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1045\u001b[0m \u001b[0;32mpass\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1046\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/joblib/parallel.py\u001b[0m in \u001b[0;36mdispatch_one_batch\u001b[0;34m(self, iterator)\u001b[0m\n\u001b[1;32m 857\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0;32mFalse\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 858\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 859\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_dispatch\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtasks\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 860\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 861\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/joblib/parallel.py\u001b[0m in \u001b[0;36m_dispatch\u001b[0;34m(self, batch)\u001b[0m\n\u001b[1;32m 775\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_lock\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 776\u001b[0m \u001b[0mjob_idx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_jobs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 777\u001b[0;31m \u001b[0mjob\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_backend\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mapply_async\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbatch\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcallback\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcb\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 778\u001b[0m \u001b[0;31m# A job can complete so quickly than its callback is\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 779\u001b[0m \u001b[0;31m# called before we get here, causing self._jobs to\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/joblib/_parallel_backends.py\u001b[0m in \u001b[0;36mapply_async\u001b[0;34m(self, func, callback)\u001b[0m\n\u001b[1;32m 206\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mapply_async\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfunc\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcallback\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 207\u001b[0m \u001b[0;34m\"\"\"Schedule a func to be run\"\"\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 208\u001b[0;31m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mImmediateResult\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfunc\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 209\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mcallback\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 210\u001b[0m \u001b[0mcallback\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mresult\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/joblib/_parallel_backends.py\u001b[0m in \u001b[0;36m__init__\u001b[0;34m(self, batch)\u001b[0m\n\u001b[1;32m 570\u001b[0m \u001b[0;31m# Don't delay the application, to avoid keeping the input\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 571\u001b[0m \u001b[0;31m# arguments in memory\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 572\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mresults\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mbatch\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 573\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 574\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/joblib/parallel.py\u001b[0m in \u001b[0;36m__call__\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 261\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mparallel_backend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_backend\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn_jobs\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_n_jobs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 262\u001b[0m return [func(*args, **kwargs)\n\u001b[0;32m--> 263\u001b[0;31m for func, args, kwargs in self.items]\n\u001b[0m\u001b[1;32m 264\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 265\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0m__reduce__\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/joblib/parallel.py\u001b[0m in \u001b[0;36m\u001b[0;34m(.0)\u001b[0m\n\u001b[1;32m 261\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mparallel_backend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_backend\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn_jobs\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_n_jobs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 262\u001b[0m return [func(*args, **kwargs)\n\u001b[0;32m--> 263\u001b[0;31m for func, args, kwargs in self.items]\n\u001b[0m\u001b[1;32m 264\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 265\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0m__reduce__\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/sklearn/utils/fixes.py\u001b[0m in \u001b[0;36m__call__\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 220\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0m__call__\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 221\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mconfig_context\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m**\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mconfig\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 222\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfunction\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/sklearn/model_selection/_validation.py\u001b[0m in \u001b[0;36m_fit_and_score\u001b[0;34m(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, return_n_test_samples, return_times, return_estimator, split_progress, candidate_progress, error_score)\u001b[0m\n\u001b[1;32m 596\u001b[0m \u001b[0mestimator\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfit\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX_train\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mfit_params\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 597\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 598\u001b[0;31m \u001b[0mestimator\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfit\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mX_train\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my_train\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mfit_params\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 599\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 600\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mException\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/sklearn/ensemble/_forest.py\u001b[0m in \u001b[0;36mfit\u001b[0;34m(self, X, y, sample_weight)\u001b[0m\n\u001b[1;32m 391\u001b[0m \u001b[0mverbose\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mverbose\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mclass_weight\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mclass_weight\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 392\u001b[0m n_samples_bootstrap=n_samples_bootstrap)\n\u001b[0;32m--> 393\u001b[0;31m for i, t in enumerate(trees))\n\u001b[0m\u001b[1;32m 394\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 395\u001b[0m \u001b[0;31m# Collect newly grown trees\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/joblib/parallel.py\u001b[0m in \u001b[0;36m__call__\u001b[0;34m(self, iterable)\u001b[0m\n\u001b[1;32m 1052\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1053\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_backend\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mretrieval_context\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1054\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mretrieve\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1055\u001b[0m \u001b[0;31m# Make sure that we get a last message telling us we are done\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1056\u001b[0m \u001b[0melapsed_time\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtime\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtime\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_start_time\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/joblib/parallel.py\u001b[0m in \u001b[0;36mretrieve\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 931\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 932\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mgetattr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_backend\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'supports_timeout'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;32mFalse\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 933\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_output\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mextend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mjob\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtimeout\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtimeout\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 934\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 935\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_output\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mextend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mjob\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/site-packages/joblib/_parallel_backends.py\u001b[0m in \u001b[0;36mwrap_future_result\u001b[0;34m(future, timeout)\u001b[0m\n\u001b[1;32m 540\u001b[0m AsyncResults.get from multiprocessing.\"\"\"\n\u001b[1;32m 541\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 542\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mfuture\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mresult\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtimeout\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mtimeout\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 543\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mCfTimeoutError\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 544\u001b[0m \u001b[0;32mraise\u001b[0m \u001b[0mTimeoutError\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/concurrent/futures/_base.py\u001b[0m in \u001b[0;36mresult\u001b[0;34m(self, timeout)\u001b[0m\n\u001b[1;32m 428\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__get_result\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 429\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 430\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_condition\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwait\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtimeout\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 431\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 432\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_state\u001b[0m \u001b[0;32min\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0mCANCELLED\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mCANCELLED_AND_NOTIFIED\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/opt/anaconda3/lib/python3.7/threading.py\u001b[0m in \u001b[0;36mwait\u001b[0;34m(self, timeout)\u001b[0m\n\u001b[1;32m 294\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;31m# restore state no matter what (e.g., KeyboardInterrupt)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 295\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mtimeout\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 296\u001b[0;31m \u001b[0mwaiter\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0macquire\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 297\u001b[0m \u001b[0mgotit\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 298\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mKeyboardInterrupt\u001b[0m: " ] } ], "source": [ "# Feature importance extraction\n", "\n", "## Define a function to get feature importance using the drop-column method\n", "def get_imp(X, y=y, m=m, cv=cv):\n", " baseline = get_loss(X=X, y=y, m=m, cv=cv)\n", " imp = []\n", " for col in X.columns:\n", " s = get_loss(X=X.drop(col, axis=1), y=y, m=m, cv=cv)\n", " change_in_score = s - baseline\n", " imp.append(change_in_score)\n", " imp_df = pd.DataFrame(data={'Feature': X.columns, 'Importance': np.array(imp)})\n", " imp_df = imp_df.set_index('Feature').sort_values('Importance', ascending=False)\n", " return imp_df\n", "\n", "imp1 = get_imp(X=X)\n", "imp1.reset_index().plot('Feature', 'Importance', figsize=(10,6), legend=False);" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "plot_importances(imp1, imp_range=(min(imp1.values), max(imp1.values)))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Drop the least 3 important features and check Brier loss again.\n", "X2 = X.drop(['location_x','pass_cut_back','pass_switch'], axis=1)\n", "loss2 = get_loss(X=X2)\n", "print('Brier loss:', \"{0:.5f}\".format(loss2))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Improvement !! Ok...now do a train-test split. Then use an XGBClassifier\n", "X_train, X_test, y_train, y_test = train_test_split(X2, y, test_size=0.20, shuffle=True, stratify=y, random_state=42)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%time\n", "\n", "xgb = XGBClassifier(objective='binary:logistic', random_state=42, n_jobs=-1)\n", "xgb.fit(X_train, y_train)\n", "scores = cross_val_score(xgb, X_train, y_train, cv=cv, scoring='brier_score_loss')\n", "print('Brier loss:', \"{0:.5f}\".format(np.mean(scores)*-1))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Do a grid search for the best choice of parameters.\n", "\n", "## Create the parameter grid\n", "# # Define function to print the results of the grid search\n", "def print_gs_results(gs, print_all=True):\n", " if print_all == True:\n", " print('Grid scores:')\n", " means = gs.cv_results_['mean_test_score']*-1\n", " stds = gs.cv_results_['std_test_score']\n", " for mean, std, params in zip(means, stds, gs.cv_results_['params']):\n", " print(\"%0.5f (+/-%0.05f) for %r\"\n", " % (mean, std * 2, params))\n", " print()\n", " print('Best:', \"{0:.5f}\".format(gs.best_score_*-1),'using %s' % gs.best_params_)\n", " else:\n", " print('Best:', \"{0:.5f}\".format(gs.best_score_*-1),'using %s' % gs.best_params_)\n", " return\n", "\n", "from sklearn.model_selection import StratifiedKFold, train_test_split, cross_val_score, GridSearchCV, RandomizedSearchCV\n", "params = {\n", " 'learning_rate': [0.0001, 0.001, 0.01, 0.1, 0.2, 0.3],\n", " 'n_estimators': [int(x) for x in np.linspace(start=100, stop=500, num=5)],\n", " 'max_depth': [i for i in range(3, 5)],\n", "}\n", "\n", "## Create the randomised grid search model\n", "## See http://scikit-learn.sourceforge.net/stable/modules/generated/sklearn.grid_search.RandomizedSearchCV.html\n", "## \"n_iter = number of parameter settings that are sampled. n_iter trades off runtime vs quality of the solution\"\n", "rgs = RandomizedSearchCV(estimator=xgb, param_distributions=params, n_iter=20, cv=cv, random_state=42, n_jobs=-1,\n", " scoring='brier_score_loss', return_train_score=True)\n", "\n", "# Fit rgs\n", "rgs.fit(X_train, y_train)\n", "\n", "# Print results\n", "print_gs_results(gs=rgs, print_all=False)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Some tests of the model\n", "\n", "## Define a function to help fit models and print the results\n", "def print_results(xgb, X_train=X_train, y_train=y_train, X_test=X_test, y_test=y_test):\n", " # Fit model\n", " xgb.fit(X_train, y_train)\n", " \n", " # Get predicted probabilities\n", " y_pred_proba_xgb = xgb.predict_proba(X_test)[:,1]\n", "\n", " # Print results\n", " print('Actual passes:', sum(y_test))\n", " print('Predicted passes (xgb):', '{0:.2f}'.format(sum(y_pred_proba_xgb)))\n", " print('Brier loss (xgb):', '{0:.5f}'.format(brier_score_loss(y_test, y_pred_proba_xgb)))\n", " return\n", "\n", "# Evaluate best models on the hold-out set\n", "best_xgb = rgs.best_estimator_\n", "\n", "print_results(xgb=best_xgb)\n", "print_results(xgb=xgb)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "calibrated_xgb = CalibratedClassifierCV(rgs.best_estimator_, cv=cv, method='sigmoid')\n", "print_results(xgb=calibrated_xgb)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "filename = 'final_model_sb_360.sav'\n", "pickle.dump(best_xgb, open(filename, 'wb'))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plot_confusion_matrix(best_xgb, X_test, y_test, cmap=plt.cm.Blues)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "skplt.metrics.plot_precision_recall(y_test, calibrated_xgb.predict_proba(X_test))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "len(X2),len(passes)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Fit the model to the entire dataset\n", "best_xgb.fit(X2, y)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pass_final['xp'] = best_xgb.predict_proba(X2)[:,1]\n", "pass_final['name'] = passes['player_name']\n", "pass_final" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Groupby to get actual and expected pass success\n", "df_sb_events_grouped_pass_completion = pass_final.groupby('name').agg({'duration':'count','success':'sum','xp':'sum'}).reset_index()\n", "\n", "df_sb_events_grouped_pass_completion.rename(columns={'duration':'Open Play Passes',\n", " 'success':'Successful Open Play Passes',\n", " 'xp':'Expected Pass Success',\n", " 'name':'player_name'},inplace=True)\n", "\n", "df_sb_events_grouped_pass_completion" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define 'Pass Completed Above Expected'\n", "df_sb_events_grouped_pass_completion['Pass Completed Above Expected'] = df_sb_events_grouped_pass_completion['Successful Open Play Passes']/df_sb_events_grouped_pass_completion['Expected Pass Success']\n", "\n", "df_sb_events_grouped_pass_completion" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "\n", "\n", "## 6. Unify Data for Final Dataset\n", "\n", "Derived DataFrames from the original `df_sb_events` DataFrame are:\n", "- `df_sb_player_positions`\n", "- `df_sb_player_minutes`\n", "- `df_sb_events_grouped_passing_carrying`\n", "- `df_sb_events_grouped_xt`\n", "- `df_sb_events_grouped_xg`\n", "- `df_sb_events_grouped_defending`\n", "- `df_sb_events_grouped_pass_completion`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "### 6.0. Import Previous Saved Datasets\n", "Temporary step to bring in previously saved DataFrames with correct data (currently a bug in the notebook)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#df_sb_player_positions\n", "#df_sb_player_minutes\n", "df_sb_events_grouped_passing_carrying = pd.read_csv(data_dir_sb + '/engineered/combined/sb_360/' + 'sb_events_grouped_passing.csv')\n", "df_sb_events_grouped_xt = pd.read_csv(data_dir_sb + '/engineered/combined/sb_360/' + 'sb_events_grouped_xt.csv')\n", "df_sb_events_grouped_xg = pd.read_csv(data_dir_sb + '/engineered/combined/sb_360/' + 'sb_events_grouped_xg.csv')\n", "df_sb_events_grouped_defending = pd.read_csv(data_dir_sb + '/engineered/combined/sb_360/' + 'sb_events_grouped_defending.csv')\n", "df_sb_events_grouped_pass_completion = pd.read_csv(data_dir_sb + '/engineered/combined/sb_360/' + 'sb_events_grouped_pass_completion.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "### 6.1. Join Datasets" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "defenders = df.player_name.unique()\n", "#defenders = df[df.position_name.isin(['Left Center Back','Right Center Back', 'Center Back'])].player_name.unique()\n", "#flagnames = ['Francisco Javier Calvo Quesada','Joshua Kimmich',\n", "# 'Luis Carlos Tejada Hansell',\n", "# 'Michael Lang','Nicolás Alejandro Tagliafico',\n", "# 'Gabriel Iván Mercado','Hörður Björgvin Magnússon','Birkir Már Sævarsson',\n", "# 'Fedor Kudryashov','Éver Maximiliano David Banega','Edson Omar Álvarez Velázquez',\n", "# 'Marcus Rashford', 'İlkay Gündoğan', 'Dylan Bronn']\n", "#defenders = list(set(defenders) - set(flagnames))\n", "\n", "defenders" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Collecting team open play shots" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "teams = df[df.player_name.isin(defenders)].team_name.unique()\n", "teamOPSdf = df[(df.team_name.isin(teams))&(df.shot_type_name=='Open Play')].groupby('team_name').agg({'player_name':'count'}).reset_index()\n", "teamOPSdf.rename(columns={'team_name':'team','player_name':'Open Play Shots'},inplace=True)\n", "teamOPSdf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Combine the separate, aggregrated DataFrames into a single dataframe." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Join the Matches DataFrame to the Events DataFrame\n", "df_sb_events_grouped_combined = pd.merge(df_sb_player_minutes, df_sb_player_positions, left_on=['player_name'], right_on=['player_name'], how='left')\n", "df_sb_events_grouped_combined = pd.merge(df_sb_events_grouped_combined, df_sb_events_grouped_passing_carrying, left_on=['player_name'], right_on=['player_name'], how='left')\n", "df_sb_events_grouped_combined = pd.merge(df_sb_events_grouped_combined, df_sb_events_grouped_xt, left_on=['player_name'], right_on=['player_name'], how='left')\n", "df_sb_events_grouped_combined = pd.merge(df_sb_events_grouped_combined, df_sb_events_grouped_xg, left_on=['player_name'], right_on=['player_name'], how='left')\n", "df_sb_events_grouped_combined = pd.merge(df_sb_events_grouped_combined, df_sb_events_grouped_defending, left_on=['player_name'], right_on=['player_name'], how='left')\n", "df_sb_events_grouped_combined = pd.merge(df_sb_events_grouped_combined, df_sb_events_grouped_pass_completion, left_on=['player_name'], right_on=['player_name'], how='left')\n", "df_sb_events_grouped_combined = df_sb_events_grouped_combined.sort_values(['player_name', 'team_name', 'mins_total'], ascending=[True, True, True])\n", "df_sb_events_grouped_combined.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_events_grouped_combined[df_sb_events_grouped_combined['player_name'] == 'Harry Maguire']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_player_positions.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_events_grouped_combined.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#df_sb_player_positions\n", "#df_sb_player_minutes\n", "df_sb_events_grouped_passing_carrying = pd.read_csv(data_dir_sb + '/engineered/combined/sb_360/' + 'sb_events_grouped_passing.csv')\n", "df_sb_events_grouped_xt = pd.read_csv(data_dir_sb + '/engineered/combined/sb_360/' + 'sb_events_grouped_xt.csv')\n", "df_sb_events_grouped_xg = pd.read_csv(data_dir_sb + '/engineered/combined/sb_360/' + 'sb_events_grouped_xg.csv')\n", "df_sb_events_grouped_defending = pd.read_csv(data_dir_sb + '/engineered/combined/sb_360/' + 'sb_events_grouped_defending.csv')\n", "df_sb_events_grouped_pass_completion = pd.read_csv(data_dir_sb + '/engineered/combined/sb_360/' + 'sb_events_grouped_pass_completion.csv')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_events_grouped_passing_carrying = df_sb_events_grouped_passing_carrying[df_sb_events_grouped_passing_carrying.player_name.isin(defenders)]\n", "df_sb_events_grouped_passing_carrying['Final Third Entries'] = df_sb_events_grouped_passing_carrying['Successful Final Third Passes'] + df_sb_events_grouped_passing_carrying['Final Third Carries']\n", "df_sb_events_grouped_passing_carrying['Progressive Moves'] = df_sb_events_grouped_passing_carrying['Successful Progressive Passes'] + df_sb_events_grouped_passing_carrying['Progressive Carries']\n", "\n", "cols = ['Passes',\n", " 'Successful Passes',\n", " 'Open Play Passes',\n", " 'Successful Open Play Passes',\n", " 'Long Passes',\n", " 'Successful Long Passes',\n", " 'Successful Final Third Passes',\n", " 'Under Pressure Passes',\n", " 'Successful Under Pressure Passes',\n", " 'Progressive Passes',\n", " 'Successful Progressive Passes',\n", " 'Total Pass Length', 'Carries',\n", " 'Pass Progressive Distance',\n", " 'Carry Distance',\n", " 'Carry Progressive Distance',\n", " 'Final Third Entries',\n", " 'Progressive Moves'\n", " ]\n", "\n", "df_sb_events_grouped_passing_carrying = df_sb_events_grouped_passing_carrying[['player_name']+cols]\n", "\n", "for c in ['Successful Passes','Successful Open Play Passes','Successful Long Passes','Successful Under Pressure Passes',\n", " 'Successful Progressive Passes']:\n", " if 'Successful' in c:\n", " cplayer_name = ' '.join(c.split(' ')[1:])\n", " df_sb_events_grouped_passing_carrying[cplayer_name+' Success %'] = df_sb_events_grouped_passing_carrying[c]*100/df_sb_events_grouped_passing_carrying[cplayer_name]\n", " df_sb_events_grouped_passing_carrying = df_sb_events_grouped_passing_carrying.drop(columns=cplayer_name)\n", "\n", "df_sb_events_grouped_passing_carrying['PPF'] = df_sb_events_grouped_passing_carrying['Pass Progressive Distance']/df_sb_events_grouped_passing_carrying['Total Pass Length']\n", "df_sb_events_grouped_passing_carrying['CPF'] = df_sb_events_grouped_passing_carrying['Carry Progressive Distance']/df_sb_events_grouped_passing_carrying['Carry Distance']\n", "df_sb_events_grouped_passing_carrying = df_sb_events_grouped_passing_carrying.drop(columns=['Pass Progressive Distance','Carry Progressive Distance'])\n", "df_sb_events_grouped_combined = df_sb_events_grouped_combined.merge(df_sb_events_grouped_passing_carrying[df_sb_events_grouped_passing_carrying.player_name.isin(defenders)],how='left')\n", "df_sb_events_grouped_combined = df_sb_events_grouped_combined.merge(teamOPSdf,how='left')\n", "# df_sb_events_grouped_combined = df_sb_events_grouped_combined.merge(passangles,how='left')\n", "# df_sb_events_grouped_combined = df_sb_events_grouped_combined.merge(ppassangles,how='left')\n", "df_sb_events_grouped_combined = df_sb_events_grouped_combined.merge(xpdf,how='left')\n", "df_sb_events_grouped_combined['Successful Passes and Carries'] = df_sb_events_grouped_combined['Successful Passes'] + df_sb_events_grouped_combined['Carries']\n", "\n", "for cols in ['xT', 'xT Facilitated']:\n", " df_sb_events_grouped_combined[cols] = df_sb_events_grouped_combined[cols]*100/df_sb_events_grouped_combined['Successful Passes and Carries']\n", "\n", "for cols in ['xGBuildup', 'xGChain']:\n", " df_sb_events_grouped_combined[cols] = df_sb_events_grouped_combined[cols]*10/df_sb_events_grouped_combined['Open Play Shots']\n", "\n", "per90cols = ['Padj_Defensive Acts',\n", " 'Turnovers',\n", " 'Aerial Challenges',\n", " 'Dribbles',\n", " 'Padj_Pressures',\n", " 'Padj_Successful Pressures',\n", " 'Successful Passes',\n", " 'Successful Long Passes',\n", " 'Successful Final Third Passes',\n", " 'Successful Under Pressure Passes',\n", " 'Successful Progressive Passes',\n", " 'Total Pass Length',\n", " 'Carries',\n", " 'Carry Distance',\n", " 'Final Third Entries',\n", " #'Progressive Passes',\n", " #'Progressive Carries',\n", " 'Progressive Moves',\n", " 'Successful Passes',\n", " 'Successful Open Play Passes',\n", " 'Successful Carries',\n", " 'Successful Progressive Carries',\n", " 'Successful Passes and Carries',\n", " ]\n", "\n", "for col in per90cols:\n", " df_sb_events_grouped_combined[col] = df_sb_events_grouped_combined[col]/df_sb_events_grouped_combined['mins_total']*90\n", " df_sb_events_grouped_combined[col+'_p90'] = df_sb_events_grouped_combined[col]\n", "\n", "df_sb_events_grouped_combined.drop(columns='Open Play Shots',inplace=True)\n", "df_sb_events_grouped_combined['Turnovers per 100 Touches'] = df_sb_events_grouped_combined['Turnovers']*100/(df_sb_events_grouped_combined['Carries'] + df_sb_events_grouped_combined['Dribbles'])\n", "\n", "for c in df_sb_events_grouped_combined.columns.tolist()[3:]:\n", " df_sb_events_grouped_combined['Percentile '+c] = df_sb_events_grouped_combined[c].rank(pct = True)\n", "\n", "df_sb_events_grouped_combined.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Temporary step: Importing an previously saved version\n", "#df_sb_events_grouped_combined = pd.read_csv(data_dir + '/sb/engineered/combined/sb_360/' + 'sb_events_agg_all.csv')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "len(df_sb_events_grouped_combined)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "### 6.3. Rename Columns\n", "Final cleaning of dataset by renaming the columns" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_events_grouped_combined.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_events_grouped_combined.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_events_grouped_combined.columns" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Displays all columns\n", "with pd.option_context('display.max_rows', None, 'display.max_columns', None):\n", " print(df_sb_events_grouped_combined.dtypes)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# To add\n", "#age\tdob\tpob\tcob\tsecond_citizenship\tcurrent_club\tcurrent_club_country\theight\tfoot\tmarket_value_euros\tmarket_value_pounds\tjoined\tage_when_joining\tyears_since_joining\tcontract_expires\tyears_until_contract_expiry\tcontract_option\ton_loan_from\ton_loan_from_country\tloan_contract_expiry\tplayer_agent" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Rename columns\n", "df_sb_events_grouped_combined = (df_sb_events_grouped_combined\n", " .rename(columns={'player_name': 'player',\n", " 'mins_total': 'mins_total',\n", " 'team_name': 'team',\n", " 'primary_position_name': 'position',\n", " 'Passes': 'passes',\n", " 'Successful Passes': 'completed_passes',\n", " 'Open Play Passes_x': 'open_play_passes',\n", " 'Successful Open Play Passes_x': 'completed_open_play_passes',\n", " 'Short Passes': 'short_passes',\n", " 'Successful Short Passes': 'completed_short_passes',\n", " 'Medium Passes': 'medium_passes',\n", " 'Successful Medium Passes': 'completed_medium_passes',\n", " 'Long Passes': 'long_passes',\n", " 'Successful Long Passes': 'completed_long_passes',\n", " 'Final Third Passes': 'final_third_pass',\n", " 'Successful Final Third Passes': 'completed_final_third_pass',\n", " 'Penalty Area Passes': 'penalty_area_passes',\n", " 'Successful Penalty Area Passes': 'completed_penalty_area_passes',\n", " 'Under Pressure Passes': 'under_pressure_passes',\n", " 'Successful Under Pressure Passes': 'completed_under_pressure_passes',\n", " 'Throughballs': 'throughballs',\n", " 'Successful Throughballs': 'completed_throughballs',\n", " 'Switches': 'switches',\n", " 'Successful Switches': 'completed_switches',\n", " 'Crosses': 'crosses',\n", " 'Successful Crosses': 'completed_crosses',\n", " 'Penalty Area Crosses': 'penalty_area_crosses',\n", " 'Successful Penalty Area Crosses': 'completed_penalty_area_crosses',\n", " 'Progressive Passes': 'progressive_passes',\n", " 'Successful Progressive Passes': 'completed_progressive_passes',\n", " 'Total Pass Length': 'total_pass_distance',\n", " 'Pass Progressive Distance': 'progressive_pass_distance',\n", " 'Carries': 'carries',\n", " 'Successful Carries': 'completed_carries',\n", " 'Final Third Carries': 'final_third_carries',\n", " 'Successful Final Third Carries': 'completed_final_third_carries',\n", " 'Progressive Carries': 'progressive_Carries',\n", " 'Successful Progressive Carries': 'completed_progressive_carries',\n", " 'Carry Distance': 'carry_distance',\n", " 'Carry Progressive Distance': 'progressive_carry_distance',\n", " 'xT': 'xt',\n", " 'xT Facilitated': 'xt_faciliated',\n", " 'xGBuildup': 'xgbuildup',\n", " 'xGChain': 'xgchain',\n", " 'Padj_Defensive Acts': 'padj_defensive_acts',\n", " 'Turnovers': 'turnovers',\n", " 'Aerial Challenges': 'aerial_challenges',\n", " 'Aerial Win %': 'aerial_win_%',\n", " 'True Tackle Win%': 'true_tackle_win_%',\n", " 'Padj_Pressures': 'padj_pressures',\n", " 'Padj_Successful Pressures': 'padj_completed_pressures',\n", " 'Dribbles': 'dribbles',\n", " 'Padj_Tackles and Interceptions': 'padj_tackles_and_interceptions',\n", " 'Tack/DP %': 'tackle/dp_%',\n", " 'Expected Pass Success': 'completed_expected_passes',\n", " 'Pass Completed Above Expected': 'passes_completed_above_expected',\n", " 'Successful Open Play Passes': 'completed_open_play_passes',\n", " 'Final Third Entries': 'final_third_entries',\n", " 'Progressive Moves': 'progressive_moves',\n", " 'Passes Success %': 'completed_passes_%',\n", " 'Open Play Passes Success %': 'completed_open_play_passes_%',\n", " 'Long Passes Success %': 'completed_long_passes_%',\n", " 'Under Pressure Passes Success %': 'completed_under_pressure_passes_%',\n", " 'Progressive Passes Success %': 'completed_progressive_passes_%',\n", " 'PPF': 'ppf',\n", " 'CPF': 'cpf',\n", " 'Open Play Passes': 'open_play_passes',\n", " 'Successful Passes and Carries': 'completed_passes_and_carries', \n", " 'Padj_Defensive Acts_p90': 'padj_defensive_acts_p90',\n", " 'Turnovers_p90': 'turnovers_p90',\n", " 'Aerial Challenges_p90': 'aerial_challenges_p90',\n", " 'Dribbles_p90': 'dribbles_p90',\n", " 'Padj_Pressures_p90': 'padj_presures_p90',\n", " 'Padj_Successful Pressures_p90': 'padj_completed_pressures_p90',\n", " 'Successful Passes_p90': 'completed_passes_p90',\n", " 'Successful Long Passes_p90': 'completed_long_passes_p90',\n", " 'Successful Final Third Passes_p90': 'completed_final_thirdpasses_p90',\n", " 'Successful Under Pressure Passes_p90': 'completed_under_pressure_passes_p90', \n", " 'Successful Progressive Passes_p90': 'completed_progressive_passes_p90',\n", " 'Total Pass Length_p90': 'total_pass_length_p90',\n", " 'Carries_p90': 'carries_p90', \n", " 'Carry Distance_p90': 'carry_distance_p90',\n", " 'Final Third Entries_p90': 'final_third_entries_p90',\n", " 'Progressive Moves_p90': 'progressive_moves_p90',\n", " 'Successful Open Play Passes_p90': 'completed_open_play_passes_p90',\n", " 'Successful Carries_p90': 'completed_carries_p90',\n", " 'Successful Progressive Carries_p90': 'completed_progressive_carries_p90',\n", " 'Successful Passes and Carries_p90': 'completed_passes_and_carries_p90',\n", " 'Turnovers per 100 Touches': 'turnovers_per_100_touches',\n", " 'Percentile primary_position_name': 'percentile_primary_position_name',\n", " 'Percentile Passes': 'percentile_passes', \n", " 'Percentile Successful Passes': 'percentile_completed_passes',\n", " 'Percentile Open Play Passes_x': 'percentile_open_play_Passes',\n", " 'Percentile Successful Open Play Passes_x': 'percentile_completed_open_play_passes', \n", " 'Percentile Short Passes': 'percentile_short_passes',\n", " 'Percentile Successful Short Passes': 'percentile_completed_short_passes',\n", " 'Percentile Medium Passes': 'percentile_medium_passes',\n", " 'Percentile Successful Medium Passes': 'percentile_completed_medium_passes',\n", " 'Percentile Long Passes': 'percentile_long_passes',\n", " 'Percentile Successful Long Passes': 'percentile_completed_long_passes',\n", " 'Percentile Final Third Passes': 'percentile_final_third_passes',\n", " 'Percentile Successful Final Third Passes': 'percentile_completed_final_third_passes',\n", " 'Percentile Penalty Area Passes': 'percentile_penalty_area_passes',\n", " 'Percentile Successful Penalty Area Passes': 'percentile_completed_penalty_area_passes',\n", " 'Percentile Under Pressure Passes': 'percentile_under_pressure_passes',\n", " 'Percentile Successful Under Pressure Passes': 'percentile_completed_under_pressure_passes',\n", " 'Percentile Throughballs': 'percentile_throughballs',\n", " 'Percentile Successful Throughballs': 'percentile_completed_throughballs',\n", " 'Percentile Switches': 'percentile_switches',\n", " 'Percentile Successful Switches': 'percentile_completed_switches',\n", " 'Percentile Crosses': 'percentile_crosses',\n", " 'Percentile Successful Crosses': 'percentile_completes_crosses',\n", " 'Percentile Penalty Area Crosses': 'percentile_penalty_area_crosses',\n", " 'Percentile Successful Penalty Area Crosses': 'percentile_completed_penalty_area_crosses', \n", " 'Percentile Progressive Passes': 'percentile_progressive_passes',\n", " 'Percentile Successful Progressive Passes': 'percentile_completed_progressive_passes',\n", " 'Percentile Total Pass Length': 'percentile_total_pass_length', \n", " 'Percentile Pass Progressive Distance': 'percentile_pass_progressive_distance',\n", " 'Percentile Carries': 'percentile_carries',\n", " 'Percentile Successful Carries': 'percentile_completed_carries',\n", " 'Percentile Final Third Carries': 'percentile_final_third_carries',\n", " 'Percentile Successful Final Third Carries': 'percentile_completed_final_third_Carries',\n", " 'Percentile Progressive Carries': 'percentile_progressive_carries',\n", " 'Percentile Successful Progressive Carries': 'percentile_completed_progressive_carries',\n", " 'Percentile Carry Distance': 'percentile_carry_distance',\n", " 'Percentile Carry Progressive Distance': 'percentile_progressive_carry_distance',\n", " 'Percentile xT': 'percentile_xt', \n", " 'Percentile xT Facilitated': 'percentile_xt_facilitated',\n", " 'Percentile xGBuildup': 'percentile_xgbuildup',\n", " 'Percentile xGChain': 'percentile_xgchain', \n", " 'Percentile team': 'percentile_team',\n", " 'Percentile Padj_Defensive Acts': 'percentile_padj_defensive_acts',\n", " 'Percentile Turnovers': 'percentile_turnovers',\n", " 'Percentile Aerial Challenges': 'percentile_aerial_challenges',\n", " 'Percentile Aerial Win %': 'percentile_aerial_win_%',\n", " 'Percentile True Tackle Win%': 'percentile_true_tackle_win_%',\n", " 'Percentile Padj_Pressures': 'percentile_padj_pressures',\n", " 'Percentile Padj_Successful Pressures': 'percentile_padj_completed_pressures',\n", " 'Percentile Dribbles': 'percentile_dribbles',\n", " 'Percentile Padj_Tackles and Interceptions': 'percentile_padj_tackles_and_interceptions',\n", " 'Percentile Tack/DP %': 'percentile_tack/dp_%',\n", " 'Percentile Open Play Passes_y': 'percentile_',\n", " 'Percentile Successful Open Play Passes_y': 'percentile_completed_open_play_passes',\n", " 'Percentile Expected Pass Success': 'percentile_completed_expected_pass',\n", " 'Percentile Pass Completed Above Expected': 'percentile_pass_completed_above_expected',\n", " 'Percentile Successful Open Play Passes': 'percentile_open_play_passes',\n", " 'Percentile Final Third Entries': 'percentile_final_third_entries',\n", " 'Percentile Progressive Moves': 'percentile_progressive_moves',\n", " 'Percentile Passes Success %': 'percentile_completed_passes_%',\n", " 'Percentile Open Play Passes Success %': 'percentile_completed_open_play_passes_%', \n", " 'Percentile Long Passes Success %': 'percentile_completedlong_passes_%_',\n", " 'Percentile Under Pressure Passes Success %': 'percentile_completed_under_Pressure_passes_%',\n", " 'Percentile Progressive Passes Success %': 'percentile_completed_progressive_passes_%',\n", " 'Percentile PPF': 'percentile_ppf',\n", " 'Percentile CPF': 'percentile_cpf',\n", " 'Percentile Open Play Passes': 'percentile_open_play_passes',\n", " 'Percentile Successful Passes and Carries': 'percentile_completed_passes_and_carries',\n", " 'Percentile Padj_Defensive Acts_p90': 'percentile_padj_defensive_acts_p90', \n", " 'Percentile Turnovers_p90': 'percentile_turnovers_p90',\n", " 'Percentile Aerial Challenges_p90': 'percentile_aerial_challenges_p90',\n", " 'Percentile Dribbles_p90': 'percentile_dribbles_p90', \n", " 'Percentile Padj_Pressures_p90': 'percentile_padj_pressures_p90',\n", " 'Percentile Padj_Successful Pressures_p90': 'percentile_padj_completed_pressures_p90',\n", " 'Percentile Successful Passes_p90': 'percentile_completed_passes_p90',\n", " 'Percentile Successful Long Passes_p90': 'percentile_completed_long_passes_p90',\n", " 'Percentile Successful Final Third Passes_p90': 'percentile_completed_final_third_passes_p90',\n", " 'Percentile Successful Under Pressure Passes_p90': 'percentile_percentile_completed_under_pressure_passes_p90',\n", " 'Percentile Successful Progressive Passes_p90': 'percentile_completed_progressive_passes_p90',\n", " 'Percentile Total Pass Length_p90': 'percentile_total_pass_distance_p90',\n", " 'Percentile Carries_p90': 'percentile_carries_p90',\n", " 'Percentile Carry Distance_p90': 'percentile_carry_distance_p90',\n", " 'Percentile Final Third Entries_p90': 'percentile_final_third_entries_p90',\n", " 'Percentile Progressive Moves_p90': 'percentile_progressive_moves_p90',\n", " 'Percentile Successful Open Play Passes_p90': 'percentile_completed_open_play_passes_p90',\n", " 'Percentile Successful Carries_p90': 'percentile_completed_completed_carries_p90',\n", " 'Percentile Successful Progressive Carries_p90': 'percentile_completed_progressive_carries_p90',\n", " 'Percentile Successful Passes and Carries_p90': 'percentile_completed_passes_and_carries_p90',\n", " 'Percentile Turnovers per 100 Touches': 'percentile_turnovers_per_100_touches'\n", " }\n", " )\n", " )\n", "\n", "# Display DataFrame\n", "df_sb_events_grouped_combined.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Drop columns\n", "df_sb_events_grouped_combined = (df_sb_events_grouped_combined\n", " .drop(['Open Play Passes_y',\n", " 'Successful Open Play Passes_y',\n", " 'percentile_team'\n", " ], axis=1\n", " )\n", " )\n", "\n", "# Display DataFrame\n", "df_sb_events_grouped_combined.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "### 6.2. Filter for Center Backs\n", "Filter for only defenders" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#F ilter for Center Backs\n", "df_sb_events_grouped_combined_center_backs = df_sb_events_grouped_combined[(df_sb_events_grouped_combined['position'] == 'Left Center Back') | (df_sb_events_grouped_combined['position'] == 'Center Back') | (df_sb_events_grouped_combined['position'] == 'Right Center Back')]\n", "\n", "# Display DataFrame\n", "df_sb_events_grouped_combined_center_backs.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "len(df_sb_events_grouped_combined_defenders)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "\n", "\n", "## 7. Export Final DataFrames" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Export DataFrames\n", "df_sb_events.to_csv(data_dir + '/export/' + '/sb_events.csv', index=None, header=True)\n", "df_sb_events_grouped_combined.to_csv(data_dir + '/export/' + '/sb_events_agg_all.csv', index=None, header=True)\n", "df_sb_events_grouped_combined_center_backs.to_csv(data_dir + '/export/' + '/sb_events_agg_center_backs.csv', index=None, header=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "\n", "\n", "## 8. Subset DataFrames\n", "The following code creates DataFrames for additional Tableau visualisation that are not the focus of this task submission, but are included for reference and may be used if there is sufficient time.\n", "\n", "These DataFrames include bespoke datasets for the following visualisations:\n", "- Radar\n", "- Passing Matrix\n", "- Passing Network" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 8.1. Radar Data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_radar = df_sb_events_grouped_combined" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_radar.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Filter for Center Backs¶" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Filter for Center Backs\n", "df_sb_radar = df_sb_radar[(df_sb_radar['position'] == 'Left Center Back') | (df_sb_radar['position'] == 'Center Back') | (df_sb_radar['position'] == 'Right Center Back')]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Determine Min and Max of Selected Attributes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#\n", "\n", "## Min\n", "min_completed_op_passes_p90 = df_sb_radar['completed_open_play_passes_p90'].min()\n", "min_completed_progressive_passes_p90 = df_sb_radar['completed_progressive_passes_p90'].min()\n", "min_completed_pass_percentage_completion = df_sb_radar['completed_passes_%'].min()\n", "min_completed_pressure_pass_percentage_completion = df_sb_radar['completed_under_pressure_passes_%'].min()\n", "min_completed_carries_p90 = df_sb_radar['completed_carries_p90'].min()\n", "min_completed_progressive_carries_p90 = df_sb_radar['completed_progressive_carries_p90'].min()\n", "min_carry_distance_p90 = df_sb_radar['carry_distance_p90'].min()\n", "min_xgbuildup = df_sb_radar['xgbuildup'].min()\n", "min_xt = df_sb_radar['xt'].min()\n", "min_padj_tackles_and_interceptions = df_sb_radar['padj_tackles_and_interceptions'].min()\n", "min_tack_dp_percentage_completion = df_sb_radar['tackle/dp_%'].min()\n", "min_aerial_win_percentage_completion = df_sb_radar['aerial_win_%'].min()\n", "\n", "## Max\n", "max_completed_op_passes_p90 = df_sb_radar['completed_open_play_passes_p90'].max()\n", "max_completed_progressive_passes_p90 = df_sb_radar['completed_progressive_passes_p90'].max()\n", "max_completed_pass_percentage_completion = df_sb_radar['completed_passes_%'].max()\n", "max_completed_pressure_pass_percentage_completion = df_sb_radar['completed_under_pressure_passes_%'].max()\n", "max_completed_carries_p90 = df_sb_radar['completed_carries_p90'].max()\n", "max_completed_progressive_carries_p90 = df_sb_radar['completed_progressive_carries_p90'].max()\n", "max_carry_distance_p90 = df_sb_radar['carry_distance_p90'].max()\n", "max_xgbuildup = df_sb_radar['xgbuildup'].max()\n", "max_xt = df_sb_radar['xt'].max()\n", "max_padj_tackles_and_interceptions = df_sb_radar['padj_tackles_and_interceptions'].max()\n", "max_tack_dp_percentage_completion = df_sb_radar['tackle/dp_%'].max()\n", "max_aerial_win_percentage_completion = df_sb_radar['aerial_win_%'].max()\n", "\n", "## Manual Max Changes - for normalisation\n", "# max_completed_op_passes_p90 = 102.857\n", "# max_completed_carries_p90 = 123.75\n", "# max_carry_distance_p90 = 1574.36\n", "\n", "\n", "\"\"\"\n", "## All custom max - not used\n", "max_completed_op_passes_p90 = 25.0\n", "max_completed_pass_percentage_completion = 100.0\n", "max_completed_pressure_pass_percentage_completion = 100.0\n", "max_completed_carries_p90 = 100.0\n", "max_completed_progressive_carries_p90 = 50.0\n", "max_carry_distance_p90 = 850.0\n", "max_xgbuildup = 0.5 \n", "max_xt = 4.0\n", "max_padj_tackles_and_interceptions = 40.0\n", "max_tack_dp_percentage_completion = 100.0\n", "max_aerial_win_percentage_completion = 100.0\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## Print statements\n", "print(f'completed_op_passes_p90 is {min_completed_op_passes_p90:.1f} and the maximum value is {max_completed_op_passes_p90:.1f}')\n", "print(f'completed_pass_% is {min_completed_pass_percentage_completion:.1f} and the maximum value is {max_completed_pass_percentage_completion:.1f}')\n", "print(f'completed_pressure_pass_% is {min_completed_pressure_pass_percentage_completion:.1f} and the maximum value is {max_completed_pressure_pass_percentage_completion:.1f}')\n", "print(f'completed_carries_p90 is {min_completed_carries_p90:.1f} and the maximum value is {max_completed_carries_p90:.1f}')\n", "print(f'completed_progressive_carries_p90 is {min_completed_progressive_carries_p90:.1f} and the maximum value is {max_completed_progressive_carries_p90:.1f}')\n", "print(f'carry_distance_p90 is {min_carry_distance_p90:.1f} and the maximum value is {max_carry_distance_p90:.1f}')\n", "print(f'xgbuildup is {min_xgbuildup:.1f} and the maximum value is {max_xgbuildup:.1f}')\n", "print(f'xt is {min_xt:.1f} and the maximum value is {max_xt:.1f}')\n", "print(f'padj_tackles_and_interceptions is {min_padj_tackles_and_interceptions:.1f} and the maximum value is {max_padj_tackles_and_interceptions:.1f}')\n", "print(f'tack_dp_percentage_completion is {min_tack_dp_percentage_completion:.1f} and the maximum value is {max_tack_dp_percentage_completion:.1f}')\n", "print(f'aerial_win_percentage_completion is {min_aerial_win_percentage_completion:.1f} and the maximum value is {max_aerial_win_percentage_completion:.1f}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Determine Normalise Columns" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Normalise columns\n", "df_sb_radar['completed_open_play_passes_p90_normalised'] = df_sb_radar['completed_open_play_passes_p90'].apply(lambda x: (x - min_completed_op_passes_p90) / (max_completed_op_passes_p90 - min_completed_op_passes_p90))\n", "df_sb_radar['completed_progressive_passes_p90_normalised'] = df_sb_radar['completed_progressive_passes_p90'].apply(lambda x: (x - min_completed_progressive_passes_p90) / (max_completed_progressive_passes_p90 - min_completed_progressive_passes_p90))\n", "df_sb_radar['completed_passes_%_normalised'] = df_sb_radar['completed_passes_%'].apply(lambda x: (x - min_completed_pass_percentage_completion) / (max_completed_pass_percentage_completion - min_completed_pass_percentage_completion))\n", "df_sb_radar['completed_under_pressure_passes_%_normalised'] = df_sb_radar['completed_under_pressure_passes_%'].apply(lambda x: (x - min_completed_pressure_pass_percentage_completion) / (max_completed_pressure_pass_percentage_completion - min_completed_pressure_pass_percentage_completion))\n", "df_sb_radar['completed_carries_p90_normalised'] = df_sb_radar['completed_carries_p90'].apply(lambda x: (x - min_completed_carries_p90) / (max_completed_carries_p90 - min_completed_carries_p90))\n", "df_sb_radar['completed_progressive_carries_p90_normalised'] = df_sb_radar['completed_progressive_carries_p90'].apply(lambda x: (x - min_completed_progressive_carries_p90) / (max_completed_progressive_carries_p90 - min_completed_progressive_carries_p90))\n", "df_sb_radar['carry_distance_p90_normalised'] = df_sb_radar['carry_distance_p90'].apply(lambda x: (x - min_carry_distance_p90) / (max_carry_distance_p90 - min_carry_distance_p90))\n", "df_sb_radar['xgbuildup_normalised'] = df_sb_radar['xgbuildup'].apply(lambda x: (x - min_xgbuildup) / (max_xgbuildup - min_xgbuildup))\n", "df_sb_radar['xt_normalised'] = df_sb_radar['xt'].apply(lambda x: (x - min_xt) / (max_xt - min_xt))\n", "df_sb_radar['padj_tackles_and_interceptions_normalised'] = df_sb_radar['padj_tackles_and_interceptions'].apply(lambda x: (x - min_padj_tackles_and_interceptions) / (max_padj_tackles_and_interceptions - min_padj_tackles_and_interceptions))\n", "df_sb_radar['tackle/dp_%_normalised'] = df_sb_radar['tackle/dp_%'].apply(lambda x: (x - min_tack_dp_percentage_completion) / (max_tack_dp_percentage_completion - min_tack_dp_percentage_completion))\n", "df_sb_radar['aerial_win_%_normalised'] = df_sb_radar['aerial_win_%'].apply(lambda x: (x - min_aerial_win_percentage_completion) / (max_aerial_win_percentage_completion - min_aerial_win_percentage_completion))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_radar[df_sb_radar['player'] == 'Sergio Ramos García']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Select Columns of Interest" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "#\n", "\n", "## Define columns\n", "cols = ['player',\n", " 'mins_total',\n", " 'team',\n", " 'position',\n", " 'completed_open_play_passes_p90', \n", " 'completed_progressive_passes_p90',\n", " 'completed_passes_%',\n", " 'completed_under_pressure_passes_%',\n", " 'completed_carries_p90',\n", " 'completed_progressive_carries_p90',\n", " 'carry_distance_p90',\n", " 'xgbuildup',\n", " 'xt',\n", " 'padj_tackles_and_interceptions',\n", " 'tackle/dp_%',\n", " 'aerial_win_%',\n", " 'completed_open_play_passes_p90_normalised', \n", " 'completed_progressive_passes_p90_normalised',\n", " 'completed_passes_%_normalised',\n", " 'completed_under_pressure_passes_%_normalised',\n", " 'completed_carries_p90_normalised',\n", " 'completed_progressive_carries_p90_normalised',\n", " 'carry_distance_p90_normalised',\n", " 'xgbuildup_normalised',\n", " 'xt_normalised',\n", " 'padj_tackles_and_interceptions_normalised',\n", " 'tackle/dp_%_normalised',\n", " 'aerial_win_%_normalised'\n", " ]\n", "\n", "## Select columns of interest\n", "df_sb_radar_select = df_sb_radar[cols]\n", "\n", "## Drop duplicate column (duplicate 'team', temporary solution, needs to be moved up)\n", "df_sb_radar_select = df_sb_radar_select.loc[:,~df_sb_radar_select.columns.duplicated()]\n", "\n", "## \n", "df_sb_radar_select.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Melt DataFrame" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "df_sb_radar_melt = pd.melt(df_sb_radar_select, id_vars=['player', 'mins_total', 'team', 'position'], var_name='attribute_name', value_name='value')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_radar_melt.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_radar_melt.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_radar.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Export Final Dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Export DataFrames\n", "df_sb_radar_melt.to_csv(data_dir + '/export/' + '/sb_events_agg_all_long.csv', index=None, header=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 8.2. Passing Matrix Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Copy the Events DataFrame" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df1 = df_sb_events.copy()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Isolate In-Play Events\n", "Remove Non-Event rows to only include player's actions i.e. removing line ups, halves, etc." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# List unique values in the df_sb['type.name'] column\n", "df1['type_name'].unique()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lst_events = ['Pass', 'Ball Receipt*', 'Carry', 'Duel', 'Miscontrol', 'Pressure', 'Ball Recovery', 'Dribbled Past', 'Dribble', 'Shot', 'Block', 'Goal Keeper', 'Clearance', 'Dispossessed', 'Foul Committed', 'Foul Won', 'Interception', 'Shield', 'Half End', 'Substitution', 'Tactical Shift', 'Injury Stoppage', 'Player Off', 'Player On', 'Offside', 'Referee Ball-Drop', 'Error']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df1= df1[df1['type_name'].isin(lst_events)]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df1['df_name'] = 'df1'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df2 = df_sb_events.copy()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df2['df_name'] = 'df2'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Concatanate DataFrames" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_events_passing = pd.concat([df1, df2])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_events_passing.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### ..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_events_passing['Pass_X'] = np.where(df_sb_events_passing['df_name'] == 'df1', df_sb_events_passing['location_x'], df_sb_events_passing['pass_end_location_x'])\n", "df_sb_events_passing['Pass_Y'] = np.where(df_sb_events_passing['df_name'] == 'df1', df_sb_events_passing['location_y'], df_sb_events_passing['pass_end_location_y'])\n", "df_sb_events_passing['Carry_X'] = np.where(df_sb_events_passing['df_name'] == 'df1', df_sb_events_passing['location_x'], df_sb_events_passing['carry_end_location_x'])\n", "df_sb_events_passing['Carry_Y'] = np.where(df_sb_events_passing['df_name'] == 'df1', df_sb_events_passing['location_y'], df_sb_events_passing['carry_end_location_y'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Export Dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Export \n", "#df_sb_events_passing.to_csv(data_dir_sb + '/events/engineered/' + '/sb_events_passing_matrix.csv', index=None, header=True)\n", "\n", "# Export \n", "df_sb_events_passing.to_csv(data_dir + '/export/' + '/sb_events_passing_matrix.csv', index=None, header=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 8.3. Create Passing Network Data\n", "\n", "See: https://community.tableau.com/s/question/0D54T00000C6YbE/football-passing-network" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_pass_network = df_sb_events_passing.copy()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_pass_network = df_sb_pass_network[df_sb_pass_network['type_name'] == 'Pass']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_pass_network['player_recipient'] = np.where(df_sb_pass_network['df_name'] == 'df1', df_sb_pass_network['player_name'], df_sb_pass_network['pass_recipient_name'])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_pass_network.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sorted(df_sb_pass_network.columns)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_pass_network.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Select columns of interest\n", "\n", "## Define columns\n", "cols = ['df_name',\n", " 'id',\n", " 'index',\n", " 'competition_name',\n", " 'season_name',\n", " 'match_date',\n", " 'kick_off',\n", " 'Full_Fixture_Date',\n", " 'Team',\n", " 'Opponent',\n", " 'home_team_home_team_name',\n", " 'away_team_away_team_name',\n", " 'home_score',\n", " 'away_score',\n", " 'player_recipient',\n", " 'player_name',\n", " 'pass_recipient_name',\n", " 'position_id',\n", " 'position_name',\n", " 'type_name',\n", " 'pass_type_name',\n", " 'pass_outcome_name',\n", " 'location_x',\n", " 'location_y', \n", " 'pass_end_location_x',\n", " 'pass_end_location_y', \n", " 'carry_end_location_x',\n", " 'carry_end_location_y',\n", " 'Pass_X',\n", " 'Pass_Y',\n", " 'Carry_X',\n", " 'Carry_Y'\n", " ]\n", "\n", "##\n", "df_sb_pass_network_select = df_sb_pass_network[cols]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_pass_network_select['pass_to_from'] = df_sb_pass_network_select['player_name'] + ' - ' + df_sb_pass_network_select['pass_recipient_name']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# List unique values in the df_sb_pass_network_select['pass.outcome.name'] column\n", "df_sb_pass_network_select['pass_outcome_name'].unique()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_pass_network_select = df_sb_pass_network_select[df_sb_pass_network_select['pass_outcome_name'].isnull()]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_pass_network_select.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_pass_network_select = df_sb_pass_network_select.sort_values(['season_name', 'match_date', 'kick_off', 'Full_Fixture_Date', 'index', 'id', 'df_name'], ascending=[True, True, True, True, True, True, True])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_pass_network_select['Pass_X'] = df_sb_pass_network_select['Pass_X'].astype(str).astype(float)\n", "df_sb_pass_network_select['Pass_Y'] = df_sb_pass_network_select['Pass_Y'].astype(str).astype(float)\n", "df_sb_pass_network_select['Carry_X'] = df_sb_pass_network_select['Carry_X'].astype(str).astype(float)\n", "df_sb_pass_network_select['Carry_Y'] = df_sb_pass_network_select['Carry_Y'].astype(str).astype(float)\n", "df_sb_pass_network_select['location_x'] = df_sb_pass_network_select['location_x'].astype(str).astype(float)\n", "df_sb_pass_network_select['location_y'] = df_sb_pass_network_select['location_y'].astype(str).astype(float)\n", "df_sb_pass_network_select['pass_end_location_x'] = df_sb_pass_network_select['pass_end_location_x'].astype(str).astype(float)\n", "df_sb_pass_network_select['pass_end_location_y'] = df_sb_pass_network_select['pass_end_location_y'].astype(str).astype(float)\n", "df_sb_pass_network_select['carry_end_location_x'] = df_sb_pass_network_select['carry_end_location_x'].astype(str).astype(float)\n", "df_sb_pass_network_select['carry_end_location_y'] = df_sb_pass_network_select['carry_end_location_y'].astype(str).astype(float)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "df_sb_pass_network_select.dtypes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_pass_network_select.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#\n", "\n", "##\n", "df_sb_pass_network_grouped = (df_sb_pass_network_select\n", " .groupby(['competition_name',\n", " 'season_name',\n", " 'match_date',\n", " 'kick_off',\n", " 'Full_Fixture_Date',\n", " 'Team',\n", " 'Opponent',\n", " 'home_team_home_team_name',\n", " 'away_team_away_team_name',\n", " 'home_score',\n", " 'away_score',\n", " 'pass_to_from',\n", " 'player_name',\n", " 'pass_recipient_name',\n", " 'player_recipient'\n", " ])\n", " .agg({'pass_to_from': ['count']\n", " })\n", " )\n", "\n", "##\n", "df_sb_pass_network_grouped.columns = df_sb_pass_network_grouped.columns.droplevel(level=0)\n", "\n", "##\n", "df_sb_pass_network_grouped = df_sb_pass_network_grouped.reset_index()\n", "\n", "## \n", "df_sb_pass_network_grouped.columns = ['competition_name',\n", " 'season_name',\n", " 'match_date',\n", " 'kick_off',\n", " 'full_fixture_date',\n", " 'team',\n", " 'opponent',\n", " 'home_team_name',\n", " 'away_team_name',\n", " 'home_score',\n", " 'away_score',\n", " 'pass_to_from',\n", " 'player_name',\n", " 'pass_recipient_name',\n", " 'player_recipient',\n", " 'count_passes',\n", " ]\n", "\n", "##\n", "#df_sb_pass_network_grouped['count_passes'] = df_sb_pass_network_grouped['count_passes'] / 2\n", "\n", "##\n", "df_sb_pass_network_grouped = df_sb_pass_network_grouped.sort_values(['season_name', 'match_date', 'kick_off', 'full_fixture_date', 'team', 'opponent', 'pass_to_from'], ascending=[True, True, True, True, True, True, True])\n", "\n", "##\n", "df_sb_pass_network_grouped.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_pass_network_grouped.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Select columns of interest\n", "\n", "## Define columns\n", "cols = ['Full_Fixture_Date',\n", " 'player_name',\n", " 'position_id',\n", " 'position_name',\n", " 'Pass_X',\n", " 'Pass_Y'\n", " ]\n", "\n", "##\n", "df_sb_pass_network_avg_pass = df_sb_pass_network_select[cols]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_pass_network_avg_pass " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#\n", "\n", "##\n", "df_sb_pass_network_avg_pass_grouped = (df_sb_pass_network_avg_pass \n", " .groupby(['Full_Fixture_Date',\n", " 'player_name',\n", " 'position_id',\n", " 'position_name',\n", " ])\n", " .agg({'Pass_X': ['mean'],\n", " 'Pass_Y': ['mean']\n", " })\n", " )\n", "\n", "##\n", "df_sb_pass_network_avg_pass_grouped.columns = df_sb_pass_network_avg_pass_grouped .columns.droplevel(level=0)\n", "\n", "##\n", "df_sb_pass_network_avg_pass_grouped = df_sb_pass_network_avg_pass_grouped.reset_index()\n", "\n", "## \n", "df_sb_pass_network_avg_pass_grouped.columns = ['full_fixture_date',\n", " 'player_name',\n", " 'position_id',\n", " 'position_name',\n", " 'avg_location_pass_x',\n", " 'avg_location_pass_y'\n", " ]\n", "\n", "##\n", "df_sb_pass_network_avg_pass_grouped['avg_location_pass_x'] = df_sb_pass_network_avg_pass_grouped['avg_location_pass_x'].round(decimals=1)\n", "df_sb_pass_network_avg_pass_grouped['avg_location_pass_y'] = df_sb_pass_network_avg_pass_grouped['avg_location_pass_y'].round(decimals=1)\n", "\n", "##\n", "df_sb_pass_network_avg_pass_grouped = df_sb_pass_network_avg_pass_grouped.sort_values(['full_fixture_date', 'player_name'], ascending=[True, True])\n", "\n", "##\n", "df_sb_pass_network_avg_pass_grouped.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Join the Events DataFrame to the Matches DataFrame\n", "df_sb_pass_network_final = pd.merge(df_sb_pass_network_grouped, df_sb_pass_network_avg_pass_grouped, left_on=['full_fixture_date', 'player_recipient'], right_on=['full_fixture_date', 'player_name'])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## Rename columns\n", "df_sb_pass_network_final = df_sb_pass_network_final.rename(columns={'player_name_x': 'player_name',\n", " #'player_name_x': 'player_name'\n", " }\n", " )" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "df_sb_pass_network_final.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_sb_pass_network_final.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Export Dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Export \n", "df_sb_pass_network_final.to_csv(data_dir_sb + '/engineered/events/' + 'sb_events_passing_network.csv', index=None, header=True)\n", "\n", "# Export \n", "df_sb_pass_network_final.to_csv(data_dir + '/export/' + 'sb_events_passing_network.csv', index=None, header=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 8.4. Extract Lineups from DataFrame" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# List unique values in the df_sb['type.name'] column\n", "df_sb['type.name'].unique()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The starting XI players and formation can be found in the rows where `type.name` is 'Starting XI'." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_lineup = df_sb[df_sb['type.name'] == 'Starting XI']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_lineup" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Streamline DataFrame to include just the columns of interest\n", "\n", "## Define columns\n", "cols = ['id', 'type.name', 'match_date', 'kick_off', 'Full_Fixture_Date', 'team.id', 'team.name', 'tactics.formation', 'tactics.lineup', 'competition_name', 'season_name', 'home_team.home_team_name', 'away_team.away_team_name', 'Team', 'Opponent', 'home_score', 'away_score']\n", "\n", "## Select only columns of interest\n", "df_lineup_select = df_lineup[cols]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_lineup_select" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see from the extracted lineup data so far. To get the stating XI players, we need to breakdown the `tactics.lineup` attribute." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Normalize tactics.lineup - see: https://stackoverflow.com/questions/52795561/flattening-nested-json-in-pandas-data-frame\n", "\n", "## explode all columns with lists of dicts\n", "df_lineup_select_normalize = df_lineup_select.apply(lambda x: x.explode()).reset_index(drop=True)\n", "\n", "## list of columns with dicts\n", "cols_to_normalize = ['tactics.lineup']\n", "\n", "## if there are keys, which will become column names, overlap with excising column names. add the current column name as a prefix\n", "normalized = list()\n", "\n", "for col in cols_to_normalize:\n", " d = pd.json_normalize(df_lineup_select_normalize[col], sep='_')\n", " d.columns = [f'{col}_{v}' for v in d.columns]\n", " normalized.append(d.copy())\n", "\n", "## combine df with the normalized columns\n", "df_lineup_select_normalize = pd.concat([df_lineup_select_normalize] + normalized, axis=1).drop(columns=cols_to_normalize)\n", "\n", "## display(df_lineup_select_normalize)\n", "df_lineup_select_normalize.head(30)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_lineup_engineered = df_lineup_select_normalize" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Streamline DataFrame to include just the columns of interest\n", "\n", "## Define columns\n", "cols = ['id', 'match_date', 'kick_off', 'Full_Fixture_Date', 'type.name', 'season_name', 'competition_name', 'home_team.home_team_name', 'away_team.away_team_name', 'Team', 'Opponent', 'home_score', 'away_score', 'tactics.formation', 'tactics.lineup_jersey_number', 'tactics.lineup_position_id', 'tactics.lineup_player_name', 'tactics.lineup_position_name']\n", "\n", "## Select only columns of interest\n", "df_lineup_engineered_select = df_lineup_engineered[cols]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "df_lineup_engineered_select['tactics.formation'] = df_lineup_engineered_select['tactics.formation'].astype('Int64')\n", "df_lineup_engineered_select['tactics.lineup_jersey_number'] = df_lineup_engineered_select['tactics.lineup_jersey_number'].astype('Int64')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_lineup_engineered_select.head(5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_lineup_engineered_select.columns" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## Rename columns\n", "df_lineup_engineered_select = df_lineup_engineered_select.rename(columns={'id': 'Match_Id',\n", " 'match_date': 'Match_Date',\n", " 'kick_off': 'Kick_Off',\n", " 'type.name': 'Type_Name',\n", " 'season_name': 'Season',\n", " 'competition_name': 'Competition',\n", " 'home_team.home_team_name': 'Home_Team',\n", " 'away_team.away_team_name': 'Away_Team',\n", " 'home_score': 'Home_Score',\n", " 'away_score': 'Away_Score',\n", " 'tactics.formation': 'Formation',\n", " 'tactics.lineup_jersey_number': 'Shirt_Number',\n", " 'tactics.lineup_position_id': 'Position_Number',\n", " 'tactics.lineup_player_name': 'Player_Name',\n", " 'tactics.lineup_position_name': 'Position_Name'\n", " }\n", " \n", " )\n", "\n", "## Display DataFrame\n", "df_lineup_engineered_select.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Convert Match_Date from string to datetime64[ns]\n", "df_lineup_engineered_select['Match_Date']= pd.to_datetime(df_lineup_engineered_select['Match_Date'])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\"\"\"\n", "# THIS IS NOT WORKING ATM\n", "\n", "# Convert Kick_Off from string to datetime64[ns]\n", "df_lineup_engineered_select['Kick_Off']= pd.to_datetime(df_lineup_engineered_select['Kick_Off'], format='%H:%M', errors='ignore')\n", "df_lineup_engineered_select['Kick_Off'] = df_lineup_engineered_select['Kick_Off'].dt.time\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_lineup_engineered_select.dtypes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Put hyphens between numbers in Formation attribute\n", "\n", "## Convert Formation attribute from Integer to String\n", "df_lineup_engineered_select['Formation'] = df_lineup_engineered_select['Formation'].astype(str)\n", "\n", "## Define custom function to add hyphen between letters: StackOverflow: https://stackoverflow.com/questions/29382285/python-making-a-function-that-would-add-between-letters\n", "def f(s):\n", " m = s[0]\n", " for i in s[1:]:\n", " m += '-' + i\n", " return m\n", " \n", "## Apply custom function\n", "df_lineup_engineered_select['Formation'] = df_lineup_engineered_select.apply(lambda row: f(row['Formation']),axis=1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "lst_formation = df_lineup_engineered_select['Formation'].unique().tolist()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lst_formation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Add Position Coordinates" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_formations_coords = pd.read_csv(data_dir_sb + '/sb_formation_coordinates.csv')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#df_formations_coords['Id'] = df_formations_coords['Id'].astype('Int8')\n", "#df_formations_coords['Player_Number'] = df_formations_coords['Player_Number'].astype('Int8')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_lineup_engineered_select = pd.merge(df_lineup_engineered_select, df_formations_coords, how='left', left_on=['Formation', 'Position_Number'], right_on=['Formation', 'Player_Number'])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#df_lineup_engineered_select = df_lineup_engineered_select.drop(['Player_Number'], axis=1)\n", "df_lineup_engineered_select = df_lineup_engineered_select.drop(['Id'], axis=1)\n", "df_lineup_engineered_select = df_lineup_engineered_select.drop(['Player_Position'], axis=1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_lineup_engineered_select.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Add Opponent Data to Each Row" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Select columns of interest\n", "\n", "## Define columns\n", "cols = ['Match_Date',\n", " 'Competition',\n", " 'Full_Fixture_Date',\n", " 'Team',\n", " 'Formation'\n", " ]\n", "\n", "##\n", "df_lineup_opponent = df_lineup_engineered_select[cols]\n", "\n", "##\n", "df_lineup_opponent = df_lineup_opponent.drop_duplicates()\n", "\n", "##\n", "df_lineup_opponent.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Join DataFrame to itself on 'Date', 'Fixture', 'Team'/'Opponent', and 'Event', to join Team and Opponent together\n", "df_lineup_engineered_opponent_select = pd.merge(df_lineup_engineered_select, df_lineup_opponent, how='left', left_on=['Match_Date', 'Competition', 'Full_Fixture_Date', 'Opponent'], right_on = ['Match_Date', 'Competition', 'Full_Fixture_Date', 'Team'])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Clean Data\n", "\n", "## Drop columns\n", "df_lineup_engineered_opponent_select = df_lineup_engineered_opponent_select.drop(columns=['Team_y'])\n", "\n", "\n", "## Rename columns\n", "df_lineup_engineered_opponent_select = df_lineup_engineered_opponent_select.rename(columns={'Team_x': 'Team',\n", " 'Formation_x': 'Formation',\n", " 'Formation_y': 'Opponent_Formation'\n", " }\n", " )\n", "\n", "## Display DataFrame\n", "df_lineup_engineered_opponent_select.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Export DataFrame" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Export \n", "df_lineup_engineered_opponent_select.to_csv(data_dir_sb + '/lineups/engineered/' + '/sb_lineups.csv', index=None, header=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Export \n", "df_lineup_engineered_opponent_select.to_csv(data_dir + '/export/' + '/sb_lineups.csv', index=None, header=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 8.5. Tactical Shifts" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_tactics = df_sb[df_sb['type.name'] == 'Tactical Shift']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_tactics" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Select columns of interest\n", "\n", "##\n", "cols = ['id', 'type.name', 'team.id', 'team.name', 'tactics.formation', 'tactics.lineup']\n", "\n", "##\n", "df_tactics_select = df_tactics[cols]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_tactics_select" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Normalize tactics.lineup - see: https://stackoverflow.com/questions/52795561/flattening-nested-json-in-pandas-data-frame\n", "\n", "## explode all columns with lists of dicts\n", "df_tactics_select_normalize = df_tactics_select.apply(lambda x: x.explode()).reset_index(drop=True)\n", "\n", "## list of columns with dicts\n", "cols_to_normalize = ['tactics.lineup']\n", "\n", "## if there are keys, which will become column names, overlap with excising column names. add the current column name as a prefix\n", "normalized = list()\n", "for col in cols_to_normalize:\n", " \n", " d = pd.json_normalize(df_tactics_select_normalize[col], sep='_')\n", " d.columns = [f'{col}_{v}' for v in d.columns]\n", " normalized.append(d.copy())\n", "\n", "## combine df with the normalized columns\n", "df_tactics_select_normalize = pd.concat([df_tactics_select_normalize] + normalized, axis=1).drop(columns=cols_to_normalize)\n", "\n", "## display(df_lineup_select_normalize)\n", "df_tactics_select_normalize.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9. Summary\n", "This notebook parses and engineers 2018 FIFA World Cup JSON data from the [StatsBomb Open Data GitHub repository](https://github.com/statsbomb/open-data) using [pandas](http://pandas.pydata.org/), to create several datasets for visualisation in [Tableau](https://public.tableau.com/profile/edd.webster)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 10. Next Steps\n", "The next stage is to visualise this data in Tableau." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 11. References\n", "\n", "#### Data\n", "* [StatsBomb](https://statsbomb.com/) data\n", "* [StatsBomb](https://github.com/statsbomb/open-data/tree/master/data) open data GitHub repository\n", "\n", "#### Visualisation\n", "* [Passing networks](https://community.tableau.com/s/question/0D54T00000C6YbE/football-passing-network)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "***Visit my website [eddwebster.com](https://www.eddwebster.com) or my [GitHub Repository](https://github.com/eddwebster) for more projects. If you'd like to get in contact, my Twitter handle is [@eddwebster](http://www.twitter.com/eddwebster) and my email is: edd.j.webster@gmail.com.***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Back to the top](#top)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "oldHeight": 642, "position": { "height": "40px", "left": "1118px", "right": "20px", "top": "-7px", "width": "489px" }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "varInspector_section_display": "none", "window_display": true } }, "nbformat": 4, "nbformat_minor": 2 }