{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Simple Soccer Simulator\n", "\n", "## About \n", "\n", "This notebook will demonstrate, through increasingly complex scenarios, why it's (generally) advantageous to make \"safe\" passes while playing soccer. This is something I totally didn't get as a kid, but it's been on my mind a lot lately. \n", "\n", "I thought about throwing a GUI behind this, but that's a lot of work, so I'm leaving it as-is for now. \n", "\n", "My ultimate goal? To convince you (and myself) that making safe passes, not losing possession, and carefully making your way up the field is important; that the difference between an 80% pass completion rate and an 85% pass completion rate is significant, especially when everyone on the team is on board and goes for the higher pass completion rate. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Simulating a pass\n", "\n", "Ok, let's get this out of the way because everything else will build on it. Here's a small snippet of code that takes as input a person's pass completion rate (between 0 and 1) and uses a weighted random setup (#Ithinkthisworks) to determine if the pass was successful or not. Yeah, yeah, yeah, there are a lot of factors that determine if a pass is successful beyond the player's pass completion rate, but this is #SimpleSoccerSimulator, after all :) " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import random\n", "\n", "# returns True if the pass is completed successfully, given R chance of successful pass completion\n", "# returns False if the pass was unsuccessful\n", "def make_pass(R): \n", " if R >= 1 or R <=0:\n", " raise ValueError(\"Warning: R should be a float between 0 and 1 (exclusive).\")\n", " temp = random.random()\n", " if temp > R: \n", " return False\n", " return True" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And below is a demo of how it works. We assume a 95% pass completion rate, make 1000 passes, and (hopefully) approximately 942 of them are successful. This doesn't actually show that this is happening truly randomly, so let's just hope I did it right. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "947\n" ] } ], "source": [ "R = 0.95\n", "a = []\n", "for x in xrange(1000):\n", " a.append(make_pass(R))\n", " \n", "print sum(a) # number of True elements in the array (confirms that make_pass works as expected)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Make N consecutive passes (simple)\n", "\n", "Can we all agree that in order to score, we usually need to string together a few passes? Let's say we need to string together N passes before we can score (this is obviously an oversimplification - sometimes a goal happens after 1 pass or none or 30, but bear with me ...). Assuming everyone on the team has the same pass completion rate of R (again, this is oversimplified, I know), this will calculate the chance of successfully completing N passes (first function) and introduce randomness in a simulation using the make_pass function above (second function). " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Am I wrong? \n", "\n", "Probably. And let's think about how/why in order to decide if the rest of my arguments are useful. \n", "\n", "Who knows if [this](http://thesportjournal.org/article/analysis-of-goal-scoring-patterns-in-the-2012-european-football-championship/) is a legit source? Not me, but it's got #data, and that gives me something to build on. \n", "\n", "According to that site, \n", "> Since the landmark work of Reep and Benjamin (42), many studies have focused on goal scoring patterns in various national and international football tournaments. Reep and Benjamin (42) showed that approximately 80% of goals scored were the result of a short sequence of three or less passes and that 1 in 10 shots tend to lead to a goal. More recently, Hughes and Franks (21) showed that in the 1990 and 1994 World Cup tournaments 84% and 80% of goals respectively came from possessions of four or less passes. In addition, 80% and 77% of the shots at goal were a result of a sequence of four or less passes.\n", "\n", "Ok, so maybe you don't have to make N passes in a row in order to score, but I made this notebook before reading that, and you do (presumably) need to string together passes to make it down the field, so the following bit from that same site, which says that an overwhelming percentage of goals are scored from inside the penalty area, might help convince you that the rest of this exercise is interesting. \n", "> Other research has examined the position on the pitch from which goals are scored. In a recent study Wright et al. (50) showed that from 167 goals from English Premier League, 87% of goals were scored inside the penalty area which is similar to the 90% observed by Olsen (36) for the 1986 World Cup whereas Dufour (14) reported 80% for the 1990 World Cup. Yiannakos and Armatas (51) reported that 44.4% of goals scored were inside the penalty area, 35.2% inside the goal area, and 20.4% outside the penalty area, for the 2004 European Championship in Portugal. Finally, Hughes et al. (22) showed that successful teams in the 1986 Football World Cup made more attempts inside the penalty area in comparison to unsuccessful teams.\n", "\n", "Another part of my argument that is missing is this: why not dribble? " ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def chance_of_n_consecutive_passes_simple(N, R):\n", " return R**N\n", "\n", "def simulate_n_consecutive_passes_simple(N, R):\n", " result_str = \"\"\n", " for a_pass in xrange(N):\n", " result_str += \"Pass \" + str(a_pass+1) + \" of \" + str(N)\n", " if make_pass(R):\n", " result_str += \" completed.\\n\"\n", " else:\n", " result_str += \" failed.\\n\"\n", " return (False, result_str)\n", " result_str += str(N) + \" passes completed. Success!\\n\"\n", " return (True, result_str)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## How much difference does 5% make, anyway? \n", "\n", "Ok, so now we're getting into the #goodstuff. The following 3 lines give the chance of completing 5 passes if each player has a 95% vs 90% vs 80% pass completion rate. Notice that the 5% difference between 95% completion and 90% completion results in 77% vs 59% chance of completing 5 passes; that means that with a 90% pass completion rate, your team only connects 5 passes 77% of the times that they would have connected 5 passes if everyone had a 95% completion rate. And with each player having an 80% pass completion rate, which still still sounds good until you get into this math ... you're more likely to lose the ball while making 5 passes than to actually connect 5 passes in a row!" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.7737809375\n", "0.59049\n", "0.32768\n" ] } ], "source": [ "print chance_of_n_consecutive_passes_simple(5,.95)\n", "print chance_of_n_consecutive_passes_simple(5,.9)\n", "print chance_of_n_consecutive_passes_simple(5,.8)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "False\n", "Pass 1 of 3 completed.\n", "Pass 2 of 3 failed.\n", "\n" ] } ], "source": [ "simulation = simulate_n_consecutive_passes_simple(3,.5)\n", "print simulation[0]\n", "print simulation[1]" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def try_until_success(function, *args):\n", " success = False\n", " count = 0\n", " result_str = \"\"\n", " while not success: \n", " count += 1\n", " result_str += \"Try \" + str(count) + \":\\n\"\n", " result = function(*args)\n", " success = result[0]\n", " result_str += result[1]\n", " return count" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "8\n", "3\n" ] } ], "source": [ "print try_until_success(simulate_n_consecutive_passes_simple, 10,.8)\n", "print try_until_success(simulate_n_consecutive_passes_simple, 10,.9)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Make N consecutive passes (complex)\n", "\n", "Assuming that each person has chance R_i of completion. " ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from operator import mul\n", "\n", "def chance_of_n_consecutive_passes_complex(N, R):\n", " if len(R) != N:\n", " raise ValueError(\"Warning: R should be an array of length N!\")\n", " return reduce(mul, R, 1)\n", "\n", "def simulate_n_consecutive_passes_complex(N, R):\n", " if len(R) != N:\n", " raise ValueError(\"Warning: R should be an array of length N!\")\n", " \n", " result_str = \"\"\n", " for a_pass in xrange(N):\n", " result_str += \"Pass \" + str(a_pass+1) + \" of \" + str(N)\n", " if make_pass(R[a_pass]):\n", " result_str += \" completed.\\n\"\n", " else:\n", " result_str += \" failed.\\n\"\n", " return (False, result_str)\n", " result_str += str(N) + \" passes completed. Success!\\n\"\n", " return (True, result_str)\n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.03125\n", "0.03125\n" ] } ], "source": [ "# confirms that the simple matches the complex in a simple case\n", "print chance_of_n_consecutive_passes_complex(5, [0.5, 0.5, 0.5, 0.5, 0.5])\n", "print chance_of_n_consecutive_passes_simple(5, 0.5)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.857375\n", "0.722\n" ] } ], "source": [ "print chance_of_n_consecutive_passes_complex(3,[.95, 0.95, 0.95])\n", "print chance_of_n_consecutive_passes_complex(3,[.95, 0.8, 0.95])" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n", "Pass 1 of 3 completed.\n", "Pass 2 of 3 completed.\n", "Pass 3 of 3 completed.\n", "3 passes completed. Success!\n", "\n" ] } ], "source": [ "simulation = simulate_n_consecutive_passes_complex(3,[.95, 0.5, 0.95])\n", "print simulation[0]\n", "print simulation[1]" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2\n", "5\n" ] } ], "source": [ "print try_until_success(simulate_n_consecutive_passes_complex, 10,[.9, .9, .9, .9, .9, .9, .9, .9, 0.8, 0.9])\n", "print try_until_success(simulate_n_consecutive_passes_complex, 10,[.9, .9, .9, .9, .9, .9, .9, .9, 0.9, 0.9])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create teams, where each player has a pass completion rate" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [], "source": [ "DELTA = 0.1 # equivalent to needing 10 passes to score\n", "\n", "class Player:\n", " def __init__(self, name, (completion_rate, total_passes)):\n", " if completion_rate < 0:\n", " completion_rate = 0.01\n", " self.completion_rate = min(completion_rate, 0.98)\n", " self.total_passes = total_passes\n", " self.name = name\n", " \n", " def get_completion_rate(self):\n", " return self.completion_rate\n", " \n", " def update(self, pass_bool):\n", " return ## comment this out if you want the pass completion rate to dynamically change as the players play\n", " successful_passes = self.completion_rate * self.total_passes \n", " if pass_bool:\n", " successful_passes += 1\n", " self.total_passes += 1\n", " self.completion_rate = successful_passes*1.0 / self.total_passes\n", " if self.completion_rate > 0.98:\n", " self.completion_rate = 0.98\n", " \n", "class Team:\n", " def __init__(self, name, max_players=11):\n", " self.players = []\n", " self.score = 0\n", " self.max_players = max_players\n", " self.name = name\n", " \n", " def __init__(self, name, players, max_players=11):\n", " self.players = players\n", " if len(players) > 11:\n", " raise ValueError(\"Too many players on the field!\")\n", " self.score = 0\n", " self.max_players = max_players\n", " self.name = name\n", " \n", " def add_player(self, stats):\n", " if len(self.players) == self.max_players:\n", " raise ValueError(\"Too many players on the field!\")\n", " self.players.append(Player(stats))\n", " \n", " def update_score(self):\n", " self.score += 1\n", " \n", "class Game:\n", " def __init__(self, teamA, teamB):\n", " self.teamA = teamA\n", " self.teamB = teamB\n", " self.ball_position = 0\n", " self.possession = random.choice(self.teamA.players)\n", " self.possessing_team = self.teamA\n", " \n", " self.consecutive_passes = 0\n", " \n", " def get_winner(self):\n", " if self.teamA.score > self.teamB.score:\n", " return self.teamA\n", " elif self.teamB.score > self.teamA.score:\n", " return self.teamB\n", " else:\n", " return None\n", " \n", " def print_status(self):\n", " \n", " print \"-1 ||\", self.teamA.name, \"--> <--\", self.teamB.name, \"|| 1\"\n", " \n", " print self.teamA.name, \"score:\", self.teamA.score\n", " print self.teamB.name, \"score:\", self.teamB.score\n", " \n", " print 'Ball at:', self.ball_position\n", " \n", " print 'Possession:', self.possession.name\n", " print 'Possessing team:', self.possessing_team.name\n", " print 'Consecutive passes:', self.consecutive_passes\n", " \n", " def update_ball_position(self, delta):\n", " self.ball_position += delta\n", " \n", "\n", " def step(self):\n", " \n", " pass_success = make_pass(self.possession.completion_rate)\n", " #print 'successful pass?', pass_success\n", " self.possession.update(pass_success)\n", "\n", " if self.teamA == self.possessing_team:\n", " \n", " self.update_ball_position(DELTA)\n", "\n", " if pass_success: \n", " self.consecutive_passes += 1\n", " #print 'pass succeeded'\n", " other_players = list(self.teamA.players)\n", " other_players.remove(self.possession)\n", " #for p in other_players:\n", " # print p.name\n", " self.possession = random.choice(other_players)\n", " #print self.possession.name\n", " else: \n", " self.consecutive_passes = 0 \n", " #print 'pass failed'\n", " self.possession = random.choice(self.teamB.players)\n", " self.possessing_team = self.teamB\n", " #print self.possessing_team.name\n", " #print self.possession.name\n", "\n", " \n", " else: \n", " self.update_ball_position(0-DELTA)\n", "\n", " if pass_success:\n", " self.consecutive_passes += 1 \n", " other_players = list(self.teamB.players)\n", " other_players.remove(self.possession)\n", " self.possession = random.choice(other_players)\n", " else:\n", " self.consecutive_passes = 0 \n", " \n", " self.possession = random.choice(self.teamA.players)\n", " self.possessing_team = self.teamA\n", " \n", " \n", " if self.ball_position > 1:\n", " self.teamA.update_score()\n", " self.ball_position = 0\n", " self.possession = random.choice(self.teamB.players)\n", " self.possessing_team = self.teamB\n", " #print 'Required', self.consecutive_passes, 'consecutive passes to score'\n", " self.consecutive_passes = 0 \n", "\n", " elif self.ball_position < -1:\n", " self.teamB.update_score()\n", " self.ball_position = 0\n", " self.possession = random.choice(self.teamA.players)\n", " self.possessing_team = self.teamA\n", " #print 'Required', self.consecutive_passes, 'consecutive passes to score'\n", " self.consecutive_passes = 0\n", "\n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [], "source": [ "p1 = Player('p1', (0.9,10))\n", "p2 = Player('p2', (0.9,10))\n", "p3 = Player('p3', (0.85,10))\n", "\n", "p4 = Player('p4', (0.9,10))\n", "p5 = Player('p5', (0.9,10))\n", "p6 = Player('p6', (0.9,10))\n", "\n", "\n", "t1 = Team('t1', [p1,p2,p3], 3)\n", "t2 = Team('t2', [p4,p5,p6], 3)\n", "\n", "game = Game(t1, t2)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-1 || t1 --> <-- t2 || 1\n", "t1 score: 0\n", "t2 score: 0\n", "Ball at: 0\n", "Possession: p3\n", "Possessing team: t1\n", "Consecutive passes: 0\n" ] } ], "source": [ "game.print_status()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#game.step()\n", "#game.print_status()" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-1 || t1 --> <-- t2 || 1\n", "t1 score: 3\n", "t2 score: 2\n", "Ball at: 0.5\n", "Possession: p2\n", "Possessing team: t1\n", "Consecutive passes: 11\n" ] }, { "data": { "text/plain": [ "'t1'" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "for x in xrange(100):\n", " game.step()\n", "game.print_status()\n", "game.get_winner().name" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "Counter({'none': 287, 't1': 319, 't2': 394})" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from collections import Counter\n", "\n", "winners = []\n", "for g in xrange(1000):\n", " game = Game(Team('t1', [p1,p2,p3], 3), Team('t2', [p4,p5,p6], 3))\n", " for time in xrange(100):\n", " game.step()\n", " winner = game.get_winner()\n", " if winner == None:\n", " winners.append('none')\n", " else: \n", " winners.append(winner.name)\n", " \n", "Counter(winners)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ok, but what about playing for the fun of the game! Most of us are playing adult co-ed soccer, after all. \n", "\n", "Do you have more fun when your team has the ball or when the other team has the ball? Next, I'll do some calculations to show that we're more likely to have more possession over the course of the game (e.g., each individual player is more likely to touch it more times) if we all have a better pass completion rate than if even one player decides to put all their eggs in the immediate fun gratification basket and do something risky every time they get the ball. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "http://tempofreesoccer.blogspot.com/2013/06/passing-passes-per-turnover-and.html" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "TODO: \n", "\n", "1. set \"safe\" \"normal\" \"risky\" pass completion rates for N players and the % chance that a given player will make the choice to carry out each of these 3 pass types\n", "2. simulation differences are not in the pass completion rates but in the % chances (e.g., one team full of players that consistently make safe/normal passes vs. a team that consistently makes risky passes) \n", "3. pre-define that safe passes go back a little, risky passes go forward more than normal\n", "\n", "Can I \"show\" that teams that make more safe/normal passes win more often than teams that make more risky passes? \n", "\n", "Something about shooting likelyhood? Incorporating a \"choice\" of making a shot from distance X from goal (where the farther you get the harder the shot is to make). Simulating punting (e.g., if keeper gets it, they will punt it and the ball re-starts at the midfield with 50/50 chance of going to your team ... or other options off the punt maybe?). " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.11" } }, "nbformat": 4, "nbformat_minor": 0 }