{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Bayesian Statistics Made Simple\n", "===\n", "\n", "Code and exercises from my workshop on Bayesian statistics in Python.\n", "\n", "Copyright 2018 Allen Downey\n", "\n", "MIT License: https://opensource.org/licenses/MIT" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from __future__ import print_function, division\n", "\n", "%matplotlib inline\n", "\n", "import numpy as np\n", "\n", "from thinkbayes2 import Suite\n", "import thinkplot\n", "\n", "import warnings\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The likelihood function\n", "\n", "\n", "Here's a definition for `Bandit`, which extends `Suite` and defines a likelihood function that computes the probability of the data (win or lose) for a given value of `x` (the probability of win).\n", "\n", "Note that `hypo` is in the range 0 to 100." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class Bandit(Suite):\n", " \n", " def Likelihood(self, data, hypo):\n", " \"\"\" \n", " hypo is the prob of win (0-100)\n", " data is a string, either 'W' or 'L'\n", " \"\"\"\n", " x = hypo / 100\n", " if data == 'W':\n", " return x\n", " else:\n", " return 1-x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll start with a uniform distribution from 0 to 100." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bandit = Bandit(range(101))\n", "thinkplot.Pdf(bandit)\n", "thinkplot.Config(xlabel='x', ylabel='Probability')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can update with a single loss:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bandit.Update('L')\n", "thinkplot.Pdf(bandit)\n", "thinkplot.Config(xlabel='x', ylabel='Probability', legend=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another loss:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bandit.Update('L')\n", "thinkplot.Pdf(bandit)\n", "thinkplot.Config(xlabel='x', ylabel='Probability', legend=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And a win:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bandit.Update('W')\n", "thinkplot.Pdf(bandit)\n", "thinkplot.Config(xlabel='x', ylabel='Probability', legend=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Starting over, here's what it looks like after 1 win and 9 losses." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bandit = Bandit(range(101))\n", "\n", "for outcome in 'WLLLLLLLLL':\n", " bandit.Update(outcome)\n", "\n", "thinkplot.Pdf(bandit)\n", "thinkplot.Config(xlabel='x', ylabel='Probability', legend=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The posterior mean is about 17%" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bandit.Mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The most likely value is the observed proportion 1/10" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bandit.MAP()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The posterior credible interval has a 90% chance of containing the true value (provided that the prior distribution truly represents our background knowledge)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bandit.CredibleInterval(90)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Multiple bandits" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now suppose we have several bandits and we want to decide which one to play." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this example, we have 4 machines with these probabilities:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "actual_probs = [0.10, 0.20, 0.30, 0.40]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following function simulates playing one machine once." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from random import random\n", "from collections import Counter\n", "\n", "counter = Counter()\n", "\n", "def flip(p):\n", " return random() < p\n", "\n", "def play(i):\n", " counter[i] += 1\n", " p = actual_probs[i]\n", " if flip(p):\n", " return 'W'\n", " else:\n", " return 'L'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's a test, playing machine 3 twenty times:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in range(20):\n", " result = play(3)\n", " print(result, end=' ')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now I'll make 4 `Bandit` objects to represent our beliefs about the 4 machines." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "prior = range(101)\n", "beliefs = [Bandit(prior) for i in range(4)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This function displays the four posterior distributions" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "options = dict(yticklabels='invisible')\n", "\n", "def plot(beliefs, **options):\n", " thinkplot.preplot(rows=2, cols=2)\n", " for i, b in enumerate(beliefs):\n", " thinkplot.subplot(i+1)\n", " thinkplot.Pdf(b, label=i)\n", " thinkplot.Config(**options)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "plot(beliefs, legend=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now suppose we play each machine 10 times. This function updates our beliefs about one of the machines based on one outcome." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def update(beliefs, i, outcome):\n", " beliefs[i].Update(outcome)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in range(4):\n", " for _ in range(10):\n", " outcome = play(i)\n", " update(beliefs, i, outcome)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plot(beliefs, legend=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After playing each machine 10 times, we have some information about their probabilies:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[belief.Mean() for belief in beliefs]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bayesian Bandits\n", "\n", "To get more information, we could play each machine 100 times, but while we are gathering data, we are not making good use of it. The kernel of the Bayesian Bandits algorithm is that is collects and uses data at the same time. In other words, it balances exploration and exploitation.\n", "\n", "The following function chooses among the machines so that the probability of choosing each machine is proportional to its \"probability of superiority\".\n", "\n", "`Random` chooses a value from the posterior distribution.\n", "\n", "`argmax` returns the index of the machine that chose the highest value." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def choose(beliefs):\n", " ps = [b.Random() for b in beliefs]\n", " return np.argmax(ps)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's an example." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "choose(beliefs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Putting it all together, the following function chooses a machine, plays once, and updates `beliefs`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def choose_play_update(beliefs, verbose=False):\n", " i = choose(beliefs)\n", " outcome = play(i)\n", " update(beliefs, i, outcome)\n", " if verbose:\n", " print(i, outcome, beliefs[i].Mean())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's an example" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "counter = Counter()\n", "choose_play_update(beliefs, verbose=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Trying it out" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's start again with a fresh set of machines:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "beliefs = [Bandit(prior) for i in range(4)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can play a few times and see how `beliefs` gets updated:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "num_plays = 100\n", "\n", "for i in range(num_plays):\n", " choose_play_update(beliefs)\n", " \n", "plot(beliefs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can summarize `beliefs` by printing the posterior mean and credible interval:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i, b in enumerate(beliefs):\n", " print(b.Mean(), b.CredibleInterval(90))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The credible intervals usually contain the true values (10, 20, 30, and 40).\n", "\n", "The estimates are still rough, especially for the lower-probability machines. But that's a feature, not a bug: the goal is to play the high-probability machines most often. Making the estimates more precise is a means to that end, but not an end itself.\n", "\n", "Let's see how many times each machine got played. If things go according to play, the machines with higher probabilities should get played more often." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for machine, count in sorted(counter.items()):\n", " print(machine, count)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "**Exercise:** Go back and run this section again with a different value of `num_play` and see how it does." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 1 }