# Introduction to Simulations #

In this notebook we will learn:

- Comparisons
- For-loops
- Basic structure of a simulation by for-loops
- Use of sum with comparison to count successes in a simulation


In [None]:
from datascience import *
import numpy as np

%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')

## Comparison ##

In [None]:
3 > 1

In [None]:
type(3 > 1)

In [None]:
True

In [None]:
true

In [None]:
3 = 3 

In [None]:
3 == 3.0

In [None]:
10 != 2

In [None]:
x = 14
y = 3

In [None]:
x > 15

In [None]:
12 < x

In [None]:
x < 20

In [None]:
12 < x < 20

In [None]:
10 < x-y < 13

In [None]:
x > 13 and y < 3.14159

## Comparisons with arrays

In [None]:
pets = make_array('cat', 'cat', 'dog', 'cat', 'dog', 'rabbit')

In [None]:
pets == 'cat'

In [None]:
1 + 1 + 0 + 1 + 0 + 0

In [None]:
sum(make_array(True, True, False, True, False, False))

In [None]:
sum(pets == 'dog')

In [None]:
np.count_nonzero(pets == 'dog')

In [None]:
x = np.arange(20, 31)

In [None]:
x > 28

## For-loops##

Python has a `for`. The stucture is like this:

for variable in list or array:
 
 body of loop
 


In [None]:
rainbow = make_array('red', 'orange', 'yellow', 'green', 'blue', 'indigo', 'violet')

for color in rainbow:
 print(color)

In [None]:
for thing in rainbow:
 print(thing)

In [None]:
num_array = np.arange(1, 3.25, 0.25)

## This for-loop is meaningless, don't try to figure out what's being computed
## we just want to demonstrate that a for-loop can involve multiple steps

for i in num_array:
 i2 = i**2
 i3 = i2 - 1
 i4 = i3*(1.09)
 print(i4)

In [None]:
for k in np.arange(11):
 print(k**3)

In [None]:
num_list = [1, 2, 3, 4, 5, 6, 7, 8]

for k in num_list:
 print((k - 1)**0.5)

### Appending Arrays

We'll see that appending an array can be a good way to keep track to the results of multiple simulations. 

In [None]:
first = np.arange(4)
second = np.arange(10, 17)
second

In [None]:
np.append(first, 6)

In [None]:
first

In [None]:
np.append(first, second)

In [None]:
first

In [None]:
second

In [None]:
squares = make_array() # an empty array

num_array = np.arange(11)

for i in num_array:
 squares = np.append(squares, i**2)
 
squares

## Simulation

Let's play a game: we each roll a die. 

If my number is bigger: you pay me a dollar.

If they're the same: we do nothing.

If your number is bigger: I pay you a dollar.

Steps:
1. Find a way to simulate the roll of a die, then generalize to two dice.
2. Compute how much money we win/lose based on the result.
3. Do steps 1 and 2 10,000 times.

### Random Selection

The `np.random.choice` function can help here.

In [None]:
die_faces = np.arange(1, 7)
die_faces

In [None]:
np.random.choice(die_faces)

In [None]:
np.random.choice(die_faces, 10)

### Conditional Statements

In [None]:
# Work in progress
def one_round(my_roll, your_roll):
 if my_roll > your_roll:
 return 1

In [None]:
one_round(4, 3)

In [None]:
one_round(2, 6)

In [None]:
# Final correct version
def one_round(my_roll, your_roll):
 if my_roll > your_roll:
 return 1
 elif your_roll > my_roll:
 return -1
 elif your_roll == my_roll:
 return 0

In [None]:
one_round(1, 1)

In [None]:
one_round(6, 5)

In [None]:
one_round(7, -1)

In [None]:
def simulate_one_round():
 my_roll = np.random.choice(die_faces)
 your_roll = np.random.choice(die_faces)
 return one_round(my_roll, your_roll)

In [None]:
simulate_one_round()

### Repeated Betting ###

In [None]:
results = make_array()
results

In [None]:
results = np.append(results, simulate_one_round())
results

In [None]:
game_outcomes = make_array()

for i in np.arange(5):
 game_outcomes = np.append(game_outcomes, simulate_one_round())
 
game_outcomes

In [None]:
game_outcomes = make_array()

for i in np.arange(10000):
 game_outcomes = np.append(game_outcomes, simulate_one_round())
 
game_outcomes

In [None]:
len(game_outcomes)

In [None]:
results = Table().with_column('My winnings', game_outcomes)

In [None]:
results

In [None]:
results.group('My winnings').barh('My winnings')

In [None]:
game_outcomes = make_array()

for i in np.arange(10000):
 game_outcomes = np.append(game_outcomes, simulate_one_round())
 
results = Table().with_column('My winnings', game_outcomes)

results.group('My winnings').barh('My winnings')

## Would this game be a good way to make money? ##

In [None]:
sum(results.column(0))

In [None]:
# Bonus question: This simulation is relatively simple. 
# Can you find a way to run it without using a for loop?

my_rolls = np.random.choice(np.arange(1,7), size = 10000)
your_rolls = np.random.choice(np.arange(1,7), size = 10000)

results = Table().with_columns("Mine", my_rolls, "Yours", your_rolls)

results = results.with_column("Results", results.apply(one_round, "Mine", "Yours"))

results.group("Results")

results.group("Results").barh("Results")

### Another example: simulating heads in 100 coin tosses

If 100 people individually flipped their own fair coin at the same time (or one very bored person flipped a fair coin 100 times), would it be reasonable if 40 or fewer of them came up heads?



In [None]:
coin = make_array('heads', 'tails')

In [None]:
sum(np.random.choice(coin, 100) == 'heads')

In [None]:
# Simulate one outcome

def num_heads():
 return sum(np.random.choice(coin, 100) == 'heads')

In [None]:
# Decide how many times you want to repeat the experiment

repetitions = 10000

In [None]:
# Simulate that many outcomes

outcomes = make_array()

for i in np.arange(repetitions):
 outcomes = np.append(outcomes, num_heads())
 
heads = Table().with_column('Heads', outcomes)
heads.hist(bins = np.arange(29.5, 70.6), right_end = 40)

In [None]:
heads = Table().with_column('Heads', outcomes)
heads.hist(bins = np.arange(29.5, 70.6), right_end = 40)

They yellow section; how many is that?

In [None]:
sum(heads.column(0)<=40)

In [None]:
sum(outcomes <=40)

Then what proportion is that?

In [None]:
290/10000

What interval captures the middle 95% of these outcomes?

In [None]:
 np.percentile(outcomes, make_array(2.5, 97.5))

## Famous Monty Hall Problem ##

On the game show, Let's Make a Deal, one of the more popular games was a simple guessing game involving three doors. One door would hide a desireable prize (an expensive vacation, a new car, or something of similar value). The other two doors would hide a fake prize, often a goat. The way the game was played was simple:

1. The player picks a door
2. Monty Hall (the show's host) would ask that *a different* door be opened, revealing one of the two goats.
3. Monty would offer the player the opportunity to switch to the *other* unopened door. 

The mathematical/probability/statistical question is this: should the player switch doors?

To put it another way, which player strategy has the higher likelihood of winning, picking a door and sticking with it, or picking a door and automatically switching once another door has been opened?

**Strategy 1:** The pick & stick (pick a door and don't switch when given the change)

**Strategy 2:** The pick & switch (pick a door, but automatically switch to the other when it's offered)

Let's use simulations to decide which strategy is better.


In [None]:
doors = make_array('car', 'first goat', 'second goat')

goats = make_array('first goat', 'second goat')

def other_goat(a_goat):
 if a_goat == 'first goat':
 return 'second goat'
 elif a_goat == 'second goat':
 return 'first goat'


In [None]:
def monty_hall():
 
 contestant_choice = np.random.choice(doors)
 
 if contestant_choice == 'first goat':
 monty_choice = 'second goat'
 remaining_door = 'car'
 
 elif contestant_choice == 'second goat':
 monty_choice = 'first goat'
 remaining_door = 'car'
 
 elif contestant_choice == 'car':
 monty_choice = np.random.choice(goats)
 remaining_door = other_goat(monty_choice)
 
 return [contestant_choice, monty_choice, remaining_door]

In [None]:
games = Table(['Strategy 1 Prize', 'Revealed', 'Strategy 2 Prize'])

reps = 10000

for i in range(reps):
 games.append(monty_hall())
 
games

In [None]:
sum(games.column('Strategy 1 Prize')=='car')/reps

In [None]:
sum(games.column('Strategy 2 Prize')=='car')/reps