{ "cells": [ { "cell_type": "markdown", "metadata": { "tags": [ "s1", "content", "l1" ] }, "source": [ "# Probability Theory\n", "\n", "## Conditional Probability\n", "\n", "For her vacation in Hawaii, Elizabeth has packed 5 pants, 4 skirts and 8 shirts. On her first day venturing out to explore the islands, Elizabeth is running late. In a rush to dress up, she opens her briefcase and picks out two pieces of clothing at random. Given that her first pick is a skirt, what is the probability that she is in luck and picks a shirt to go with it.\n", "\n", "This is a typical example of conditional probability, which is the probability of a second event happening given that a first event has already occured. This particular case of conditional probability deals with dependent events. Elizabeth has to pull out a second piece of clothing that matches her first pick. Elizabeth having a matching pair of clothing is dependent on her first and second pick.\n", "\n", "As most of you might have already guessed, the probability is: $$P(MatchingPair) = \\frac{4}{17} * \\frac{8}{16}$$
\n", "(Hint): She first picks a skirt with probability 4/17, which leaves her with 3 remaining skirts, 5 pants and 8 shirts. Out of the 16 remaining clothes, her only favourable picks are the 8 shirts, thus giving us 8/16. Our final answer is a combination of the first and second pick, thus giving us our final answer = 4/17 X 8/16\n", "\n", "## Definition:\n", "\n", "In probability theory, conditional probability is the probability of an event given that another event has occured. Denoted as P(B|A), is the probabilty of event B given event A has already occured.To put this into persepective, let us assume that the probability of rain on any random day is 10%. But if we know that there is a storm the same day, then the probability of rain will be much higher.\n", "\n", "The concept of conditional probability is one of the most fundamental concepts in probability theory. While it may seem very trivial, conditional probabilities can be quite confusing and require careful attention.
\n", "Note: If events A and B are independent, then P(B|A) is simply P(B).\n", "\n", "### Kolmogorov definition:\n", "\n", "Given two events A and B from the sigma-field of a probability space with P(B) > 0, the conditional probability of A given B is defined as the quotient of the probability of the joint of events A and B, and the probability of B:\n", "\n", "$$P(A|B) = \\frac{P(A∩B)}{P(B)}$$\n", "\n", "\n", "## Independent Events:\n", "Now let us prove the note mentioned earlier for independent events using this equation:\n", "\n", "We know that, $$P(A|B) = \\frac{P(A∩B)}{P(B)}$$\n", "\n", "For independent events, we have P(A∩B) = P(A) X P(B) . Substituting this in the equation above, we get:\n", "\n", "$$P(A|B) = \\frac{P(A) X P(B)}{P(B)}$$
\n", "\n", "$$P(A|B) = P(A)$$\n", "\n", "## Mutually Exclusive Events:\n", "\n", "Two events are said to be mutually exclusive when the two events cannot occur at the same time. For a given sample space either one or the other can occur, not both. They have their probabilities defined as: $$P(A)+P(B)=1$$\n", "Furthermore,$$P(A∩B) = 0$$\n", "\n", "Therfore the conditional probability for mutually exclusive events is 0\n", "\n", "### Exercise\n", "\n", "You toss a fair coin three times, write a program to compute the following:\n", "\n", "1. What is the probability of three heads, HHH? [use P1 as variable for final ans]\n", "2. What is the probability that you observe exactly one heads? [use P2 as variable for final ans]\n", "3. Given that you have observed at least one heads, what is the probability that you observe at least two heads? [use P3 as variable for final answer." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true, "tags": [ "s1", "ce", "l1" ] }, "outputs": [], "source": [ "#" ] }, { "cell_type": "markdown", "metadata": { "tags": [ "s1", "l1", "hint" ] }, "source": [] }, { "cell_type": "code", "execution_count": 4, "metadata": { "tags": [ "s1", "l1", "ans" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.125 0.375 0.5714285714285714\n" ] } ], "source": [ "# Let us first analyse the problem. We assume that the coin tosses are independent\n", "\n", "#Part 1:Since each toss is independent, the probability of getting a head on any toss is 1/2\n", "P1 = 0.5*0.5*0.5\n", "\n", "#Part 2: Let us create the sample space\n", "#S = [HHH,HHT,HTH,THH,HTT,TTH,THT,TTT] the favourable events are [HTT, THT, TTH]\n", "P2 = 3/8\n", "\n", "#Part 3: Let s1 be the event that you observe one head, and s2 be the event for two heads\n", "#we have to find P(s2|s1), where P(s1) = 1 - P{TTT} = 7/8 and P(s2) = 4/8\n", "#P(s2|s1) = P(s2 ∩ s1)/P(s1)\n", "P3 = (4/8)*(8/7)\n", "\n", "#print the answers:\n", "print(P1,P2,P3)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "tags": [ "s1", "hid", "l1" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n" ] } ], "source": [ "ref_tmp_var = False\n", "\n", "try:\n", " if (P1 == 0.125) and (P2 == 0.375) and (abs(P3 - 0.571) < 0.1):\n", " ref_assert_var = True\n", " ref_tmp_var = True\n", " else:\n", " ref_assert_var = False\n", " print('Please follow the instructions given and use the same variables provided in the instructions.')\n", "except Exception:\n", " print('Please follow the instructions given and use the same variables provided in the instructions.')\n", "\n", "assert ref_tmp_var" ] }, { "cell_type": "markdown", "metadata": { "tags": [ "l2", "content", "s2" ] }, "source": [ "## Bayes' Theorem:\n", "\n", "Named after the famous English statistician Thomas Bayes, Bayes' Theorem (or Bayes' Rule). The theorem provides a way to revise exsisting predictions or hypothesis given new or additional evidence. It is a direct application of conditional probability.\n", "\n", "The formula is as follows:\n", "\n", "$$P(A|B) = \\frac{P(B|A)P(A)}{P(B)}$$\n", "\n", "### Derivation:\n", "\n", "Let us derive this equation using conditional probability. We know from conditional probabilities that:\n", "\n", "$$P(A|B) = \\frac{P(A∩B)}{P(B)}$$\n", "\n", "re-arranging the terms, we get:\n", "$$P(A∩B) = P(A|B).P(B)$$\n", "\n", "Similarly,\n", "\n", "$$P(B|A) = \\frac{P(A∩B)}{P(A)}$$\n", "\n", "$$P(A∩B) = P(B|A).P(A)$$\n", "\n", "Now equating the above two results, we get:\n", "$$P(A|B).P(B) = P(B|A).P(A)$$\n", "\n", "$$P(A|B) = \\frac{P(B|A).P(A)}{P(B)}$$\n", "\n", "Let us better understand the theorem from a simple example\n", "\n", "Example: It is the flu season and you visit your personal physician for a regular checkup. The doctor selects you at random to test for Bird flu. The virus is currently suspected to affect 1 in 10000 people across the globe. The test is 99% accurate, meaning that the false positive (you test positive when you do not have the flu) is 2%. The false negative is 0%. Your test result shows you tested positive. What is the probability that you have bird flu.\n", "\n", "Solution:
\n", "Let us first breakdown the problem:
\n", "Let P(B) be the probability that you have bird flu
\n", "Let P(T) be the probability of testing positive
\n", "From the above problem statement, it is clear that we are required to find P(B|A)\n", "\n", "As per Bayes' Theorem,\n", "\n", "$$P(B|T) = \\frac{P(T|B).P(B)}{P(T)}$$\n", "\n", "From total probability, we have:\n", "$$P(T) = P(T|B).P(B) + P(T|NB).P(NB)$$\n", "\n", "where P(NB) is the probability of not having bird flu.\n", "\n", "We have,\n", "P(B) = 1/10000 = 0.0001
\n", "P(NB) = 1 - P(B) = 0.9999
\n", "P(T|B) = 1
[if you have swine flue you always test positive, false negativs is 0]
\n", "P(T|NB) = 0.02 (false posititve rate)\n", "\n", "Therefore, P(T) = P(T|B).P(B) + P(T|NB).P(NB)
\n", "P(T) ≈ 0.02\n", "\n", "$$P(B|T) = \\frac{P(T|B).P(B)}{P(T)}$$\n", "
\n", "$$P(B|T) = \\frac{1 X 0.0001}{0.02} ≈ 0.005$$\n", "
\n", "
Even though you tested positive, there is only a 0.5% chance you have bird flu.\n", "\n", "\n", "\n", "### Exercise\n", "\n", "The blue M&M was introduced in 1995. Before then, the color mix in a bag of plain M&Ms was (30% Brown, 20% Yellow, 20% Red, 10% Green, 10% Orange, 10% Tan). Afterward it was (24% Blue , 20% Green, 16% Orange, 14% Yellow, 13% Red, 13% Brown). \n", "A friend of mine has two bags of M&Ms, and he tells me that one is from 1994 and one from 1996. He won't tell me which is which, but he gives me one M&M from each bag. One is yellow and one is green. What is the probability that the yellow M&M came from the 1994 bag?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "tags": [ "l2", "ce", "s2" ] }, "outputs": [], "source": [ "#" ] }, { "cell_type": "markdown", "metadata": { "tags": [ "l2", "s2", "hint" ] }, "source": [] }, { "cell_type": "code", "execution_count": 7, "metadata": { "tags": [ "l2", "s2", "ans" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.5882352941176471\n" ] } ], "source": [ "# Use the variable \"ans\" to denote the final probability\n", "ans = (0.2*0.5)/(0.2*0.5+0.14*0.5)\n", "print(ans)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "tags": [ "l2", "hid", "s2" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n" ] } ], "source": [ "ref_tmp_var = False\n", "\n", "try:\n", " if abs(ans - 0.588) < 0.1:\n", " ref_assert_var = True\n", " ref_tmp_var = True\n", " else:\n", " ref_assert_var = False\n", " print('Please follow the instructions given and use the same variables provided in the instructions.')\n", "except Exception:\n", " print('Please follow the instructions given and use the same variables provided in the instructions.')\n", "\n", "assert ref_tmp_var" ] } ], "metadata": { "executed_sections": [], "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.1" } }, "nbformat": 4, "nbformat_minor": 2 }