# Joint Distributions

Consider two discrete random variables X and Y.  The function given by
f (x, y) = P(X = x, Y = y) for each pair of values (x, y) within the
range of X is called the joint probability distribution of X and Y.

The joint probability mass function for discrete random variables (X=x, Y=y) is given by:

${\begin{aligned}\mathrm {P} (X=x\ \mathrm {and} \ Y=y)=\mathrm {P} (Y=y\mid X=x)\cdot \mathrm {P} (X=x)=\mathrm {P} (X=x\mid Y=y)\cdot \mathrm {P} (Y=y)\end{aligned}}$


### Example

A coin is tossed twice. Let X denote the number of heads on the first toss and Y the total number of heads on the 2 tosses. 
Assume that the coin is biased and a head has a 60% chance of occurring:

* X = First head
* Y = Number of heads in 2 tosses

Compute the joint probability table and assign the values to the dictionary.

In [None]:
# Assign the values of the dictionary of the form p_xy[X][Y] below
p_h = 0.6
p_t = 1-0.6
p_12 = 0
p_11 = 0
p_01 = 0
p_10 = 0

In [None]:
p_12 = p_h * p_h
p_11 = p_h * p_t
p_01 = p_t * p_h
p_00 = p_t * p_t

print("p_12 %s, p_11 %s, p_01 %s, p_00 %.4s" % (p_12, p_11, p_01, p_00))

In [None]:
ref_tmp_var = False

try:
    if (abs(p_12 - 0.36)<0.1) and (abs(p_11 - 0.24) < 0.1) and (abs(p_01 - 0.24) < 0.1) and (abs(p_00 - .16) < 0.1): 
        ref_assert_var = True
        ref_tmp_var = True
    else:
        ref_assert_var = False
        print('Please follow the instructions given and use the same variables provided in the instructions.')
except Exception:
    print('Please follow the instructions given and use the same variables provided in the instructions.')

assert ref_tmp_var

\begin{array}{ l | c | r }
     \hline
     - & 1st - Toss & 2nd-Toss & JP \\ 
     \hline
     HH & 0.6 & 0.6 & 0.36 \\ 
     \hline
     HT & 0.6 & 0.4 & 0.24 \\ 
     \hline
     TH & 0.4 & 0.6 & 0.24 \\ 
     \hline
     TT & 0.4 & 0.4 & 0.16 \\ 
   \hline
\end{array}



The joint probability distribution looks like :

\begin{array}{ l | c | r }
     \hline
     H:T & X & Y & JP \\ 
     \hline
     HH & 1 & 2 & 0.36 \\ 
     \hline
     HT & 1 & 1 & 0.24 \\ 
     \hline
     TH & 0 & 1 & 0.24 \\ 
     \hline
     TT & 0 & 0 & 0.16 \\ 
   \hline
\end{array}

We can now organize the above in the form of a map with Y, X as:

\begin{array}{ l | c | r }
     \hline
     Y:X-> & 0 & 1 \\ 
     \hline
     0 & 0.16 & 0 \\ 
     \hline
     1 & 0.24 & 0.24 \\ 
     \hline
     2 & 0 & 0.36 \\ 
     \hline
\end{array}


## Marginal Distribution

For a given two random variables X and Y whose joint distribution is known, the marginal distribution of X is simply the probability distribution of X averaging over information about Y. This is  calculated by summing the joint probability distribution over Y.

For discrete random variable , marginal distribution of variable X is obtained by summing up the distribution of X over values of Y.

Let us consider the above joint distribution again:

\begin{array}{ l | c | r }
     \hline
     Y : X-> & 0 & 1 \\ 
     \hline
     0 & 0.16 & 0 \\ 
     \hline
     1 & 0.24 & 0.24 \\ 
     \hline
     2 & 0 & 0.36 \\ 
     \hline
\end{array}


## Example

* Compute the marginal distributions, f(X), f(y). Assign the list to the variables fX, fY.

In [None]:
#Exercise
fX = []
fY = []

Sum over rows and columns for each marginal distribution.

In [5]:
fX = [0.4, 0.6]
fY = [0.16, 0.48, 0.36]

print("fX: ", fX)
print("fY: ", fY)

fX:  [0.4, 0.6]
fY:  [0.16, 0.48, 0.36]


In [8]:
ref_tmp_var = False

try:
    if fX == [0.4, 0.6] and fY == [0.16, 0.48, 0.36]: 
        ref_assert_var = True
        ref_tmp_var = True
    else:
        ref_assert_var = False
        print('Please follow the instructions given and use the same variables provided in the instructions.')
except Exception:
    print('Please follow the instructions given and use the same variables provided in the instructions.')

assert ref_tmp_var

continue


For the above joint distribution the marginal distribution is below:

Marginal Distribution of X:

\begin{array}{ l | c | r }
     \hline
     X-> & 0 & 1 \\ 
     \hline
     f(x) & 0.4 & 0.6 \\ 
     \hline
\end{array}


Marginal Distribution of Y:

\begin{array}{ l | c | r }
     \hline
     Y-> & 0 & 1  & 2 \\ 
     \hline
     f(y) & 0.16 & 0.48 & 0.36 \\ 
     \hline
\end{array}

http://www.sci.csueastbay.edu/~btrumbo/Stat3401/Hand3401/JointDistnsCor.pdf


### Corpus of words

Let us consider the case of a corpus (collection) of 100 words in a text. The words are tabulated below based on their frequency of occurrence and the probability - 
c(w) = count
P(w) = Probability
X = word length
Y- number of Vowels.


Let us look at a joint probability table for this:

\begin{array}{ l | c | r }
     \hline
     word & c(w) & P(w) & X & Y  \\ 
     \hline
     the & 30 & 0.30  & 3 & 1 \\ 
     \hline
     to & 18 & 0.18  & 2 & 1 \\ 
     \hline
     will & 16 & 0.16  & 4 & 1 \\ 
     \hline
     of & 10 & 0.10  & 2 & 1 \\ 
     \hline
     hello & 7 & 0.07  & 5 & 2 \\ 
     \hline
     in & 6 & 0.06  & 2 & 1 \\ 
     \hline
     tools & 4 & 0.04  & 5 & 2 \\ 
     \hline
     pose & 3 & 0.03  & 4 & 2 \\ 
     \hline
     taste & 3 & 0.03  & 5 & 2 \\ 
     \hline
     PGM & 3 & 0.03  & 3 & 0 \\ 
     \hline
\end{array}

From the above table, it is evident that the word "the" occurs 30 times (count column) out of a total of 100 words. Hence the probability of the word "the" is 0.30 (30/100 = 0.30). The X column refers to the length of the word. In this case x=3. The Y column refers to the number of vowels. In this case y=1. Similarly for the word "to" the probability of occurrence is 0.18 (18/100 = 0.18). X and Y are 2 and 1 respectively.

For arriving at joint probability distribution of variables X and Y, we must consider all the combinations of X and Y that are observed. For example, let us consider all the words with a length of 2 (that is X=2) and with exactly 1 vowel (Y=1). We have 3 occurrences namely "to", "of" and "in". We can get the joint probability by summing up the individual probabilities for these words. Those are 0.18, 0.10 and 0.06. Hence for X=2, Y=1 the joint probability is 0.18+0.10+0.06 which is 0.34. Similarly calculating the joint probabilities for all combinations of X and Y we get the Joint Probability Distribution table.     

The joint probability distribution looks like this:

\begin{array}{ l | c | r }
     \hline
     Y/X-> & 2 & 3 & 4 & 5 \\ 
     \hline
     0 & 0 & 0.03 & 0 & 0 \\ 
     \hline
     1 & 0.34 & 0.30 & 0.16 & 0 \\ 
     \hline
     2 & 0 & 0 & 0.03 & 0.14 \\ 
     \hline
\end{array}

### Exercise

Find the marginal distribution of X and Y from the above joint probability distribution.

Assign them to the variables fX and fY respectively.



In [2]:
#Exercise
fX = []
fY = []

Sum over rows(for fY array) and columns(for fX array) for each marginal distribution.

In [9]:
fX = [0.34, 0.33, 0.19, 0.14]
fY = [0.03, 0.80, 0.17]

print("fX: ", fX)
print("fY: ", fY)

fX:  [0.34, 0.33, 0.19, 0.14]
fY:  [0.03, 0.8, 0.17]


In [10]:
ref_tmp_var = False

try:
    if fX == [0.34, 0.33, 0.19, 0.14] and fY == [0.03, 0.8, 0.17]: 
        ref_assert_var = True
        ref_tmp_var = True
    else:
        ref_assert_var = False
        print('Please follow the instructions given and use the same variables provided in the instructions.')
except Exception:
    print('Please follow the instructions given and use the same variables provided in the instructions.')

assert ref_tmp_var

continue


For the above joint distribution, the marginal distribution of X and Y are given below:

Marginal Distribution of X:

\begin{array}{ l | c | r }
     \hline
     X-> & 2 & 3 & 4 & 5 \\ 
     \hline
     f(X) & 0.34 & 0.33 & 0.19 & 0.14 \\ 
     \hline
\end{array}


Marginal Distribution of Y:

\begin{array}{ l | c | r }
     \hline
     Y-> & 0 & 1  & 2 \\ 
     \hline
     f(Y) & 0.03 & 0.80 & 0.17 \\ 
     \hline
\end{array}




## Fraud Modeling Example

Consider a simple model of fraudulent transactions with data containing Sex (S), Age (A), Fraud (F), Jewelry (J) and probabilities P {P(S,A,F,J)}:

| S   | A   | F   | J   |       P        |
|-----|-----|-----|-----|----------------|
| S_0 | A_0 | F_0 | J_0 |         0.0025 |
| S_0 | A_0 | F_0 | J_1 |         0.0100 |
| S_0 | A_0 | F_1 | J_0 |         0.1069 |
| ... | ... | ... | ... |          ...   |
| S_1 | A_2 | F_1 | J_1 |         0.0079 |


(F = No) corresponds to F_1

* Compute p(S, A, F, J | F=No) and assign it to p_SAFJ 

In [11]:
import pandas as pd

fraud_data = pd.read_csv('https://raw.githubusercontent.com/colaberry/data/master/Fraud/fraud_data.csv')
fraud_data.head()

Unnamed: 0,S,A,F,J,P
0,S_0,A_0,F_0,J_0,0.0025
1,S_0,A_0,F_0,J_1,0.01
2,S_0,A_0,F_1,J_0,0.1069
3,S_0,A_0,F_1,J_1,0.0056
4,S_0,A_1,F_0,J_0,0.0008


Use fraud_data['F'].str.contains('F_1')

In [13]:
p_SAFJ = fraud_data[fraud_data['F'].str.contains('F_1')]
p_SAFJ['P'] = p_SAFJ['P']/p_SAFJ['P'].sum()
p_SAFJ

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


Unnamed: 0,S,A,F,J,P
2,S_0,A_0,F_1,J_0,0.118778
3,S_0,A_0,F_1,J_1,0.006222
6,S_0,A_1,F_1,J_0,0.19
7,S_0,A_1,F_1,J_1,0.01
10,S_0,A_2,F_1,J_0,0.166222
11,S_0,A_2,F_1,J_1,0.008778
14,S_1,A_0,F_1,J_0,0.118778
15,S_1,A_0,F_1,J_1,0.006222
18,S_1,A_1,F_1,J_0,0.19
19,S_1,A_1,F_1,J_1,0.01


In [14]:
ref_tmp_var = False

try:
    if abs(p_SAFJ['P'][2] - 0.1069) < 0.1: 
        ref_assert_var = True
        ref_tmp_var = True
    else:
        ref_assert_var = False
        print('Please follow the instructions given and use the same variables provided in the instructions.')
except Exception:
    print('Please follow the instructions given and use the same variables provided in the instructions.')

assert ref_tmp_var

continue
