# Business Application: Expected Value

## Objectives 

* Extend our metrics into business application
* Identify and understand a cost benefit matrix
* Using a confusion matrix with a cost benefit matrix to solve for expected value

## Class Notes 

### How do we use business goals as evaluation metrics?

So far we've discussed one crosswise metric that helps us evaluate false positive rate vs true positive rate (AUC). However, to Everyday Business Guy, AUC means nothing: he is more familiar with accuracy and would rather just have some model that best predicts accuracy. To improve the understanding of the model and connect it to Everyday Business Guy, we want to associate the confusion matrix with an expected value, using a cost benefit analysis matrix.

### What is expected value?

Expected value associates probabilities for the confusion matrix with values. Expected value says "this is the expected outcome of the model, should we choose to use it for the business."

Mathmatically, expected value's equation looks like this:

$EV = \sum p(o_x) * v(o_x)$

where:

$EV$: Expected Value  
$p(o_x)$: [P]robability of [o]bservation [x] occurring  
$v(o_x)$: The [v]alue of [o]bservation [x] occuring


### How do we find these values?

The technique we'll use is a _cost benefit matrix_. This is very similar to our confusion matrix:

$\begin{bmatrix}TP & FP\\FN & TN\end{bmatrix}$

Which to find probabilities turns into this:

$\begin{bmatrix}p(TP) & p(FP)\\p(FN) & p(TN)\end{bmatrix}$

and our cost benefit matrix will be somewhat similar:

$\begin{bmatrix}b(TP) & c(FP)\\c(FN) & b(TN)\end{bmatrix}$

where:

$b$ represents benefit (the benefits of accurately predicting positives and negatives), while...   
$c$ represents cost (the costs of misclassifying positives and negatives).

To simplify we'll associate benefits to be positive value and costs to be negative value:

$\begin{bmatrix}v(TP) & -v(FP)\\-v(FN) & v(TN)\end{bmatrix}$


# Example: Ads for a startup

You run a startup service and are interested in acquiring customers. The estimated CLV (customer lifetime value) of a client is about \$75, while the estimated acquisition cost runs at about \$30. To maximize profit, you're interested in a model that accurately targets the correct clients for your business. You come to these conclusions on this ad:

Benefit of a True Positive: \$45. Our CLV is \$75, and to acquire them with this marketing technique is \$30.<br />
Benefit of a True Negative: \$0. We would have never acquired this customer (0), and we never spent money to acquire them (0).<br />
Cost of a False Positive: \$30. We spent \$30 on a client that would not have used the service.<br />
Cost of a False Negative: \$0. We did not spend money on this client, so technically our cost is 0, though we would have gained if our model correctly targetted this client.<br />

This is represented in our cost benefit matrix:

$\begin{bmatrix}45 & -30\\0 & 0\end{bmatrix}$

We end up with a confusion matrix that predicts response to the targeted ad this way:

$\begin{bmatrix}45423 & 13041\\98724 & 12324\end{bmatrix}$

and solve for probabilities ($h_x, a$ in this case means hypothesis given all hypothesised values):

$p(h_{TP}, a) = 45423 / 169512 = 0.26796333$

$p(h_{TN}, a) = 12324 / 169512 = 0.0727028175$

$p(h_{FP}, a) = 13041 / 169512 = 0.07693260654$

$p(h_{FN}, a) = 98724 / 169512 = 0.5824012459$

We'll multiply these probabilities against the cost benefit analysis:

$0.26796333 * 45 = 12.05834985$

$0.0727028175 * 0 = 0$

$0.07693260654 * -30 = -2.307978196$

$0.5824012459 * 0 = 0$

and sum the results to get the resulting **expected value**:

$12.05834985 + 0 + (-2.307978196) + 0 = 9.75$

which means on average, if we target the clients and they respond as predicted, we're expected almost a 10 dollar return on clients.

Let's have a function that accepts a confusion matrix as well as a the cost benefit matrix:

In [2]:
import numpy as np
def find_expected_value(confusion, cost_benefit):
    # if you use a probability matrix instead, this next line will return the same matrix back
    probabilities = confusion.astype('float') / confusion.sum()
    return (probabilities * cost_benefit.astype('float')).sum()

cb = np.array([[45.0, -30.0], [0, 0]])

conf = np.array([[45423, 13041], [98724, 12324]])
print find_expected_value(conf, cb)

9.7503716551


At this point, the marketing team has drawn up a new ad targeting a different market. This ad is more expensive, but the target market is more specific, which means it's easier to predict if a client will respond to the ad or not. We end up building a model and predicting the confusion matrix to look something like this:


In [3]:
new_conf = np.array([[62153, 7501], [4735, 32041]])

and a cost benefit matrix:


In [4]:
new_cb = np.array([[25.0, -50.0], [0, 0]])

How does our end result come out? Does this more expensive ad end up being worth the return?

In [5]:
print find_expected_value(new_conf, new_cb)

11.0755895894


### Evaluating a new decision line

We can use cost benefit analysis to help us draw new decision lines as well. For example, we may want to target a client with an ad only when we a greater likelihood in order to make a profit (> 0). We can use the following math to determine that new boundary:

$p(x)~b(TP) - (1 - p(x))~c(FP) > 0$

which then simplifies to this:

$p(x)~b(TP) > (1 - p(x))~c(FP)$

in our first ad, we had a $TP$ benefit of 45 dollars and a $FP$ cost of 30 dollars:

$p(x) * 45 > (1 - p(x)) * 30$

which then simplifies to (solving $p(x)$):

$p(x) > \dfrac{30}{75} = .4$

so we should _only_ target customers with an expected probaility of .40.

How does our second ad fair?

$p(x) > \dfrac{25}{75} = .33$

This second ad we can target a larger audience in order to maximize likely profit.

#### A side note on sklearn confusion matrices:

SKlearn's confusion matrices follow the syntax as specified in [wikipedia](http://en.wikipedia.org/wiki/Confusion_matrix), which is predicted in columns and actual in rows. This is the opposite of what we expect given previous diagrams. An easy solution to address this would be to flip the confusion matrix function, and assign label order:

```python
metrics.confusion_matrix(y_predicted, y_actual, labels=[1, 0])
```

And this will create your matrix as expected:

$\begin{bmatrix}TP & FP\\FN & TN\end{bmatrix}$

### applications to car auction / lemons

In the lemons data set there was a column called "VehBCost," which represents the acquisition price of the vehicle. Imagine you run a car acquisition business that attempts to purchase used vehicles and flip them for 10% more given marginal repairs (let's assume 0). We'll use the benchmark results for probabilities in the confusion matrix, and then assume the following cost benefits for each result:

A true positive (TP) was a lemon, so knowing ahead that it would be a bad purchase means our Benefit is 0

A true negative (TN) was not a lemon, so purchasing would be on average 10% of the VehBCost

A false positive (FP) was not actually a lemon, but our model suggested we shouldn't purchase it. Cost is 0.

A false negative (FN) was a lemon, but our model suggested to purchase it. Our cost is the entire VehBCost.

our confusion matrix from the benchmark looks like so:

In [6]:
# computed AUC comes out to be 0.64
lemons_confusion = np.array([[292, 66], [5997, 44733]])

The average VehBCost is \$6728.22. We can assume some average expected value:

In [7]:
lemons_cb = np.array([[0, 0], [-6278.22, 672.82]])

What do we end up getting as an expected value?

What should our new decision line be?

## Your Turn

Continue working on personal projects or the lemons project with the remainder of class today.

## Additional Resources

* [Wikipedia](http://en.wikipedia.org/wiki/Expected_value) is always a good start, but
* [Khan Academy](https://www.khanacademy.org/math/probability/random-variables-topic/expected-value/v/term-life-insurance-and-death-probability) has a great set of videos that deep dive into Expected Value.
