{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 21 - Meta Learners\n", " \n", " \n", "Just to recap, we are now interested in finding treatment effect heterogeneity, that is, identifying how units respond differently to the treatment. In this framework, we want to estimate\n", " \n", "$\n", "\\tau(x) = E[Y_i(1) − Y_i(0)|X] = E[\\tau_i|X]\n", "$\n", " \n", "or, $E[\\delta Y_i(t)|X]$ in the continuous case. In other words, we want to know how sensitive the units are to the treatment. This is super useful in the case where we can't treat everyone and need to do some prioritization of the treatment, for example when you want to give discounts but have a limited budget. \n", " \n", "Previously, we saw how we could transform the outcome variable $Y$ so that we can plug it in a predictive model and get a Conditional Average Treatment Effect (CATE) estimate. There, we had to pay a price in terms of variance increase. That's something we see a lot in Data Science. There isn't a single best method because each one has its downsides and upsides. For that reason, it is worth learning about many techniques so you can trade-off one for the other depending on the circumstances. In that spirit, this chapter will focus on giving more tools for you to have at your disposal. \n", " \n", "![img](data/img/meta-learners/learned-new-move.png)\n", " \n", "Meta learners are a simple way to leverage off-the-shelf predictive machine learning methods in order to solve the same problem we've been looking at so far: estimating the CATE. Again, none of them is the single best one and each one has its weakness. \n", "I'll try to go over them, but keep in mind that this stuff is highly dependent on the context. Not only that, meta-learners deploy predictive ML models which can vary from linear regression and boosted decision trees to neural networks and gaussian processes. The success of the meta learner will be also highly dependent on what machine learning method it uses as its components. Oftentimes you just have to try out many different things and see what works best. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2023-07-27T11:43:09.109336Z", "start_time": "2023-07-27T11:43:07.076775Z" }, "tags": [ "hide-input" ] }, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "from matplotlib import pyplot as plt\n", "import seaborn as sns\n", "from nb21 import cumulative_gain, elast" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, we will use the same data we had before, regathing investment advertisement emails. Again, the goal here is to figure out who will respond better to the email. There is a little twist, though. This time, we will use non-random data to train the models and random data to validate them. Dealing with-non random data is a much harder task, because the meta learners will need to debias the data **AND** estimate the CATE." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2023-07-27T11:43:09.141766Z", "start_time": "2023-07-27T11:43:09.111209Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " | age | \n", "income | \n", "insurance | \n", "invested | \n", "em1 | \n", "em2 | \n", "em3 | \n", "converted | \n", "
---|---|---|---|---|---|---|---|---|
0 | \n", "44.1 | \n", "5483.80 | \n", "6155.29 | \n", "14294.81 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "
1 | \n", "39.8 | \n", "2737.92 | \n", "50069.40 | \n", "7468.15 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "
2 | \n", "49.0 | \n", "2712.51 | \n", "5707.08 | \n", "5095.65 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "
3 | \n", "39.7 | \n", "2326.37 | \n", "15657.97 | \n", "6345.20 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
4 | \n", "35.3 | \n", "2787.26 | \n", "27074.44 | \n", "14114.86 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "