{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Project: Analyzing A/B test result to decide whether to launch new homepage design\n", " \n", "--By Lu Tang" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this project, I will do data analysis to help our client decide whether to launch two new features on their website. \n", "\n", "Here's the customer funnel for typical new users on their site:\n", "\n", "**View home page > Explore courses > View course overview page > Enroll in course > Complete course**\n", "\n", "Our client loses users as they go down the stages of this funnel, with only a few making it to the end. To increase student engagement, Our client is performing **A/B tests** to try out changes that will hopefully increase conversion rates from one stage to the next.\n", "\n", "A/B tests are used to test changes on a web page by running an experiment where a **control group** sees the old version, while the **experiment group** sees the new version. A **metric** is then chosen to measure the level of engagement from users in each group. These results are then used to judge whether one version is more effective than the other. A/B testing is very much like hypothesis testing with the following hypotheses:\n", ">- **Null Hypothesis**: The new version is no better, or even worse, than the old version\n", ">- **Alternative Hypothesis**: The new version is better than the old version\n", "\n", "If we fail to reject the null hypothesis, the results would suggest keeping the old version. If we reject the null hypothesis, the results would suggest launching the change. These tests can be used for a wide variety of changes, from large feature additions to small adjustments in color, to see what change maximizes your metric the most." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Feature Changes: Change homepage design. \n", "Our client hopes that this new, more engaging design will increase the number of users that explore their courses, that is, move on to the second stage of the funnel.\n", "\n", "**Metric: Click through rate (CTR)**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Click through rate (CTR) is often defined as the the number of clicks divided by the number of views. Since our client uses cookies, we can identify unique users and make sure we don't count the same one multiple times. For this experiment, we'll define our click through rate as:\n", "\n", "CTR: # clicks by *unique* users / # views by *unique* users\n", "\n", "H0: CTR_{new} - CTR_{old} <= 0\n", "\n", "H1: CTR_{new} - CTR_{old} > 0 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**The company has collected the data in homepage_actions.csv, we will use Python to analyze the data**" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# load library\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "# to make sure we get the same results everytime we run the code\n", "np.random.seed(42)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | timestamp | \n", "id | \n", "group | \n", "action | \n", "
|---|---|---|---|---|
| 0 | \n", "2016-09-24 17:42:27.839496 | \n", "804196 | \n", "experiment | \n", "view | \n", "
| 1 | \n", "2016-09-24 19:19:03.542569 | \n", "434745 | \n", "experiment | \n", "view | \n", "
| 2 | \n", "2016-09-24 19:36:00.944135 | \n", "507599 | \n", "experiment | \n", "view | \n", "
| 3 | \n", "2016-09-24 19:59:02.646620 | \n", "671993 | \n", "control | \n", "view | \n", "
| 4 | \n", "2016-09-24 20:26:14.466886 | \n", "536734 | \n", "experiment | \n", "view | \n", "