{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Logistic Homework" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from datascience import *\n", "import numpy as np\n", "import math\n", "\n", "%matplotlib inline\n", "import matplotlib.pyplot as plots\n", "plots.style.use('fivethirtyeight')\n", "from scipy import stats\n", "\n", "import pandas\n", "\n", "sepsis = Table.read_table('sepsis.csv')\n", "\n", "import statsmodels.formula.api as sfm\n", "\n", "np.warnings.filterwarnings('ignore', category=np.VisibleDeprecationWarning)\n", "\n", "def boolean_to_binary(x):\n", " return int(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The data we call `sepsis` was downloaded from [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/Sepsis+survival+minimal+clinical+records) and was originally published in [here](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0187990). It contains information on whether patients admitted to the hospital suffering from sepsis survived. The variable names are long, but also self-explanatory. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| age_years | sex_0male_1female | episode_number | hospital_outcome_1alive_0dead | \n", "
|---|---|---|---|
| 21 | 1 | 1 | 1 | \n", "
| 20 | 1 | 1 | 1 | \n", "
| 21 | 1 | 1 | 1 | \n", "
... (110201 rows omitted)
" ], "text/plain": [ "| Actual | 0 | 1 | \n", "
|---|---|---|
| 0 | 102 | 8003 | \n", "
| 1 | 435 | 101664 | \n", "