{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Bayesian Survival Analysis\n", "\n", "Author: Austin Rochford\n", "\n", "[Survival analysis](https://en.wikipedia.org/wiki/Survival_analysis) studies the distribution of the time to an event. Its applications span many fields across medicine, biology, engineering, and social science. This tutorial shows how to fit and analyze a Bayesian survival model in Python using PyMC3.\n", "\n", "We illustrate these concepts by analyzing a [mastectomy data set](https://vincentarelbundock.github.io/Rdatasets/doc/HSAUR/mastectomy.html) from `R`'s [HSAUR](https://cran.r-project.org/web/packages/HSAUR/index.html) package." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from matplotlib import pyplot as plt\n", "import numpy as np\n", "import pymc3 as pm\n", "from pymc3.distributions.timeseries import GaussianRandomWalk\n", "import seaborn as sns\n", "import pandas as pd\n", "from theano import tensor as T" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "df = pd.read_csv(pm.get_data('mastectomy.csv'))\n", "df.event = df.event.astype(np.int64)\n", "df.metastized = (df.metastized == 'yes').astype(np.int64)\n", "n_patients = df.shape[0]\n", "patients = np.arange(n_patients)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | time | \n", "event | \n", "metastized | \n", "
---|---|---|---|
0 | \n", "23 | \n", "1 | \n", "0 | \n", "
1 | \n", "47 | \n", "1 | \n", "0 | \n", "
2 | \n", "69 | \n", "1 | \n", "0 | \n", "
3 | \n", "70 | \n", "0 | \n", "0 | \n", "
4 | \n", "100 | \n", "0 | \n", "0 | \n", "