{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"execution": {},
"id": "view-in-github"
},
"source": [
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"# Bonus Tutorial: Fitting to data\n",
"\n",
"**Week 3, Day 1: Bayesian Decisions**\n",
"\n",
"**By Neuromatch Academy**\n",
"\n",
"**Content creators:** Vincent Valton, Konrad Kording\n",
"\n",
"**Content reviewers:** Matt Krause, Jesse Livezey, Karolina Stosio, Saeed Salehi, Michael Waskom\n",
"\n",
"**Production editors:** Gagana B, Spiros Chavlis\n",
"\n",
"
\n",
"\n",
"**Note:** This is bonus material, included from NMA 2020. It has not been substantially revised. This means that the notation and standards are slightly different and some of the references to other days in NMA are outdated. We include it here because it covers fitting Bayesian models to data, which may be of interest to many students."
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"---\n",
"# Tutorial objectives\n",
"\n",
"In the first two tutorials, we learned about Bayesian models and decisions more intuitively, using demos. In this notebook, we will dive into using math and code to fit Bayesian models to data.\n",
"\n",
"We'll have a look at computing all the necessary steps to perform model inversion (estimate the model parameters such as $p_{common}$ that generated data similar to that of a participant). We will describe all the steps of the generative model first, and in the last exercise we will use all these steps to estimate the parameter $p_{common}$ of a single participant using simulated data.\n",
"\n",
"The generative model will be a Bayesian model we saw in Tutorial 2: a mixture of Gaussian prior and a Gaussian likelihood.\n",
"\n",
"Steps:\n",
"\n",
"* First, we'll create the prior, likelihood, posterior, etc., in a form that will make it easier for us to visualize what is being computed and estimated at each step of the generative model:\n",
" 1. Creating a mixture of Gaussian prior for multiple possible stimulus inputs\n",
" 2. Generating the likelihood for multiple possible stimulus inputs\n",
" 3. Estimating our posterior as a function of the stimulus input\n",
" 4. Estimating a participant response given the posterior\n",
"\n",
"* Next, we'll perform the model inversion/fitting:\n",
" 5. Create a distribution for the input as a function of possible inputs\n",
" 6. Marginalization\n",
" 7. Generate some data using the generative model provided\n",
" 8. Perform model inversion (model fitting) using the generated data and see if you recover the original parameters."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {}
},
"outputs": [],
"source": [
"# @title Tutorial slides\n",
"# @markdown These are the slides for the videos in this tutorial\n",
"from IPython.display import IFrame\n",
"link_id = \"sqnd5\"\n",
"print(f\"If you want to download the slides: https://osf.io/download/{link_id}/\")\n",
"IFrame(src=f\"https://mfr.ca-1.osf.io/render?url=https://osf.io/{link_id}/?direct%26mode=render%26action=download%26mode=render\", width=854, height=480)"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"---\n",
"# Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {}
},
"outputs": [],
"source": [
"# @title Install and import feedback gadget\n",
"\n",
"!pip3 install vibecheck datatops --quiet\n",
"\n",
"from vibecheck import DatatopsContentReviewContainer\n",
"def content_review(notebook_section: str):\n",
" return DatatopsContentReviewContainer(\n",
" \"\", # No text prompt\n",
" notebook_section,\n",
" {\n",
" \"url\": \"https://pmyvdlilci.execute-api.us-east-1.amazonaws.com/klab\",\n",
" \"name\": \"neuromatch_cn\",\n",
" \"user_key\": \"y1x3mpx5\",\n",
" },\n",
" ).render()\n",
"\n",
"\n",
"feedback_prefix = \"W3D1_T3_Bonus\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "both",
"execution": {}
},
"outputs": [],
"source": [
"# Imports\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import matplotlib as mpl\n",
"from scipy.optimize import minimize"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {}
},
"outputs": [],
"source": [
"# @title Figure Settings\n",
"import logging\n",
"logging.getLogger('matplotlib.font_manager').disabled = True\n",
"\n",
"import ipywidgets as widgets\n",
"%matplotlib inline\n",
"%config InlineBackend.figure_format = 'retina'\n",
"plt.style.use(\"https://raw.githubusercontent.com/NeuromatchAcademy/course-content/NMA2020/nma.mplstyle\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {}
},
"outputs": [],
"source": [
"# @title Plotting functions\n",
"\n",
"def plot_myarray(array, xlabel, ylabel, title):\n",
" \"\"\" Plot an array with labels.\n",
"\n",
" Args :\n",
" array (numpy array of floats)\n",
" xlabel (string) - label of x-axis\n",
" ylabel (string) - label of y-axis\n",
" title (string) - title of plot\n",
"\n",
" Returns:\n",
" None\n",
" \"\"\"\n",
" fig = plt.figure()\n",
" ax = fig.add_subplot(111)\n",
" colormap = ax.imshow(array, extent=[-10, 10, 8, -8])\n",
" cbar = plt.colorbar(colormap, ax=ax)\n",
" cbar.set_label('probability')\n",
" ax.invert_yaxis()\n",
" ax.set_xlabel(xlabel)\n",
" ax.set_title(title)\n",
" ax.set_ylabel(ylabel)\n",
" ax.set_aspect('auto')\n",
" return None\n",
"\n",
"\n",
"def plot_my_bayes_model(model) -> None:\n",
" \"\"\"Pretty-print a simple Bayes Model (ex 7), defined as a function:\n",
"\n",
" Args:\n",
" - model: function that takes a single parameter value and returns\n",
" the negative log-likelihood of the model, given that parameter\n",
" Returns:\n",
" None, draws plot\n",
" \"\"\"\n",
" x = np.arange(-10,10,0.07)\n",
"\n",
" # Plot neg-LogLikelihood for different values of alpha\n",
" alpha_tries = np.arange(0.01, 0.3, 0.01)\n",
" nll = np.zeros_like(alpha_tries)\n",
" for i_try in np.arange(alpha_tries.shape[0]):\n",
" nll[i_try] = model(np.array([alpha_tries[i_try]]))\n",
"\n",
" plt.figure()\n",
" plt.plot(alpha_tries, nll)\n",
" plt.xlabel('p_independent value')\n",
" plt.ylabel('negative log-likelihood')\n",
"\n",
" # Mark minima\n",
" ix = np.argmin(nll)\n",
" plt.scatter(alpha_tries[ix], nll[ix], c='r', s=144)\n",
"\n",
" #plt.axvline(alpha_tries[np.argmin(nll)])\n",
" plt.title('Sample Output')\n",
" plt.show()\n",
"\n",
" return None\n",
"\n",
"\n",
"def plot_simulated_behavior(true_stim, behaviour):\n",
" fig = plt.figure(figsize=(7, 7))\n",
" ax = fig.add_subplot(1,1,1)\n",
" ax.set_facecolor('xkcd:light grey')\n",
" plt.plot(true_stim, true_stim - behaviour, '-k', linewidth=2, label='data')\n",
" plt.axvline(0, ls='dashed', color='grey')\n",
" plt.axhline(0, ls='dashed', color='grey')\n",
" plt.legend()\n",
" plt.xlabel('Position of true visual stimulus (cm)')\n",
" plt.ylabel('Participant deviation from true stimulus (cm)')\n",
" plt.title('Participant behavior')\n",
" plt.show()\n",
"\n",
" return None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {}
},
"outputs": [],
"source": [
"# @title Helper Functions\n",
"\n",
"def my_gaussian(x_points, mu, sigma):\n",
" \"\"\"\n",
" Returns a Gaussian estimated at points `x_points`, with parameters: `mu` and `sigma`\n",
"\n",
" Args :\n",
" x_points (numpy arrays of floats)- points at which the gaussian is evaluated\n",
" mu (scalar) - mean of the Gaussian\n",
" sigma (scalar) - std of the gaussian\n",
"\n",
" Returns:\n",
" Gaussian evaluated at `x`\n",
" \"\"\"\n",
" p = np.exp(-(x_points-mu)**2/(2*sigma**2))\n",
" return p / sum(p)\n",
"\n",
"\n",
"def moments_myfunc(x_points, function):\n",
" \"\"\"\n",
" DO NOT EDIT THIS FUNCTION !!!\n",
"\n",
" Returns the mean, median and mode of an arbitrary function\n",
"\n",
" Args :\n",
" x_points (numpy array of floats) - x-axis values\n",
" function (numpy array of floats) - y-axis values of the function evaluated at `x_points`\n",
"\n",
" Returns:\n",
" (tuple of 3 scalars): mean, median, mode\n",
" \"\"\"\n",
"\n",
" # Calc mode of arbitrary function\n",
" mode = x_points[np.argmax(function)]\n",
"\n",
" # Calc mean of arbitrary function\n",
" mean = np.sum(x_points * function)\n",
"\n",
" # Calc median of arbitrary function\n",
" cdf_function = np.zeros_like(x_points)\n",
" accumulator = 0\n",
" for i in np.arange(x_points.shape[0]):\n",
" accumulator = accumulator + function[i]\n",
" cdf_function[i] = accumulator\n",
" idx = np.argmin(np.abs(cdf_function - 0.5))\n",
" median = x_points[idx]\n",
"\n",
" return mean, median, mode"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"---\n",
"# Introduction\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {}
},
"outputs": [],
"source": [
"# @title Video 1: Intro\n",
"from ipywidgets import widgets\n",
"from IPython.display import YouTubeVideo\n",
"from IPython.display import IFrame\n",
"from IPython.display import display\n",
"\n",
"\n",
"class PlayVideo(IFrame):\n",
" def __init__(self, id, source, page=1, width=400, height=300, **kwargs):\n",
" self.id = id\n",
" if source == 'Bilibili':\n",
" src = f'https://player.bilibili.com/player.html?bvid={id}&page={page}'\n",
" elif source == 'Osf':\n",
" src = f'https://mfr.ca-1.osf.io/render?url=https://osf.io/download/{id}/?direct%26mode=render'\n",
" super(PlayVideo, self).__init__(src, width, height, **kwargs)\n",
"\n",
"\n",
"def display_videos(video_ids, W=400, H=300, fs=1):\n",
" tab_contents = []\n",
" for i, video_id in enumerate(video_ids):\n",
" out = widgets.Output()\n",
" with out:\n",
" if video_ids[i][0] == 'Youtube':\n",
" video = YouTubeVideo(id=video_ids[i][1], width=W,\n",
" height=H, fs=fs, rel=0)\n",
" print(f'Video available at https://youtube.com/watch?v={video.id}')\n",
" else:\n",
" video = PlayVideo(id=video_ids[i][1], source=video_ids[i][0], width=W,\n",
" height=H, fs=fs, autoplay=False)\n",
" if video_ids[i][0] == 'Bilibili':\n",
" print(f'Video available at https://www.bilibili.com/video/{video.id}')\n",
" elif video_ids[i][0] == 'Osf':\n",
" print(f'Video available at https://osf.io/{video.id}')\n",
" display(video)\n",
" tab_contents.append(out)\n",
" return tab_contents\n",
"\n",
"\n",
"video_ids = [('Youtube', 'YSKDhnbjKmA'), ('Bilibili', 'BV13g4y1i7je')]\n",
"tab_contents = display_videos(video_ids, W=854, H=480)\n",
"tabs = widgets.Tab()\n",
"tabs.children = tab_contents\n",
"for i in range(len(tab_contents)):\n",
" tabs.set_title(i, video_ids[i][0])\n",
"display(tabs)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"execution": {}
},
"outputs": [],
"source": [
"# @title Submit your feedback\n",
"content_review(f\"{feedback_prefix}_Intro_Video\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"Here is a graphical representation of the generative model:\n",
"\n",
"