{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "execution": {},
    "id": "view-in-github"
   },
   "source": [
    "<a href=\"https://colab.research.google.com/github/NeuromatchAcademy/course-content/blob/main/tutorials/W1D1_ModelTypes/W1D1_Tutorial3.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a> &nbsp; <a href=\"https://kaggle.com/kernels/welcome?src=https://raw.githubusercontent.com/NeuromatchAcademy/course-content/main/tutorials/W1D1_ModelTypes/W1D1_Tutorial3.ipynb\" target=\"_parent\"><img src=\"https://kaggle.com/static/images/open-in-kaggle.svg\" alt=\"Open in Kaggle\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "# Tutorial 3: \"Why\" models\n",
    "**Week 1, Day 1: Model Types**\n",
    "\n",
    "**By Neuromatch Academy**\n",
    "\n",
    "__Content creators:__ Matt Laporte, Byron Galbraith, Konrad Kording\n",
    "\n",
    "__Content reviewers:__ Dalin Guo, Aishwarya Balwani, Madineh Sarvestani, Maryam Vaziri-Pashkam, Michael Waskom, Ella Batty\n",
    "\n",
    "__Post-production team:__ Gagana B, Spiros Chavlis\n",
    "\n",
    "We would like to acknowledge [Steinmetz _et al._ (2019)](https://www.nature.com/articles/s41586-019-1787-x) for sharing their data, a subset of which is used here.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "___\n",
    "# Tutorial Objectives\n",
    "\n",
    "*Estimated timing of tutorial: 45 minutes*\n",
    "\n",
    "This is tutorial 3 of a 3-part series on different flavors of models used to understand neural data. In parts 1 and 2 we explored mechanisms that would produce the data. In this tutorial we will explore models and techniques that can potentially explain *why* the spiking data we have observed is produced the way it is.\n",
    "\n",
    "To understand why different spiking behaviors may be beneficial, we will learn about the concept of entropy. Specifically, we will:\n",
    "\n",
    "- Write code to compute formula for entropy, a measure of information\n",
    "- Compute the entropy of a number of toy distributions\n",
    "- Compute the entropy of spiking activity from the Steinmetz dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Tutorial slides\n",
    "# @markdown These are the slides for the videos in all tutorials today\n",
    "from IPython.display import IFrame\n",
    "link_id = \"6dxwe\"\n",
    "print(f\"If you want to download the slides: https://osf.io/download/{link_id}/\")\n",
    "IFrame(src=f\"https://mfr.ca-1.osf.io/render?url=https://osf.io/{link_id}/?direct%26mode=render%26action=download%26mode=render\", width=854, height=480)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "---\n",
    "# Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Install and import feedback gadget\n",
    "\n",
    "!pip3 install vibecheck datatops --quiet\n",
    "\n",
    "from vibecheck import DatatopsContentReviewContainer\n",
    "def content_review(notebook_section: str):\n",
    "    return DatatopsContentReviewContainer(\n",
    "        \"\",  # No text prompt\n",
    "        notebook_section,\n",
    "        {\n",
    "            \"url\": \"https://pmyvdlilci.execute-api.us-east-1.amazonaws.com/klab\",\n",
    "            \"name\": \"neuromatch_cn\",\n",
    "            \"user_key\": \"y1x3mpx5\",\n",
    "        },\n",
    "    ).render()\n",
    "\n",
    "\n",
    "feedback_prefix = \"W1D1_T3\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "both",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# Imports\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "from scipy import stats"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Figure Settings\n",
    "import logging\n",
    "logging.getLogger('matplotlib.font_manager').disabled = True\n",
    "\n",
    "import ipywidgets as widgets  # interactive display\n",
    "%matplotlib inline\n",
    "%config InlineBackend.figure_format = 'retina'\n",
    "plt.style.use(\"https://raw.githubusercontent.com/NeuromatchAcademy/course-content/main/nma.mplstyle\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Plotting Functions\n",
    "\n",
    "def plot_pmf(pmf,isi_range):\n",
    "  \"\"\"Plot the probability mass function.\"\"\"\n",
    "  ymax = max(0.2, 1.05 * np.max(pmf))\n",
    "  pmf_ = np.insert(pmf, 0, pmf[0])\n",
    "  plt.plot(bins, pmf_, drawstyle=\"steps\")\n",
    "  plt.fill_between(bins, pmf_, step=\"pre\", alpha=0.4)\n",
    "  plt.title(f\"Neuron {neuron_idx}\")\n",
    "  plt.xlabel(\"Inter-spike interval (s)\")\n",
    "  plt.ylabel(\"Probability mass\")\n",
    "  plt.xlim(isi_range)\n",
    "  plt.ylim([0, ymax])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "#@title Download Data\n",
    "import io\n",
    "import requests\n",
    "r = requests.get('https://osf.io/sy5xt/download')\n",
    "if r.status_code != 200:\n",
    "  print('Could not download data')\n",
    "else:\n",
    "  steinmetz_spikes = np.load(io.BytesIO(r.content), allow_pickle=True)['spike_times']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "---\n",
    "# \"Why\" models"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Video 1: “Why” models\n",
    "from ipywidgets import widgets\n",
    "from IPython.display import YouTubeVideo\n",
    "from IPython.display import IFrame\n",
    "from IPython.display import display\n",
    "\n",
    "\n",
    "class PlayVideo(IFrame):\n",
    "  def __init__(self, id, source, page=1, width=400, height=300, **kwargs):\n",
    "    self.id = id\n",
    "    if source == 'Bilibili':\n",
    "      src = f'https://player.bilibili.com/player.html?bvid={id}&page={page}'\n",
    "    elif source == 'Osf':\n",
    "      src = f'https://mfr.ca-1.osf.io/render?url=https://osf.io/download/{id}/?direct%26mode=render'\n",
    "    super(PlayVideo, self).__init__(src, width, height, **kwargs)\n",
    "\n",
    "\n",
    "def display_videos(video_ids, W=400, H=300, fs=1):\n",
    "  tab_contents = []\n",
    "  for i, video_id in enumerate(video_ids):\n",
    "    out = widgets.Output()\n",
    "    with out:\n",
    "      if video_ids[i][0] == 'Youtube':\n",
    "        video = YouTubeVideo(id=video_ids[i][1], width=W,\n",
    "                             height=H, fs=fs, rel=0)\n",
    "        print(f'Video available at https://youtube.com/watch?v={video.id}')\n",
    "      else:\n",
    "        video = PlayVideo(id=video_ids[i][1], source=video_ids[i][0], width=W,\n",
    "                          height=H, fs=fs, autoplay=False)\n",
    "        if video_ids[i][0] == 'Bilibili':\n",
    "          print(f'Video available at https://www.bilibili.com/video/{video.id}')\n",
    "        elif video_ids[i][0] == 'Osf':\n",
    "          print(f'Video available at https://osf.io/{video.id}')\n",
    "      display(video)\n",
    "    tab_contents.append(out)\n",
    "  return tab_contents\n",
    "\n",
    "\n",
    "video_ids = [('Youtube', 'OOIDEr1e5Gg'), ('Bilibili', 'BV16t4y1Q7DR')]\n",
    "tab_contents = display_videos(video_ids, W=854, H=480)\n",
    "tabs = widgets.Tab()\n",
    "tabs.children = tab_contents\n",
    "for i in range(len(tab_contents)):\n",
    "  tabs.set_title(i, video_ids[i][0])\n",
    "display(tabs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Submit your feedback\n",
    "content_review(f\"{feedback_prefix}_Why_models_Video\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "---\n",
    "# Section 1: Optimization and Information\n",
    "\n",
    "*Remember that the notation section is located after the Summary for quick reference!*\n",
    "\n",
    "Neurons can only fire so often in a fixed period of time, as the act of emitting a spike consumes energy that is depleted and must eventually be replenished. To communicate effectively for downstream computation, the neuron would need to make good use of its limited spiking capability. This becomes an optimization problem:\n",
    "\n",
    "What is the optimal way for a neuron to fire in order to maximize its ability to communicate information?\n",
    "\n",
    "In order to explore this question, we first need to have a quantifiable measure for information. Shannon introduced the concept of entropy to do just that, and defined it as\n",
    "\n",
    "\\begin{equation}\n",
    "H_b(X) = -\\sum_{x\\in X} p(x) \\log_b p(x)\n",
    "\\end{equation}\n",
    "\n",
    "where $H$ is entropy measured in units of base $b$ and $p(x)$ is the probability of observing the event $x$ from the set of all possible events in $X$. See the Bonus Section 1 for a more detailed look at how this equation was derived.\n",
    "\n",
    "The most common base of measuring entropy is $b=2$, so we often talk about *bits* of information, though other bases are used as well (e.g. when $b=e$ we call the units *nats*)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "First, let's explore how entropy changes between some simple discrete probability distributions. In the rest of this tutorial we will refer to these as probability mass functions (PMF), where $p(x_i)$ equals the $i^{th}$ value in an array, and mass refers to how much of the distribution is contained at that value.\n",
    "\n",
    "For our first PMF, we will choose one where all of the probability mass is located in the middle of the distribution."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {}
   },
   "outputs": [],
   "source": [
    "n_bins = 50  # number of points supporting the distribution\n",
    "x_range = (0, 1)  # will be subdivided evenly into bins corresponding to points\n",
    "\n",
    "bins = np.linspace(*x_range, n_bins + 1)  # bin edges\n",
    "\n",
    "pmf = np.zeros(n_bins)\n",
    "pmf[len(pmf) // 2] = 1.0  # middle point has all the mass\n",
    "\n",
    "# Since we already have a PMF, rather than un-binned samples, `plt.hist` is not\n",
    "# suitable. Instead, we directly plot the PMF as a step function to visualize\n",
    "# the histogram:\n",
    "pmf_ = np.insert(pmf, 0, pmf[0])  # this is necessary to align plot steps with bin edges\n",
    "plt.plot(bins, pmf_, drawstyle=\"steps\")\n",
    "# `fill_between` provides area shading\n",
    "plt.fill_between(bins, pmf_, step=\"pre\", alpha=0.4)\n",
    "plt.xlabel(\"x\")\n",
    "plt.ylabel(\"p(x)\")\n",
    "plt.xlim(x_range)\n",
    "plt.ylim(0, 1)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "If we were to draw a sample from this distribution, we know exactly what we would get every time. Distributions where all the mass is concentrated on a single event are known as *deterministic*.\n",
    "\n",
    "How much entropy is contained in a deterministic distribution? We will compute this in the next exercise."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "## Coding Exercise 1: Computing Entropy\n",
    "\n",
    "Your first exercise is to implement a method that computes the entropy of a discrete probability distribution, given its mass function. Remember that we are interested in entropy in units of _bits_, so be sure to use the correct log function.\n",
    "\n",
    "Recall that $\\log(0)$ is undefined. When evaluated at $0$, NumPy log functions (such as `np.log2`) return `np.nan` (\"Not a Number\"). By convention, these undefined terms— which correspond to points in the distribution with zero mass—are excluded from the sum that computes the entropy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {}
   },
   "outputs": [],
   "source": [
    "def entropy(pmf):\n",
    "  \"\"\"Given a discrete distribution, return the Shannon entropy in bits.\n",
    "\n",
    "  This is a measure of information in the distribution. For a totally\n",
    "  deterministic distribution, where samples are always found in the same bin,\n",
    "  then samples from the distribution give no more information and the entropy\n",
    "  is 0.\n",
    "\n",
    "  For now this assumes `pmf` arrives as a well-formed distribution (that is,\n",
    "  `np.sum(pmf)==1` and `not np.any(pmf < 0)`)\n",
    "\n",
    "  Args:\n",
    "    pmf (np.ndarray): The probability mass function for a discrete distribution\n",
    "      represented as an array of probabilities.\n",
    "  Returns:\n",
    "    h (number): The entropy of the distribution in `pmf`.\n",
    "\n",
    "  \"\"\"\n",
    "  ############################################################################\n",
    "  # Exercise for students: compute the entropy of the provided PMF\n",
    "  #   1. Exclude the points in the distribution with no mass (where `pmf==0`).\n",
    "  #      Hint: this is equivalent to including only the points with `pmf>0`.\n",
    "  #   2. Implement the equation for Shannon entropy (in bits).\n",
    "  #  When ready to test, comment or remove the next line\n",
    "  raise NotImplementedError(\"Exercise: implement the equation for entropy\")\n",
    "  ############################################################################\n",
    "\n",
    "  # reduce to non-zero entries to avoid an error from log2(0)\n",
    "  pmf = ...\n",
    "\n",
    "  # implement the equation for Shannon entropy (in bits)\n",
    "  h = ...\n",
    "\n",
    "  # return the absolute value (avoids getting a -0 result)\n",
    "  return np.abs(h)\n",
    "\n",
    "# Call entropy function and print result\n",
    "print(f\"{entropy(pmf):.2f} bits\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "both",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# to_remove solution\n",
    "def entropy(pmf):\n",
    "  \"\"\"Given a discrete distribution, return the Shannon entropy in bits.\n",
    "\n",
    "  This is a measure of information in the distribution. For a totally\n",
    "  deterministic distribution, where samples are always found in the same bin,\n",
    "  then samples from the distribution give no more information and the entropy\n",
    "  is 0.\n",
    "\n",
    "  For now this assumes `pmf` arrives as a well-formed distribution (that is,\n",
    "  `np.sum(pmf)==1` and `not np.any(pmf < 0)`)\n",
    "\n",
    "  Args:\n",
    "    pmf (np.ndarray): The probability mass function for a discrete distribution\n",
    "      represented as an array of probabilities.\n",
    "  Returns:\n",
    "    h (number): The entropy of the distribution in `pmf`.\n",
    "  \"\"\"\n",
    "  # reduce to non-zero entries to avoid an error from log2(0)\n",
    "  pmf = pmf[pmf > 0]\n",
    "\n",
    "  # implement the equation for Shannon entropy (in bits)\n",
    "  h = -np.sum(pmf * np.log2(pmf))\n",
    "\n",
    "  # return the absolute value (avoids getting a -0 result)\n",
    "  return np.abs(h)\n",
    "\n",
    "# Call entropy function and print result\n",
    "print(f\"{entropy(pmf):.2f} bits\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Submit your feedback\n",
    "content_review(f\"{feedback_prefix}_Optimization_and_Information_Exercise\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "We expect zero surprise from a deterministic distribution. If we had done this calculation by hand, it would simply be $-1\\log_2 1 = -0=0$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "Note that changing the location of the peak (i.e. the point and bin on which all the mass rests) doesn't alter the entropy. The entropy is about how predictable a sample is with respect to a distribution. A single peak is deterministic regardless of which point it sits on - the following plot shows a PMF that would also have zero entropy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @markdown Execute this cell to visualize another PMF with zero entropy\n",
    "pmf = np.zeros(n_bins)\n",
    "pmf[2] = 1.0  # arbitrary point has all the mass\n",
    "\n",
    "pmf_ = np.insert(pmf, 0, pmf[0])\n",
    "plt.plot(bins, pmf_, drawstyle=\"steps\")\n",
    "plt.fill_between(bins, pmf_, step=\"pre\", alpha=0.4)\n",
    "plt.xlabel(\"x\")\n",
    "plt.ylabel(\"p(x)\")\n",
    "plt.xlim(x_range)\n",
    "plt.ylim(0, 1);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "What about a distribution with mass split equally between two points?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @markdown Execute this cell to visualize a PMF with split mass\n",
    "\n",
    "pmf = np.zeros(n_bins)\n",
    "pmf[len(pmf) // 3] = 0.5\n",
    "pmf[2 * len(pmf) // 3] = 0.5\n",
    "\n",
    "pmf_ = np.insert(pmf, 0, pmf[0])\n",
    "plt.plot(bins, pmf_, drawstyle=\"steps\")\n",
    "plt.fill_between(bins, pmf_, step=\"pre\", alpha=0.4)\n",
    "plt.xlabel(\"x\")\n",
    "plt.ylabel(\"p(x)\")\n",
    "plt.xlim(x_range)\n",
    "plt.ylim(0, 1)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "Here, the entropy calculation is: $-(0.5 \\log_2 0.5 + 0.5\\log_2 0.5)=1$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "There is 1 bit of entropy. This means that before we take a random sample, there is 1 bit of uncertainty about which point in the distribution the sample will fall on: it will either be the first peak or the second one.\n",
    "\n",
    "Likewise, if we make one of the peaks taller (i.e. its point holds more of the probability mass) and the other one shorter, the entropy will decrease because of the increased certainty that the sample will fall on one point and not the other: : $-(0.2 \\log_2 0.2 + 0.8\\log_2 0.8)\\approx 0.72$\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "Try changing the definition of the number and weighting of peaks, and see how the entropy varies."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "If we split the probability mass among even more points, the entropy continues to increase. Let's derive the general form for $N$ points of equal mass, where $p_i=p=1/N$:\n",
    "\n",
    "\\begin{align}\n",
    "-\\sum_i p_i \\log_b p_i &= -\\sum_i^N \\frac{1}{N} \\log_b \\frac{1}{N} \\\\\n",
    "&= -\\log_b \\frac{1}{N} \\\\\n",
    "&= \\log_b N\n",
    "\\end{align}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "If we have $N$ discrete points, the _uniform distribution_ (where all points have equal mass) is the distribution with the highest entropy: $\\log_b N$. This upper bound on entropy is useful when considering binning strategies, as any estimate of entropy over $N$ discrete points (or bins) must be in the interval $[0, \\log_b N]$.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @markdown Execute this cell to visualize a PMF of uniform distribution\n",
    "\n",
    "pmf = np.ones(n_bins) / n_bins  # [1/N] * N\n",
    "\n",
    "pmf_ = np.insert(pmf, 0, pmf[0])\n",
    "plt.plot(bins, pmf_, drawstyle=\"steps\")\n",
    "plt.fill_between(bins, pmf_, step=\"pre\", alpha=0.4)\n",
    "plt.xlabel(\"x\")\n",
    "plt.ylabel(\"p(x)\")\n",
    "plt.xlim(x_range)\n",
    "plt.ylim(0, 1)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "Here, there are 50 points and the entropy of the uniform distribution is $\\log_2 50\\approx 5.64$. If we construct _any_ discrete distribution $X$ over 50 points (or bins) and calculate an entropy of $H_2(X)>\\log_2 50$, something must be wrong with our implementation of the discrete entropy computation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "---\n",
    "# Section 2: Information, neurons, and spikes\n",
    "\n",
    "*Estimated timing to here from start of tutorial: 20 min*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Video 2: Entropy of different distributions\n",
    "from ipywidgets import widgets\n",
    "from IPython.display import YouTubeVideo\n",
    "from IPython.display import IFrame\n",
    "from IPython.display import display\n",
    "\n",
    "\n",
    "class PlayVideo(IFrame):\n",
    "  def __init__(self, id, source, page=1, width=400, height=300, **kwargs):\n",
    "    self.id = id\n",
    "    if source == 'Bilibili':\n",
    "      src = f'https://player.bilibili.com/player.html?bvid={id}&page={page}'\n",
    "    elif source == 'Osf':\n",
    "      src = f'https://mfr.ca-1.osf.io/render?url=https://osf.io/download/{id}/?direct%26mode=render'\n",
    "    super(PlayVideo, self).__init__(src, width, height, **kwargs)\n",
    "\n",
    "\n",
    "def display_videos(video_ids, W=400, H=300, fs=1):\n",
    "  tab_contents = []\n",
    "  for i, video_id in enumerate(video_ids):\n",
    "    out = widgets.Output()\n",
    "    with out:\n",
    "      if video_ids[i][0] == 'Youtube':\n",
    "        video = YouTubeVideo(id=video_ids[i][1], width=W,\n",
    "                             height=H, fs=fs, rel=0)\n",
    "        print(f'Video available at https://youtube.com/watch?v={video.id}')\n",
    "      else:\n",
    "        video = PlayVideo(id=video_ids[i][1], source=video_ids[i][0], width=W,\n",
    "                          height=H, fs=fs, autoplay=False)\n",
    "        if video_ids[i][0] == 'Bilibili':\n",
    "          print(f'Video available at https://www.bilibili.com/video/{video.id}')\n",
    "        elif video_ids[i][0] == 'Osf':\n",
    "          print(f'Video available at https://osf.io/{video.id}')\n",
    "      display(video)\n",
    "    tab_contents.append(out)\n",
    "  return tab_contents\n",
    "\n",
    "\n",
    "video_ids = [('Youtube', 'o6nyrx3KH20'), ('Bilibili', 'BV1df4y1976g')]\n",
    "tab_contents = display_videos(video_ids, W=854, H=480)\n",
    "tabs = widgets.Tab()\n",
    "tabs.children = tab_contents\n",
    "for i in range(len(tab_contents)):\n",
    "  tabs.set_title(i, video_ids[i][0])\n",
    "display(tabs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Submit your feedback\n",
    "content_review(f\"{feedback_prefix}_Entropy_of_different_distributions_Video\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "Recall the discussion of spike times and inter-spike intervals (ISIs) from Tutorial 1. What does the information content (or distributional entropy) of these measures say about our theory of nervous systems?\n",
    "\n",
    "We'll consider three hypothetical neurons that all have the same mean ISI, but with different distributions:\n",
    "\n",
    "1. Deterministic\n",
    "2. Uniform\n",
    "3. Exponential\n",
    "\n",
    "Fixing the mean of the ISI distribution is equivalent to fixing its inverse: the neuron's mean firing rate. If a neuron has a fixed energy budget and each of its spikes has the same energy cost, then by fixing the mean firing rate, we are normalizing for energy expenditure. This provides a basis for comparing the entropy of different ISI distributions. In other words: if our neuron has a fixed budget, what ISI distribution should it express (all else being equal) to maximize the information content of its outputs?\n",
    "\n",
    "Let's construct our three distributions and see how their entropies differ."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {}
   },
   "outputs": [],
   "source": [
    "n_bins = 50\n",
    "mean_isi = 0.025\n",
    "isi_range = (0, 0.25)\n",
    "\n",
    "bins = np.linspace(*isi_range, n_bins + 1)\n",
    "mean_idx = np.searchsorted(bins, mean_isi)\n",
    "\n",
    "# 1. all mass concentrated on the ISI mean\n",
    "pmf_single = np.zeros(n_bins)\n",
    "pmf_single[mean_idx] = 1.0\n",
    "\n",
    "# 2. mass uniformly distributed about the ISI mean\n",
    "pmf_uniform = np.zeros(n_bins)\n",
    "pmf_uniform[0:2*mean_idx] = 1 / (2 * mean_idx)\n",
    "\n",
    "# 3. mass exponentially distributed about the ISI mean\n",
    "pmf_exp = stats.expon.pdf(bins[1:], scale=mean_isi)\n",
    "pmf_exp /= np.sum(pmf_exp)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @markdown Run this cell to plot the three PMFs\n",
    "fig, axes = plt.subplots(ncols=3, figsize=(15, 5))\n",
    "\n",
    "dists =  [# (subplot title, pmf, ylim)\n",
    "          (\"(1) Deterministic\", pmf_single, (0, 1.05)),\n",
    "          (\"(1) Uniform\", pmf_uniform, (0, 1.05)),\n",
    "          (\"(1) Exponential\", pmf_exp, (0, 1.05))]\n",
    "\n",
    "for ax, (label, pmf_, ylim) in zip(axes, dists):\n",
    "  pmf_ = np.insert(pmf_, 0, pmf_[0])\n",
    "  ax.plot(bins, pmf_, drawstyle=\"steps\")\n",
    "  ax.fill_between(bins, pmf_, step=\"pre\", alpha=0.4)\n",
    "  ax.set_title(label)\n",
    "  ax.set_xlabel(\"Inter-spike interval (s)\")\n",
    "  ax.set_ylabel(\"Probability mass\")\n",
    "  ax.set_xlim(isi_range)\n",
    "  ax.set_ylim(ylim)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {}
   },
   "outputs": [],
   "source": [
    "print(\n",
    "  f\"Deterministic: {entropy(pmf_single):.2f} bits\",\n",
    "  f\"Uniform: {entropy(pmf_uniform):.2f} bits\",\n",
    "  f\"Exponential: {entropy(pmf_exp):.2f} bits\",\n",
    "  sep=\"\\n\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "---\n",
    "# Section 3: Calculate entropy of ISI distributions from data\n",
    "\n",
    "*Estimated timing to here from start of tutorial: 25 min*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "## Section 3.1: Computing probabilities from histogram"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Video 3: Probabilities from histogram\n",
    "from ipywidgets import widgets\n",
    "from IPython.display import YouTubeVideo\n",
    "from IPython.display import IFrame\n",
    "from IPython.display import display\n",
    "\n",
    "\n",
    "class PlayVideo(IFrame):\n",
    "  def __init__(self, id, source, page=1, width=400, height=300, **kwargs):\n",
    "    self.id = id\n",
    "    if source == 'Bilibili':\n",
    "      src = f'https://player.bilibili.com/player.html?bvid={id}&page={page}'\n",
    "    elif source == 'Osf':\n",
    "      src = f'https://mfr.ca-1.osf.io/render?url=https://osf.io/download/{id}/?direct%26mode=render'\n",
    "    super(PlayVideo, self).__init__(src, width, height, **kwargs)\n",
    "\n",
    "\n",
    "def display_videos(video_ids, W=400, H=300, fs=1):\n",
    "  tab_contents = []\n",
    "  for i, video_id in enumerate(video_ids):\n",
    "    out = widgets.Output()\n",
    "    with out:\n",
    "      if video_ids[i][0] == 'Youtube':\n",
    "        video = YouTubeVideo(id=video_ids[i][1], width=W,\n",
    "                             height=H, fs=fs, rel=0)\n",
    "        print(f'Video available at https://youtube.com/watch?v={video.id}')\n",
    "      else:\n",
    "        video = PlayVideo(id=video_ids[i][1], source=video_ids[i][0], width=W,\n",
    "                          height=H, fs=fs, autoplay=False)\n",
    "        if video_ids[i][0] == 'Bilibili':\n",
    "          print(f'Video available at https://www.bilibili.com/video/{video.id}')\n",
    "        elif video_ids[i][0] == 'Osf':\n",
    "          print(f'Video available at https://osf.io/{video.id}')\n",
    "      display(video)\n",
    "    tab_contents.append(out)\n",
    "  return tab_contents\n",
    "\n",
    "\n",
    "video_ids = [('Youtube', 'e2U_-07O9jo'), ('Bilibili', 'BV1Jk4y1B7cz')]\n",
    "tab_contents = display_videos(video_ids, W=854, H=480)\n",
    "tabs = widgets.Tab()\n",
    "tabs.children = tab_contents\n",
    "for i in range(len(tab_contents)):\n",
    "  tabs.set_title(i, video_ids[i][0])\n",
    "display(tabs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Submit your feedback\n",
    "content_review(f\"{feedback_prefix}_Probabilities_from_histogram_Video\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "In the previous example we created the PMFs by hand to illustrate idealized scenarios. How would we compute them from data recorded from actual neurons?\n",
    "\n",
    "One way is to convert the ISI histograms we've previously computed into discrete probability distributions using the following equation:\n",
    "\n",
    "\\begin{equation}\n",
    "p_i = \\frac{n_i}{\\sum\\nolimits_{i}n_i}\n",
    "\\end{equation}\n",
    "\n",
    "where $p_i$ is the probability of an ISI falling within a particular interval $i$ and $n_i$ is the count of how many ISIs were observed in that interval."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "### Coding Exercise 3.1: Probability Mass Function\n",
    "\n",
    "Your second exercise is to implement a method that will produce a probability mass function from an array of ISI bin counts.\n",
    "\n",
    "To verify your solution, we will compute the probability distribution of ISIs from real neural data taken from the Steinmetz dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {}
   },
   "outputs": [],
   "source": [
    "def pmf_from_counts(counts):\n",
    "  \"\"\"Given counts, normalize by the total to estimate probabilities.\"\"\"\n",
    "  ###########################################################################\n",
    "  # Exercise: Compute the PMF. Remove the next line to test your function\n",
    "  raise NotImplementedError(\"Student exercise: compute the PMF from ISI counts\")\n",
    "  ###########################################################################\n",
    "\n",
    "  pmf = ...\n",
    "\n",
    "  return pmf\n",
    "\n",
    "\n",
    "# Get neuron index\n",
    "neuron_idx = 283\n",
    "\n",
    "# Get counts of ISIs from Steinmetz data\n",
    "isi = np.diff(steinmetz_spikes[neuron_idx])\n",
    "bins = np.linspace(*isi_range, n_bins + 1)\n",
    "counts, _ = np.histogram(isi, bins)\n",
    "\n",
    "# Compute pmf\n",
    "pmf = pmf_from_counts(counts)\n",
    "\n",
    "# Visualize\n",
    "plot_pmf(pmf, isi_range)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "both",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# to_remove solution\n",
    "def pmf_from_counts(counts):\n",
    "  \"\"\"Given counts, normalize by the total to estimate probabilities.\"\"\"\n",
    "  pmf = counts / np.sum(counts)\n",
    "  return pmf\n",
    "\n",
    "\n",
    "# Get neuron index\n",
    "neuron_idx = 283\n",
    "\n",
    "# Get counts of ISIs from Steinmetz data\n",
    "isi = np.diff(steinmetz_spikes[neuron_idx])\n",
    "bins = np.linspace(*isi_range, n_bins + 1)\n",
    "counts, _ = np.histogram(isi, bins)\n",
    "\n",
    "# Compute pmf\n",
    "pmf = pmf_from_counts(counts)\n",
    "\n",
    "# Visualize\n",
    "with plt.xkcd():\n",
    "  plot_pmf(pmf, isi_range)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Submit your feedback\n",
    "content_review(f\"{feedback_prefix}_Probability_mass_function_Exercise\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "## Section 3.2: Calculating entropy from pmf"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Video 4: Calculating entropy from pmf\n",
    "from ipywidgets import widgets\n",
    "from IPython.display import YouTubeVideo\n",
    "from IPython.display import IFrame\n",
    "from IPython.display import display\n",
    "\n",
    "\n",
    "class PlayVideo(IFrame):\n",
    "  def __init__(self, id, source, page=1, width=400, height=300, **kwargs):\n",
    "    self.id = id\n",
    "    if source == 'Bilibili':\n",
    "      src = f'https://player.bilibili.com/player.html?bvid={id}&page={page}'\n",
    "    elif source == 'Osf':\n",
    "      src = f'https://mfr.ca-1.osf.io/render?url=https://osf.io/download/{id}/?direct%26mode=render'\n",
    "    super(PlayVideo, self).__init__(src, width, height, **kwargs)\n",
    "\n",
    "\n",
    "def display_videos(video_ids, W=400, H=300, fs=1):\n",
    "  tab_contents = []\n",
    "  for i, video_id in enumerate(video_ids):\n",
    "    out = widgets.Output()\n",
    "    with out:\n",
    "      if video_ids[i][0] == 'Youtube':\n",
    "        video = YouTubeVideo(id=video_ids[i][1], width=W,\n",
    "                             height=H, fs=fs, rel=0)\n",
    "        print(f'Video available at https://youtube.com/watch?v={video.id}')\n",
    "      else:\n",
    "        video = PlayVideo(id=video_ids[i][1], source=video_ids[i][0], width=W,\n",
    "                          height=H, fs=fs, autoplay=False)\n",
    "        if video_ids[i][0] == 'Bilibili':\n",
    "          print(f'Video available at https://www.bilibili.com/video/{video.id}')\n",
    "        elif video_ids[i][0] == 'Osf':\n",
    "          print(f'Video available at https://osf.io/{video.id}')\n",
    "      display(video)\n",
    "    tab_contents.append(out)\n",
    "  return tab_contents\n",
    "\n",
    "\n",
    "video_ids = [('Youtube', 'Xjy-jj-6Oz0'), ('Bilibili', 'BV1vA411e7Cd')]\n",
    "tab_contents = display_videos(video_ids, W=854, H=480)\n",
    "tabs = widgets.Tab()\n",
    "tabs.children = tab_contents\n",
    "for i in range(len(tab_contents)):\n",
    "  tabs.set_title(i, video_ids[i][0])\n",
    "display(tabs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Submit your feedback\n",
    "content_review(f\"{feedback_prefix}_Calculating_entropy_from_pmf_Video\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "Now that we have the probability distribution for the actual neuron spiking activity, we can calculate its entropy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {}
   },
   "outputs": [],
   "source": [
    "print(f\"Entropy for Neuron {neuron_idx}: {entropy(pmf):.2f} bits\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "### Interactive Demo 3.2: Entropy of neurons\n",
    "\n",
    "We can combine the above distribution plot and entropy calculation with an interactive widget to explore how the different neurons in the dataset vary in spiking activity and relative information. Note that the mean firing rate across neurons is not fixed, so some neurons with a uniform ISI distribution may have higher entropy than neurons with a more exponential distribution.\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @markdown **Run the cell** to enable the sliders.\n",
    "\n",
    "def _pmf_from_counts(counts):\n",
    "  \"\"\"Given counts, normalize by the total to estimate probabilities.\"\"\"\n",
    "  pmf = counts / np.sum(counts)\n",
    "  return pmf\n",
    "\n",
    "\n",
    "def _entropy(pmf):\n",
    "  \"\"\"Given a discrete distribution, return the Shannon entropy in bits.\"\"\"\n",
    "  # remove non-zero entries to avoid an error from log2(0)\n",
    "  pmf = pmf[pmf > 0]\n",
    "  h = -np.sum(pmf * np.log2(pmf))\n",
    "  # absolute value applied to avoid getting a -0 result\n",
    "  return np.abs(h)\n",
    "\n",
    "\n",
    "@widgets.interact(neuron=widgets.IntSlider(0, min=0, max=(len(steinmetz_spikes)-1)))\n",
    "def steinmetz_pmf(neuron):\n",
    "  \"\"\" Given a neuron from the Steinmetz data, compute its PMF and entropy \"\"\"\n",
    "  isi = np.diff(steinmetz_spikes[neuron])\n",
    "  bins = np.linspace(*isi_range, n_bins + 1)\n",
    "  counts, _ = np.histogram(isi, bins)\n",
    "  pmf = _pmf_from_counts(counts)\n",
    "\n",
    "  plot_pmf(pmf, isi_range)\n",
    "  plt.title(f\"Neuron {neuron}: H = {_entropy(pmf):.2f} bits\")\n",
    "  plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Submit your feedback\n",
    "content_review(f\"{feedback_prefix}_Entropy_of_neurons_Interactive_Demo\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "---\n",
    "# Section 4: Reflecting on why models\n",
    "\n",
    "*Estimated timing to here from start of tutorial: 35 min*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "## Think! 3: Reflecting on why models\n",
    "\n",
    "Please discuss the following questions for around 10 minutes with your group:\n",
    "\n",
    "- Have you seen why models before?\n",
    "- Have you ever done one?\n",
    "- Why are why models useful?\n",
    "- When are they possible? Does your field have why models?\n",
    "- What do we learn from constructing them?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Submit your feedback\n",
    "content_review(f\"{feedback_prefix}_Reflecting_on_why_models_Discussion\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "---\n",
    "# Summary\n",
    "\n",
    "*Estimated timing of tutorial: 45 minutes*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Video 5: Summary of model types\n",
    "from ipywidgets import widgets\n",
    "from IPython.display import YouTubeVideo\n",
    "from IPython.display import IFrame\n",
    "from IPython.display import display\n",
    "\n",
    "\n",
    "class PlayVideo(IFrame):\n",
    "  def __init__(self, id, source, page=1, width=400, height=300, **kwargs):\n",
    "    self.id = id\n",
    "    if source == 'Bilibili':\n",
    "      src = f'https://player.bilibili.com/player.html?bvid={id}&page={page}'\n",
    "    elif source == 'Osf':\n",
    "      src = f'https://mfr.ca-1.osf.io/render?url=https://osf.io/download/{id}/?direct%26mode=render'\n",
    "    super(PlayVideo, self).__init__(src, width, height, **kwargs)\n",
    "\n",
    "\n",
    "def display_videos(video_ids, W=400, H=300, fs=1):\n",
    "  tab_contents = []\n",
    "  for i, video_id in enumerate(video_ids):\n",
    "    out = widgets.Output()\n",
    "    with out:\n",
    "      if video_ids[i][0] == 'Youtube':\n",
    "        video = YouTubeVideo(id=video_ids[i][1], width=W,\n",
    "                             height=H, fs=fs, rel=0)\n",
    "        print(f'Video available at https://youtube.com/watch?v={video.id}')\n",
    "      else:\n",
    "        video = PlayVideo(id=video_ids[i][1], source=video_ids[i][0], width=W,\n",
    "                          height=H, fs=fs, autoplay=False)\n",
    "        if video_ids[i][0] == 'Bilibili':\n",
    "          print(f'Video available at https://www.bilibili.com/video/{video.id}')\n",
    "        elif video_ids[i][0] == 'Osf':\n",
    "          print(f'Video available at https://osf.io/{video.id}')\n",
    "      display(video)\n",
    "    tab_contents.append(out)\n",
    "  return tab_contents\n",
    "\n",
    "\n",
    "video_ids = [('Youtube', 'X4K2RR5qBK8'), ('Bilibili', 'BV1F5411e7ww')]\n",
    "tab_contents = display_videos(video_ids, W=854, H=480)\n",
    "tabs = widgets.Tab()\n",
    "tabs.children = tab_contents\n",
    "for i in range(len(tab_contents)):\n",
    "  tabs.set_title(i, video_ids[i][0])\n",
    "display(tabs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "execution": {}
   },
   "outputs": [],
   "source": [
    "# @title Submit your feedback\n",
    "content_review(f\"{feedback_prefix}_Summary_of_model_types_Video\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "Congratulations! You've finished your first NMA tutorial. In this 3 part tutorial series, we used different types of models to understand the spiking behavior of neurons recorded in the Steinmetz data set.\n",
    "\n",
    " - We used \"what\" models to discover that the ISI distribution of real neurons is closest to an exponential distribution\n",
    " - We used \"how\" models to discover that balanced excitatory and inhibitory inputs, coupled with a leaky membrane, can give rise to neuronal spiking with exhibiting such an exponential ISI distribution\n",
    " - We used \"why\" models to discover that exponential ISI distributions contain the most information when the mean spiking is constrained\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "---\n",
    "# Notation\n",
    "\n",
    "\\begin{align}\n",
    "H(X) &\\quad \\text{entropy of random variable X}\\\\\n",
    "b &\\quad \\text{base, e.g. b=2 or b=e}\\\\\n",
    "x &\\quad \\text{event x}\\\\\n",
    "p(x) &\\quad \\text{probability of observing event x}\\\\\n",
    "\\text{ISI} &\\quad \\text{interspike interval}\\\\\n",
    "n_i &\\quad \\text{count of observed ISIs in interval i}\\\\\n",
    "p_i  &\\quad \\text{probability of of an ISI falling within a particular interval i}\n",
    "\\end{align}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "---\n",
    "# Bonus"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {}
   },
   "source": [
    "## Bonus Section 1: The foundations for Entropy\n",
    "\n",
    "In his foundational [1948 paper](https://en.wikipedia.org/wiki/A_Mathematical_Theory_of_Communication) on information theory, Claude Shannon began with three criteria for a function $H$ defining the entropy of a discrete distribution of probability masses $p_i\\in p(X)$ over the points $x_i\\in X$:\n",
    "1. $H$ should be continuous in the $p_i$.\n",
    "  - That is, $H$ should change smoothly in response to smooth changes to the mass $p_i$ on each point $x_i$.\n",
    "2. If all the points have equal shares of the probability mass, $p_i=1/N$, $H$ should be a non-decreasing function of $N$.\n",
    "  - That is, if $X_N$ is the support with $N$ discrete points and $p(x\\in X_N)$ assigns constant mass to each point, then $H(X_1) < H(X_2) < H(X_3) < \\dots$\n",
    "3. $H$ should be preserved by (invariant to) the equivalent (de)composition of distributions.\n",
    "  - For example (from Shannon's paper) if we have a discrete distribution over three points with masses $(\\frac{1}{2},\\frac{1}{3},\\frac{1}{6})$, then their entropy can be represented in terms of a direct choice between the three and calculated $H(\\frac{1}{2},\\frac{1}{3},\\frac{1}{6})$. However, it could also be represented in terms of a series of two choices:\n",
    "    1. either we sample the point with mass $1/2$ or not (_not_ is the other $1/2$, whose subdivisions are not given in the first choice),\n",
    "    2. if (with probability $1/2$) we _don't_ sample the first point, we sample one of the two remaining points, masses $1/3$ and $1/6$.\n",
    "    \n",
    "    Thus in this case we require that $H(\\frac{1}{2},\\frac{1}{3},\\frac{1}{6})=H(\\frac{1}{2},\\frac{1}{2}) + \\frac{1}{2}H(\\frac{1}{3}, \\frac{1}{6})$\n",
    "\n",
    "There is a unique function (up to a linear scaling factor) which satisfies these 3 requirements:\n",
    "\n",
    "\\begin{equation}\n",
    "H_b(X) = -\\sum_{x\\in X} p(x) \\log_b p(x)\n",
    "\\end{equation}\n",
    "\n",
    "Where the base of the logarithm $b>1$ controls the units of entropy. The two most common cases are $b=2$ for units of _bits_, and $b=e$ for _nats_.\n",
    "\n",
    "We can view this function as the expectation of the self-information over a distribution:\n",
    "\n",
    "\\begin{align}\n",
    "H_b(X) &= \\mathbb{E}_{x\\in X} \\left[I_b(x)\\right]\\\\\n",
    "I_b(x) &= -\\log_b p(x)\n",
    "\\end{align}\n",
    "\n",
    "Self-information is just the negative logarithm of probability, and is a measure of how surprising an event sampled from the distribution would be. Events with $p(x)=1$ are certain to occur, and their self-information is zero (as is the entropy of the distribution they compose) meaning they are totally unsurprising. The smaller the probability of an event, the higher its self-information, and the more surprising the event would be to observe.\n"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "include_colab_link": true,
   "name": "W1D1_Tutorial3",
   "provenance": [],
   "toc_visible": true
  },
  "kernel": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.17"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}