{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Discrete Data and the Multinomial Distribution"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Preliminaries\n",
    "\n",
    "- Goal \n",
    "  - Simple Bayesian and maximum likelihood-based density estimation for discretely valued data sets \n",
    "- Materials        \n",
    "  - Mandatory\n",
    "    - These lecture notes\n",
    "  - Optional\n",
    "    - Bishop pp. 67-70, 74-76, 93-94      "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Discrete Data: the 1-of-K Coding Scheme\n",
    "\n",
    "- Consider a coin-tossing experiment with outcomes $x \\in\\{0,1\\}$ (tail and head) and let $0\\leq \\mu \\leq 1$ represent the probability of heads. This model can written as a [**Bernoulli distribution**](https://en.wikipedia.org/wiki/Bernoulli_distribution):\n",
    "$$ \n",
    "p(x|\\mu) = \\mu^{x}(1-\\mu)^{1-x}\n",
    "$$\n",
    "  - Note that the variable $x$ acts as a (binary) **selector** for the tail or head probabilities. Think of this as an 'if'-statement in programming."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- Now consider a $K$-sided coin (e.g., a six-faced _die_ (pl.: dice)). How should we encode outcomes?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- **Option 1**: $x \\in \\{1,2,\\ldots,K\\}$.\n",
    "  - E.g., for $K=6$, if the die lands on the 3rd face $\\,\\Rightarrow x=3$.\n",
    "  \n",
    "- **Option 2**:  $x=(x_1,\\ldots,x_K)^T$ with **binary selection variables**\n",
    "$$\n",
    "x_k = \\begin{cases} 1 & \\text{if die landed on $k$th face}\\\\\n",
    "0 & \\text{otherwise} \\end{cases}\n",
    "$$ \n",
    "  - E.g., for $K=6$, if the die lands on the 3rd face $\\,\\Rightarrow x=(0,0,1,0,0,0)^T$.\n",
    "  - This coding scheme is called a **1-of-K** or **one-hot** coding scheme."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- It turns out that the one-hot coding scheme is mathematically more convenient!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### The Categorical Distribution\n",
    "\n",
    "- Consider a $K$-sided die. We use a one-hot coding scheme. Assume the probabilities $p(x_k=1) = \\mu_k$ with  $\\sum_k \\mu_k  = 1$. The data generating distribution is then (note the similarity to the Bernoulli distribution)\n",
    "\n",
    "$$\n",
    "p(x|\\mu) = \\mu_1^{x_1} \\mu_2^{x_2} \\cdots \\mu_K^{x_K}=\\prod_{k=1}^K \\mu_k^{x_k} \\tag{B-2.26}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- This generalized Bernoulli distribution is called the [**categorical distribution**](https://en.wikipedia.org/wiki/Categorical_distribution) (or sometimes the 'multi-noulli' distribution).\n",
    "\n",
    "<!---\n",
    "\n",
    "- Note that $\\sum_k x_k = 1$ and verify for yourself that $\\mathrm{E}[x|\\mu] = \\mu$.\n",
    "- In these notes, we use the superscript to indicate that we are working with a **binary selection variable** in a 1-of-$K$ scheme.\n",
    "--->"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Bayesian Density Estimation for a Loaded Die\n",
    "\n",
    "- Now let's proceed with Bayesian density estimation, i.e., let's learn the parameters for model $p(x|\\theta)$ for an observed data set $D=\\{x_1,\\ldots,x_N\\}$  of $N$ independent-and-identically-distributed (IID) rolls of a $K$-sided die, with \n",
    "\n",
    "$$\n",
    "x_{nk} = \\begin{cases} 1 & \\text{if the $n$th throw landed on $k$th face}\\\\\n",
    "0 & \\text{otherwise} \\end{cases}\n",
    "$$ "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "##### Model specification\n",
    "\n",
    "- The data generating PDF is\n",
    "$$\n",
    "p(D|\\mu) = \\prod_n \\prod_k \\mu_k^{x_{nk}} = \\prod_k \\mu_k^{\\sum_n x_{nk}} = \\prod_k \\mu_k^{m_k} \\tag{B-2.29}\n",
    "$$\n",
    "where $m_k= \\sum_n x_{nk}$ is the total number of occurrences that we 'threw' $k$ eyes. Note that $\\sum_k m_k = N$.\n",
    "  - This distribution depends on the observations **only** through the quantities $\\{m_k\\}$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- We need a prior for the parameters $\\mu = (\\mu_1,\\mu_2,\\ldots,\\mu_K)$. In the [binary coin toss example](https://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/notebooks/Bayesian-Machine-Learning.ipynb#beta-prior), \n",
    "we used a [beta distribution](https://en.wikipedia.org/wiki/Beta_distribution) that was conjugate with the binomial and forced us to choose prior pseudo-counts. \n",
    "\n",
    "- The generalization of the beta prior to the $K$ parameters $\\{\\mu_k\\}$ is the [Dirichlet distribution](https://en.wikipedia.org/wiki/Dirichlet_distribution):\n",
    "$$\n",
    "p(\\mu|\\alpha) = \\mathrm{Dir}(\\mu|\\alpha) = \\frac{\\Gamma\\left(\\sum_k \\alpha_k\\right)}{\\Gamma(\\alpha_1)\\cdots \\Gamma(\\alpha_K)} \\prod_{k=1}^K \\mu_k^{\\alpha_k-1} \n",
    "$$\n",
    "where $\\Gamma(\\cdot)$ is the [Gamma function](https://en.wikipedia.org/wiki/Gamma_function). \n",
    "\n",
    "- The Gamma function can be interpreted as a generalization of the factorial function to the real ($\\mathbb{R}$) numbers. If $n$ is a natural number ($1,2,3, \\ldots $), then $\\Gamma(n) = (n-1)!$, where $(n-1)! = (n-1)\\cdot (n-2) \\cdot 1$.\n",
    "\n",
    "- As before for the Beta distribution in the coin toss experiment, you can interpret $\\alpha_k-1$ as the prior number of (pseudo-)observations that the die landed on the \n",
    "  $k$-th face."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "##### Inference for $\\{\\mu_k\\}$\n",
    "\n",
    "- The posterior for  $\\{\\mu_k\\}$ can be obtained through Bayes rule:\n",
    "\n",
    "$$\\begin{align*}\n",
    "p(\\mu|D,\\alpha) &\\propto p(D|\\mu) \\cdot p(\\mu|\\alpha) \\\\\n",
    "  &\\propto  \\prod_k \\mu_k^{m_k} \\cdot \\prod_k \\mu_k^{\\alpha_k-1} \\\\\n",
    "  &= \\prod_k \\mu_k^{\\alpha_k + m_k -1}\\\\\n",
    "  &\\propto \\mathrm{Dir}\\left(\\mu\\,|\\,\\alpha + m \\right) \\tag{B-2.41} \\\\\n",
    "  &= \\frac{\\Gamma\\left(\\sum_k (\\alpha_k + m_k) \\right)}{\\Gamma(\\alpha_1+m_1) \\Gamma(\\alpha_2+m_2) \\cdots \\Gamma(\\alpha_K + m_K)} \\prod_{k=1}^K \\mu_k^{\\alpha_k + m_k -1}\n",
    "\\end{align*}$$\n",
    "where $m = (m_1,m_2,\\ldots,m_K)^T$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- We recognize the $(\\alpha_k-1)$'s as prior pseudo-counts and the Dirichlet distribution shows to be a [conjugate prior](https://en.wikipedia.org/wiki/Conjugate_prior) to the categorical/multinomial:\n",
    "\n",
    "$$\\begin{align*}\n",
    "\\underbrace{\\text{Dirichlet}}_{\\text{posterior}} &\\propto \\underbrace{\\text{categorical}}_{\\text{likelihood}} \\cdot \\underbrace{\\text{Dirichlet}}_{\\text{prior}}\n",
    "\\end{align*}$$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- This is actually a generalization of the conjugate relation that we found for the binary coin toss: \n",
    "\n",
    "$$\\begin{align*}\n",
    "\\underbrace{\\text{Beta}}_{\\text{posterior}} &\\propto \\underbrace{\\text{binomial}}_{\\text{likelihood}} \\cdot \\underbrace{\\text{Beta}}_{\\text{prior}}\n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "##### <a id=\"prediction-loaded-die\">Prediction of next toss for the loaded die</a>\n",
    "\n",
    "- Let's apply what we have learned about the loaded die to compute the probability that we throw the $k$-th face at the next toss. \n",
    "\n",
    "$$\\begin{align*}\n",
    "p(x_{\\bullet,k}=1|D)  &= \\int p(x_{\\bullet,k}=1|\\mu)\\,p(\\mu|D) \\,\\mathrm{d}\\mu \\\\\n",
    "  &= \\int_0^1 \\mu_k \\times  \\mathcal{Dir}(\\mu|\\,\\alpha+m) \\,\\mathrm{d}\\mu  \\\\\n",
    "  &= \\mathrm{E}\\left[ \\mu_k \\right] \\\\\n",
    "  &= \\frac{m_k + \\alpha_k }{ N+ \\sum_k \\alpha_k}\n",
    "\\end{align*}$$\n",
    "  \n",
    "- (You can [find the mean of the Dirichlet distribution at its Wikipedia site](https://en.wikipedia.org/wiki/Dirichlet_distribution)). \n",
    "\n",
    "- This result is simply a generalization of [**Laplace's rule of succession**](https://en.wikipedia.org/wiki/Rule_of_succession)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Categorical, Multinomial and Related Distributions\n",
    "\n",
    "- In the above derivation, we noticed that the data generating distribution for $N$ die tosses $D=\\{x_1,\\ldots,x_N\\}$ only depends on the **data frequencies** $m_k$:\n",
    "$$\n",
    "p(D|\\mu) = \\prod_n \\underbrace{\\prod_k \\mu_k^{x_{nk}}}_{\\text{categorical dist.}} = \\prod_k \\mu_k^{\\sum_n x_{nk}} = \\prod_k \\mu_k^{m_k} \\tag{B-2.29}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- A related distribution is the distribution over data frequency observations $D_m=\\{m_1,\\ldots,m_K\\}$, which is called the **multinomial distribution**,\n",
    "$$\n",
    "p(D_m|\\mu) =\\frac{N!}{m_1! m_2!\\ldots m_K!} \\,\\prod_k \\mu_k^{m_k}\\,.\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- When used as a likelihood function for $\\mu$, it makes no difference whether you use $p(D|\\mu)$ or $p(D_m|\\mu)$. Why? "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- Verify for yourself that ([Exercise](http://nbviewer.jupyter.org/github/bertdv/BMLIP/blob/master/lessons/exercises/Exercises-The-Multinomial-Distribution.ipynb)): \n",
    "  - the categorial distribution is a special case of the multinomial for $N=1$. \n",
    "  - the Bernoulli is a special case of the categorial distribution for $K=2$.\n",
    "  - the binomial is a special case of the multinomial for $K=2$.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Maximum Likelihood Estimation for the Multinomial\n",
    "\n",
    "##### Maximum likelihood as a special case of Bayesian estimation\n",
    "- We can get the maximum likelihood estimate $\\hat{\\mu}_k$ for $\\mu_k$ based on $N$ throws of a $K$-sided die from the Bayesian framework by using a uniform prior for $\\mu$ and taking the mode of the posterior for $\\mu$:\n",
    "$$\\begin{align*}\n",
    "\\hat{\\mu}_k &= \\arg\\max_{\\mu_k} p(D|\\mu) \\\\\n",
    "&= \\arg\\max_{\\mu_k} p(D|\\mu)\\cdot \\mathrm{Uniform}(\\mu) \\\\\n",
    "&= \\arg\\max_{\\mu_k} p(D|\\mu) \\cdot \\left.\\mathrm{Dir}(\\mu|\\alpha)\\right|_{\\alpha=(1,1,\\ldots,1)} \\\\\n",
    "&= \\arg\\max_{\\mu_k} \\left.p(\\mu|D,\\alpha)\\right|_{\\alpha=(1,1,\\ldots,1)}  \\\\\n",
    "&= \\arg\\max_{\\mu_k} \\left.\\mathrm{Dir}\\left( \\mu | m + \\alpha \\right)\\right|_{\\alpha=(1,1,\\ldots,1)} \\\\\n",
    "&= \\frac{m_k}{\\sum_k m_k} = \\frac{m_k}{N}\n",
    "\\end{align*}$$\n",
    "where we used the fact that the [maximum of the Dirichlet distribution](https://en.wikipedia.org/wiki/Dirichlet_distribution#Mode) $\\mathrm{Dir}(\\{\\alpha_1,\\ldots,\\alpha_K\\})$ is obtained at \n",
    "$(\\alpha_k-1)/(\\sum_k\\alpha_k - K)$.\n",
    "\n",
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "##### <a id=\"ML-for-multinomial\">Maximum likelihood estimation by optimizing a constrained log-likelihood</a>\n",
    "- Of course, we shouldn't have to go through the full Bayesian framework to get the maximum likelihood estimate. Alternatively, we can find the maximum likelihood (ML) solution directly by optimizing the (constrained) log-likelihood.\n",
    "\n",
    "- The log-likelihood for the multinomial distribution is given by\n",
    "\n",
    "$$\\begin{align*}\n",
    "\\mathrm{L}(\\mu) &\\triangleq \\log p(D_m|\\mu) \\propto \\log \\prod_k \\mu_k^{m_k} =  \\sum_k m_k \\log \\mu_k \n",
    "\\end{align*}$$ \n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- When doing ML estimation, we must obey the constraint $\\sum_k \\mu_k  = 1$, which can be accomplished by a [Lagrange multiplier](https://en.wikipedia.org/wiki/Lagrange_multiplier). The **constrained log-likelihood** with Lagrange multiplier is then\n",
    "\n",
    "$$\n",
    "\\tilde{\\mathrm{L}}(\\mu) = \\sum_k m_k \\log \\mu_k  + \\lambda \\cdot \\big(1 - \\sum_k \\mu_k \\big)\n",
    "$$\n",
    "\n",
    "- The method of Lagrange multipliers is a mathematical method for transferring a constrained optimization problem to an unconstrained optimization problem (see Bishop App.E). Unconstrained optimization problems can be solved by setting the derivative to zero. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- Setting the derivative of $\\tilde{\\mathrm{L}}(\\mu)$ to zero yields the **sample proportion** for $\\mu_k$ \n",
    "$$\\begin{equation*}\n",
    "\\nabla_{\\mu_k}   \\tilde{\\mathrm{L}}(\\mu) = \\frac{m_k }\n",
    "{\\hat\\mu_k } - \\lambda  \\overset{!}{=} 0 \\; \\Rightarrow \\; \\hat\\mu_k = \\frac{m_k }{N}\n",
    "\\end{equation*}$$\n",
    "where we get $\\lambda$ from the constraint \n",
    "$$\\begin{equation*}\n",
    "\\sum_k \\hat \\mu_k = \\sum_k \\frac{m_k}\n",
    "{\\lambda} = \\frac{N}{\\lambda} \\overset{!}{=}  1\n",
    "\\end{equation*}$$\n",
    "\n",
    "<!---\n",
    "- Interesting special case: **Binomial** (=$N$ coin tosses): \n",
    "$$p(x_n|\\theta)= \\theta^{[x_n=h]}(1-\\theta)^{[x_n=t]}=\\theta_h^{[x_n=h]} \\theta_t^{[x_n=t]}\n",
    "$$ \n",
    "yields $$\\hat \\theta = \\frac{N_h}{N_h +N_t} $$\n",
    "\n",
    "- Compare this answer to Laplace's rule for predicting the next coin toss (in probability theory lesson) $$p(\\,x_\\bullet=h\\,|\\,\\theta\\,)=\\frac{N_h+1}{N_h +N_t+2}\\,.$$ What is the source of the difference?\n",
    "--->"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Recap Maximum Likelihood Estimation for Gaussian and Multinomial Distributions\n",
    "\n",
    "Given $N$ IID observations $D=\\{x_1,\\dotsc,x_N\\}$.\n",
    "\n",
    "- For a **multivariate Gaussian** model $p(x_n|\\theta) = \\mathcal{N}(x_n|\\mu,\\Sigma)$, we obtain ML estimates\n",
    "\n",
    "$$\\begin{align*}\n",
    "\\hat \\mu &= \\frac{1}{N} \\sum_n x_n \\qquad &\\text{(sample mean)} \\\\\n",
    "\\hat \\Sigma &= \\frac{1}{N} \\sum_n (x_n-\\hat\\mu)(x_n - \\hat \\mu)^T \\qquad &\\text{(sample variance)}\n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- For discrete outcomes modeled by a 1-of-K **categorical distribution** we find\n",
    "\n",
    "$$\\begin{align*}\n",
    "\\hat\\mu_k  = \\frac{1}{N} \\sum_n x_{nk} \\quad \\left(= \\frac{m_k}{N} \\right) \\qquad \\text{(sample proportion)}\n",
    "\\end{align*}$$\n",
    "  \n",
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    " \n",
    "- Note the similarity for the means between discrete and continuous data. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<!--\n",
       "This HTML file contains custom styles and some javascript.\n",
       "Include it a Jupyter notebook for improved rendering.\n",
       "-->\n",
       "\n",
       "<!-- Fonts -->\n",
       "<link href='http://fonts.googleapis.com/css?family=Alegreya+Sans:100,300,400,500,700,800,900,100italic,300italic,400italic,500italic,700italic,800italic,900italic' rel='stylesheet' type='text/css'>\n",
       "<link href='http://fonts.googleapis.com/css?family=Arvo:400,700,400italic' rel='stylesheet' type='text/css'>\n",
       "<link href='http://fonts.googleapis.com/css?family=PT+Mono' rel='stylesheet' type='text/css'>\n",
       "<link href='http://fonts.googleapis.com/css?family=Shadows+Into+Light' rel='stylesheet' type='text/css'>\n",
       "<link href='http://fonts.googleapis.com/css?family=Nixie+One' rel='stylesheet' type='text/css'>\n",
       "\n",
       "<!-- Custom style -->\n",
       "<style>\n",
       "\n",
       "@font-face {\n",
       "    font-family: \"Computer Modern\";\n",
       "    src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf');\n",
       "}\n",
       "\n",
       "#notebook_panel { /* main background */\n",
       "    background: rgb(245,245,245);\n",
       "}\n",
       "\n",
       "div.container {\n",
       "    min-width: 960px;\n",
       "}\n",
       "\n",
       "div #notebook { /* centre the content */\n",
       "    background: #fff; /* white background for content */\n",
       "    margin: auto;\n",
       "    padding-left: 0em;\n",
       "}\n",
       "\n",
       "#notebook li { /* More space between bullet points */\n",
       "    margin-top:0.8em;\n",
       "}\n",
       "\n",
       "/* draw border around running cells */\n",
       "div.cell.border-box-sizing.code_cell.running {\n",
       "    border: 1px solid #111;\n",
       "}\n",
       "\n",
       "/* Put a solid color box around each cell and its output, visually linking them*/\n",
       "div.cell.code_cell {\n",
       "    background-color: rgb(256,256,256);\n",
       "    border-radius: 0px;\n",
       "    padding: 0.5em;\n",
       "    margin-left:1em;\n",
       "    margin-top: 1em;\n",
       "}\n",
       "\n",
       "div.text_cell_render{\n",
       "    font-family: 'Alegreya Sans' sans-serif;\n",
       "    line-height: 140%;\n",
       "    font-size: 125%;\n",
       "    font-weight: 400;\n",
       "    width:800px;\n",
       "    margin-left:auto;\n",
       "    margin-right:auto;\n",
       "}\n",
       "\n",
       "\n",
       "/* Formatting for header cells */\n",
       ".text_cell_render h1 {\n",
       "    font-family: 'Nixie One', serif;\n",
       "    font-style:regular;\n",
       "    font-weight: 400;\n",
       "    font-size: 45pt;\n",
       "    line-height: 100%;\n",
       "    color: rgb(0,51,102);\n",
       "    margin-bottom: 0.5em;\n",
       "    margin-top: 0.5em;\n",
       "    display: block;\n",
       "}\n",
       "\n",
       ".text_cell_render h2 {\n",
       "    font-family: 'Nixie One', serif;\n",
       "    font-weight: 400;\n",
       "    font-size: 30pt;\n",
       "    line-height: 100%;\n",
       "    color: rgb(0,51,102);\n",
       "    margin-bottom: 0.1em;\n",
       "    margin-top: 0.3em;\n",
       "    display: block;\n",
       "}\n",
       "\n",
       ".text_cell_render h3 {\n",
       "    font-family: 'Nixie One', serif;\n",
       "    margin-top:16px;\n",
       "    font-size: 22pt;\n",
       "    font-weight: 600;\n",
       "    margin-bottom: 3px;\n",
       "    font-style: regular;\n",
       "    color: rgb(102,102,0);\n",
       "}\n",
       "\n",
       ".text_cell_render h4 {    /*Use this for captions*/\n",
       "    font-family: 'Nixie One', serif;\n",
       "    font-size: 14pt;\n",
       "    text-align: center;\n",
       "    margin-top: 0em;\n",
       "    margin-bottom: 2em;\n",
       "    font-style: regular;\n",
       "}\n",
       "\n",
       ".text_cell_render h5 {  /*Use this for small titles*/\n",
       "    font-family: 'Nixie One', sans-serif;\n",
       "    font-weight: 400;\n",
       "    font-size: 16pt;\n",
       "    color: rgb(163,0,0);\n",
       "    font-style: italic;\n",
       "    margin-bottom: .1em;\n",
       "    margin-top: 0.8em;\n",
       "    display: block;\n",
       "}\n",
       "\n",
       ".text_cell_render h6 { /*use this for copyright note*/\n",
       "    font-family: 'PT Mono', sans-serif;\n",
       "    font-weight: 300;\n",
       "    font-size: 9pt;\n",
       "    line-height: 100%;\n",
       "    color: grey;\n",
       "    margin-bottom: 1px;\n",
       "    margin-top: 1px;\n",
       "}\n",
       "\n",
       ".CodeMirror{\n",
       "    font-family: \"PT Mono\";\n",
       "    font-size: 90%;\n",
       "}\n",
       "\n",
       ".boxed { /* draw a border around a piece of text */\n",
       "  border: 1px solid blue ;\n",
       "}\n",
       "\n",
       "h4#CODE-EXAMPLE,\n",
       "h4#END-OF-CODE-EXAMPLE {\n",
       "    margin: 10px 0;\n",
       "    padding: 10px;\n",
       "    background-color: #d0f9ca !important;\n",
       "    border-top: #849f81 1px solid;\n",
       "    border-bottom: #849f81 1px solid;\n",
       "}\n",
       "\n",
       ".emphasis {\n",
       "    color: red;\n",
       "}\n",
       "\n",
       ".exercise {\n",
       "    color: green;\n",
       "}\n",
       "\n",
       ".proof {\n",
       "    color: blue;\n",
       "}\n",
       "\n",
       "code {\n",
       "  padding: 2px 4px !important;\n",
       "  font-size: 90% !important;\n",
       "  color: #222 !important;\n",
       "  background-color: #efefef !important;\n",
       "  border-radius: 2px !important;\n",
       "}\n",
       "\n",
       "/* This removes the actual style cells from the notebooks, but no in print mode\n",
       "   as they will be removed through some other method */\n",
       "@media not print {\n",
       "  .cell:nth-last-child(-n+2) {\n",
       "    display: none;\n",
       "  }\n",
       "}\n",
       "\n",
       "footer.hidden-print {\n",
       "    display: none !important;\n",
       "}\n",
       "    \n",
       "</style>\n",
       "\n",
       "<!-- MathJax styling -->\n",
       "<script>\n",
       "    MathJax.Hub.Config({\n",
       "                        CommonHTML: {\n",
       "                            scale: 200\n",
       "                        },\n",
       "                TeX: {\n",
       "                    extensions: [\"AMSmath.js\"],\n",
       "                    equationNumbers: { autoNumber: \"AMS\", useLabelIds: true}\n",
       "                },\n",
       "                tex2jax: {\n",
       "                    inlineMath: [ ['$','$'], [\"\\\\(\",\"\\\\)\"] ],\n",
       "                    displayMath: [ ['$$','$$'], [\"\\\\[\",\"\\\\]\"] ]\n",
       "                },\n",
       "                displayAlign: 'center', // Change this to 'center' to center equations.\n",
       "                \"HTML-CSS\": {\n",
       "                    styles: {'.MathJax_Display': {\"margin\": 4}}\n",
       "                }\n",
       "        });\n",
       "</script>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "open(\"../../styles/aipstyle.html\") do f\n",
    "    display(\"text/html\", read(f, String))\n",
    "end"
   ]
  }
 ],
 "metadata": {
  "@webio": {
   "lastCommId": null,
   "lastKernelId": null
  },
  "anaconda-cloud": {},
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Julia 1.10.5",
   "language": "julia",
   "name": "julia-1.10"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.10.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}