{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# The General EM Algorithm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "\n",
    "### Preliminaries\n",
    "\n",
    "- Goal \n",
    "  - A more formal treatment of the general EM algorithm   \n",
    "- Materials\n",
    "  - Mandatory\n",
    "    - These lecture notes\n",
    "  - Optional \n",
    "    - Bishop pp. 55-57 for Jensen's inequality\n",
    "    - Bishop pp. 439-443 for EM applied to GMM\n",
    "    - Bishop pp. 450-455 for the general EM algorithm\n",
    "    - Liang (2015) [Technical Details about the EM algorithm](./files/Liang-2015-technical-details-about-the-EM-algorithm.pdf)\n",
    "    - Bo and Batzoglou (2008) [What is the expectation algorithm?](./files/Bo-2008-What-is-the-expectation-algorithm.pdf)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "###  The Kullback-Leibler Divergence\n",
    "\n",
    "- In order to prove that the EM algorithm works, we will need [Gibbs inequality](https://en.wikipedia.org/wiki/Gibbs%27_inequality), which is a famous theorem in information theory.\n",
    "\n",
    "- Definition: the **Kullback-Leibler divergence** (a.k.a. **relative entropy**) is a distance measure between two distributions $q$ and $p$ and is defined as\n",
    "\n",
    "$$\n",
    "D_{\\text{KL}}(q \\parallel p) \\triangleq \\sum_z q(z) \\log \\frac{q(z)}{p(z)}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "-  Theorem: **Gibbs Inequality** ([proof](https://en.wikipedia.org/wiki/Gibbs%27_inequality#Proof) uses Jensen inquality):    \n",
    "$$\\boxed{ D_{\\text{KL}}(q \\parallel p) \\geq 0 }$$\n",
    "with equality only iff $p=q$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- Note that the KL divergence is an asymmetric distance measure, i.e. in general \n",
    "\n",
    "$$D_{\\text{KL}}(q \\parallel p) \\neq D_{\\text{KL}}(p \\parallel q)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "###  EM by maximizing a Lower Bound on the Log-Likelihood\n",
    "\n",
    "- Consider a model for observations $x$, hidden variables $z$ and tuning parameters $\\theta$. Note that, for **any** distribution $q(z)$, we can expand the log-likelihood as follows: \n",
    "\n",
    "$$\\begin{align*}\n",
    "\\mathrm{L}(\\theta) &\\triangleq \\log p(x|\\theta)  \\\\\n",
    "  &= \\sum_z q(z) \\log p(x|\\theta) \\\\\n",
    "  &= \\sum_z q(z) \\left( \\log p(x|\\theta) - \\log \\frac{q(z)}{p(z|x,\\theta)}\\right) +  \\sum_z q(z) \\log \\frac{q(z)}{p(z|x,\\theta)} \\\\\n",
    "  &= \\sum_z q(z) \\log \\frac{p(x,z|\\theta)}{q(z)} + \\underbrace{D_{\\text{KL}}\\left( q(z) \\parallel p(z|x,\\theta) \n",
    "\\right)}_{\\text{Kullback-Leibler div.}} \\tag{1} \\\\\n",
    "  &\\geq \\sum_z q(z) \\log \\frac{p(x,z|\\theta)}{q(z)} \\quad \\text{(use Gibbs inequality)}  \\\\\n",
    "  &= \\underbrace{\\sum_z q(z) \\log p(x,z|\\theta)}_{\\text{expected complete-data log-likelihood}} + \\underbrace{\\mathcal{H}\\left[ q\\right]}_{\\text{entropy of }q}  \\\\\n",
    "&\\triangleq \\mathrm{LB}(q,\\theta) \n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "- Hence, $\\mathrm{LB}(q,\\theta)$ is a **lower bound** on the log-likelihood $\\log p(x|\\theta)$. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- Technically, the Expectation-Maximization (EM) algorithm is defined by coordinate ascent on $\\mathrm{LB}(q,\\theta)$:\n",
    "\n",
    "$$\\begin{align*}\n",
    "  &\\textrm{ } \\\\\n",
    "  &\\textrm{Initialize }: \\theta^{(0)}\\\\\n",
    "  &\\textrm{for }m = 1,2,\\ldots \\textrm{until convergence}\\\\\n",
    "    &\\quad q^{(m+1)} = \\arg\\max_q \\mathrm{LB}(q,\\theta^{(m)}) &\\quad \\text{% update responsibilities} \\\\\n",
    "    &\\quad \\theta^{(m+1)} = \\arg\\max_\\theta \\mathrm{LB}(q^{(m+1)},\\theta) &\\quad \\text{% re-estimate parameters}\n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "- Since $\\mathrm{LB}(q,\\theta) \\leq \\mathrm{L}(\\theta)$ (for all choices of $q(z)$), maximizing the lower-bound $\\mathrm{LB}$ will also maximize the log-likelihood wrt $\\theta$. The _reason_ to maximize $\\mathrm{LB}$ rather than log-likelihood $\\mathrm{L}$ directly is that $\\arg\\max \\mathrm{LB}$ often leads to easier expressions. E.g., see this illustrative figure (Bishop p.453):\n",
    "\n",
    "<img src=\"./figures/Bishop-Figure914.png\" width=\"400px\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### EM Details"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "##### Maximizing $\\mathrm{LB}$ w.r.t. $q$ \n",
    "\n",
    "- Note that\n",
    "$$\n",
    "\\mathrm{LB}(q,\\theta) =  \\mathrm{L}(\\theta)  - D_{\\text{KL}}\\left( q(z) \\parallel p(z|x,\\theta) \\right)\n",
    "$$\n",
    "and consequenty, maximizing $\\mathrm{LB}$ over $q$ leads to minimization of the KL-divergence. Specifically, it follows from Gibbs inequality that\n",
    "$$\n",
    " q^{(m+1)}(z) := p(z|x,\\theta^{(m)})\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "##### Maximizing $\\mathrm{LB}$ w.r.t. $\\theta$\n",
    "\n",
    "- It also follows from (the last line of) the multi-line derivation above that maximizing $\\mathrm{LB}$ w.r.t. $\\theta$ amounts to maximization of the _expected complete-data log-likelihood_ (where the complete data set is defined as $\\{(x_i,z_i)\\}_{i=1}^N$). Hence, the EM algorithm comprises iterations of\n",
    "$$\n",
    "\\boxed{\\textbf{EM}:\\, \\theta^{(m+1)} := \\underbrace{\\arg\\max_\\theta}_{\\text{M-step}} \\underbrace{\\sum_z  \\overbrace{p(z|x,\\theta^{(m)})}^{=q^{(m+1)}(z)} \\log p(x,z|\\theta)}_{\\text{E-step}} }\n",
    "$$\n",
    "\n",
    "<!---\n",
    "- Compare this to regular log-likelihood maximization:\n",
    "$$\n",
    "\\boxed{\\textbf{ML}:\\, \\hat \\theta:= \\arg\\max_\\theta \\log p(x|\\theta)}\n",
    "$$\n",
    "--->"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "\n",
    "### <span style=\"color:green\">Hoemwork exercise: EM for GMM Revisited</span>\n",
    "\n",
    "##### E-step\n",
    "- Write down the GMM generative model\n",
    "- The complete-data set is $D_c=\\{x_1,z_1,x_2,z_2,\\ldots,x_n,z_n\\}$. Write down the _complete-data_ likelihood $p(D_c|\\theta)$\n",
    "- Write down the complete-data _log_-likelihood $\\log p(D_c|\\theta)$\n",
    "- Write down the _expected_ complete-data log-likelihood $\\mathrm{E}_Z\\left[ \\log p(D_c|\\theta) \\right]$\n",
    "\n",
    "\n",
    "##### M-step\n",
    "- Maximize $\\mathrm{E}_Z\\left[ \\log p(D_c|\\theta) \\right]$ w.r.t. $\\theta=\\{\\pi,\\mu,\\Sigma\\}$\n",
    "\n",
    "- Verify that your solution is the same as the 'intuitive' solution of the previous lesson. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "###  <span style=\"color:green\">Homework Exercise: EM for Three Coins problem</span>\n",
    "\n",
    "- You have three coins in your pocket. For each coin, outcomes $\\in \\{\\mathrm{H},\\mathrm{T}\\}$.\n",
    "$$\n",
    "p(\\mathrm{H}) = \\begin{cases} \\lambda & \\text{for coin }0 \\\\\n",
    " \\rho & \\text{for coin }1 \\\\\n",
    " \\theta & \\text{for coin }2 \\end{cases}\n",
    "$$\n",
    "\n",
    "    \n",
    "-  **Scenario**. Toss coin $0$. If Head comes up, toss three times with coin $1$; otherwise, toss three times with coin $2$.\n",
    "\n",
    "- The observed sequences **after** each toss with coin $0$ were $\\langle \\mathrm{HHH}\\rangle$, $\\langle \\mathrm{HTH}\\rangle$, $\\langle \\mathrm{HHT}\\rangle$, and $\\langle\\mathrm{HTT}\\rangle$\n",
    "\n",
    "- **Task**. Use EM to estimate most the likely values for $\\lambda$, $\\rho$ and $\\theta$\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "###  Report Card on EM\n",
    "\n",
    "-  EM is a general procedure for learning in the presence of unobserved variables."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "-  In a sense, it is a **family of algorithms**. The update rules you will derive depend on the probability model assumed."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "-  (Good!) **No tuning parameters** such a learning rate, unlike gradient descent-type algorithms"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "-  (Bad). EM is an iterative procedure that is very sensitive to initial\n",
    "conditions! EM converges to a **local optimum**."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "-  Start from trash $\\rightarrow$ end up with trash. Hence, we need a good and fast initialization procedure (often used: K-Means, see Bishop p.424)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "-  Also used to train HMMs, etc."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### <span style=\"color:red\">(OPTIONAL SLIDE)</span> The Free-Energy Principle\n",
    "\n",
    "- The negative lower-bound $\\mathrm{F}(q,\\theta) \\triangleq -\\mathrm{LB}(q,\\theta)$ also appears in various scientific disciplines. In statistical physics and variational calculus, $F$ is known as the **free energy** functional. Hence, the EM algorithm is a special case of **free energy minimization**.  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- It follows from Eq.1 above that\n",
    "\n",
    "$$\\begin{align*}\n",
    "\\mathrm{F}(q,\\theta) &\\triangleq \\sum_z q(z) \\log \\frac{q(z)}{p(x,z|\\theta)} \\\\\n",
    "  &= \\underbrace{- \\mathrm{L}(\\theta)}_{-\\text{log-likelihood}} + \\underbrace{D_{\\text{KL}}\\left( q(z) \\parallel p(z|x,\\theta) \n",
    "\\right)}_{\\text{divergence}}\n",
    "\\end{align*}$$ "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- The **Free-Energy Principle** (FEP) is an influential [neuroscientific theory](https://en.wikipedia.org/wiki/Free_energy_principle) that claims that information processing in the brain is also an example of free-energy minimization, see [Friston, 2009](./files/Friston-2009-The-free-energy-principle-a-rough-guide-to-the-brain.pdf)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- According to FEP, the brain contains a generative model $p(x,z,a,\\theta)$ for its environment. Here, $x$ are the sensory signals (observations); $z$ corresponds to (hidden) environmental causes of the sensorium; $a$ represents our actions (control signals for movement) and $\\theta$ describes the fixed 'physical rules' of the world."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### <span style=\"color:red\">(OPTIONAL SLIDE)</span> The Free-Energy Principle, cont'd\n",
    "\n",
    "- Solely through free-energy minimization, the brain infers an approximate posterior $q(z,a,\\theta)$, thus inferring our **perception**, **actions**, and **learning** equations."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- Free-energy can be interpreted as (a generalized notion of sum of) prediction errors. Free-energy minimization aims to minimize prediction errors through perception, actions, and learning. (The next picture  is from a [2012 tutorial presentation](http://slideplayer.com/slide/9925565/) by Karl Friston) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "<img src=\"./figures/friston-fep.jpg\" width=\"500px\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### <span style=\"color:red\">(OPTIONAL SLIDE)</span> The Free-Energy Principle, cont'd\n",
    "\n",
    "- $\\Rightarrow$ The brain is \"nothing but\" an approximate Bayesian agent that tries to build a model for its world. Actions (behavior) are selected to fulfill prior expectations about the world. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- In our [BIASlab research team](http://biaslab.org) in the Signal Processing Systems group (FLUX floor 7), we work on developing (approximately Bayesian) **artificial** intelligent agents that learn purposeful behavior through interactions with their environments, inspired by the free-energy principle. Applications include\n",
    "  - robotics\n",
    "  - self-learning of new signal processing algorithms, e.g., for hearing aids\n",
    "  - self-learning how to drive\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "- Let me know (email bert.de.vries@tue.nl) if you're interested in developing machine learning applications that are directly inspired by how the brain computes. We often have open traineeships or MSc thesis projects available. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "source": [
    "---\n",
    "The cell below loads the style file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<!--\n",
       "This HTML file contains custom styles and some javascript.\n",
       "Include it a Jupyter notebook for improved rendering.\n",
       "-->\n",
       "\n",
       "<!-- Fonts -->\n",
       "<link href='http://fonts.googleapis.com/css?family=Alegreya+Sans:100,300,400,500,700,800,900,100italic,300italic,400italic,500italic,700italic,800italic,900italic' rel='stylesheet' type='text/css'>\n",
       "<link href='http://fonts.googleapis.com/css?family=Arvo:400,700,400italic' rel='stylesheet' type='text/css'>\n",
       "<link href='http://fonts.googleapis.com/css?family=PT+Mono' rel='stylesheet' type='text/css'>\n",
       "<link href='http://fonts.googleapis.com/css?family=Shadows+Into+Light' rel='stylesheet' type='text/css'>\n",
       "<link href='http://fonts.googleapis.com/css?family=Nixie+One' rel='stylesheet' type='text/css'>\n",
       "\n",
       "<!-- Custom style -->\n",
       "<style>\n",
       "\n",
       "@font-face {\n",
       "    font-family: \"Computer Modern\";\n",
       "    src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf');\n",
       "}\n",
       "\n",
       "#notebook_panel { /* main background */\n",
       "    background: rgb(245,245,245);\n",
       "}\n",
       "\n",
       "div.container {\n",
       "    min-width: 960px;\n",
       "}\n",
       "\n",
       "div #notebook { /* centre the content */\n",
       "    background: #fff; /* white background for content */\n",
       "    margin: auto;\n",
       "    padding-left: 0em;\n",
       "}\n",
       "\n",
       "#notebook li { /* More space between bullet points */\n",
       "    margin-top:0.8em;\n",
       "}\n",
       "\n",
       "/* draw border around running cells */\n",
       "div.cell.border-box-sizing.code_cell.running {\n",
       "    border: 1px solid #111;\n",
       "}\n",
       "\n",
       "/* Put a solid color box around each cell and its output, visually linking them*/\n",
       "div.cell.code_cell {\n",
       "    background-color: rgb(256,256,256);\n",
       "    border-radius: 0px;\n",
       "    padding: 0.5em;\n",
       "    margin-left:1em;\n",
       "    margin-top: 1em;\n",
       "}\n",
       "\n",
       "div.text_cell_render{\n",
       "    font-family: 'Alegreya Sans' sans-serif;\n",
       "    line-height: 140%;\n",
       "    font-size: 125%;\n",
       "    font-weight: 400;\n",
       "    width:800px;\n",
       "    margin-left:auto;\n",
       "    margin-right:auto;\n",
       "}\n",
       "\n",
       "\n",
       "/* Formatting for header cells */\n",
       ".text_cell_render h1 {\n",
       "    font-family: 'Nixie One', serif;\n",
       "    font-style:regular;\n",
       "    font-weight: 400;\n",
       "    font-size: 45pt;\n",
       "    line-height: 100%;\n",
       "    color: rgb(0,51,102);\n",
       "    margin-bottom: 0.5em;\n",
       "    margin-top: 0.5em;\n",
       "    display: block;\n",
       "}\n",
       "\n",
       ".text_cell_render h2 {\n",
       "    font-family: 'Nixie One', serif;\n",
       "    font-weight: 400;\n",
       "    font-size: 30pt;\n",
       "    line-height: 100%;\n",
       "    color: rgb(0,51,102);\n",
       "    margin-bottom: 0.1em;\n",
       "    margin-top: 0.3em;\n",
       "    display: block;\n",
       "}\n",
       "\n",
       ".text_cell_render h3 {\n",
       "    font-family: 'Nixie One', serif;\n",
       "    margin-top:16px;\n",
       "    font-size: 22pt;\n",
       "    font-weight: 600;\n",
       "    margin-bottom: 3px;\n",
       "    font-style: regular;\n",
       "    color: rgb(102,102,0);\n",
       "}\n",
       "\n",
       ".text_cell_render h4 {    /*Use this for captions*/\n",
       "    font-family: 'Nixie One', serif;\n",
       "    font-size: 14pt;\n",
       "    text-align: center;\n",
       "    margin-top: 0em;\n",
       "    margin-bottom: 2em;\n",
       "    font-style: regular;\n",
       "}\n",
       "\n",
       ".text_cell_render h5 {  /*Use this for small titles*/\n",
       "    font-family: 'Nixie One', sans-serif;\n",
       "    font-weight: 400;\n",
       "    font-size: 16pt;\n",
       "    color: rgb(163,0,0);\n",
       "    font-style: italic;\n",
       "    margin-bottom: .1em;\n",
       "    margin-top: 0.8em;\n",
       "    display: block;\n",
       "}\n",
       "\n",
       ".text_cell_render h6 { /*use this for copyright note*/\n",
       "    font-family: 'PT Mono', sans-serif;\n",
       "    font-weight: 300;\n",
       "    font-size: 9pt;\n",
       "    line-height: 100%;\n",
       "    color: grey;\n",
       "    margin-bottom: 1px;\n",
       "    margin-top: 1px;\n",
       "}\n",
       "\n",
       ".CodeMirror{\n",
       "    font-family: \"PT Mono\";\n",
       "    font-size: 90%;\n",
       "}\n",
       "\n",
       ".boxed { /* draw a border around a piece of text */\n",
       "  border: 1px solid blue ;\n",
       "}\n",
       "\n",
       "h4#CODE-EXAMPLE,\n",
       "h4#END-OF-CODE-EXAMPLE {\n",
       "    margin: 10px 0;\n",
       "    padding: 10px;\n",
       "    background-color: #d0f9ca !important;\n",
       "    border-top: #849f81 1px solid;\n",
       "    border-bottom: #849f81 1px solid;\n",
       "}\n",
       "\n",
       ".emphasis {\n",
       "    color: red;\n",
       "}\n",
       "\n",
       ".exercise {\n",
       "    color: green;\n",
       "}\n",
       "\n",
       ".proof {\n",
       "    color: blue;\n",
       "}\n",
       "\n",
       "code {\n",
       "  padding: 2px 4px !important;\n",
       "  font-size: 90% !important;\n",
       "  color: #222 !important;\n",
       "  background-color: #efefef !important;\n",
       "  border-radius: 2px !important;\n",
       "}\n",
       "\n",
       "/* This removes the actual style cells from the notebooks, but no in print mode\n",
       "   as they will be removed through some other method */\n",
       "@media not print {\n",
       "  .cell:nth-last-child(-n+2) {\n",
       "    display: none;\n",
       "  }\n",
       "}\n",
       "\n",
       "</style>\n",
       "\n",
       "<!-- MathJax styling -->\n",
       "<script>\n",
       "    MathJax.Hub.Config({\n",
       "                        TeX: {\n",
       "                           extensions: [\"AMSmath.js\"],\n",
       "                           equationNumbers: { autoNumber: \"AMS\", useLabelIds: true}\n",
       "                           },\n",
       "                tex2jax: {\n",
       "                    inlineMath: [ ['$','$'], [\"\\\\(\",\"\\\\)\"] ],\n",
       "                    displayMath: [ ['$$','$$'], [\"\\\\[\",\"\\\\]\"] ]\n",
       "                },\n",
       "                displayAlign: 'center', // Change this to 'center' to center equations.\n",
       "                \"HTML-CSS\": {\n",
       "                    styles: {'.MathJax_Display': {\"margin\": 4}}\n",
       "                }\n",
       "        });\n",
       "</script>\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "open(\"../../styles/aipstyle.html\") do f\n",
    "    display(\"text/html\", readstring(f))\n",
    "end"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Julia 0.6.1",
   "language": "julia",
   "name": "julia-0.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "0.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}