{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Week 13 Day 1: Review\n",
    "\n",
    "| Week | Monday                       | Wednesday                  | Friday              |\n",
    "|------|------------------------------|----------------------------|---------------------|\n",
    "| 1  | 8-27  Introduction             | 8-29 Using Python part 1   | 8-31 Using Python part 2 |\n",
    "| 2  | 9-03  *Labor day*              | 9-05 OO programming (4)    | 9-07 Error accumulation (2) |\n",
    "| 3  | 9-10  Numerical tools          | 9-12 Plotting (3)          | 9-14 Using git |\n",
    "| 4  | 9-17  Random numbers (5)       | 9-19 Monte Carlo (5)       | 9-21  **Project selection** |\n",
    "| 5  | 9-24  Integration rules (6)    | 9-26 MC Integration (6)    | 9-28 Numerical differentiation (7) |\n",
    "| 6  | 10-01 Vectorization (8)        | 10-03 Linear algebra (8)   | 10-05 Linear regression (8) |\n",
    "| 7  | 10-08 Structured tabular data  | 10-10 Cuts and histograms  | 10-12 *Fall reading days* |\n",
    "| 8  | 10-15 Generating distributions | 10-17 Minimization and fitting | 10-19 Fitting tools |\n",
    "| 9  | 10-22 Confidence intervals     | 10-24 Markov Chain Monte Carlo | 10-26 Performance computing (14) |\n",
    "| 10 | 10-29 Intro to ODEs (9)        | 10-31 Runge–Kutta algorithm (9) | 11-02 Solving ODE problems (9) |\n",
    "| 11 | 11-05 Fourier Series (10)      | 11-07 Fast Fourier Transform (10) | 11-09 **Project progress report** |\n",
    "| 12 | 11-12 *Veterans day*           | 11-14 **Project progress report** | 11-16 Filtering signals (10) |\n",
    "| 13 | 11-19 **Review**               | 11-21 **Student requested topics** | 11-23 *Thanksgiving* |\n",
    "| 14 | 11-26 Static computation graphs | 11-28 Applied ML topics | 11-30 Sharing and documenting code |\n",
    "| 15 | 12-03 **Term project presentations** | 12-05 **Term project presentations** | 12-07 **Term project presentations** |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import math\n",
    "import numpy as np\n",
    "import numba\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "from scipy import integrate, optimize, stats, misc"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.3: Error accumulation\n",
    "\n",
    "Compare several summing methods to the very accurate \"fsum\"; which fairs best?\n",
    "\n",
    "<!--\n",
    "print(fsummed - sum(vals))\n",
    "print(fsummed - np.sum(vals))\n",
    "-->"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(42)\n",
    "vals = np.random.normal(0, 1, 1_000_000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fsummed = math.fsum(vals)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.3: Using git\n",
    "\n",
    "Describe the following commands:\n",
    "\n",
    "1. `git add -u .`\n",
    "2. `git commit -m \"hello\"`\n",
    "3. `git push`\n",
    "4. `git grep \"hello\"`\n",
    "\n",
    "<!--\n",
    "1) Adds all changes (not new files) in the current directory or below into the staging area\n",
    "2) Saves the staging area into history locally, with a message of \"hello\"\n",
    "3) Pushes changes in history to a remote (like github)\n",
    "4) Looks for the phrase \"hello\" in the current repository\n",
    "-->"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.1: Integration rules\n",
    "\n",
    "Integrate the following function. You *should* use tools provided; don't write your own integrator unless it is necessary for some reason!\n",
    "\n",
    "$$\n",
    "f(x) = \\int _0 ^1 \\sin(\\cos x) \\, dx \\approx 0.73864...\n",
    "$$\n",
    "\n",
    "(From [Wolfram Alpha](https://www.wolframalpha.com/input/?i=integrate+sin(cos+x)+from+x%3D0+to+1&lk=3))\n",
    "\n",
    "<!--\n",
    "integrate.quad(f,0,1)[0]\n",
    "-->"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def f(x):\n",
    "    return np.sin(np.cos(x))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.2: MC Integration\n",
    "\n",
    "What situations would you want to use MC integration instead of a regular grid and an integration rule?\n",
    "\n",
    "Integrate the above function with MC. Note there is not a built in function to do so, since it's so easy to do.\n",
    "\n",
    "<!--\n",
    "arr = np.random.rand(10_000_000)\n",
    "np.mean(f(arr))\n",
    "-->"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.3: Numerical differentiation\n",
    "\n",
    "Can you find a numerical derivative for the function above at 1/2? Analytic solution shown.\n",
    "\n",
    "<!--\n",
    "misc.derivative(f, .5, .00001)\n",
    "(f(.50001)-f(.49999)) / .00002\n",
    "-->"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "real_deriv = -np.cos(np.cos(0.5)) * np.sin(0.5)\n",
    "real_deriv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5.3, 8.2, 8.3 : Fitting\n",
    "\n",
    "We've already done a bit of this, so let's just hit a highlight or two.\n",
    "\n",
    "Functions:\n",
    "\n",
    "* Interpolation: connect the dots. Cubic splines were probably the nicest visual interpolation method.\n",
    "* Linear regression: `np.polyfit` or `np.linalg.lstsq`\n",
    "* Non-linear least squares: `optimize.curve_fit`\n",
    "\n",
    "Distributions:\n",
    "\n",
    "* Binned fits: See above, be careful and include error\n",
    "* Unbinned fits: NLL (use `optimize.minimize` or custom package)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Try a quick polyfit here:\n",
    "\n",
    "<!--\n",
    "c = np.polyfit(x,y,1)\n",
    "yy = np.poly1d(c)(x)\n",
    "...\n",
    "plt.plot(x,yy)\n",
    "print(*c)\n",
    "-->"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(42)\n",
    "x = np.linspace(0, 2, 200)\n",
    "y = x * 2 + 0.5 + np.random.normal(0, 0.5, 200)\n",
    "\n",
    "# Add fit here\n",
    "\n",
    "plt.plot(x, y, \".\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6.1, 6.2: Structured tabular data and cuts\n",
    "\n",
    "Play with the following dataframe. Make a scatter plot of `sepal_width` vs. `sepal_length` with each of the three species a different color.\n",
    "\n",
    "<!--\n",
    "sp = df.groupby('species')\n",
    "print(*sp.groups.keys())\n",
    "#\n",
    "for name, group in sp:\n",
    "    locals()[name] = group\n",
    "#\n",
    "plt.plot('sepal_length', 'sepal_width', '.', data=setosa, label='setosa')\n",
    "plt.plot('sepal_length', 'sepal_width', '.', data=versicolor, label='versicolor')\n",
    "plt.plot('sepal_length', 'sepal_width', '.', data=virginica, label='virginica')\n",
    "plt.xlabel('Sepal length')\n",
    "plt.ylabel('Sepal width')\n",
    "plt.legend(title='Species')\n",
    "plt.show()\n",
    "-->"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv(\n",
    "    \"https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv\"\n",
    ")\n",
    "\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note, we could do this with seaborn:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sns.scatterplot(\"sepal_length\", \"sepal_width\", \"species\", data=df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 9.2: MCMC\n",
    "\n",
    "Let's look at the metropolis algorithm (MCMC but without computing a posterior, so simpler) to produce samples from $\\sin(x)**2 / x**2$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def p(x):\n",
    "    return np.sin(x) ** 2 / x ** 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Normalization not needed for algorithm, but makes the plots line up:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "integ, err = integrate.quad(p, -15, 15.000001)\n",
    "print(integ, err)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x1 = np.linspace(-15, 15, 400)\n",
    "y1 = p(x1) / integ"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now compute the metropolis algorithm:\n",
    "\n",
    "<!--\n",
    "@numba.njit\n",
    "def metro(p, samples, sigma):\n",
    "    x = np.zeros(samples+1)\n",
    "    x[0] = np.random.rand()\n",
    "#\n",
    "    for i in range(samples):\n",
    "        # suggest new position\n",
    "        x_Star = np.random.normal(x[0], sigma)\n",
    "#\n",
    "        # Compute alpha - the fractional chance of moving to a new point\n",
    "        alpha = p(x_Star)/p(x[i])\n",
    "#\n",
    "        # Accept/reject based on alpha\n",
    "        accept = alpha > np.random.rand()\n",
    "#\n",
    "        # Add the current (moved?) point\n",
    "        x[i+1] = x_Star if accept else x[i]\n",
    "#\n",
    "    return x\n",
    "-->"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "samples = 100_000\n",
    "sigma = 10\n",
    "x = np.empty(samples + 1)\n",
    "x[0] = np.random.rand()\n",
    "\n",
    "for i in range(samples):\n",
    "    # suggest new position xStar\n",
    "    ...\n",
    "\n",
    "    # Compute alpha - the fractional chance of moving to a new point\n",
    "    ...\n",
    "\n",
    "    # Accept/reject based on alpha\n",
    "    ...\n",
    "\n",
    "    # Add the current (moved?) point\n",
    "    ..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.plot(x1, y1)\n",
    "# plt.hist(x[500:], density=True, bins=100)\n",
    "# plt.yscale('log')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Side project: This is great, but very slow. We can make this a function, and add a couple of `@numba.njit`'s, and get a massive speedup!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 10.2: Runge-Kutta algorithms\n",
    "\n",
    "Let's look at ODEs, and revisit the RK2 algorithm. Remember you can do the following:\n",
    "\n",
    "$$\n",
    "\\mathbf{u} =\n",
    "\\left(\n",
    "\\begin{matrix}\n",
    "\\dot{x} \\\\\n",
    "x\n",
    "\\end{matrix}\n",
    "\\right)\n",
    "$$\n",
    "\n",
    "$$\n",
    "\\mathbf{f}(t, \\mathbf{u}) =\n",
    "\\dot{\\mathbf{u}}\n",
    "$$\n",
    "\n",
    "To get a first order ODE."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll start with one, though:\n",
    "\n",
    "$$\n",
    "u =\n",
    "\\left(\n",
    "\\begin{matrix}\n",
    "x \\\\\n",
    "y \\\\\n",
    "z\n",
    "\\end{matrix}\n",
    "\\right)\n",
    "$$\n",
    "\n",
    "$$\n",
    "\\dot{u} =\n",
    "\\left(\n",
    "\\begin{matrix}\n",
    "\\sigma (y - x) \\\\\n",
    "\\rho x - x z - y \\\\\n",
    "x y - \\beta z\n",
    "\\end{matrix}\n",
    "\\right)\n",
    "$$\n",
    "\n",
    "Constants given below. (example from [Bokeh](https://bokeh.pydata.org/en/latest/docs/gallery/lorenz.html))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sigma = 10\n",
    "rho = 28\n",
    "beta = 8 / 3\n",
    "theta = 3 * np.pi / 4\n",
    "\n",
    "\n",
    "def lorenz(t, xyz):\n",
    "    x, y, z = xyz\n",
    "    x_dot = sigma * (y - x)\n",
    "    y_dot = x * rho - x * z - y\n",
    "    z_dot = x * y - beta * z\n",
    "    return np.array([x_dot, y_dot, z_dot])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "initial = (-10, -7, 35)\n",
    "t = np.arange(0, 50, 0.006)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### RK2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def rk2_ivp(f, init_y, t):\n",
    "    steps = len(t)\n",
    "    order = len(init_y)\n",
    "\n",
    "    y = np.empty((steps, order))\n",
    "    y[0] = init_y\n",
    "\n",
    "    for n in range(steps - 1):\n",
    "        h = t[n + 1] - t[n]\n",
    "\n",
    "        k1 = h * f(t[n], y[n])  # 1.1\n",
    "        k2 = h * f(t[n] + h / 2, y[n] + k1 / 2)  # 1.2\n",
    "\n",
    "        y[n + 1] = y[n] + k2  # 1.0\n",
    "\n",
    "    return y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "xyz = rk2_ivp(lorenz, initial, t)\n",
    "xprime = np.cos(theta) * xyz[:, 0] - np.sin(theta) * xyz[:, 1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(9, 8))\n",
    "plt.plot(xprime, xyz[:, 2])\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### RK4"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def rk4_ivp(f, init_y, t):\n",
    "    steps = len(t)\n",
    "    order = len(init_y)\n",
    "\n",
    "    y = np.empty((steps, order))\n",
    "    y[0] = init_y\n",
    "\n",
    "    for n in range(steps - 1):\n",
    "        h = t[n + 1] - t[n]\n",
    "\n",
    "        k1 = h * f(t[n], y[n])  # 2.1\n",
    "        k2 = h * f(t[n] + h / 2, y[n] + k1 / 2)  # 2.2\n",
    "        k3 = h * f(t[n] + h / 2, y[n] + k2 / 2)  # 2.3\n",
    "        k4 = h * f(t[n] + h, y[n] + k3)  # 2.4\n",
    "\n",
    "        y[n + 1] = y[n] + 1 / 6 * (k1 + 2 * k2 + 2 * k3 + k4)  # 2.0\n",
    "\n",
    "    return y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "xyz = rk4_ivp(lorenz, initial, t)\n",
    "xprime = np.cos(theta) * xyz[:, 0] - np.sin(theta) * xyz[:, 1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(9, 8))\n",
    "plt.plot(xprime, xyz[:, 2])\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Solve IVP"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "res = integrate.solve_ivp(lorenz, (0, 100), initial, t_eval=t)\n",
    "xprime = np.cos(theta) * res.y[0] - np.sin(theta) * res.y[1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(9, 8))\n",
    "plt.plot(xprime, res.y[2])\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Ode Int\n",
    "\n",
    "Let's try this with ODE Int, since that's in older SciPy's. The main difference is the reversed order for f, and"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def lorenz_flip(xyz, t):\n",
    "    return lorenz(t, xyz)\n",
    "\n",
    "\n",
    "xyz = integrate.odeint(lorenz_flip, initial, t)\n",
    "xprime = np.cos(theta) * xyz[:, 0] - np.sin(theta) * xyz[:, 1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(9, 8))\n",
    "plt.plot(xprime, xyz[:, 2])\n",
    "plt.show()"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}