{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Week 13 Day 1: Review\n", "\n", "| Week | Monday | Wednesday | Friday |\n", "|------|------------------------------|----------------------------|---------------------|\n", "| 1 | 8-27 Introduction | 8-29 Using Python part 1 | 8-31 Using Python part 2 |\n", "| 2 | 9-03 *Labor day* | 9-05 OO programming (4) | 9-07 Error accumulation (2) |\n", "| 3 | 9-10 Numerical tools | 9-12 Plotting (3) | 9-14 Using git |\n", "| 4 | 9-17 Random numbers (5) | 9-19 Monte Carlo (5) | 9-21 **Project selection** |\n", "| 5 | 9-24 Integration rules (6) | 9-26 MC Integration (6) | 9-28 Numerical differentiation (7) |\n", "| 6 | 10-01 Vectorization (8) | 10-03 Linear algebra (8) | 10-05 Linear regression (8) |\n", "| 7 | 10-08 Structured tabular data | 10-10 Cuts and histograms | 10-12 *Fall reading days* |\n", "| 8 | 10-15 Generating distributions | 10-17 Minimization and fitting | 10-19 Fitting tools |\n", "| 9 | 10-22 Confidence intervals | 10-24 Markov Chain Monte Carlo | 10-26 Performance computing (14) |\n", "| 10 | 10-29 Intro to ODEs (9) | 10-31 Runge–Kutta algorithm (9) | 11-02 Solving ODE problems (9) |\n", "| 11 | 11-05 Fourier Series (10) | 11-07 Fast Fourier Transform (10) | 11-09 **Project progress report** |\n", "| 12 | 11-12 *Veterans day* | 11-14 **Project progress report** | 11-16 Filtering signals (10) |\n", "| 13 | 11-19 **Review** | 11-21 **Student requested topics** | 11-23 *Thanksgiving* |\n", "| 14 | 11-26 Static computation graphs | 11-28 Applied ML topics | 11-30 Sharing and documenting code |\n", "| 15 | 12-03 **Term project presentations** | 12-05 **Term project presentations** | 12-07 **Term project presentations** |" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import math\n", "import numpy as np\n", "import numba\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "from scipy import integrate, optimize, stats, misc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.3: Error accumulation\n", "\n", "Compare several summing methods to the very accurate \"fsum\"; which fairs best?\n", "\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(42)\n", "vals = np.random.normal(0, 1, 1_000_000)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fsummed = math.fsum(vals)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.3: Using git\n", "\n", "Describe the following commands:\n", "\n", "1. `git add -u .`\n", "2. `git commit -m \"hello\"`\n", "3. `git push`\n", "4. `git grep \"hello\"`\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4.1: Integration rules\n", "\n", "Integrate the following function. You *should* use tools provided; don't write your own integrator unless it is necessary for some reason!\n", "\n", "$$\n", "f(x) = \\int _0 ^1 \\sin(\\cos x) \\, dx \\approx 0.73864...\n", "$$\n", "\n", "(From [Wolfram Alpha](https://www.wolframalpha.com/input/?i=integrate+sin(cos+x)+from+x%3D0+to+1&lk=3))\n", "\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def f(x):\n", " return np.sin(np.cos(x))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4.2: MC Integration\n", "\n", "What situations would you want to use MC integration instead of a regular grid and an integration rule?\n", "\n", "Integrate the above function with MC. Note there is not a built in function to do so, since it's so easy to do.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4.3: Numerical differentiation\n", "\n", "Can you find a numerical derivative for the function above at 1/2? Analytic solution shown.\n", "\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "real_deriv = -np.cos(np.cos(0.5)) * np.sin(0.5)\n", "real_deriv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5.3, 8.2, 8.3 : Fitting\n", "\n", "We've already done a bit of this, so let's just hit a highlight or two.\n", "\n", "Functions:\n", "\n", "* Interpolation: connect the dots. Cubic splines were probably the nicest visual interpolation method.\n", "* Linear regression: `np.polyfit` or `np.linalg.lstsq`\n", "* Non-linear least squares: `optimize.curve_fit`\n", "\n", "Distributions:\n", "\n", "* Binned fits: See above, be careful and include error\n", "* Unbinned fits: NLL (use `optimize.minimize` or custom package)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Try a quick polyfit here:\n", "\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(42)\n", "x = np.linspace(0, 2, 200)\n", "y = x * 2 + 0.5 + np.random.normal(0, 0.5, 200)\n", "\n", "# Add fit here\n", "\n", "plt.plot(x, y, \".\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 6.1, 6.2: Structured tabular data and cuts\n", "\n", "Play with the following dataframe. Make a scatter plot of `sepal_width` vs. `sepal_length` with each of the three species a different color.\n", "\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df = pd.read_csv(\n", " \"https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv\"\n", ")\n", "\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note, we could do this with seaborn:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sns.scatterplot(\"sepal_length\", \"sepal_width\", \"species\", data=df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9.2: MCMC\n", "\n", "Let's look at the metropolis algorithm (MCMC but without computing a posterior, so simpler) to produce samples from $\\sin(x)**2 / x**2$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def p(x):\n", " return np.sin(x) ** 2 / x ** 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Normalization not needed for algorithm, but makes the plots line up:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "integ, err = integrate.quad(p, -15, 15.000001)\n", "print(integ, err)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x1 = np.linspace(-15, 15, 400)\n", "y1 = p(x1) / integ" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now compute the metropolis algorithm:\n", "\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "samples = 100_000\n", "sigma = 10\n", "x = np.empty(samples + 1)\n", "x[0] = np.random.rand()\n", "\n", "for i in range(samples):\n", " # suggest new position xStar\n", " ...\n", "\n", " # Compute alpha - the fractional chance of moving to a new point\n", " ...\n", "\n", " # Accept/reject based on alpha\n", " ...\n", "\n", " # Add the current (moved?) point\n", " ..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.plot(x1, y1)\n", "# plt.hist(x[500:], density=True, bins=100)\n", "# plt.yscale('log')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Side project: This is great, but very slow. We can make this a function, and add a couple of `@numba.njit`'s, and get a massive speedup!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 10.2: Runge-Kutta algorithms\n", "\n", "Let's look at ODEs, and revisit the RK2 algorithm. Remember you can do the following:\n", "\n", "$$\n", "\\mathbf{u} =\n", "\\left(\n", "\\begin{matrix}\n", "\\dot{x} \\\\\n", "x\n", "\\end{matrix}\n", "\\right)\n", "$$\n", "\n", "$$\n", "\\mathbf{f}(t, \\mathbf{u}) =\n", "\\dot{\\mathbf{u}}\n", "$$\n", "\n", "To get a first order ODE." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll start with one, though:\n", "\n", "$$\n", "u =\n", "\\left(\n", "\\begin{matrix}\n", "x \\\\\n", "y \\\\\n", "z\n", "\\end{matrix}\n", "\\right)\n", "$$\n", "\n", "$$\n", "\\dot{u} =\n", "\\left(\n", "\\begin{matrix}\n", "\\sigma (y - x) \\\\\n", "\\rho x - x z - y \\\\\n", "x y - \\beta z\n", "\\end{matrix}\n", "\\right)\n", "$$\n", "\n", "Constants given below. (example from [Bokeh](https://bokeh.pydata.org/en/latest/docs/gallery/lorenz.html))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sigma = 10\n", "rho = 28\n", "beta = 8 / 3\n", "theta = 3 * np.pi / 4\n", "\n", "\n", "def lorenz(t, xyz):\n", " x, y, z = xyz\n", " x_dot = sigma * (y - x)\n", " y_dot = x * rho - x * z - y\n", " z_dot = x * y - beta * z\n", " return np.array([x_dot, y_dot, z_dot])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "initial = (-10, -7, 35)\n", "t = np.arange(0, 50, 0.006)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### RK2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def rk2_ivp(f, init_y, t):\n", " steps = len(t)\n", " order = len(init_y)\n", "\n", " y = np.empty((steps, order))\n", " y[0] = init_y\n", "\n", " for n in range(steps - 1):\n", " h = t[n + 1] - t[n]\n", "\n", " k1 = h * f(t[n], y[n]) # 1.1\n", " k2 = h * f(t[n] + h / 2, y[n] + k1 / 2) # 1.2\n", "\n", " y[n + 1] = y[n] + k2 # 1.0\n", "\n", " return y" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xyz = rk2_ivp(lorenz, initial, t)\n", "xprime = np.cos(theta) * xyz[:, 0] - np.sin(theta) * xyz[:, 1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure(figsize=(9, 8))\n", "plt.plot(xprime, xyz[:, 2])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### RK4" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def rk4_ivp(f, init_y, t):\n", " steps = len(t)\n", " order = len(init_y)\n", "\n", " y = np.empty((steps, order))\n", " y[0] = init_y\n", "\n", " for n in range(steps - 1):\n", " h = t[n + 1] - t[n]\n", "\n", " k1 = h * f(t[n], y[n]) # 2.1\n", " k2 = h * f(t[n] + h / 2, y[n] + k1 / 2) # 2.2\n", " k3 = h * f(t[n] + h / 2, y[n] + k2 / 2) # 2.3\n", " k4 = h * f(t[n] + h, y[n] + k3) # 2.4\n", "\n", " y[n + 1] = y[n] + 1 / 6 * (k1 + 2 * k2 + 2 * k3 + k4) # 2.0\n", "\n", " return y" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xyz = rk4_ivp(lorenz, initial, t)\n", "xprime = np.cos(theta) * xyz[:, 0] - np.sin(theta) * xyz[:, 1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure(figsize=(9, 8))\n", "plt.plot(xprime, xyz[:, 2])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Solve IVP" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "res = integrate.solve_ivp(lorenz, (0, 100), initial, t_eval=t)\n", "xprime = np.cos(theta) * res.y[0] - np.sin(theta) * res.y[1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure(figsize=(9, 8))\n", "plt.plot(xprime, res.y[2])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Ode Int\n", "\n", "Let's try this with ODE Int, since that's in older SciPy's. The main difference is the reversed order for f, and" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def lorenz_flip(xyz, t):\n", " return lorenz(t, xyz)\n", "\n", "\n", "xyz = integrate.odeint(lorenz_flip, initial, t)\n", "xprime = np.cos(theta) * xyz[:, 0] - np.sin(theta) * xyz[:, 1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure(figsize=(9, 8))\n", "plt.plot(xprime, xyz[:, 2])\n", "plt.show()" ] } ], "metadata": { "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.2" } }, "nbformat": 4, "nbformat_minor": 2 }