{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Univariate and Multivariate Optimization\n", "\n", "The first two lectures of the course will serve to satisfy two objectives:\n", "\n", "1. Acquaint students with the Python programming language\n", "2. Review an important class of statistical computing algorithms: *optimization*\n", "\n", "For some of you, one or both of these topics will likely be review. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's begin by importing the packages we will need for this section.\n", "\n", "> ### Import statements\n", "> Much of Python's power resides in **modules**, either those included in base Python or from third parties, which contain functions and classes that provide key functionality for specialized tasks. We will import several of these modules, here to enable us to more easily peform scientific computing tasks, such as linear algebra, data manipulation, and plotting. The import clause will bring the module into the current session.\n", "> Here we also create **aliases** for each module, so that they may be accessed more easily." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import numpy as np\n", "import pandas as pd\n", "import seaborn as sns\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> For example, the seaborn package provides some high-level plotting capability. Here, we will call the set_context function from seaborn, which allows us to adjust the size of the labels, lines, and other elements of plots. The 'notebook' argument tells seaborn to set these elements to be suitable for display within a Jupyter Notebook." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Set some Seaborn options\n", "sns.set(context='notebook', style='ticks', palette='viridis')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Optimization\n", "\n", "Optimization is the process of finding the *minima* or *maxima* of a function. Consider a function:\n", "\n", "$$f: \\mathbf{R} \\rightarrow \\mathbf{R}$$\n", "\n", "where $f',f''$ are continuous. A point $x^*$ is a *global* maximum if:\n", "\n", "$$f(x) \\le f(x^*) \\, \\forall \\, x$$\n", "\n", "or a *local* maximum if:\n", "\n", "$$f(x) \\le f(x^*)$$ \n", "$$\\forall \\, x:|x-x^*| \\lt \\epsilon$$\n", "\n", "Necessary conditions:\n", "\n", "1. $f'(x^*) = 0$\n", "2. $f''(x^*) \\le 0$ (sufficient if $f''(x^*) \\lt 0$)\n", "\n", "We will consider **local search** methods that generate a series of values that converge to the maximum:\n", "\n", "$$x_0, x_1, x_2, \\ldots \\rightarrow \\text{argmax}(f)$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example: Maximum Likelihood\n", "\n", "**Maximum likelihood** (ML) is an approach for estimating the parameters of statistical models. The resulting estimates from ML have good theoretical properties, so it is a widely-used method. \n", "\n", "There is a ton of theory regarding ML. We will restrict ourselves to the mechanics here.\n", "\n", "Say we have some data $y = y_1,y_2,\\ldots,y_n$ that is distributed according to some distribution:\n", "\n", "
\n", "$$Pr(Y_i=y_i | \\theta)$$\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> ### Random number generation\n", "> The numpy module contains a random submodule with functions for generating random values from several important probability distributions. For example, if we want to generate 100 values from a **Poisson distribution** with a mean of 5, we can make the following function call:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "y = np.random.poisson(5, size=100)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> The variable y is now the label associated with the resulting 100 values that we have sampled. They are stored in a data structure called an ndarray." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "numpy.ndarray" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> We can arbitrarily acces values of this array by **indexing** them, specifying values or ranges of values within square brackets. For example, to get the first value:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Notice Python begins indexing values starting at zero, rather than one. To extract a sub-array, we can use a *slice*, denoted by the boundaries of the sub-array separated with a colon:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([4, 6, 5, 3, 3, 4, 8, 2, 5, 6])" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y[:10]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's plot a histogram of the sampled values:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAESCAYAAADzBx6nAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAEKxJREFUeJzt3X9o1IUfx/HX3LjZrGPO1N2cKVnaYpK0gxEl1ZRG4a/9Ndm0IVZgtCxbYWFOnEIrQZNpkwrZH8NAiOEv3AINNCosChyaM39lutP9UG5t6PTu8/2jGu57fr/7cXf7bG+fD+gPP7vbvY7k6fnx7rMEx3EcAQDMGuX2AABAfBF6ADCO0AOAcYQeAIwj9ABgHKEHAOMIPQAYR+gBwDhCDwDGEXoAMI7QA4BxSW498I0bN9TY2Kjx48crMTHRrRkAMKKEQiG1tLQoOztbo0eP7td9XAt9Y2OjiouL3Xp4ABjRamtr5ff7+3Vb10I/fvx4SX+PTU9Pd2sGAIwogUBAxcXFPQ3tD9dC/+/pmvT0dGVmZro1AwBGpIGc8uYfYwHAOEIPAMYRegAwjtADgHGEHgCMI/QAYByhBwDjXHsfPTCcdQS71NXZ7faMCCljPHrAm+L2DIwwhB64i67Obv38/e9uz4iQ89QjhB4DxqkbADCO0AOAcYQeAIwj9ABgHKEHAOMIPQAYR+gBwDhCDwDGEXoAMI7QA4BxhB4AjCP0AGAcoQcA4wg9ABhH6AHAOEIPAMYRegAwjtADgHGEHgCMI/QAYByhBwDjCD0AGJfU1w2uXbum9957T3/88Yc8Ho+mTJmi9evXKy0tTb/++qvWrl2rmzdvatKkSfrkk080bty4odgNAOinPl/RJyQk6JVXXlF9fb327t2ryZMna9OmTXIcR++++67Wrl2r+vp6+f1+bdq0aSg2AwAGoM/Qp6amKjc3t+fXs2bN0uXLl3X8+HElJyfL7/dLkhYvXqyDBw/e9XsEg0H9+eefvf4LBAIxegoAgP+nz1M3dwqHw9q1a5fy8vLU3NysjIyMnq+lpaUpHA7r+vXrSk1N7XW/mpoaVVVVxWbxMNcR7FJXZ7fbMyKkjPHoAW+K2zMAuGBAoa+oqFBKSoqWLFmib775pt/3KykpUUFBQa9jgUBAxcXFA3n4EaGrs1s/f/+72zMi5Dz1CKEH7lH9Dn1lZaUuXLig6upqjRo1Sj6fT5cvX+75ent7uxISEiJezUuS1+uV1+uNzWIAwID06+2VmzdvVmNjo7Zt2yaPxyNJys7O1o0bN/TTTz9Jkr766iu9+OKL8VsKABiUPl/Rnz59WtXV1Zo6daoWL14sScrMzNS2bdv08ccfq7y8vNfbKwEAw0ufoX/00Ud16tSpu37tySef1N69e2M+CgAQO3wyFgCMI/QAYByhBwDjBvQ+eiDWhusHzLpv3nJ7AhAzhB6uGq4fMHtsZqbbE4CY4dQNABhH6AHAOEIPAMYRegAwjtADgHGEHgCMI/QAYByhBwDjCD0AGEfoAcA4Qg8AxhF6ADCO0AOAcYQeAIwj9ABgHKEHAOMIPQAYR+gBwDhCDwDGEXoAMI7QA4BxhB4AjCP0AGAcoQcA4wg9ABiX5PaAweoIdqmrs9vtGRG6b95yewIA9DJiQ9/V2a2fv//d7RkRHpuZ6fYEAOiFUzcAYByhBwDjCD0AGEfoAcC4fv1jbGVlperr63Xp0iXt3btX06dPlyTl5eXJ4/EoOTlZklRWVqbZs2fHby0AYMD6Ffo5c+bo5ZdfVnFxccTXtm7d2hN+AMDw06/Q+/3+qB4kGAwqGAz2OhYIBKL6ngCA/on6ffRlZWVyHEc5OTlatWqVvF5vxG1qampUVVUV7UMBAAYhqtDX1tbK5/Opu7tbGzdu1Pr167Vp06aI25WUlKigoKDXsUAgcNdTQQCA2Ioq9D6fT5Lk8XhUVFSkFStW3PV2Xq/3rq/0AQDxN+i3V3Z1damjo0OS5DiODhw4oKysrJgNAwDERr9e0W/YsEENDQ1qbW3VsmXLlJqaqurqapWWlioUCikcDmvatGkqLy+P914AwAD1K/Rr1qzRmjVrIo7X1dXFfBAAILb4ZCwAGEfoAcA4Qg8AxhF6ADCO0AOAcYQeAIwj9ABgHKEHAOMIPQAYR+gBwDhCDwDGEXoAMI7QA4BxhB4AjCP0AGAcoQcA4wg9ABhH6AHAOEIPAMYRegAwjtADgHGEHgCMI/QAYByhBwDjCD0AGEfoAcA4Qg8AxhF6ADCO0AOAcYQeAIwj9ABgHKEHAOMIPQAYR+gBwDhCDwDGEXoAMK7P0FdWViovL08zZsxQU1NTz/Fz586psLBQ+fn5Kiws1Pnz5+O5EwAwSH2Gfs6cOaqtrdWkSZN6HS8vL1dRUZHq6+tVVFSktWvXxm0kAGDwkvq6gd/vjzjW1tamEydOaOfOnZKkefPmqaKiQu3t7UpLS4u4fTAYVDAY7HUsEAgMdjMAYAD6DP3dNDc3a+LEiUpMTJQkJSYmasKECWpubr5r6GtqalRVVRXdUgDAoAwq9ANVUlKigoKCXscCgYCKi4uH4uEB4J42qND7fD5duXJFoVBIiYmJCoVCunr1qnw+311v7/V65fV6oxoKABicQb29cty4ccrKytK+ffskSfv27VNWVtZdT9sAANzV5yv6DRs2qKGhQa2trVq2bJlSU1O1f/9+rVu3TqtXr9b27dvl9XpVWVk5FHsBAAPUZ+jXrFmjNWvWRByfNm2adu/eHZdRAIDY4ZOxAGAcoQcA4wg9ABg3JO+jh/tCt0O60nzd7RkRum/ecnsCYB6hv0fcuHFLvx0/5/aMCI/NzHR7AmAep24AwDhCDwDGEXoAMI7QA4BxhB4AjCP0AGAcoQcA4wg9ABhH6AHAOEIPAMYRegAwjtADgHGEHgCMI/QAYByhBwDjCD0AGMcPHgFGkOH6k8JSxnj0gDfF7Rn4Hwg9MIIM158UlvPUI4R+GOPUDQAYR+gBwDhCDwDGEXoAMI7QA4BxhB4AjCP0AGAcoQcA4wg9ABhH6AHAOEIPAMYRegAwjtADgHFRX70yLy9PHo9HycnJkqSysjLNnj076mEAgNiIyWWKt27dqunTp8fiWwEAYmxIrkcfDAYVDAZ7HQsEAkPx0ABwz4tJ6MvKyuQ4jnJycrRq1Sp5vd5eX6+pqVFVVVUsHgoAMEBRh762tlY+n0/d3d3auHGj1q9fr02bNvW6TUlJiQoKCnodCwQCKi4ujvbhAQB9iDr0Pp9PkuTxeFRUVKQVK1ZE3Mbr9Ua8ygcADI2o3l7Z1dWljo4OSZLjODpw4ICysrJiMgwAEBtRvaJva2tTaWmpQqGQwuGwpk2bpvLy8lhtAwDEQFShnzx5surq6mK1BQAQB3wyFgCMI/QAYByhBwDjCD0AGEfoAcA4Qg8AxhF6ADCO0AOAcYQeAIwj9ABgHKEHAOMIPQAYNyQ/ShCAbaHbIV1pvu72jAijRiUoHHbcnhEhZYxHD3hThuzxCD2AqN24cUu/HT/n9owIj83M1G/H/3R7RoScpx4Z0tBz6gYAjCP0AGAcoQcA4wg9ABhH6AHAOEIPAMYRegAwjtADgHGEHgCMI/QAYByhBwDjCD0AGEfoAcA4Qg8AxhF6ADCO0AOAcYQeAIwj9ABgHKEHAOMIPQAYR+gBwDhCDwDGRR36c+fOqbCwUPn5+SosLNT58+djMAsAECtRh768vFxFRUWqr69XUVGR1q5dG4tdAIAYSYrmzm1tbTpx4oR27twpSZo3b54qKirU3t6utLS0ntsFg0EFg8Fe97106ZIkKRAIDOqxW68G1X6tZZDL4+fKlUR2DQC7BoZdAzNcdzU3p+hW6K9B3fffZoZCoX7fJ6rQNzc3a+LEiUpMTJQkJSYmasKECWpubu4V+pqaGlVVVd31exQXF0czAQDuSS0tLZoyZUq/bhtV6PurpKREBQUFvY51d3fr4sWLmjp1as8fFMNdIBBQcXGxamtrlZ6e7vacIcFz5jlbNVKfcygUUktLi7Kzs/t9n6hC7/P5dOXKFYVCISUmJioUCunq1avy+Xy9buf1euX1eiPu//DDD0fz8K5JT09XZmam2zOGFM/53sBzHhn6+0r+X1H9Y+y4ceOUlZWlffv2SZL27dunrKysXqdtAADuivrUzbp167R69Wpt375dXq9XlZWVsdgFAIiRqEM/bdo07d69OxZbAABxkLhu3bp1bo8YSZKTk5Wbm6vk5GS3pwwZnvO9gedsV4LjOI7bIwAA8cO1bgDAOEIPAMYR+n66du2aXn31VeXn52v+/Pl644031N7e7vasIVFVVaUZM2aoqanJ7Slxd/PmTZWXl+uFF17Q/Pnz9eGHH7o9Ke4OHz6sRYsWaeHChZo/f74aGhrcnhRzlZWVysvLi/h9fM9clNFBv1y7ds354Ycfen790UcfOe+//76Li4ZGY2Ojs3z5cue5555zTp065facuKuoqHA2btzohMNhx3Ecp6WlxeVF8RUOhx2/39/z//bkyZPOrFmznFAo5PKy2Dp27Jhz+fJl5/nnn+/1+3jp0qVOXV2d4ziOU1dX5yxdutStiXHFK/p+Sk1NVW5ubs+vZ82apcuXL7u4KP66u7u1fv16lZeXKyEhwe05cdfZ2am6ujqtXLmy5/k++OCDLq+Kv1GjRqmjo0OS1NHRoQkTJmjUKFtp8Pv9EZ/Y//eijPPmzZP090UZT5w4YfJv6kNyrRtrwuGwdu3apby8PLenxNWnn36qBQsWaPLkyW5PGRIXL15Uamqqqqqq9OOPP2rMmDFauXKl/H6/29PiJiEhQVu2bNHrr7+ulJQUdXZ2aseOHW7PGhL9vSijBbb+2B4iFRUVSklJ0ZIlS9yeEje//PKLjh8/rqKiIrenDJnbt2/r4sWLevzxx/X111+rrKxMpaWl+uuvwV1OdiS4ffu2duzYoe3bt+vw4cP67LPP9Pbbb6uzs9PtaYghQj9AlZWVunDhgrZs2WLur7d3OnbsmM6ePas5c+YoLy9PgUBAy5cv19GjR92eFjcZGRlKSkrq+av8E088obFjx+rcuXMuL4ufkydP6urVq8rJyZEk5eTk6L777tOZM2dcXhZ/d16UUdL/vCijBXZLFQebN29WY2Ojtm3bJo/H4/acuHrttdd09OhRHTp0SIcOHVJ6erq+/PJLPfPMM25Pi5u0tDTl5ubqu+++k/T3OzLa2toGfKXAkSQ9PV2BQEBnz56VJJ05c0atra166KGHXF4Wf/fSRRn5ZGw/nT59WvPmzdPUqVM1evRoSVJmZqa2bdvm8rKhkZeXp+rqak2fPt3tKXF18eJFffDBB7p+/bqSkpL01ltv6dlnn3V7Vlzt2bNHn3/+ec8/QL/55puaO3euy6tia8OGDWpoaFBra6vGjh2r1NRU7d+/X2fOnNHq1asVDAZ7Lso4Ui+f/v8QegAwjlM3AGAcoQcA4wg9ABhH6AHAOEIPAMYRegAwjtADgHGEHgCMI/TAP7744guVlpb2OlZRUaGNGze6tAiIDUIP/GPBggU6cuSIgsGgpL+v7HjgwAEtXLjQ5WVAdAg98I8JEybI7/fr4MGDkqQjR45o7Nixys7OdnkZEB1CD9yhoKBAe/bskfT3xb54NQ8LCD1wh7lz5+rUqVNqamrSt99+q/nz57s9CYgaoQfukJycrPz8fL3zzjuaOXOmMjIy3J4ERI3QA/9l0aJFampq4rQNzCD0wH/JyMjQ6NGjlZ+f7/YUICYIPXCHcDisnTt36qWXXtL999/v9hwgJpLcHgAMF11dXXr66aeVkZGhL774wu05QMzwowQBwDhO3QCAcYQeAIwj9ABgHKEHAOMIPQAYR+gBwLj/ANAh7fhqKmJbAAAAAElFTkSuQmCC\n", "text/plain": [ "