{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/fonnesbeck/Bios8366/blob/master/notebooks/Section1_1-Univeriate-and-Multivariate-Optimization.ipynb)\n", "\n", "# Univariate and Multivariate Optimization\n", "\n", "The first two lectures of the course will serve to satisfy two objectives:\n", "\n", "1. Acquaint students with the Python programming language\n", "2. Review an important class of statistical computing algorithms: *optimization*\n", "\n", "For some of you, one or both of these topics will likely be review. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's begin by importing the packages we will need for this section.\n", "\n", "> ### Import statements\n", "> Much of Python's power resides in **modules**, either those included in base Python or from third parties, which contain functions and classes that provide key functionality for specialized tasks. We will import several of these modules, here to enable us to more easily peform scientific computing tasks, such as linear algebra, data manipulation, and plotting. The `import` clause will bring the module into the current session.\n", "> Here we also create **aliases** for each module, so that they may be accessed more easily." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import numpy as np\n", "import pandas as pd\n", "import seaborn as sns\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> For example, the `seaborn` package provides some high-level plotting capability. Here, we will call the `set_context` function from `seaborn`, which allows us to adjust the size of the labels, lines, and other elements of plots. The `'notebook'` argument tells `seaborn` to set these elements to be suitable for display within a Jupyter Notebook." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set some Seaborn options\n", "sns.set(context='notebook', style='ticks', palette='viridis')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Optimization\n", "\n", "Optimization is the process of finding the *minima* or *maxima* of a function. Consider a function:\n", "\n", "$$f: \\mathbf{R} \\rightarrow \\mathbf{R}$$\n", "\n", "where $f',f''$ are continuous. A point $x^*$ is a *global* maximum if:\n", "\n", "$$f(x) \\le f(x^*) \\, \\forall \\, x$$\n", "\n", "or a *local* maximum if:\n", "\n", "$$f(x) \\le f(x^*)$$ \n", "$$\\forall \\, x:|x-x^*| \\lt \\epsilon$$\n", "\n", "Necessary conditions:\n", "\n", "1. $f'(x^*) = 0$\n", "2. $f''(x^*) \\le 0$ (sufficient if $f''(x^*) \\lt 0$)\n", "\n", "We will consider **local search** methods that generate a series of values that converge to the maximum:\n", "\n", "$$x_0, x_1, x_2, \\ldots \\rightarrow \\text{argmax}(f)$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example: Maximum Likelihood\n", "\n", "**Maximum likelihood** (ML) is an approach for estimating the parameters of statistical models. The resulting estimates from ML have good theoretical properties, so it is a widely-used method. \n", "\n", "There is a ton of theory regarding ML. We will restrict ourselves to the mechanics here.\n", "\n", "Say we have some data $y = y_1,y_2,\\ldots,y_n$ that is distributed according to some distribution:\n", "\n", "