{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Measurement Errors in Linear Regression"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Lets use simulated data here. \n",
    "First we will model the distance of 100 supernovae (for a particular cosmology) as a function of redshift.\n",
    "\n",
    "We rely on that astroML has a common API with scikit-learn, extending the functionality of the latter.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "\n",
    "import numpy as np\n",
    "from matplotlib import pyplot as plt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from astropy.cosmology import LambdaCDM\n",
    "from astroML.datasets import generate_mu_z\n",
    "\n",
    "z_sample, mu_sample, dmu = generate_mu_z(100, random_state=0)\n",
    "\n",
    "cosmo = LambdaCDM(H0=70, Om0=0.30, Ode0=0.70, Tcmb0=0)\n",
    "z = np.linspace(0.01, 2, 1000)\n",
    "mu_true = cosmo.distmod(z)\n",
    "\n",
    "plt.errorbar(z_sample, mu_sample, dmu, fmt='.')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Simple linear regression\n",
    "\n",
    "Regression defined as the relation between a dependent variable, $y$, and a set of independent variables, $x$, \n",
    "that describes the expectation value of y given x: $ E[y|x] $\n",
    "\n",
    "We will start with the most familiar linear regression, a straight-line fit to data.\n",
    "A straight-line fit is a model of the form\n",
    "$$\n",
    "y = ax + b\n",
    "$$\n",
    "where $a$ is commonly known as the *slope*, and $b$ is commonly known as the *intercept*.\n",
    "\n",
    "We can use Scikit-Learn's LinearRegression estimator to fit this data and construct the best-fit line:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.linear_model import LinearRegression as LinearRegression_sk \n",
    "\n",
    "linear_sk = LinearRegression_sk()\n",
    "linear_sk.fit(z_sample[:,None], mu_sample)\n",
    "\n",
    "mu_fit_sk = linear_sk.predict(z[:, None])\n",
    "\n",
    "#------------------------------------------------------------\n",
    "# Plot the results\n",
    "fig = plt.figure(figsize=(8, 6))\n",
    "\n",
    "ax = fig.add_subplot(111)\n",
    "\n",
    "ax.plot(z, mu_fit_sk, '-k')\n",
    "ax.plot(z, mu_true, '--', c='gray')\n",
    "ax.errorbar(z_sample, mu_sample, dmu, fmt='.k', ecolor='gray', lw=1)\n",
    "\n",
    "ax.set_xlim(0.01, 1.8)\n",
    "ax.set_ylim(36.01, 48)\n",
    "\n",
    "ax.set_ylabel(r'$\\mu$')\n",
    "ax.set_xlabel(r'$z$')\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Measurement errors\n",
    "\n",
    "Modifications to LinearRegression in astroML take measurement errors into account on the dependent variable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from astroML.linear_model import LinearRegression\n",
    "\n",
    "linear = LinearRegression()\n",
    "linear.fit(z_sample[:,None], mu_sample, dmu)\n",
    "\n",
    "mu_fit = linear.predict(z[:, None])\n",
    "\n",
    "#------------------------------------------------------------\n",
    "# Plot the results\n",
    "fig = plt.figure(figsize=(8, 6))\n",
    "\n",
    "ax = fig.add_subplot(111)\n",
    "\n",
    "ax.plot(z, mu_fit_sk, '-k')\n",
    "ax.plot(z, mu_fit, '-k', color='red')\n",
    "\n",
    "ax.plot(z, mu_true, '--', c='gray')\n",
    "ax.errorbar(z_sample, mu_sample, dmu, fmt='.k', ecolor='gray', lw=1)\n",
    "\n",
    "ax.set_xlim(0.01, 1.8)\n",
    "ax.set_ylim(36.01, 48)\n",
    "\n",
    "ax.set_ylabel(r'$\\mu$')\n",
    "ax.set_xlabel(r'$z$')\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Basis function regression\n",
    "\n",
    "If we consider a function in terms of the sum of bases (this can be polynomials, Gaussians, quadratics, cubics) then we can solve for the coefficients using regression. \n",
    "\n",
    "### Polynomial basis functions\n",
    "\n",
    "polynomial regression: $$𝑦=𝑎_0+𝑎_1𝑥+𝑎_2𝑥^2+𝑎_3𝑥^3+⋯$$\n",
    "\n",
    "Notice that this is still a linear model—the linearity refers to the fact that the coefficients $𝑎_𝑛$ never multiply or divide each other. What we have effectively done here is to take our one-dimensional $𝑥$ values and projected them into a higher dimension, so that a linear fit can fit more complicated relationships between $𝑥$ and $𝑦$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from astroML.linear_model import PolynomialRegression\n",
    "\n",
    "# 2nd degree polynomial regression\n",
    "polynomial = PolynomialRegression(degree=2)\n",
    "polynomial.fit(z_sample[:,None], mu_sample, dmu)\n",
    "\n",
    "mu_fit_poly = polynomial.predict(z[:, None])\n",
    "\n",
    "#------------------------------------------------------------\n",
    "# Plot the results\n",
    "fig = plt.figure(figsize=(8, 6))\n",
    "\n",
    "ax = fig.add_subplot(111)\n",
    "\n",
    "ax.plot(z, mu_fit_poly, '-k', color='red')\n",
    "\n",
    "ax.plot(z, mu_true, '--', c='gray')\n",
    "ax.errorbar(z_sample, mu_sample, dmu, fmt='.k', ecolor='gray', lw=1)\n",
    "\n",
    "ax.set_xlim(0.01, 1.8)\n",
    "ax.set_ylim(36.01, 48)\n",
    "\n",
    "ax.set_ylabel(r'$\\mu$')\n",
    "ax.set_xlabel(r'$z$')\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Gaussian basis functions\n",
    "\n",
    "Of course, other basis functions are possible.\n",
    "For example, one useful pattern is to fit a model that is not a sum of polynomial bases, but a sum of Gaussian bases. E.g. we could substitute $𝑥^2$ for Gaussians (where we fix $𝜎$ and $𝜇$ and fit for the amplitude) as long as the attribute we are fitting for is linear. This is called basis function regression.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from astroML.linear_model import BasisFunctionRegression\n",
    "\n",
    "#------------------------------------------------------------\n",
    "# Define our number of Gaussians\n",
    "nGaussians = 10\n",
    "basis_mu = np.linspace(0, 2, nGaussians)[:, None]\n",
    "basis_sigma = 3 * (basis_mu[1] - basis_mu[0])\n",
    "\n",
    "gauss_basis = BasisFunctionRegression('gaussian', mu=basis_mu, sigma=basis_sigma)\n",
    "gauss_basis.fit(z_sample[:,None], mu_sample, dmu)\n",
    "\n",
    "mu_fit_gauss = gauss_basis.predict(z[:, None])\n",
    "\n",
    "#------------------------------------------------------------\n",
    "# Plot the results\n",
    "fig = plt.figure(figsize=(8, 6))\n",
    "\n",
    "ax = fig.add_subplot(111)\n",
    "\n",
    "ax.plot(z, mu_fit_gauss, '-k', color='red')\n",
    "\n",
    "ax.plot(z, mu_true, '--', c='gray')\n",
    "ax.errorbar(z_sample, mu_sample, dmu, fmt='.k', ecolor='gray', lw=1)\n",
    "\n",
    "ax.set_xlim(0.01, 1.8)\n",
    "ax.set_ylim(36.01, 48)\n",
    "\n",
    "ax.set_ylabel(r'$\\mu$')\n",
    "ax.set_xlabel(r'$z$')\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Regularization and cross-validation\n",
    "\n",
    "As the complexity of a model increases the data points fit the model more and more closely.\n",
    "\n",
    "This does not result in a better fit to the data. We are overfitting the data (the model has high variance - a small change in a training point can change the model dramatically).\n",
    "\n",
    "We can evaluate this using a training set (50-70% of sample), a cross-validation set (15-25%) and a test set (15-25%). See booko sub-chapter 8.11 and e.g. figures [8.13](http://www.astroml.org/book_figures/chapter8/fig_cross_val_B.html) and [8.14](http://www.astroml.org/book_figures/chapter8/fig_cross_val_C.html).\n",
    "\n",
    "For cases where we are concerned with overfitting we can apply constraints (usually of smoothness, number of coefficients, size of coefficients). See book sub-chapter 8.3, and e.g. figure [8.4](http://www.astroml.org/book_figures/chapter8/fig_rbf_ridge_mu_z.html)\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}