{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "###### The cell below loads the visual style of the notebook when run." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "from IPython.core.display import HTML\n", "css_file = '../../styles/styles.css'\n", "HTML(open(css_file, \"r\").read())" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "# Fitting a model to data" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "
\n", "
\n", "

Learning Objectives

\n", "
\n", "
\n", "\n", "> * Be able to describe what a figure of merit is, and what it is for.\n", "> * Learn how to use ```scipy``` to fit models to data.\n", "> * Understand why sometimes model-fitting algorithms do not give the right answer.\n" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "## Theory\n", "\n", "Probably the most common task for an observational astronomer is fitting a model to some data. Suppose we have a model $M$ with some parameters $\\theta$. For example, our model, $M$ could be a straight line $y = mx+c$, with parameters $\\theta = (m,c)$. Model fitting is the process of finding the parameters which *best fit* our data, and the corresponding uncertainties on them. To do this, we obviously have to have some objective **measure** of how well our model fits our data. Such a measure is often called a **figure of merit**.\n", "\n", "### Figures of merit\n", "\n", "### The sum-of-the-squares\n", "\n", "Suppose we have a model $y(x)$ and we observe a set of points $(x_i,y_i)$. If our model is a good fit to the data then the model prediction at a certain data point $y(x_i)$ will lie close to the observation $y_i$. Therefore, the sum-of-the squares statistic\n", "\n", "$${\\rm SS} = \\sum_i [y_i - y(x_i)]^2,$$\n", "\n", "is a measure of how well our model fits the data. Smaller values of ${\\rm SS}$ imply a better fit. For scientific data, the sum-of-the-squares is not the best figure of merit. This is because every data point contributes equally to the sum, regardless of it's uncertainty.\n", "\n", "### The $\\chi^2$ statistic\n", "\n", "This is the most commonly used measure of goodness-of-fit. Suppose again that we have a model $y(x)$ and we observe a set of points $(x_i,y_i)$, each with corresponding uncertainties $\\sigma_i$. The $\\chi^2$ statistic is \n", "\n", "$$\\chi^2 = \\sum_i \\left(\\frac{y_i - y(x_i)}{\\sigma_i} \\right) ^2,$$\n", "\n", "where $y(x_i)$ is the prediction of our model at $x_i$. Intuitively, we can see that $\\chi^2$ measures goodness of fit. The quantity $ [y_i - y(x_i)]/\\sigma_i $ is a measure of how many error bars a data point is away from our model. We saw in the [lectures](http://slittlefair.staff.shef.ac.uk/teaching/phy241/lectures/L10/index.html#gaussian) that if our model fits well, most data points lie within 1$\\sigma$, but a third lie further away. Therefore, we'd expect $\\chi^2$ for a good fit to roughly equal the number of data points. Poor fits will yield larger versions of $\\chi^2$. A common way to fit a model is therefore to find the model parameters which **minimise** $\\chi^2$.\n", "\n", "If you want to prove to yourself that minimising $\\chi^2$ is not just intuitively correct, but formally correct, have a look at the companion notebook `08a-why_chisq.ipynb`." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "----------\n", "\n", "## Model fitting in Python\n", "\n", "Model fitting is therefore just a matter of finding the model parameters that minimise $\\chi^2$. You could do this by hand, of course, but this is the kind of task that computers are designed for. Python has several different methods available in libraries. They all do the same thing - they attempt to automatically search parameter space for the values that minimise $\\chi^2$. Which you want to use depends a little on your task. We'll look at two cases: fitting simple polynomial models (e.g a straight line), and fitting more complex models\n", "\n", "So we want to fit a straight line to data using python, and plot the results. This is very simple. First we need to import the relevant modules" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "%matplotlib inline\n", "import numpy as np\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "Now we wish to load in the data to fit. This is stored in a comma separated file. The loadtxt function in numpy will do this simply and will load each column into arrays called x, y and e." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "# note the \"unpack\" optional argument to unpack the columns automatically\n", "x, y, e = np.loadtxt('/home/user/PHY241/data/Session7/data.txt', unpack=True)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "Now we have our arrays of data we can plot it to see what it looks like." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ ], "source": [ "fig,ax = plt.subplots()\n", "ax.errorbar(x, y, yerr=e, fmt='.')\n", "ax.set_xlabel('x')\n", "ax.set_ylabel('y')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "# Fitting models with SciPy\n", "\n", "In this case our data looks like a straight line will fit it. If we want to fit a polynomial to some data, the easiest way is to use the ```numpy.polyfit``` package. I show how to do that in the companion notebook `08b-polynomials.ipynb`. For this lab, we are going to learn a different approach, which has the advantage that it will scale to more complex models.\n", "\n", "If we want to fit a model which is more complex than a polynomial, we can use scipy's **optimize** library. This library has many routines for finding the parameters of a model which best fit your data. We're going to look at the [curve fit](http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html#scipy.optimize.curve_fit) function in scipy's [optimize](http://docs.scipy.org/doc/scipy/reference/optimize.html) package, which uses the Levenberg-Marquardt algorithm to find the parameters which best fit our data.\n", "\n", "### The Levenberg-Marquardt algorithm: theory\n", "\n", "Finding the model parameters which minimise $\\chi^2$ for a function requires an algorithm, or recipe. You could imagine using a brute-force method where you calculate $\\chi^2$ over a wide range of possible parameters, and find the best ones. Such a method does work, but is very slow - impractically so for models that have more than two parameters.\n", "\n", "The Levenberg-Marquardt algorithm is an attempt to find the best parameters more efficiently. It is a type of \"gradient-search\" method, illustrated in the figure below:\n", "\n", "\n", "\n", "The plot above shows how $\\chi^2$ depends upon the model parameter for a model with a single free parameter. From a given starting point, the Levenberg-Marquardt algorithm finds the \"downhill\" direction in $\\chi^2$, and steps in that direction until $\\chi^2$ stops decreasing. This location gives the parameter values which minimise $\\chi^2$. The steps taken are shown by the red dots and solid red line. The idea is easily extended to models with more than one free parameter (for example, a straight line fit has two - the slope and the gradient):\n", "\n", "\n", "\n", "The greyscale in the image above shows the $\\chi^2$ space for a model with two parameters, $a$ and $b$. Dark areas represent low values of $\\chi^2$, blue contours show lines of constant $\\chi^2$. From a given starting point, the Levenberg-Marquardt algorithm follows the \"downhill\" direction in $\\chi^2$, and steps in that direction until $\\chi^2$ stops decreasing. This location gives the parameter values which minimise $\\chi^2$. The steps taken are shown by the red dots and solid red line.\n", "\n", "This second figure also illustrates an important point about model-fitting: there can be more than one \"valley\" of low $\\chi^2$ in the parameter space, and the Levenberg-Marquardt algorithm always sets off in the local downhill direction until the minimum is reached. This can lead to the function finding a local $\\chi^2$ minimum, which is not the *global* $\\chi^2$ minimum, as illustrated by the dashed red line.\n", "\n", "What this means for you in practice is that the \"best-fit\" you find for a model **may not be the overall best fit**, and will depend upon your starting position. It is **always** good practice when fitting a model to data to check that starting from different positions gives the same solution.\n", "\n", "### The Levenberg-Marquardt algorithm: practice\n", "\n", "Suppose we want to fit a model (any model) to some data. The first step is to write **a function that calculates the prediction of our model** at some location, $x$. The first argument of this function should be the location $x$. The remaining arguments are our model parameters. Therefore, for a straight line, our function looks like this (as always, read the documentation string carefully to understand how the function works):" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "def funcToFit(x, m, c):\n", " \"\"\"Function to calculate a straight line\n", "\n", " Args\n", " --------\n", " x: np.ndarray, float\n", " the positions at which to evaluate the function\n", " m: float\n", " the gradient of the straight line\n", " c: float\n", " the intercept of the line\n", "\n", " Returns\n", " ---------\n", " np.ndarray, float\n", " the values predicted by our model at positions x\n", " \"\"\"\n", " return m*x + c" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "Now we can use the [curve fit](http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html#scipy.optimize.curve_fit) function in scipy's [optimize](http://docs.scipy.org/doc/scipy/reference/optimize.html) package, which uses the Levenberg-Marquardt algorithm to find the parameters which best fit our data.\n", "\n", "Look *closely* at the comments in the code below to understand how it works. ```curve_fit``` returns two variables. The first is a ```list``` of the best fitting model parameters. The second is the *covariance matrix*. You'll learn more about covariance matrices next year. For now you can assume that the square root of the diagonal elements of this matrix are our uncertainties, and these can easily be extracted using numpy's ```diag``` function " ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "from scipy.optimize import curve_fit\n", "\n", "# we need an initial guess for our starting parameters\n", "# this is a list with one entry for each parameter\n", "p0 = [2, 2]\n", "\n", "\"\"\"\n", "We run curve fit itself below\n", "\n", "The first argument is the function that defines our model\n", "The next two arguments are the \"x\" and \"y\" data\n", "The fourth argument is our starting guess\n", "The final (optional) argument is the error bars, \"e\".\n", "\n", "Without the error bars, curve_fit will minimise the Sum-of-Squares.\n", "\n", "Curve fit returns two variables. The first is the best fitting\n", "model parameters, the second is the covariance matrix.\n", "\"\"\"\n", "popt,pcov = curve_fit(funcToFit, x, y, p0, sigma=e)\n", "\n", "# let's print out our fits\n", "# I use numpy's diag function to pick out the diagonal\n", "# elements of the covariance matrix\n", "print( popt )\n", "print( np.sqrt(np.diag(pcov)) )\n", "\n", "# also, let's use python's string formatting to print out in a \n", "# nice format. Note how I ensure the correct number of decimal places\n", "m, c = popt\n", "em, ec = np.sqrt(np.diag(pcov))\n", "print( 'Gradient = {0:.3f} +/- {1:.3f}'.format(m, em) )\n", "print( 'Intercept = {0:.1f} +/- {1:.1f}'.format(c, ec) )" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "
\n", "
\n", "

Plot the best fit

\n", "
\n", "
\n", "\n", "> Make a plot of the data and the best fit to it. Use error bars for the data points. For the best fit, plot a solid line.\n", "\n", "> You will need to use your function ```funcToFit``` and the best fit parameters ```popt``` to calculate the best fitting model at the points ```x```. You can then plot the result against ```x``` on the same graph as your data." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "## More complex models\n", "\n", "Generalising the process to more complex models is easy - all we do is change the function we fit our data with - ```funcToFit``` (note, we could call this function anything, but ```funcToFit``` tells you what it is!)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "
\n", "
\n", "

Some complex modelling

\n", "
\n", "
\n", "\n", "> The data file ```/home/user/PHY241/data/Session7/complex_model.txt``` contains some data in the same format as ```data.txt``` but the model we think explains the data is more complex. We have reason to believe the data is explained by the model \n", "\n", "> $$ y = a \\cos{bx} + b \\sin{ax} $$\n", "\n", "> In the cell below, write a new version of ```funcToFit``` which encodes this model. Use the version of ```funcToFit``` above as a template to make sure you get it right.\n", "\n", "> The test cell below should run without errors if you've done things right..." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "# YOUR CODE HERE" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "# this cell should raise no errors!\n", "from nose.tools import assert_almost_equal\n", "assert_almost_equal(funcToFit(2, 12, 2), -9.6548801743765917, places=6)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "
\n", "
\n", "

Fit the data

\n", "
\n", "
\n", "\n", "> In the cell(s) below, read the data from the file ```/home/user/PHY241/data/Session7/complex_model.txt``` into variables ```x```, ```y``` and ```e```.\n", "\n", "> Use ```curve_fit``` to fit your function above to the data, and plot the data and best fitting model together.\n", "\n", "> Start from a guess for the parameters of $a=-12$, $b=2$. Does your model fit your data? If you didn't get a good fit, try again from a starting position of $a=5$, $b=2$. Do you get a good fit now?\n", "\n", "> Create a markdown cell, explaining the results you found above." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": true }, "outputs": [ ], "source": [ "# YOUR CODE HERE - USE AS MANY CELLS AS YOU LIKE" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "---------\n", "
\n", "
\n", "

Homework #7

\n", "

Fitting Exoplanet Transit Data

\n", "
\n", "
\n", "\n", "## Important - this homework is OPTIONAL. There is no due date and it won't be marked. It's here because it's cool.\n", "\n", "As a reminder, we can use ```curve_fit``` to fit more or less any model we like, as long as we can write that model as a Python function. In the file ```transit.py``` I have written a Python function that calculates the shape of an exoplanet transit lightcurve. We can import this function from the Python file in the same we would import functions from ```scipy``` or ```numpy``` Let's import the function and look at it's documentation:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "from transit import transit\n", "help(transit)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "Your task in this homework is to fit the transit lightcurve of the exoplanet Wasp-4b. However, we don't want to fit all the parameters taken by the function ```transit```! I have been very kind, and found reference values for all but the planetary radius, the stellar radius and the inclination. These are the three parameters we want to fit, so we need to write a new function that hard-codes the other parameters in." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "
\n", "
\n", "

Q1: Writing our function to fit (4 points)

\n", "
\n", "
\n", "\n", "> Complete the function below to write a new function which takes an array of times, plus three parameters for the planetary radius, the inclination and the limb darkening. Your function must call the transit function above, with the other parameters fixed at values found from the literature. These are:\n", "\n", "> Orbital period, $P = 1.38823187\\,$ days.\n", "\n", "> Time of mid-transit, $T_0 = 54748.150490$ days.\n", "\n", "> The limb darkening parameter, $\\mu = 0.311$." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false, "deletable": false }, "outputs": [ ], "source": [ "def funcToFit(t, Rs, Rp, i):\n", " \"\"\"Calculate the transit shape due to an exoplanet.\n", "\n", " This function implements the model of exoplanetary transits found in\n", " Sackett et al (1999, ASIC, 532, 189). \n", "\n", " Args\n", " ----\n", " t: np.ndarray, float\n", " the times at which to calcualte the transit lightcurve. Units of days (MJD)\n", " Rs: float\n", " radius of star, divided by the orbital separation\n", " Rp: float\n", " radius of exoplanet, divided by the orbital separation\n", " i: float\n", " inclination, in degrees\n", "\n", " Returns\n", " --------\n", " np.ndarray, float\n", " the transit lightcurve, normalised to 1 outside of transit\n", " \"\"\"\n", " # WRITE YOUR CODE HERE" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false, "deletable": false }, "outputs": [ ], "source": [ "from nose.tools import assert_almost_equal, assert_equal\n", "t = [54822.925253333335,54748.150490]\n", "rp = 0.02883\n", "rs = 0.01827\n", "i = 88.8\n", "result = funcToFit(t,rs,rp,i)\n", "assert_equal(len(result),2)\n", "assert_almost_equal(result[0],1,places=5)\n", "assert_almost_equal(result[1],0.28553724389678403,places=5)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "
\n", "
\n", "

Q2: Fit the transit data (4 points)

\n", "
\n", "
\n", "\n", "> The transit data (time, flux, error) are stored in the text file ```/home/user/PHY241/data/Session7/wasp4_transit.txt```. Using the examples provided above as a template, read in the data file and fit the data using ```curve_fit```. Store the best fitting parameters in a variable named ```popt``` and the covariance matrix in a variable named ```pcov```.\n", "\n", "> You will need reasonable starting guesses for the scaled star and planet radii and the inclination. I suggest 0.2, 0.02 and 88 degrees respectively." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false, "deletable": false }, "outputs": [ ], "source": [ "# YOUR CODE HERE" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false, "deletable": false }, "outputs": [ ], "source": [ "assert_almost_equal(popt[0],1.72786592e-01,places=3)\n", "assert_almost_equal(popt[1],2.70230620e-02,places=3)\n", "assert_almost_equal(popt[2],9.02249791e+01,places=3)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "
\n", "
\n", "

Q3: Plot the fit (2 points)

\n", "
\n", "
\n", "\n", "> Plot the data and best fit on a single plot." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false, "deletable": false }, "outputs": [ ], "source": [ "# YOUR CODE HERE" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (system-wide)", "language": "python", "metadata": { "cocalc": { "description": "Python 3 programming language", "priority": 100, "url": "https://www.python.org/" } }, "name": "python3", "resource_dir": "/ext/jupyter/kernels/python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9" } }, "nbformat": 4, "nbformat_minor": 4 }