{ "cells": [ { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "# Exercises: Maximum Likelihood Estimation \n", "By Christopher van Hoecke, Max Margenot, and Delaney Mackenzie\n", "\n", "## Lecture Link : \n", "https://www.quantopian.com/lectures/maximum-likelihood-estimation\n", "\n", "###IMPORTANT NOTE: \n", "This lecture corresponds to the Maximum Likelihood Estimation lecture, which is part of the Quantopian lecture series. This homework expects you to rely heavily on the code presented in the corresponding lecture. Please copy and paste regularly from that lecture when starting to work on the problems, as trying to do them from scratch will likely be too difficult.\n", "\n", "Part of the Quantopian Lecture Series:\n", "\n", "* [www.quantopian.com/lectures](https://www.quantopian.com/lectures)\n", "* [github.com/quantopian/research_public](https://github.com/quantopian/research_public)\n", "\n", "Notebook released under the Creative Commons Attribution 4.0 License.\n", "\n", "----" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "## Key concepts\n", "Normal Distribution MLE Estimators: \n", "$$\n", "\\hat\\mu = \\frac{1}{T}\\sum_{t=1}^{T} x_t \\\\\\qquad \\hat\\sigma = \\sqrt{\\frac{1}{T}\\sum_{t=1}^{T}{(x_t - \\hat\\mu)^2}}\n", "$$\n", "\n", "\n", "Exponential Distribution MLE Estimators: \n", "$$\\hat\\lambda = \\frac{\\sum_{t=1}^{T} x_t}{T}$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "# Useful Libraries\n", "import pandas as pd\n", "import math\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import scipy\n", "import scipy.stats as stats" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "----" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "# Exercise 1: Normal Distribution\n", "- Given the equations above, write down functions to calculate the MLE estimators $\\hat{\\mu}$ and $\\hat{\\sigma}$ of the normal distribution. \n", "- Given the sample normally distributed set, find the maximum likelihood $\\hat{\\mu}$ and $\\hat{\\sigma}$.\n", "- Fit the data to a normal distribution using SciPy. Compare SciPy's calculated parameters with your calculated values of $\\hat{\\mu}$ and $\\hat{\\sigma}$.\n", "- Plot a normal distribution PDF with your estimated parameters" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "# Normal mean and standard deviation MLE estimators\n", "def normal_mu(X):\n", " # Get the number of observations\n", " T = #______# Your code goes here\n", " # Sum the observations\n", " s = #______# Your code goes here\n", " return 1.0/T * s\n", "\n", "def normal_sigma(X):\n", " T = #______# Your code goes here\n", " # Get the mu MLE\n", " mu = #______# Your code goes here\n", " # Sum the square of the differences\n", " s = #______# Your code goes here\n", " # Compute sigma^2\n", " sigma_squared = \n", " return math.sqrt(sigma_squared)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "# Normal Distribution Sample Data\n", "TRUE_MEAN = 40\n", "TRUE_STD = 10\n", "X = np.random.normal(TRUE_MEAN, TRUE_STD, 10000000)\n", "\n", "# Use your functions to compute the MLE mu and sigma\n", "mu = #______# Your code goes here\n", "std = #______# Your code goes here\n", "\n", "print 'Maximum likelihood value of mu:', mu\n", "print 'Maximum likelihood value for sigma:', std" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "# Fit the distribution using SciPy and compare those parameters with yours \n", "scipy_mu, scipy_std = #______# Your code goes here\n", "print 'Scipy Maximum likelihood value of mu:', scipy_mu\n", "print 'Scipy Maximum likelihood value for sigma:', scipy_std" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "# Get the PDF, fill it with your calculated parameters, and plot it along x\n", "x = np.linspace(0, 80, 80)\n", "\n", "plt.hist(X, bins=x, normed='true')\n", "plt.plot(pdf(x, loc=mu, scale=std), color='red')\n", "plt.xlabel('Value')\n", "plt.ylabel('Observed Frequency')\n", "plt.legend(['Fitted Distribution PDF', 'Observed Data', ]);" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "------" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "# Exercise 2: Exponential Distribution\n", "- Given the equations above, write down functions to calculate the MLE estimator $\\hat{\\lambda}$ of the exponential distribution\n", "- Given the sample exponentially distributed set, find the maximum likelihood\n", "- Fit the data to an exponential distribution using SciPy. Compare SciPy's calculated parameter with your calculated values of $\\hat{\\lambda}$\n", "- Plot an exponential distribution PDF with your estimated parameter" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "def exp_lambda(X):\n", " T = #______# Your code goes here\n", " s = #______# Your code goes here\n", " return s/T" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "# Exponential distribution sample data\n", "TRUE_LAMBDA = 5\n", "X = np.random.exponential(TRUE_LAMBDA, 1000)\n", "\n", "# Use your functions to compute the MLE lambda\n", "lam = #______# Your code goes here\n", "print \"Lambda estimate: \", lam" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "# Fit the distribution using SciPy and compare that parameter with yours \n", "_, l = #______# Your code goes here\n", "print 'Scipy lambds estimate: ', l" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "# Get the PDF, fill it with your calculated parameter, and plot it along x\n", "x = range(0, 80)\n", "\n", "plt.hist(X, bins=x, normed='true')\n", "plt.plot(pdf(x, scale=l), color = 'red')\n", "plt.xlabel('Value')\n", "plt.ylabel('Observed Frequency')\n", "plt.legend(['Fitted Distribution PDF', 'Observed Data', ]);" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "----" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "# Exercise 3 : Fitting Data Using MLE\n", "- Using the MLE estimators laid out in the lecture, fit the returns for SPY from 2014 to 2015 to a normal distribution. \n", "- Check for normality using the Jarque-Bera test" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "prices = get_pricing('SPY', \n", " fields='price', \n", " start_date='2016-01-04', \n", " end_date='2016-01-05', \n", " frequency = 'minute')\n", "returns = prices.pct_change()[1:]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "mu = #______# Your code goes here\n", "std = #______# Your code goes here\n", "\n", "x = np.linspace(#______# Your code goes here)\n", "h = plt.hist(#______# Your code goes here)\n", "l = plt.plot(#______# Your code goes here)\n", "plt.show(h, l);" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Recall that this fit **only** makes sense **if** we have normally distributed data." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "alpha = 0.05\n", "stat, pval = #______# Your code goes here\n", "print pval\n", "\n", "if pval > alpha: \n", " print 'Accept our null hypothesis'\n", "if pval < alpha: \n", " print 'Reject our null hypothesis'" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "---\n", "\n", "Congratulations on completing the Maximum Likelihood Estimation exercises!\n", "\n", "As you learn more about writing trading models and the Quantopian platform, enter the daily [Quantopian Contest](https://www.quantopian.com/contest). Your strategy will be evaluated for a cash prize every day.\n", "\n", "Start by going through the [Writing a Contest Algorithm](https://www.quantopian.com/tutorials/contest) tutorial." ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "*This presentation is for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation for any security; nor does it constitute an offer to provide investment advisory or other services by Quantopian, Inc. (\"Quantopian\"). Nothing contained herein constitutes investment advice or offers any opinion with respect to the suitability of any security, and any views expressed herein should not be taken as advice to buy, sell, or hold any security or as an endorsement of any security or company. In preparing the information contained herein, Quantopian, Inc. has not taken into account the investment needs, objectives, and financial circumstances of any particular investor. Any views expressed and data illustrated herein were prepared based upon information, believed to be reliable, available to Quantopian, Inc. at the time of publication. Quantopian makes no guarantees as to their accuracy or completeness. All information is subject to change and may quickly become unreliable for various reasons, including changes in market conditions or economic circumstances.*" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 2 }