{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# A papermill example: Fitting a model\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Specify default parameters\n", "\n", "This is a \"parameters\" cell, which defines default" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "parameters" ] }, "outputs": [], "source": [ "# Our default parameters\n", "# This cell has a \"parameters\" tag, means that it defines the parameters for use in the notebook\n", "start_date = \"2001-08-05\"\n", "stop_date = \"2016-01-01\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set up our packages and create the data\n", "\n", "We'll run `plt.ioff()` so that we don't get double plots in the notebook" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import papermill as pm\n", "plt.ioff()\n", "np.random.seed(1337)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Generate some fake data by date\n", "dates = pd.date_range(\"2010-01-01\", \"2017-01-01\")\n", "data = pd.DataFrame(np.random.randn(len(dates)), index=dates, columns=['mydata'])\n", "data = data.rolling(100).mean() # Smooth it so it looks purdy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Choose a subset of data to highlight\n", "\n", "Here we use the **start_date** and **stop_date** parameters, which are defined above by default, but can\n", "be overwritten at runtime by papermill." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data_highlight = data.loc[start_date: stop_date]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We use the `pm.record()` function to keep track of how many records were included in the\n", "highlighted section. This lets us inspect this value after running the notebook with papermill.\n", "\n", "We also include a ValueError if we've got a but in the start/stop times, which will be captured\n", "and displayed by papermill if it's triggered." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "num_records = len(data_highlight)\n", "pm.record('num_records', num_records)\n", "if num_records == 0:\n", " raise ValueError(\"I have no data to highlight! Check that your dates are correct!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Make our plot\n", "\n", "Below we'll generate a matplotlib figure with our highlighted dates. By calling `pm.display()`, papermill\n", "will store the figure to the key that we've specified (`highlight_dates_fig`). This will let us inspect the\n", "output later on." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots()\n", "ax.plot(data.index, data['mydata'], c='k', alpha=.5)\n", "ax.plot(data_highlight.index, data_highlight['mydata'], c='r', lw=3)\n", "ax.set(title=\"Start: {}\\nStop: {}\".format(start_date, stop_date))\n", "pm.display('highlight_dates_fig', fig)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }