{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Radio Light Curve Module\n",
"\n",
"**Lecturer:** David Kaplan
\n",
"**Jupyter Notebook Authors:** Dougal Dobie, David Kaplan & Cameron Hummels\n",
"\n",
"This is a Jupyter notebook lesson taken from the GROWTH Summer School 2020. For other lessons and their accompanying lectures, please see: http://growth.caltech.edu/growth-school-2020.html\n",
"\n",
"## Objective\n",
"Produce and analyze radio light curve data, fitting a broken power law to it.\n",
"\n",
"## Key steps\n",
"- Plot the radio light curve\n",
"- Determine its spectral index\n",
"- Scale the data based on its spectral index\n",
"- Fit the data with a power law\n",
"- Fit the data with a broken power law with a smooth turnover\n",
"- Interpret these results with a physical model\n",
"\n",
"## Required dependencies\n",
"\n",
"See GROWTH school webpage for detailed instructions on how to install these modules and packages. Nominally, you should be able to install the python modules with `pip install `. The external astromatic packages are easiest installed using package managers (e.g., `rpm`, `apt-get`).\n",
"\n",
"### Python modules\n",
"* python 3\n",
"* astropy\n",
"* numpy\n",
"* scipy\n",
"* matplotlib\n",
"* emcee\n",
"* corner\n",
"\n",
"### External packages\n",
"None"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import emcee\n",
"from astropy.io import ascii\n",
"import corner\n",
"import os\n",
"from timeit import default_timer as timer\n",
"import matplotlib.colors as colors\n",
"import matplotlib.cm as cmx\n",
"from scipy.optimize import least_squares, curve_fit\n",
"from scipy.stats import f"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Load the data\n",
"We write a basic function to load in the radio lightcurve. It's pretty easy using `astropy`. But you should look at the file `radio_lightcurve.dat` and make sure that the data returned look appropriate. It's also good to keep track of some basic info like:\n",
"1. How many observations are there?\n",
"2. What is the first observation date? What is the last?\n",
"3. What is the lowest frequency? What is the highest frequency?\n",
"4. Do all of the observations actually report _detections_, where the source is detected at $>3\\sigma$ significance?\n",
"\n",
"We will use the `astropy` `Table` module, so each data-set works as a table. This way all relevant columns get treated together. This isn't required, but it makes things easier."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first few lines of `radio_lightcurve.dat`\n",
"\n",
"```\n",
"delta_t telescope frequency flux rms\n",
"10.37 VLA 6.2 7.8 2.5\n",
"16.42 VLA 3.0 18.7 6.3\n",
"17.39 VLA 3.0 15.1 3.9\n",
"...\n",
"```\n",
"\n",
"The first line is the header (with `delta_t` = $\\Delta_t$ the time since the GW trigger in days)."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"def load_data(filename='data/radio_lightcurve.dat'):\n",
" data = ascii.read(filename)\n",
" return data\n",
"data = load_data()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Answers to the questions above:\n",
"\n",
"1.\n",
"\n",
"2.\n",
"\n",
"3.\n",
"\n",
"4.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Make a plot of the radio lightcurve\n",
"Plot the flux density as a function of time. This works best as a log-log plot. Making a basic plot is easy, but we want to be a little fancier. There are some variations we might make later on so it helps to modularise your code *now*. \n",
"\n",
"For bonus points use different markers for each telescope, and use a colour scale to denote the observation frequency.\n",
"\n",
"The basic elements are below. Fill in the `plot_data` and `make_plot` functions\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"ename": "IndentationError",
"evalue": "expected an indented block (, line 7)",
"output_type": "error",
"traceback": [
"\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m7\u001b[0m\n\u001b[0;31m return\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mIndentationError\u001b[0m\u001b[0;31m:\u001b[0m expected an indented block\n"
]
}
],
"source": [
"def plot_data(ax, sm, data, scaled=False, **kwargs):\n",
" telescope_marker_dict = {'VLA':'s', 'ATCA':'o', 'GMRT':'d'}\n",
" \n",
" \n",
" for row in data:\n",
" #Loop over each row of the data, set the marker colour based on frequency and the marker style based on telescope\n",
" return\n",
"\n",
"def cmap_setup(cmap='viridis', min_freq=0, max_freq=17):\n",
" '''\n",
" This function will set up a scalar map for you to colour your markers by frequency\n",
" '''\n",
" freq_cmap = plt.cm.get_cmap(cmap)\n",
" \n",
" cNorm = colors.Normalize(vmin=min_freq, vmax=max_freq)\n",
" scalarMap = cmx.ScalarMappable(norm=cNorm, cmap=cmap)\n",
" sm = scalarMap\n",
" sm._A = []\n",
" \n",
" return sm \n",
" \n",
"def make_plot(data, scaled=False, model=None, params=None, tvals=np.arange(10,400), plot_models=False):\n",
" fig = plt.figure(figsize=(10,6))\n",
" ax = fig.add_subplot(111)\n",
" \n",
" #Get the scalar map, plot the data using the plot_data function above\n",
" sm = cmap_setup()\n",
" plot_data(ax, sm, data, scaled=scaled)\n",
" \n",
" \n",
" #Set up a colourbar\n",
"\n",
" \n",
" #Set axis scales to log\n",
" \n",
" \n",
" #Label axes, set axis limits etc.\n",
" \n",
"make_plot(data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Determining the spectral index\n",
"As with many things in astrophysics, the emission in the radio regime is _non-thermal_ in origin (unlike the early emission in the optical/UV/infrared, which is a thermal blackbody). What that means is that generally has a power-law spectrum:\n",
"$$ S_\\nu(\\nu) \\propto \\nu^\\alpha$$\n",
"Strictly this only works for data observed at _exactly_ the same time, but real observations don't work that way. A single telescope can (usually) only observe at a single frequency, and different telescopes are separated in time by schedules, longitude, etc. So we need to be a bit more generous about how we define simultaneous. Luckily, the light-curve is mostly evolving pretty slowly, so this isn't necessarily a problem.\n",
"\n",
"Write a function to take a subset of the data that was observed roughly simultaneously and calculate the spectral index $\\alpha$ and its uncertainty. Using multi-band observation at 162 days post-merger calculate the spectral index. Questions:\n",
"1. How many observations did you use, over what time window?\n",
"2. What do you get for $\\alpha$ and its uncertainty?\n",
"3. Do you think using a finite time window introduced significant errors into this calculation?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def calc_power_law(freq,S0,alpha):\n",
" S = S0 * (freq) ** alpha\n",
" return S\n",
"\n",
"def alpha_calc(data):\n",
" '''\n",
" Write a function to take in some subset of data and fit a power law to the spectrum using the curve_fit function.\n",
" Your function should return a tuple containing the spectral index and associated uncertainty.\n",
" \n",
" Hint: Don't forget to include uncertainties (and set absolute_sigma=True)\n",
" '''\n",
" \n",
" \n",
" return alpha, alpha_err"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"#Select the data at the ~162 day epoch and print the spectral index + uncertainty\n",
"\n",
"sel_data = data[data['delta_t'] == 162.89]\n",
"\n",
"alpha, alpha_err = alpha_calc(sel_data)\n",
"\n",
"print(\"alpha = %.1f+/-%.1f\"%(alpha, alpha_err))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Answers to the questions above:\n",
"\n",
"1.\n",
"\n",
"2.\n",
"\n",
"3.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Scaling the data based on the spectral Index\n",
"Based on the $\\alpha$ you calculated above, what happens if you assume that _all_ of the observations can be fit using the same frequency power-law? i.e., if $\\alpha$ were the same at all times? If this is the case then scaling all of the data to a single frequency makes it easier to understand the temporal properties of the lightcurve as a function of only 1 variable, not 2.\n",
"\n",
"Write a function to take the observed data and scale it to a specific frequency based on an estimated spectral index. \n",
"\n",
"Don't forget to include uncertainties! You should add two columns to your data table called \"scaled_flux\" and \"scaled_rms\".\n",
"\n",
"Questions:\n",
"1. Can you think of reasons why the spectral index should stay the same? Why it should change?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Answer to the question above:\n",
"\n",
"1.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def scale_data(data, alpha, alpha_err, ref_freq=3.0):\n",
" #calculate a scaling factor for the flux density and uncertainty\n",
" f_scale = \n",
" rms_scale = \n",
" \n",
" #scale the flux and uncertainty - don't forget to add errors in quadrature\n",
" scaled_flux = \n",
" scaled_rms = \n",
" \n",
" #Add two new columns to the data table\n",
" data['scaled_flux'] = scaled_flux\n",
" data['scaled_rms'] = scaled_rms\n",
" \n",
" return data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Modify your `plot_data()` and `make_plot()` functions to add a keyword parameter \"scaled\" that is by default False. If `scaled=True`, `plot_data()` should plot the scaled data instead of the observed data.\n",
"\n",
"Scale the data to 3 GHz based on your estimated spectral index and associated uncertainty, then plot the scaled lightcurve"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"data = scale_data(data, -0.6,0.1)\n",
"make_plot(data, scaled=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Fitting (some of) the data\n",
"We now want to characterise the radio lightcurve. You should be able to see that it initially rises according to a power law, peaks somewhere between 100 and 200 days post-merger and then declines according to a different power law.\n",
"\n",
"However, when we published the first paper demonstrating evidence of a turnover we only had observations up to 200 days post-merger. We will now determine evidence for a turnover using that subset of data.\n",
"\n",
"Create a new table called `tdata` with the data up to 200 days post-merger and plot the data using your `make_plot()` function."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "global name 'ascii' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mdata\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mload_data\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mdata\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mscale_data\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m-\u001b[0m\u001b[0;36m0.6\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0.05\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;31m# use nice Table functionality to select only a subset of the data\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;31m# instead of needing to use explicit indices (like where()) we can just\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;31m# index with a boolean array\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m\u001b[0m in \u001b[0;36mload_data\u001b[0;34m(filename)\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mdatadir\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m''\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mload_data\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfilename\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'radio_lightcurve.dat'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m \u001b[0mdata\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mascii\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mread\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfilename\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 5\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mdata\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0mos\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mchdir\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdatadir\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mNameError\u001b[0m: global name 'ascii' is not defined"
]
}
],
"source": [
"data = load_data()\n",
"data = scale_data(data, -0.6,0.05)\n",
"# use nice Table functionality to select only a subset of the data\n",
"# instead of needing to use explicit indices (like where()) we can just \n",
"# index with a boolean array\n",
"tdata = data[data['delta_t'] < 200]\n",
"\n",
"make_plot(tdata, scaled=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can fit this data with a \"smoothed broken power law\", which combines two power laws with a smoothing parameter around the break point. One functional form of this is given by\n",
"\n",
"$S_\\nu(t) = S_{\\rm \\nu,peak} \\left[ \\left(\\dfrac{t}{t_{\\rm peak}}\\right)^{-s\\delta_1} + \\left(\\dfrac{t}{t_{\\rm peak}}\\right)^{-s\\delta_2}\\right]^{-1/s}$\n",
"\n",
"Here the spectral index is still $\\alpha$, but we've also introduced _temporal_ power-law indices $\\delta_1$ (before the break) and $\\delta_2$ (after the break). $S_{\\rm \\nu,peak}$ is the flux density at peak, and $s$ controls the smoothness of the break.\n",
"\n",
"Write a function `smooth_broken_power_law()` that outputs a smoothed broken power law that also scales based on frequency and spectral index"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def smooth_broken_power_law(t, nu, S_peak, t_peak, delta_1, delta_2, alpha, log_s, nu0=3.0):\n",
"\n",
" \n",
" return "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Determining evidence for a turnover\n",
"\n",
"We now want to fit a smoothed broken power law to our data and see if there is evidence for a turnover in the radio lightcurve. In our paper we do this via a parameter grid-search to minimise the goodness-of-fit parameter $\\chi^2$, i.e., compute $\\chi^2(S_{\\nu,\\rm peak},t_{\\rm peak},\\delta_1,\\delta_2,\\alpha,s)$ for a 6-dimension parameter grid. However, grid searches are slow and innefficient. It's better to concentrate your effort in the part of the fit where the data \"prefer\" to go. We can do this using a slightly more complicated statistical technique\n",
"\n",
"Here we will perform an Markov Chain Monte Carlo (MCMC) maximum likelihood fit using the [`emcee`](http://dfm.io/emcee/current/) package, to determine lightcurve parameters and the spectral index of the source. First you will need to write 3 functions that define your Probability, Prior and Likelihood.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you want more details about MCMC fitting etc., there are many great resources out there (start with the `emcee` page above). But basically we can go back to Bayes' theorem:\n",
"\n",
"$$ P({\\rm model} | {\\rm data}) \\propto P({\\rm model}) P({\\rm data} | {\\rm model}) $$\n",
"\n",
"where $P({\\rm model} | {\\rm data})$ is the probability of the model given the data, which is what we want to find out. The term $P({\\rm model})$ is the *prior*: any information we have about what the model parameters could be (or could not be). The term $P({\\rm data} | {\\rm model})$ is the *likelihood*: how probable would it be to generate the data given the set of model parameters we have. If we can integrate this product over all possible model parameters, we can end up with the distributions of how likely it would be to have model parameters starting from the data, which is what we want. (Some of you will note that we have left out the denominator above which is often called the *evidence* for the data, but we are just treating is at a normalizing factor to be ignored)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will use uniform *priors* with:\n",
"1. $\\delta_1>0$ (since we require the lightcurve to initially rise)\n",
"2. $0 0.0 and log_s < 2:\n",
" return 0.0\n",
" \n",
" else:\n",
" return -np.inf"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will now write a likelihood function that takes the lightcurve parameters inside the tuple theta, along with the observed data. As before, we pass the log of the likelihood. The simple assumption is that the likelihood ${\\cal L}$:\n",
"\n",
"$$ {\\cal L} \\propto \\frac{1}{\\sigma^2}e^{-\\chi^2/2}$$\n",
"\n",
"So $\\log {\\cal L} = -\\chi^2 / 2 - \\log \\sigma^2$ plus a constant (which we will ignore). We will also be explicit about defining our $\\chi^2$:\n",
"\n",
"$$ \\chi^2(S_{\\nu,\\rm peak},t_{\\rm peak},\\delta_1,\\delta_2,\\alpha,s) = \\sum_i \\left(\\frac{{\\rm data}_i - {\\rm model}_i(S_{\\nu,\\rm peak},t_{\\rm peak},\\delta_1,\\delta_2,\\alpha,s)}{\\sigma_i}\\right)^2$$\n",
"\n",
"with $\\sigma_i$ the uncertainty on each point, data$_i$ the data, and model$_i$ the model (a function of the parameters). Minimizing $\\chi^2$ will maximize the likelihood (or the log likelihood)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def lnlike(theta, t, nu, S, S_err):\n",
" S_peak, t_peak, delta_1, delta_2, alpha, log_s = theta\n",
"\n",
" model = smooth_broken_power_law(t, nu, S_peak, t_peak, delta_1, delta_2, alpha, log_s)\n",
" inv_sigma2 = 1.0/S_err**2\n",
"\n",
" return -0.5*(np.sum((S-model)**2*inv_sigma2 - 2*np.log(inv_sigma2)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now write a function to calculate the marginal probability using the `lnlike()` and `lnprior()` functions you calculated above. Since we are taking the log, the produce above becomes a sum:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def lnprob(theta, t, nu, S, S_err):\n",
" lp = lnprior(theta)\n",
"\n",
" if not np.isfinite(lp):\n",
" return -np.inf\n",
"\n",
" return lp + lnlike(theta, t, nu, S, S_err)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will now fit the data using the `emcee` package. The function `get_starting_pos()` provided below will set up an array of \"walker\" starting positions for given lightcurve parameters. Examine the lightcurve and estimate some reasonable values for these parameters and add them to the function.\n",
"\n",
"Now write a function called `run_mcmc()` that will load the observed data, take the starting position and then run the emcee Ensemple Sampler. Use a small number of iterations and walkers initially (100/20) to see how long the code takes to run on your laptop. Then increase both parameters to a larger number so that the algorithm takes ~90 seconds to run."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def get_starting_pos(nwalkers, ndim=6):\n",
" S_peak = \n",
" t_peak = \n",
" delta_1 = \n",
" delta_2 = \n",
" alpha = \n",
" log_s = \n",
" \n",
" pos = [np.asarray([S_peak, t_peak, delta_1, delta_2, alpha, log_s]) + 1e-4*np.random.randn(ndim) for i in range(nwalkers)]\n",
" \n",
" return pos"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def run_mcmc(data, niters=500, nthreads=1, nwalkers=200, ndim=6):\n",
" t = data['delta_t']\n",
" nu = data['frequency']\n",
" S = data['flux']\n",
" S_err = data['rms']\n",
" \n",
" pos = get_starting_pos(nwalkers, ndim=ndim)\n",
" \n",
" sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob, args=(t, nu, S, S_err),threads=nthreads)\n",
" \n",
" start = timer()\n",
" sampler.run_mcmc(pos, niters, progress=True)\n",
" end = timer()\n",
" \n",
" print(\"Computation time: %f s\"%(end-start))\n",
" \n",
" return sampler"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We now want to inspect our chain to see if our algorithm has converged to a reasonable solution. First, extract the chain from the sampler, and then write a function to make a figure showing how each walker moves around the parameter space. Your figure should have 6 subplots (1 for each dimension), iteration number on the x-axis and parameter value on the y-axis.\n",
"\n",
"MCMC algorithms typically use a burn-in phase, where the sampler is moving towards the optimum solution and not yet accurately sampling the parameter space. Add a parameter chain_cut to your function that plots a vertical line at the end of the burn-in. Remember: we want *fuzzy caterpillars*."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"sampler = run_mcmc(tdata)\n",
"chain = sampler.chain"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def make_chain_plot(chain, chain_cut):\n",
" niters = chain.shape[1]\n",
" ndim = chain.shape[2]\n",
"\n",
" fig, axes = plt.subplots(ndim,1,sharex=True)\n",
" fig.set_size_inches(7, 20)\n",
" \n",
" param_names = ['$S_{{\\\\nu,\\\\rm peak}, 3\\.{\\\\rm GHz}}$', '$t_{{\\\\rm peak}}$','$\\\\delta_1$','$\\\\delta_2$', '$\\\\alpha$', '$\\\\log_{10}(s)$']\n",
"\n",
" for i, (ax,param_name) in enumerate(zip(axes,param_names)):\n",
" #plot the chain for the given parameter\n",
" \n",
" \n",
" \n",
" ax.set_ylabel(param_name)\n",
" ax.set_xlim(0,niters)\n",
" \n",
" \n",
" ax.axvline(chain_cut,c='r',linestyle='--')\n",
"\n",
"chain_cut = \n",
"\n",
"make_chain_plot(chain, chain_cut)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You might want to print some of those to make sure they make sense and are consistent with your corner plots above.\n",
"\n",
"Now that we know that our algorithm is converging, and we know how long the burn-in takes we can begin to estimate parameters. The function below will make a corner plot from the good part of your chain (we are ignoring \"thinning,\" which is needed in general)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"flat_samples = sampler.get_chain(discard=chain_cut, flat=True)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'flat_samples' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mplt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msavefig\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msavefile\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0mmake_corner_plot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mflat_samples\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mNameError\u001b[0m: name 'flat_samples' is not defined"
]
}
],
"source": [
"def make_corner_plot(flat_samples, savefile='corner.png'):\n",
" param_names = ['$S_{{\\\\rm peak}, 3\\.{\\\\rm GHz}}$', '$t_{{\\\\rm peak}}$','$\\\\delta_1$','$\\\\delta_2$', '$\\\\alpha$', '$\\\\log_{10}(s)$']\n",
" \n",
" fig = corner.corner(flat_samples, labels=param_names, quantiles=[0.16, 0.5, 0.84], show_titles=True)\n",
" plt.savefig(savefile)\n",
"\n",
"make_corner_plot(flat_samples)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The function below will then extract the median and uncertainty (1 standard deviation) from the chain."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def get_best_params(chain):\n",
" ndim = chain.shape[1]\n",
" \n",
" vals = map(lambda v: (v[1], v[2]-v[1], v[1]-v[0]), zip(*np.percentile(chain, [16, 50, 84],axis=0)))\n",
" \n",
" param_names = ['S_peak', 't_peak', 'delta_1', 'delta_2', 'alpha', 'log_s']\n",
" \n",
" param_dict = dict(zip(param_names,vals))\n",
" \n",
" return param_dict\n",
" \n",
" \n",
"best_params = get_best_params(good_chain)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now write a function, `calc_chi2()`, that will calculate the $\\chi^2$ for the fit. We will use this later to compare different lightcurve models"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def calc_chi2(best_params, param_names, model, data, nu0=3.0):\n",
" args = []\n",
" for param in param_names:\n",
" val = best_params[param][0]\n",
" args.append(val)\n",
"\n",
" best_fit = model(data['delta_t'], nu0, *args)\n",
" \n",
" chi2 = np.sum((best_fit-data['scaled_flux'])**2/data['scaled_rms']**2)\n",
" \n",
" return chi2\n",
"\n",
"param_names = ['S_peak', 't_peak', 'delta_1', 'delta_2', 'alpha', 'log_s']\n",
"\n",
"chi2_best = calc_chi2(best_params, param_names, smooth_broken_power_law, tdata)\n",
"print(chi2_best)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will now plot our best fit on top of the observational data.\n",
"\n",
"Write a function plot_model() that takes in a function that calculates the model fit (in this case, our smooth_broken_power_law function), the best parameters, an array of values to plot the model for and a matplotlib axis to plot it on. Modify your make_plot function to take in an optional argument for the model to plot. If the model is supplied your make_plot() function should call your plot_model() function."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def plot_model(model, params, tvals, ax):\n",
"\n",
" return\n",
"\n",
"plotting_data = scale_data(tdata, best_params['alpha'][0],np.max(best_params['alpha'][1:])) \n",
" \n",
"make_plot(plotting_data,scaled=True,model=smooth_broken_power_law,params=args, plot_models=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But how do we know that the lightcurve has definitely turned over?\n",
"\n",
"We can perform a similar process as above to fit a standard power law to our data and then use an **[F-test](https://en.wikipedia.org/wiki/F-test#Regression_problems)** to determine which model (turnover or no turnover) provides the best fit. We have provided a `power_law()` function that calculates the a power-law fit to the data. Now write a series of functions to perform an MCMC fit using a standard power law; `lnprior_noturnover()`, `lnlike_noturnover()`, `lnprob_noturnover()`, `get_starting_pos_noturnover()`, `run_mcmc_noturnover()`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def power_law(t, nu, F0, delta_1, alpha, nu0=3.0):\n",
" return (nu/nu0)**alpha * F0 * t**delta_1\n",
"\n",
"def lnprior_noturnover(theta):\n",
" F0, delta_1, alpha = theta\n",
"\n",
" if delta_1 > 0.0:\n",
" return 0.0\n",
" \n",
" else:\n",
" return -np.inf\n",
" \n",
"def lnlike_noturnover(theta, t, nu, S, S_err):\n",
" \n",
"\n",
" return lnline_val\n",
"\n",
"def lnprob_noturnover(theta, t, nu, S, S_err):\n",
" lp = lnprior_noturnover(theta)\n",
"\n",
" if not np.isfinite(lp):\n",
" return -np.inf\n",
"\n",
" return lp + lnlike_noturnover(theta, t, nu, S, S_err)\n",
"\n",
"def get_starting_pos_noturnover(nwalkers, ndim=6):\n",
" F0 = \n",
" delta_1 = \n",
" alpha = \n",
" \n",
" pos = [np.asarray([F0, delta_1, alpha]) + 1e-4*np.random.randn(ndim) for i in range(nwalkers)]\n",
" \n",
" return pos\n",
"\n",
"def run_mcmc_noturnover(data, niters=1000, nthreads=1, nwalkers=200, ndim=3):\n",
" \n",
" return sampler"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now run the sampler for the standard power law fit and write a function to plot the chains and calculate the length of the burn-in."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"sampler_noturnover = run_mcmc_noturnover(tdata)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def make_chain_plot_noturnover(chain, chain_cut):\n",
" \n",
"\n",
"chain_noturnover = sampler_noturnover.chain\n",
"chain_cut_nt = \n",
"\n",
"make_chain_plot_noturnover(chain_noturnover, chain_cut)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Make a corner plot for the standard power law fit"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def make_corner_plot_noturnover(good_chain, savefile='corner.png'):\n",
" \n",
" \n",
" plt.savefig(savefile)\n",
"\n",
"good_chain_nt = chain_noturnover[:, chain_cut_nt:, :]\n",
"\n",
"make_corner_plot_noturnover(good_chain_nt)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"How do the values of $\\delta_1$ and $\\alpha$ compare to the previous fit. How does the calculated radio spectral index compare to the spectral index determined from fitting data all the way from the radio to X-rays ($\\alpha=-0.585\\pm 0.005$; Margutti et al. 2018, ApJ, 856, L18)?\n",
"\n",
"Now write a function to get the best parameters for the standard power law fit."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def get_best_params_noturnover(chain):\n",
" \n",
" return param_dict\n",
" \n",
" \n",
"best_params_nt = get_best_params_noturnover(good_chain_nt)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we have the best parameters for the standard power law fit we can plot it over our data. Don't forget to scale the data based on the calculated best-fit spectral index!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"param_names_nt = ['F0', 'delta_1', 'alpha']\n",
"args_nt = []\n",
"for param in param_names_nt:\n",
" val = best_params_nt[param][0]\n",
" args_nt.append(val)\n",
" \n",
"\n",
"\n",
"plotting_data_nt = scale_data(tdata, best_params_nt['alpha'][0],np.max(best_params_nt['alpha'][1:])) \n",
" \n",
"make_plot(plotting_data_nt,scaled=True,model=power_law,params=args_nt)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Calculate the $\\chi^2$ for the standard power law fit. We will then use this, and the previously calculated $\\chi^2$ to perform an F-test and determine which model we prefer."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"chi2_nt = calc_chi2(best_params_nt, ['F0', 'delta_1', 'alpha'], power_law, tdata)\n",
"print(chi2_nt)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"An [F-test](https://en.wikipedia.org/wiki/F-test) is a generalised test that can be used to compare statistical models. In particular, it is useful when comparing two models where one is a restricted form of the other. Write a function calculate_ftest that calculates the F statistic for our two fits and then calculates the corresponding p-value. Hint: We have already imported the [scipy.stats F-distribution model](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f.html), and we can access the cumulative distribution function using f.cdf()."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def calculate_ftest(chi2_t, p_t, chi2_nt, p_nt, n):\n",
" \n",
" return 1-pval\n",
"\n",
"n = \n",
"p_t = \n",
"p_nt = \n",
"\n",
"calculate_ftest(chi2_best, p_t, chi2_nt, p_nt, n)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If `1-pval` is very small, that suggests that the probability of getting such an `F-test` value by chance is very unlikely, so it would suggest that the turnover fit is preferred. In this case, which model is preferred? With what confidence can we say that we prefer one model over the other?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Determining Lightcurve Parameters\n",
"We're now going to use the full radio lightcurve to determine the best fitting parameters for the smooth broken power law. Load the full dataset and pass it to your previous run_mcmc() function."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"data = load_data()\n",
"\n",
"full_sampler = run_mcmc(data)\n",
"full_chain = full_sampler.chain"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Plot the chain and estimate the burn-in length"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"full_chain_cut = 200\n",
"\n",
"make_chain_plot(full_chain, full_chain_cut)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Make a corner plot"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"full_flat_samples = full_sampler.get_chain(discard=chain_cut, flat=True)\n",
"\n",
"make_corner_plot(full_flat_samples, savefile='corner_fulldata.png')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Calculate the best parameters and the $\\chi^2$ for the best fit."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now plot the full lightcurve. A physically motivated model of a relativistic jet and cocoon of ejecta (supplied in `data/jet_cocoon/contribution.txt`) can also be overplotted by using passing \"plot_models=True\" to the make_plot function."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now write a function to plot the best fitting model for the 3 GHz data and call it from your make_plot() function when you pass plot_models=True"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"def plot_physical_models(ax, fname='data/jet_cocoon_contribution.txt'):\n",
" \n",
" return ax"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"make_plot(plotting_data, scaled=True, model=smooth_broken_power_law, params=full_args, plot_models=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So we have now determined the final temporal/spectral behavior of the radio data. What does this tell us? \n",
"\n",
"The fact that the spectral index stayed basically constant over time and that it was the same spectrum from radio to X-ray wavelengths (across ~9 order of magnitude in frequency) tells us that the emission is _optically-thin synchrotron emission_ ([Sari et al. 1998, ApJ, 497, L17](https://ui.adsabs.harvard.edu/abs/1998ApJ...497L..17S/abstract)), which we can use to make other inferences. We also know that with synchrotron emission you can get a change in the spectrum at high frequencies if the highest-energy electrons have cooled: we see no evidence for a cooling break. \n",
"\n",
"The temporal behavior can be used to constrain the geometry of the emission. This is what we (and others) did in a series of papers:\n",
"1. [Hallinan et al. (2017)](http://adsabs.harvard.edu/abs/2017Sci...358.1579H)\n",
"2. [Alexander et al. (2017)](http://adsabs.harvard.edu/abs/2017ApJ...848L..21A)\n",
"3. [Mooley et al. (2018)](http://adsabs.harvard.edu/abs/2018Natur.554..207M)\n",
"4. [Margutti et al. (2018)](http://adsabs.harvard.edu/abs/2018ApJ...856L..18M)\n",
"5. [Dobie et al. (2018)](http://adsabs.harvard.edu/abs/2018ApJ...858L..15D)\n",
"6. [Resmi et al. (2018)](http://adsabs.harvard.edu/abs/2018arXiv180302768R)\n",
"7. [Alexander et al. (2018)](http://adsabs.harvard.edu/abs/2018ApJ...863L..18A)\n",
"8. [Hajela et al. (2019)](https://ui.adsabs.harvard.edu/abs/2019ApJ...886L..17H/abstract)\n",
"9. [Makhathini et al. (2020)](https://ui.adsabs.harvard.edu/abs/2020arXiv200602382M/abstract)\n",
"\n",
"In general we can say that $\\alpha$ relates to the intrinsic distribution of relativistic electrons as:\n",
"$$\\alpha = -\\frac{p-1}{2}$$\n",
"where the electrons have a power-law distribution of energies with index $p$. Moreover, we can relate the same $p$ to the temporal index in the decay phase in a way that depends on the geomety:\n",
"1. If the emission is dominated by a jet (collimated, relativistic) we expect $\\delta_2=-p$ in the decay phase\n",
"2. If it is more spherical it should be:\n",
"$$\\delta_2 = -3\\frac{p-1}{4}$$\n",
"(Sari et al. 1998 again).\n",
"\n",
"Questions:\n",
"1. What do you get for $p$ based on your value of $\\alpha$?\n",
"2. Which scenario, jet or sphere, do you find best fits in the decay phase?\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}