{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "7812dfbd",
   "metadata": {},
   "source": [
    "(rv)=\n",
    "\n",
    "# Radial velocity fitting"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ef499b30-6dcd-4de5-aafb-028290b5c8d0",
   "metadata": {},
   "outputs": [],
   "source": [
    "import exoplanet\n",
    "\n",
    "exoplanet.utils.docs_setup()\n",
    "print(f\"exoplanet.__version__ = '{exoplanet.__version__}'\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f644ca40",
   "metadata": {},
   "source": [
    "In this case study, we will demonstrate how to fit radial velocity observations of an exoplanetary system using *exoplanet*.\n",
    "We will follow [the getting started tutorial](https://radvel.readthedocs.io/en/latest/tutorials/K2-24_Fitting+MCMC.html) from [the excellent RadVel package](https://radvel.readthedocs.io) where they fit for the parameters of the two planets in [the K2-24 system](https://arxiv.org/abs/1511.04497).\n",
    "\n",
    "First, let's download the data from RadVel:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "151595c9",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "url = \"https://raw.githubusercontent.com/California-Planet-Search/radvel/master/example_data/epic203771098.csv\"\n",
    "data = pd.read_csv(url, index_col=0)\n",
    "\n",
    "x = np.array(data.t)\n",
    "y = np.array(data.vel)\n",
    "yerr = np.array(data.errvel)\n",
    "\n",
    "# Compute a reference time that will be used to normalize the trends model\n",
    "x_ref = 0.5 * (x.min() + x.max())\n",
    "\n",
    "# Also make a fine grid that spans the observation window for plotting purposes\n",
    "t = np.linspace(x.min() - 5, x.max() + 5, 1000)\n",
    "\n",
    "plt.errorbar(x, y, yerr=yerr, fmt=\".k\")\n",
    "plt.xlabel(\"time [days]\")\n",
    "_ = plt.ylabel(\"radial velocity [m/s]\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bc35e597",
   "metadata": {},
   "source": [
    "Now, we know the periods and transit times for the planets [from the K2 light curve](https://arxiv.org/abs/1511.04497), so let's start by using the :func:`exoplanet.estimate_semi_amplitude` function to estimate the expected RV semi-amplitudes for the planets."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6bae5e6d",
   "metadata": {},
   "outputs": [],
   "source": [
    "import exoplanet as xo\n",
    "\n",
    "periods = [20.8851, 42.3633]\n",
    "period_errs = [0.0003, 0.0006]\n",
    "t0s = [2072.7948, 2082.6251]\n",
    "t0_errs = [0.0007, 0.0004]\n",
    "Ks = xo.estimate_semi_amplitude(periods, x, y, yerr, t0s=t0s)\n",
    "print(Ks, \"m/s\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "84a75b04",
   "metadata": {},
   "outputs": [],
   "source": [
    "0.5 * (\n",
    "    np.log(np.array(periods) + np.array(period_errs))\n",
    "    - np.log(np.array(periods) - np.array(period_errs))\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "51586651",
   "metadata": {},
   "outputs": [],
   "source": [
    "np.array(period_errs) / np.array(periods)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f817fba",
   "metadata": {},
   "source": [
    "## The radial velocity model in PyMC3\n",
    "\n",
    "Now that we have the data and an estimate of the initial values for the parameters, let's start defining the probabilistic model in PyMC3.\n",
    "First, we'll define our priors on the parameters:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "da098a77",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pymc3 as pm\n",
    "import pymc3_ext as pmx\n",
    "import aesara_theano_fallback.tensor as tt\n",
    "\n",
    "with pm.Model() as model:\n",
    "    # Gaussian priors based on transit data (from Petigura et al.)\n",
    "    t0 = pm.Normal(\"t0\", mu=np.array(t0s), sd=np.array(t0_errs), shape=2)\n",
    "    logP = pm.Normal(\n",
    "        \"logP\",\n",
    "        mu=np.log(periods),\n",
    "        sd=np.array(period_errs) / np.array(periods),\n",
    "        shape=2,\n",
    "        testval=np.log(periods),\n",
    "    )\n",
    "    P = pm.Deterministic(\"P\", tt.exp(logP))\n",
    "\n",
    "    # Wide log-normal prior for semi-amplitude\n",
    "    logK = pm.Normal(\n",
    "        \"logK\", mu=np.log(Ks), sd=2.0, shape=2, testval=np.log(Ks)\n",
    "    )\n",
    "\n",
    "    # Eccentricity & argument of periasteron\n",
    "    ecs = pmx.UnitDisk(\"ecs\", shape=(2, 2), testval=0.01 * np.ones((2, 2)))\n",
    "    ecc = pm.Deterministic(\"ecc\", tt.sum(ecs**2, axis=0))\n",
    "    omega = pm.Deterministic(\"omega\", tt.arctan2(ecs[1], ecs[0]))\n",
    "    xo.eccentricity.vaneylen19(\n",
    "        \"ecc_prior\", multi=True, shape=2, fixed=True, observed=ecc\n",
    "    )\n",
    "\n",
    "    # Jitter & a quadratic RV trend\n",
    "    logs = pm.Normal(\"logs\", mu=np.log(np.median(yerr)), sd=5.0)\n",
    "    trend = pm.Normal(\"trend\", mu=0, sd=10.0 ** -np.arange(3)[::-1], shape=3)\n",
    "\n",
    "    # Then we define the orbit\n",
    "    orbit = xo.orbits.KeplerianOrbit(period=P, t0=t0, ecc=ecc, omega=omega)\n",
    "\n",
    "    # And a function for computing the full RV model\n",
    "    def get_rv_model(t, name=\"\"):\n",
    "        # First the RVs induced by the planets\n",
    "        vrad = orbit.get_radial_velocity(t, K=tt.exp(logK))\n",
    "        pm.Deterministic(\"vrad\" + name, vrad)\n",
    "\n",
    "        # Define the background model\n",
    "        A = np.vander(t - x_ref, 3)\n",
    "        bkg = pm.Deterministic(\"bkg\" + name, tt.dot(A, trend))\n",
    "\n",
    "        # Sum over planets and add the background to get the full model\n",
    "        return pm.Deterministic(\"rv_model\" + name, tt.sum(vrad, axis=-1) + bkg)\n",
    "\n",
    "    # Define the RVs at the observed times\n",
    "    rv_model = get_rv_model(x)\n",
    "\n",
    "    # Also define the model on a fine grid as computed above (for plotting)\n",
    "    rv_model_pred = get_rv_model(t, name=\"_pred\")\n",
    "\n",
    "    # Finally add in the observation model. This next line adds a new contribution\n",
    "    # to the log probability of the PyMC3 model\n",
    "    err = tt.sqrt(yerr**2 + tt.exp(2 * logs))\n",
    "    pm.Normal(\"obs\", mu=rv_model, sd=err, observed=y)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "605230c2",
   "metadata": {},
   "source": [
    "Now, we can plot the initial model:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a5cba2e3",
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.errorbar(x, y, yerr=yerr, fmt=\".k\")\n",
    "\n",
    "with model:\n",
    "    plt.plot(t, pmx.eval_in_model(model.vrad_pred), \"--k\", alpha=0.5)\n",
    "    plt.plot(t, pmx.eval_in_model(model.bkg_pred), \":k\", alpha=0.5)\n",
    "    plt.plot(t, pmx.eval_in_model(model.rv_model_pred), label=\"model\")\n",
    "\n",
    "plt.legend(fontsize=10)\n",
    "plt.xlim(t.min(), t.max())\n",
    "plt.xlabel(\"time [days]\")\n",
    "plt.ylabel(\"radial velocity [m/s]\")\n",
    "_ = plt.title(\"initial model\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dd70b3a8",
   "metadata": {},
   "source": [
    "In this plot, the background is the dotted line, the individual planets are the dashed lines, and the full model is the blue line.\n",
    "\n",
    "It doesn't look amazing so let's fit for the maximum a posterior parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e9ed3b98",
   "metadata": {},
   "outputs": [],
   "source": [
    "with model:\n",
    "    map_soln = pmx.optimize(start=model.test_point, vars=[trend])\n",
    "    map_soln = pmx.optimize(start=map_soln, vars=[t0, trend, logK, logP, logs])\n",
    "    map_soln = pmx.optimize(start=map_soln, vars=[ecs])\n",
    "    map_soln = pmx.optimize(start=map_soln)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5af974ba",
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.errorbar(x, y, yerr=yerr, fmt=\".k\")\n",
    "plt.plot(t, map_soln[\"vrad_pred\"], \"--k\", alpha=0.5)\n",
    "plt.plot(t, map_soln[\"bkg_pred\"], \":k\", alpha=0.5)\n",
    "plt.plot(t, map_soln[\"rv_model_pred\"], label=\"model\")\n",
    "\n",
    "plt.legend(fontsize=10)\n",
    "plt.xlim(t.min(), t.max())\n",
    "plt.xlabel(\"time [days]\")\n",
    "plt.ylabel(\"radial velocity [m/s]\")\n",
    "_ = plt.title(\"MAP model\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f26f5f52",
   "metadata": {},
   "source": [
    "That looks better.\n",
    "\n",
    "## Sampling\n",
    "\n",
    "Now that we have our model set up and a good estimate of the initial parameters, let's start sampling.\n",
    "There are substantial covariances between some of the parameters so we'll use the `pmx.sample` function from [pymc3-ext](https://github.com/exoplanet-dev/pymc3-ext) which wraps `pm.sample` function with some better defaults and tuning strategies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "090279e4",
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(42)\n",
    "with model:\n",
    "    trace = pmx.sample(\n",
    "        tune=1000,\n",
    "        draws=1000,\n",
    "        cores=2,\n",
    "        chains=2,\n",
    "        target_accept=0.9,\n",
    "        return_inferencedata=True,\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d455c287",
   "metadata": {},
   "source": [
    "After sampling, it's always a good idea to do some convergence checks.\n",
    "First, let's check the number of effective samples and the Gelman-Rubin statistic for our parameters of interest:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "05343243",
   "metadata": {},
   "outputs": [],
   "source": [
    "import arviz as az\n",
    "\n",
    "az.summary(\n",
    "    trace, var_names=[\"trend\", \"logs\", \"omega\", \"ecc\", \"t0\", \"logK\", \"P\"]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "49efcc83",
   "metadata": {},
   "source": [
    "It looks like everything is pretty much converged here. Not bad for 14 parameters and about a minute of runtime...\n",
    "\n",
    "Then we can make a [corner plot](https://corner.readthedocs.io) of any combination of the parameters.\n",
    "For example, let's look at period, semi-amplitude, and eccentricity:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a5939a96",
   "metadata": {},
   "outputs": [],
   "source": [
    "import corner\n",
    "\n",
    "with model:\n",
    "    _ = corner.corner(trace, var_names=[\"P\", \"logK\", \"ecc\", \"omega\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "856086dd",
   "metadata": {},
   "source": [
    "Finally, let's plot the plosterior constraints on the RV model and compare those to the data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b797deb4",
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.errorbar(x, y, yerr=yerr, fmt=\".k\")\n",
    "\n",
    "# Compute the posterior predictions for the RV model\n",
    "rv_pred = trace.posterior[\"rv_model_pred\"].values\n",
    "pred = np.percentile(rv_pred, [16, 50, 84], axis=(0, 1))\n",
    "plt.plot(t, pred[1], color=\"C0\", label=\"model\")\n",
    "art = plt.fill_between(t, pred[0], pred[2], color=\"C0\", alpha=0.3)\n",
    "art.set_edgecolor(\"none\")\n",
    "\n",
    "plt.legend(fontsize=10)\n",
    "plt.xlim(t.min(), t.max())\n",
    "plt.xlabel(\"time [days]\")\n",
    "plt.ylabel(\"radial velocity [m/s]\")\n",
    "_ = plt.title(\"posterior constraints\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5617a97e",
   "metadata": {},
   "source": [
    "## Phase plots\n",
    "\n",
    "It might be also be interesting to look at the phased plots for this system.\n",
    "Here we'll fold the dataset on the median of posterior period and then overplot the posterior constraint on the folded model orbits."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a638bdcc",
   "metadata": {},
   "outputs": [],
   "source": [
    "for n, letter in enumerate(\"bc\"):\n",
    "    plt.figure()\n",
    "\n",
    "    # Get the posterior median orbital parameters\n",
    "    p = np.median(trace.posterior[\"P\"].values[:, :, n])\n",
    "    t0 = np.median(trace.posterior[\"t0\"].values[:, :, n])\n",
    "\n",
    "    # Compute the median of posterior estimate of the background RV\n",
    "    # and the contribution from the other planet. Then we can remove\n",
    "    # this from the data to plot just the planet we care about.\n",
    "    other = np.median(\n",
    "        trace.posterior[\"vrad\"].values[:, :, :, (n + 1) % 2], axis=(0, 1)\n",
    "    )\n",
    "    other += np.median(trace.posterior[\"bkg\"].values, axis=(0, 1))\n",
    "\n",
    "    # Plot the folded data\n",
    "    x_fold = (x - t0 + 0.5 * p) % p - 0.5 * p\n",
    "    plt.errorbar(x_fold, y - other, yerr=yerr, fmt=\".k\")\n",
    "\n",
    "    # Compute the posterior prediction for the folded RV model for this\n",
    "    # planet\n",
    "    t_fold = (t - t0 + 0.5 * p) % p - 0.5 * p\n",
    "    inds = np.argsort(t_fold)\n",
    "    pred = np.percentile(\n",
    "        trace.posterior[\"vrad_pred\"].values[:, :, inds, n],\n",
    "        [16, 50, 84],\n",
    "        axis=(0, 1),\n",
    "    )\n",
    "    plt.plot(t_fold[inds], pred[1], color=\"C0\", label=\"model\")\n",
    "    art = plt.fill_between(\n",
    "        t_fold[inds], pred[0], pred[2], color=\"C0\", alpha=0.3\n",
    "    )\n",
    "    art.set_edgecolor(\"none\")\n",
    "\n",
    "    plt.legend(fontsize=10)\n",
    "    plt.xlim(-0.5 * p, 0.5 * p)\n",
    "    plt.xlabel(\"phase [days]\")\n",
    "    plt.ylabel(\"radial velocity [m/s]\")\n",
    "    plt.title(\"K2-24{0}\".format(letter))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab693478",
   "metadata": {},
   "source": [
    "## Citations\n",
    "\n",
    "As described in the [citation tutorial](https://docs.exoplanet.codes/en/stable/tutorials/citation/), we can use [citations.get_citations_for_model](https://docs.exoplanet.codes/en/stable/user/api/#exoplanet.citations.get_citations_for_model) to construct an acknowledgement and BibTeX listing that includes the relevant citations for this model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7f182c2f",
   "metadata": {},
   "outputs": [],
   "source": [
    "with model:\n",
    "    txt, bib = xo.citations.get_citations_for_model()\n",
    "print(txt)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "728c81fe",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(bib.split(\"\\n\\n\")[0] + \"\\n\\n...\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ddfbf0f3-d094-4b70-9d1b-077512dd6a95",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}