{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "a47326c5",
   "metadata": {},
   "source": [
    "# Fit a model to observations\n",
    "\n",
    "In the **Quickstart example** notebook we saw a quick introduction to forward modeling the upper atmosphere and He triplet signal of HD 209458 b. In this notebook we will go over an advanced-level tutorial on retrieving the properties of the upper atmosphere of HAT-P-11 b using ``p-winds`` models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "39455c13",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import matplotlib.pylab as pylab\n",
    "import astropy.constants as c\n",
    "import astropy.units as u\n",
    "from astropy.convolution import convolve\n",
    "from scipy.optimize import minimize\n",
    "from p_winds import parker, hydrogen, helium, transit, microphysics, tools\n",
    "\n",
    "# Uncomment the next line if you have a MacBook with retina screen\n",
    "# %config InlineBackend.figure_format = 'retina'\n",
    "pylab.rcParams['figure.figsize'] = 9.0,6.5\n",
    "pylab.rcParams['font.size'] = 18"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb4efbda",
   "metadata": {},
   "source": [
    "Let's start with the observation of the He triplet transmission spectrum of HAT-P-11 b using the CARMENES spectrograph. This data is openly available in the [DACE platform](https://dace.unige.ch/openData/). But we will retrieve it from a [public Gist](https://gist.github.com/ladsantos/a8433928e384819a3632adc469bed803) for convenience."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f06da648",
   "metadata": {},
   "outputs": [],
   "source": [
    "# The observed transmission spectrum\n",
    "data_url = 'https://gist.githubusercontent.com/ladsantos/a8433928e384819a3632adc469bed803/raw/a584e6e83073d1ad3444248624927838588f22e4/HAT-P-11_b_He.dat'\n",
    "# We skip 2 rows instead of 1 to have an odd number of rows and allow a fast convolution later\n",
    "He_spec = np.loadtxt(data_url, skiprows=2)\n",
    "wl_obs = He_spec[:, 0]  # Angstrom\n",
    "f_obs = 1 - He_spec[:, 1] * 0.01  # Normalized flux\n",
    "u_obs = He_spec[:, 2] * 0.01  # Flux uncertainty\n",
    "\n",
    "# Convert in-vacuum wavelengths to in-air\n",
    "s = 1E4 / np.mean(wl_obs)\n",
    "n = 1 + 0.0000834254 + 0.02406147 / (130 - s ** 2) + 0.00015998 / (38.9 - s ** 2)\n",
    "wl_obs /= n\n",
    "\n",
    "# We will also need to know the instrumental profile that\n",
    "# widens spectral lines. We take the width from Allart et al. (2018),\n",
    "# the paper describing the HAT-P-11 b data.\n",
    "def gaussian(x, mu=0.0, sigma=1.0):\n",
    "    return 1 / sigma / (2 * np.pi) ** 0.5 * np.exp(-0.5 * (x - mu) ** 2 / sigma ** 2)\n",
    "\n",
    "instrumental_profile_width_v = 3.7  # Instrumental profile FWHM in km / s (assumed Gaussian)\n",
    "sigma_wl = instrumental_profile_width_v / (2 * (2 * np.log(2)) ** 0.5) / \\\n",
    "    c.c.to(u.km / u.s).value * np.mean(wl_obs)  # Same unit as wl_obs\n",
    "instrumental_profile = gaussian(wl_obs, np.mean(wl_obs), sigma=sigma_wl)\n",
    "\n",
    "plt.errorbar(wl_obs, f_obs, yerr=u_obs)\n",
    "plt.xlabel(r'Wavelength (${\\rm \\AA}$)')\n",
    "plt.ylabel('Normalized flux')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fa59a54f",
   "metadata": {},
   "source": [
    "Now we set up the simulation. This is quite a dense cell of configurations, but you should be familiar with all of it if you followed the quickstart example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6d98ff3a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set up the simulation\n",
    "\n",
    "# Fixed parameters of HAT-P-11 b (not to be sampled)\n",
    "R_pl = 0.389  # Planetary radius (Jupiter radii)\n",
    "M_pl = 0.09  # Planetary mass (Jupiter masses)\n",
    "a_pl = 0.05254  # Semi-major axis (au)\n",
    "planet_to_star_ratio = 0.057989\n",
    "impact_parameter = 0.132\n",
    "h_fraction = 0.90  # H number fraction\n",
    "he_fraction = 1 - h_fraction  # He number fraction\n",
    "he_h_fraction = he_fraction / h_fraction\n",
    "mean_f_ion = 0.90  # Initially assumed, but the model relaxes for it\n",
    "mu_0 = (1 + 4 * he_h_fraction) / (1 + he_h_fraction + mean_f_ion)  \n",
    "# mu_0 is the constant mean molecular weight (assumed for now, will be updated later)\n",
    "\n",
    "# Physical constants\n",
    "m_h = c.m_p.to(u.g).value  # Hydrogen atom mass in g\n",
    "m_He = 4 * 1.67262192369e-27  # Helium atomic mass in kg\n",
    "k_B = 1.380649e-23  # Boltzmann's constant in kg / (m / s) ** 2 / K\n",
    "\n",
    "# Free parameters (to be sampled with the optimization algorithm)\n",
    "# The reason why we set m_dot and T in log is because\n",
    "# we will fit for them in log space\n",
    "log_m_dot_0 = np.log10(2E10)  # Planetary mass loss rate (g / s)\n",
    "log_T_0 = np.log10(6000)  # Atmospheric temperature (K)\n",
    "v_wind_0 = -2E3  # Line-of-sight wind velocity (m / s)\n",
    "\n",
    "# Altitudes samples (this can be a very important setting)\n",
    "r = np.logspace(0, np.log10(20), 100)\n",
    "\n",
    "# First guesses of fractions (not to be fit, but necessary for the calculation)\n",
    "initial_f_ion = 0.0  # Fraction of ionized hydrogen\n",
    "initial_f_he = np.array([1.0, 0.0])  # Fraction of singlet, triplet helium\n",
    "\n",
    "# Model settings\n",
    "relax_solution = True  # This will iteratively relax the solutions until convergence\n",
    "exact_phi = True  # Exact calculation of H photoionization\n",
    "sample_phases = np.linspace(-0.50, 0.50, 5)  # Phases that we will average to obtain the final spectrum\n",
    "# The phases -0.5 and +0.5 correspond to the times of first and fourth transit contact\n",
    "w0, w1, w2, f0, f1, f2, a_ij = microphysics.he_3_properties()\n",
    "w_array = np.array([w0, w1, w2])  # Central wavelengths of the triplet\n",
    "f_array = np.array([f0, f1, f2])  # Oscillator strengths of the triplet\n",
    "a_array = np.array([a_ij, a_ij, a_ij])  # This is the same for all lines in then triplet\n",
    "n_samples = len(sample_phases)\n",
    "transit_grid_size = 100  # Also very important to constrain computation time\n",
    "supersampling = 5  # This is used to improve the hard pixel edges in the ray tracing"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4cb69d50",
   "metadata": {},
   "source": [
    "The full spectrum of HAT-P-11 until 2600 Å is not known. But we can use a proxy for which we do have a full spectrum: HD 40307. It has a similar size, spectral type, effective temperature, and surface gravity as HAT-P-11. We take the spectrum from the [MUSCLES database](https://archive.stsci.edu/prepds/muscles/). There is a convenience function in `tools` that calculates the spectrum arriving at a planet based on the MUSCLES SEDs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "83c456a2",
   "metadata": {},
   "outputs": [],
   "source": [
    "data_url = 'https://gist.githubusercontent.com/ladsantos/c7d1aae1ecc755bae9f1c8ef1545cf8d/raw/cb444d9b4ff9853672dab80a4aab583975557449/HAT-P-11_spec.dat'\n",
    "spec = np.loadtxt(data_url, skiprows=1)\n",
    "host_spectrum = {'wavelength': spec[:, 0], 'flux_lambda': spec[:, 1],\n",
    "                 'wavelength_unit': u.angstrom,\n",
    "                 'flux_unit': u.erg / u.s / u.cm ** 2 / u.angstrom}\n",
    "\n",
    "plt.loglog(host_spectrum['wavelength'], host_spectrum['flux_lambda'])\n",
    "plt.xlabel(r'Wavelength (${\\rm \\AA}$)')\n",
    "plt.ylabel(r'Flux density (erg s$^{-1}$ cm$^{-2}$ ${\\rm \\AA}^{-1}$)')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4fc80f5d",
   "metadata": {},
   "source": [
    "Before we start fitting the observed data to models, we have to do a few sanity checks and assess if all the moving parts of ``p-winds`` will work well for the configuration you set in the cell above. Most numerical issues are caused when using the ``scipy.integrate`` routines.\n",
    "\n",
    "We start by assessing if the atmospheric model behaves well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5537fd15",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Calculate the model\n",
    "def atmospheric_model(theta):\n",
    "    log_m_dot, log_T = theta\n",
    "    m_dot = 10 ** log_m_dot\n",
    "    T = 10 ** log_T\n",
    "    \n",
    "    f_r, mu_bar = hydrogen.ion_fraction(r, R_pl, T, h_fraction,\n",
    "                                m_dot, M_pl, mean_f_ion,\n",
    "                                spectrum_at_planet=host_spectrum,\n",
    "                                initial_f_ion=initial_f_ion, \n",
    "                                relax_solution=relax_solution,\n",
    "                                exact_phi=exact_phi, return_mu=True)\n",
    "\n",
    "    # Update the structure for the revised ion fraction\n",
    "    updated_mean_f_ion = np.mean(f_r)\n",
    "    vs = parker.sound_speed(T, mu_bar)\n",
    "    rs = parker.radius_sonic_point(M_pl, vs)\n",
    "    rhos = parker.density_sonic_point(m_dot, rs, vs)\n",
    "    r_array = r * R_pl / rs\n",
    "    v_array, rho_array = parker.structure(r_array)\n",
    "    \n",
    "    # Calculate the helium population\n",
    "    f_he_1, f_he_3 = helium.population_fraction(r, v_array, rho_array, f_r,\n",
    "        R_pl, T, h_fraction, vs, rs, rhos,\n",
    "        spectrum_at_planet=host_spectrum,\n",
    "        initial_state=initial_f_he, relax_solution=relax_solution)\n",
    "\n",
    "    # Number density of helium nuclei\n",
    "    n_he = (rho_array * rhos * he_fraction / (1 + 4 * he_fraction) / m_h)\n",
    "    \n",
    "    # Number density distribution of helium\n",
    "    n_he_1 = f_he_1 * n_he\n",
    "    n_he_3 = f_he_3 * n_he\n",
    "    n_he_ion = (1 - f_he_1 - f_he_3) * n_he\n",
    "    \n",
    "    # Return the important outputs (number densities [cm ** -3] of helium and \n",
    "    # the profile of velocities of the outflow [km / s])\n",
    "    return n_he_1, n_he_3, n_he_ion, v_array * vs\n",
    "\n",
    "# Let's test if the model function is working\n",
    "theta = (log_m_dot_0, log_T_0)\n",
    "y0 = (initial_f_ion, initial_f_he)\n",
    "n_he_1, n_he_3, n_he_ion, v_array = atmospheric_model(theta)\n",
    "\n",
    "plt.semilogy(r, n_he_1, color='C0', label='He singlet')\n",
    "plt.semilogy(r, n_he_3, color='C1', label='He triplet')\n",
    "plt.semilogy(r, n_he_ion, color='C2', label='He ionized')\n",
    "plt.xlabel(r'Radius (R$_\\mathrm{pl}$)')\n",
    "plt.ylabel('Number density (cm$^{-3}$)')\n",
    "plt.xlim(1, 10)\n",
    "plt.ylim(1E-2, 1E10)\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "23759e18",
   "metadata": {},
   "source": [
    "Seems to be working fine. Now we do a sanity check for the radiative transfer. There is not a lot of things that can break here, but we do it anyway."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e975a383",
   "metadata": {},
   "outputs": [],
   "source": [
    "# The transmission spectrum model\n",
    "def transmission_model(wavelength_array, v_wind, n_he_3_distribution, log_T, v_array):\n",
    "\n",
    "    # Set up the transit configuration. We use SI units to avoid too many \n",
    "    # headaches with unit conversion\n",
    "    R_pl_physical = R_pl * 71492000  # Planet radius in m\n",
    "    r_SI = r * R_pl_physical  # Array of altitudes in m\n",
    "    v_SI = v_array * 1000  # Velocity of the outflow in m / s\n",
    "    n_he_3_SI = n_he_3_distribution * 1E6  # Volumetric densities in 1 / m ** 3\n",
    "\n",
    "    # Set up the ray tracing\n",
    "    f_maps = []\n",
    "    t_depths = []\n",
    "    r_maps = []\n",
    "    for i in range(n_samples):\n",
    "        flux_map, transit_depth, r_map = transit.draw_transit(\n",
    "            planet_to_star_ratio,\n",
    "            impact_parameter=impact_parameter,\n",
    "            supersampling=supersampling,\n",
    "            phase=sample_phases[i],\n",
    "            planet_physical_radius=R_pl_physical,\n",
    "            grid_size=transit_grid_size\n",
    "                                   )\n",
    "        f_maps.append(flux_map)\n",
    "        t_depths.append(transit_depth)\n",
    "        r_maps.append(r_map)\n",
    "    # Do the radiative transfer\n",
    "    spectra = []\n",
    "\n",
    "    for i in range(n_samples):\n",
    "        spec = transit.radiative_transfer_2d(f_maps[i], r_maps[i], \n",
    "                                        r_SI, n_he_3_SI, v_SI, w_array, f_array, a_array,\n",
    "                                        wavelength_array, 10 ** log_T, m_He, bulk_los_velocity=v_wind,\n",
    "                                            wind_broadening_method='average')\n",
    "        # We add the transit depth because ground-based observations\n",
    "        # lose the continuum information and they are not sensitive to\n",
    "        # the loss of light by the opaque disk of the planet, only\n",
    "        # by the atmosphere\n",
    "        spectra.append(spec + t_depths[i])\n",
    "\n",
    "    spectra = np.array(spectra)\n",
    "    # Finally we take the mean of the spectra we calculated for each phase\n",
    "    spectrum = np.mean(spectra, axis=0)\n",
    "    return spectrum\n",
    "\n",
    "# Here we divide wl_obs by 1E10 to convert angstrom to m\n",
    "t_spectrum = transmission_model(wl_obs / 1E10, v_wind_0, n_he_3, log_T_0, v_array)\n",
    "plt.errorbar(wl_obs, f_obs, yerr=u_obs)\n",
    "plt.plot(wl_obs, t_spectrum, color='k', lw=2)\n",
    "plt.xlabel(r'Wavelength (${\\rm \\AA}$)')\n",
    "plt.ylabel('Normalized flux')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ebff07be",
   "metadata": {},
   "source": [
    "Alright, it seems that our first guess was not very good. But we shall soon make this an actual fit. For now, let's write a cascading model that combines both the atmosphere and the radiative transfer. This function will also convolve our predicted spectrum with the instrumental profile.\n",
    "\n",
    "Also, in the next cell you can do a trial-and-error process to have a better starting guess for the escape rate and the temperature. It took me a minute to find out that the escape rate `1E10` g / s and temperature `6000` K are a much better first guess to fit the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "39b0d119",
   "metadata": {},
   "outputs": [],
   "source": [
    "def cascading_model(theta, wavelength_array):\n",
    "    log_m_dot, log_T, v_wind = theta\n",
    "    n_he_1, n_he_3, n_he_ion, v = atmospheric_model((log_m_dot, log_T))\n",
    "    t_spec = transmission_model(wavelength_array, v_wind, n_he_3, log_T, v)\n",
    "    t_spec_conv = convolve(t_spec, instrumental_profile, boundary='extend')\n",
    "    return t_spec_conv\n",
    "\n",
    "# First guess\n",
    "theta0 = (log_m_dot_0, \n",
    "          log_T_0, \n",
    "          v_wind_0)\n",
    "\n",
    "t_spec = cascading_model(theta0, wl_obs / 1E10)\n",
    "\n",
    "plt.errorbar(wl_obs, f_obs, yerr=u_obs)\n",
    "plt.plot(wl_obs, t_spec, color='k', lw=2)\n",
    "plt.xlabel(r'Wavelength (${\\rm \\AA}$)')\n",
    "plt.ylabel('Normalized flux')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eba5b671",
   "metadata": {},
   "source": [
    "Great, it seems that the cascading model is also working well. We will fit it to the observations using a maximum likelihood estimation. The log-likelihood is defined as:\n",
    "\n",
    "$$\n",
    "\\ln{p(y | x, \\sigma, \\log{\\dot{m}}, \\log{T}, v_{\\rm wind})} = -\\frac{1}{2} \\sum_n \\left[ \\frac{\\left(y_n - y_{\\rm model}\\right)^2}{\\sigma^2} + \\ln{\\left(2\\pi \\sigma^2 \\right)} \\right]\n",
    "$$\n",
    "\n",
    "We do one sneaky trick in the calculation of log-likelihood here to avoid some numerical issues. The problem is that the solvers, which calculates the steady-state ionization of He, for some reason, can ocassionally become numerically unstable in some very specific cases and lose precision, yielding a `RuntimeError`. These solutions are of no use to us, but we do not want them to stop our optimization. So we discard them by making the log-likelihood function return `-np.inf` in those cases."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8c6fa1b8",
   "metadata": {},
   "outputs": [],
   "source": [
    "def log_likelihood(theta, x, y, yerr):\n",
    "    try:\n",
    "        model = cascading_model(theta, x)\n",
    "        sigma2 = yerr ** 2\n",
    "        return -0.5 * np.sum((y - model) ** 2 / sigma2 + np.log(sigma2))\n",
    "    except RuntimeError:\n",
    "        return -np.inf"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d9135eab",
   "metadata": {},
   "source": [
    "With all that set, we use `scipy.optimize.minimize()` to maximize the likelihood of our solution and find the best fit. This calculation takes a few minutes to run on a computer with a 3.1 GHz CPU, so I commented the line that actually does this calculation as to not use the resources of online platforms that compile this notebook and upset the powers that be. But you should try running it in your own computer. \n",
    "\n",
    "In some cases you may run into runtime warnings, but the result should be robust. The actual computation time depends on how bad the first guess was, so you will probably save some time if you do a first fit by eye and than optimize it. You can also try changing the `method` option of `minimize()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4f96d1f3",
   "metadata": {},
   "outputs": [],
   "source": [
    "nll = lambda *args: -log_likelihood(*args)\n",
    "args = (wl_obs / 1E10, f_obs, u_obs)\n",
    "# %time soln = minimize(nll, theta0, args=args, method='Nelder-Mead')\n",
    "# log_mdot_ml, logT_ml, v_wind_ml = soln.x"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df6c88fb",
   "metadata": {},
   "source": [
    "When I started from a very good guess (`m_dot = 2E10`, `T_0 = 6000`, `v_wind_0 = -2000.0`), `minimize()` converges to a best fit solution of $\\dot{m} = 4.9 \\times 10^{10}$ g s$^{-1}$, $T = 8100$ K, and $v_{\\rm wind} = -1.9$ km s$^{-1}$ in about 6 minutes in a 3.1 GHz CPU with four threads."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7cac7265",
   "metadata": {},
   "outputs": [],
   "source": [
    "theta_ml = (np.log10(4.9E10), np.log10(8100), -1.9E3)\n",
    "\n",
    "t_spec = cascading_model(theta_ml, wl_obs / 1E10)\n",
    "\n",
    "plt.errorbar(wl_obs, f_obs, yerr=u_obs)\n",
    "plt.plot(wl_obs, t_spec, color='k', lw=2)\n",
    "plt.xlabel(r'Wavelength (${\\rm \\AA}$)')\n",
    "plt.ylabel('Normalized flux')\n",
    "plt.show()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}