{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "a47326c5",
   "metadata": {},
   "source": [
    "# Fit a model to observations\n",
    "\n",
    "In the **Quickstart example** notebook we saw a quick introduction to forward modeling the upper atmosphere and He triplet signal of HD 209458 b. In this notebook we will go over an advanced-level tutorial on retrieving the properties of the upper atmosphere of HAT-P-11 b using ``p-winds`` models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "39455c13",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import matplotlib.pylab as pylab\n",
    "import astropy.constants as c\n",
    "import astropy.units as u\n",
    "from scipy.optimize import minimize\n",
    "from astropy.io import fits\n",
    "from astropy.convolution import convolve\n",
    "from p_winds import tools, parker, hydrogen, helium, transit, microphysics\n",
    "\n",
    "# Uncomment the next line if you have a MacBook with retina screen\n",
    "# %config InlineBackend.figure_format = 'retina'\n",
    "pylab.rcParams['figure.figsize'] = 9.0,6.5\n",
    "pylab.rcParams['font.size'] = 18"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb4efbda",
   "metadata": {},
   "source": [
    "Let's start with the observation of the He triplet transmission spectrum of HAT-P-11 b using the CARMENES spectrograph. This data is openly available in the [DACE platform](https://dace.unige.ch/openData/). But we will retrieve it from a [public Gist](https://gist.github.com/ladsantos/a8433928e384819a3632adc469bed803) for convenience."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f06da648",
   "metadata": {},
   "outputs": [],
   "source": [
    "# The observed transmission spectrum\n",
    "data_url = 'https://gist.githubusercontent.com/ladsantos/a8433928e384819a3632adc469bed803/raw/a584e6e83073d1ad3444248624927838588f22e4/HAT-P-11_b_He.dat'\n",
    "# We skip 2 rows instead of 1 to have an odd number of rows and allow a fast convolution later\n",
    "He_spec = np.loadtxt(data_url, skiprows=2)\n",
    "wl_obs = He_spec[:, 0]  # Angstrom\n",
    "f_obs = 1 - He_spec[:, 1] * 0.01  # Normalized flux\n",
    "u_obs = He_spec[:, 2] * 0.01  # Flux uncertainty\n",
    "\n",
    "# Convert in-vacuum wavelengths to in-air\n",
    "s = 1E4 / np.mean(wl_obs)\n",
    "n = 1 + 0.0000834254 + 0.02406147 / (130 - s ** 2) + 0.00015998 / (38.9 - s ** 2)\n",
    "wl_obs /= n\n",
    "\n",
    "# We will also need to know the instrumental profile that\n",
    "# widens spectral lines. We take the width from Allart et al. (2018),\n",
    "# the paper describing the HAT-P-11 b data.\n",
    "def gaussian(x, mu=0.0, sigma=1.0):\n",
    "    return 1 / sigma / (2 * np.pi) ** 0.5 * np.exp(-0.5 * (x - mu) ** 2 / sigma ** 2)\n",
    "\n",
    "instrumental_profile_width_v = 3.7  # Instrumental profile FWHM in km / s (assumed Gaussian)\n",
    "sigma_wl = instrumental_profile_width_v / \\\n",
    "    c.c.to(u.km / u.s).value * np.mean(wl_obs)  # Same unit as wl_obs\n",
    "instrumental_profile = gaussian(wl_obs, np.mean(wl_obs), sigma=sigma_wl)\n",
    "\n",
    "plt.errorbar(wl_obs, f_obs, yerr=u_obs)\n",
    "plt.xlabel(r'Wavelength (${\\rm \\AA}$)')\n",
    "plt.ylabel('Normalized flux')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fa59a54f",
   "metadata": {},
   "source": [
    "Now we set up the simulation. This is quite a dense cell of configurations, but you should be familiar with all of it if you followed the quickstart example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6d98ff3a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set up the simulation\n",
    "\n",
    "# Fixed parameters of HAT-P-11 b (not to be sampled)\n",
    "R_pl = 0.389  # Planetary radius (Jupiter radii)\n",
    "M_pl = 0.09  # Planetary mass (Jupiter masses)\n",
    "a_pl = 0.05254  # Semi-major axis (au)\n",
    "planet_to_star_ratio = 0.057989\n",
    "baseline = planet_to_star_ratio ** 2\n",
    "impact_parameter = 0.132\n",
    "h_he = 0.9  # H/He fraction (assumed for now, but can be a free parameter)\n",
    "mean_f_ion = 0.90  # Initially assumed, but the model relaxes for it\n",
    "\n",
    "# Physical constants\n",
    "m_h = c.m_p.to(u.g).value  # Hydrogen atom mass in g\n",
    "m_He = 4 * 1.67262192369e-27  # Helium atomic mass in kg\n",
    "k_B = 1.380649e-23  # Boltzmann's constant in kg / (m / s) ** 2 / K\n",
    "\n",
    "# Free parameters (to be sampled with MCMC)\n",
    "# The reason why we set m_dot and T in log is because\n",
    "# we will fit for them in log space\n",
    "log_m_dot_0 = np.log10(7E10)  # Planetary mass loss rate (g / s)\n",
    "log_T_0 = np.log10(9000)  # Atmospheric temperature (K)\n",
    "v_wind_0 = -2E3  # Line-of-sight wind velocity (m / s)\n",
    "\n",
    "# Altitudes samples (this can be a very important setting)\n",
    "r = np.logspace(0, np.log10(20), 100)\n",
    "\n",
    "# First guesses of fractions (not to be fit, but necessary for the calculation)\n",
    "initial_f_ion = 0.0  # Fraction of ionized hydrogen\n",
    "initial_f_he = np.array([1.0, 0.0])  # Fraction of singlet, triplet helium\n",
    "\n",
    "# Model settings\n",
    "relax_solution = True  # This will iteratively relax the solutions until convergence\n",
    "sample_phases = np.array([-0.5, -0.25, 0.0, 0.25, 0.5])  # Phases that we will average to obtain the final spectrum\n",
    "w0, w1, w2, f0, f1, f2, a_ij = microphysics.he_3_properties()\n",
    "w_array = np.array([w0, w1, w2])  # Central wavelengths of the triplet\n",
    "f_array = np.array([f0, f1, f2])  # Oscillator strengths of the triplet\n",
    "a_array = np.array([a_ij, a_ij, a_ij])  # This is the same for all lines in then triplet\n",
    "n_samples = len(sample_phases)\n",
    "transit_grid_size = 121  # Also very important to constrain computation time\n",
    "atol = 1E-8  # Absolute numerical tolerance for solve_ivp solver\n",
    "rtol = 1E-8  # Relative numerical tolerance for solve_ivp solver"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4cb69d50",
   "metadata": {},
   "source": [
    "The full spectrum of HAT-P-11 until 2600 Å is not known. But we can use a proxy for which we do have a full spectrum: HD 40307. It has a similar size, spectral type, effective temperature, and surface gravity as HAT-P-11. We take the spectrum from the [MUSCLES database](https://archive.stsci.edu/prepds/muscles/). The original file is fairly large, so for convenience we also retrieve the relevant section of the spectrum from a public Gist (already scaled to correspond to the irradiation at the planet HAT-P-11 b)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "83c456a2",
   "metadata": {},
   "outputs": [],
   "source": [
    "data_url = 'https://gist.githubusercontent.com/ladsantos/c7d1aae1ecc755bae9f1c8ef1545cf8d/raw/cb444d9b4ff9853672dab80a4aab583975557449/HAT-P-11_spec.dat'\n",
    "spec = np.loadtxt(data_url, skiprows=1)\n",
    "host_spectrum = {'wavelength': spec[:, 0], 'flux_lambda': spec[:, 1],\n",
    "                 'wavelength_unit': u.angstrom,\n",
    "                 'flux_unit': u.erg / u.s / u.cm ** 2 / u.angstrom}\n",
    "\n",
    "plt.loglog(host_spectrum['wavelength'], host_spectrum['flux_lambda'])\n",
    "plt.xlabel(r'Wavelength (${\\rm \\AA}$)')\n",
    "plt.ylabel(r'Flux density (erg s$^{-1}$ cm$^{-2}$ ${\\rm \\AA}^{-1}$)')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4fc80f5d",
   "metadata": {},
   "source": [
    "Before we start fitting the observed data to models, we have to do a few sanity checks and assess if all the moving parts of ``p-winds`` will work well for the configuration you set in the cell above. Most numerical issues are caused when using the ``scipy.integrate`` routines.\n",
    "\n",
    "We start by assessing if the atmospheric model behaves well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5537fd15",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Calculate the model\n",
    "def atmospheric_model(theta):\n",
    "    log_m_dot, log_T = theta\n",
    "    m_dot = 10 ** log_m_dot\n",
    "    T = 10 ** log_T\n",
    "    \n",
    "    vs = parker.sound_speed(T, h_he, mean_f_ion)  # Speed of sound (km/s, assumed to be constant)\n",
    "    rs = parker.radius_sonic_point(M_pl, vs)  # Radius at the sonic point (jupiterRad)\n",
    "    rhos = parker.density_sonic_point(m_dot, rs, vs)  # Density at the sonic point (g/cm^3)\n",
    "    r_array = r * R_pl / rs  # altitudes in unit of radius at the sonic point\n",
    "    v_array, rho_array = parker.structure(r_array)  # velocities and densities in units at the sonic point\n",
    "    \n",
    "    f_r = hydrogen.ion_fraction(r, R_pl, T, h_he,\n",
    "                                m_dot, M_pl, mean_f_ion,\n",
    "                                spectrum_at_planet=host_spectrum,\n",
    "                                initial_f_ion=initial_f_ion, relax_solution=relax_solution)\n",
    "\n",
    "    f_he_1, f_he_3 = helium.population_fraction(r, v_array, rho_array, f_r,\n",
    "        R_pl, T, h_he, vs, rs, rhos, host_spectrum,\n",
    "        initial_state=initial_f_he, atol=atol, rtol=rtol, relax_solution=relax_solution,\n",
    "                                                solver='Radau')\n",
    "\n",
    "    # Number density of helium nuclei\n",
    "    n_he = (rho_array * rhos * (1 - h_he) / (1 + 4 * (1 - h_he)) / m_h)\n",
    "    \n",
    "    # Number density distribution of helium\n",
    "    n_he_1 = f_he_1 * n_he\n",
    "    n_he_3 = f_he_3 * n_he\n",
    "    n_he_ion = (1 - f_he_1 - f_he_3) * n_he\n",
    "    \n",
    "    return n_he_1, n_he_3, n_he_ion\n",
    "\n",
    "# Let's test if the model function is working\n",
    "theta = (log_m_dot_0, log_T_0)\n",
    "y0 = (initial_f_ion, initial_f_he)\n",
    "n_he_1, n_he_3, n_he_ion = atmospheric_model(theta)\n",
    "\n",
    "plt.semilogy(r, n_he_1, color='C0', label='He singlet')\n",
    "plt.semilogy(r, n_he_3, color='C1', label='He triplet')\n",
    "plt.semilogy(r, n_he_ion, color='C2', label='He ionized')\n",
    "plt.xlabel(r'Radius (R$_\\mathrm{pl}$)')\n",
    "plt.ylabel('Number density (cm$^{-3}$)')\n",
    "plt.xlim(1, 10)\n",
    "plt.ylim(1E-2, 1E10)\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "23759e18",
   "metadata": {},
   "source": [
    "Seems to be working fine. Now we do a sanity check for the radiative transfer. There is not a lot of things that can break here, but we do it anyway."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e975a383",
   "metadata": {},
   "outputs": [],
   "source": [
    "# The transmission spectrum model\n",
    "def transmission_model(wavelength_array, v_wind, n_he_3_distribution, log_T):\n",
    "\n",
    "    # Set up the transit configuration\n",
    "    R_pl_physical = R_pl * 71492000  # Planet radius in m\n",
    "    r_SI = r * R_pl_physical  # Array of altitudes in m\n",
    "    n_he_3_SI = n_he_3_distribution * 1E6  # Volumetric densities in 1 / m ** 3\n",
    "\n",
    "    # Set up the simplified ray tracing.\n",
    "    f_maps = []\n",
    "    d_maps = []\n",
    "    for i in range(n_samples):\n",
    "        flux_map, density_map = transit.draw_transit(\n",
    "            planet_to_star_ratio,\n",
    "            impact_parameter=impact_parameter,\n",
    "            phase=sample_phases[i],\n",
    "            density_profile=n_he_3_SI,\n",
    "            profile_radius=r_SI,\n",
    "            planet_physical_radius=R_pl_physical,\n",
    "            grid_size=transit_grid_size\n",
    "                                   )\n",
    "        f_maps.append(flux_map)\n",
    "        d_maps.append(density_map)\n",
    "    \n",
    "    # Do the radiative transfer\n",
    "    spectra = []\n",
    "    v_turb = (5 / 3 * k_B * (10 ** log_T) / m_He) ** 0.5\n",
    "    for i in range(n_samples):\n",
    "        spec = transit.radiative_transfer(f_maps[i], d_maps[i],\n",
    "                                          wavelength_array, w_array, f_array, a_array, \n",
    "                                          10 ** log_T, m_He, v_wind, turbulence_speed=v_turb)\n",
    "        spectra.append(spec)\n",
    "\n",
    "    spectra = np.array(spectra)\n",
    "    spectrum = np.mean(spectra, axis=0)\n",
    "    return spectrum\n",
    "\n",
    "# Here we divide wl_obs by 1E10 to convert angstrom to m\n",
    "t_spectrum = transmission_model(wl_obs / 1E10, v_wind_0, n_he_3, log_T_0)\n",
    "plt.errorbar(wl_obs, f_obs, yerr=u_obs)\n",
    "plt.plot(wl_obs, t_spectrum + planet_to_star_ratio ** 2, color='k', lw=2)\n",
    "plt.xlabel(r'Wavelength (${\\rm \\AA}$)')\n",
    "plt.ylabel('Normalized flux')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ebff07be",
   "metadata": {},
   "source": [
    "Alright, it seems that our first guess does not fit very well. But we shall soon fix this issue. For now, let's write a cascading model that combines both the atmosphere and the radiative transfer. This function will also convolve our predicted spectrum with the instrumental profile."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "39b0d119",
   "metadata": {},
   "outputs": [],
   "source": [
    "def cascading_model(theta, wavelength_array):\n",
    "    log_m_dot, log_T, v_wind = theta\n",
    "    n_he_1, n_he_3, n_he_ion = atmospheric_model((log_m_dot, log_T))\n",
    "    t_spec = transmission_model(wavelength_array, v_wind, n_he_3, log_T) + baseline\n",
    "    t_spec_conv = convolve(t_spec, instrumental_profile, boundary='extend')\n",
    "    return t_spec\n",
    "\n",
    "theta0 = (log_m_dot_0, log_T_0, v_wind_0)\n",
    "\n",
    "t_spec = cascading_model(theta0, wl_obs / 1E10)\n",
    "\n",
    "plt.errorbar(wl_obs, f_obs, yerr=u_obs)\n",
    "plt.plot(wl_obs, t_spec, color='k', lw=2)\n",
    "plt.xlabel(r'Wavelength (${\\rm \\AA}$)')\n",
    "plt.ylabel('Normalized flux')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eba5b671",
   "metadata": {},
   "source": [
    "Great, it seems that the cascading model is also working well. We will fit it to the observations using a maximum likelihood estimation. The log-likelihood is defined as:\n",
    "\n",
    "$$\n",
    "\\ln{p(y | x, \\sigma, \\log{\\dot{m}}, \\log{T}, v_{\\rm wind})} = -\\frac{1}{2} \\sum_n \\left[ \\frac{\\left(y_n - y_{\\rm model}\\right)^2}{\\sigma^2} + \\ln{\\left(2\\pi \\sigma^2 \\right)} \\right]\n",
    "$$\n",
    "\n",
    "We do one sneaky trick in the calculation of log-likelihood here to avoid some numerical issues. The problem is that ``scipy.integrate.solve_ivp()``, which calculates the steady-state ionization of He, for some reason, can ocassionally become numerically unstable in some very specific cases and lose precision, yielding a `ValueError`. These solutions are of no use to us, but we do not want them to stop our optimization. So we discard them by making the log-likelihood function return `-np.inf` in those cases."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8c6fa1b8",
   "metadata": {},
   "outputs": [],
   "source": [
    "def log_likelihood(theta, x, y, yerr):\n",
    "    try:\n",
    "        model = cascading_model(theta, x)\n",
    "        sigma2 = yerr ** 2 + model ** 2\n",
    "        return -0.5 * np.sum((y - model) ** 2 / sigma2 + np.log(sigma2))\n",
    "    except ValueError:\n",
    "        return -np.inf"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d9135eab",
   "metadata": {},
   "source": [
    "With all that set, we use `scipy.optimize.minimize()` to maximize the likelihood of our solution and find the best fit. This calculation takes a few minutes to run on a computer with a 3.1 GHz CPU, so I commented the line that actually does this calculation as to not use the resources of online platforms that compile this notebook and upset the powers that be. But you should try running it in your own computer. \n",
    "\n",
    "You will likely run into some warnings, but the result should be robust. The actual computation time depends on how bad the first guess was, so you will probably save some time if you do a first fit by eye and than optimize it. You can also try changing the `method` option of `minimize()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4f96d1f3",
   "metadata": {},
   "outputs": [],
   "source": [
    "nll = lambda *args: -log_likelihood(*args)\n",
    "args = (wl_obs / 1E10, f_obs, u_obs)\n",
    "# soln = minimize(nll, theta0, args=args, method='Nelder-Mead')\n",
    "# m_ml, b_ml, log_f_ml = soln.x"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df6c88fb",
   "metadata": {},
   "source": [
    "When I started from a very good guess (`m_dot = 7E10`, `T_0 = 12000`, `v_wind_0 = -2000.0`), `minimize()` converges to a best fit solution of $\\dot{m} = 5.7 \\times 10^{10}$ g s$^{-1}$, $T = 11235$ K, and $v_{\\rm wind} = -2.0$ km s$^{-1}$ in about 5 minutes in a 3.1 GHz CPU."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}