{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "35354229",
   "metadata": {},
   "source": [
    "# Setting up and running absolute hydration free energy calculations\n",
    "\n",
    "This tutorial gives a step-by-step process to set up absolute hydration free energy (AHFE) simulation campaign using OpenFE. In this tutorial we are performing an absolute hydration free energy calculation of benzene.\n",
    "\n",
    "The absolute hydration free energy is obtained through a thermodynamic cycle. The interactions of the molecule are decoupled both in solvent, giving $\\Delta G$(solvent) and in vacuum, giving $\\Delta G$(vacuum), which allows calculation of the absolute hydration free energy, $\\Delta G$(AHFE). \n",
    "\n",
    "Note: In this protocol, the coulombic interactions of the molecule are fully turned off (annihilated), while the Lennard-Jones interactions are decoupled, meaning the intermolecular interactions turned off, keeping the intramolecular Lennard-Jones interactions."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aa696921-9fef-4a5e-ab66-a5023fa56a80",
   "metadata": {},
   "source": [
    "## <img src=\"ahfe_cycle.png\" alt=\"Drawing\" style=\"width: 300px;\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "fc97de03",
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "import openfe"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2fea29c3",
   "metadata": {},
   "source": [
    "## 1. Loading the ligand\n",
    "\n",
    "First we must load the chemical models between which we wish to calculate free energies.\n",
    "In this example these are initially stored in a molfile (`.sdf`) containing multiple molecules.\n",
    "This can be loaded using the `SDMolSupplier` class from rdkit and passed to openfe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "41cf8be7",
   "metadata": {},
   "outputs": [],
   "source": [
    "from rdkit import Chem\n",
    "supp = Chem.SDMolSupplier(\"../cookbook/assets/benzene.sdf\", removeHs=False)\n",
    "ligands = [openfe.SmallMoleculeComponent.from_rdkit(mol) for mol in supp]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d0cb1329",
   "metadata": {},
   "source": [
    "## 2. Creating `ChemicalSystem`s\n",
    "\n",
    "OpenFE describes complex molecular systems as being composed of `Component`s. For example, we have `SmallMoleculeComponent` for each small molecule in the `LigandNetwork`. We'll create a `SolventComponent` to describe the solvent.\n",
    "\n",
    "The `Component`s are joined in a `ChemicalSystem`, which describes all the particles in the simulation.\n",
    "\n",
    "In state A of the hydration free energy protocol, the ligand is fully interacting in the solvent and the `ChemicalSystem` contains both the ligand and the solvent. In the other endstate, state B, the ligand is fully decoupled in the solvent. Therefore, the `ChemicalSystem` in state B only contains the solvent.\n",
    "\n",
    "Note that for AHFE simulations, we are not separately defining the vacuum state, but the protocol creates that based on the solvent states."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "9d2fbc22",
   "metadata": {},
   "outputs": [],
   "source": [
    "# defaults are water with NaCl at 0.15 M\n",
    "solvent = openfe.SolventComponent()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "710285ca",
   "metadata": {},
   "outputs": [],
   "source": [
    "# In state A the ligand is fully interacting in the solvent\n",
    "systemA = openfe.ChemicalSystem({\n",
    "    'ligand': ligands[0],\n",
    "    'solvent': solvent,\n",
    "}, name=ligands[0].name)\n",
    "# In state B the ligand is fully decoupled in the solvent, therefore we are only defining the solvent here\n",
    "systemB = openfe.ChemicalSystem({'solvent': solvent})"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "33aa384c-54f4-4b42-8b6d-38b07b4e3c47",
   "metadata": {},
   "source": [
    "## 3. Defining the AHFE simulation settings and creating a `Protocol`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a0993eec-50f4-49d2-83ba-e3c166b9377e",
   "metadata": {},
   "source": [
    "There are various different parameters which can be set to determine how the AHFE simulation will take place. \n",
    "\n",
    "The easiest way to customize protocol settings is to start with the default settings, and modify them. Many settings carry units with them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "c31c712b-4844-477b-8aa4-ded6f0c8ca5f",
   "metadata": {},
   "outputs": [],
   "source": [
    "from openfe.protocols.openmm_afe import AbsoluteSolvationProtocol"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "fb839094",
   "metadata": {},
   "outputs": [],
   "source": [
    "settings = AbsoluteSolvationProtocol.default_settings()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "8b99f77f-c70c-436d-b4eb-fb462a4b043e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'alchemical_settings': {},\n",
      " 'integrator_settings': {'barostat_frequency': <Quantity(25, 'timestep')>,\n",
      "                         'constraint_tolerance': 1e-06,\n",
      "                         'langevin_collision_rate': <Quantity(1.0, '1 / picosecond')>,\n",
      "                         'n_restart_attempts': 20,\n",
      "                         'reassign_velocities': False,\n",
      "                         'remove_com': False,\n",
      "                         'timestep': <Quantity(4, 'femtosecond')>},\n",
      " 'lambda_settings': {'lambda_elec': [0.0,\n",
      "                                     0.25,\n",
      "                                     0.5,\n",
      "                                     0.75,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0],\n",
      "                     'lambda_restraints': [0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0],\n",
      "                     'lambda_vdw': [0.0,\n",
      "                                    0.0,\n",
      "                                    0.0,\n",
      "                                    0.0,\n",
      "                                    0.0,\n",
      "                                    0.12,\n",
      "                                    0.24,\n",
      "                                    0.36,\n",
      "                                    0.48,\n",
      "                                    0.6,\n",
      "                                    0.7,\n",
      "                                    0.77,\n",
      "                                    0.85,\n",
      "                                    1.0]},\n",
      " 'partial_charge_settings': {'nagl_model': None,\n",
      "                             'number_of_conformers': None,\n",
      "                             'off_toolkit_backend': 'ambertools',\n",
      "                             'partial_charge_method': 'am1bcc'},\n",
      " 'protocol_repeats': 3,\n",
      " 'solvation_settings': {'solvent_model': 'tip3p',\n",
      "                        'solvent_padding': <Quantity(1.2, 'nanometer')>},\n",
      " 'solvent_engine_settings': {'compute_platform': None},\n",
      " 'solvent_equil_output_settings': {'checkpoint_interval': <Quantity(250, 'picosecond')>,\n",
      "                                   'checkpoint_storage_filename': 'checkpoint.chk',\n",
      "                                   'equil_npt_structure': 'equil_npt_structure.pdb',\n",
      "                                   'equil_nvt_structure': 'equil_nvt_structure.pdb',\n",
      "                                   'forcefield_cache': 'db.json',\n",
      "                                   'log_output': 'equil_simulation.log',\n",
      "                                   'minimized_structure': 'minimized.pdb',\n",
      "                                   'output_indices': 'not water',\n",
      "                                   'preminimized_structure': 'system.pdb',\n",
      "                                   'production_trajectory_filename': 'production_equil.xtc',\n",
      "                                   'trajectory_write_interval': <Quantity(20, 'picosecond')>},\n",
      " 'solvent_equil_simulation_settings': {'equilibration_length': <Quantity(0.2, 'nanosecond')>,\n",
      "                                       'equilibration_length_nvt': <Quantity(0.1, 'nanosecond')>,\n",
      "                                       'minimization_steps': 5000,\n",
      "                                       'production_length': <Quantity(0.5, 'nanosecond')>},\n",
      " 'solvent_forcefield_settings': {'constraints': 'hbonds',\n",
      "                                 'forcefields': ['amber/ff14SB.xml',\n",
      "                                                 'amber/tip3p_standard.xml',\n",
      "                                                 'amber/tip3p_HFE_multivalent.xml',\n",
      "                                                 'amber/phosaa10.xml'],\n",
      "                                 'hydrogen_mass': 3.0,\n",
      "                                 'nonbonded_cutoff': <Quantity(1.0, 'nanometer')>,\n",
      "                                 'nonbonded_method': 'PME',\n",
      "                                 'rigid_water': True,\n",
      "                                 'small_molecule_forcefield': 'openff-2.0.0'},\n",
      " 'solvent_output_settings': {'checkpoint_interval': <Quantity(250, 'picosecond')>,\n",
      "                             'checkpoint_storage_filename': 'solvent_checkpoint.nc',\n",
      "                             'forcefield_cache': 'db.json',\n",
      "                             'output_filename': 'solvent.nc',\n",
      "                             'output_indices': 'not water',\n",
      "                             'output_structure': 'hybrid_system.pdb'},\n",
      " 'solvent_simulation_settings': {'early_termination_target_error': <Quantity(0.0, 'kilocalorie_per_mole')>,\n",
      "                                 'equilibration_length': <Quantity(1.0, 'nanosecond')>,\n",
      "                                 'minimization_steps': 5000,\n",
      "                                 'n_replicas': 14,\n",
      "                                 'production_length': <Quantity(10.0, 'nanosecond')>,\n",
      "                                 'real_time_analysis_interval': <Quantity(250, 'picosecond')>,\n",
      "                                 'real_time_analysis_minimum_time': <Quantity(500, 'picosecond')>,\n",
      "                                 'sampler_method': 'repex',\n",
      "                                 'sams_flatness_criteria': 'logZ-flatness',\n",
      "                                 'sams_gamma0': 1.0,\n",
      "                                 'time_per_iteration': <Quantity(1, 'picosecond')>},\n",
      " 'thermo_settings': {'ph': None,\n",
      "                     'pressure': <Quantity(0.986923267, 'standard_atmosphere')>,\n",
      "                     'redox_potential': None,\n",
      "                     'temperature': <Quantity(298.15, 'kelvin')>},\n",
      " 'vacuum_engine_settings': {'compute_platform': None},\n",
      " 'vacuum_equil_output_settings': {'checkpoint_interval': <Quantity(250, 'picosecond')>,\n",
      "                                  'checkpoint_storage_filename': 'checkpoint.chk',\n",
      "                                  'equil_npt_structure': 'equil_structure.pdb',\n",
      "                                  'equil_nvt_structure': None,\n",
      "                                  'forcefield_cache': 'db.json',\n",
      "                                  'log_output': 'equil_simulation.log',\n",
      "                                  'minimized_structure': 'minimized.pdb',\n",
      "                                  'output_indices': 'not water',\n",
      "                                  'preminimized_structure': 'system.pdb',\n",
      "                                  'production_trajectory_filename': 'production_equil.xtc',\n",
      "                                  'trajectory_write_interval': <Quantity(20, 'picosecond')>},\n",
      " 'vacuum_equil_simulation_settings': {'equilibration_length': <Quantity(0.2, 'nanosecond')>,\n",
      "                                      'equilibration_length_nvt': None,\n",
      "                                      'minimization_steps': 5000,\n",
      "                                      'production_length': <Quantity(0.5, 'nanosecond')>},\n",
      " 'vacuum_forcefield_settings': {'constraints': 'hbonds',\n",
      "                                'forcefields': ['amber/ff14SB.xml',\n",
      "                                                'amber/tip3p_standard.xml',\n",
      "                                                'amber/tip3p_HFE_multivalent.xml',\n",
      "                                                'amber/phosaa10.xml'],\n",
      "                                'hydrogen_mass': 3.0,\n",
      "                                'nonbonded_cutoff': <Quantity(1.0, 'nanometer')>,\n",
      "                                'nonbonded_method': 'nocutoff',\n",
      "                                'rigid_water': True,\n",
      "                                'small_molecule_forcefield': 'openff-2.0.0'},\n",
      " 'vacuum_output_settings': {'checkpoint_interval': <Quantity(250, 'picosecond')>,\n",
      "                            'checkpoint_storage_filename': 'vacuum_checkpoint.nc',\n",
      "                            'forcefield_cache': 'db.json',\n",
      "                            'output_filename': 'vacuum.nc',\n",
      "                            'output_indices': 'not water',\n",
      "                            'output_structure': 'hybrid_system.pdb'},\n",
      " 'vacuum_simulation_settings': {'early_termination_target_error': <Quantity(0.0, 'kilocalorie_per_mole')>,\n",
      "                                'equilibration_length': <Quantity(0.5, 'nanosecond')>,\n",
      "                                'minimization_steps': 5000,\n",
      "                                'n_replicas': 14,\n",
      "                                'production_length': <Quantity(2.0, 'nanosecond')>,\n",
      "                                'real_time_analysis_interval': <Quantity(250, 'picosecond')>,\n",
      "                                'real_time_analysis_minimum_time': <Quantity(500, 'picosecond')>,\n",
      "                                'sampler_method': 'repex',\n",
      "                                'sams_flatness_criteria': 'logZ-flatness',\n",
      "                                'sams_gamma0': 1.0,\n",
      "                                'time_per_iteration': <Quantity(1, 'picosecond')>}}\n"
     ]
    }
   ],
   "source": [
    "settings"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f2d77ba4-f6d7-465f-8933-1cf35748d71f",
   "metadata": {},
   "source": [
    "Displaying the default values:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "82c3e1ea-2bf6-47ec-bf1d-fe06adb6ea10",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "298.15 kelvin"
      ],
      "text/latex": [
       "$298.15\\ \\mathrm{kelvin}$"
      ],
      "text/plain": [
       "298.15 <Unit('kelvin')>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "settings.thermo_settings.temperature  # Simulation temperature"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "05d407e9-8ddb-4c53-be73-88ec65f4a93e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0.0, 0.25, 0.5, 0.75, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "settings.lambda_settings.lambda_elec # List of floats of lambda values for the electrostatics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "cbaa744c-9b3e-45ca-b6e1-227e80446623",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "1.0 nanosecond"
      ],
      "text/latex": [
       "$1.0\\ \\mathrm{nanosecond}$"
      ],
      "text/plain": [
       "1.0 <Unit('nanosecond')>"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "settings.solvent_simulation_settings.equilibration_length # Length of the NPT equilibration in the solvent, prior to running the AHFE calculation"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9f81f879-9eea-4af0-b772-e28832912487",
   "metadata": {},
   "source": [
    "By default we run 3 repeats with solvent and vacuum simulation lengths of 10 ns and 2 ns over 14 lambda windows. \n",
    "To speed things up here we instead run 1 repeat with both solvent and vacuum simulation lengths of 0.5 ns over 14 lambda windows.\n",
    "\n",
    "Changing default values:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "e83630f0",
   "metadata": {},
   "outputs": [],
   "source": [
    "from openff.units import unit\n",
    "\n",
    "# change the values\n",
    "settings.protocol_repeats = 1\n",
    "settings.lambda_settings.lambda_elec = [0.0, 0.26, 0.5, 0.75, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "55067780-d228-4661-8c1e-5cb0217fd2dc",
   "metadata": {},
   "outputs": [],
   "source": [
    "settings.solvent_simulation_settings.equilibration_length = 10 * unit.picosecond\n",
    "settings.solvent_simulation_settings.production_length = 500 * unit.picosecond\n",
    "settings.vacuum_simulation_settings.equilibration_length = 10 * unit.picosecond\n",
    "settings.vacuum_simulation_settings.production_length = 500 * unit.picosecond\n",
    "settings.solvent_engine_settings.compute_platform = 'CUDA'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "af14d630-e933-4054-9444-e96af7dfcc80",
   "metadata": {
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "Here a view of all the settings that the user can modify as shown in the examples above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "137d3f96-00fc-4400-bf5d-b8ed355dbbec",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'alchemical_settings': {},\n",
      " 'integrator_settings': {'barostat_frequency': <Quantity(25, 'timestep')>,\n",
      "                         'constraint_tolerance': 1e-06,\n",
      "                         'langevin_collision_rate': <Quantity(1.0, '1 / picosecond')>,\n",
      "                         'n_restart_attempts': 20,\n",
      "                         'reassign_velocities': False,\n",
      "                         'remove_com': False,\n",
      "                         'timestep': <Quantity(4, 'femtosecond')>},\n",
      " 'lambda_settings': {'lambda_elec': [0.0,\n",
      "                                     0.26,\n",
      "                                     0.5,\n",
      "                                     0.75,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0,\n",
      "                                     1.0],\n",
      "                     'lambda_restraints': [0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0,\n",
      "                                           0.0],\n",
      "                     'lambda_vdw': [0.0,\n",
      "                                    0.0,\n",
      "                                    0.0,\n",
      "                                    0.0,\n",
      "                                    0.0,\n",
      "                                    0.12,\n",
      "                                    0.24,\n",
      "                                    0.36,\n",
      "                                    0.48,\n",
      "                                    0.6,\n",
      "                                    0.7,\n",
      "                                    0.77,\n",
      "                                    0.85,\n",
      "                                    1.0]},\n",
      " 'partial_charge_settings': {'nagl_model': None,\n",
      "                             'number_of_conformers': None,\n",
      "                             'off_toolkit_backend': 'ambertools',\n",
      "                             'partial_charge_method': 'am1bcc'},\n",
      " 'protocol_repeats': 1,\n",
      " 'solvation_settings': {'solvent_model': 'tip3p',\n",
      "                        'solvent_padding': <Quantity(1.2, 'nanometer')>},\n",
      " 'solvent_engine_settings': {'compute_platform': 'CUDA'},\n",
      " 'solvent_equil_output_settings': {'checkpoint_interval': <Quantity(250, 'picosecond')>,\n",
      "                                   'checkpoint_storage_filename': 'checkpoint.chk',\n",
      "                                   'equil_npt_structure': 'equil_npt_structure.pdb',\n",
      "                                   'equil_nvt_structure': 'equil_nvt_structure.pdb',\n",
      "                                   'forcefield_cache': 'db.json',\n",
      "                                   'log_output': 'equil_simulation.log',\n",
      "                                   'minimized_structure': 'minimized.pdb',\n",
      "                                   'output_indices': 'not water',\n",
      "                                   'preminimized_structure': 'system.pdb',\n",
      "                                   'production_trajectory_filename': 'production_equil.xtc',\n",
      "                                   'trajectory_write_interval': <Quantity(20, 'picosecond')>},\n",
      " 'solvent_equil_simulation_settings': {'equilibration_length': <Quantity(0.2, 'nanosecond')>,\n",
      "                                       'equilibration_length_nvt': <Quantity(0.1, 'nanosecond')>,\n",
      "                                       'minimization_steps': 5000,\n",
      "                                       'production_length': <Quantity(0.5, 'nanosecond')>},\n",
      " 'solvent_forcefield_settings': {'constraints': 'hbonds',\n",
      "                                 'forcefields': ['amber/ff14SB.xml',\n",
      "                                                 'amber/tip3p_standard.xml',\n",
      "                                                 'amber/tip3p_HFE_multivalent.xml',\n",
      "                                                 'amber/phosaa10.xml'],\n",
      "                                 'hydrogen_mass': 3.0,\n",
      "                                 'nonbonded_cutoff': <Quantity(1.0, 'nanometer')>,\n",
      "                                 'nonbonded_method': 'PME',\n",
      "                                 'rigid_water': True,\n",
      "                                 'small_molecule_forcefield': 'openff-2.0.0'},\n",
      " 'solvent_output_settings': {'checkpoint_interval': <Quantity(250, 'picosecond')>,\n",
      "                             'checkpoint_storage_filename': 'solvent_checkpoint.nc',\n",
      "                             'forcefield_cache': 'db.json',\n",
      "                             'output_filename': 'solvent.nc',\n",
      "                             'output_indices': 'not water',\n",
      "                             'output_structure': 'hybrid_system.pdb'},\n",
      " 'solvent_simulation_settings': {'early_termination_target_error': <Quantity(0.0, 'kilocalorie_per_mole')>,\n",
      "                                 'equilibration_length': <Quantity(0.01, 'nanosecond')>,\n",
      "                                 'minimization_steps': 5000,\n",
      "                                 'n_replicas': 14,\n",
      "                                 'production_length': <Quantity(0.5, 'nanosecond')>,\n",
      "                                 'real_time_analysis_interval': <Quantity(250, 'picosecond')>,\n",
      "                                 'real_time_analysis_minimum_time': <Quantity(500, 'picosecond')>,\n",
      "                                 'sampler_method': 'repex',\n",
      "                                 'sams_flatness_criteria': 'logZ-flatness',\n",
      "                                 'sams_gamma0': 1.0,\n",
      "                                 'time_per_iteration': <Quantity(1, 'picosecond')>},\n",
      " 'thermo_settings': {'ph': None,\n",
      "                     'pressure': <Quantity(0.986923267, 'standard_atmosphere')>,\n",
      "                     'redox_potential': None,\n",
      "                     'temperature': <Quantity(298.15, 'kelvin')>},\n",
      " 'vacuum_engine_settings': {'compute_platform': None},\n",
      " 'vacuum_equil_output_settings': {'checkpoint_interval': <Quantity(250, 'picosecond')>,\n",
      "                                  'checkpoint_storage_filename': 'checkpoint.chk',\n",
      "                                  'equil_npt_structure': 'equil_structure.pdb',\n",
      "                                  'equil_nvt_structure': None,\n",
      "                                  'forcefield_cache': 'db.json',\n",
      "                                  'log_output': 'equil_simulation.log',\n",
      "                                  'minimized_structure': 'minimized.pdb',\n",
      "                                  'output_indices': 'not water',\n",
      "                                  'preminimized_structure': 'system.pdb',\n",
      "                                  'production_trajectory_filename': 'production_equil.xtc',\n",
      "                                  'trajectory_write_interval': <Quantity(20, 'picosecond')>},\n",
      " 'vacuum_equil_simulation_settings': {'equilibration_length': <Quantity(0.2, 'nanosecond')>,\n",
      "                                      'equilibration_length_nvt': None,\n",
      "                                      'minimization_steps': 5000,\n",
      "                                      'production_length': <Quantity(0.5, 'nanosecond')>},\n",
      " 'vacuum_forcefield_settings': {'constraints': 'hbonds',\n",
      "                                'forcefields': ['amber/ff14SB.xml',\n",
      "                                                'amber/tip3p_standard.xml',\n",
      "                                                'amber/tip3p_HFE_multivalent.xml',\n",
      "                                                'amber/phosaa10.xml'],\n",
      "                                'hydrogen_mass': 3.0,\n",
      "                                'nonbonded_cutoff': <Quantity(1.0, 'nanometer')>,\n",
      "                                'nonbonded_method': 'nocutoff',\n",
      "                                'rigid_water': True,\n",
      "                                'small_molecule_forcefield': 'openff-2.0.0'},\n",
      " 'vacuum_output_settings': {'checkpoint_interval': <Quantity(250, 'picosecond')>,\n",
      "                            'checkpoint_storage_filename': 'vacuum_checkpoint.nc',\n",
      "                            'forcefield_cache': 'db.json',\n",
      "                            'output_filename': 'vacuum.nc',\n",
      "                            'output_indices': 'not water',\n",
      "                            'output_structure': 'hybrid_system.pdb'},\n",
      " 'vacuum_simulation_settings': {'early_termination_target_error': <Quantity(0.0, 'kilocalorie_per_mole')>,\n",
      "                                'equilibration_length': <Quantity(0.01, 'nanosecond')>,\n",
      "                                'minimization_steps': 5000,\n",
      "                                'n_replicas': 14,\n",
      "                                'production_length': <Quantity(0.5, 'nanosecond')>,\n",
      "                                'real_time_analysis_interval': <Quantity(250, 'picosecond')>,\n",
      "                                'real_time_analysis_minimum_time': <Quantity(500, 'picosecond')>,\n",
      "                                'sampler_method': 'repex',\n",
      "                                'sams_flatness_criteria': 'logZ-flatness',\n",
      "                                'sams_gamma0': 1.0,\n",
      "                                'time_per_iteration': <Quantity(1, 'picosecond')>}}\n"
     ]
    }
   ],
   "source": [
    "settings"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8c32f1b9-b09e-41ac-b66d-3fbc8ee433ec",
   "metadata": {},
   "source": [
    "### Creating the `Protocol`\n",
    "With the Settings inspected and adjusted, we can provide these to the `Protocol`. This `Protocol` defines the procedure to estimate a free energy difference between two chemical systems, with the details of the two end states yet to be defined."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "23f6322b-0336-4aa9-b9d0-ebe533dc5753",
   "metadata": {},
   "outputs": [],
   "source": [
    "protocol = AbsoluteSolvationProtocol(settings=settings)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f1dfcff-bb06-4508-9f16-4e3de18a5f93",
   "metadata": {},
   "source": [
    "## 4. Running the AHFE simulation"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fe618563-c445-4fb3-bc0c-6c08148ddede",
   "metadata": {},
   "source": [
    "### (a) Using the CLI"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "38129396-ba19-46e9-9eb0-36b0e3a7ad3c",
   "metadata": {},
   "source": [
    "Once we have the ChemicalSystems, and the Protocol, we can create the Transformation. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "1231fad8-b37d-4008-a851-5ad546386286",
   "metadata": {},
   "outputs": [],
   "source": [
    "transformation = openfe.Transformation(\n",
    "            stateA=systemA,\n",
    "            stateB=systemB,\n",
    "            mapping=None,\n",
    "            protocol=protocol,  # use protocol created above\n",
    "            name=f\"{systemA.name}\"\n",
    "        )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "451e3345-0b30-4dbf-93b5-28847fad1c9a",
   "metadata": {},
   "source": [
    "We'll write out the transformation to disk, so that it can be run using the openfe quickrun command:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "b69efbcd-b888-4afc-a33e-01860889f408",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pathlib\n",
    "# first we create the directory\n",
    "transformation_dir = pathlib.Path(\"ahfe_json\")\n",
    "transformation_dir.mkdir(exist_ok=True)\n",
    "\n",
    "# then we write out the transformation\n",
    "transformation.dump(transformation_dir / f\"{transformation.name}.json\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5b4f7397-d79e-4bea-a473-1750b27d7b44",
   "metadata": {},
   "source": [
    "You can run the AHFE simulation from the CLI by using the `openfe quickrun` command. It takes a transformation JSON as input, and the flags -o to give the final output JSON file and -d for the directory where simulation results should be stored. For example,\n",
    "\n",
    "`openfe quickrun path/to/transformation.json -o results.json -d working-directory`\n",
    "\n",
    "where path/to/transformation.json is the path to one of the files created above."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bc919dc3-1281-49d2-ad92-4a3355d8bbe5",
   "metadata": {},
   "source": [
    "### (b) Using the Python API"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "21b64394-fdc7-47e9-aadd-488a55c1d0bb",
   "metadata": {},
   "source": [
    "**Creating the `ProtocolDAG`**\n",
    "\n",
    "Once we have the two `ChemicalSystem`s, and the `Protocol`, we can create the `ProtocolDAG`.\n",
    "\n",
    "This creates a directed-acyclic-graph (DAG) of computational tasks necessary for creating an estimate of the free energy difference between the two chemical systems."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "44ba94ca",
   "metadata": {},
   "outputs": [],
   "source": [
    "dag = protocol.create(stateA=systemA, stateB=systemB, mapping=None)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4283dfe4",
   "metadata": {},
   "source": [
    "To summarize, this `ProtocolDAG` contains:\n",
    "- chemical models of both sides of the alchemical transformation in `systemA` and `systemB`\n",
    "- a description of the exact computational algorithm to use to perform the estimate in `protocol`\n",
    "- the `mapping` is set to `None` since no atoms are mapped in the AHFE protocol"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1e29d1c8",
   "metadata": {},
   "source": [
    "**Executing the simulation**\n",
    "\n",
    "The DAG contains many invdividual jobs. We can execute them sequentially in this notebook using the `gufe.protocols.execute` function.\n",
    "\n",
    "In a more realistic (expansive) situation we would farm off the individual jobs to a HPC cluster or cloud compute service so they could be executed in parallel.\n",
    "\n",
    "Note: we use the `shared_basedir` and `scratch_basedir` argument of `execute_DAG` in order to set the directory where the simulation files are written to"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "6dbedb47-46b9-4c22-ad74-a580174a359c",
   "metadata": {},
   "outputs": [],
   "source": [
    "from gufe.protocols import execute_DAG\n",
    "import pathlib"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "be690ef1-3243-4114-b56f-5babbb660af6",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/hannahbaumann/openfe/openfe/protocols/openmm_rfe/_rfe_utils/compute.py:56: UserWarning: Non-GPU platform selected: CPU, this may significantly impact simulation performance\n",
      "  warnings.warn(wmsg)\n",
      "WARNING:root:Non-GPU platform selected: CPU, this may significantly impact simulation performance\n",
      "WARNING:openmmtools.multistate.multistatereporter:Warning: The openmmtools.multistate API is experimental and may change in future releases\n",
      "WARNING:root:Non-GPU platform selected: CPU, this may significantly impact simulation performance\n",
      "WARNING:openmmtools.multistate.multistatesampler:Warning: The openmmtools.multistate API is experimental and may change in future releases\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Please cite the following:\n",
      "\n",
      "        Friedrichs MS, Eastman P, Vaidyanathan V, Houston M, LeGrand S, Beberg AL, Ensign DL, Bruns CM, and Pande VS. Accelerating molecular dynamic simulations on graphics processing unit. J. Comput. Chem. 30:864, 2009. DOI: 10.1002/jcc.21209\n",
      "        Eastman P and Pande VS. OpenMM: A hardware-independent framework for molecular simulations. Comput. Sci. Eng. 12:34, 2010. DOI: 10.1109/MCSE.2010.27\n",
      "        Eastman P and Pande VS. Efficient nonbonded interactions for molecular dynamics on a graphics processing unit. J. Comput. Chem. 31:1268, 2010. DOI: 10.1002/jcc.21413\n",
      "        Eastman P and Pande VS. Constant constraint matrix approximation: A robust, parallelizable constraint method for molecular simulations. J. Chem. Theor. Comput. 6:434, 2010. DOI: 10.1021/ct900463w\n",
      "        Chodera JD and Shirts MR. Replica exchange and expanded ensemble simulations as Gibbs multistate: Simple improvements for enhanced mixing. J. Chem. Phys., 135:194110, 2011. DOI:10.1063/1.3660669\n",
      "        \n"
     ]
    }
   ],
   "source": [
    "# Finally we can run the simulations\n",
    "path = pathlib.Path('./ahfe_results')\n",
    "path.mkdir()\n",
    "\n",
    "# Execute the DAG\n",
    "dag_results = execute_DAG(dag, scratch_basedir=path, shared_basedir=path, n_retries=3)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d91f78b8-58dc-4780-880f-5a643cbb18c9",
   "metadata": {},
   "source": [
    "## 6. Analysis"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "27f1f7d0-a294-41a7-8b97-70878747a22d",
   "metadata": {},
   "source": [
    "Finally now that we've run our simulations, let's go ahead and gather the free\n",
    "energies for both phases."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8dfef043-35de-4943-a728-571a83c13161",
   "metadata": {},
   "source": [
    "### Python API"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cea729d2-231f-4759-8a11-d6d09c1428fb",
   "metadata": {},
   "source": [
    "If you executed the simulations using the Python API, you will have generated a `dag_results` object. You can analyze these results by calling the Protocols' `gather()` method. This takes a **list** of completed DAG results and returns a `AbsoluteSolvationProtocolResult` which can return a free energy estimate and uncertainty by calling the `get_estimate()` and `get_uncertainty()` methods."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d0f6b8fb-dde2-442a-b641-3160a70e2f84",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get the complex and solvent results\n",
    "protocol_results = protocol.gather([dag_results])\n",
    "\n",
    "print(f\"AHFE dG: {protocol_results.get_estimate()}, err {protocol_results.get_uncertainty()}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b23748c3-3fa6-45f2-8fcb-e471bd28dd3d",
   "metadata": {},
   "source": [
    "You can save the `AbsoluteSolvationProtocolResult` to a JSON output file in the following manner:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "847b5663-2e6c-4cb2-aa9d-39d132fcff3d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Save the results in a json file\n",
    "import gzip\n",
    "import json\n",
    "import gufe\n",
    "outdict = {\n",
    "    \"estimate\": protocol_results.get_estimate(),\n",
    "    \"uncertainty\": protocol_results.get_uncertainty(),\n",
    "    \"protocol_result\": protocol_results.to_dict(),\n",
    "    \"unit_results\": {\n",
    "        unit.key: unit.to_keyed_dict()\n",
    "        for unit in dag_results.protocol_unit_results\n",
    "    }\n",
    "}\n",
    "\n",
    "with open(\"ahfe_json/benzene_results.json\") as stream:\n",
    "    json.dump(outdict, stream, cls=gufe.tokenization.JSON_HANDLER.encoder)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "43f3626a-74c1-4835-be7b-fd6bf0c81562",
   "metadata": {},
   "source": [
    "### CLI / Quickrun"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aa3978a2-ed90-4706-b3be-554237d40751",
   "metadata": {},
   "source": [
    "If you ran the simulation using the CLI (i.e. by calling `openfe quickrun` ) you will end up with the same JSON output file as the one created in the previous cell. To get your simulation results you can load them back into Python in the following manner:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "f62f69a1-09c0-4a4a-9b37-9663b51a75ac",
   "metadata": {},
   "outputs": [],
   "source": [
    "import gzip\n",
    "import json\n",
    "import gufe\n",
    "\n",
    "outfile = \"ahfe_json/benzene_results.json\"\n",
    "with open(outfile) as stream:\n",
    "    results = json.load(stream)\n",
    "    estimate = results['estimate']\n",
    "    uncertainty = results['uncertainty']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "3b459b28-a4dc-4fa9-a961-b106c45d79ce",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'magnitude': -0.7536190151704971,\n",
       " 'unit': 'kilocalorie_per_mole',\n",
       " ':is_custom:': True,\n",
       " 'pint_unit_registry': 'openff_units'}"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "estimate"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}