{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "\"\"\"Intermolecular Interactions and Symmetry-Adapted Perturbation Theory\"\"\"\n",
    "\n",
    "__authors__ = \"Konrad Patkowski\"\n",
    "__email__   = [\"patkowsk@auburn.edu\"]\n",
    "\n",
    "__copyright__ = \"(c) 2008-2020, The Psi4Education Developers\"\n",
    "__license__   = \"BSD-3-Clause\"\n",
    "__date__      = \"2020-07-16\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This lab activity is designed to teach students about weak intermolecular interactions, and the calculation and interpretation of the interaction energy between two molecules. The interaction energy can be broken down into physically meaningful contributions (electrostatics, induction, dispersion, and exchange) using symmetry-adapted perturbation theory (SAPT). In this exercise, we will calculate complete interaction energies and their SAPT decomposition using the procedures from the Psi4 software package, processing and analyzing the data with NumPy and Matplotlib.\n",
    "\n",
    "Prerequisite knowledge: the Hartree-Fock method, molecular orbitals, electron correlation and the MP2 theory. The lab also assumes all the standard Python prerequisites of all Psi4Education labs.\n",
    "\n",
    "Learning Objectives: \n",
    "1. Recognize and appreciate the ubiquity and diversity of intermolecular interactions.\n",
    "2. Compare and contrast the supermolecular and perturbative methods of calculating interaction energy.\n",
    "3. Analyze and interpret the electrostatic, induction, dispersion, and exchange SAPT contributions at different intermolecular separations.\n",
    "\n",
    "Author: Konrad Patkowski, Auburn University (patkowsk@auburn.edu; ORCID: 0000-0002-4468-207X)\n",
    "\n",
    "Copyright: Psi4Education Project, 2020\n",
    "\n",
    "# Weak intermolecular interactions \n",
    "\n",
    "In this activity, you will examine some properties of weak interactions between molecules. As the molecular subunits are not connected by any covalent (or ionic) bonds, we often use the term *noncovalent interactions*. Suppose we want to calculate the interaction energy between molecule A and molecule B for a certain geometry of the A-B complex (obviously, this interaction energy depends on how far apart the molecules are and how they are oriented). The simplest way of doing so is by subtraction (in the so-called *supermolecular approach*):\n",
    "\n",
    "\\begin{equation}\n",
    "E_{\\rm int}=E_{\\rm A-B}-E_{\\rm A}-E_{\\rm B}\n",
    "\\end{equation}\n",
    "\n",
    "where $E_{\\rm X}$ is the total energy of system X, computed using our favorite electronic structure theory and basis set. A negative value of $E_{\\rm int}$ means that A and B have a lower energy when they are together than when they are apart, so they do form a weakly bound complex that might be stable at least at very low temperatures. A positive value of $E_{\\rm int}$ means that the A-B complex is unbound - it is energetically favorable for A and B to go their separate ways. \n",
    "\n",
    "Let's consider a simple example of two interacting helium atoms and calculate $E_{\\rm int}$ at a few different interatomic distances $R$. You will use Psi4 to calculate the total energies that you need to perform subtraction. When you do so for a couple different $R$, you will be able to sketch the *potential energy curve* - the graph of $E_{\\rm int}(R)$ as a function of $R$.\n",
    "\n",
    "OK, but how should you pick the electronic structure method to calculate $E_{\\rm A-B}$, $E_{\\rm A}$, and $E_{\\rm B}$? Let's start with the simplest choice and try out the Hartree-Fock (HF) method. In case HF is not accurate enough, we will also try the coupled-cluster method with single, double, and perturbative triple excitations - CCSD(T). If you haven't heard about CCSD(T) before, let's just state that it is **(1)** usually very accurate (it's even called the *gold standard* of electronic structure theory) and **(2)** very expensive for larger molecules. For the basis set, let's pick the augmented correlation consistent triple-zeta (aug-cc-pVTZ) basis of Dunning which should be quite OK for both HF and CCSD(T).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# A simple Psi4 input script to compute the potential energy curve for two helium atoms\n",
    "\n",
    "%matplotlib notebook\n",
    "import time\n",
    "import numpy as np\n",
    "import scipy\n",
    "from scipy.optimize import *\n",
    "np.set_printoptions(precision=5, linewidth=200, threshold=2000, suppress=True)\n",
    "import psi4\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "# Set Psi4 & NumPy Memory Options\n",
    "psi4.set_memory('2 GB')\n",
    "psi4.core.set_output_file('output.dat', False)\n",
    "\n",
    "numpy_memory = 2\n",
    "\n",
    "psi4.set_options({'basis': 'aug-cc-pVTZ',\n",
    "              'e_convergence': 1e-10,\n",
    "              'd_convergence': 1e-10,\n",
    "              'INTS_TOLERANCE': 1e-15})\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We need to collect some data points to graph the function $E_{\\rm int}(R)$. Therefore, we set up a list of distances $R$ for which we will run the calculations (we go with 11 of them). For each distance, we need to remember three values ($E_{\\rm A-B}$, $E_{\\rm A}$, and $E_{\\rm B}$). For this purpose, we will prepare two $11\\times 3$ NumPy arrays to hold the HF and CCSD(T) results. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "distances = [4.0,4.5,5.0,5.3,5.6,6.0,6.5,7.0,8.0,9.0,10.0]\n",
    "ehf = np.zeros((11,3))\n",
    "eccsdt = np.zeros((11,3))\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We are almost ready to crunch some numbers! One question though: how are we going to tell Psi4 whether we want $E_{\\rm A-B}$, $E_{\\rm A}$, or $E_{\\rm B}$? \n",
    "We need to define three different geometries. The $E_{\\rm A-B}$ one has two helium atoms $R$ atomic units from each other - we can place one atom at $(0,0,0)$ and the other at $(0,0,R)$. The other two geometries involve one actual helium atom, with a nucleus and two electrons, and one *ghost atom* in place of the other one. A ghost atom does not have a nucleus or electrons, but it does carry the same basis functions as an actual atom - we need to calculate all energies in the same basis set, with functions centered at both $(0,0,0)$ and $(0,0,R)$, to prevent the so-called *basis set superposition error*. In Psi4, the syntax `Gh(X)` denotes a ghost atom where basis functions for atom type X are located. \n",
    "\n",
    "Using ghost atoms, we can now easily define geometries for the $E_{\\rm A}$ and $E_{\\rm B}$ calculations.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for i in range(len(distances)):\n",
    "  dimer = psi4.geometry(\"\"\"\n",
    "  He 0.0 0.0 0.0\n",
    "  --\n",
    "  He 0.0 0.0 \"\"\"+str(distances[i])+\"\"\"\n",
    "  units bohr\n",
    "  symmetry c1\n",
    "  \"\"\")\n",
    "\n",
    "  psi4.energy('ccsd(t)')   #HF will be calculated along the way\n",
    "  ehf[i,0] = psi4.variable('HF TOTAL ENERGY')\n",
    "  eccsdt[i,0] = psi4.variable('CCSD(T) TOTAL ENERGY')\n",
    "  psi4.core.clean()\n",
    "\n",
    "  monomerA = psi4.geometry(\"\"\"\n",
    "  He 0.0 0.0 0.0\n",
    "  --\n",
    "  Gh(He) 0.0 0.0 \"\"\"+str(distances[i])+\"\"\"\n",
    "  units bohr\n",
    "  symmetry c1\n",
    "  \"\"\")\n",
    "\n",
    "  psi4.energy('ccsd(t)')   #HF will be calculated along the way\n",
    "  ehf[i,1] = psi4.variable('HF TOTAL ENERGY')\n",
    "  eccsdt[i,1] = psi4.variable('CCSD(T) TOTAL ENERGY')\n",
    "  psi4.core.clean()\n",
    "\n",
    "  monomerB = psi4.geometry(\"\"\"\n",
    "  Gh(He) 0.0 0.0 0.0\n",
    "  --\n",
    "  He 0.0 0.0 \"\"\"+str(distances[i])+\"\"\"\n",
    "  units bohr\n",
    "  symmetry c1\n",
    "  \"\"\")\n",
    "\n",
    "  psi4.energy('ccsd(t)')   #HF will be calculated along the way\n",
    "  ehf[i,2] = psi4.variable('HF TOTAL ENERGY')\n",
    "  eccsdt[i,2] = psi4.variable('CCSD(T) TOTAL ENERGY')\n",
    "  psi4.core.clean()\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have completed the $E_{\\rm A-B}$, $E_{\\rm A}$, or $E_{\\rm B}$ calculations for all 11 distances $R$ (it didn't take that long, did it?). We will now perform the subtraction to form NumPy arrays with $E_{\\rm int}(R)$ values for each method, converted from atomic units (hartrees) to kcal/mol, and graph the resulting potential energy curves using the matplotlib library. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#COMPLETE the two lines below to generate interaction energies. Convert them from atomic units to kcal/mol.\n",
    "einthf = \n",
    "eintccsdt = \n",
    "\n",
    "print ('HF PEC',einthf)\n",
    "print ('CCSD(T) PEC',eintccsdt)\n",
    "\n",
    "plt.plot(distances,einthf,'r+',linestyle='-',label='HF')\n",
    "plt.plot(distances,eintccsdt,'bo',linestyle='-',label='CCSD(T)')\n",
    "plt.hlines(0.0,4.0,10.0)\n",
    "plt.legend(loc='upper right')\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Questions* \n",
    "1. Which curve makes more physical sense?\n",
    "2. Why does helium form a liquid at very low temperatures?\n",
    "3. You learned in freshman chemistry that two helium atoms do not form a molecule because there are two electrons on a bonding orbital and two electrons on an antibonding orbital. How does this information relate to the behavior of HF (which does assume a molecular orbital for every electron) and CCSD(T) (which goes beyond the molecular orbital picture)?\n",
    "4. When you increase the size of the interacting molecules, the CCSD(T) method quickly gets much more expensive and your calculation might take weeks instead of seconds. It gets especially expensive for the calculation of $E_{\\rm A-B}$ because A-B has more electrons than either A or B. Your friend suggests to use CCSD(T) only for the easier terms $E_{\\rm A}$ and $E_{\\rm B}$ and subtract them from $E_{\\rm A-B}$ calculated with a different, cheaper method such as HF. Why is this a really bad idea?\n",
    "\n",
    "*To answer the questions above, please double click this Markdown cell to edit it. When you are done entering your answers, run this cell as if it was a code cell, and your Markdown source will be recompiled.*\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A nice feature of the supermolecular approach is that it is very easy to use - you just need to run three standard energy calculations, and modern quantum chemistry codes such as Psi4 give you a lot of methods to choose from. However, the accuracy of subtraction hinges on error cancellation, and we have to be careful to ensure that the errors do cancel between $E_{\\rm A-B}$ and $E_{\\rm A}+E_{\\rm B}$. Another drawback of the supermolecular approach is that it is not particularly rich in physical insight. All that we get is a single number $E_{\\rm int}$ that tells us very little about the underlying physics of the interaction. Therefore, one may want to find an alternative approach where $E_{\\rm int}$ is computed directly, without subtraction, and it is obtained as a sum of distinct, physically meaningful terms. Symmetry-adapted perturbation theory (SAPT) is such an alternative approach.\n",
    "\n",
    "# Symmetry-Adapted Perturbation Theory (SAPT)\n",
    "\n",
    "SAPT is a perturbation theory aimed specifically at calculating the interaction energy between two molecules. Contrary to the supermolecular approach, SAPT obtains the interaction energy directly - no subtraction of similar terms is needed. Moreover, the result is obtained as a sum of separate corrections accounting for the electrostatic, induction, dispersion, and exchange contributions to interaction energy, so the SAPT decomposition facilitates the understanding and physical interpretation of results.\n",
    "- *Electrostatic energy* arises from the Coulomb interaction between charge densities of isolated molecules.\n",
    "- *Induction energy* is the energetic effect of mutual polarization between the two molecules.\n",
    "- *Dispersion energy* is a consequence of intermolecular electron correlation, usually explained in terms of correlated fluctuations of electron density on both molecules.\n",
    "- *Exchange energy* is a short-range repulsive effect that is a consequence of the Pauli exclusion principle.\n",
    "\n",
    "In this activity, we will explore the simplest level of the SAPT theory called SAPT0 (see [Parker:2014] for the definitions of different levels of SAPT). A particular SAPT correction $E^{(nk)}$ corresponds to effects that are of $n$th order in the intermolecular interaction and $k$th order in the intramolecular electron correlation. In SAPT0, intramolecular correlation is neglected, and intermolecular interaction is included through second order:\n",
    "\n",
    "\\begin{equation}\n",
    "E_{\\rm int}^{\\rm SAPT0}=E^{(10)}_{\\rm elst}+E^{(10)}_{\\rm exch}+E^{(20)}_{\\rm ind,resp}+E^{(20)}_{\\rm exch-ind,resp}+E^{(20)}_{\\rm disp}+E^{(20)}_{\\rm exch-disp}+\\delta E^{(2)}_{\\rm HF}\n",
    "\\end{equation}\n",
    "\n",
    "In this equation, the consecutive corrections account for the electrostatic, first-order exchange, induction, exchange induction, dispersion, and exchange dispersion effects, respectively. The additional subscript ''resp'' denotes that these corrections are computed including response effects - the HF orbitals of each molecule are relaxed in the electric field generated by the other molecule. The last term $\\delta E^{(2)}_{\\rm HF}$ approximates third- and higher-order induction and exchange induction effects and is taken from a supermolecular HF calculation.\n",
    "\n",
    "Sticking to our example of two helium atoms, let's now calculate the SAPT0 interaction energy contributions using Psi4. In the results that follow, we will group $E^{(20)}_{\\rm ind,resp}$, $E^{(20)}_{\\rm exch-ind,resp}$, and $\\delta E^{(2)}_{\\rm HF}$ to define the total induction effect (including its exchange quenching), and group $E^{(20)}_{\\rm disp}$ with $E^{(20)}_{\\rm exch-disp}$ to define the total dispersion effect.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "distances = [4.0,4.5,5.0,5.3,5.6,6.0,6.5,7.0,8.0,9.0,10.0]\n",
    "eelst = np.zeros((11))\n",
    "eexch = np.zeros((11))\n",
    "eind = np.zeros((11))\n",
    "edisp = np.zeros((11))\n",
    "esapt = np.zeros((11))\n",
    "\n",
    "for i in range(len(distances)):\n",
    "  dimer = psi4.geometry(\"\"\"\n",
    "  He 0.0 0.0 0.0\n",
    "  --\n",
    "  He 0.0 0.0 \"\"\"+str(distances[i])+\"\"\"\n",
    "  units bohr\n",
    "  symmetry c1\n",
    "  \"\"\")\n",
    "\n",
    "  psi4.energy('sapt0')\n",
    "  eelst[i] = psi4.variable('SAPT ELST ENERGY') * 627.509\n",
    "  eexch[i] = psi4.variable('SAPT EXCH ENERGY') * 627.509\n",
    "  eind[i] = psi4.variable('SAPT IND ENERGY') * 627.509\n",
    "  edisp[i] = psi4.variable('SAPT DISP ENERGY') * 627.509\n",
    "  esapt[i] = psi4.variable('SAPT TOTAL ENERGY') * 627.509\n",
    "  psi4.core.clean()\n",
    "\n",
    "plt.close()\n",
    "plt.ylim(-0.2,0.4)\n",
    "plt.plot(distances,eelst,'r+',linestyle='-',label='SAPT0 elst')\n",
    "plt.plot(distances,eexch,'bo',linestyle='-',label='SAPT0 exch')\n",
    "plt.plot(distances,eind,'g^',linestyle='-',label='SAPT0 ind')\n",
    "plt.plot(distances,edisp,'mx',linestyle='-',label='SAPT0 disp')\n",
    "plt.plot(distances,esapt,'k*',linestyle='-',label='SAPT0 total')\n",
    "plt.hlines(0.0,4.0,10.0)\n",
    "plt.legend(loc='upper right')\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Questions* \n",
    "1. What is the origin of attraction between two helium atoms?\n",
    "2. For the interaction of two helium atoms, which SAPT terms are *long-range* (vanish with distance like some inverse power of $R$) and which are *short-range* (vanish exponentially with $R$ just like the overlap of molecular orbitals)?\n",
    "3. The dispersion energy decays at large $R$ like $R^{-n}$. Find the value of $n$ by fitting a function to the five largest-$R$ results. You can use `scipy.optimize.curve_fit` to perform the fitting, but you have to define the appropriate function first.\n",
    "Does the optimal exponent $n$ obtained by your fit agree with what you know about van der Waals dispersion forces? Is the graph of dispersion energy shaped like the $R^{-n}$ graph for large $R$? What about intermediate $R$?\n",
    "\n",
    "*Do you know how to calculate $R^{-n}$ if you have an array with $R$ values? If not, look it up in the NumPy documentation!* \n" 
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#COMPLETE the definition of function f below.\n",
    "def f\n",
    "\n",
    "ndisp = scipy.optimize.curve_fit(f,distances[-5:],edisp[-5:])\n",
    "print (\"Optimal dispersion exponent:\",ndisp[0][0])\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Interaction between two water molecules\n",
    "\n",
    "For the next part, you will perform the same analysis and obtain the supermolecular and SAPT0 data for the interaction of two water molecules. We now have many more degrees of freedom: in addition to the intermolecular distance $R$, we can change the relative orientation of two molecules, or even their internal geometries (O-H bond lengths and H-O-H angles). In this way, the potential energy curve becomes a multidimensional *potential energy surface*. It is hard to graph functions of more than two variables, so we will stick to the distance dependence of the interaction energies. Therefore, we will assume one particular orientation of two water molecules (a hydrogen-bonded one) and vary the intermolecular distance $R$ while keeping the orientation, and molecular geometries, constant. The geometry of the A-B complex has been defined for you, but you have to request all the necessary Psi4 calculations and extract the numbers that you need. To save time, we will downgrade the basis set to aug-cc-pVDZ and use MP2 (an approximate method that captures most of electron correlation) in place of CCSD(T).\n",
    "\n",
    "*Hints:* To prepare the geometries for the individual water molecules A and B, copy and paste the A-B geometry, but use the Gh(O2)... syntax to define the appropriate ghost atoms. Remember to run `psi4.core.clean()` after each calculation.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "distances_h2o = [2.7,3.0,3.5,4.0,4.5,5.0,6.0,7.0,8.0,9.0]\n",
    "ehf_h2o = np.zeros((10,3))\n",
    "emp2_h2o = np.zeros((10,3))\n",
    "psi4.set_options({'basis': 'aug-cc-pVDZ'})\n",
    "\n",
    "for i in range(len(distances_h2o)):\n",
    "  dimer = psi4.geometry(\"\"\"\n",
    "  O1\n",
    "  H1 O1 0.96\n",
    "  H2 O1 0.96 H1 104.5\n",
    "  --\n",
    "  O2 O1 \"\"\"+str(distances_h2o[i])+\"\"\" H1 5.0 H2 0.0\n",
    "  X O2 1.0 O1 120.0 H2 180.0\n",
    "  H3 O2 0.96 X 52.25 O1 90.0\n",
    "  H4 O2 0.96 X 52.25 O1 -90.0\n",
    "  units angstrom\n",
    "  symmetry c1\n",
    "  \"\"\")\n",
    "\n",
    "#COMPLETE the MP2 energy calculations for A-B, A, and B, and prepare the data for the graph.\n",
    "#Copy and paste the A-B geometry, but use the Gh(O2)... syntax to define the appropriate ghost atoms for the A and B calculations. \n",
    "#Remember to run psi4.core.clean() after each calculation.\n",
    "\n",
    "print ('HF PEC',einthf_h2o)\n",
    "print ('MP2 PEC',eintmp2_h2o)\n",
    "\n",
    "plt.close()\n",
    "plt.plot(distances_h2o,einthf_h2o,'r+',linestyle='-',label='HF')\n",
    "plt.plot(distances_h2o,eintmp2_h2o,'bo',linestyle='-',label='MP2')\n",
    "plt.hlines(0.0,2.5,9.0)\n",
    "plt.legend(loc='upper right')\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "eelst_h2o = np.zeros((10))\n",
    "eexch_h2o = np.zeros((10))\n",
    "eind_h2o = np.zeros((10))\n",
    "edisp_h2o = np.zeros((10))\n",
    "esapt_h2o = np.zeros((10))\n",
    "\n",
    "#COMPLETE the SAPT calculations for 10 distances to prepare the data for the graph.\n",
    "\n",
    "plt.close()\n",
    "plt.ylim(-10.0,10.0)\n",
    "plt.plot(distances_h2o,eelst_h2o,'r+',linestyle='-',label='SAPT0 elst')\n",
    "plt.plot(distances_h2o,eexch_h2o,'bo',linestyle='-',label='SAPT0 exch')\n",
    "plt.plot(distances_h2o,eind_h2o,'g^',linestyle='-',label='SAPT0 ind')\n",
    "plt.plot(distances_h2o,edisp_h2o,'mx',linestyle='-',label='SAPT0 disp')\n",
    "plt.plot(distances_h2o,esapt_h2o,'k*',linestyle='-',label='SAPT0 total')\n",
    "plt.hlines(0.0,2.5,9.0)\n",
    "plt.legend(loc='upper right')\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before we proceed any further, let us check one thing about your first MP2 water-water interaction energy calculation, the one that produced `eintmp2_h2o[0]`. Here's the geometry of that complex again:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#all x,y,z in Angstroms\n",
    "atomtypes = [\"O1\",\"H1\",\"H2\",\"O2\",\"H3\",\"H4\"]\n",
    "coordinates = np.array([[0.116724185090,     1.383860971547,     0.000000000000],\n",
    "                        [0.116724185090,     0.423860971547,     0.000000000000],\n",
    "                        [-0.812697549673,    1.624225775439,     0.000000000000],\n",
    "                        [-0.118596320329,   -1.305864713301,     0.000000000000],\n",
    "                        [0.362842754701,    -1.642971982825,    -0.759061990794],\n",
    "                        [0.362842754701,    -1.642971982825,     0.759061990794]])\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, write the code to compute the four O-H bond lengths and two H-O-H bond angles in the two molecules. *(Hint: if the angles look weird, maybe they are still in radians - don't forget to convert them to degrees.)* Are the two water molecules identical?\n",
    "\n",
    "Then, check the values of the MP2 energy for these two molecules (the numbers $E_{\\rm A}$ and $E_{\\rm B}$ that you subtracted to get the interaction energy). If the molecules are the same, why are the MP2 energies close but not the same?\n",
    "\n",
    "*Hints:* The most elegant way to write this code is to define functions `distance(point1,point2)` for the distance between two points $(x_1,y_1,z_1)$ and $(x_2,y_2,z_2)$, and `angle(vec1,vec2)` for the angle between two vectors $(x_{v1},y_{v1},z_{v1})$ and $(x_{v2},y_{v2},z_{v2})$. Recall that the cosine of this angle is related to the dot product $(x_{v1},y_{v1},z_{v1})\\cdot(x_{v2},y_{v2},z_{v2})$. If needed, check the documentation on how to calculate the dot product of two NumPy vectors. \n",
    "\n",
    "When you are parsing the NumPy array with the coordinates, remember that `coordinates[k,:]` is the vector of $(x,y,z)$ values for atom number $k$, $k=0,1,2,\\ldots,N_{\\rm atoms}-1$. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "#COMPLETE the distance and angle calculations below.\n",
    "ro1h1 = \n",
    "ro1h2 = \n",
    "ro2h3 = \n",
    "ro2h4 = \n",
    "ah1o1h2 = \n",
    "ah3o2h4 = \n",
    "print ('O-H distances: %5.3f %5.3f %5.3f %5.3f' % (ro1h1,ro1h2,ro2h3,ro2h4))\n",
    "print ('H-O-H angles: %6.2f %6.2f' % (ah1o1h2,ah3o2h4))\n",
    "print ('MP2 energy of molecule 1: %18.12f hartrees' % emp2_h2o[0,1])\n",
    "print ('MP2 energy of molecule 2: %18.12f hartrees' % emp2_h2o[0,2])\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now proceed with the analysis of the SAPT0 energy components for the complex of two water molecules. *Please edit this Markdown cell to write your answers.*\n",
    "1. Which of the four SAPT terms are long-range, and which are short-range this time?\n",
    "2. For the terms that are long-range and decay with $R$ like $R^{-n}$, estimate $n$ by fitting a proper function to the 5 data points with the largest $R$, just like you did for the two interacting helium atoms (using `scipy.optimize.curve_fit`). How would you explain the power $n$ that you obtained for the electrostatic energy?\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#COMPLETE the optimizations below. \n",
    "nelst_h2o = \n",
    "nind_h2o = \n",
    "ndisp_h2o = \n",
    "print (\"Optimal electrostatics exponent:\",nelst_h2o[0][0])\n",
    "print (\"Optimal induction exponent:\",nind_h2o[0][0])\n",
    "print (\"Optimal dispersion exponent:\",ndisp_h2o[0][0])\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The water molecules are polar - each one has a nonzero dipole moment, and at large distances we expect the electrostatic energy to be dominated by the dipole-dipole interaction (at short distances, when the orbitals of two molecules overlap, the multipole approximation is not valid and the electrostatic energy contains the short-range *charge penetration* effects). Let's check if this is indeed the case. In preparation for this, we first find the HF dipole moment vector for each water molecule. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "waterA = psi4.geometry(\"\"\"\n",
    "O 0.116724185090 1.383860971547 0.000000000000\n",
    "H 0.116724185090 0.423860971547 0.000000000000\n",
    "H -0.812697549673 1.624225775439 0.000000000000\n",
    "units angstrom\n",
    "noreorient\n",
    "nocom\n",
    "symmetry c1\n",
    "\"\"\")\n",
    "\n",
    "comA = waterA.center_of_mass()\n",
    "comA = np.array([comA[0],comA[1],comA[2]])\n",
    "E, wfn = psi4.energy('HF',return_wfn=True)\n",
    "dipoleA = np.array([psi4.variable('SCF DIPOLE X'),psi4.variable('SCF DIPOLE Y'),\n",
    "                    psi4.variable('SCF DIPOLE Z')])*0.393456   # conversion from Debye to a.u.\n",
    "psi4.core.clean()\n",
    "print(\"COM A in a.u.\",comA)\n",
    "print(\"Dipole A in a.u.\",dipoleA)\n",
    "\n",
    "waterB = psi4.geometry(\"\"\"\n",
    "O -0.118596320329 -1.305864713301 0.000000000000\n",
    "H 0.362842754701 -1.642971982825 -0.759061990794\n",
    "H 0.362842754701 -1.642971982825 0.759061990794\n",
    "units angstrom\n",
    "noreorient\n",
    "nocom\n",
    "symmetry c1\n",
    "\"\"\")\n",
    "\n",
    "comB = waterB.center_of_mass()\n",
    "comB = np.array([comB[0],comB[1],comB[2]])\n",
    "E, wfn = psi4.energy('HF',return_wfn=True)\n",
    "dipoleB = np.array([psi4.variable('SCF DIPOLE X'),psi4.variable('SCF DIPOLE Y'),\n",
    "                    psi4.variable('SCF DIPOLE Z')])*0.393456   # conversion from Debye to a.u.\n",
    "psi4.core.clean()\n",
    "print(\"COM B in a.u.\",comB)\n",
    "print(\"Dipole B in a.u.\",dipoleB)\n",
    "\n",
    "comA_to_comB = comB - comA\n",
    "print(\"Vector from COMA to COMB:\",comA_to_comB)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our goal now is to plot the electrostatic energy from SAPT against the interaction energy between two dipoles $\\boldsymbol{\\mu_A}$ and $\\boldsymbol{\\mu_B}$:\n",
    "\n",
    "\\begin{equation}\n",
    "E_{\\rm dipole-dipole}=\\frac{\\boldsymbol{\\mu_A}\\cdot\\boldsymbol{\\mu_B}}{R^3}-\\frac{3(\\boldsymbol{\\mu_A}\\cdot{\\mathbf R})(\\boldsymbol{\\mu_B}\\cdot{\\mathbf R})}{R^5} \n",
    "\\end{equation}\n",
    "\n",
    "Program this formula in the `dipole_dipole` function below, taking ${\\mathbf R}$, $\\boldsymbol{\\mu_A}$, and $\\boldsymbol{\\mu_B}$ in atomic units and calculating the dipole-dipole interaction energy, also in atomic units (which we will later convert to kcal/mol). \n", 
    "With your new function, we can populate the `edipdip` array of dipole-dipole interaction energies for all intermolecular separations, and plot these energies alongside the actual electrostatic energy data from SAPT. \n",
    "\n",
    "Note that ${\\mathbf R}$ is the vector from the center of mass of molecule A to the center of mass of molecule B. For the shortest intermolecular distance, the atomic coordinates are listed in the code above, so `R = comA_to_comB`. For any other distance, we obtained the geometry of the complex by shifting one water molecule away from the other along the O-O direction, so we need to shift the center of mass of the second molecule in the same way.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#the geometries are related to each other by a shift of 1 molecule along the O-O vector:\n",
    "OA_to_OB = (np.array([-0.118596320329,-1.305864713301,0.000000000000])-np.array(\n",
    "    [0.116724185090,1.383860971547,0.000000000000]))/0.529177249\n",
    "OA_to_OB_unit = OA_to_OB/np.sqrt(np.sum(OA_to_OB*OA_to_OB))\n",
    "print(\"Vector from OA to OB:\",OA_to_OB,OA_to_OB_unit)\n",
    "\n",
    "def dipole_dipole(R,dipA,dipB):\n",
    "#COMPLETE the definition of the dipole-dipole energy. All your data are in atomic units.\n",
    "\n",
    "edipdip = []\n",
    "for i in range(len(distances_h2o)):\n",
    "  shiftlength = (distances_h2o[i]-distances_h2o[0])/0.529177249\n",
    "  R = comA_to_comB + shiftlength*OA_to_OB_unit\n",
    "  edipdip.append(dipole_dipole(R,dipoleA,dipoleB)*627.509)\n",
    "\n",
    "edipdip = np.array(edipdip)\n",
    "print (edipdip)\n",
    "\n",
    "plt.close()\n",
    "plt.ylim(-10.0,10.0)\n",
    "plt.plot(distances_h2o,eelst_h2o,'r+',linestyle='-',label='SAPT0 elst')\n",
    "plt.plot(distances_h2o,edipdip,'bo',linestyle='-',label='dipole-dipole')\n",
    "plt.hlines(0.0,2.5,9.0)\n",
    "plt.legend(loc='upper right')\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We clearly have a favorable dipole-dipole interaction, which results in negative (attractive) electrostatic energy. This is how the origins of hydrogen bonding might have been explained to you in your freshman chemistry class: two polar molecules have nonzero dipole moments and the dipole-dipole interaction can be strongly attractive. However, your SAPT components show you that it's not a complete explanation: the two water molecules are bound not only by electrostatics, but by two other SAPT components as well. Can you quantify the relative (percentage) contributions of electrostatics, induction, and dispersion to the overall interaction energy at the van der Waals minimum? This minimum is the second point on your curve, so, for example, `esapt_h2o[1]` is the total SAPT interaction energy.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#now let's examine the SAPT0 contributions at the van der Waals minimum, which is the 2nd point on the curve\n",
    "#COMPLETE the calculation of percentages.\n",
    "percent_elst = \n",
    "percent_ind  = \n",
    "percent_disp = \n",
    "print ('At the van der Waals minimum, electrostatics, induction, and dispersion')\n",
    "print (' contribute %5.1f, %5.1f, and %5.1f percent of interaction energy, respectively.'\n",
    " % (percent_elst,percent_ind,percent_disp))\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You have now completed some SAPT calculations and analyzed the meaning of different corrections. Can you complete the table below to indicate whether different SAPT corrections can be positive (repulsive), negative (attractive), or both, and why?\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Type in your answers below.\n",
    "#COMPLETE this table. Do not remove the comment (#) signs.\n",
    "#\n",
    "#SAPT term       Positive/Negative/Both?     Why?\n",
    "#Electrostatics\n",
    "#Exchange\n",
    "#Induction\n",
    "#Dispersion\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Ternary diagrams\n",
    "\n",
    "Higher levels of SAPT calculations can give very accurate interaction energies, but are more computationally expensive than SAPT0. SAPT0 is normally sufficient for qualitative accuracy and basic understanding of the interaction physics. One important use of SAPT0 is to *classify different intermolecular complexes according to the type of interaction*, and a nice way to display the results of this classification is provided by a *ternary diagram*.\n",
    "\n",
    "The relative importance of attractive electrostatic, induction, and dispersion contributions to a SAPT interaction energy for a particular structure can be marked as a point inside a triangle, with the distance to each vertex of the triangle depicting the relative contribution of a given type (the more dominant a given contribution is, the closer the point lies to the corresponding vertex). If the electrostatic contribution is repulsive, we can display the relative magnitudes of electrostatic, induction, and dispersion terms in the same way, but we need the second triangle (the left one). The combination of two triangles forms the complete diagram and we can mark lots of different points corresponding to different complexes and geometries.\n",
    "\n",
    "Let's now mark all your systems on a ternary diagram, in blue for two helium atoms and in red for two water molecules. What kinds of interaction are represented? Compare your diagram with the one pictured below, prepared for 2510 different geometries of the complex of two water molecules, with all kinds of intermolecular distances and orientations (this graph is taken from [Smith:2016]). What conclusions can you draw about the interaction of two water molecules at *any* orientation?\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "def ternary(sapt, title='', labeled=True, view=True, saveas=None, relpath=False, graphicsformat=['pdf']):\n",
    "#Adapted from the QCDB ternary diagram code by Lori Burns\n",
    "    \"\"\"Takes array of arrays *sapt* in form [elst, indc, disp] and builds formatted\n",
    "    two-triangle ternary diagrams. Either fully-readable or dotsonly depending\n",
    "    on *labeled*.\n",
    "    \"\"\"\n",
    "    from matplotlib.path import Path\n",
    "    import matplotlib.patches as patches\n",
    "\n",
    "    # initialize plot\n",
    "    plt.close()\n",
    "    fig, ax = plt.subplots(figsize=(6, 3.6))\n",
    "    plt.xlim([-0.75, 1.25])\n",
    "    plt.ylim([-0.18, 1.02])\n",
    "    plt.xticks([])\n",
    "    plt.yticks([])\n",
    "    ax.set_aspect('equal')\n",
    "\n",
    "    if labeled:\n",
    "        # form and color ternary triangles\n",
    "        codes = [Path.MOVETO, Path.LINETO, Path.LINETO, Path.CLOSEPOLY]\n",
    "        pathPos = Path([(0., 0.), (1., 0.), (0.5, 0.866), (0., 0.)], codes)\n",
    "        pathNeg = Path([(0., 0.), (-0.5, 0.866), (0.5, 0.866), (0., 0.)], codes)\n",
    "        ax.add_patch(patches.PathPatch(pathPos, facecolor='white', lw=2))\n",
    "        ax.add_patch(patches.PathPatch(pathNeg, facecolor='#fff5ee', lw=2))\n",
    "\n",
    "        # label corners\n",
    "        ax.text(1.0,\n",
    "                -0.15,\n",
    "                u'Elst (−)',\n",
    "                verticalalignment='bottom',\n",
    "                horizontalalignment='center',\n",
    "                family='Times New Roman',\n",
    "                weight='bold',\n",
    "                fontsize=18)\n",
    "        ax.text(0.5,\n",
    "                0.9,\n",
    "                u'Ind (−)',\n",
    "                verticalalignment='bottom',\n",
    "                horizontalalignment='center',\n",
    "                family='Times New Roman',\n",
    "                weight='bold',\n",
    "                fontsize=18)\n",
    "        ax.text(0.0,\n",
    "                -0.15,\n",
    "                u'Disp (−)',\n",
    "                verticalalignment='bottom',\n",
    "                horizontalalignment='center',\n",
    "                family='Times New Roman',\n",
    "                weight='bold',\n",
    "                fontsize=18)\n",
    "        ax.text(-0.5,\n",
    "                0.9,\n",
    "                u'Elst (+)',\n",
    "                verticalalignment='bottom',\n",
    "                horizontalalignment='center',\n",
    "                family='Times New Roman',\n",
    "                weight='bold',\n",
    "                fontsize=18)\n",
    "\n",
    "    xvals = []\n",
    "    yvals = []\n",
    "    cvals = []\n",
    "    geomindex = 0 # first 11 points are He-He, the next 10 are H2O-H2O\n",
    "    for sys in sapt:\n",
    "        [elst, indc, disp] = sys\n",
    "\n",
    "        # calc ternary posn and color\n",
    "        Ftop = abs(indc) / (abs(elst) + abs(indc) + abs(disp))\n",
    "        Fright = abs(elst) / (abs(elst) + abs(indc) + abs(disp))\n",
    "        xdot = 0.5 * Ftop + Fright\n",
    "        ydot = 0.866 * Ftop\n",
    "        if geomindex <= 10:\n",
    "          cdot = 'b'\n",
    "        else:\n",
    "          cdot = 'r'\n",
    "        if elst > 0.:\n",
    "            xdot = 0.5 * (Ftop - Fright)\n",
    "            ydot = 0.866 * (Ftop + Fright)\n",
    "        #print elst, indc, disp, '', xdot, ydot, cdot\n",
    "\n",
    "        xvals.append(xdot)\n",
    "        yvals.append(ydot)\n",
    "        cvals.append(cdot)\n",
    "        geomindex += 1\n",
    "\n",
    "    sc = ax.scatter(xvals, yvals, c=cvals, s=15, marker=\"o\", \n",
    "                    edgecolor='none', vmin=0, vmax=1, zorder=10)\n",
    "\n",
    "    # remove figure outline\n",
    "    ax.spines['top'].set_visible(False)\n",
    "    ax.spines['right'].set_visible(False)\n",
    "    ax.spines['bottom'].set_visible(False)\n",
    "    ax.spines['left'].set_visible(False)\n",
    "\n",
    "    # save and show\n",
    "    plt.show()\n",
    "    return 1\n",
    "\n",
    "sapt = []\n",
    "for i in range(11):\n",
    "  sapt.append([eelst[i],eind[i],edisp[i]])\n",
    "for i in range(10):\n",
    "  sapt.append([eelst_h2o[i],eind_h2o[i],edisp_h2o[i]])\n",
    "idummy = ternary(sapt)\n",
    "from IPython.display import Image\n",
    "Image(filename='water2510.png')\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Some further reading:\n",
    "\n",
    "1. How is the calculation of SAPT corrections actually programmed? The Psi4NumPy projects has some tutorials on this topic: https://github.com/psi4/psi4numpy/tree/master/Tutorials/07_Symmetry_Adapted_Perturbation_Theory \n",
    "2. A classic (but recently updated) book on the theory of interactions between molecules: \"The Theory of Intermolecular Forces\"\n",
    "\t> [[Stone:2013](https://www.worldcat.org/title/theory-of-intermolecular-forces/oclc/915959704)] A. Stone, Oxford University Press, 2013\n",
    "3. The classic review paper on SAPT: \"Perturbation Theory Approach to Intermolecular Potential Energy Surfaces of van der Waals Complexes\"\n",
    "\t> [[Jeziorski:1994](http://pubs.acs.org/doi/abs/10.1021/cr00031a008)] B. Jeziorski, R. Moszynski, and K. Szalewicz, *Chem. Rev.* **94**, 1887 (1994)\n",
    "4. A brand new (as of 2020) review of SAPT, describing new developments and inprovements to the theory: \"Recent developments in symmetry‐adapted perturbation theory\"\n",
    "\t> [[Patkowski:2020](https://onlinelibrary.wiley.com/doi/abs/10.1002/wcms.1452)] K. Patkowski, *WIREs Comput. Mol. Sci.* **10**, e1452 (2020)\n",
    "5. The definitions and practical comparison of different levels of SAPT: \"Levels of symmetry adapted perturbation theory (SAPT). I. Efficiency and performance for interaction energies\"\n",
    "\t> [[Parker:2014](http://aip.scitation.org/doi/10.1063/1.4867135)] T. M. Parker, L. A. Burns, R. M. Parrish, A. G. Ryno, and C. D. Sherrill, *J. Chem. Phys.* **140**, 094106 (2014)\n",
    "6. An example study making use of the SAPT0 classification of interaction types, with lots of ternary diagrams in the paper and in the supporting information: \"Revised Damping Parameters for the D3 Dispersion Correction to Density Functional Theory\"\n",
    "\t> [[Smith:2016](https://pubs.acs.org/doi/abs/10.1021/acs.jpclett.6b00780)] D. G. A. Smith, L. A. Burns, K. Patkowski, and C. D. Sherrill, *J. Phys. Chem. Lett.* **7**, 2197 (2016).\n"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}