{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Advanced MPDS API usage: pVT-data and EoS fitting\n",
    "==========\n",
    "\n",
    "- **Complexity level**: green karate belt\n",
    "- **Requirements**: familiarity with computational thermodynamics\n",
    "\n",
    "Among the opened MPDS data there is a considerable number of the diagrams **cell parameters** _vs._ **temperature** `VT`, as well as **cell parameters** _vs._ **pressure** `pV`. They provide us the link to the thermodynamic properties. Let's explore this link.\n",
    "\n",
    "We will calculate an isothermal bulk modulus, as well as its pressure derivative, using the `pV`-data from the MPDS API and fitting to the equation of state (EoS).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "**Important! Before you proceed:** the notebooks running at the third-party servers are not secure. Using this notebook assumes you authenticate at the MPDS server with your own API key. Please run this notebook only if you have an open-access account (_i.e._ an **access** section of your MPDS account reads: `Programmatic data access: only open data`).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Please **do not** run this notebook at the third-party servers if you have an elevated API access to the MPDS, since there's a nonzero probability of key leakage!\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Be sure to **always invalidate** (revoke) your API key at your [MPDS account](https://mpds.io/#modal/menu) after using the notebooks.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Now let's proceed with the authentication part. First, apply for an [MPDS account](https://mpds.io/open-data-api), if you have none. Then copy your API key, run the next cell, paste the key in the appeared prompt input, and hit **Enter**.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os, getpass\n",
    "os.environ['MPDS_KEY'] = getpass.getpass()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "OK, now you may talk to the MPDS server programmatically from this notebook on your behalf.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For fitting we will use a `pytheos` pure-Python library written by Prof. Sang-Heon Dan Shim. (Note there's a wide range of the other EoS fitting tools!)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install pytheos"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Make sure `pytheos` works, using this simple example:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from pytheos import BM3Model\n",
    "\n",
    "v0, k0, k0p = 10, 200, 4\n",
    "\n",
    "exp_bm3 = BM3Model()\n",
    "v_data = v0 * np.linspace(0.6, 1, 20)\n",
    "\n",
    "fit_params = exp_bm3.make_params(v0=v0, k0=k0, k0p=k0p)\n",
    "p_bm3 = exp_bm3.eval(fit_params, v=v_data)\n",
    "p_data = p_bm3 + np.random.normal(0.0, 2, 20)\n",
    "\n",
    "fit_params['v0'].vary = False\n",
    "fitresult_bm3 = exp_bm3.fit(p_data, fit_params, v=v_data, weights=None)\n",
    "\n",
    "print(fitresult_bm3.fit_report())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "If there were no errors at the previous step, let's continue.\n",
    "\n",
    "We will download the available isothermal bulk moduli and their pressure derivatives from the MPDS.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install mpds_client"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import warnings\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "from numpy.linalg import det\n",
    "from ase.geometry import cellpar_to_cell\n",
    "\n",
    "from mpds_client import MPDSDataRetrieval"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Note, if your API key isn't valid, the API returns an HTTP error `403`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "client = MPDSDataRetrieval()\n",
    "\n",
    "dfrm_k0p = client.get_dataframe({\"classes\": \"binary\", \"elements\": \"O\", \"props\": \"pressure derivative of isothermal bulk modulus\"})\n",
    "dfrm_k0p = dfrm_k0p[np.isfinite(dfrm_k0p['Phase'])] # only data for the existing distinct phases\n",
    "avg_k0p =  dfrm_k0p.groupby('Phase')['Value'].median().to_frame().reset_index().rename(columns={'Value': 'avg_k0p'})\n",
    "\n",
    "dfrm_k0 = client.get_dataframe({\"props\": \"isothermal bulk modulus\"}, phases=set(dfrm_k0p['Phase'].tolist()))\n",
    "avg_k0 =  dfrm_k0.groupby('Phase')['Value'].median().to_frame().reset_index().rename(columns={'Value': 'avg_k0'})\n",
    "avg_k0 =  avg_k0.merge(avg_k0p, how='inner', on='Phase')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's fit the isothermal bulk moduli and their pressure derivatives from the MPDS `pV`-data and compare with the experimental values:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_cell_volume(a, b, c, alpha, beta, gamma):\n",
    "    '''\n",
    "    Calculate V from cell parameters.\n",
    "    NB ab_normal and a_direction are standard.\n",
    "    '''\n",
    "    return abs(\n",
    "        det(\n",
    "            cellpar_to_cell([a, b, c, alpha, beta, gamma])\n",
    "        )\n",
    "    )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pvts = {}\n",
    "\n",
    "for matrix in client.get_data(\n",
    "    {\"props\": \"cell parameters - pressure diagram\"},\n",
    "    phases=set(dfrm_k0p['Phase'].tolist()), # only those phases we have experimental bulk modulus\n",
    "    fields={}                               # all fields\n",
    "):\n",
    "    try: decks = matrix['sample']['measurement'][0]['property']['matrix']\n",
    "    except KeyError:\n",
    "        warnings.warn('Error: there is no expected property in the entry %s' % matrix)\n",
    "        continue\n",
    "\n",
    "    p_all, v_all, t_all = [], [], []\n",
    "    for deck in decks:\n",
    "        p, t, v = deck[0], deck[1], get_cell_volume(*deck[2:])\n",
    "        p_all.append(p)\n",
    "        v_all.append(v)\n",
    "        t_all.append(t)\n",
    "\n",
    "    # TODO here in principle we should do something smarter\n",
    "    # than just omitting the data for the same phase\n",
    "    if matrix['sample']['material']['phase_id'] in pvts and len(pvts[ matrix['sample']['material']['phase_id'] ][0]) > len(p_all):\n",
    "            warnings.warn('Skipping entry %s' % matrix['sample']['material']['entry'])\n",
    "            continue\n",
    "\n",
    "    pvts[ matrix['sample']['material']['phase_id'] ] = [matrix['sample']['material']['entry'], p_all, v_all, t_all]\n",
    "\n",
    "pvts = [[key] + value for key, value in pvts.items()]\n",
    "pvts = pd.DataFrame(pvts, columns=['Phase', 'Entry', 'P', 'V', 'T'])\n",
    "print(pvts)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dfrm = avg_k0.merge(pvts, how='inner', on='Phase')\n",
    "\n",
    "for n, system in dfrm.iterrows():\n",
    "\n",
    "    params = exp_bm3.make_params(v0=system['V'][0], k0=system['avg_k0'], k0p=system['avg_k0p'])\n",
    "\n",
    "    fitresult_bm3 = exp_bm3.fit(system['P'], params, v=system['V'], weights=None)\n",
    "    if abs(fitresult_bm3.params['k0'].value - system['avg_k0']) > 50:\n",
    "        # show the discrepancies, if any\n",
    "        print(\"*\" * 30 + \" Distinct phase https://mpds.io/#phase_id/%s \" % system['Phase'] + \"*\" * 30)\n",
    "        print(\"BM_fit: %4.1f \t BM_exp: %4.1f\" % (fitresult_bm3.params['k0'].value, system['avg_k0']))\n",
    "        print(\"BM0p_fit: %1.2f \t BM0p_exp: %1.2f\" % (fitresult_bm3.params['k0p'].value, system['avg_k0p']))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Were you able to follow everything? Please, try to answer:\n",
    "- Why did the discrepancies occur?\n",
    "- What other thermodynamic properties can be calculated _via_ EoS?\n",
    "- Given a certain (_e.g._ hypothetical) crystalline structure, how can we calculate its isothermal bulk modulus? What about its adiabatic bulk modulus?\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "**PS** don't forget to [invalidate](https://mpds.io/#modal/menu) (revoke) your API key.\n"
   ]
  }
 ],
 "metadata": {},
 "nbformat": 4,
 "nbformat_minor": 2
}