{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Basic MPDS API usage: machine-learning and peer-reviewed data\n",
    "==========\n",
    "\n",
    "- **Complexity level**: beginner\n",
    "- **Requirements**: understanding how APIs work\n",
    "\n",
    "Let's play a bit with the MPDS API, fetching different kinds of data?\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "**Important! Before you proceed:** the notebooks running at the third-party servers are not secure. Using this notebook assumes you authenticate at the MPDS server with your own API key. Please run this notebook only if you have an open-access account (_i.e._ an **access** section of your MPDS account reads: `Programmatic data access: only open data`).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Please **do not** run this notebook at the third-party servers if you have an elevated API access to the MPDS, since there's a nonzero probability of key leakage!\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Be sure to **always invalidate** (revoke) your API key at your [MPDS account](https://mpds.io/#modal/menu) after using the notebooks.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Now let's proceed with the authentication part. First, apply for an [MPDS account](https://mpds.io/open-data-api), if you have none. Then copy your API key, run the next cell, paste the key in the appeared prompt input, and hit **Enter**.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os, getpass\n",
    "os.environ['MPDS_KEY'] = getpass.getpass()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "OK, now you may talk to the MPDS server programmatically from this notebook on your behalf.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install mpds_client>=0.0.17"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mpds_client import MPDSDataRetrieval, MPDSDataTypes, APIError"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "[x for x in dir(MPDSDataTypes) if not x.startswith('__')]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In future we'll also add _ab initio_ data, but peer-reviewed data are (and will remain) default."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "example_props = [ # NB these props support machine-learning data type\n",
    "'isothermal bulk modulus',\n",
    "'enthalpy of formation',\n",
    "'heat capacity at constant pressure',\n",
    "'Seebeck coefficient',\n",
    "'values of electronic band gap', # NB both direct + indirect gaps\n",
    "'temperature for congruent melting',\n",
    "'Debye temperature',\n",
    "'linear thermal expansion coefficient'\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's customize the returned data fields (that's optional):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "desired_fields = {\n",
    "    'P':[ # *P*hysical property entries\n",
    "        'sample.material.entry',\n",
    "        'sample.material.phase',\n",
    "        'sample.material.chemical_elements',\n",
    "        'sample.material.chemical_formula'\n",
    "    ],\n",
    "    'S':[ # Crystalline *S*tructure entries\n",
    "        'entry'\n",
    "        'phase',\n",
    "        'chemical_elements',\n",
    "        'chemical_formula'\n",
    "    ],\n",
    "    'C':[ # Phase diagrams, i.e. *C*onstitution entries\n",
    "        'entry',\n",
    "        lambda: 'MANY-PHASE', # constants are given like this (on purpose)\n",
    "        'chemical_elements',\n",
    "        lambda: 'MANY-FORMULAE'\n",
    "    ]\n",
    "    # NB. P-S-C are interconnected by means of the distinct phases\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Note, if the key isn't valid, the API returns an HTTP error `403`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "client = MPDSDataRetrieval(dtype=MPDSDataTypes.MACHINE_LEARNING)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for prop in example_props:\n",
    "\n",
    "    print(\"*\" * 100)\n",
    "    print(\"Considering %s\" % prop)\n",
    "\n",
    "    try:\n",
    "        for card in client.get_data({\n",
    "            \"props\": prop,\n",
    "            # we defined our props above\n",
    "\n",
    "            \"classes\": \"transitional, superconductor\",\n",
    "            # a transitional metal atom must be present,\n",
    "            # and a superconductor must be assigned in the original publication\n",
    "\n",
    "            \"aetypes\": \"all 7-vertex\",\n",
    "            # atomic environment type e.g. hexagonal pyramid, pentagonal bipyramid etc.\n",
    "\n",
    "            \"aeatoms\": \"X-S\",\n",
    "            # atomic environment atoms: any atom in the center, sulphur in the vertices (ligands)\n",
    "\n",
    "            \"years\": \"2010-2019\"\n",
    "            # only recent results (void for MACHINE_LEARNING, as all are 2018)\n",
    "        }, fields=desired_fields):\n",
    "\n",
    "            print(\"%s %s %s\" % (card[0], \"-\".join(card[2]), card[3]))\n",
    "\n",
    "    except APIError as ex:\n",
    "\n",
    "        if ex.code == 1:\n",
    "            print(\"No matches.\")\n",
    "\n",
    "        else:\n",
    "            print(\"Error %s: %s\" % (ex.code, ex.msg))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "client.dtype = MPDSDataTypes.PEER_REVIEWED\n",
    "\n",
    "print(client.get_data({\"elements\": \"O\", \"classes\": \"binary\", \"sgs\": \"I4/mmm\"}))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import random\n",
    "prop = random.choice(example_props)\n",
    "\n",
    "print(client.get_data({\"props\": prop, \"elements\": \"O\", \"classes\": \"binary, lanthanoid, non-disordered\"}))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Were you able to follow everything? Please, explain, what happens under the hood (tentatively), when we call `client.get_data` or `client.get_dataframe`.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "**PS** don't forget to [invalidate](https://mpds.io/#modal/menu) (revoke) your API key.\n"
   ]
  }
 ],
 "metadata": {},
 "nbformat": 4,
 "nbformat_minor": 2
}