{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Basic MPDS API usage: machine-learning and peer-reviewed data\n", "==========\n", "\n", "- **Complexity level**: beginner\n", "- **Requirements**: understanding how APIs work\n", "\n", "Let's play a bit with the MPDS API, fetching different kinds of data?\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "**Important! Before you proceed:** the notebooks running at the third-party servers are not secure. Using this notebook assumes you authenticate at the MPDS server with your own API key. Please run this notebook only if you have an open-access account (_i.e._ an **access** section of your MPDS account reads: `Programmatic data access: only open data`).\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Please **do not** run this notebook at the third-party servers if you have an elevated API access to the MPDS, since there's a nonzero probability of key leakage!\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Be sure to **always invalidate** (revoke) your API key at your [MPDS account](https://mpds.io/#modal/menu) after using the notebooks.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Now let's proceed with the authentication part. First, apply for an [MPDS account](https://mpds.io/open-data-api), if you have none. Then copy your API key, run the next cell, paste the key in the appeared prompt input, and hit **Enter**.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os, getpass\n", "os.environ['MPDS_KEY'] = getpass.getpass()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "OK, now you may talk to the MPDS server programmatically from this notebook on your behalf.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install mpds_client>=0.0.17" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from mpds_client import MPDSDataRetrieval, MPDSDataTypes, APIError" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[x for x in dir(MPDSDataTypes) if not x.startswith('__')]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In future we'll also add _ab initio_ data, but peer-reviewed data are (and will remain) default." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "example_props = [ # NB these props support machine-learning data type\n", "'isothermal bulk modulus',\n", "'enthalpy of formation',\n", "'heat capacity at constant pressure',\n", "'Seebeck coefficient',\n", "'values of electronic band gap', # NB both direct + indirect gaps\n", "'temperature for congruent melting',\n", "'Debye temperature',\n", "'linear thermal expansion coefficient'\n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's customize the returned data fields (that's optional):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "desired_fields = {\n", " 'P':[ # *P*hysical property entries\n", " 'sample.material.entry',\n", " 'sample.material.phase',\n", " 'sample.material.chemical_elements',\n", " 'sample.material.chemical_formula'\n", " ],\n", " 'S':[ # Crystalline *S*tructure entries\n", " 'entry'\n", " 'phase',\n", " 'chemical_elements',\n", " 'chemical_formula'\n", " ],\n", " 'C':[ # Phase diagrams, i.e. *C*onstitution entries\n", " 'entry',\n", " lambda: 'MANY-PHASE', # constants are given like this (on purpose)\n", " 'chemical_elements',\n", " lambda: 'MANY-FORMULAE'\n", " ]\n", " # NB. P-S-C are interconnected by means of the distinct phases\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Note, if the key isn't valid, the API returns an HTTP error `403`.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "client = MPDSDataRetrieval(dtype=MPDSDataTypes.MACHINE_LEARNING)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for prop in example_props:\n", "\n", " print(\"*\" * 100)\n", " print(\"Considering %s\" % prop)\n", "\n", " try:\n", " for card in client.get_data({\n", " \"props\": prop,\n", " # we defined our props above\n", "\n", " \"classes\": \"transitional, superconductor\",\n", " # a transitional metal atom must be present,\n", " # and a superconductor must be assigned in the original publication\n", "\n", " \"aetypes\": \"all 7-vertex\",\n", " # atomic environment type e.g. hexagonal pyramid, pentagonal bipyramid etc.\n", "\n", " \"aeatoms\": \"X-S\",\n", " # atomic environment atoms: any atom in the center, sulphur in the vertices (ligands)\n", "\n", " \"years\": \"2010-2019\"\n", " # only recent results (void for MACHINE_LEARNING, as all are 2018)\n", " }, fields=desired_fields):\n", "\n", " print(\"%s %s %s\" % (card[0], \"-\".join(card[2]), card[3]))\n", "\n", " except APIError as ex:\n", "\n", " if ex.code == 1:\n", " print(\"No matches.\")\n", "\n", " else:\n", " print(\"Error %s: %s\" % (ex.code, ex.msg))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "client.dtype = MPDSDataTypes.PEER_REVIEWED\n", "\n", "print(client.get_data({\"elements\": \"O\", \"classes\": \"binary\", \"sgs\": \"I4/mmm\"}))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import random\n", "prop = random.choice(example_props)\n", "\n", "print(client.get_data({\"props\": prop, \"elements\": \"O\", \"classes\": \"binary, lanthanoid, non-disordered\"}))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Were you able to follow everything? Please, explain, what happens under the hood (tentatively), when we call `client.get_data` or `client.get_dataframe`.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "**PS** don't forget to [invalidate](https://mpds.io/#modal/menu) (revoke) your API key.\n" ] } ], "metadata": {}, "nbformat": 4, "nbformat_minor": 2 }