{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# myChEMBL drug ADME data\n",
    "\n",
    "### myChEMBL team, ChEMBL group, EMBL-EBI.\n",
    "\n",
    "A demo notebook illustrating one way of extracting _in vivo_ ADME data from myChEMBL for marketed drugs (as defined by the FDA Orange Book)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Populating the interactive namespace from numpy and matplotlib\n"
     ]
    }
   ],
   "source": [
    "%matplotlib inline\n",
    "%pylab inline\n",
    "\n",
    "from __future__  import print_function, division\n",
    "\n",
    "import warnings\n",
    "\n",
    "import psycopg2\n",
    "import pandas as pd\n",
    "pd.set_option('display.max_colwidth', 1000)\n",
    "from IPython.display import HTML, Javascript"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Configuration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "IPython.notebook.kernel.execute(\"current_url = \" + \"'\"+document.URL+\"'\");"
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Get the base URL of the VM:\n",
    "display(Javascript('IPython.notebook.kernel.execute(\"current_url = \" + \"\\'\"+document.URL+\"\\'\");'))\n",
    "\n",
    "# Base URL for report cards...\n",
    "# NB This currently uses the main (external) ChEMBL instance, as myChEMBL doesn't yet offer report cards\n",
    "\n",
    "report_url = 'https://www.ebi.ac.uk/chembl'\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "cur_base_url = current_url.split('http://')[1].split('/')[0]\n",
    "base_url = 'localhost:8000' if (cur_base_url == 'localhost:9612' or cur_base_url == '127.0.0.1:9612') else current_url.split('http://')[1].split(':')[0] + ':8000'\n",
    "img_url = 'http://' + base_url + '/chemblws' "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'http://tannin.windows.ebi.ac.uk:8000/chemblws'"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "img_url"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# PostgreSQL connection details...\n",
    "\n",
    "username, password, hostname, database = 'mychembl', 'read', 'localhost', 'chembl_21'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Connect to PostgreSQL...\n",
    "\n",
    "conn = psycopg2.connect(host=hostname, database=database, user=username, password=password)\n",
    "\n",
    "cursor = conn.cursor()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Marketed Drugs\n",
    "\n",
    "Here, we query myChEMBL for a list of marketed drugs, and for any alternative salt/solvate forms of those drugs. This is done by obtaining the parent form (_i.e._ the bioactive component) for each drug, and requesting all versions (_i.e._ salt/solvate forms) containing that parent.\n",
    "\n",
    "This is done because data may be recorded in the literature for versions of the drug other than the marketed form. Interest is _usually_ in the parent form (_i.e._ stripped of any salt/solvate components), as this is the bioactive component (ignoring prodrugs and active metabolites for the present). Thus, to ensure the greatest amount of data is obtained for a given bioactive compound, it normally makes sense to extend the query out to versions sharing the same parent. \n",
    "\n",
    "Note that in some cases this might not be appropriate, as different salt forms might not behave the same in certain assays (_e.g._ dissolution rates and hence oral bioavailabilities might differ). If this was of concern, the query below could be simplified so as only to retrieve versions flagged as the marketed drugs (a simplified version is included but commented out).\n",
    "\n",
    "In practice, the differences between the two approaches appear to be fairly small, and the more inclusive query is used mainly for illustrative purposes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Get list of marketed drugs, including any alternative salt forms...\n",
    "\n",
    "drugs_sql = \"\"\"\n",
    "select\n",
    "    c.chembl_id as parent_chembl_id\n",
    "  , a.chembl_id\n",
    "  , a.pref_name\n",
    "  , d.canonical_smiles as parent_smiles\n",
    "from\n",
    "    molecule_dictionary a\n",
    "  , molecule_hierarchy b\n",
    "  , chembl_id_lookup c\n",
    "  , compound_structures d\n",
    "  , ( --  List of parent structures (i.e. the bioactive form) of marketed drugs\n",
    "    select distinct\n",
    "        b.parent_molregno \n",
    "    from\n",
    "          molecule_dictionary a \n",
    "        , molecule_hierarchy b\n",
    "    where\n",
    "        a.molregno =  b.molregno\n",
    "    and a.max_phase = 4\n",
    "  ) e\n",
    "where\n",
    "    a.molregno = b.molregno\n",
    "and b.parent_molregno = c.entity_id and c.entity_type = 'COMPOUND'\n",
    "and b.parent_molregno = d.molregno\n",
    "and b.parent_molregno = e.parent_molregno -- restrict to cases where parent is that of as marketed drug\n",
    "order by\n",
    "    parent_chembl_id\n",
    "  , chembl_id\n",
    "\"\"\"\n",
    "\n",
    "# Simplified query that doesn't pull in other salt forms...\n",
    "\n",
    "# drugs_sql = \"\"\"\n",
    "# select\n",
    "#     c.chembl_id as parent_chembl_id\n",
    "#   , a.chembl_id\n",
    "#   , a.pref_name\n",
    "#   , d.canonical_smiles as parent_smiles\n",
    "# from\n",
    "#     molecule_dictionary a\n",
    "#   , molecule_hierarchy b\n",
    "#   , chembl_id_lookup c\n",
    "#   , compound_structures d\n",
    "# where\n",
    "#     a.molregno = b.molregno\n",
    "# and b.parent_molregno = c.entity_id and c.entity_type = 'COMPOUND'\n",
    "# and b.parent_molregno = d.molregno\n",
    "# and a.max_phase = 4\n",
    "# order by\n",
    "#     parent_chembl_id\n",
    "#   , chembl_id\n",
    "# \"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Run query...\n",
    "\n",
    "cursor.execute(drugs_sql)\n",
    "\n",
    "# Get column names...\n",
    "\n",
    "columns = [x[0] for x in cursor.description]\n",
    "\n",
    "# Retrieve data...\n",
    "\n",
    "rows = cursor.fetchall()\n",
    "\n",
    "# Convert to Pandas dataframe...\n",
    "\n",
    "drugs = pd.DataFrame(rows, columns=columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2947, 4)"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "drugs.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>parent_chembl_id</th>\n",
       "      <th>chembl_id</th>\n",
       "      <th>pref_name</th>\n",
       "      <th>parent_smiles</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>CHEMBL1000</td>\n",
       "      <td>CHEMBL1000</td>\n",
       "      <td>CETIRIZINE</td>\n",
       "      <td>OC(=O)COCCN1CCN(CC1)C(c2ccccc2)c3ccc(Cl)cc3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>CHEMBL1000</td>\n",
       "      <td>CHEMBL1201113</td>\n",
       "      <td>CETIRIZINE HYDROCHLORIDE</td>\n",
       "      <td>OC(=O)COCCN1CCN(CC1)C(c2ccccc2)c3ccc(Cl)cc3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>CHEMBL1000</td>\n",
       "      <td>CHEMBL1607273</td>\n",
       "      <td>CETIRIZINE DIHYDROCHLORIDE</td>\n",
       "      <td>OC(=O)COCCN1CCN(CC1)C(c2ccccc2)c3ccc(Cl)cc3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>CHEMBL1002</td>\n",
       "      <td>CHEMBL1002</td>\n",
       "      <td>LEVOSALBUTAMOL</td>\n",
       "      <td>CC(C)(C)NC[C@H](O)c1ccc(O)c(CO)c1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>CHEMBL1002</td>\n",
       "      <td>CHEMBL1201061</td>\n",
       "      <td>LEVALBUTEROL HYDROCHLORIDE</td>\n",
       "      <td>CC(C)(C)NC[C@H](O)c1ccc(O)c(CO)c1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>CHEMBL1002</td>\n",
       "      <td>CHEMBL2062258</td>\n",
       "      <td>LEVALBUTEROL TARTRATE</td>\n",
       "      <td>CC(C)(C)NC[C@H](O)c1ccc(O)c(CO)c1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>CHEMBL1002</td>\n",
       "      <td>CHEMBL2106337</td>\n",
       "      <td>LEVALBUTEROL SULFATE</td>\n",
       "      <td>CC(C)(C)NC[C@H](O)c1ccc(O)c(CO)c1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>CHEMBL1004</td>\n",
       "      <td>CHEMBL1004</td>\n",
       "      <td>DOXYLAMINE</td>\n",
       "      <td>CN(C)CCOC(C)(c1ccccc1)c2ccccn2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>CHEMBL1004</td>\n",
       "      <td>CHEMBL1200392</td>\n",
       "      <td>DOXYLAMINE SUCCINATE</td>\n",
       "      <td>CN(C)CCOC(C)(c1ccccc1)c2ccccn2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>CHEMBL1005</td>\n",
       "      <td>CHEMBL1005</td>\n",
       "      <td>REMIFENTANIL</td>\n",
       "      <td>CCC(=O)N(c1ccccc1)C2(CCN(CCC(=O)OC)CC2)C(=O)OC</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  parent_chembl_id      chembl_id                   pref_name  \\\n",
       "0       CHEMBL1000     CHEMBL1000                  CETIRIZINE   \n",
       "1       CHEMBL1000  CHEMBL1201113    CETIRIZINE HYDROCHLORIDE   \n",
       "2       CHEMBL1000  CHEMBL1607273  CETIRIZINE DIHYDROCHLORIDE   \n",
       "3       CHEMBL1002     CHEMBL1002              LEVOSALBUTAMOL   \n",
       "4       CHEMBL1002  CHEMBL1201061  LEVALBUTEROL HYDROCHLORIDE   \n",
       "5       CHEMBL1002  CHEMBL2062258       LEVALBUTEROL TARTRATE   \n",
       "6       CHEMBL1002  CHEMBL2106337        LEVALBUTEROL SULFATE   \n",
       "7       CHEMBL1004     CHEMBL1004                  DOXYLAMINE   \n",
       "8       CHEMBL1004  CHEMBL1200392        DOXYLAMINE SUCCINATE   \n",
       "9       CHEMBL1005     CHEMBL1005                REMIFENTANIL   \n",
       "\n",
       "                                    parent_smiles  \n",
       "0     OC(=O)COCCN1CCN(CC1)C(c2ccccc2)c3ccc(Cl)cc3  \n",
       "1     OC(=O)COCCN1CCN(CC1)C(c2ccccc2)c3ccc(Cl)cc3  \n",
       "2     OC(=O)COCCN1CCN(CC1)C(c2ccccc2)c3ccc(Cl)cc3  \n",
       "3               CC(C)(C)NC[C@H](O)c1ccc(O)c(CO)c1  \n",
       "4               CC(C)(C)NC[C@H](O)c1ccc(O)c(CO)c1  \n",
       "5               CC(C)(C)NC[C@H](O)c1ccc(O)c(CO)c1  \n",
       "6               CC(C)(C)NC[C@H](O)c1ccc(O)c(CO)c1  \n",
       "7                  CN(C)CCOC(C)(c1ccccc1)c2ccccn2  \n",
       "8                  CN(C)CCOC(C)(c1ccccc1)c2ccccn2  \n",
       "9  CCC(=O)N(c1ccccc1)C2(CCN(CCC(=O)OC)CC2)C(=O)OC  "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "drugs.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Inspect drugs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Take a copy, as the dataframe will be altered...\n",
    "\n",
    "drugs2 = drugs.copy()\n",
    "\n",
    "# Add images for parent structure and salt form(s)...\n",
    "\n",
    "drugs2['parent_img']       = drugs2['parent_chembl_id'].apply(lambda x: '<img src=\"{}/image/{}?dimensions=250\" >'.format(img_url, x))\n",
    "drugs2['version_img']      = drugs2['chembl_id'].apply(lambda x: '<img src=\"{}/image/{}?dimensions=250\" >'.format(img_url, x))\n",
    "\n",
    "# Add links to report cards (NB these are currently calls out to the main ChEMBL instance, as myChEMBL doesn't yet offer report cards)...\n",
    "\n",
    "drugs2['parent_chembl_id'] = drugs2['parent_chembl_id'].apply(lambda x:   '<a target=\"_blank\" href=\"{}/compound/inspect/{}\">{}</a>'.format(report_url, x, x))\n",
    "drugs2['chembl_id']        = drugs2['chembl_id'].apply(lambda x: '<a target=\"_blank\" href=\"{}/compound/inspect/{}\">{}</a>'.format(report_url, x, x))\n",
    "\n",
    "# Subset and reorder columns for easier inspection...\n",
    "\n",
    "drugs2 = drugs2[['parent_chembl_id', 'parent_img', 'chembl_id', 'version_img', 'pref_name', 'parent_smiles']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>parent_chembl_id</th>\n",
       "      <th>parent_img</th>\n",
       "      <th>chembl_id</th>\n",
       "      <th>version_img</th>\n",
       "      <th>pref_name</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1000\">CHEMBL1000</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1000?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1000\">CHEMBL1000</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1000?dimensions=250\" ></td>\n",
       "      <td>CETIRIZINE</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1000\">CHEMBL1000</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1000?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1201113\">CHEMBL1201113</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1201113?dimensions=250\" ></td>\n",
       "      <td>CETIRIZINE HYDROCHLORIDE</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1000\">CHEMBL1000</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1000?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1607273\">CHEMBL1607273</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1607273?dimensions=250\" ></td>\n",
       "      <td>CETIRIZINE DIHYDROCHLORIDE</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1002\">CHEMBL1002</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1002?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1002\">CHEMBL1002</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1002?dimensions=250\" ></td>\n",
       "      <td>LEVOSALBUTAMOL</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1002\">CHEMBL1002</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1002?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1201061\">CHEMBL1201061</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1201061?dimensions=250\" ></td>\n",
       "      <td>LEVALBUTEROL HYDROCHLORIDE</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(drugs2.head().drop('parent_smiles', 1).to_html(escape=False))\n",
    "\n",
    "# HTML(drugs2.drop('parent_smiles', 1).to_html(escape=False)) # Uncomment to view full table"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load ADME data\n",
    "\n",
    "This query pulls back _in vivo_ PK data for humans and a variety of species used in drug development and safety studies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# ADME data query...\n",
    "\n",
    "data_sql = \"\"\"\n",
    "SELECT\n",
    "      cil.chembl_id         as chembl_id\n",
    "    , act.standard_type     as data_type\n",
    "    , act.standard_relation as modifier\n",
    "    , act.standard_value    as value\n",
    "    , act.standard_units    as units\n",
    "    , ass.assay_organism    as organism\n",
    "    , ass.assay_tissue      as tissue\n",
    "    , td.chembl_id          as target_id\n",
    "FROM\n",
    "      activities act\n",
    "    , assays ass\n",
    "    , target_dictionary td\n",
    "    , chembl_id_lookup cil\n",
    "WHERE \n",
    "        act.assay_id                = ass.assay_id\n",
    "    AND td.tid                      = ass.tid\n",
    "    AND act.standard_value          IS NOT NULL\n",
    "    AND act.data_validity_comment   IS NULL\n",
    "    AND((act.standard_type          = 'Cmax'\n",
    "    AND act.standard_units          = 'nM')\n",
    "    OR (act.standard_type           = 'F'\n",
    "    AND act.standard_units          = '%%')\n",
    "    OR(act.standard_type            = 'Tmax'\n",
    "    AND act.standard_units          = 'hr')\n",
    "    OR (act.standard_type           = 'T1/2'\n",
    "    AND act.standard_units          = 'hr')\n",
    "    OR (act.standard_type           = 'Vd'\n",
    "    AND act.standard_units          = 'L.kg-1')\n",
    "    OR (act.standard_type           = 'Vdss'\n",
    "    AND act.standard_units          = 'L.kg-1')\n",
    "    OR (act.standard_type           = 'CL'\n",
    "    AND act.standard_units          = 'mL.min-1.kg-1'))\n",
    "    AND ass.assay_test_type         = 'In vivo'\n",
    "    AND (ass.assay_tissue           IS NULL\n",
    "    OR lower(ass.assay_tissue) NOT IN ('blood', 'liver', 'adipose', 'spleen', 'prostate', 'skin', 'jejunum', 'lung', 'duodenum', 'hypothalamus', 'cortex', 'pancreas', 'interstitial fluid', 'lung epithelial lining fluid', 'small intestine', 'adrenal gland', 'heart', 'retina', 'brain', 'tumour', 'interstitial fluid (thigh)', 'stomach', 'kidney', 'intestinal segment', 'brain', 'csf', 'liver', 'Pancreas', 'testes', 'hypothalamus', 'peritoneal fluid', 'adrenals', 'intestine', 'gi tract', 'choroid/sclera', 'not recorded', 'muscle', 'kidney', 'bile', 'cornea', 'ovary'))\n",
    "    AND upper(ass.assay_organism)  IN ('HOMO SAPIENS', 'RATTUS NORVEGICUS', 'MUS MUSCULUS', 'CANIS LUPUS FAMILIARIS', 'MACACA MULATTA', 'MACACA FASCICULARIS','MACACA','CERCOPITHECIDAE','MARMOSETS', 'PRIMATES','MONKEY')\n",
    "    AND act.molregno = cil.entity_id AND cil.entity_type = 'COMPOUND'\n",
    "ORDER BY\n",
    "      chembl_id\n",
    "    , target_id\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Run query...\n",
    "\n",
    "cursor.execute(data_sql)\n",
    "\n",
    "# Get column names...\n",
    "\n",
    "columns = [x[0] for x in cursor.description]\n",
    "\n",
    "# Retrieve data...\n",
    "\n",
    "rows = cursor.fetchall()\n",
    "\n",
    "# Convert to Pandas dataframe...\n",
    "\n",
    "data = pd.DataFrame(rows, columns=columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(41380, 8)"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>chembl_id</th>\n",
       "      <th>data_type</th>\n",
       "      <th>modifier</th>\n",
       "      <th>value</th>\n",
       "      <th>units</th>\n",
       "      <th>organism</th>\n",
       "      <th>tissue</th>\n",
       "      <th>target_id</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>CHEMBL100049</td>\n",
       "      <td>T1/2</td>\n",
       "      <td>=</td>\n",
       "      <td>3.28</td>\n",
       "      <td>hr</td>\n",
       "      <td>Canis lupus familiaris</td>\n",
       "      <td>None</td>\n",
       "      <td>CHEMBL373</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>CHEMBL100181</td>\n",
       "      <td>T1/2</td>\n",
       "      <td>=</td>\n",
       "      <td>0.6</td>\n",
       "      <td>hr</td>\n",
       "      <td>Canis lupus familiaris</td>\n",
       "      <td>None</td>\n",
       "      <td>CHEMBL373</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>CHEMBL100367</td>\n",
       "      <td>Vd</td>\n",
       "      <td>=</td>\n",
       "      <td>0.29</td>\n",
       "      <td>L.kg-1</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td>CHEMBL376</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>CHEMBL100367</td>\n",
       "      <td>T1/2</td>\n",
       "      <td>=</td>\n",
       "      <td>5.7</td>\n",
       "      <td>hr</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td>CHEMBL376</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>CHEMBL100367</td>\n",
       "      <td>Vd</td>\n",
       "      <td>=</td>\n",
       "      <td>0.34</td>\n",
       "      <td>L.kg-1</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td>CHEMBL376</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      chembl_id data_type modifier value   units                organism  \\\n",
       "0  CHEMBL100049      T1/2        =  3.28      hr  Canis lupus familiaris   \n",
       "1  CHEMBL100181      T1/2        =   0.6      hr  Canis lupus familiaris   \n",
       "2  CHEMBL100367        Vd        =  0.29  L.kg-1       Rattus norvegicus   \n",
       "3  CHEMBL100367      T1/2        =   5.7      hr       Rattus norvegicus   \n",
       "4  CHEMBL100367        Vd        =  0.34  L.kg-1       Rattus norvegicus   \n",
       "\n",
       "  tissue  target_id  \n",
       "0   None  CHEMBL373  \n",
       "1   None  CHEMBL373  \n",
       "2   None  CHEMBL376  \n",
       "3   None  CHEMBL376  \n",
       "4   None  CHEMBL376  "
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Inspect data table"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Take a copy, as the dataframe will be altered...\n",
    "\n",
    "data2 = data.copy()\n",
    "\n",
    "# Add images...\n",
    "\n",
    "data2['img'] = data2['chembl_id'].apply(lambda x: '<img src=\"{}/image/{}?dimensions=250\" >'.format(img_url, x))\n",
    "\n",
    "# Add links to report cards (NB these are currently calls out to the main ChEMBL instance, as myChEMBL doesn't yet offer report cards)...\n",
    "\n",
    "data2['chembl_id'] = data2['chembl_id'].apply(lambda x: '<a target=\"_blank\" href=\"{}/compound/inspect/{}\">{}</a>'.format(report_url, x, x))\n",
    "data2['target_id'] = data2['target_id'].apply(lambda x: '<a target=\"_blank\" href=\"{}/target/inspect/{}\">{}</a>'.format(report_url, x, x))\n",
    "\n",
    "# Subset and reorder columns for easier inspection...\n",
    "\n",
    "data2 = data2[['chembl_id', 'img', 'data_type', 'modifier', 'value', 'units', 'organism', 'tissue', 'target_id']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>chembl_id</th>\n",
       "      <th>img</th>\n",
       "      <th>data_type</th>\n",
       "      <th>modifier</th>\n",
       "      <th>value</th>\n",
       "      <th>units</th>\n",
       "      <th>organism</th>\n",
       "      <th>tissue</th>\n",
       "      <th>target_id</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL100049\">CHEMBL100049</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL100049?dimensions=250\" ></td>\n",
       "      <td>T1/2</td>\n",
       "      <td>=</td>\n",
       "      <td>3.28</td>\n",
       "      <td>hr</td>\n",
       "      <td>Canis lupus familiaris</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL373\">CHEMBL373</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL100181\">CHEMBL100181</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL100181?dimensions=250\" ></td>\n",
       "      <td>T1/2</td>\n",
       "      <td>=</td>\n",
       "      <td>0.6</td>\n",
       "      <td>hr</td>\n",
       "      <td>Canis lupus familiaris</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL373\">CHEMBL373</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL100367\">CHEMBL100367</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL100367?dimensions=250\" ></td>\n",
       "      <td>Vd</td>\n",
       "      <td>=</td>\n",
       "      <td>0.29</td>\n",
       "      <td>L.kg-1</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL376\">CHEMBL376</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL100367\">CHEMBL100367</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL100367?dimensions=250\" ></td>\n",
       "      <td>T1/2</td>\n",
       "      <td>=</td>\n",
       "      <td>5.7</td>\n",
       "      <td>hr</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL376\">CHEMBL376</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL100367\">CHEMBL100367</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL100367?dimensions=250\" ></td>\n",
       "      <td>Vd</td>\n",
       "      <td>=</td>\n",
       "      <td>0.34</td>\n",
       "      <td>L.kg-1</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL376\">CHEMBL376</a></td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(data2.head(5).to_html(escape=False))\n",
    "\n",
    "# HTML(data2.head(5).to_html(escape=False))  # NB Uncomment to render all rows"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Merge Drug and ADME data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Merge Pandas data frames...\n",
    "# NB This could obviously also be done in the database using SQL.\n",
    "\n",
    "df = pd.merge(drugs, data, how='inner', left_on='chembl_id', right_on='chembl_id')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(5013, 11)"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>parent_chembl_id</th>\n",
       "      <th>chembl_id</th>\n",
       "      <th>pref_name</th>\n",
       "      <th>parent_smiles</th>\n",
       "      <th>data_type</th>\n",
       "      <th>modifier</th>\n",
       "      <th>value</th>\n",
       "      <th>units</th>\n",
       "      <th>organism</th>\n",
       "      <th>tissue</th>\n",
       "      <th>target_id</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>CHEMBL1005</td>\n",
       "      <td>CHEMBL1005</td>\n",
       "      <td>REMIFENTANIL</td>\n",
       "      <td>CCC(=O)N(c1ccccc1)C2(CCN(CCC(=O)OC)CC2)C(=O)OC</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>37</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td>CHEMBL372</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>CHEMBL1005</td>\n",
       "      <td>CHEMBL1005</td>\n",
       "      <td>REMIFENTANIL</td>\n",
       "      <td>CCC(=O)N(c1ccccc1)C2(CCN(CCC(=O)OC)CC2)C(=O)OC</td>\n",
       "      <td>Vdss</td>\n",
       "      <td>=</td>\n",
       "      <td>0.4</td>\n",
       "      <td>L.kg-1</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td>CHEMBL612558</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>CHEMBL1005</td>\n",
       "      <td>CHEMBL1005</td>\n",
       "      <td>REMIFENTANIL</td>\n",
       "      <td>CCC(=O)N(c1ccccc1)C2(CCN(CCC(=O)OC)CC2)C(=O)OC</td>\n",
       "      <td>T1/2</td>\n",
       "      <td>=</td>\n",
       "      <td>0.8</td>\n",
       "      <td>hr</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td>CHEMBL612558</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>CHEMBL1005</td>\n",
       "      <td>CHEMBL1005</td>\n",
       "      <td>REMIFENTANIL</td>\n",
       "      <td>CCC(=O)N(c1ccccc1)C2(CCN(CCC(=O)OC)CC2)C(=O)OC</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>37</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td>CHEMBL612558</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>CHEMBL1009</td>\n",
       "      <td>CHEMBL1009</td>\n",
       "      <td>LEVODOPA</td>\n",
       "      <td>N[C@@H](Cc1ccc(O)c(O)c1)C(=O)O</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>23</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td>CHEMBL372</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  parent_chembl_id   chembl_id     pref_name  \\\n",
       "0       CHEMBL1005  CHEMBL1005  REMIFENTANIL   \n",
       "1       CHEMBL1005  CHEMBL1005  REMIFENTANIL   \n",
       "2       CHEMBL1005  CHEMBL1005  REMIFENTANIL   \n",
       "3       CHEMBL1005  CHEMBL1005  REMIFENTANIL   \n",
       "4       CHEMBL1009  CHEMBL1009      LEVODOPA   \n",
       "\n",
       "                                    parent_smiles data_type modifier value  \\\n",
       "0  CCC(=O)N(c1ccccc1)C2(CCN(CCC(=O)OC)CC2)C(=O)OC        CL        =    37   \n",
       "1  CCC(=O)N(c1ccccc1)C2(CCN(CCC(=O)OC)CC2)C(=O)OC      Vdss        =   0.4   \n",
       "2  CCC(=O)N(c1ccccc1)C2(CCN(CCC(=O)OC)CC2)C(=O)OC      T1/2        =   0.8   \n",
       "3  CCC(=O)N(c1ccccc1)C2(CCN(CCC(=O)OC)CC2)C(=O)OC        CL        =    37   \n",
       "4                  N[C@@H](Cc1ccc(O)c(O)c1)C(=O)O        CL        =    23   \n",
       "\n",
       "           units      organism tissue     target_id  \n",
       "0  mL.min-1.kg-1  Homo sapiens   None     CHEMBL372  \n",
       "1         L.kg-1  Homo sapiens   None  CHEMBL612558  \n",
       "2             hr  Homo sapiens   None  CHEMBL612558  \n",
       "3  mL.min-1.kg-1  Homo sapiens   None  CHEMBL612558  \n",
       "4  mL.min-1.kg-1  Homo sapiens   None     CHEMBL372  "
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Uncomment to save data as spreadsheet...\n",
    "\n",
    "# df.to_csv('myChEMBL_drugs_ADME.csv')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Inspect merged table"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Take a copy, as the dataframe will be altered...\n",
    "\n",
    "df2 = df.copy()\n",
    "\n",
    "# Add images...\n",
    "\n",
    "df2['parent_img']       = df2['parent_chembl_id'].apply(lambda x: '<img src=\"{}/image/{}?dimensions=250\" >'.format(img_url, x))\n",
    "\n",
    "# Add links to report cards (NB these are currently calls out to the main ChEMBL instance, as myChEMBL doesn't yet offer report cards)...\n",
    "\n",
    "df2['parent_chembl_id'] = df2['parent_chembl_id'].apply(lambda x: '<a target=\"_blank\" href=\"{}/compound/inspect/{}\">{}</a>'.format(report_url, x, x))\n",
    "df2['chembl_id']        = df2['chembl_id'].apply(lambda x:        '<a target=\"_blank\" href=\"{}/compound/inspect/{}\">{}</a>'.format(report_url, x, x))\n",
    "\n",
    "df2['target_id']        = df2['target_id'].apply(lambda x:   '<a target=\"_blank\" href=\"{}/target/inspect/{}\">{}</a>'.format(report_url, x, x))\n",
    "\n",
    "# Subset and reorder columns for easier inspection...\n",
    "\n",
    "df2 = df2[['parent_chembl_id', 'parent_img', 'chembl_id', 'pref_name', 'parent_smiles', 'data_type', 'modifier', 'value', 'units', 'organism', 'tissue', 'target_id']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>parent_chembl_id</th>\n",
       "      <th>parent_img</th>\n",
       "      <th>chembl_id</th>\n",
       "      <th>pref_name</th>\n",
       "      <th>data_type</th>\n",
       "      <th>modifier</th>\n",
       "      <th>value</th>\n",
       "      <th>units</th>\n",
       "      <th>organism</th>\n",
       "      <th>tissue</th>\n",
       "      <th>target_id</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1005\">CHEMBL1005</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1005?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1005\">CHEMBL1005</a></td>\n",
       "      <td>REMIFENTANIL</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>37</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL372\">CHEMBL372</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1005\">CHEMBL1005</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1005?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1005\">CHEMBL1005</a></td>\n",
       "      <td>REMIFENTANIL</td>\n",
       "      <td>Vdss</td>\n",
       "      <td>=</td>\n",
       "      <td>0.4</td>\n",
       "      <td>L.kg-1</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL612558\">CHEMBL612558</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1005\">CHEMBL1005</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1005?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1005\">CHEMBL1005</a></td>\n",
       "      <td>REMIFENTANIL</td>\n",
       "      <td>T1/2</td>\n",
       "      <td>=</td>\n",
       "      <td>0.8</td>\n",
       "      <td>hr</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL612558\">CHEMBL612558</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1005\">CHEMBL1005</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1005?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1005\">CHEMBL1005</a></td>\n",
       "      <td>REMIFENTANIL</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>37</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL612558\">CHEMBL612558</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1009\">CHEMBL1009</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1009?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1009\">CHEMBL1009</a></td>\n",
       "      <td>LEVODOPA</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>23</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL372\">CHEMBL372</a></td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(df2.head().drop('parent_smiles', 1).to_html(escape=False)) # NB smiles column excluded from rendering of table only\n",
    "\n",
    "# HTML(df2.drop('parent_smiles', 1).to_html(escape=False)) # NB Uncomment to render all rows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>parent_chembl_id</th>\n",
       "      <th>parent_img</th>\n",
       "      <th>chembl_id</th>\n",
       "      <th>pref_name</th>\n",
       "      <th>data_type</th>\n",
       "      <th>modifier</th>\n",
       "      <th>value</th>\n",
       "      <th>units</th>\n",
       "      <th>organism</th>\n",
       "      <th>tissue</th>\n",
       "      <th>target_id</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1006</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>Vdss</td>\n",
       "      <td>=</td>\n",
       "      <td>2.8</td>\n",
       "      <td>L.kg-1</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL372\">CHEMBL372</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1007</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>6</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL372\">CHEMBL372</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1008</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>6</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL372\">CHEMBL372</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1009</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>1.3</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL372\">CHEMBL372</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1010</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>6</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Canis lupus familiaris</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL373\">CHEMBL373</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1011</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>60</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL376\">CHEMBL376</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1012</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>60</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL376\">CHEMBL376</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1013</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>T1/2</td>\n",
       "      <td>=</td>\n",
       "      <td>1.7</td>\n",
       "      <td>hr</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>Plasma</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL376\">CHEMBL376</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1014</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>T1/2</td>\n",
       "      <td>=</td>\n",
       "      <td>1.7</td>\n",
       "      <td>hr</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL376\">CHEMBL376</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1015</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>Cmax</td>\n",
       "      <td>=</td>\n",
       "      <td>330</td>\n",
       "      <td>nM</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL376\">CHEMBL376</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1016</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>60</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL376\">CHEMBL376</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1017</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>T1/2</td>\n",
       "      <td>=</td>\n",
       "      <td>12</td>\n",
       "      <td>hr</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL612558\">CHEMBL612558</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1018</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>6</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL612558\">CHEMBL612558</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1019</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>Vdss</td>\n",
       "      <td>=</td>\n",
       "      <td>2.8</td>\n",
       "      <td>L.kg-1</td>\n",
       "      <td>Homo sapiens</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL612558\">CHEMBL612558</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1020</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td>SITAGLIPTIN</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>28</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Macaca mulatta</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL614875\">CHEMBL614875</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1021</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL539077\">CHEMBL539077</a></td>\n",
       "      <td>None</td>\n",
       "      <td>Cmax</td>\n",
       "      <td>=</td>\n",
       "      <td>330</td>\n",
       "      <td>nM</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL376\">CHEMBL376</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1022</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL539077\">CHEMBL539077</a></td>\n",
       "      <td>None</td>\n",
       "      <td>T1/2</td>\n",
       "      <td>=</td>\n",
       "      <td>1.7</td>\n",
       "      <td>hr</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL376\">CHEMBL376</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1023</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL539077\">CHEMBL539077</a></td>\n",
       "      <td>None</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>60</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL376\">CHEMBL376</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1024</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL539077\">CHEMBL539077</a></td>\n",
       "      <td>None</td>\n",
       "      <td>Cmax</td>\n",
       "      <td>=</td>\n",
       "      <td>330</td>\n",
       "      <td>nM</td>\n",
       "      <td>Cercopithecidae</td>\n",
       "      <td>Plasma</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL612558\">CHEMBL612558</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1025</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL539077\">CHEMBL539077</a></td>\n",
       "      <td>None</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>60</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL612558\">CHEMBL612558</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1026</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL539077\">CHEMBL539077</a></td>\n",
       "      <td>None</td>\n",
       "      <td>Cmax</td>\n",
       "      <td>=</td>\n",
       "      <td>330</td>\n",
       "      <td>nM</td>\n",
       "      <td>Rattus norvegicus</td>\n",
       "      <td>Plasma</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL612558\">CHEMBL612558</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1027</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL539077\">CHEMBL539077</a></td>\n",
       "      <td>None</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>6</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Canis lupus familiaris</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL612558\">CHEMBL612558</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1028</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL539077\">CHEMBL539077</a></td>\n",
       "      <td>None</td>\n",
       "      <td>Cmax</td>\n",
       "      <td>=</td>\n",
       "      <td>2200</td>\n",
       "      <td>nM</td>\n",
       "      <td>Canis lupus familiaris</td>\n",
       "      <td>Plasma</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL612558\">CHEMBL612558</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1029</th>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1422\">CHEMBL1422</a></td>\n",
       "      <td><img src=\"http://tannin.windows.ebi.ac.uk:8000/chemblws/image/CHEMBL1422?dimensions=250\" ></td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL539077\">CHEMBL539077</a></td>\n",
       "      <td>None</td>\n",
       "      <td>CL</td>\n",
       "      <td>=</td>\n",
       "      <td>28</td>\n",
       "      <td>mL.min-1.kg-1</td>\n",
       "      <td>Cercopithecidae</td>\n",
       "      <td>None</td>\n",
       "      <td><a target=\"_blank\" href=\"https://www.ebi.ac.uk/chembl/target/inspect/CHEMBL612558\">CHEMBL612558</a></td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# An example of a case where extra data is obtained by looking at other salt forms: CHEMBL539077 (the hydrochloride salt)\n",
    "# would not have been found using the simple query.\n",
    "\n",
    "HTML(df2[df2['parent_chembl_id'].str.contains('CHEMBL1422')].drop('parent_smiles', 1).to_html(escape=False))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.11"
  },
  "widgets": {
   "state": {},
   "version": "1.1.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}