{
 "metadata": {
  "name": "",
  "signature": "sha256:72f81ee69cf1a5bedc59313dc16cb7ef381aeb0d670fa934b481c31fc5d898f1"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "# myChEMBL ADMESARfari webservice tutorial"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### myChEMBL team, ChEMBL Group, EMBL-EBI.\n",
      "\n",
      "This notebook is intended to illustrate the use of the ADMESARfari webservice API from Python. Since the webservices for ADMESARfari are written using Cornice (https://github.com/mozilla-services/cornice) we have an exposed SPORE (https://github.com/SPORE/specifications) endpoint. This allows us to use a Python library, such as Respire (https://github.com/spiral-project/respire) to parse the JSON description of the methods available, which is provided by the SPORE endpoint (https://www.ebi.ac.uk/chembl/admesarfari/rest/spore) and to automatically generate callable methods from Python without handcoding the necessary boilerplate code.\n",
      "\n",
      "We will cover\n",
      "* Using Respire to create an API client\n",
      "* Making basic GET requests to the service\n",
      "* Using an input compound to run a prediction\n",
      "* Using an input FASTA sequence to run a prediction\n",
      "* Presentation of results from both."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Let do our imports first.\n",
      "import respire,urllib,re\n",
      "from IPython.display import HTML,JSON\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 3
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# We just need to monkey-patch the URL join method in this instance, \n",
      "# since it truncates the URL due to the way ADMESARfari is hosted\n",
      "\n",
      "def urljoin_patched(base,path):\n",
      "    return base+path\n",
      "\n",
      "respire.client.urljoin = urljoin_patched\n",
      "    \n",
      "# Create our client and associated methods\n",
      "api_client = respire.client_from_url('http://wwwdev.ebi.ac.uk/chembl/admesarfari/rest/spore')    \n",
      "\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 6
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# What methods do we have available?\n",
      "# Iterate over the parsed endpoint, pulling out applicable methods, the paths and the descriptions.\n",
      "# We'll add some HTML elements to the output.\n",
      "tc=[]\n",
      "ts = '<table><tr>'\n",
      "te = '</tr></table>'\n",
      "\n",
      "for method in api_client.description.methods:\n",
      "        methodname = method\n",
      "        method = api_client.description.methods[methodname]\n",
      "\n",
      "        if method['method']!='HEAD':\n",
      "            tc.append(\"<tr><th>\"+methodname+\"</th></tr>\")\n",
      "            tc.append('<tr><td>'+method['path']+'</td><td>'+method['description']+'</td></tr>')\n",
      "\n",
      "\n",
      "h = HTML(ts+\"\".join(tc)+te)\n",
      "h\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<table><tr><tr><th>get_textsearch</th></tr><tr><td>/rest/:TEXT/search</td><td> Return a set of target ids where the search term appears</td></tr><tr><th>get_blast</th></tr><tr><td>/rest/:FASTA/blast</td><td>BLAST the input sequence(s) against the ADME SARfari set of target sequences.\n",
        "\n",
        "    This requires:\n",
        "\n",
        "    * A URL encoded FASTA sequence (May extended to other formats via a keyword)\n",
        "    </td></tr><tr><th>get_celltypes</th></tr><tr><td>/rest/celltypes</td><td> Retrieve the list of cell types</td></tr><tr><th>get_targetsequence</th></tr><tr><td>/rest/targetsequence/:TARGET_ID</td><td> Return the Target Sequence and Variation information for a particular Target ID\n",
        "\n",
        "    </td></tr><tr><th>post_postsimsubsdf</th></tr><tr><td>/rest/simsubsdf/:VALUE</td><td> Return a set of molecules via either a similarity or substructure search\n",
        "\n",
        "    Requires:\n",
        "    * The POST body to contain CTAB or SMILES\n",
        "    * A similarity cut-off value (100 will perform a sub-structure search)\n",
        "\n",
        "    This returns a gzipped SDF file.\n",
        "\n",
        "    </td></tr><tr><th>get_orthologuematrix</th></tr><tr><td>/rest/orthologuematrix/:TAX_IDS</td><td>Retrieves the orthologue mapping matrix for a specific set of Taxonomy IDs.\n",
        "\n",
        "    Requires: Comma seperated list of Taxonomy IDs\n",
        "\n",
        "    </td></tr><tr><th>post_modelpredictor2</th></tr><tr><td>/rest/modelpredictor2</td><td>Run the input CTAB through the ADME SARfari SciKit/RDKit Bayesian model.\n",
        "\n",
        "    This requires a URL encoded CTAB</td></tr><tr><th>get_target</th></tr><tr><td>/rest/target/:TARGET_ID</td><td> Return the Target information for a particular Target ID\n",
        "\n",
        "    </td></tr><tr><th>get_targetalignment</th></tr><tr><td>/rest/targetalignment/:TARGET_ID/:TAX_IDS</td><td> Return the alignment information for a particular Target ID\n",
        "\n",
        "    </td></tr><tr><th>get_bioactivity</th></tr><tr><td>/rest/:MOLREGNO/bioactivity</td><td>Retrieve the bioactivity and assay data for a particular molregno.\n",
        "\n",
        "    Requires: Molregno (Int)\n",
        "\n",
        "    If the Molregno == -1 then it will bring back all records\n",
        "\n",
        "    Returns: Datatables format JSON.\n",
        "    </td></tr><tr><th>post_postblast</th></tr><tr><td>/rest/blast</td><td>BLAST the input sequence(s) against the ADME SARfari set of protein databases\n",
        "\n",
        "    This requires:\n",
        "\n",
        "    * A FASTA sequence as the POST body\n",
        "    </td></tr><tr><th>get_targetcompounds</th></tr><tr><td>/rest/:TARGET_ID/targetcompounds</td><td>Retrieve the compound SMILES associated with an ADME SARfari target (via activity)\n",
        "\n",
        "    Requires: ADME SARfari Target ID\n",
        "\n",
        "    </td></tr><tr><th>get_expressionmatrix</th></tr><tr><td>/rest/expressionmatrix/:TISSUE_IDS</td><td>Retrieves the tissue target expression level matrix\n",
        "\n",
        "    </td></tr><tr><th>get_taxids</th></tr><tr><td>/rest/taxids</td><td> Return the list of taxonomy IDs used.</td></tr><tr><th>get_alignmentdendrogram</th></tr><tr><td>/rest/alignmentdendrogram/:TARGET_ID/:TAX_IDS</td><td> Return the dendrogram tree information for a particular Target ID (from the relevant orthologues)\n",
        "\n",
        "    Requires a target ID and a comma seperated list of tax IDs\n",
        "\n",
        "    </td></tr><tr><th>get_targetinvivomatrix</th></tr><tr><td>/rest/:TARGET_ID/targetinvivomatrix</td><td>Retrieves the invivo matrix for a particular target\n",
        "\n",
        "    Requires: ADME SARfari internal target id\n",
        "\n",
        "    This will return an object with xcats,ycats and data elements (primarily used with the Highcharts Heatmap plugin.)\n",
        "\n",
        "    </td></tr><tr><th>get_tissues</th></tr><tr><td>/rest/tissues</td><td> Retrieve the list of tissues</td></tr><tr><th>get_targetbioactivity</th></tr><tr><td>/rest/:TARGET_ID/targetbioactivity</td><td>Retrieve the bioactivity and assay data for a particular target\n",
        "\n",
        "    Requires: ADME SARfari Target ID\n",
        "\n",
        "    </td></tr><tr><th>get_modelpredictor2</th></tr><tr><td>/rest/:CTAB/modelpredictor2</td><td>Run the input CTAB through the ADME SARfari SciKit/RDKit Bayesian model.\n",
        "\n",
        "    This requires a URL encoded CTAB</td></tr></tr></table>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 7,
       "text": [
        "<IPython.core.display.HTML at 0x7fdf17838f50>"
       ]
      }
     ],
     "prompt_number": 7
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Let's set a few lookup dictionaries\n",
      "\n",
      "# Taxonmy ID look up\n",
      "# Get the taxids\n",
      "\n",
      "taxids = api_client.get_taxids()['results']\n",
      "t = {}\n",
      "\n",
      "# Create taxonomy look-up\n",
      "\n",
      "for taxid in taxids:\n",
      "    t[taxid['taxid']]=taxid['name']\n",
      "\n",
      "taxids = t\n",
      "\n",
      "# Get tissues\n",
      "tissues = api_client.get_tissues()['results'][0]\n",
      "alltissues = str(\",\".join(tissues.keys()))\n",
      "cells = api_client.get_celltypes()['results'][0]\n",
      "\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 8
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Get Human expression levels (Could take a while!)\n",
      "expressionlevels = api_client.get_expressionmatrix(TISSUE_IDS=alltissues)['expression_matrix']\n",
      "print \"Levels found:\",expressionlevels.__len__()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Levels found: 459\n"
       ]
      }
     ],
     "prompt_number": 9
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Let's use an input compound and predict it's ADME profile\n",
      "# We'll use Gleevec (CHEMBL941) as our input\n",
      "\n",
      "gleevec_ctab = \"\"\"\n",
      " SciTegic12111210002D\n",
      "\n",
      " 37 41  0  0  0  0            999 V2000\n",
      "    6.9208   -3.0042    0.0000 C   0  0\n",
      "    7.5250   -2.6417    0.0000 N   0  0\n",
      "    3.2167   -3.0417    0.0000 C   0  0\n",
      "    5.6875   -3.0167    0.0000 C   0  0\n",
      "    6.3000   -2.6542    0.0000 N   0  0\n",
      "    3.8292   -2.6792    0.0000 N   0  0\n",
      "    8.1417   -2.9833    0.0000 C   0  0\n",
      "    0.1292   -2.0000    0.0000 N   0  0\n",
      "    5.0667   -2.6667    0.0000 C   0  0\n",
      "   -1.1083   -2.6917    0.0000 N   0  0\n",
      "    6.9250   -3.7167    0.0000 N   0  0\n",
      "    2.6000   -2.6917    0.0000 C   0  0\n",
      "    4.4542   -3.0292    0.0000 C   0  0\n",
      "    8.7542   -2.6125    0.0000 C   0  0\n",
      "    5.6917   -3.7292    0.0000 C   0  0\n",
      "    3.2250   -3.7500    0.0000 O   0  0\n",
      "    9.3458   -1.5375    0.0000 N   0  0\n",
      "    0.7417   -1.6417    0.0000 C   0  0\n",
      "    2.6000   -1.9792    0.0000 C   0  0\n",
      "    1.9875   -3.0500    0.0000 C   0  0\n",
      "    5.0667   -4.0917    0.0000 C   0  0\n",
      "    0.1167   -2.7167    0.0000 C   0  0\n",
      "   -0.4708   -1.6417    0.0000 C   0  0\n",
      "   -0.5000   -3.0542    0.0000 C   0  0\n",
      "   -1.1000   -1.9792    0.0000 C   0  0\n",
      "    8.1583   -3.6958    0.0000 C   0  0\n",
      "    7.5458   -4.0625    0.0000 C   0  0\n",
      "    1.3667   -1.9917    0.0000 C   0  0\n",
      "    4.4542   -3.7417    0.0000 C   0  0\n",
      "    1.9667   -1.6292    0.0000 C   0  0\n",
      "    1.3750   -2.7042    0.0000 C   0  0\n",
      "    8.7417   -1.9083    0.0000 C   0  0\n",
      "   -1.7375   -3.0292    0.0000 C   0  0\n",
      "    9.3750   -2.9625    0.0000 C   0  0\n",
      "    9.9750   -1.8833    0.0000 C   0  0\n",
      "    6.3042   -4.0875    0.0000 C   0  0\n",
      "    9.9917   -2.5958    0.0000 C   0  0\n",
      "  2  1  1  0\n",
      "  3  6  1  0\n",
      "  4  5  1  0\n",
      "  5  1  1  0\n",
      "  6 13  1  0\n",
      "  7  2  2  0\n",
      "  8 18  1  0\n",
      "  9  4  1  0\n",
      " 10 25  1  0\n",
      " 11  1  2  0\n",
      " 12  3  1  0\n",
      " 13  9  2  0\n",
      " 14  7  1  0\n",
      " 15  4  2  0\n",
      " 16  3  2  0\n",
      " 17 32  2  0\n",
      " 18 28  1  0\n",
      " 19 12  2  0\n",
      " 20 12  1  0\n",
      " 21 15  1  0\n",
      " 22  8  1  0\n",
      " 23  8  1  0\n",
      " 24 22  1  0\n",
      " 25 23  1  0\n",
      " 26  7  1  0\n",
      " 27 11  1  0\n",
      " 28 31  1  0\n",
      " 29 21  2  0\n",
      " 30 19  1  0\n",
      " 31 20  2  0\n",
      " 32 14  1  0\n",
      " 33 10  1  0\n",
      " 34 14  2  0\n",
      " 35 37  2  0\n",
      " 36 15  1  0\n",
      " 37 34  1  0\n",
      " 27 26  2  0\n",
      " 29 13  1  0\n",
      " 17 35  1  0\n",
      " 28 30  2  0\n",
      " 24 10  1  0\n",
      "M  END\n",
      "\"\"\"\n",
      "\n",
      "predictions = api_client.post_modelpredictor2(data=urllib.quote(gleevec_ctab))['results']\n",
      "\n",
      "# How many ADME targets were predicted?\n",
      "print predictions.__len__()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "8\n"
       ]
      }
     ],
     "prompt_number": 10
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Let's view the predictions\n",
      "tc=[]\n",
      "ts = '<table><tr>'\n",
      "te = '</tr></table>'\n",
      "\n",
      "for prediction in predictions:    \n",
      "        tc.append(\"<tr><th>\"+prediction['PROTEIN_ACCESSION']+\"</th><th>\"+prediction['full_name']+\"</th></tr>\")\n",
      "        if prediction['function'] != None:\n",
      "            pfunc = prediction['function']\n",
      "        else:\n",
      "            pfunc = 'Unknown'\n",
      "            \n",
      "        tc.append('<tr><td>'+taxids[prediction['taxid']]+'</td><td>'+pfunc+'</td></tr>')\n",
      "        \n",
      "h = HTML(ts+\"\".join(tc)+te)\n",
      "h"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<table><tr><tr><th>CHEMBL3356</th><th>Cytochrome P450 1A2</th></tr><tr><td>Human</td><td>Cytochromes P450 are a group of heme-thiolate monooxygenases. In liver microsomes, this enzyme is involved in an NADPH-dependent electron transport pathway. It oxidizes a variety of structurally unrelated compounds, including steroids, fatty acids, and xenobiotics. Most active in catalyzing 2-hydroxylation. Caffeine is metabolized primarily by cytochrome CYP1A2 in the liver through an initial N3-demethylation. Also acts in the metabolism of aflatoxin B1 and acetaminophen. Participates in the bioactivation of carcinogenic aromatic and heterocyclic amines. Catalizes the N-hydroxylation of heterocyclic amines and the O-deethylation of phenacetin.</td></tr><tr><th>CHEMBL5393</th><th>ATP-binding cassette sub-family G member 2</th></tr><tr><td>Human</td><td>Xenobiotic transporter that may play an important role in the exclusion of xenobiotics from the brain. May be involved in brain-to-blood efflux. Appears to play a major role in the multidrug resistance phenotype of several cancer cell lines. When overexpressed, the transfected cells become resistant to mitoxantrone, daunorubicin and doxorubicin, display diminished intracellular accumulation of daunorubicin, and manifest an ATP-dependent increase in the efflux of rhodamine 123.</td></tr><tr><th>CHEMBL340</th><th>Cytochrome P450 3A4</th></tr><tr><td>Human</td><td>Cytochromes P450 are a group of heme-thiolate monooxygenases. In liver microsomes, this enzyme is involved in an NADPH-dependent electron transport pathway. It performs a variety of oxidation reactions (e.g. caffeine 8-oxidation, omeprazole sulphoxidation, midazolam 1''''-hydroxylation and midazolam 4-hydroxylation) of structurally unrelated compounds, including steroids, fatty acids, and xenobiotics. Acts as a 1,8-cineole 2-exo-monooxygenase. The enzyme also hydroxylates etoposide.</td></tr><tr><th>CHEMBL3397</th><th>Cytochrome P450 2C9</th></tr><tr><td>Human</td><td>Cytochromes P450 are a group of heme-thiolate monooxygenases. In liver microsomes, this enzyme is involved in an NADPH-dependent electron transport pathway. It oxidizes a variety of structurally unrelated compounds, including steroids, fatty acids, and xenobiotics. This enzyme contributes to the wide pharmacokinetics variability of the metabolism of drugs such as S-warfarin, diclofenac, phenytoin, tolbutamide and losartan.</td></tr><tr><th>CHEMBL3622</th><th>Cytochrome P450 2C19</th></tr><tr><td>Human</td><td>Responsible for the metabolism of a number of therapeutic agents such as the anticonvulsant drug S-mephenytoin, omeprazole, proguanil, certain barbiturates, diazepam, propranolol, citalopram and imipramine.</td></tr><tr><th>CHEMBL3577</th><th>Retinal dehydrogenase 1</th></tr><tr><td>Human</td><td>Binds free retinal and cellular retinol-binding protein-bound retinal. Can convert/oxidize retinaldehyde to retinoic acid (By similarity).</td></tr><tr><th>CHEMBL289</th><th>Cytochrome P450 2D6</th></tr><tr><td>Human</td><td>Responsible for the metabolism of many drugs and environmental chemicals that it oxidizes. It is involved in the metabolism of drugs such as antiarrhythmics, adrenoceptor antagonists, and tricyclic antidepressants.</td></tr><tr><th>CHEMBL6035</th><th>Thioredoxin reductase 1, cytoplasmic</th></tr><tr><td>Rat</td><td>Unknown</td></tr></tr></table>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 11,
       "text": [
        "<IPython.core.display.HTML at 0x7fdf187b2550>"
       ]
      }
     ],
     "prompt_number": 11
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Now lets look at expression levels of these targets\n",
      "# Select only HIGH expression levels\n",
      "tc=[]\n",
      "ts = '<table><tr>'\n",
      "te = '</tr></table>'\n",
      "\n",
      "for prediction in predictions:    \n",
      "        tc.append(\"<tr><th>\"+prediction['PROTEIN_ACCESSION']+\"</th><th>\"+prediction['full_name']+\"</th></tr>\")\n",
      "        # This dumps out expression levels for all tissues and cell types!\n",
      "        targetexpression=[]\n",
      "\n",
      "        for humexp in expressionlevels:\n",
      "\n",
      "            for tissue in tissues:\n",
      "\n",
      "                for cell in cells:\n",
      "\n",
      "                    percell = humexp[str(tissue)]\n",
      "\n",
      "                    if str(cell) in percell:\n",
      "\n",
      "                        if percell[str(cell)]['target_id']==prediction['target_id']:\n",
      "\n",
      "                            expstring = \"Tissue =\",percell[str(cell)]['tissue'],\", Cell =\",cells[str(cell)],\" Level =\",percell[str(cell)]['exp_level'],\" Type =\",percell[str(cell)]['expression_type'],\" Reliability =\",percell[str(cell)]['reliability']\n",
      "\n",
      "                            level = percell[str(cell)]['exp_level']\n",
      "                            if re.match('High|Strong',level):\n",
      "                                 targetexpression.append(expstring)\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "\n",
      "        for exp in targetexpression:\n",
      "            tc.append('<tr><td>'+\" \".join(exp)+'</td></tr>')\n",
      "\n",
      "h = HTML(ts+\"\".join(tc)+te)\n",
      "h"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<table><tr><tr><th>CHEMBL3356</th><th>Cytochrome P450 1A2</th></tr><tr><td>Tissue = liver , Cell = hepatocytes  Level = High  Type = APE  Reliability = Medium</td></tr><tr><th>CHEMBL5393</th><th>ATP-binding cassette sub-family G member 2</th></tr><tr><td>Tissue = vulva/anal+skin , Cell = epidermal cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = tonsil , Cell = squamous epithelial cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = lung , Cell = macrophages  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = nasopharynx , Cell = respiratory epithelial cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = vagina , Cell = squamous epithelial cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = uterus,+post-menopause , Cell = glandular cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = stomach,+upper , Cell = glandular cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = testis , Cell = cells in seminiferus ducts  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = adrenal+gland , Cell = glandular cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = bone+marrow , Cell = hematopoietic cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = bronchus , Cell = respiratory epithelial cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = colon , Cell = glandular cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = cervix,+uterine , Cell = glandular cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = spleen , Cell = cells in red pulp  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = spleen , Cell = cells in white pulp  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = epididymis , Cell = glandular cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = esophagus , Cell = squamous epithelial cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = heart+muscle , Cell = myocytes  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = gallbladder , Cell = glandular cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = seminal+vesicle , Cell = glandular cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = small+intestine , Cell = glandular cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = skin , Cell = keratinocytes  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = skin , Cell = Langerhans  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = skin , Cell = fibroblasts  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = skin , Cell = melanocytes  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = skeletal+muscle , Cell = myocytes  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><th>CHEMBL340</th><th>Cytochrome P450 3A4</th></tr><tr><td>Tissue = duodenum , Cell = glandular cells  Level = Strong  Type = Staining  Reliability = Supportive</td></tr><tr><td>Tissue = liver , Cell = hepatocytes  Level = Strong  Type = Staining  Reliability = Supportive</td></tr><tr><td>Tissue = small+intestine , Cell = glandular cells  Level = Strong  Type = Staining  Reliability = Supportive</td></tr><tr><th>CHEMBL3397</th><th>Cytochrome P450 2C9</th></tr><tr><td>Tissue = liver , Cell = hepatocytes  Level = High  Type = APE  Reliability = High</td></tr><tr><th>CHEMBL3622</th><th>Cytochrome P450 2C19</th></tr><tr><td>Tissue = liver , Cell = hepatocytes  Level = Strong  Type = Staining  Reliability = Supportive</td></tr><tr><th>CHEMBL3577</th><th>Retinal dehydrogenase 1</th></tr><tr><th>CHEMBL289</th><th>Cytochrome P450 2D6</th></tr><tr><td>Tissue = cerebellum , Cell = cells in granular layer  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = duodenum , Cell = glandular cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = liver , Cell = hepatocytes  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><td>Tissue = small+intestine , Cell = glandular cells  Level = Strong  Type = Staining  Reliability = Uncertain</td></tr><tr><th>CHEMBL6035</th><th>Thioredoxin reductase 1, cytoplasmic</th></tr></tr></table>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 12,
       "text": [
        "<IPython.core.display.HTML at 0x7fdf0c7e5fd0>"
       ]
      }
     ],
     "prompt_number": 12
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Now lets see how many activity points we have per target\n",
      "# Let's view the predictions\n",
      "tc=[]\n",
      "ts = '<table><tr>'\n",
      "te = '</tr></table>'\n",
      "\n",
      "for prediction in predictions:\n",
      "        try:\n",
      "            activity = api_client.get_targetbioactivity(TARGET_ID=str(prediction['target_id']))['results']\n",
      "            print activity.__len__(),\" activity points\"\n",
      "        except:\n",
      "            print \"Error retrieving data points!\"\n",
      "        #tc.append(\"<tr><th>\"+prediction['PROTEIN_ACCESSION']+\"</th><th>\"+prediction['full_name']+\"</th></tr>\")\n",
      "        #tc.append('<tr><td>Activity points</td><td>'+str(activity.__len__())+'</td></tr>')\n",
      "        \n",
      "h = HTML(ts+\"\".join(tc)+te)\n",
      "h"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "66105  activity points\n",
        "10257"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "  activity points\n",
        "Error retrieving data points!"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "59288"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "  activity points\n",
        "79250"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "  activity points\n",
        "Error retrieving data points!"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Error retrieving data points!"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "Error retrieving data points!"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n"
       ]
      },
      {
       "html": [
        "<table><tr></tr></table>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 13,
       "text": [
        "<IPython.core.display.HTML at 0x7fdf1881af50>"
       ]
      }
     ],
     "prompt_number": 13
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# How many compounds do we have per target?\n",
      "tc=[]\n",
      "ts = '<table><tr>'\n",
      "te = '</tr></table>'\n",
      "\n",
      "for prediction in predictions:\n",
      "        try:\n",
      "            targetcompounds = api_client.get_targetcompounds(TARGET_ID=str(prediction['target_id']))['results']\n",
      "            count = targetcompounds.__len__(),\" compounds\"\n",
      "        except:\n",
      "            count = 0\n",
      "            print \"Error retrieving data points!\"\n",
      "        tc.append(\"<tr><th>\"+prediction['PROTEIN_ACCESSION']+\"</th><th>\"+prediction['full_name']+\"</th></tr>\")\n",
      "        tc.append('<tr><td>Activity points</td><td>'+str(count)+'</td></tr>')\n",
      "        \n",
      "h = HTML(ts+\"\".join(tc)+te)\n",
      "h\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<table><tr><tr><th>CHEMBL3356</th><th>Cytochrome P450 1A2</th></tr><tr><td>Activity points</td><td>(11791, ' compounds')</td></tr><tr><th>CHEMBL5393</th><th>ATP-binding cassette sub-family G member 2</th></tr><tr><td>Activity points</td><td>(484, ' compounds')</td></tr><tr><th>CHEMBL340</th><th>Cytochrome P450 3A4</th></tr><tr><td>Activity points</td><td>(15913, ' compounds')</td></tr><tr><th>CHEMBL3397</th><th>Cytochrome P450 2C9</th></tr><tr><td>Activity points</td><td>(11124, ' compounds')</td></tr><tr><th>CHEMBL3622</th><th>Cytochrome P450 2C19</th></tr><tr><td>Activity points</td><td>(11707, ' compounds')</td></tr><tr><th>CHEMBL3577</th><th>Retinal dehydrogenase 1</th></tr><tr><td>Activity points</td><td>(75307, ' compounds')</td></tr><tr><th>CHEMBL289</th><th>Cytochrome P450 2D6</th></tr><tr><td>Activity points</td><td>(9717, ' compounds')</td></tr><tr><th>CHEMBL6035</th><th>Thioredoxin reductase 1, cytoplasmic</th></tr><tr><td>Activity points</td><td>(39319, ' compounds')</td></tr></tr></table>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 15,
       "text": [
        "<IPython.core.display.HTML at 0x7fdf17838910>"
       ]
      }
     ],
     "prompt_number": 15
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    }
   ],
   "metadata": {}
  }
 ]
}