{ "metadata": { "name": "", "signature": "sha256:72f81ee69cf1a5bedc59313dc16cb7ef381aeb0d670fa934b481c31fc5d898f1" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# myChEMBL ADMESARfari webservice tutorial" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### myChEMBL team, ChEMBL Group, EMBL-EBI.\n", "\n", "This notebook is intended to illustrate the use of the ADMESARfari webservice API from Python. Since the webservices for ADMESARfari are written using Cornice (https://github.com/mozilla-services/cornice) we have an exposed SPORE (https://github.com/SPORE/specifications) endpoint. This allows us to use a Python library, such as Respire (https://github.com/spiral-project/respire) to parse the JSON description of the methods available, which is provided by the SPORE endpoint (https://www.ebi.ac.uk/chembl/admesarfari/rest/spore) and to automatically generate callable methods from Python without handcoding the necessary boilerplate code.\n", "\n", "We will cover\n", "* Using Respire to create an API client\n", "* Making basic GET requests to the service\n", "* Using an input compound to run a prediction\n", "* Using an input FASTA sequence to run a prediction\n", "* Presentation of results from both." ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Let do our imports first.\n", "import respire,urllib,re\n", "from IPython.display import HTML,JSON\n" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "# We just need to monkey-patch the URL join method in this instance, \n", "# since it truncates the URL due to the way ADMESARfari is hosted\n", "\n", "def urljoin_patched(base,path):\n", " return base+path\n", "\n", "respire.client.urljoin = urljoin_patched\n", " \n", "# Create our client and associated methods\n", "api_client = respire.client_from_url('http://wwwdev.ebi.ac.uk/chembl/admesarfari/rest/spore') \n", "\n" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "# What methods do we have available?\n", "# Iterate over the parsed endpoint, pulling out applicable methods, the paths and the descriptions.\n", "# We'll add some HTML elements to the output.\n", "tc=[]\n", "ts = '
get_textsearch | |
---|---|
/rest/:TEXT/search | Return a set of target ids where the search term appears |
get_blast | |
/rest/:FASTA/blast | BLAST the input sequence(s) against the ADME SARfari set of target sequences.\n", "\n", " This requires:\n", "\n", " * A URL encoded FASTA sequence (May extended to other formats via a keyword)\n", " |
get_celltypes | |
/rest/celltypes | Retrieve the list of cell types |
get_targetsequence | |
/rest/targetsequence/:TARGET_ID | Return the Target Sequence and Variation information for a particular Target ID\n", "\n", " |
post_postsimsubsdf | |
/rest/simsubsdf/:VALUE | Return a set of molecules via either a similarity or substructure search\n", "\n", " Requires:\n", " * The POST body to contain CTAB or SMILES\n", " * A similarity cut-off value (100 will perform a sub-structure search)\n", "\n", " This returns a gzipped SDF file.\n", "\n", " |
get_orthologuematrix | |
/rest/orthologuematrix/:TAX_IDS | Retrieves the orthologue mapping matrix for a specific set of Taxonomy IDs.\n", "\n", " Requires: Comma seperated list of Taxonomy IDs\n", "\n", " |
post_modelpredictor2 | |
/rest/modelpredictor2 | Run the input CTAB through the ADME SARfari SciKit/RDKit Bayesian model.\n", "\n", " This requires a URL encoded CTAB |
get_target | |
/rest/target/:TARGET_ID | Return the Target information for a particular Target ID\n", "\n", " |
get_targetalignment | |
/rest/targetalignment/:TARGET_ID/:TAX_IDS | Return the alignment information for a particular Target ID\n", "\n", " |
get_bioactivity | |
/rest/:MOLREGNO/bioactivity | Retrieve the bioactivity and assay data for a particular molregno.\n", "\n", " Requires: Molregno (Int)\n", "\n", " If the Molregno == -1 then it will bring back all records\n", "\n", " Returns: Datatables format JSON.\n", " |
post_postblast | |
/rest/blast | BLAST the input sequence(s) against the ADME SARfari set of protein databases\n", "\n", " This requires:\n", "\n", " * A FASTA sequence as the POST body\n", " |
get_targetcompounds | |
/rest/:TARGET_ID/targetcompounds | Retrieve the compound SMILES associated with an ADME SARfari target (via activity)\n", "\n", " Requires: ADME SARfari Target ID\n", "\n", " |
get_expressionmatrix | |
/rest/expressionmatrix/:TISSUE_IDS | Retrieves the tissue target expression level matrix\n", "\n", " |
get_taxids | |
/rest/taxids | Return the list of taxonomy IDs used. |
get_alignmentdendrogram | |
/rest/alignmentdendrogram/:TARGET_ID/:TAX_IDS | Return the dendrogram tree information for a particular Target ID (from the relevant orthologues)\n", "\n", " Requires a target ID and a comma seperated list of tax IDs\n", "\n", " |
get_targetinvivomatrix | |
/rest/:TARGET_ID/targetinvivomatrix | Retrieves the invivo matrix for a particular target\n", "\n", " Requires: ADME SARfari internal target id\n", "\n", " This will return an object with xcats,ycats and data elements (primarily used with the Highcharts Heatmap plugin.)\n", "\n", " |
get_tissues | |
/rest/tissues | Retrieve the list of tissues |
get_targetbioactivity | |
/rest/:TARGET_ID/targetbioactivity | Retrieve the bioactivity and assay data for a particular target\n", "\n", " Requires: ADME SARfari Target ID\n", "\n", " |
get_modelpredictor2 | |
/rest/:CTAB/modelpredictor2 | Run the input CTAB through the ADME SARfari SciKit/RDKit Bayesian model.\n", "\n", " This requires a URL encoded CTAB |
CHEMBL3356 | Cytochrome P450 1A2 |
---|---|
Human | Cytochromes P450 are a group of heme-thiolate monooxygenases. In liver microsomes, this enzyme is involved in an NADPH-dependent electron transport pathway. It oxidizes a variety of structurally unrelated compounds, including steroids, fatty acids, and xenobiotics. Most active in catalyzing 2-hydroxylation. Caffeine is metabolized primarily by cytochrome CYP1A2 in the liver through an initial N3-demethylation. Also acts in the metabolism of aflatoxin B1 and acetaminophen. Participates in the bioactivation of carcinogenic aromatic and heterocyclic amines. Catalizes the N-hydroxylation of heterocyclic amines and the O-deethylation of phenacetin. |
CHEMBL5393 | ATP-binding cassette sub-family G member 2 |
Human | Xenobiotic transporter that may play an important role in the exclusion of xenobiotics from the brain. May be involved in brain-to-blood efflux. Appears to play a major role in the multidrug resistance phenotype of several cancer cell lines. When overexpressed, the transfected cells become resistant to mitoxantrone, daunorubicin and doxorubicin, display diminished intracellular accumulation of daunorubicin, and manifest an ATP-dependent increase in the efflux of rhodamine 123. |
CHEMBL340 | Cytochrome P450 3A4 |
Human | Cytochromes P450 are a group of heme-thiolate monooxygenases. In liver microsomes, this enzyme is involved in an NADPH-dependent electron transport pathway. It performs a variety of oxidation reactions (e.g. caffeine 8-oxidation, omeprazole sulphoxidation, midazolam 1''''-hydroxylation and midazolam 4-hydroxylation) of structurally unrelated compounds, including steroids, fatty acids, and xenobiotics. Acts as a 1,8-cineole 2-exo-monooxygenase. The enzyme also hydroxylates etoposide. |
CHEMBL3397 | Cytochrome P450 2C9 |
Human | Cytochromes P450 are a group of heme-thiolate monooxygenases. In liver microsomes, this enzyme is involved in an NADPH-dependent electron transport pathway. It oxidizes a variety of structurally unrelated compounds, including steroids, fatty acids, and xenobiotics. This enzyme contributes to the wide pharmacokinetics variability of the metabolism of drugs such as S-warfarin, diclofenac, phenytoin, tolbutamide and losartan. |
CHEMBL3622 | Cytochrome P450 2C19 |
Human | Responsible for the metabolism of a number of therapeutic agents such as the anticonvulsant drug S-mephenytoin, omeprazole, proguanil, certain barbiturates, diazepam, propranolol, citalopram and imipramine. |
CHEMBL3577 | Retinal dehydrogenase 1 |
Human | Binds free retinal and cellular retinol-binding protein-bound retinal. Can convert/oxidize retinaldehyde to retinoic acid (By similarity). |
CHEMBL289 | Cytochrome P450 2D6 |
Human | Responsible for the metabolism of many drugs and environmental chemicals that it oxidizes. It is involved in the metabolism of drugs such as antiarrhythmics, adrenoceptor antagonists, and tricyclic antidepressants. |
CHEMBL6035 | Thioredoxin reductase 1, cytoplasmic |
Rat | Unknown |
CHEMBL3356 | Cytochrome P450 1A2 |
---|---|
Tissue = liver , Cell = hepatocytes Level = High Type = APE Reliability = Medium | |
CHEMBL5393 | ATP-binding cassette sub-family G member 2 |
Tissue = vulva/anal+skin , Cell = epidermal cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = tonsil , Cell = squamous epithelial cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = lung , Cell = macrophages Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = nasopharynx , Cell = respiratory epithelial cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = vagina , Cell = squamous epithelial cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = uterus,+post-menopause , Cell = glandular cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = stomach,+upper , Cell = glandular cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = testis , Cell = cells in seminiferus ducts Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = adrenal+gland , Cell = glandular cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = bone+marrow , Cell = hematopoietic cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = bronchus , Cell = respiratory epithelial cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = colon , Cell = glandular cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = cervix,+uterine , Cell = glandular cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = spleen , Cell = cells in red pulp Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = spleen , Cell = cells in white pulp Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = epididymis , Cell = glandular cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = esophagus , Cell = squamous epithelial cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = heart+muscle , Cell = myocytes Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = gallbladder , Cell = glandular cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = seminal+vesicle , Cell = glandular cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = small+intestine , Cell = glandular cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = skin , Cell = keratinocytes Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = skin , Cell = Langerhans Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = skin , Cell = fibroblasts Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = skin , Cell = melanocytes Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = skeletal+muscle , Cell = myocytes Level = Strong Type = Staining Reliability = Uncertain | |
CHEMBL340 | Cytochrome P450 3A4 |
Tissue = duodenum , Cell = glandular cells Level = Strong Type = Staining Reliability = Supportive | |
Tissue = liver , Cell = hepatocytes Level = Strong Type = Staining Reliability = Supportive | |
Tissue = small+intestine , Cell = glandular cells Level = Strong Type = Staining Reliability = Supportive | |
CHEMBL3397 | Cytochrome P450 2C9 |
Tissue = liver , Cell = hepatocytes Level = High Type = APE Reliability = High | |
CHEMBL3622 | Cytochrome P450 2C19 |
Tissue = liver , Cell = hepatocytes Level = Strong Type = Staining Reliability = Supportive | |
CHEMBL3577 | Retinal dehydrogenase 1 |
CHEMBL289 | Cytochrome P450 2D6 |
Tissue = cerebellum , Cell = cells in granular layer Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = duodenum , Cell = glandular cells Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = liver , Cell = hepatocytes Level = Strong Type = Staining Reliability = Uncertain | |
Tissue = small+intestine , Cell = glandular cells Level = Strong Type = Staining Reliability = Uncertain | |
CHEMBL6035 | Thioredoxin reductase 1, cytoplasmic |
CHEMBL3356 | Cytochrome P450 1A2 |
---|---|
Activity points | (11791, ' compounds') |
CHEMBL5393 | ATP-binding cassette sub-family G member 2 |
Activity points | (484, ' compounds') |
CHEMBL340 | Cytochrome P450 3A4 |
Activity points | (15913, ' compounds') |
CHEMBL3397 | Cytochrome P450 2C9 |
Activity points | (11124, ' compounds') |
CHEMBL3622 | Cytochrome P450 2C19 |
Activity points | (11707, ' compounds') |
CHEMBL3577 | Retinal dehydrogenase 1 |
Activity points | (75307, ' compounds') |
CHEMBL289 | Cytochrome P450 2D6 |
Activity points | (9717, ' compounds') |
CHEMBL6035 | Thioredoxin reductase 1, cytoplasmic |
Activity points | (39319, ' compounds') |