{ "metadata": { "name": "", "signature": "sha256:2dcd973ed5f93f6f77f5ed461217275423bf82bf0a6a144974fc6865e7408a84" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# MMPA on ChEMBL hERG data using Pandas\n", "\n", "The idea here is to try out using Pandas to visualize and work with the output of Jameed Hussain's MMPA code in the IPython notebook. The code is available in the RDKit Contrib dir. Jameed gave a couple tutorials on use of the tools at the [2013 UGM](https://github.com/rdkit/UGM_2013). The notebooks from his tutorials are [here](http://nbviewer.ipython.org/urls/raw.github.com/rdkit/UGM_2013/master/Tutorials/mmpa_tutorial/mmp_tutorial1.ipynb) and [here](http://nbviewer.ipython.org/urls/raw.github.com/rdkit/UGM_2013/master/Tutorials/mmpa_tutorial/mmp_tutorial2.ipynb).\n", "\n", "I'll use a ChEMBL hERG dataset. This was somewhat inspired/informed by Paul's work here: https://github.com/pzc/herg_chembl_jcim" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from rdkit import Chem,DataStructs\n", "import time,random\n", "from collections import defaultdict\n", "import psycopg2\n", "from rdkit.Chem import Draw,PandasTools,rdMolDescriptors\n", "from rdkit.Chem.Draw import IPythonConsole\n", "from rdkit import rdBase\n", "from __future__ import print_function\n", "import requests\n", "from xml.etree import ElementTree\n", "import pandas as pd\n", "%load_ext sql\n", "print(rdBase.rdkitVersion)\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "2014.09.1pre\n" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Preparation\n", "\n", "Start by finding the hERG data in ChEMBL" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%sql postgresql://localhost/chembl_19 \\\n", " select * from chembl_id_lookup where chembl_id = 'CHEMBL240';" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1 rows affected.\n" ] }, { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
chembl_identity_typeentity_idstatus
CHEMBL240TARGET165ACTIVE
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ "[(u'CHEMBL240', u'TARGET', 165, u'ACTIVE')]" ] } ], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "%sql select count(*) from activities join assays using (assay_id) where tid=165;" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1 rows affected.\n" ] }, { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", "
count
14397
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 10, "text": [ "[(14397L,)]" ] } ], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Look at all the activity units available." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%sql select distinct(standard_type) from activities join assays using (assay_id) where tid=165;" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "27 rows affected.\n" ] }, { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
standard_type
Ratio
Fold change
Ratio IC50
EC25
Imax
EC50
Activity
IC25
IC60
EC10
IP
IC50
QT interval
Time
Ratio Ki
Log IC50
ED50
pIC50
Inflection point
Inhibition
Ki
IC20
V1/2
Potency
INH
pKi
log IC50
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 11, "text": [ "[(u'Ratio',),\n", " (u'Fold change',),\n", " (u'Ratio IC50',),\n", " (u'EC25',),\n", " (u'Imax',),\n", " (u'EC50',),\n", " (u'Activity',),\n", " (u'IC25',),\n", " (u'IC60',),\n", " (u'EC10',),\n", " (u'IP',),\n", " (u'IC50',),\n", " (u'QT interval',),\n", " (u'Time',),\n", " (u'Ratio Ki',),\n", " (u'Log IC50',),\n", " (u'ED50',),\n", " (u'pIC50',),\n", " (u'Inflection point',),\n", " (u'Inhibition',),\n", " (u'Ki',),\n", " (u'IC20',),\n", " (u'V1/2',),\n", " (u'Potency',),\n", " (u'INH',),\n", " (u'pKi',),\n", " (u'log IC50',)]" ] } ], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "%sql select count(*) from activities join assays using (assay_id) where tid=165 and standard_type='Ki';" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1 rows affected.\n" ] }, { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", "
count
2327
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 12, "text": [ "[(2327L,)]" ] } ], "prompt_number": 12 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Create the data set\n", "\n", "Pull all hERG assay Ki values where the value is not qualified and the SMILES doesn't include a dot (the MMPA code doesn't get along with dot-separated SMILES).\n", "\n", "*Reproducibility note:* though the queries here are shown against chembl_19, I did the original data export from chembl_18, so the pairs shown may differ somewhat from what you'd get." ] }, { "cell_type": "code", "collapsed": false, "input": [ "data = %sql select canonical_smiles,molregno,activity_id,standard_value,standard_units from activities \\\n", " join assays using (assay_id) \\\n", " join compound_structures using (molregno) \\\n", " where tid=165 and standard_type='Ki' and standard_value is not null and standard_relation='=' \\\n", " and canonical_smiles not like '%.%';\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1099 rows affected.\n" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert to a Pandas DataFrame and write it to a text file" ] }, { "cell_type": "code", "collapsed": false, "input": [ "df = data.DataFrame()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "df.to_csv('../data/herg_data.txt',sep=\" \",index=False)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 42 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Build the matched pairs\n", "\n", "Call the MMPA fragmentation program.\n", "\n", "This can take a few minutes." ] }, { "cell_type": "code", "collapsed": false, "input": [ "!python $RDBASE/Contrib/mmpa/rfrag.py < ../data/herg_data.txt > ../data/herg_fragmented.txt" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[03:43:47] SMILES Parse Error: syntax error for input: canonical_smiles\r\n", "Can't generate mol for: canonical_smiles\r\n" ] } ], "prompt_number": 43 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now generate the MMPs that differ by less than 10% of the molecule. Generate symmetrically so that we can detect tforms in both directions." ] }, { "cell_type": "code", "collapsed": false, "input": [ "!python $RDBASE/Contrib/mmpa/indexing.py -s -r 0.1 < ../data/herg_fragmented.txt > ../data/mmps_default.txt" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 153 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Read those into a Pandas data frame and look at some of the data." ] }, { "cell_type": "code", "collapsed": false, "input": [ "mmps = pd.read_csv('../data/mmps_default.txt',header=None,names=('smiles1','smiles2','molregno1','molregno2','tform','core'))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "mmps[mmps.molregno1==290813]" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
smiles1smiles2molregno1molregno2tformcore
0 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290814 [*:1]C(C)(C)C>>[*:1][C@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
2 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290815 [*:1]C(C)(C)C>>[*:1][C@@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
92 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290814 [*:1]C([*:2])([*:3])C>>[*:1]CC([*:2])[*:3] [*:3]C.[*:1]C.[*:2][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
94 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290815 [*:1]C([*:2])([*:3])C>>[*:1]CC([*:2])[*:3] [*:3]C.[*:1]C.[*:2][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
1368 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290814 [*:1]C([*:2])(C)C>>[*:1]C([*:2])CC [*:1]C.[*:2][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
1370 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290814 [*:1]C([*:2])(C)C>>[*:1]C[C@H]([*:2])C [*:1]C.[*:2][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
1372 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290815 [*:1]C([*:2])(C)C>>[*:1]C[C@@H]([*:2])C [*:1]C.[*:2][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
1374 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290815 [*:1]C([*:2])(C)C>>[*:1]C([*:2])CC [*:1]C.[*:2][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
1470 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CCn1nc(Cc2ccc(OC(C)C)cc2)cc1C3CCN(C[C@H]4CN(C[C@@H]4c5cccc(F)c5)[C@@H](C(=O)O)C(C)(C)C)CC3 290813 290921 [*:1]C1CCC1>>[*:1]C(C)C [*:1]Oc1ccc(Cc2cc(C3CCN(C[C@H]4CN([C@@H](C(=O)O)C(C)(C)C)C[C@@H]4c4cccc(F)c4)CC3)n(CC)n2)cc1
1472 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CCn1nc(Cc2ccc(OC(F)(F)F)cc2)cc1C3CCN(C[C@H]4CN(C[C@@H]4c5cccc(F)c5)[C@@H](C(=O)O)C(C)(C)C)CC3 290813 292218 [*:1]C1CCC1>>[*:1]C(F)(F)F [*:1]Oc1ccc(Cc2cc(C3CCN(C[C@H]4CN([C@@H](C(=O)O)C(C)(C)C)C[C@@H]4c4cccc(F)c4)CC3)n(CC)n2)cc1
1504 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290814 [*:1]C([*:2])C([*:3])(C)C>>[*:1]C([*:2])[C@@H]([*:3])CC [*:3]C.[*:1]C(=O)O.[*:2]N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
1506 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290814 [*:1]C([*:2])C([*:3])(C)C>>[*:3]C[C@@H](C)C([*:1])[*:2] [*:3]C.[*:1]C(=O)O.[*:2]N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
1508 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290815 [*:1]C([*:2])C([*:3])(C)C>>[*:1]C([*:2])[C@H]([*:3])CC [*:3]C.[*:1]C(=O)O.[*:2]N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
1510 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290815 [*:1]C([*:2])C([*:3])(C)C>>[*:3]C[C@H](C)C([*:1])[*:2] [*:3]C.[*:1]C(=O)O.[*:2]N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 7, "text": [ " smiles1 smiles2 molregno1 molregno2 tform core\n", "0 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290814 [*:1]C(C)(C)C>>[*:1][C@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "2 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290815 [*:1]C(C)(C)C>>[*:1][C@@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "92 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290814 [*:1]C([*:2])([*:3])C>>[*:1]CC([*:2])[*:3] [*:3]C.[*:1]C.[*:2][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "94 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290815 [*:1]C([*:2])([*:3])C>>[*:1]CC([*:2])[*:3] [*:3]C.[*:1]C.[*:2][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "1368 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290814 [*:1]C([*:2])(C)C>>[*:1]C([*:2])CC [*:1]C.[*:2][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "1370 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290814 [*:1]C([*:2])(C)C>>[*:1]C[C@H]([*:2])C [*:1]C.[*:2][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "1372 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290815 [*:1]C([*:2])(C)C>>[*:1]C[C@@H]([*:2])C [*:1]C.[*:2][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "1374 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290815 [*:1]C([*:2])(C)C>>[*:1]C([*:2])CC [*:1]C.[*:2][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "1470 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CCn1nc(Cc2ccc(OC(C)C)cc2)cc1C3CCN(C[C@H]4CN(C[C@@H]4c5cccc(F)c5)[C@@H](C(=O)O)C(C)(C)C)CC3 290813 290921 [*:1]C1CCC1>>[*:1]C(C)C [*:1]Oc1ccc(Cc2cc(C3CCN(C[C@H]4CN([C@@H](C(=O)O)C(C)(C)C)C[C@@H]4c4cccc(F)c4)CC3)n(CC)n2)cc1\n", "1472 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CCn1nc(Cc2ccc(OC(F)(F)F)cc2)cc1C3CCN(C[C@H]4CN(C[C@@H]4c5cccc(F)c5)[C@@H](C(=O)O)C(C)(C)C)CC3 290813 292218 [*:1]C1CCC1>>[*:1]C(F)(F)F [*:1]Oc1ccc(Cc2cc(C3CCN(C[C@H]4CN([C@@H](C(=O)O)C(C)(C)C)C[C@@H]4c4cccc(F)c4)CC3)n(CC)n2)cc1\n", "1504 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290814 [*:1]C([*:2])C([*:3])(C)C>>[*:1]C([*:2])[C@@H]([*:3])CC [*:3]C.[*:1]C(=O)O.[*:2]N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "1506 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290814 [*:1]C([*:2])C([*:3])(C)C>>[*:3]C[C@@H](C)C([*:1])[*:2] [*:3]C.[*:1]C(=O)O.[*:2]N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "1508 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290815 [*:1]C([*:2])C([*:3])(C)C>>[*:1]C([*:2])[C@H]([*:3])CC [*:3]C.[*:1]C(=O)O.[*:2]N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "1510 CCn1nc(Cc2ccc(OC3CCC3)cc2)cc1C4CCN(C[C@H]5CN(C[C@@H]5c6cccc(F)c6)[C@@H](C(=O)O)C(C)(C)C)CC4 CC[C@H](C)[C@@H](N1C[C@H](CN2CCC(CC2)c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)[C@H](C1)c6cccc(F)c6)C(=O)O 290813 290815 [*:1]C([*:2])C([*:3])(C)C>>[*:3]C[C@H](C)C([*:1])[*:2] [*:3]C.[*:1]C(=O)O.[*:2]N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are dupes... drop them:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "mmps=mmps.drop_duplicates(subset=(\"molregno1\",\"molregno2\"))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Add a couple molecule columns and remove the SMILES columns:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "PandasTools.AddMoleculeColumnToFrame(mmps,'smiles1','mol1')\n", "PandasTools.AddMoleculeColumnToFrame(mmps,'smiles2','mol2')\n", "mmps = mmps[['mol1','mol2','molregno1','molregno2','tform','core']]\n", "mmps.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
mol1mol2molregno1molregno2tformcore
0 \"Mol\"/ \"Mol\"/ 290813 290814 [*:1]C(C)(C)C>>[*:1][C@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
1 \"Mol\"/ \"Mol\"/ 290814 290813 [*:1][C@H](C)CC>>[*:1]C(C)(C)C [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
2 \"Mol\"/ \"Mol\"/ 290813 290815 [*:1]C(C)(C)C>>[*:1][C@@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
3 \"Mol\"/ \"Mol\"/ 290815 290813 [*:1][C@@H](C)CC>>[*:1]C(C)(C)C [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
4 \"Mol\"/ \"Mol\"/ 290814 290815 [*:1][C@H](C)CC>>[*:1][C@@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 9, "text": [ " mol1 mol2 molregno1 molregno2 tform core\n", "0 \"Mol\"/ \"Mol\"/ 290813 290814 [*:1]C(C)(C)C>>[*:1][C@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "1 \"Mol\"/ \"Mol\"/ 290814 290813 [*:1][C@H](C)CC>>[*:1]C(C)(C)C [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "2 \"Mol\"/ \"Mol\"/ 290813 290815 [*:1]C(C)(C)C>>[*:1][C@@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "3 \"Mol\"/ \"Mol\"/ 290815 290813 [*:1][C@@H](C)CC>>[*:1]C(C)(C)C [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "4 \"Mol\"/ \"Mol\"/ 290814 290815 [*:1][C@H](C)CC>>[*:1][C@@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now join back on the original data so that we have activities in the table again. Pandas makes this easy." ] }, { "cell_type": "code", "collapsed": false, "input": [ "t1=df[['molregno','standard_value']]\n", "mmpdds = mmps.merge(t1,left_on='molregno1',right_on='molregno',suffixes=(\"_1\",\"_2\")).\\\n", " merge(t1,left_on='molregno2',right_on='molregno',suffixes=(\"_1\",\"_2\"))\n", "mmpdds.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
mol1mol2molregno1molregno2tformcoremolregno_1standard_value_1molregno_2standard_value_2
0 \"Mol\"/ \"Mol\"/ 290813 290814 [*:1]C(C)(C)C>>[*:1][C@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1 290813 3500 290814 3400
1 \"Mol\"/ \"Mol\"/ 290815 290814 [*:1][C@@H](C)CC>>[*:1][C@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1 290815 5700 290814 3400
2 \"Mol\"/ \"Mol\"/ 290879 290814 [*:1]C(C)(C)C>>[*:1]C1CCC1 [*:1]Oc1ccc(Cc2cc(C3CCN(C[C@H]4CN([C@@H](C(=O)O)[C@H](C)CC)C[C@@H]4c4cccc(F)c4)CC3)n(CC)n2)cc1 290879 5600 290814 3400
3 \"Mol\"/ \"Mol\"/ 290813 290815 [*:1]C(C)(C)C>>[*:1][C@@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1 290813 3500 290815 5700
4 \"Mol\"/ \"Mol\"/ 290814 290815 [*:1][C@H](C)CC>>[*:1][C@@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1 290814 3400 290815 5700
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 10, "text": [ " mol1 mol2 molregno1 molregno2 tform core molregno_1 standard_value_1 molregno_2 standard_value_2\n", "0 \"Mol\"/ \"Mol\"/ 290813 290814 [*:1]C(C)(C)C>>[*:1][C@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1 290813 3500 290814 3400\n", "1 \"Mol\"/ \"Mol\"/ 290815 290814 [*:1][C@@H](C)CC>>[*:1][C@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1 290815 5700 290814 3400\n", "2 \"Mol\"/ \"Mol\"/ 290879 290814 [*:1]C(C)(C)C>>[*:1]C1CCC1 [*:1]Oc1ccc(Cc2cc(C3CCN(C[C@H]4CN([C@@H](C(=O)O)[C@H](C)CC)C[C@@H]4c4cccc(F)c4)CC3)n(CC)n2)cc1 290879 5600 290814 3400\n", "3 \"Mol\"/ \"Mol\"/ 290813 290815 [*:1]C(C)(C)C>>[*:1][C@@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1 290813 3500 290815 5700\n", "4 \"Mol\"/ \"Mol\"/ 290814 290815 [*:1][C@H](C)CC>>[*:1][C@@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1 290814 3400 290815 5700" ] } ], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Calculate pKi values and the difference between them:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import math\n", "mmpdds['pKi_1']=mmpdds.apply(lambda row:-1*math.log10(float(row['standard_value_1'])*1e-9),axis=1)\n", "mmpdds['pKi_2']=mmpdds.apply(lambda row:-1*math.log10(float(row['standard_value_2'])*1e-9),axis=1)\n", "mmpdds['delta']=mmpdds['pKi_2']-mmpdds['pKi_1']" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 11 }, { "cell_type": "markdown", "metadata": {}, "source": [ "And, remove some extra columns:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "mmpdds=mmpdds[['mol1','mol2','molregno1','molregno2','pKi_1','pKi_2','delta','tform','core']]\n", "mmpdds.head()\n" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
mol1mol2molregno1molregno2pKi_1pKi_2deltatformcore
0 \"Mol\"/ \"Mol\"/ 290813 290814 5.455932 5.468521 0.012589 [*:1]C(C)(C)C>>[*:1][C@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
1 \"Mol\"/ \"Mol\"/ 290815 290814 5.244125 5.468521 0.224396 [*:1][C@@H](C)CC>>[*:1][C@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
2 \"Mol\"/ \"Mol\"/ 290879 290814 5.251812 5.468521 0.216709 [*:1]C(C)(C)C>>[*:1]C1CCC1 [*:1]Oc1ccc(Cc2cc(C3CCN(C[C@H]4CN([C@@H](C(=O)O)[C@H](C)CC)C[C@@H]4c4cccc(F)c4)CC3)n(CC)n2)cc1
3 \"Mol\"/ \"Mol\"/ 290813 290815 5.455932 5.244125-0.211807 [*:1]C(C)(C)C>>[*:1][C@@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
4 \"Mol\"/ \"Mol\"/ 290814 290815 5.468521 5.244125-0.224396 [*:1][C@H](C)CC>>[*:1][C@@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 12, "text": [ " mol1 mol2 molregno1 molregno2 pKi_1 pKi_2 delta tform core\n", "0 \"Mol\"/ \"Mol\"/ 290813 290814 5.455932 5.468521 0.012589 [*:1]C(C)(C)C>>[*:1][C@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "1 \"Mol\"/ \"Mol\"/ 290815 290814 5.244125 5.468521 0.224396 [*:1][C@@H](C)CC>>[*:1][C@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "2 \"Mol\"/ \"Mol\"/ 290879 290814 5.251812 5.468521 0.216709 [*:1]C(C)(C)C>>[*:1]C1CCC1 [*:1]Oc1ccc(Cc2cc(C3CCN(C[C@H]4CN([C@@H](C(=O)O)[C@H](C)CC)C[C@@H]4c4cccc(F)c4)CC3)n(CC)n2)cc1\n", "3 \"Mol\"/ \"Mol\"/ 290813 290815 5.455932 5.244125 -0.211807 [*:1]C(C)(C)C>>[*:1][C@@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1\n", "4 \"Mol\"/ \"Mol\"/ 290814 290815 5.468521 5.244125 -0.224396 [*:1][C@H](C)CC>>[*:1][C@@H](C)CC [*:1][C@H](C(=O)O)N1C[C@H](CN2CCC(c3cc(Cc4ccc(OC5CCC5)cc4)nn3CC)CC2)[C@@H](c2cccc(F)c2)C1" ] } ], "prompt_number": 12 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Analysis\n", "\n", "Let's start by grouping related transforms together and seeing how often they occur:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "gs=mmpdds.groupby('tform')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 13 }, { "cell_type": "code", "collapsed": false, "input": [ "vs = [(len(y),x) for x,y in gs]\n", "vs.sort(reverse=True)\n", "vs[:5]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 14, "text": [ "[(34, '[*:1]F>>[*:1]Cl'),\n", " (34, '[*:1]Cl>>[*:1]F'),\n", " (15, '[*:1]C[*:2]>>[*:1]CC[*:2]'),\n", " (15, '[*:1]CC[*:2]>>[*:1]C[*:2]'),\n", " (14, '[*:1]F>>[*:1]C#N')]" ] } ], "prompt_number": 14 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Look at the summary stats for one of the frequent transformations:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "gs['delta'].describe()['[*:1]F>>[*:1]Cl']" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 15, "text": [ "count 34.000000\n", "mean 0.419827\n", "std 0.383555\n", "min -0.421005\n", "25% 0.154902\n", "50% 0.455932\n", "75% 0.675167\n", "max 1.149398\n", "dtype: float64" ] } ], "prompt_number": 15 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Gather those summary stats for the tforms that occur at least 5 times and convert them into another data frame " ] }, { "cell_type": "code", "collapsed": false, "input": [ "rows=[]\n", "for c,k in vs:\n", " if c>=5:\n", " descr=gs['delta'].describe()[k]\n", " rows.append((k,descr['count'],descr['mean'],descr['std']))\n" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 102 }, { "cell_type": "code", "collapsed": false, "input": [ "ndf = pd.DataFrame(rows,columns=('tform','count_val','mean_val','std_val'))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 103 }, { "cell_type": "code", "collapsed": false, "input": [ "ndf.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
tformcount_valmean_valstd_val
0 [*:1]F>>[*:1]Cl 34 0.419827 0.383555
1 [*:1]Cl>>[*:1]F 34-0.419827 0.383555
2 [*:1]C[*:2]>>[*:1]CC[*:2] 15-0.115020 0.222562
3 [*:1]CC[*:2]>>[*:1]C[*:2] 15 0.115020 0.222562
4 [*:1]F>>[*:1]C#N 14 0.155265 0.089983
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 104, "text": [ " tform count_val mean_val std_val\n", "0 [*:1]F>>[*:1]Cl 34 0.419827 0.383555\n", "1 [*:1]Cl>>[*:1]F 34 -0.419827 0.383555\n", "2 [*:1]C[*:2]>>[*:1]CC[*:2] 15 -0.115020 0.222562\n", "3 [*:1]CC[*:2]>>[*:1]C[*:2] 15 0.115020 0.222562\n", "4 [*:1]F>>[*:1]C#N 14 0.155265 0.089983" ] } ], "prompt_number": 104 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Add the two bits of the transformation as molecules so that we can visualize them" ] }, { "cell_type": "code", "collapsed": false, "input": [ "ndf['react']=ndf.apply(lambda row:row['tform'].split('>>')[0],axis=1)\n", "ndf['prod']=ndf.apply(lambda row:row['tform'].split('>>')[1],axis=1)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 105 }, { "cell_type": "code", "collapsed": false, "input": [ "PandasTools.AddMoleculeColumnToFrame(ndf,'react','reactmol')\n", "PandasTools.AddMoleculeColumnToFrame(ndf,'prod','prodmol')\n", "ndf.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
tformcount_valmean_valstd_valreactprodreactmolprodmol
0 [*:1]F>>[*:1]Cl 34 0.419827 0.383555 [*:1]F [*:1]Cl \"Mol\"/ \"Mol\"/
1 [*:1]Cl>>[*:1]F 34-0.419827 0.383555 [*:1]Cl [*:1]F \"Mol\"/ \"Mol\"/
2 [*:1]C[*:2]>>[*:1]CC[*:2] 15-0.115020 0.222562 [*:1]C[*:2] [*:1]CC[*:2] \"Mol\"/ \"Mol\"/
3 [*:1]CC[*:2]>>[*:1]C[*:2] 15 0.115020 0.222562 [*:1]CC[*:2] [*:1]C[*:2] \"Mol\"/ \"Mol\"/
4 [*:1]F>>[*:1]C#N 14 0.155265 0.089983 [*:1]F [*:1]C#N \"Mol\"/ \"Mol\"/
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 106, "text": [ " tform count_val mean_val std_val react prod reactmol prodmol\n", "0 [*:1]F>>[*:1]Cl 34 0.419827 0.383555 [*:1]F [*:1]Cl \"Mol\"/ \"Mol\"/\n", "1 [*:1]Cl>>[*:1]F 34 -0.419827 0.383555 [*:1]Cl [*:1]F \"Mol\"/ \"Mol\"/\n", "2 [*:1]C[*:2]>>[*:1]CC[*:2] 15 -0.115020 0.222562 [*:1]C[*:2] [*:1]CC[*:2] \"Mol\"/ \"Mol\"/\n", "3 [*:1]CC[*:2]>>[*:1]C[*:2] 15 0.115020 0.222562 [*:1]CC[*:2] [*:1]C[*:2] \"Mol\"/ \"Mol\"/\n", "4 [*:1]F>>[*:1]C#N 14 0.155265 0.089983 [*:1]F [*:1]C#N \"Mol\"/ \"Mol\"/" ] } ], "prompt_number": 106 }, { "cell_type": "markdown", "metadata": {}, "source": [ "And now let's see all the transforms that, on average, reduce hERG binding by at least 0.3 log units." ] }, { "cell_type": "code", "collapsed": false, "input": [ "ndf[ndf.mean_val<-.3]" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
tformcount_valmean_valstd_valreactprodreactmolprodmol
1 [*:1]Cl>>[*:1]F 34-0.419827 0.383555 [*:1]Cl [*:1]F \"Mol\"/ \"Mol\"/
13 [*:1]Cl>>[*:1]OC 9-0.342221 0.694424 [*:1]Cl [*:1]OC \"Mol\"/ \"Mol\"/
16 [*:1]Cl>>[*:1]C#N 7-0.437406 0.431559 [*:1]Cl [*:1]C#N \"Mol\"/ \"Mol\"/
21 [*:1]C1CC1>>[*:1]C 6-0.655480 0.580657 [*:1]C1CC1 [*:1]C \"Mol\"/ \"Mol\"/
25 [*:1]F>>[*:1]OC 5-0.462994 0.192751 [*:1]F [*:1]OC \"Mol\"/ \"Mol\"/
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 107, "text": [ " tform count_val mean_val std_val react prod reactmol prodmol\n", "1 [*:1]Cl>>[*:1]F 34 -0.419827 0.383555 [*:1]Cl [*:1]F \"Mol\"/ \"Mol\"/\n", "13 [*:1]Cl>>[*:1]OC 9 -0.342221 0.694424 [*:1]Cl [*:1]OC \"Mol\"/ \"Mol\"/\n", "16 [*:1]Cl>>[*:1]C#N 7 -0.437406 0.431559 [*:1]Cl [*:1]C#N \"Mol\"/ \"Mol\"/\n", "21 [*:1]C1CC1>>[*:1]C 6 -0.655480 0.580657 [*:1]C1CC1 [*:1]C \"Mol\"/ \"Mol\"/\n", "25 [*:1]F>>[*:1]OC 5 -0.462994 0.192751 [*:1]F [*:1]OC \"Mol\"/ \"Mol\"/" ] } ], "prompt_number": 107 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Not much signal in there when you take the standard deviation into account, but to continue showing what's possible with Pandas, we can at least look at the pairs for the last one:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "mmpdds[mmpdds['tform']=='[*:1]F>>[*:1]OC'][['mol1','mol2','pKi_1','pKi_2','delta']]" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
mol1mol2pKi_1pKi_2delta
85 \"Mol\"/ \"Mol\"/ 6.370590 5.835350-0.535241
86 \"Mol\"/ \"Mol\"/ 7.744727 6.991400-0.753328
303 \"Mol\"/ \"Mol\"/ 5.862329 5.576918-0.285411
349 \"Mol\"/ \"Mol\"/ 7.244125 6.801343-0.442782
487 \"Mol\"/ \"Mol\"/ 6.033389 5.735182-0.298207
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 19, "text": [ " mol1 mol2 pKi_1 pKi_2 delta\n", "85 \"Mol\"/ \"Mol\"/ 6.370590 5.835350 -0.535241\n", "86 \"Mol\"/ \"Mol\"/ 7.744727 6.991400 -0.753328\n", "303 \"Mol\"/ \"Mol\"/ 5.862329 5.576918 -0.285411\n", "349 \"Mol\"/ \"Mol\"/ 7.244125 6.801343 -0.442782\n", "487 \"Mol\"/ \"Mol\"/ 6.033389 5.735182 -0.298207" ] } ], "prompt_number": 19 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Those are all a consistent structural modification to the same core." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Look at bigger transformations\n", "\n", "Try increasing the cutoff when running the pair-generation algorithm to see if we get more/larger tforms." ] }, { "cell_type": "code", "collapsed": false, "input": [ "!python $RDBASE/Contrib/mmpa/indexing.py -s -r 0.25 < ../data/herg_fragmented.txt > ../data/mmps_larger.txt" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "mmps = pd.read_csv('../data/mmps_larger.txt',header=None,names=('smiles1','smiles2','molregno1','molregno2','tform','core'))\n", "mmps=mmps.drop_duplicates(subset=(\"molregno1\",\"molregno2\"))\n", "PandasTools.AddMoleculeColumnToFrame(mmps,'smiles1','mol1')\n", "PandasTools.AddMoleculeColumnToFrame(mmps,'smiles2','mol2')\n", "mmps = mmps[['mol1','mol2','molregno1','molregno2','tform','core']]\n", "t1=df[['molregno','standard_value']]\n", "mmpdds = mmps.merge(t1,left_on='molregno1',right_on='molregno',suffixes=(\"_1\",\"_2\")).\\\n", " merge(t1,left_on='molregno2',right_on='molregno',suffixes=(\"_1\",\"_2\"))\n", " \n", "import math\n", "mmpdds['pKi_1']=mmpdds.apply(lambda row:-1*math.log10(float(row['standard_value_1'])*1e-9),axis=1)\n", "mmpdds['pKi_2']=mmpdds.apply(lambda row:-1*math.log10(float(row['standard_value_2'])*1e-9),axis=1)\n", "mmpdds['delta']=mmpdds['pKi_2']-mmpdds['pKi_1']\n", "mmpdds=mmpdds[['mol1','mol2','molregno1','molregno2','pKi_1','pKi_2','delta','tform','core']]\n", "mmpdds.head()\n" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
mol1mol2molregno1molregno2pKi_1pKi_2deltatformcore
0 \"Mol\"/ \"Mol\"/ 1333317 1333318 5.823909 5.958607 0.134699 [*:1]Cc1ccccc1[*:2]>>[*:1]Cc1cccc([*:2])c1 [*:2]F.[*:1]n1ccc2c1ncnc2OC1CCN(Cc2cscn2)CC1
1 \"Mol\"/ \"Mol\"/ 1333327 1333318 6.221849 5.958607-0.263241 [*:1]C(F)(F)F>>[*:1]F [*:1]c1cccc(Cn2ccc3c2ncnc3OC2CCN(Cc3cscn3)CC2)c1
2 \"Mol\"/ \"Mol\"/ 1333330 1333318 6.022276 5.958607-0.063669 [*:1]c1cccc(Cl)c1[*:2]>>[*:1]c1cccc([*:2])c1 [*:2]F.[*:1]Cn1ccc2c1ncnc2OC1CCN(Cc2cscn2)CC1
3 \"Mol\"/ \"Mol\"/ 1333321 1333318 5.920819 5.958607 0.037789 [*:1]c1cccc(Cl)c1>>[*:1]c1cccc(F)c1 [*:1]Cn1ccc2c1ncnc2OC1CCN(Cc2cscn2)CC1
4 \"Mol\"/ \"Mol\"/ 1333329 1333318 5.638272 5.958607 0.320335 [*:1]c1cccc(F)c1[*:2]>>[*:1]c1cccc([*:2])c1 [*:2]F.[*:1]Cn1ccc2c1ncnc2OC1CCN(Cc2cscn2)CC1
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 67, "text": [ " mol1 mol2 molregno1 molregno2 pKi_1 pKi_2 delta tform core\n", "0 \"Mol\"/ \"Mol\"/ 1333317 1333318 5.823909 5.958607 0.134699 [*:1]Cc1ccccc1[*:2]>>[*:1]Cc1cccc([*:2])c1 [*:2]F.[*:1]n1ccc2c1ncnc2OC1CCN(Cc2cscn2)CC1\n", "1 \"Mol\"/ \"Mol\"/ 1333327 1333318 6.221849 5.958607 -0.263241 [*:1]C(F)(F)F>>[*:1]F [*:1]c1cccc(Cn2ccc3c2ncnc3OC2CCN(Cc3cscn3)CC2)c1\n", "2 \"Mol\"/ \"Mol\"/ 1333330 1333318 6.022276 5.958607 -0.063669 [*:1]c1cccc(Cl)c1[*:2]>>[*:1]c1cccc([*:2])c1 [*:2]F.[*:1]Cn1ccc2c1ncnc2OC1CCN(Cc2cscn2)CC1\n", "3 \"Mol\"/ \"Mol\"/ 1333321 1333318 5.920819 5.958607 0.037789 [*:1]c1cccc(Cl)c1>>[*:1]c1cccc(F)c1 [*:1]Cn1ccc2c1ncnc2OC1CCN(Cc2cscn2)CC1\n", "4 \"Mol\"/ \"Mol\"/ 1333329 1333318 5.638272 5.958607 0.320335 [*:1]c1cccc(F)c1[*:2]>>[*:1]c1cccc([*:2])c1 [*:2]F.[*:1]Cn1ccc2c1ncnc2OC1CCN(Cc2cscn2)CC1" ] } ], "prompt_number": 67 }, { "cell_type": "code", "collapsed": false, "input": [ "gs=mmpdds.groupby('tform')\n", "vs = [(len(y),x) for x,y in gs]\n", "vs.sort(reverse=True)\n", "vs[:5]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 68, "text": [ "[(30, '[*:1]F>>[*:1]Cl'),\n", " (30, '[*:1]Cl>>[*:1]F'),\n", " (14, '[*:1]F>>[*:1]C#N'),\n", " (14, '[*:1]C#N>>[*:1]F'),\n", " (13, '[*:1]c1cccc([*:2])c1>>[*:1]c1ccc([*:2])cc1')]" ] } ], "prompt_number": 68 }, { "cell_type": "code", "collapsed": false, "input": [ "rows=[]\n", "for c,k in vs:\n", " if c>=5:\n", " descr=gs['delta'].describe()[k]\n", " rows.append((k,descr['count'],descr['mean'],descr['std']))\n", "ndf = pd.DataFrame(rows,columns=('tform','count_val','mean_val','std_val'))\n", "ndf['react']=ndf.apply(lambda row:row['tform'].split('>>')[0],axis=1)\n", "ndf['prod']=ndf.apply(lambda row:row['tform'].split('>>')[1],axis=1)\n", "PandasTools.AddMoleculeColumnToFrame(ndf,'react','reactmol')\n", "PandasTools.AddMoleculeColumnToFrame(ndf,'prod','prodmol')\n" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 69 }, { "cell_type": "code", "collapsed": false, "input": [ "ndf[ndf.mean_val<-.3].sort(columns='mean_val')\n" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
tformcount_valmean_valstd_valreactprodreactmolprodmol
17 [*:1]Cc1ccccc1>>[*:1]C 8-0.847827 0.827364 [*:1]Cc1ccccc1 [*:1]C \"Mol\"/ \"Mol\"/
31 [*:1]c1ccc(Cl)cc1>>[*:1]c1ccccc1F 6-0.684998 0.122674 [*:1]c1ccc(Cl)cc1 [*:1]c1ccccc1F \"Mol\"/ \"Mol\"/
49 [*:1]C1CC1>>[*:1]C 5-0.647260 0.443866 [*:1]C1CC1 [*:1]C \"Mol\"/ \"Mol\"/
11 [*:1]C#N>>[*:1]C(N)=O 10-0.594131 0.339552 [*:1]C#N [*:1]C(N)=O \"Mol\"/ \"Mol\"/
1 [*:1]Cl>>[*:1]F 30-0.469615 0.368912 [*:1]Cl [*:1]F \"Mol\"/ \"Mol\"/
24 [*:1]Cl>>[*:1]C#N 7-0.437406 0.431559 [*:1]Cl [*:1]C#N \"Mol\"/ \"Mol\"/
15 [*:1]C(F)(F)F>>[*:1]C 9-0.376203 0.524325 [*:1]C(F)(F)F [*:1]C \"Mol\"/ \"Mol\"/
34 [*:1]F>>[*:1]OC 6-0.363874 0.297777 [*:1]F [*:1]OC \"Mol\"/ \"Mol\"/
36 [*:1]Cl>>[*:1]OC 6-0.345202 0.855605 [*:1]Cl [*:1]OC \"Mol\"/ \"Mol\"/
44 [*:1]c1cccc(C)c1>>[*:1]c1ccccc1 5-0.310806 0.185063 [*:1]c1cccc(C)c1 [*:1]c1ccccc1 \"Mol\"/ \"Mol\"/
5 [*:1]c1ccc([*:2])cc1>>[*:1]c1cccc([*:2])c1 13-0.301016 0.518341 [*:1]c1ccc([*:2])cc1 [*:1]c1cccc([*:2])c1 \"Mol\"/ \"Mol\"/
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 70, "text": [ " tform count_val mean_val std_val react prod reactmol prodmol\n", "17 [*:1]Cc1ccccc1>>[*:1]C 8 -0.847827 0.827364 [*:1]Cc1ccccc1 [*:1]C \"Mol\"/ \"Mol\"/\n", "31 [*:1]c1ccc(Cl)cc1>>[*:1]c1ccccc1F 6 -0.684998 0.122674 [*:1]c1ccc(Cl)cc1 [*:1]c1ccccc1F \"Mol\"/ \"Mol\"/\n", "49 [*:1]C1CC1>>[*:1]C 5 -0.647260 0.443866 [*:1]C1CC1 [*:1]C \"Mol\"/ \"Mol\"/\n", "11 [*:1]C#N>>[*:1]C(N)=O 10 -0.594131 0.339552 [*:1]C#N [*:1]C(N)=O \"Mol\"/ \"Mol\"/\n", "1 [*:1]Cl>>[*:1]F 30 -0.469615 0.368912 [*:1]Cl [*:1]F \"Mol\"/ \"Mol\"/\n", "24 [*:1]Cl>>[*:1]C#N 7 -0.437406 0.431559 [*:1]Cl [*:1]C#N \"Mol\"/ \"Mol\"/\n", "15 [*:1]C(F)(F)F>>[*:1]C 9 -0.376203 0.524325 [*:1]C(F)(F)F [*:1]C \"Mol\"/ \"Mol\"/\n", "34 [*:1]F>>[*:1]OC 6 -0.363874 0.297777 [*:1]F [*:1]OC \"Mol\"/ \"Mol\"/\n", "36 [*:1]Cl>>[*:1]OC 6 -0.345202 0.855605 [*:1]Cl [*:1]OC \"Mol\"/ \"Mol\"/\n", "44 [*:1]c1cccc(C)c1>>[*:1]c1ccccc1 5 -0.310806 0.185063 [*:1]c1cccc(C)c1 [*:1]c1ccccc1 \"Mol\"/ \"Mol\"/\n", "5 [*:1]c1ccc([*:2])cc1>>[*:1]c1cccc([*:2])c1 13 -0.301016 0.518341 [*:1]c1ccc([*:2])cc1 [*:1]c1cccc([*:2])c1 \"Mol\"/ \"Mol\"/" ] } ], "prompt_number": 70 }, { "cell_type": "code", "collapsed": false, "input": [ "tform='[*:1]c1ccc(Cl)cc1>>[*:1]c1ccccc1F'\n", "mmpdds[mmpdds['tform']==tform][['molregno1','molregno2','mol1','mol2','pKi_1','pKi_2','delta']]" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
molregno1molregno2mol1mol2pKi_1pKi_2delta
2184 408189 408192 \"Mol\"/ \"Mol\"/ 6.688246 6.149354-0.538892
2711 408196 408198 \"Mol\"/ \"Mol\"/ 6.686133 5.886057-0.800076
2712 408196 408198 \"Mol\"/ \"Mol\"/ 6.659556 5.886057-0.773499
2713 408196 408198 \"Mol\"/ \"Mol\"/ 6.669586 5.886057-0.783530
2714 408196 408198 \"Mol\"/ \"Mol\"/ 6.419075 5.886057-0.533018
2715 408196 408198 \"Mol\"/ \"Mol\"/ 6.567031 5.886057-0.680974
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 95, "text": [ " molregno1 molregno2 mol1 mol2 pKi_1 pKi_2 delta\n", "2184 408189 408192 \"Mol\"/ \"Mol\"/ 6.688246 6.149354 -0.538892\n", "2711 408196 408198 \"Mol\"/ \"Mol\"/ 6.686133 5.886057 -0.800076\n", "2712 408196 408198 \"Mol\"/ \"Mol\"/ 6.659556 5.886057 -0.773499\n", "2713 408196 408198 \"Mol\"/ \"Mol\"/ 6.669586 5.886057 -0.783530\n", "2714 408196 408198 \"Mol\"/ \"Mol\"/ 6.419075 5.886057 -0.533018\n", "2715 408196 408198 \"Mol\"/ \"Mol\"/ 6.567031 5.886057 -0.680974" ] } ], "prompt_number": 95 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again, a nice set of modifications to a consistent core.\n", "\n", "Note that there are really only two pairs here, this arises due to repeated measurements in the paper (we'll see those below).\n", "\n", "# An aside\n", "\n", "This is a brief exploration to look at additional data that's available from ChEMBL. This isn't the ideal example for it, but hopefully it will still be a useful start.\n", "\n", "We would assume that the last set of examples all came from the same paper, but we can confirm that.\n", "\n", "Start by getting the unique ChEMBL compound numbers:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "regnos = list(mmpdds[mmpdds['tform']==tform]['molregno1'])\n", "regnos += list(mmpdds[mmpdds['tform']==tform]['molregno2'])\n", "regnos=tuple(set(regnos))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 101 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now get the documents that have Ki values for those compounds:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%sql select distinct(activities.doc_id) from activities join assays using (assay_id) \\\n", " where tid=165 and standard_type='Ki' and molregno in :regnos;" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1 rows affected.\n" ] }, { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", "
doc_id
37427
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 97, "text": [ "[(37427,)]" ] } ], "prompt_number": 97 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Query our local ChEMBL instance to get info about the document:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "docid = _[0]['doc_id']\n", "%sql select * from docs where doc_id=:docid;" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1 rows affected.\n" ] }, { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
doc_idjournalyearvolumeissuefirst_pagelast_pagepubmed_iddoichembl_idtitledoc_typeauthorsabstract
37427Bioorg. Med. Chem. Lett.20071761675167817257843NoneCHEMBL1139118NonePUBLICATIONNoneNone
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 53, "text": [ "[(37427, u'Bioorg. Med. Chem. Lett.', 2007, u'17', u'6', u'1675', u'1678', 17257843L, None, u'CHEMBL1139118', None, u'PUBLICATION', None, None)]" ] } ], "prompt_number": 53 }, { "cell_type": "markdown", "metadata": {}, "source": [ "ChEMBL doesn't have the article title, but we can get that easily enough from pubmed:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pmid = _[0]['pubmed_id']\n", "txt=requests.get('http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=pubmed&id=%d'%pmid).text\n", "et = ElementTree.fromstring(txt.encode('utf-8'))\n", "et.findall(\".//*[@Name='Title']\")[0].text" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 54, "text": [ "'A novel, non-substrate-based series of glycine type 1 transporter inhibitors derived from high-throughput screening.'" ] } ], "prompt_number": 54 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pull the other assays from the paper:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%sql select * from assays where assay_id in (select distinct(assay_id) from activities where doc_id = :docid);" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "6 rows affected.\n" ] }, { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
assay_iddoc_iddescriptionassay_typeassay_test_typeassay_categoryassay_organismassay_tax_idassay_strainassay_tissueassay_cell_typeassay_subcellular_fractiontidrelationship_typeconfidence_scorecurated_bysrc_idsrc_assay_idchembl_idcell_idbao_format
45422937427Metabolic stability in human liver microsomes assessed as half lifeAIn vitroNoneHomo sapiens9606NoneLiverNoneMicrosomes102164S2Autocuration1NoneCHEMBL903412NoneBAO_0000251
45422637427Displacement of [3H]5-hydroxytrytamine from human 5HT1B receptor expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None106D9Intermediate1NoneCHEMBL903407722BAO_0000219
45422737427Displacement of [3H]dofetilide from human ERG channel expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None165D9Intermediate1NoneCHEMBL903410722BAO_0000219
45422837427Inhibition of human recombinant CYP2D6 at 1.5 uMANoneNoneHomo sapiens9606NoneNoneNoneNone11365D9Intermediate1NoneCHEMBL903411NoneBAO_0000357
45422437427Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None11682D9Intermediate1NoneCHEMBL903408722BAO_0000219
45422537427Inhibition of [3H]glycine uptake at human GlyT2 expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None11596D9Intermediate1NoneCHEMBL903409722BAO_0000219
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 98, "text": [ "[(454229, 37427, u'Metabolic stability in human liver microsomes assessed as half life', u'A', u'In vitro', None, u'Homo sapiens', 9606L, None, u'Liver', None, u'Microsomes', 102164, u'S', 2, u'Autocuration', 1, None, u'CHEMBL903412', None, u'BAO_0000251'),\n", " (454226, 37427, u'Displacement of [3H]5-hydroxytrytamine from human 5HT1B receptor expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 106, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903407', 722, u'BAO_0000219'),\n", " (454227, 37427, u'Displacement of [3H]dofetilide from human ERG channel expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 165, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903410', 722, u'BAO_0000219'),\n", " (454228, 37427, u'Inhibition of human recombinant CYP2D6 at 1.5 uM', u'A', None, None, u'Homo sapiens', 9606L, None, None, None, None, 11365, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903411', None, u'BAO_0000357'),\n", " (454224, 37427, u'Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 11682, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903408', 722, u'BAO_0000219'),\n", " (454225, 37427, u'Inhibition of [3H]glycine uptake at human GlyT2 expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 11596, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903409', 722, u'BAO_0000219')]" ] } ], "prompt_number": 98 }, { "cell_type": "markdown", "metadata": {}, "source": [ "And look at the values for our compounds:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "assayData=%sql select * from activities join assays using (assay_id) \\\n", " where activities.doc_id=:docid \\\n", " and molregno in :regnos \\\n", " and standard_value is not null \\\n", " and assay_id!=454227;\n", "assayData" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "21 rows affected.\n" ] }, { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
assay_idactivity_iddoc_idrecord_idmolregnostandard_relationpublished_valuepublished_unitsstandard_valuestandard_unitsstandard_flagstandard_typeactivity_commentpublished_typedata_validity_commentpotential_duplicatepublished_relationpchembl_valuebao_endpointuo_unitsqudt_unitsdoc_id_1descriptionassay_typeassay_test_typeassay_categoryassay_organismassay_tax_idassay_strainassay_tissueassay_cell_typeassay_subcellular_fractiontidrelationship_typeconfidence_scorecurated_bysrc_idsrc_assay_idchembl_idcell_idbao_format
454224202047937427671221408196=26nM26nM1IC50NoneIC50NoneNone=7.59BAO_0000190UO_0000065http://www.openphacts.org/units/Nanomolar37427Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None11682D9Intermediate1NoneCHEMBL903408722BAO_0000219
454224202048037427671220408196=31nM31nM1IC50NoneIC50NoneNone=7.51BAO_0000190UO_0000065http://www.openphacts.org/units/Nanomolar37427Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None11682D9Intermediate1NoneCHEMBL903408722BAO_0000219
454224202048137427671219408196=24nM24nM1IC50NoneIC50NoneNone=7.62BAO_0000190UO_0000065http://www.openphacts.org/units/Nanomolar37427Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None11682D9Intermediate1NoneCHEMBL903408722BAO_0000219
454224202048237427671218408196=38nM38nM1IC50NoneIC50NoneNone=7.42BAO_0000190UO_0000065http://www.openphacts.org/units/Nanomolar37427Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None11682D9Intermediate1NoneCHEMBL903408722BAO_0000219
454224202048537427671215408198=10.2nM10.2nM1KiNoneKiNoneNone=7.99BAO_0000192UO_0000065http://www.openphacts.org/units/Nanomolar37427Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None11682D9Intermediate1NoneCHEMBL903408722BAO_0000219
454224202048637427671214408196=15.9nM15.9nM1KiNoneKiNoneNone=7.80BAO_0000192UO_0000065http://www.openphacts.org/units/Nanomolar37427Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None11682D9Intermediate1NoneCHEMBL903408722BAO_0000219
454224202048837427671212408192=61.4nM61.4nM1KiNoneKiNoneNone=7.21BAO_0000192UO_0000065http://www.openphacts.org/units/Nanomolar37427Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None11682D9Intermediate1NoneCHEMBL903408722BAO_0000219
454224202049037427671210408189=26.8nM26.8nM1KiNoneKiNoneNone=7.57BAO_0000192UO_0000065http://www.openphacts.org/units/Nanomolar37427Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None11682D9Intermediate1NoneCHEMBL903408722BAO_0000219
454225202049537427671215408198=385nM385nM1IC50NoneIC50NoneNone=6.41BAO_0000190UO_0000065http://www.openphacts.org/units/Nanomolar37427Inhibition of [3H]glycine uptake at human GlyT2 expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None11596D9Intermediate1NoneCHEMBL903409722BAO_0000219
454225202049637427671214408196=319nM319nM1IC50NoneIC50NoneNone=6.50BAO_0000190UO_0000065http://www.openphacts.org/units/Nanomolar37427Inhibition of [3H]glycine uptake at human GlyT2 expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None11596D9Intermediate1NoneCHEMBL903409722BAO_0000219
454225202049837427671212408192=31.5nM31.5nM1IC50NoneIC50NoneNone=7.50BAO_0000190UO_0000065http://www.openphacts.org/units/Nanomolar37427Inhibition of [3H]glycine uptake at human GlyT2 expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None11596D9Intermediate1NoneCHEMBL903409722BAO_0000219
454226202050637427671214408196>1000nM1000nM1KiNoneKiNoneNone>NoneBAO_0000192UO_0000065http://www.openphacts.org/units/Nanomolar37427Displacement of [3H]5-hydroxytrytamine from human 5HT1B receptor expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None106D9Intermediate1NoneCHEMBL903407722BAO_0000219
454226202051037427671210408189=1490nM1490nM1KiNoneKiNoneNone=5.83BAO_0000192UO_0000065http://www.openphacts.org/units/Nanomolar37427Displacement of [3H]5-hydroxytrytamine from human 5HT1B receptor expressed in HEK293 cellsBNoneNoneHomo sapiens9606NoneNoneHEK293None106D9Intermediate1NoneCHEMBL903407722BAO_0000219
454228202052937427671215408198=43%43%1InhibitionNoneInhibitionNoneNone=NoneBAO_0000201UO_0000187http://qudt.org/vocab/unit#Percent37427Inhibition of human recombinant CYP2D6 at 1.5 uMANoneNoneHomo sapiens9606NoneNoneNoneNone11365D9Intermediate1NoneCHEMBL903411NoneBAO_0000357
454228202053037427671214408196=96%96%1InhibitionNoneInhibitionNoneNone=NoneBAO_0000201UO_0000187http://qudt.org/vocab/unit#Percent37427Inhibition of human recombinant CYP2D6 at 1.5 uMANoneNoneHomo sapiens9606NoneNoneNoneNone11365D9Intermediate1NoneCHEMBL903411NoneBAO_0000357
454228202053237427671212408192=85%85%1InhibitionNoneInhibitionNoneNone=NoneBAO_0000201UO_0000187http://qudt.org/vocab/unit#Percent37427Inhibition of human recombinant CYP2D6 at 1.5 uMANoneNoneHomo sapiens9606NoneNoneNoneNone11365D9Intermediate1NoneCHEMBL903411NoneBAO_0000357
454228202053437427671210408189=47%47%1InhibitionNoneInhibitionNoneNone=NoneBAO_0000201UO_0000187http://qudt.org/vocab/unit#Percent37427Inhibition of human recombinant CYP2D6 at 1.5 uMANoneNoneHomo sapiens9606NoneNoneNoneNone11365D9Intermediate1NoneCHEMBL903411NoneBAO_0000357
454229202053937427671215408198=17min0.283hr1T1/2Nonet1/2NoneNone=NoneBAO_0002115UO_0000032http://qudt.org/vocab/unit#Hour37427Metabolic stability in human liver microsomes assessed as half lifeAIn vitroNoneHomo sapiens9606NoneLiverNoneMicrosomes102164S2Autocuration1NoneCHEMBL903412NoneBAO_0000251
454229202054037427671214408196=22min0.367hr1T1/2Nonet1/2NoneNone=NoneBAO_0002115UO_0000032http://qudt.org/vocab/unit#Hour37427Metabolic stability in human liver microsomes assessed as half lifeAIn vitroNoneHomo sapiens9606NoneLiverNoneMicrosomes102164S2Autocuration1NoneCHEMBL903412NoneBAO_0000251
454229202054237427671212408192=14min0.233hr1T1/2Nonet1/2NoneNone=NoneBAO_0002115UO_0000032http://qudt.org/vocab/unit#Hour37427Metabolic stability in human liver microsomes assessed as half lifeAIn vitroNoneHomo sapiens9606NoneLiverNoneMicrosomes102164S2Autocuration1NoneCHEMBL903412NoneBAO_0000251
454229202054437427671210408189=9min0.15hr1T1/2Nonet1/2NoneNone=NoneBAO_0002115UO_0000032http://qudt.org/vocab/unit#Hour37427Metabolic stability in human liver microsomes assessed as half lifeAIn vitroNoneHomo sapiens9606NoneLiverNoneMicrosomes102164S2Autocuration1NoneCHEMBL903412NoneBAO_0000251
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 102, "text": [ "[(454224, 2020479L, 37427, 671221, 408196, u'=', Decimal('26'), u'nM', Decimal('26'), u'nM', 1, u'IC50', None, u'IC50', None, None, u'=', Decimal('7.59'), u'BAO_0000190', u'UO_0000065', u'http://www.openphacts.org/units/Nanomolar', 37427, u'Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 11682, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903408', 722, u'BAO_0000219'),\n", " (454224, 2020480L, 37427, 671220, 408196, u'=', Decimal('31'), u'nM', Decimal('31'), u'nM', 1, u'IC50', None, u'IC50', None, None, u'=', Decimal('7.51'), u'BAO_0000190', u'UO_0000065', u'http://www.openphacts.org/units/Nanomolar', 37427, u'Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 11682, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903408', 722, u'BAO_0000219'),\n", " (454224, 2020481L, 37427, 671219, 408196, u'=', Decimal('24'), u'nM', Decimal('24'), u'nM', 1, u'IC50', None, u'IC50', None, None, u'=', Decimal('7.62'), u'BAO_0000190', u'UO_0000065', u'http://www.openphacts.org/units/Nanomolar', 37427, u'Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 11682, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903408', 722, u'BAO_0000219'),\n", " (454224, 2020482L, 37427, 671218, 408196, u'=', Decimal('38'), u'nM', Decimal('38'), u'nM', 1, u'IC50', None, u'IC50', None, None, u'=', Decimal('7.42'), u'BAO_0000190', u'UO_0000065', u'http://www.openphacts.org/units/Nanomolar', 37427, u'Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 11682, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903408', 722, u'BAO_0000219'),\n", " (454224, 2020485L, 37427, 671215, 408198, u'=', Decimal('10.2'), u'nM', Decimal('10.2'), u'nM', 1, u'Ki', None, u'Ki', None, None, u'=', Decimal('7.99'), u'BAO_0000192', u'UO_0000065', u'http://www.openphacts.org/units/Nanomolar', 37427, u'Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 11682, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903408', 722, u'BAO_0000219'),\n", " (454224, 2020486L, 37427, 671214, 408196, u'=', Decimal('15.9'), u'nM', Decimal('15.9'), u'nM', 1, u'Ki', None, u'Ki', None, None, u'=', Decimal('7.80'), u'BAO_0000192', u'UO_0000065', u'http://www.openphacts.org/units/Nanomolar', 37427, u'Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 11682, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903408', 722, u'BAO_0000219'),\n", " (454224, 2020488L, 37427, 671212, 408192, u'=', Decimal('61.4'), u'nM', Decimal('61.4'), u'nM', 1, u'Ki', None, u'Ki', None, None, u'=', Decimal('7.21'), u'BAO_0000192', u'UO_0000065', u'http://www.openphacts.org/units/Nanomolar', 37427, u'Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 11682, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903408', 722, u'BAO_0000219'),\n", " (454224, 2020490L, 37427, 671210, 408189, u'=', Decimal('26.8'), u'nM', Decimal('26.8'), u'nM', 1, u'Ki', None, u'Ki', None, None, u'=', Decimal('7.57'), u'BAO_0000192', u'UO_0000065', u'http://www.openphacts.org/units/Nanomolar', 37427, u'Displacement of [3H]NPTS from human GlyT1C expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 11682, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903408', 722, u'BAO_0000219'),\n", " (454225, 2020495L, 37427, 671215, 408198, u'=', Decimal('385'), u'nM', Decimal('385'), u'nM', 1, u'IC50', None, u'IC50', None, None, u'=', Decimal('6.41'), u'BAO_0000190', u'UO_0000065', u'http://www.openphacts.org/units/Nanomolar', 37427, u'Inhibition of [3H]glycine uptake at human GlyT2 expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 11596, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903409', 722, u'BAO_0000219'),\n", " (454225, 2020496L, 37427, 671214, 408196, u'=', Decimal('319'), u'nM', Decimal('319'), u'nM', 1, u'IC50', None, u'IC50', None, None, u'=', Decimal('6.50'), u'BAO_0000190', u'UO_0000065', u'http://www.openphacts.org/units/Nanomolar', 37427, u'Inhibition of [3H]glycine uptake at human GlyT2 expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 11596, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903409', 722, u'BAO_0000219'),\n", " (454225, 2020498L, 37427, 671212, 408192, u'=', Decimal('31.5'), u'nM', Decimal('31.5'), u'nM', 1, u'IC50', None, u'IC50', None, None, u'=', Decimal('7.50'), u'BAO_0000190', u'UO_0000065', u'http://www.openphacts.org/units/Nanomolar', 37427, u'Inhibition of [3H]glycine uptake at human GlyT2 expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 11596, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903409', 722, u'BAO_0000219'),\n", " (454226, 2020506L, 37427, 671214, 408196, u'>', Decimal('1000'), u'nM', Decimal('1000'), u'nM', 1, u'Ki', None, u'Ki', None, None, u'>', None, u'BAO_0000192', u'UO_0000065', u'http://www.openphacts.org/units/Nanomolar', 37427, u'Displacement of [3H]5-hydroxytrytamine from human 5HT1B receptor expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 106, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903407', 722, u'BAO_0000219'),\n", " (454226, 2020510L, 37427, 671210, 408189, u'=', Decimal('1490'), u'nM', Decimal('1490'), u'nM', 1, u'Ki', None, u'Ki', None, None, u'=', Decimal('5.83'), u'BAO_0000192', u'UO_0000065', u'http://www.openphacts.org/units/Nanomolar', 37427, u'Displacement of [3H]5-hydroxytrytamine from human 5HT1B receptor expressed in HEK293 cells', u'B', None, None, u'Homo sapiens', 9606L, None, None, u'HEK293', None, 106, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903407', 722, u'BAO_0000219'),\n", " (454228, 2020529L, 37427, 671215, 408198, u'=', Decimal('43'), u'%', Decimal('43'), u'%', 1, u'Inhibition', None, u'Inhibition', None, None, u'=', None, u'BAO_0000201', u'UO_0000187', u'http://qudt.org/vocab/unit#Percent', 37427, u'Inhibition of human recombinant CYP2D6 at 1.5 uM', u'A', None, None, u'Homo sapiens', 9606L, None, None, None, None, 11365, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903411', None, u'BAO_0000357'),\n", " (454228, 2020530L, 37427, 671214, 408196, u'=', Decimal('96'), u'%', Decimal('96'), u'%', 1, u'Inhibition', None, u'Inhibition', None, None, u'=', None, u'BAO_0000201', u'UO_0000187', u'http://qudt.org/vocab/unit#Percent', 37427, u'Inhibition of human recombinant CYP2D6 at 1.5 uM', u'A', None, None, u'Homo sapiens', 9606L, None, None, None, None, 11365, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903411', None, u'BAO_0000357'),\n", " (454228, 2020532L, 37427, 671212, 408192, u'=', Decimal('85'), u'%', Decimal('85'), u'%', 1, u'Inhibition', None, u'Inhibition', None, None, u'=', None, u'BAO_0000201', u'UO_0000187', u'http://qudt.org/vocab/unit#Percent', 37427, u'Inhibition of human recombinant CYP2D6 at 1.5 uM', u'A', None, None, u'Homo sapiens', 9606L, None, None, None, None, 11365, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903411', None, u'BAO_0000357'),\n", " (454228, 2020534L, 37427, 671210, 408189, u'=', Decimal('47'), u'%', Decimal('47'), u'%', 1, u'Inhibition', None, u'Inhibition', None, None, u'=', None, u'BAO_0000201', u'UO_0000187', u'http://qudt.org/vocab/unit#Percent', 37427, u'Inhibition of human recombinant CYP2D6 at 1.5 uM', u'A', None, None, u'Homo sapiens', 9606L, None, None, None, None, 11365, u'D', 9, u'Intermediate', 1, None, u'CHEMBL903411', None, u'BAO_0000357'),\n", " (454229, 2020539L, 37427, 671215, 408198, u'=', Decimal('17'), u'min', Decimal('0.283'), u'hr', 1, u'T1/2', None, u't1/2', None, None, u'=', None, u'BAO_0002115', u'UO_0000032', u'http://qudt.org/vocab/unit#Hour', 37427, u'Metabolic stability in human liver microsomes assessed as half life', u'A', u'In vitro', None, u'Homo sapiens', 9606L, None, u'Liver', None, u'Microsomes', 102164, u'S', 2, u'Autocuration', 1, None, u'CHEMBL903412', None, u'BAO_0000251'),\n", " (454229, 2020540L, 37427, 671214, 408196, u'=', Decimal('22'), u'min', Decimal('0.367'), u'hr', 1, u'T1/2', None, u't1/2', None, None, u'=', None, u'BAO_0002115', u'UO_0000032', u'http://qudt.org/vocab/unit#Hour', 37427, u'Metabolic stability in human liver microsomes assessed as half life', u'A', u'In vitro', None, u'Homo sapiens', 9606L, None, u'Liver', None, u'Microsomes', 102164, u'S', 2, u'Autocuration', 1, None, u'CHEMBL903412', None, u'BAO_0000251'),\n", " (454229, 2020542L, 37427, 671212, 408192, u'=', Decimal('14'), u'min', Decimal('0.233'), u'hr', 1, u'T1/2', None, u't1/2', None, None, u'=', None, u'BAO_0002115', u'UO_0000032', u'http://qudt.org/vocab/unit#Hour', 37427, u'Metabolic stability in human liver microsomes assessed as half life', u'A', u'In vitro', None, u'Homo sapiens', 9606L, None, u'Liver', None, u'Microsomes', 102164, u'S', 2, u'Autocuration', 1, None, u'CHEMBL903412', None, u'BAO_0000251'),\n", " (454229, 2020544L, 37427, 671210, 408189, u'=', Decimal('9'), u'min', Decimal('0.15'), u'hr', 1, u'T1/2', None, u't1/2', None, None, u'=', None, u'BAO_0002115', u'UO_0000032', u'http://qudt.org/vocab/unit#Hour', 37427, u'Metabolic stability in human liver microsomes assessed as half life', u'A', u'In vitro', None, u'Homo sapiens', 9606L, None, u'Liver', None, u'Microsomes', 102164, u'S', 2, u'Autocuration', 1, None, u'CHEMBL903412', None, u'BAO_0000251')]" ] } ], "prompt_number": 102 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }