{ "metadata": { "name": "Pandas_RDKit_UGM" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Ames mutagenicity dataset analysis using RDKit and PANDAS" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The example shows how the combination of RDKit and PANDAS (http://pandas.pydata.org/) (and a bit SciKit-Learn (http://scikit-learn.org/stable/)) to do a simple data analysis. The Ames mutagenicity dataset compiled in Hansen et al., J. Chem. Inf. Model., 2009, 49 (9), pp 2077\u20132081 DOI: 10.1021/ci900161g was used as an example. The smiles_cas_N6512.smi input file can be obtained as supplement information from the article (http://pubs.acs.org/doi/suppl/10.1021/ci900161g)." ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Data import and preparation" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import pandas as pd\n", "from rdkit import Chem,rdBase\n", "from rdkit.Chem import PandasTools\n", "print 'PANDAS version ',pd.__version__\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "PANDAS version 0.10.0\n" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Read the input file, which is tab-separated text, into a pandas dataframe. The input file has no column names, which is why these are provided as parameters." ] }, { "cell_type": "code", "collapsed": false, "input": [ "data = pd.read_table(open('smiles_cas_N6512.smi','r'),header=None,names=['smiles','cas','mutagenic'])" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The head() command shows the first 5 rows of a dataframe. It has an optional integer parameter that allows to specify how many rows are displayed. In most cases in this experiment this will be set two 2 rows to avoid using too much space." ] }, { "cell_type": "code", "collapsed": false, "input": [ "data.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
smilescasmutagenic
0 O=C1c2ccccc2C(=O)c3c1ccc4c3[nH]c5c6C(=O)c7ccccc7C(=O)c6c8[nH]c9c%10C(=O)c%11ccccc%11C(=O)c%10ccc9c8c45 2475-33-4 0
1 NNC(=O)CNC(=O)\\C=N\\#N 820-75-7 1
2 O=C1NC(=O)\\C(=N/#N)\\C=N1 2435-76-9 1
3 NC(=O)CNC(=O)\\C=N\\#N 817-99-2 1
4 CCCCN(CC(O)C1=C\\C(=N/#N)\\C(=O)C=C1)N=O 116539-70-9 1
\n", "
" ], "output_type": "pyout", "prompt_number": 3, "text": [ " smiles \\\n", "0 O=C1c2ccccc2C(=O)c3c1ccc4c3[nH]c5c6C(=O)c7ccccc7C(=O)c6c8[nH]c9c%10C(=O)c%11ccccc%11C(=O)c%10ccc9c8c45 \n", "1 NNC(=O)CNC(=O)\\C=N\\#N \n", "2 O=C1NC(=O)\\C(=N/#N)\\C=N1 \n", "3 NC(=O)CNC(=O)\\C=N\\#N \n", "4 CCCCN(CC(O)C1=C\\C(=N/#N)\\C(=O)C=C1)N=O \n", "\n", " cas mutagenic \n", "0 2475-33-4 0 \n", "1 820-75-7 1 \n", "2 2435-76-9 1 \n", "3 817-99-2 1 \n", "4 116539-70-9 1 " ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The CAS numbers are well suited to serve as row keys." ] }, { "cell_type": "code", "collapsed": false, "input": [ "data = data.set_index('cas')\n", "data.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
smilesmutagenic
cas
2475-33-4 O=C1c2ccccc2C(=O)c3c1ccc4c3[nH]c5c6C(=O)c7ccccc7C(=O)c6c8[nH]c9c%10C(=O)c%11ccccc%11C(=O)c%10ccc9c8c45 0
820-75-7 NNC(=O)CNC(=O)\\C=N\\#N 1
2435-76-9 O=C1NC(=O)\\C(=N/#N)\\C=N1 1
817-99-2 NC(=O)CNC(=O)\\C=N\\#N 1
116539-70-9 CCCCN(CC(O)C1=C\\C(=N/#N)\\C(=O)C=C1)N=O 1
\n", "
" ], "output_type": "pyout", "prompt_number": 4, "text": [ " smiles \\\n", "cas \n", "2475-33-4 O=C1c2ccccc2C(=O)c3c1ccc4c3[nH]c5c6C(=O)c7ccccc7C(=O)c6c8[nH]c9c%10C(=O)c%11ccccc%11C(=O)c%10ccc9c8c45 \n", "820-75-7 NNC(=O)CNC(=O)\\C=N\\#N \n", "2435-76-9 O=C1NC(=O)\\C(=N/#N)\\C=N1 \n", "817-99-2 NC(=O)CNC(=O)\\C=N\\#N \n", "116539-70-9 CCCCN(CC(O)C1=C\\C(=N/#N)\\C(=O)C=C1)N=O \n", "\n", " mutagenic \n", "cas \n", "2475-33-4 0 \n", "820-75-7 1 \n", "2435-76-9 1 \n", "817-99-2 1 \n", "116539-70-9 1 " ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "PANDAS has a describe command that shows the basic statistics of a dataframe. In the case, the information that can be obtained is only the number of rows (6512) and the roughly balanced mutagenicity class distribution." ] }, { "cell_type": "code", "collapsed": false, "input": [ "data.describe()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
mutagenic
count 6512.000000
mean 0.537930
std 0.498598
min 0.000000
25% 0.000000
50% 1.000000
75% 1.000000
max 1.000000
\n", "
" ], "output_type": "pyout", "prompt_number": 5, "text": [ " mutagenic\n", "count 6512.000000\n", "mean 0.537930\n", "std 0.498598\n", "min 0.000000\n", "25% 0.000000\n", "50% 1.000000\n", "75% 1.000000\n", "max 1.000000" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The PandasTools module from rdkit offers functionality to add a molecule object type to the dataframe. In order to accelerate future substructure searches a substructure fingerprint can be precomputed during molecule construction. When the dataframe is displayed, the molecules are rendered as images using the RDKit's built-in drawing code." ] }, { "cell_type": "code", "collapsed": false, "input": [ "PandasTools.AddMoleculeColumnToFrame(data,smilesCol='smiles',molCol='molecule',includeFingerprints=False)\n", "data.head(2)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
smilesmutagenicmolecule
cas
2475-33-4 O=C1c2ccccc2C(=O)c3c1ccc4c3[nH]c5c6C(=O)c7ccccc7C(=O)c6c8[nH]c9c%10C(=O)c%11ccccc%11C(=O)c%10ccc9c8c45 0 \"Mol\"/
820-75-7 NNC(=O)CNC(=O)\\C=N\\#N 1 None
\n", "
" ], "output_type": "pyout", "prompt_number": 6, "text": [ " smiles \\\n", "cas \n", "2475-33-4 O=C1c2ccccc2C(=O)c3c1ccc4c3[nH]c5c6C(=O)c7ccccc7C(=O)c6c8[nH]c9c%10C(=O)c%11ccccc%11C(=O)c%10ccc9c8c45 \n", "820-75-7 NNC(=O)CNC(=O)\\C=N\\#N \n", "\n", " mutagenic \\\n", "cas \n", "2475-33-4 0 \n", "820-75-7 1 \n", "\n", " molecule \n", "cas \n", "2475-33-4 \"Mol\"/ \n", "820-75-7 None " ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some problematic smiles strings (e.g. compound 820-75-7) lead to empty molecules. The respective rows can be filtered using the \"notnull\" mask. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "data = data.ix[data['molecule'].notnull()]\n", "data.describe()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
mutagenic
count 6450.000000
mean 0.533798
std 0.498895
min 0.000000
25% 0.000000
50% 1.000000
75% 1.000000
max 1.000000
\n", "
" ], "output_type": "pyout", "prompt_number": 7, "text": [ " mutagenic\n", "count 6450.000000\n", "mean 0.533798\n", "std 0.498895\n", "min 0.000000\n", "25% 0.000000\n", "50% 1.000000\n", "75% 1.000000\n", "max 1.000000" ] } ], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "data.head(2)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
smilesmutagenicmolecule
cas
2475-33-4 O=C1c2ccccc2C(=O)c3c1ccc4c3[nH]c5c6C(=O)c7ccccc7C(=O)c6c8[nH]c9c%10C(=O)c%11ccccc%11C(=O)c%10ccc9c8c45 0 \"Mol\"/
105149-00-6 CC(=O)OC1(CCC2C3C=C(Cl)C4=CC(=O)OCC4(C)C3CCC12C)C(=O)C 0 \"Mol\"/
\n", "
" ], "output_type": "pyout", "prompt_number": 8, "text": [ " smiles \\\n", "cas \n", "2475-33-4 O=C1c2ccccc2C(=O)c3c1ccc4c3[nH]c5c6C(=O)c7ccccc7C(=O)c6c8[nH]c9c%10C(=O)c%11ccccc%11C(=O)c%10ccc9c8c45 \n", "105149-00-6 CC(=O)OC1(CCC2C3C=C(Cl)C4=CC(=O)OCC4(C)C3CCC12C)C(=O)C \n", "\n", " mutagenic \\\n", "cas \n", "2475-33-4 0 \n", "105149-00-6 0 \n", "\n", " molecule \n", "cas \n", "2475-33-4 \"Mol\"/ \n", "105149-00-6 \"Mol\"/ " ] } ], "prompt_number": 8 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Simple and substructure-base statistics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas offers a grouping functionality that allows for more detailed statistics." ] }, { "cell_type": "code", "collapsed": false, "input": [ "data.groupby('mutagenic').describe().unstack()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
mutagenic
countmeanstdmin25%50%75%max
mutagenic
0 3007 0 0 0 0 0 0 0
1 3443 1 0 1 1 1 1 1
\n", "
" ], "output_type": "pyout", "prompt_number": 9, "text": [ " mutagenic \n", " count mean std min 25% 50% 75% max\n", "mutagenic \n", "0 3007 0 0 0 0 0 0 0\n", "1 3443 1 0 1 1 1 1 1" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The \"ix\" row selection, the groupby and other PANDAS methods are able to handle boolean mask arrays similarily to what numpy does. This capability can be extended to substructure filters using the RDKit PandasTools integration." ] }, { "cell_type": "code", "collapsed": false, "input": [ "#define a structure pattern\n", "from rdkit.Chem.Draw import IPythonConsole\n", "nitroso = Chem.MolFromSmiles('N=O')\n", "nitroso" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAYAAABNcIgQAAAEnklEQVR4nO3cO6tUVwCG4c+oCFp4\nKYLxeIGUAUWIFtoIsbEQ0b/gJRLEzkZQay29gCDeAqkUbKwtJBBR1MJCsBMxKdTGC2kSmBSDGJ3x\nYPA4Yr7ngYE5a2/2XlMML4vZ68waDAaDAECprz73BADgcxJCAKoJIQDVhBCAakIIQDUhBKCaEAJQ\nTQgBqCaEAFQTQgCqCSEA1YQQgGpCCEA1IQSgmhACUE0IAagmhABUE0IAqgkhANWEEIBqQghANSEE\noJoQAlBNCAGoJoQAVBNCAKoJIQDVhBCAakIIQDUhBKCaEAJQTQgBqCaEAFQTQgCqCSEA1YQQgGpC\nCEA1IQSgmhACUE0IAagmhABUE0IAqgkhANWEEIBqQghANSEEoJoQAlBNCAGoJoQAVBNCAKoJIQDV\nhBCAakIIQDUhBKCaEAJQTQgBqCaEAFQTQgCqCSEA1YQQgGpCCEA1IQSgmhACUE0IAagmhABUE0IA\nqgkhANWEEIBqcz73BKDFX38lT5+Oji/LHx9+kQULkoULZ25SgBUhjPPiRTJrVvLo0dvjZ84kGzcO\n3z99Ojznp5/ePueXX5Iffhi95r17ydTU6Gvs4Ptehw9/ks8LzawI4SNdvpzs3p18//30561Zk/z+\n+7gjYwfHW7BgdOzPP5Mff0yuXk1WrkyePEkWL07On39TbeC9hBA+0pEjyb59yY0bwxXi+8ydmyxb\nNu7I2MEPt21bMn9+8vDhMIBJcu5csmlTcvdusnr1x10f/ueEEKZx+nSyaNGbv2/eHD1nz57k55+T\nCxeSnTsnNrU3E/rtt7cjmCS7diVHjybHjydnz054UvBlEUKYxv37w8XWa48fJ3Pe+dbMnp2cOpVs\n357s2DHR6SW3byfffZd8/fXosW3bhstUYFpCCNM4eXL4s9trZ84kFy+OnrdhQ7JlS3Lo0PD9xDx7\nlixZMv7Yt98m165NcDLwZRJCmCHHjg0XZ/PmTfCmK1Ykt24lf/89ulT99ddk1aoJTga+TLZPwAxZ\nunS4Ijx1aoI3Xb8+ef58/I+X168nmzdPcDLwZbIihBm0f//w2ZTBYPTYJ9lQv3p1snVrcuBAcuVK\n8s03wxsdPDhcIe7a9d8/BJSZNRiM+8oCM+3OnWTdutHxQabZc/Gu/fuTEyfeHnv5Mtm7N7l0KVm+\nfLiPcO3a4QbHqamPmjM0EEKYkE/+L9ZevUoePBg+JPPvrRTAtIQQgGoelgGgmhACUE0IAagmhABU\nE0IAqgkhANWEEIBqQghANSEEoJoQAlBNCAGoJoQAVBNCAKoJIQDVhBCAakIIQDUhBKCaEAJQTQgB\nqCaEAFQTQgCqCSEA1YQQgGpCCEA1IQSgmhACUE0IAagmhABUE0IAqgkhANWEEIBqQghANSEEoJoQ\nAlBNCAGoJoQAVBNCAKoJIQDVhBCAakIIQDUhBKCaEAJQTQgBqCaEAFQTQgCqCSEA1YQQgGpCCEA1\nIQSgmhACUE0IAagmhABUE0IAqgkhANWEEIBqQghANSEEoJoQAlBNCAGoJoQAVBNCAKoJIQDVhBCA\nakIIQDUhBKCaEAJQTQgBqCaEAFT7B4TFl1SMHnlRAAAAAElFTkSuQmCC\n", "prompt_number": 10, "text": [ "" ] } ], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Importing the RDKit PandasTools module has the side-effects of adding HTML rendering of molecules (discussed above) and adding a __ge__ (>=) operator that triggers a substructure search, i.e. molX >= molY returns the same boolean result as checking if the substructure molY is contained in molX.
\n", "Thus, the next two examples show the mutagencity class distribution for molecules depending on wether they contain a nitroso or a naphthalene motif." ] }, { "cell_type": "code", "collapsed": false, "input": [ "data.groupby(data['molecule'] >= nitroso).describe().unstack()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
mutagenic
countmeanstdmin25%50%75%max
molecule
False 5217 0.461760 0.498583 0 0 0 1 1
True 1233 0.838605 0.368044 0 1 1 1 1
\n", "
" ], "output_type": "pyout", "prompt_number": 11, "text": [ " mutagenic \n", " count mean std min 25% 50% 75% max\n", "molecule \n", "False 5217 0.461760 0.498583 0 0 0 1 1\n", "True 1233 0.838605 0.368044 0 1 1 1 1" ] } ], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "polyarom = Chem.MolFromSmiles('c1cccc2c1cccc2')\n", "polyarom" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAYAAABNcIgQAAAWNUlEQVR4nO3de1RVdeIF8K1WDjZr\nJhAfRcxEk1xeiigoig9QZCTKZsDSlVnOMFr4GBEubxAzi8RHMmKMSJpoOj4rRUFpUBQUFXkFCNcy\nQcVmFSh3xhne9/dH+VtW91wB4X7P7ezPWv3T3Wu1a63W9px77vn20el0OhARESlUX9EFiIiIROIQ\nEhGRonEIiYhI0TiERESkaBxCIiJSNA4hEREpGoeQiIgUjUNIRESKxiEkIiJF4xASEZGicQiJiEjR\nOIRERKRoHEIiIlI0DiERESkah5CIiBSNQ0hERIrGISTqpqKiIvj5+eGRRx4R+pefnx+KiopE/+cg\nMll9eEI9Udd1dHTAzc0NDg4OCAoKEtrl/fffR1VVFc6fP4++fflnW6Kuekh0ASJTtGPHDtTU1CA7\nOxsWFhZCu9jZ2WHYsGHYuXMnXn31VaFdiEwR//hI1EVarRYRERGIjY0VPoIAYGFhgdjYWISHh0Or\n1YquQ2RyOIREXbR69WpYWlpi8eLFoqv8vyVLlsDS0hKJiYmiqxCZHH5HSNQFly9fhpOTE44cOQJv\nb2/RdX7g6NGj8Pf3R3l5OZ555hnRdYhMBoeQqAsCAgLQ3NyMjIwM0VX08vPzg5mZGfbv3y+6CpHJ\n4K1Rok7Kzs7GkSNHsGHDBsnM2bNnsWDBAvTGny91Oh0WLFiAs2fPSmbWr1+Pw4cP4+TJkz3+zyf6\nueIQEnVCW1sbli1bhkWLFknedmxvb8fChQsxYMAA9OnTp8c79OnTB2ZmZli4cCHa29v1ZlQqFRYt\nWoTg4GDJDBH9EIeQqBPS0tJQX1+P+Ph4yczWrVtx/fp1g5kHtWLFCly/fh3btm2TzCxfvhx1dXVI\nS0vrtR5EPyccQqL7aGhoQExMDFasWIFf/epXejO3b99GbGwsVq5cCXNz817rYm5ujjfffBMxMTFo\nbGzUm3nsscewcuVKxMXF4fbt273Whejngg/LEN1HSEgITp06ZfDNLWq1GtnZ2SgqKkK/fv16tU97\neztGjRoFHx8frFmzRjIzevRoTJ06FevWrevVPkSmjkNIZEBFRQVcXFyQk5ODCRMm6M1UV1djxIgR\nyMrKgpeXl1F65eTkwNfXF2VlZVCpVHozJ06cwPTp0w1miIhDSGSQn58fBgwYgH379klmnnvuOfTv\n3x8HDhwwYjPA398fra2tOHz4sGRm5syZaGpqku3PPYjkgENIJOHo0aMICAhAZWUlbGxs9GYyMzPh\n7+9vMNNbrly5AkdHRxw8eBC+vr56M1999RUcHBwMZoiUjkNIpEdrayuGDx+OmTNnYtWqVZKZESNG\nwN/fH2+//baRG34nOjoaH3/8McrKyvDwww/rzcTExODgwYMGM0RKxqdGifRITk7Gv//9b0RGRhrM\naLVaREVFGbHZD0VHR0Or1WLTpk2SmaioqPtmiJSMV4REP/LNN9/A1tYWSUlJkscaff3111CpVNi4\ncaPwo4+2b9+O4OBgaDQaDBo0yGDm8uXLsLS0NHJDInnjEBL9SFBQEIqLi3H27FnJN8S88cYbKCkp\nMZgxFp1Oh3HjxsHFxQUpKSmSGXd3d4wePRrvv/++kRsSyRuHkOgepaWlcHV1xenTp+Hu7q43c/Hi\nRbi7u6OgoACjR482ckP9CgoKMHHiRBQWFsLZ2dlg5uLFixgxYoSRGxLJF4eQ6B5TpkyBlZUVduzY\nofdznU6HSZMm4emnn8b27duN3M6wV199FdevX0dOTo5k5pVXXkFdXZ3BDJHScAiJvnfgwAHMmzcP\nVVVVsLKy0pvZv38/AgMDodFoMGTIECM3NOzGjRuws7PDhx9+iICAAMmMSqVCeno6/P39jdyQSJ74\n1CgRgKamJoSFhSEiIkJyBP/73/8iJCQE4eHhshtBALCyskJkZCTCwsLQ1NQkmYmIiIBarZbMECkN\nh5AI353jp9PpoFarDWb69+9vMCNaaGgoABh8v2hYWBh0Oh3ee+89Y9UikjXeGiXFq6urg0qlwtat\nW/Hiiy/qzdTW1sLe3h7p6emStx3lYv/+/Xjttddw6dIl/OY3v9Gb2bt3LwIDA1FdXY0nnnjCyA2J\n5IVDSIo3b948XL161eCp7nPnzsW//vUvHD9+3HjFHoCXlxesra2Rnp4umfH09ISNjY3Bsw2JlIBD\nSIp2/vx5eHh44MKFCxg5cqTeTF5eHqZMmYLi4mI4Ojoat2A3debUjJKSEri5uSE/Px9jxowxckMi\n+eAQkmLpdDqMHz8ew4cPR2pqqt5MR0cH3NzcMHbsWJP7IXpQUBAuXLhg8BzF+fPno6KiAvn5+cJf\nDEAkCh+WIcXatWsXqqqqJF+qDQDp6emoqakxmJGrt99+G1evXpX8TeTdTGVlJXbt2mXEZkTywiEk\nxWpsbISbmxsGDx6s93OtVovIyEjExcXBwsLCyO0enIWFBWJjYxEREQGtVqs3M3jwYLi5uaGxsdHI\n7Yjkg7dGSbG+/PJLODo64tChQ/Dx8fnJ5y0tLfjggw8QGBiIRx55REDDB3e/f4fjx49jxowZqKio\nwO9+9zsBDYnE4xCSoqnVamRmZqK0tBQPPfSQ6DpG1dbWBmdnZzz77LNYs2aN6DpEwvDWKClafHw8\nGhoaJB+W+TnbvHkzbt26heXLl4uuQiQUrwhJ8bZs2YLIyEhoNBoMHDhQdB2jqK+vx7Bhw5CYmIi/\n/OUvousQCcUhJMXr6OjA2LFjMX78eCQlJYmuYxR//etfUVBQgIKCAsmfVhApBYeQCMCZM2cwefJk\nFBUVYfjw4aLr9KrPP/8co0aNQm5uLsaPHy+6DpFwHEKi782ePRvffvstPvvsM9FVetXUqVMxePBg\n7N69W3QVIlngEBJ97/r167Czs8OuXbswY8YM0XV6xaeffoo5c+agqqoKTz75pOg6RLLALweIvvfk\nk09CrVYjJCQEzc3Nouv0uObmZoSEhECtVnMEie7BISS6R0REBNra2vC3v/1NdJUel5SUhPb2dkRE\nRIiuQiQrvDVK9CO7d+/G66+/jurqajz++OOi6/SImzdvQqVSITU1FbNnzxZdh0hWOIREekyePBnD\nhg1DWlqa6Co9IjAwEF988QVyc3NFVyGSHQ4hkR7FxcUYM2YMzpw5Azc3N9F1HsjdMxfPnz8PFxcX\n0XWIZIdDSCQhMDAQly5dMumz+nQ6HTw8PODg4PCzubol6ml8WIZIQkJCAiorK/GPf/xDdJVu2717\nNyorK/HOO++IrkIkWxxCIgmDBw9GdHQ0wsPDcefOHdF1uuzOnTsIDw9HTEyM5JmLRMQhJDIoODgY\nAwYMMMljihITE/Hoo49i6dKloqsQyRq/IyS6j0OHDmH27NmorKzEU089JbpOp1y9ehUODg7Ys2cP\nnn/+edF1iGSNQ0jUCb6+vvj1r39tMt8Xzpo1C1qtFpmZmaKrEMkeh5CoEy5dugRnZ2dkZ2dj8uTJ\nejM3b95ERkaGUfo899xzkj/2z83NxbRp01BaWgp7e3uj9CEyZRxCok5aunQpTp06hcLCQvTr1+8n\nn1+8eNFoh9ympaVh9OjRP/n77e3tcHV1xeTJk7FhwwajdCEydRxCok66desWbG1tkZCQINtT3bds\n2YLo6GhoNBqYm5uLrkNkEvjUKFEnmZub480330RMTAwaGxtF1/mJ27dvIzY2FitXruQIEnUBrwiJ\nuqC9vR2jRo2Cj4+P7H5SoVarkZ2djaKiIr23bolIPw4hURfl5OTA19cXZWVlUKlUousAAKqrqzFi\nxAhkZWXBy8tLdB0ik8IhJOoGf39/tLa24vDhw6KrAPjuKdL+/fvjwIEDoqsQmRwOIVE3XLlyBY6O\njjh48CB8fX2FdsnMzIS/vz8qKythY2MjtAuRKeIQEnVTcHAwsrOz4eHhIbRHXl4efHx8+HMJom7i\nU6NEJs5Uj4gikgteERJ1Q3l5OVxcXHDy5EnhV4T5+fnw8vJCUVERnJychHYhMkUcQqJu8Pb2xsCB\nA7Fnzx7RVQAAL730Em7duoXs7GzRVYhMDoeQqIsyMjIwa9YsVFVVwdraWnQdAMC1a9dgZ2eHvXv3\nws/PT3QdIpPC7wiJuqC5uRnBwcFQq9WyGUEAsLa2RmhoKIKDg9Hc3Cy6DpFJ4RASdcHGjRvR2tqK\nyMhI0VV+IioqCi0tLUhOThZdhcik8NYoUSfdvHkTKpUKKSkpmDNnjug6en300UcICgqCRqPB0KFD\nRdchMgkcQqJOmj9/Pqqrq5GbmyvbnyzodDpMmjQJ9vb2SE1NFV2HyCRwCIk64cKFCxg/fjzOnTuH\nUaNGia5jUFFREcaOHYuzZ8/C1dVVdB0i2eMQEt2HTqfDhAkToFKpsHXrVtF1OuVPf/oTNBoN8vLy\nZHv1SiQXfFiG6D727NmDiooKJCQkiK7Sae+++y7Ky8uxd+9e0VWIZI9DSGTAnTt3EBYWhqioKAwZ\nMkR0nU4bMmQIoqKioFarcefOHdF1iGSNQ0hkwJo1azBgwAAsW7ZMdJUuCwkJgZmZGdauXSu6CpGs\n8TtCIgk1NTWwt7fH7t278cILL4iu0y2ffPIJXn75ZVy6dAm//e1vRdchkiUOIZGEl19+GQ0NDcjK\nyhJd5YFMnz4dFhYW2LVrl+gqRLLEISTS49SpU/D29kZpaSns7e1F13kglZWVGDlyJP75z39i4sSJ\nousQyQ6HkOhHOjo64OrqCg8PD2zcuFF0nR6xePFinDlzBoWFhejbl48GEN2L/0cQ/ci2bdtw7do1\nrFy5UnSVHvPWW2+htrYWH374oegqRLLDISS6R2NjI6KjoxEfHw9zc3PRdXqMubk54uPjERUVhcbG\nRtF1iGSFt0aJ7hEeHo6srCwUFxejX79+ouv0qPb2dri4uMDX1xerV68WXYdINjiERN/TaDQYPnw4\nMjMzMWXKFNF1ekVOTg58fX3x+eefw9bWVnQdIlngEBJ9b8aMGejXrx8+/vhj0VV61R/+8AfodDp8\n+umnoqsQyQKHkAhAVlYW/vjHP6KiogJPP/206Dq96sqVK3BwcMAnn3yC6dOni65DJByHkBSvtbUV\nzs7OeOGFF0zqxdoPIjIyEocOHUJpaSkefvhh0XWIhOJTo6R4KSkpaGxsRExMjOgqRhMTE4Pbt2/j\n73//u+gqRMLxipAU7dtvv4WtrS3Wr1+PefPmia5jVNu2bUNoaCg0Gg0sLS1F1yEShkNIirZo0SIU\nFhaioKBAcQfYdnR0YNy4cXBzc0NycrLoOkTC8NYoKVZZWRlSU1ORlJSkdwRbWlqQkpKClpYWAe16\nhqF/h759+2LDhg3YvHkzysrKBLQjkgcOISlWXl4evLy84O7urvfz9vZ2JCYmYt26dUZu1nPWrl2L\nxMREdHR06P183Lhx8PT0RF5enpGbEckHh5AUa+jQocjLy0Ntba3ez83MzLB69Wq88847qKurM3K7\nB1dXV4eEhAQkJibiF7/4hd5MbW0t8vPz8fjjjxu5HZF88DtCUjQfHx8MGTIEO3bskMx4enriqaee\nMrkXVr/22muora3FiRMnJDOvvPIKvvnmGxw7dsyIzYjkhUNIilZRUQEXFxfk5ORgwoQJejMlJSVw\nc3NDXl4exo4da+SG3XPu3DlMmDABhYWFcHZ21ps5ffo0pk6dipKSEjg4OBi5IZF8cAhJ8RYuXIhz\n587hwoULkmf1vf766ygrK8OZM2dk/3RpR0cHxowZA1dXV8nfCXZ0dMDNzQ3u7u7YtGmTkRsSyQu/\nIyTFW7VqFWpqagzeHl21ahWqqqqwc+dOIzbrnp07d+LLL7/EW2+9JZlJT09HTU2NwQyRUnAISfEs\nLCwQGxuLiIgIaLVavZlBgwYhLi4OUVFR+M9//mPkhp2n1WoRHh6O5cuXY9CgQZKZyMhIxMXFwcLC\nwsgNieSHQ0gEYPHixbC0tDR4Tt+SJUvwy1/+UtZn+SUmJsLS0hJLliyRzLz77rsYNGgQFi1aZMRm\nRPLF7wiJvvfZZ5/Bz88PFRUVeOaZZ/Rmjh49ioCAAFRWVsLGxsbIDQ374osv4OTkhIyMDHh7e+vN\nXL58GU5OTjhy5IhkhkhpOIRE9/Dz84OZmRn2798vmXn22Wfx6KOPYt++fUZsdn8zZ87E//73Pxw5\nckQyExAQgObmZmRkZBixGZG8cQiJ7nH3qurw4cOYNm2a3kxVVRWcnZ1x7NgxeHp6GreghOzsbDz/\n/PMoLy+XvJo9fvw4ZsyYYTBDpEQcQqIfCQ0NxbFjx1BSUoKHHnpIbyYkJATHjx83mDGWtrY2jBw5\nEr///e8lXwfX1tYGZ2dn+Pr6Yu3atUZuSCRvfFiG6Efi4+NRX1+PtLQ0ycyKFSvumzGWLVu2oL6+\nHvHx8QYzDQ0NBjNESsUrQiI9Nm/ejOjoaFy+fFnyJwZ3MxqNBgMHDjRyw+/U19fD1tYWCQkJWLBg\nQbczRErGISTS4+7bWSZNmoT169cbzEycOBHvvfeekRt+Jzg4GPn5+Th37pzkW3GWLVuG06dP4/z5\n85IZIiXjEBJJyMvLw5QpU1BcXAxHR0e9mfz8fHh5eaGoqAhOTk5G7VdeXg4XFxecPHkSHh4eejN3\n36V64sQJyQyR0nEIiQzozE8SXnrpJdy6dQvZ2dlGbAZ4e3tj4MCB2LNnj2Rm2rRpeOyxx2T3Uw8i\nOeEQEhnw1VdfwcHBAQcPHoSvr6/ezLVr12BnZ4e9e/fCz8/PKL0yMjIwa9YsVFVVwdraWm/m6NGj\nePHFFw1miIhPjRIZZGNjg5CQEISEhKC1tVVvxtraGqGhoQgODkZzc3Ovd2pubkZwcDDUarXkwDU3\nN2Pp0qVYtmwZR5DoPjiERPcRFRUFrVaL5ORkg5mWlhaDmZ6yceNGtLa2IjIyUjKzadMmtLS0IDo6\nutf7EJk63hol6oT09HQsXboUGo1G8lSHjz76CEFBQdBoNBg6dGiv9Lh58yZUKhVSUlIwZ84cvZmv\nv/4aKpUKycnJmDt3bq/0IPo54RASdYJOp8O4cePg4uKClJQUycykSZNgb2+P1NTUXukxf/58VFdX\nIzc3V/KA4DfeeAPl5eU4ffq07A8RJpID3hol6oQ+ffpgw4YNSEtLQ2lpqWQmMTERwHej2NN0Oh36\n9OmD1atXSw7cxYsX8cEHHyApKYkjSNRJvCIk6oK5c+fixo0byMnJEV3lJ+5ekdrY2CA9PV10HSKT\nwSEk6oIbN27Azs4O27dvh7+/v+g6P3DgwAHMmzcPVVVVsLKyEl2HyGTw1ihRF1hZWSEiIgJqtRpN\nTU2i6/y/pqYmhIWFITIykiNI1EUcQqIuUqvV0Ol0ku8gFeHu8UuhoaGCmxCZHt4aJeqGffv24c9/\n/jOqq6vxxBNPCO1SV1cHlUqFbdu2YebMmUK7EJkiDiFRN3l6esLS0hIhISFCe6xbtw4NDQ04ceKE\n0B5Epkrs0dpEJmz9+vWIi4uDp6en0B7Tpk2TPJmeiO6PV4RERKRofFiGiIgUjUNIRESKxiEkIiJF\n4xASEZGicQiJiEjROIRERKRoHEIiIlI0DiERESkah5CIiBSNQ0hERIrGISQiIkXjEBIRkaJxCImI\nSNE4hEREpGgcQiIiUjQOIRERKdr/AbYCZQZaKmKkAAAAAElFTkSuQmCC\n", "prompt_number": 12, "text": [ "" ] } ], "prompt_number": 12 }, { "cell_type": "code", "collapsed": false, "input": [ "data.groupby(data['molecule'] >= polyarom).describe().unstack()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
mutagenic
countmeanstdmin25%50%75%max
molecule
False 5533 0.488162 0.499905 0 0 0 1 1
True 917 0.809160 0.393177 0 1 1 1 1
\n", "
" ], "output_type": "pyout", "prompt_number": 13, "text": [ " mutagenic \n", " count mean std min 25% 50% 75% max\n", "molecule \n", "False 5533 0.488162 0.499905 0 0 0 1 1\n", "True 917 0.809160 0.393177 0 1 1 1 1" ] } ], "prompt_number": 13 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Thus, 917 compounds contains the naphtalene substructure and about 81% of those 917 are classified as mutagenic." ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Performing a simple fingerprint-based machine learning experiment" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "RDKit has several molecular fingerprint implementations that could be used as a molecule representation for building a mutagenicity model. These can be used most effectively with the scikit-learn machine-learning methods by converting the fingerprints to numpy arrays first. Performing a row-wise custom computation can be most easily realized in Pandas using the combination of a lambda function and the dataframe.apply method. Unfortunately, this pattern does not work directly for functions returning an array, which is why the fingerprints have to be wrapped in a dummy container class FP." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from rdkit.Chem import AllChem\n", "from rdkit import DataStructs\n", "\n", "class FP:\n", " def __init__(self, fp):\n", " self.fp = fp\n", " def __str__(self):\n", " return self.fp.__str__()\n", " \n", "def computeFP(x):\n", " #compute depth-2 morgan fingerprint hashed to 1024 bits\n", " fp = AllChem.GetMorganFingerprintAsBitVect(x,2,nBits=1024)\n", " res = numpy.zeros(len(fp),numpy.int32)\n", " #convert the fingerprint to a numpy array and wrap it into the dummy container\n", " DataStructs.ConvertToNumpyArray(fp,res) \n", " return FP(res)\n", " \n", "\n", "data['FP'] = data.apply(lambda row: computeFP(row['molecule']), axis=1)\n", "#filter potentially failed fingerprint computations\n", "data = data.ix[data['FP'].notnull()]" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 14 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The \"ix\" row filter works for row indices as well as boolean masks. We use this here to randomly split the data into training and test sets" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import random \n", "rand = random.Random()\n", "rand.seed(42)\n", "train = rand.sample(data.index, len(data)/2)\n", "trainData = data.ix[train]\n", "testData = data.drop(train)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 15 }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn.ensemble import RandomForestClassifier\n", "model = RandomForestClassifier(random_state = 42)\n", "#resolve wrapped fingerprints\n", "X = [x.fp for x in trainData['FP']]\n", "y = trainData['mutagenic']\n", "model.fit(X,y)\n", "#resolve wrapped fingerprints and apply the model on the test data\n", "prediction = model.predict([x.fp for x in testData['FP']])" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 16 }, { "cell_type": "markdown", "metadata": {}, "source": [ "A simple numerical report can be obtained from scikit-learn easily using dataframe columns" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn.metrics import metrics\n", "print metrics.confusion_matrix(testData['mutagenic'],prediction)\n", "print metrics.classification_report(testData['mutagenic'],prediction)\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[[1107 417]\n", " [ 372 1329]]\n", " precision recall f1-score support\n", "\n", " 0 0.75 0.73 0.74 1524\n", " 1 0.76 0.78 0.77 1701\n", "\n", "avg / total 0.76 0.76 0.76 3225\n", "\n" ] } ], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "For a more detailed analysis, probabilistic predictions can be obtained from the scikit-learn RandomForest model and inserted directly into the dataframe" ] }, { "cell_type": "code", "collapsed": false, "input": [ "testData['prediction'] = model.predict([x.fp for x in testData['FP']])\n", "testData['probability'] = [p[1] for p in model.predict_proba([x.fp for x in testData['FP']])]" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 18 }, { "cell_type": "code", "collapsed": false, "input": [ "testData.head(2)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
smilesmutagenicmoleculeFPpredictionprobability
2475-33-4 O=C1c2ccccc2C(=O)c3c1ccc4c3[nH]c5c6C(=O)c7ccccc7C(=O)c6c8[nH]c9c%10C(=O)c%11ccccc%11C(=O)c%10ccc9c8c45 0 \"Mol\"/ [0 0 0 ..., 0 0 0] 1 0.525000
105149-00-6 CC(=O)OC1(CCC2C3C=C(Cl)C4=CC(=O)OCC4(C)C3CCC12C)C(=O)C 0 \"Mol\"/ [0 0 0 ..., 0 0 0] 0 0.208333
\n", "
" ], "output_type": "pyout", "prompt_number": 19, "text": [ " smiles \\\n", "2475-33-4 O=C1c2ccccc2C(=O)c3c1ccc4c3[nH]c5c6C(=O)c7ccccc7C(=O)c6c8[nH]c9c%10C(=O)c%11ccccc%11C(=O)c%10ccc9c8c45 \n", "105149-00-6 CC(=O)OC1(CCC2C3C=C(Cl)C4=CC(=O)OCC4(C)C3CCC12C)C(=O)C \n", "\n", " mutagenic \\\n", "2475-33-4 0 \n", "105149-00-6 0 \n", "\n", " molecule \\\n", "2475-33-4 \"Mol\"/ \n", "105149-00-6 \"Mol\"/ \n", "\n", " FP prediction probability \n", "2475-33-4 [0 0 0 ..., 0 0 0] 1 0.525000 \n", "105149-00-6 [0 0 0 ..., 0 0 0] 0 0.208333 " ] } ], "prompt_number": 19 }, { "cell_type": "code", "collapsed": false, "input": [ "testData.sort(columns='probability').head(2)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
smilesmutagenicmoleculeFPpredictionprobability
109-43-3 CCCCOC(=O)CCCCCCCCC(=O)OCCCC 0 \"Mol\"/ [0 0 0 ..., 0 0 0] 0 0
111-87-5 CCCCCCCCO 0 \"Mol\"/ [0 0 0 ..., 0 0 0] 0 0
\n", "
" ], "output_type": "pyout", "prompt_number": 20, "text": [ " smiles mutagenic \\\n", "109-43-3 CCCCOC(=O)CCCCCCCCC(=O)OCCCC 0 \n", "111-87-5 CCCCCCCCO 0 \n", "\n", " molecule \\\n", "109-43-3 \"Mol\"/ \n", "111-87-5 \"Mol\"/ \n", "\n", " FP prediction probability \n", "109-43-3 [0 0 0 ..., 0 0 0] 0 0 \n", "111-87-5 [0 0 0 ..., 0 0 0] 0 0 " ] } ], "prompt_number": 20 }, { "cell_type": "code", "collapsed": false, "input": [ "testData.sort(columns='probability',ascending=False).head(2)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
smilesmutagenicmoleculeFPpredictionprobability
5296-38-8 C(Oc1ccc(Cc2ccccc2)cc1)C3CO3 1 \"Mol\"/ [0 0 0 ..., 0 0 0] 1 1
17024-19-0 [O-][N+](=O)c1ccc2ccc3ccccc3c2c1 1 \"Mol\"/ [0 0 0 ..., 0 0 0] 1 1
\n", "
" ], "output_type": "pyout", "prompt_number": 21, "text": [ " smiles mutagenic \\\n", "5296-38-8 C(Oc1ccc(Cc2ccccc2)cc1)C3CO3 1 \n", "17024-19-0 [O-][N+](=O)c1ccc2ccc3ccccc3c2c1 1 \n", "\n", " molecule \\\n", "5296-38-8 \"Mol\"/ \n", "17024-19-0 \"Mol\"/ \n", "\n", " FP prediction probability \n", "5296-38-8 [0 0 0 ..., 0 0 0] 1 1 \n", "17024-19-0 [0 0 0 ..., 0 0 0] 1 1 " ] } ], "prompt_number": 21 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Analysis of the learned SAR using PANDAS" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas offers a range of data visualization tools that can be easily used to generate reports for a performance analysis of a machine learning models. The next example shows how to create a plot that shows the binary true mutagenicity class distribution on the test set for the discretized range of predicted mutagenic probabilities." ] }, { "cell_type": "code", "collapsed": false, "input": [ "#assign the predicted probabilities to discrete bins\n", "testData['binnedProb'] = pd.cut(testData['probability'],bins=[-0.1,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.1],labels=False)\n", "\n", "temp = testData.groupby(['binnedProb','mutagenic'])['mutagenic'].size().unstack()\n", "print temp\n", "temp.plot(kind='bar',stacked=True,)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "mutagenic 0 1\n", "binnedProb \n", "0 289 26\n", "1 203 53\n", "2 240 66\n", "3 194 107\n", "4 181 120\n", "5 151 144\n", "6 107 209\n", "7 55 244\n", "8 63 335\n", "9 41 397\n" ] }, { "output_type": "pyout", "prompt_number": 22, "text": [ "" ] }, { "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAXMAAAEHCAYAAABcCaZFAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X1UVPedx/H38GBTIyomgu5gD66oMIAyEZE8mIwRdVMj\n1bjBojHEh6TH7O5JY7aN7jZdbBvBk9hdNXHXTTTFpivaJAqx0WITJ010UzCiboIrxIOG59TFZyQg\n3P2DOgEfRmQYGC6f1zmcM87ce3/fmSufufzu796fxTAMAxER6dH8ursAERHxnMJcRMQEFOYiIiag\nMBcRMQGFuYiICSjMRURMoF1h3tTUhN1uZ8aMGQCkp6cTFhaG3W7Hbreza9cu17IZGRmMHDmSyMhI\n8vLyvFO1iIi0EdCehdasWYPNZuP8+fMAWCwWli5dytKlS9ssV1RUxNatWykqKqKiooKkpCSKi4vx\n89MfACIi3nTTMC8vL+e9997jn//5n/nlL38JgGEYXO9ao5ycHFJTUwkMDCQ8PJyIiAjy8/NJTEx0\nLWOxWDqxfBGR3sPdNZ43PWR+9tlneemll9ocXVssFtatW8fYsWNZtGgRZ86cAaCyspKwsDDXcmFh\nYVRUVFy3oI7+/Mu//ItH63fGj2ro/vZVg2+0rxq6rv2bcRvmO3fuJCQkBLvd3mZjS5YsobS0lEOH\nDjF06FCee+65G25DR+IiIt7nNsz3799Pbm4uw4cPJzU1lQ8++IDHH3+ckJAQLBYLFouFxYsXk5+f\nD4DVaqWsrMy1fnl5OVar1bvvQERE3If5ypUrKSsro7S0lOzsbB588EE2b95MVVWVa5nt27cTGxsL\nQHJyMtnZ2TQ0NFBaWkpJSQkJCQmdWrDD4ejU7amGntm+avCN9lWDb7QPYDHa0xkDOJ1OfvnLX5Kb\nm8v8+fM5fPgwFouF4cOHs2HDBkJDQ4GWL4BNmzYREBDAmjVrmDZtWtsGLZZ29f+IiMg3bpad7Q7z\nznKjggYNGsTp06e7spRuFxwcTG1tbXeXISI3Mah/f07/ZWh2RwUHBVF77lyH1+8xYd4bj9h743sW\n6YksFgue/qZacD+0sF01uFlfV/OIiJiAwlxExAQU5iIiJqAwFxExAdOG+dmzZ/n3f//3Lm2zsrKS\nRx99tEvbFBEBE49mOXHiBDNmzOB//ud/Om2bnU2jWUR6Bo1m8dCJEyeIjIxkwYIFjB49mnnz5pGX\nl8c999zDqFGjKCgoID09ndWrV7vWiY2N5eTJkyxbtozjx49jt9t5/vnnuXjxIklJSYwbN44xY8aQ\nm5vrWufnP/85kZGRTJw4kblz57q2d/z4cR566CHi4+O5//77OXbsGABPPPEEzzzzDPfeey8jRozg\n7bffdtV75WrYpqYm/vEf/5HY2FjGjh3LK6+80lUfm4j0Qu26n3l3On78OG+//TY2m43x48ezdetW\n1z1jVq5cSVxc3DXrWCwWVq1axeeff05hYSHQEq7bt28nKCiIU6dOcffdd5OcnExBQQHvvPMOR44c\noaGhgbvuuov4+HgAnnrqKTZs2EBERAR/+tOfePrpp3n//fcBqK6uZt++fRw9epTk5GRmz57dpob/\n/M//5Msvv+Tw4cP4+fn1uguiRKRr+XyYDx8+nOjoaACio6NJSkoCICYmhhMnTlw3zOHaP2eam5tZ\nvnw5H330EX5+flRWVlJTU8O+ffuYOXMmffr0oU+fPq7ZlC5evMj+/fvb9IE3NDQALV8WM2fOBCAq\nKoqamppr2n///fdZsmSJ69bBwcHBnnwMIiJu+XyYf+tb33I99vPzo0+fPq7Hly9fJiAggObmZtcy\n9fX1193Ob37zG06dOsXBgwfx9/dn+PDh1NfXX9MPdeVxc3MzwcHBriP7q12po/U6V1N/uIh0FZ/u\nM2+P8PBwDh48CMDBgwcpLS0FICgoyDXNHcC5c+cICQnB39+fvXv3cvLkSSwWC/feey/vvvsuX3/9\nNRcuXOB3v/uda/3hw4fz1ltvAS3BfOTIkXbXNWXKFDZs2EBTUxOAullExKt8Psyvntyi9b8tFguz\nZ8+mtraWmJgYXn31VUaPHg3AHXfcwb333ktsbCzPP/888+bN48CBA4wZM4Zf//rXREVFARAfH09y\ncjJjxozhu9/9LrGxsQwYMABoOZrfuHEjcXFxxMTEtDlpenUdVz9evHgx3/nOdxgzZgxxcXFs2bKl\nkz8ZEZFvmHZo4q24ePEit99+O3V1dTzwwAO89tprN+yL70wamijSM/SEoYk+32feFZ566imKioqo\nr6/niSee6JIgFxHpTO06Mm9qaiI+Pp6wsDDeffddamtrmTNnDidPniQ8PJxt27YxcOBAADIyMti0\naRP+/v6sXbuWqVOntm3QB4/Mu0tvfM8iPVFPODJvV5/5mjVrsNlsrv7gzMxMpkyZQnFxMZMnTyYz\nMxOAoqIitm7dSlFREbt37+bpp59uM9JERES846ZhXl5eznvvvcfixYtd3wq5ubmkpaUBkJaWxo4d\nOwDIyckhNTWVwMBAwsPDiYiIcE32LCIi3nPTPvNnn32Wl156iXOtpjuqqalxzfkZGhrqumimsrKS\nxMRE13JhYWFUVFRcs8309HTXY4fD4ROToYqI+BKn04nT6Wz38m7DfOfOnYSEhGC322+4UYvFcs3w\nwatfv1rrMBcRuRlP5+D0dP7N7nD1ge6KFSvcLu82zK/cA+W9996jvr6ec+fOMX/+fEJDQ6murmbI\nkCFUVVUREhICgNVqpayszLV+eXk5VqvVg7cjIgKnz5/36ASkxcPJmHsCt33mK1eupKysjNLSUrKz\ns3nwwQf59a9/TXJyMllZWQBkZWW57lOSnJxMdnY2DQ0NlJaWUlJSQkJCgvffhYhIL3dLV4Be6TJZ\ntmwZe/bsYdSoUXzwwQcsW7YMAJvNRkpKCjabjYceeoj169e77YLpqP79B7m6d7zx07//oHbXUltb\ny6xZs+jXrx/h4eG60lNEukWPvAK05QvCm2W3v5bU1FQANm7cSGFhIdOnT2f//v3YbLabt6Jx5iLt\n4uk4704Z4+1B+51Wg5v1FebXb6FdtVy8eJFBgwbx+eefExERAbQM1fyrv/orMjIybt6KwlykXRTm\nPXymIV9XXFxMQECAK8gBxo4dy+eff96NVYlIb6Qw98CFCxfo379/m+euvvWuiEhXUJh7oF+/fm0u\npgI4e/YsQUFB3VSRiPRWCnMPjBo1isuXL/PFF1+4njt8+DAxMTHdWJWI9EY6AXr9Fm5pNIvFYuH1\n11/n4MGDPPzww/z3f/+3a/ILt63oBKhIu+gEqElPgAYFBdPy0Xjnp2X77bN+/XouXbpESEgIjz32\nGP/xH//RriAXEelMPfLI3Cx643sW6QgdmZv0yFxERNpSmIuImIDCXETEBBTmIiImoDAXETEBhbmI\niAkozEVETEBhLiJiAm7DvL6+ngkTJhAXF4fNZmP58uVAy4TMYWFh2O127HY7u3btcq2TkZHByJEj\niYyMJC8vz7vVi4gI0I4rQOvq6ujbty+XL1/mvvvu4+WXX+b9998nKCiIpUuXtlm2qKiIuXPnUlBQ\nQEVFBUlJSRQXF+Pn9813RmdcAerpTN03096ZvF955RV+9atf8dlnn5Gamsobb7xxS+3oClCR9tEV\noJ1wBWjfvn0BaGhooKmpieDglvuWXG+jOTk5pKamEhgYSHh4OBEREeTn53e09hu6MlO3t37a+0Vh\ntVp54YUXWLhwYWe9NRGRDgm42QLNzc3cddddHD9+nCVLlhAdHc1bb73FunXr2Lx5M/Hx8axevZqB\nAwdSWVlJYmKia92wsDAqKiqu2WZ6errrscPhwOFwdMqb6WqzZs0C4MCBA5SXl3dzNSJiJk6nE6fT\n2e7lbxrmfn5+HDp0iLNnzzJt2jScTidLlizhpz/9KQAvvPACzz33HBs3brzu+i23q22rdZibgbpK\nRKSzXX2gu2LFCrfLt3s0y4ABA5g+fToHDhwgJCQEi8WCxWJh8eLFrq4Uq9VKWVmZa53y8nKsVust\nvoWe53pfWCIiXcltmJ86dYozZ84AcOnSJfbs2YPdbqe6utq1zPbt24mNjQUgOTmZ7OxsGhoaKC0t\npaSkhISEBC+W7xt0ZC4i3c1tN0tVVRVpaWk0NzfT3NzM/PnzmTx5Mo8//jiHDh3CYrEwfPhwNmzY\nAIDNZiMlJQWbzUZAQADr16/vFUetveE9iohv65GTU3TGMCG326d9R9tNTU00NjayYsUKKioqeO21\n1wgICMDf37997Whooki7aGiiSSenCA4K8uKkcS3bb4+f//zn9O3bl1WrVvHmm2/y7W9/mxdffLGz\n3qaISLv1yCNzs+iN71mkI3RkbtIjcxERaUthLiJiAgpzERETuOkVoF0lODi41w3xu3KfGxERT/nM\nCVARkRvRCVCdABUR6RUU5iIiJqAwFxExAYW5iIgJ+FyYD+rf33V73Y7+DOrfv7vfhkin0O+DtJfP\njWbxhbPGIr5Cvw8tNJpFo1lERHoFhbmIiAkozEVETMBtmNfX1zNhwgTi4uKw2WwsX74cgNraWqZM\nmcKoUaOYOnWqa2o5gIyMDEaOHElkZCR5eXnerd7EPD3xpZNeIr3LTU+A1tXV0bdvXy5fvsx9993H\nyy+/TG5uLnfeeSc//vGPWbVqFadPnyYzM5OioiLmzp1LQUEBFRUVJCUlUVxcjJ/fN98ZOgHaPt19\nwkd8g34fWnT374Mv7AePT4D27dsXgIaGBpqamggODiY3N5e0tDQA0tLS2LFjBwA5OTmkpqYSGBhI\neHg4ERER5Ofnd7h46T6+MCROf52ItN9N75rY3NzMXXfdxfHjx1myZAnR0dHU1NQQGhoKQGhoKDU1\nNQBUVlaSmJjoWjcsLIyKioprtpmenu567HA4cDgcHr4N6Wynz5/3/Ejk/PlurcHT9qXFoP79Oe3B\nZxkcFETtuXOdWFHv4HQ6cTqd7V7+pmHu5+fHoUOHOHv2LNOmTWPv3r1tXr9yFHQj13utdZiLiG/T\nl2r3uPpAd8WKFW6Xb/dolgEDBjB9+nQ+/fRTQkNDqa6uBqCqqoqQkBAArFYrZWVlrnXKy8uxWq23\nUr+IT1FXj/QUbsP81KlTrpEqly5dYs+ePdjtdpKTk8nKygIgKyuLmTNnApCcnEx2djYNDQ2UlpZS\nUlJCQkKCl9+CiPdcOSrt6I8n3RMit8JtN0tVVRVpaWk0NzfT3NzM/PnzmTx5Mna7nZSUFDZu3Eh4\neDjbtm0DwGazkZKSgs1mIyAggPXr1/e62YNERLqDz92bpY/FQqOHbQQCDRqKZY6hWN3Yvi/UoP3g\nGzX4zH5ws77PzAF6RUuQe/axNaK/BkSkd9Hl/CIiJqAwFxExAYW5iIgJKMxFRExAYS4iYgIKcxER\nE1CYi4iYgMJcRMQEFOYiIiagMBcRMQGFuYiICSjMRURMQGEuImICCnMRERNQmIuImIDbMC8rK2PS\npElER0cTExPD2rVrgZYJmcPCwrDb7djtdnbt2uVaJyMjg5EjRxIZGUleXp53q/cSzfsoIj2N25mG\nqqurqa6uJi4ujgsXLjBu3Dh27NjBtm3bCAoKYunSpW2WLyoqYu7cuRQUFFBRUUFSUhLFxcX4+X3z\nnXGz2TJappnzfE4PzaxikplVurF9X6hB+8E3avCZ/eBmfbdH5kOGDCEuLg6Afv36ERUVRUVFBdyg\nqJycHFJTUwkMDCQ8PJyIiAjy8/M7XLyIiLRPu6eNO3HiBIWFhSQmJrJv3z7WrVvH5s2biY+PZ/Xq\n1QwcOJDKykoSExNd64SFhbnCv7X09HTXY4fDgcPh8OhNiIiYjdPpxOl0tnv5dk3ofOHCBRwOBz/5\nyU+YOXMmX331FYMHDwbghRdeoKqqio0bN/IP//APJCYmMm/ePAAWL17Md7/7XR555JFvGlQ3S4+o\nwWf+rOzG9n2hBu0H36jBZ/ZDR7tZABobG5k9ezaPPfYYM2fOBCAkJMR1sm/x4sWurhSr1UpZWZlr\n3fLycqxWa4eLFxGR9nEb5oZhsGjRImw2Gz/84Q9dz1dVVbkeb9++ndjYWACSk5PJzs6moaGB0tJS\nSkpKSEhI8FLpIiJyhds+83379vHmm28yZswY7HY7ACtXrmTLli0cOnQIi8XC8OHD2bBhAwA2m42U\nlBRsNhsBAQGsX7/+L90mcqsCafmzzJP1RaT3aFefeac2qD7zdtfg2efQvZ9BSwXqq+3uz8AXatB+\n6MQaPOkzFxER36cwFxExAYW5iIgJKMxFRExAYS4iYgIKcxERE1CYi4iYgMJcRMQEFOYiIiagMBcR\nMQGFuYiICSjMRURMQGEuImICCnMRERNQmIuImIDCXETEBNyGeVlZGZMmTSI6OpqYmBjWrl0LQG1t\nLVOmTGHUqFFMnTqVM2fOuNbJyMhg5MiRREZGkpeX593qRUQEuMlMQ9XV1VRXVxMXF8eFCxcYN24c\nO3bs4I033uDOO+/kxz/+MatWreL06dNkZmZSVFTE3LlzKSgooKKigqSkJIqLi/Hz++Y7oyfMNNTH\nYqHRg9YDgQbNNNTts7tohhvfqEH7oRNrcLO+2zlAhwwZwpAhQwDo168fUVFRVFRUkJuby4cffghA\nWloaDoeDzMxMcnJySE1NJTAwkPDwcCIiIsjPzycxMbHNdtPT012PHQ4HDoejg2/PO1qCvOMfeqNH\ns3eKiIDT6cTpdLZ7ebdh3tqJEycoLCxkwoQJ1NTUEBoaCkBoaCg1NTUAVFZWtgnusLAwKioqrtlW\n6zAXEZFrXX2gu2LFCrfLt+sE6IULF5g9ezZr1qwhKCiozWsWi+UvXQLX5+41ERHpHDcN88bGRmbP\nns38+fOZOXMm0HI0Xl1dDUBVVRUhISEAWK1WysrKXOuWl5djtVq9Ubd4WSAtfXye/AR2edUivZfb\nMDcMg0WLFmGz2fjhD3/oej45OZmsrCwAsrKyXCGfnJxMdnY2DQ0NlJaWUlJSQkJCghfLF2/55rxB\nx388OYksIrfG7WiWjz/+mPvvv58xY8a4uksyMjJISEggJSWFL7/8kvDwcLZt28bAgQMBWLlyJZs2\nbSIgIIA1a9Ywbdq0tg32gNEs3T2SxBdq8JX9oFEUPjKKohvb94UafGY/uMtOd2HuDQrznlGDr+wH\nhYiPhEg3tu8LNfjMfnCzvq4AFRExAYW5iIgJKMxFREyg3RcNiXS1K8MjPVlfpLdQmIvP0m0VRNpP\n3SwiIiagMBcRMQGFuYiICSjMRURMQGEuImICCnMRERNQmIuImIDCXETEBBTmIiImoDAXETEBt2G+\ncOFCQkNDiY2NdT2Xnp5OWFgYdrsdu93Orl27XK9lZGQwcuRIIiMjycvL817VIiLShtswX7BgAbt3\n727znMViYenSpRQWFlJYWMhDDz0EQFFREVu3bqWoqIjdu3fz9NNP09zc7L3KRUTExW2YT5w4keDg\n4Guev95sFzk5OaSmphIYGEh4eDgRERHk5+d3XqUiInJDHbpr4rp169i8eTPx8fGsXr2agQMHUllZ\nSWJiomuZsLAwKioqrrt+enq667HD4cDhcHSkDBGv0214pbs4nU6cTme7l7/lMF+yZAk//elPAXjh\nhRd47rnn2Lhx43WXvTIJ9NVah7mIL9NteKW7XH2gu2LFCrfL3/JolpCQECwWCxaLhcWLF7u6UqxW\nK2VlZa7lysvLsVqtt7p5ERHpgFsO86qqKtfj7du3u0a6JCcnk52dTUNDA6WlpZSUlJCQkNB5lYqI\nyA257WZJTU3lww8/5NSpUwwbNowVK1bgdDo5dOgQFouF4cOHs2HDBgBsNhspKSnYbDYCAgJYv379\nDbtZRESkc1mM6w1N8WaDFst1R8O0ft2TPsq/bMVtGzdd2+MaPGvfF2rQfvCNGiwWSyfsheuPQOuq\nGjxt3xdq8Jn94GZ9XQEqImICCnMRERNQmIuImIDCXETEBBTmIiImoDAXETEBhbmIiAkozEVETEBh\nLiJiAgpzERETUJiLiJiAwlxExAQU5iIiJqAwFxExAYW5iIgJdGhCZxHpGp5OKH1lG2J+bo/MFy5c\nSGhoqGtqOIDa2lqmTJnCqFGjmDp1KmfOnHG9lpGRwciRI4mMjCQvL897VYv0Et9MKN3xn8auLlq6\nhdswX7BgAbt3727zXGZmJlOmTKG4uJjJkyeTmZkJQFFREVu3bqWoqIjdu3fz9NNP09zc7L3KRUTE\nxW2YT5w4keDg4DbP5ebmkpaWBkBaWho7duwAICcnh9TUVAIDAwkPDyciIoL8/HwvlS0iIq3dcp95\nTU0NoaGhAISGhlJTUwNAZWUliYmJruXCwsKoqKi47jbS09Ndjx0OBw6H41bLEBExNafTidPpbPfy\nHp0AtVgsf5nw9savX0/rMBcRkWtdfaC7YsUKt8vf8tDE0NBQqqurAaiqqiIkJAQAq9VKWVmZa7ny\n8nKsVuutbl5ERDrglsM8OTmZrKwsALKyspg5c6br+ezsbBoaGigtLaWkpISEhITOrVZERK7LbTdL\namoqH374IadOnWLYsGH87Gc/Y9myZaSkpLBx40bCw8PZtm0bADabjZSUFGw2GwEBAaxfv95tF4yI\niHQei2EYRpc2aLHgrsmWLwBPS3Lfxk3X9rgGz9r3hRq0H3yjBl/ZD559AnTKfujOGjxtv9NqcLO+\nLucXETEBhbmIiAkozEVETEBhLiJiAgpzERETUJiLiJiAwlxExAQU5iIiJqAwFxExAYW5iIgJKMxF\nRExAYS4iYgIeTU4hItIVAmm5UZUn65udwlxEfF4j4MndIxs9+iroGdTNIiJiAgpzERET6HA3S3h4\nOP3798ff35/AwEDy8/Opra1lzpw5nDx50jUL0cCBAzuzXhHpYuqv7hk6fGRusVhwOp0UFhaSn58P\nQGZmJlOmTKG4uJjJkyeTmZnZaYWKSPf4pr+6Yz+NXV5x7+RRN8vVUxjl5uaSlpYGQFpaGjt27PBk\n8yIi0k4d7maxWCwkJSXh7+/PD37wA5588klqamoIDQ0FIDQ0lJqamuuum56e7nrscDhwOBwdLUNE\nxJScTidOp7Pdy3d4QueqqiqGDh3Kn//8Z6ZMmcK6detITk7m9OnTrmUGDRpEbW1t2wY1oXOPqEH7\nwTdq0H7wjRpMPaHz0KFDARg8eDCzZs0iPz+f0NBQqqurgZawDwkJ6ejmRUTkFnQozOvq6jh//jwA\nFy9eJC8vj9jYWJKTk8nKygIgKyuLmTNndl6lIiJyQx3qM6+pqWHWrFkAXL58mXnz5jF16lTi4+NJ\nSUlh48aNrqGJIiI9nafDM69sw5s63Gfe4QbVZ94jatB+8I0atB98owZf2Q9e6TMXERHfoTAXETEB\nhbmIiAkozEVETEBhLiJiAgpzERETUJiLiJiAwlxExAQU5iIiJqAwFxExAYW5iIgJKMxFRExAYS4i\nYgIKcxERE1CYi4iYQA8Mc2d3F4Bq8IX2QTX4QvugGnyhfS+F+e7du4mMjGTkyJGsWrWqk7fu7OTt\ndYSzuwug+2vo7vZBNfhC+6AafKF9L4R5U1MTf//3f8/u3bspKipiy5YtHD16tLObERGRVjo9zPPz\n84mIiCA8PJzAwEC+//3vk5OT09nNiIhIK50+B+hbb73F73//e1577TUA3nzzTf70pz+xbt26lgYt\nnk6LKiLSO7mL64DObuxmYd3F80eLiPQKnd7NYrVaKSsrc/27rKyMsLCwzm5GRERa6fQwj4+Pp6Sk\nhBMnTtDQ0MDWrVtJTk7u7GZERKSVTu9mCQgI4JVXXmHatGk0NTWxaNEioqKiOrsZERFppdNPgHrD\n0aNHycnJoaKiAoCwsDCSk5N94kvijTfeYMGCBV5v5+jRo1RWVjJhwgT69evnen737t38zd/8jdfb\nB/j4448ZNGgQNpsNp9PJgQMHsNvtTJ48uUvav57HH3+czZs3d0vbH330Efn5+cTGxjJ16tQuafOT\nTz4hKiqKAQMGUFdXR2ZmJgcPHiQ6Opp/+qd/YsCAAV6vYe3atcyaNYthw4Z5va3r+frrr8nOzsZq\ntZKUlMRvfvMb9u/fj81m46mnniIwMLBL6jh+/DjvvPMO5eXl+Pn5MXr0aObOnUv//v27pP2r+XyY\nr1q1ii1btvD973/f1fdeVlbG1q1bmTNnDsuXL+/W+oYNG9bmHIE3rF27lldffZWoqCgKCwtZs2YN\nM2fOBMBut1NYWOjV9gGWL1/O3r17aWpqYtKkSfzxj39k+vTp7NmzhxkzZvCjH/3I6zXMmDEDi8XS\n5iT6Bx98wIMPPojFYiE3N9er7SckJJCfnw/Aa6+9xquvvsqsWbPIy8vj4Ycf7pL/izabjSNHjhAQ\nEMCTTz7J7bffzt/+7d/yhz/8gSNHjvDOO+94vYYBAwbQt29fRowYwdy5c3n00UcZPHiw19u9Yu7c\nuTQ1NVFXV8fAgQO5cOECjzzyCH/4wx8AyMrK8noNa9asYefOnTzwwAP87ne/w263M3DgQLZv3876\n9euZNGmS12u4huHjIiIijIaGhmue//rrr40RI0Z0SQ0xMTE3/OnTp4/X24+OjjbOnz9vGIZhlJaW\nGuPGjTP+9V//1TAMw4iLi/N6+4ZhGFFRUUZjY6Nx8eJFo1+/fsaZM2cMwzCMuro6IzY2tktqiIuL\nM+bOnWt88MEHhtPpNPbu3WsMGTLEcDqdhtPp7JL2rxg3bpzx1VdfGYZhGBcuXDCio6O93r5hGEZk\nZKTrsd1ub/PamDFjuqSGuLg4o6mpyfj9739vLFiwwLjzzjuNadOmGb/61a+Mc+fOeb39mJgYwzAM\no7Gx0Rg8eLDR2NhoGIZhNDc3u17ztujoaOPy5cuGYRjGxYsXjfvvv98wDMM4efKkMXbs2C6p4Wqd\n3mfe2fz9/amoqCA8PLzN85WVlfj7+3dJDV999RW7d+8mODj4mtfuuecer7dvGIarayU8PByn08ns\n2bM5efJklw317NOnDwEBAQQEBDBixAjXn/Pf/va38fPrmlv8HDhwgDVr1vDiiy/y0ksvYbfbue22\n23jggQckwWH8AAAF80lEQVS6pP2mpiZqa2sxDIOmpibX0ejtt99OQEDX/CpFR0ezadMmFi5cyNix\nYykoKGD8+PEUFxfTp0+fLqkBwM/Pj6lTpzJ16lQaGhrYtWsXW7Zs4bnnnuPUqVNebbu5uZmvv/6a\nuro6Ll26xNmzZ7njjjuor6+nubnZq21fYbFYaGxsxN/fn/r6ei5evAjAd77zHRobG7ukhqv5fJj/\n27/9G0lJSURERLj66MrKyigpKeGVV17pkhqmT5/OhQsXsNvt17zWFUESEhLCoUOHiIuLA6Bfv37s\n3LmTRYsWceTIEa+3D/Ctb32Luro6+vbty8GDB13PnzlzpsvC3N/fn6VLl5KSksKzzz5LSEgIly9f\n7pK2Ac6dO8e4ceOAll/mqqoqhg4dyvnz57ushtdff51nnnmGX/ziFwwePJh77rmHsLAwhg0bxuuv\nv95ldbTWp08fvve97/G9733PFWre9NhjjxEVFUVgYCCrV69m4sSJ3HPPPXzyySekpaV5vX2AxYsX\nM378eCZMmMBHH33E888/D7Qc+N1xxx1dUsPVfL7PHFqOiPLz86moqMBisWC1WomPj++yo6HuVlZW\nRmBgIEOGDGnzvGEY7Nu3j/vuu8/rNdTX13Pbbbdd8/ypU6eoqqoiNjbW6zVcbefOnezfv5+VK1d2\nedut1dXVUVNTw/Dhw7uszbNnz1JaWsrly5cJCwu75v+GNx07dozRo0d3WXvXc+LECfr378+gQYM4\nfvw4Bw4cIDIykrFjx3ZZDZ999hn/+7//S0xMDJGRkV3W7o30iDAXERH3euD9zEVE5GoKcxERE1CY\ni4iYgMJcfN6JEyeue4L1ySef9PrEJ+np6axevRqAJ554gr/+67/Gbrczbtw4Pvnkk1vaVusrd0U6\nW+8YDiKmdOWe+d5ksVhct3W2WCy8/PLLPPLII+zZs4cf/OAHHD58uM3yzc3NNxyqqXv5izfpyFx6\nhMuXL/PYY49hs9l49NFHuXTpEg6HwzXmvV+/fvzkJz8hLi6Ou+++m6+++gpoOZp+5plnuPfeexkx\nYgRvv/22a5svvfQSCQkJjB07lvT0dNfzL774IqNHj2bixIkcO3asTR1XBn9NnDiRL774Ami5kGvZ\nsmWMGzeO3/72t2zZsoUxY8YQGxvLsmXL2qy/dOlSYmJiSEpK8vrFNdK7KMylRzh27Bh/93d/R1FR\nEf3792f9+vVtjnTr6uq4++67OXToEPfff3+bo/bq6mr27dvHzp07XeGal5fHF198QX5+PoWFhXz6\n6ad89NFHfPrpp2zdupXDhw/z3nvvUVBQcN163n33XcaMGQO0HHHfeeedfPrpp0ycOJFly5axd+9e\nDh06REFBgWvaxIsXLzJ+/Hg+++wzHnjgAVasWOGtj0t6IYW59AjDhg3j7rvvBlquAPz444/bvN6n\nTx+mT58OwLhx4zhx4gTQErRXbkoWFRVFTU0N0BLmeXl5rv7vY8eOUVJSwscff8wjjzzCbbfdRlBQ\nUJt78RuGwY9+9CPsdjuvv/46GzdudL02Z84cAAoKCpg0aRJ33HEH/v7+zJs3jz/+8Y9AyyXwV5a7\n3nsQ8YT6zKVHaH0UbhjGNf3PrW976ufn1+Yy/9b3LGl9jdzy5ct56qmn2mxnzZo1bZZp/bh1n/nV\nbr/9dtcyV69/vb7yGz0v0lE6Mpce4csvv3SNHvmv//ovj29hMG3aNDZt2uS6l0hFRQV//vOfuf/+\n+9mxYwf19fWcP3+enTt3tlnvZhdMjx8/ng8//JD/+7//o6mpiezsbNf9e5qbm/ntb3/reg8TJ070\n6D2ItKYjc/F5FouF0aNH8+qrr7Jw4UKio6NZsmQJ7777bptlWj+++t9XP54yZQpHjx51dd0EBQXx\n5ptvYrfbmTNnDmPHjiUkJISEhIRrarlefVcMHTqUzMxMJk2ahGEYPPzww8yYMQNoOXrPz8/nF7/4\nBaGhoWzdutWTj0WkDd2bRUTEBNTNIiJiAgpzERETUJiLiJiAwlxExAQU5iIiJqAwFxExgf8HPiTB\nc+MSl4kAAAAASUVORK5CYII=\n" } ], "prompt_number": 22 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The same approach can also be used to plot the frequency of compounds containing a specific substrstructure (naphtalene in this case) with respect to the predicted probability of being mutagenic. This indicates that the model possibly has learned to recognize this motif (i.e the fingerprint bits corresponding to it) as an indicator of mutagenicity." ] }, { "cell_type": "code", "collapsed": false, "input": [ "polyarom = Chem.MolFromSmiles('c1cccc2c1cccc2')\n", "polyarom" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAYAAABNcIgQAAAWNUlEQVR4nO3de1RVdeIF8K1WDjZr\nJhAfRcxEk1xeiigoig9QZCTKZsDSlVnOMFr4GBEubxAzi8RHMmKMSJpoOj4rRUFpUBQUFXkFCNcy\nQcVmFSh3xhne9/dH+VtW91wB4X7P7ezPWv3T3Wu1a63W9px77vn20el0OhARESlUX9EFiIiIROIQ\nEhGRonEIiYhI0TiERESkaBxCIiJSNA4hEREpGoeQiIgUjUNIRESKxiEkIiJF4xASEZGicQiJiEjR\nOIRERKRoHEIiIlI0DiERESkah5CIiBSNQ0hERIrGISTqpqKiIvj5+eGRRx4R+pefnx+KiopE/+cg\nMll9eEI9Udd1dHTAzc0NDg4OCAoKEtrl/fffR1VVFc6fP4++fflnW6Kuekh0ASJTtGPHDtTU1CA7\nOxsWFhZCu9jZ2WHYsGHYuXMnXn31VaFdiEwR//hI1EVarRYRERGIjY0VPoIAYGFhgdjYWISHh0Or\n1YquQ2RyOIREXbR69WpYWlpi8eLFoqv8vyVLlsDS0hKJiYmiqxCZHH5HSNQFly9fhpOTE44cOQJv\nb2/RdX7g6NGj8Pf3R3l5OZ555hnRdYhMBoeQqAsCAgLQ3NyMjIwM0VX08vPzg5mZGfbv3y+6CpHJ\n4K1Rok7Kzs7GkSNHsGHDBsnM2bNnsWDBAvTGny91Oh0WLFiAs2fPSmbWr1+Pw4cP4+TJkz3+zyf6\nueIQEnVCW1sbli1bhkWLFknedmxvb8fChQsxYMAA9OnTp8c79OnTB2ZmZli4cCHa29v1ZlQqFRYt\nWoTg4GDJDBH9EIeQqBPS0tJQX1+P+Ph4yczWrVtx/fp1g5kHtWLFCly/fh3btm2TzCxfvhx1dXVI\nS0vrtR5EPyccQqL7aGhoQExMDFasWIFf/epXejO3b99GbGwsVq5cCXNz817rYm5ujjfffBMxMTFo\nbGzUm3nsscewcuVKxMXF4fbt273Whejngg/LEN1HSEgITp06ZfDNLWq1GtnZ2SgqKkK/fv16tU97\neztGjRoFHx8frFmzRjIzevRoTJ06FevWrevVPkSmjkNIZEBFRQVcXFyQk5ODCRMm6M1UV1djxIgR\nyMrKgpeXl1F65eTkwNfXF2VlZVCpVHozJ06cwPTp0w1miIhDSGSQn58fBgwYgH379klmnnvuOfTv\n3x8HDhwwYjPA398fra2tOHz4sGRm5syZaGpqku3PPYjkgENIJOHo0aMICAhAZWUlbGxs9GYyMzPh\n7+9vMNNbrly5AkdHRxw8eBC+vr56M1999RUcHBwMZoiUjkNIpEdrayuGDx+OmTNnYtWqVZKZESNG\nwN/fH2+//baRG34nOjoaH3/8McrKyvDwww/rzcTExODgwYMGM0RKxqdGifRITk7Gv//9b0RGRhrM\naLVaREVFGbHZD0VHR0Or1WLTpk2SmaioqPtmiJSMV4REP/LNN9/A1tYWSUlJkscaff3111CpVNi4\ncaPwo4+2b9+O4OBgaDQaDBo0yGDm8uXLsLS0NHJDInnjEBL9SFBQEIqLi3H27FnJN8S88cYbKCkp\nMZgxFp1Oh3HjxsHFxQUpKSmSGXd3d4wePRrvv/++kRsSyRuHkOgepaWlcHV1xenTp+Hu7q43c/Hi\nRbi7u6OgoACjR482ckP9CgoKMHHiRBQWFsLZ2dlg5uLFixgxYoSRGxLJF4eQ6B5TpkyBlZUVduzY\nofdznU6HSZMm4emnn8b27duN3M6wV199FdevX0dOTo5k5pVXXkFdXZ3BDJHScAiJvnfgwAHMmzcP\nVVVVsLKy0pvZv38/AgMDodFoMGTIECM3NOzGjRuws7PDhx9+iICAAMmMSqVCeno6/P39jdyQSJ74\n1CgRgKamJoSFhSEiIkJyBP/73/8iJCQE4eHhshtBALCyskJkZCTCwsLQ1NQkmYmIiIBarZbMECkN\nh5AI353jp9PpoFarDWb69+9vMCNaaGgoABh8v2hYWBh0Oh3ee+89Y9UikjXeGiXFq6urg0qlwtat\nW/Hiiy/qzdTW1sLe3h7p6emStx3lYv/+/Xjttddw6dIl/OY3v9Gb2bt3LwIDA1FdXY0nnnjCyA2J\n5IVDSIo3b948XL161eCp7nPnzsW//vUvHD9+3HjFHoCXlxesra2Rnp4umfH09ISNjY3Bsw2JlIBD\nSIp2/vx5eHh44MKFCxg5cqTeTF5eHqZMmYLi4mI4Ojoat2A3debUjJKSEri5uSE/Px9jxowxckMi\n+eAQkmLpdDqMHz8ew4cPR2pqqt5MR0cH3NzcMHbsWJP7IXpQUBAuXLhg8BzF+fPno6KiAvn5+cJf\nDEAkCh+WIcXatWsXqqqqJF+qDQDp6emoqakxmJGrt99+G1evXpX8TeTdTGVlJXbt2mXEZkTywiEk\nxWpsbISbmxsGDx6s93OtVovIyEjExcXBwsLCyO0enIWFBWJjYxEREQGtVqs3M3jwYLi5uaGxsdHI\n7Yjkg7dGSbG+/PJLODo64tChQ/Dx8fnJ5y0tLfjggw8QGBiIRx55REDDB3e/f4fjx49jxowZqKio\nwO9+9zsBDYnE4xCSoqnVamRmZqK0tBQPPfSQ6DpG1dbWBmdnZzz77LNYs2aN6DpEwvDWKClafHw8\nGhoaJB+W+TnbvHkzbt26heXLl4uuQiQUrwhJ8bZs2YLIyEhoNBoMHDhQdB2jqK+vx7Bhw5CYmIi/\n/OUvousQCcUhJMXr6OjA2LFjMX78eCQlJYmuYxR//etfUVBQgIKCAsmfVhApBYeQCMCZM2cwefJk\nFBUVYfjw4aLr9KrPP/8co0aNQm5uLsaPHy+6DpFwHEKi782ePRvffvstPvvsM9FVetXUqVMxePBg\n7N69W3QVIlngEBJ97/r167Czs8OuXbswY8YM0XV6xaeffoo5c+agqqoKTz75pOg6RLLALweIvvfk\nk09CrVYjJCQEzc3Nouv0uObmZoSEhECtVnMEie7BISS6R0REBNra2vC3v/1NdJUel5SUhPb2dkRE\nRIiuQiQrvDVK9CO7d+/G66+/jurqajz++OOi6/SImzdvQqVSITU1FbNnzxZdh0hWOIREekyePBnD\nhg1DWlqa6Co9IjAwEF988QVyc3NFVyGSHQ4hkR7FxcUYM2YMzpw5Azc3N9F1HsjdMxfPnz8PFxcX\n0XWIZIdDSCQhMDAQly5dMumz+nQ6HTw8PODg4PCzubol6ml8WIZIQkJCAiorK/GPf/xDdJVu2717\nNyorK/HOO++IrkIkWxxCIgmDBw9GdHQ0wsPDcefOHdF1uuzOnTsIDw9HTEyM5JmLRMQhJDIoODgY\nAwYMMMljihITE/Hoo49i6dKloqsQyRq/IyS6j0OHDmH27NmorKzEU089JbpOp1y9ehUODg7Ys2cP\nnn/+edF1iGSNQ0jUCb6+vvj1r39tMt8Xzpo1C1qtFpmZmaKrEMkeh5CoEy5dugRnZ2dkZ2dj8uTJ\nejM3b95ERkaGUfo899xzkj/2z83NxbRp01BaWgp7e3uj9CEyZRxCok5aunQpTp06hcLCQvTr1+8n\nn1+8eNFoh9ympaVh9OjRP/n77e3tcHV1xeTJk7FhwwajdCEydRxCok66desWbG1tkZCQINtT3bds\n2YLo6GhoNBqYm5uLrkNkEvjUKFEnmZub480330RMTAwaGxtF1/mJ27dvIzY2FitXruQIEnUBrwiJ\nuqC9vR2jRo2Cj4+P7H5SoVarkZ2djaKiIr23bolIPw4hURfl5OTA19cXZWVlUKlUousAAKqrqzFi\nxAhkZWXBy8tLdB0ik8IhJOoGf39/tLa24vDhw6KrAPjuKdL+/fvjwIEDoqsQmRwOIVE3XLlyBY6O\njjh48CB8fX2FdsnMzIS/vz8qKythY2MjtAuRKeIQEnVTcHAwsrOz4eHhIbRHXl4efHx8+HMJom7i\nU6NEJs5Uj4gikgteERJ1Q3l5OVxcXHDy5EnhV4T5+fnw8vJCUVERnJychHYhMkUcQqJu8Pb2xsCB\nA7Fnzx7RVQAAL730Em7duoXs7GzRVYhMDoeQqIsyMjIwa9YsVFVVwdraWnQdAMC1a9dgZ2eHvXv3\nws/PT3QdIpPC7wiJuqC5uRnBwcFQq9WyGUEAsLa2RmhoKIKDg9Hc3Cy6DpFJ4RASdcHGjRvR2tqK\nyMhI0VV+IioqCi0tLUhOThZdhcik8NYoUSfdvHkTKpUKKSkpmDNnjug6en300UcICgqCRqPB0KFD\nRdchMgkcQqJOmj9/Pqqrq5GbmyvbnyzodDpMmjQJ9vb2SE1NFV2HyCRwCIk64cKFCxg/fjzOnTuH\nUaNGia5jUFFREcaOHYuzZ8/C1dVVdB0i2eMQEt2HTqfDhAkToFKpsHXrVtF1OuVPf/oTNBoN8vLy\nZHv1SiQXfFiG6D727NmDiooKJCQkiK7Sae+++y7Ky8uxd+9e0VWIZI9DSGTAnTt3EBYWhqioKAwZ\nMkR0nU4bMmQIoqKioFarcefOHdF1iGSNQ0hkwJo1azBgwAAsW7ZMdJUuCwkJgZmZGdauXSu6CpGs\n8TtCIgk1NTWwt7fH7t278cILL4iu0y2ffPIJXn75ZVy6dAm//e1vRdchkiUOIZGEl19+GQ0NDcjK\nyhJd5YFMnz4dFhYW2LVrl+gqRLLEISTS49SpU/D29kZpaSns7e1F13kglZWVGDlyJP75z39i4sSJ\nousQyQ6HkOhHOjo64OrqCg8PD2zcuFF0nR6xePFinDlzBoWFhejbl48GEN2L/0cQ/ci2bdtw7do1\nrFy5UnSVHvPWW2+htrYWH374oegqRLLDISS6R2NjI6KjoxEfHw9zc3PRdXqMubk54uPjERUVhcbG\nRtF1iGSFt0aJ7hEeHo6srCwUFxejX79+ouv0qPb2dri4uMDX1xerV68WXYdINjiERN/TaDQYPnw4\nMjMzMWXKFNF1ekVOTg58fX3x+eefw9bWVnQdIlngEBJ9b8aMGejXrx8+/vhj0VV61R/+8AfodDp8\n+umnoqsQyQKHkAhAVlYW/vjHP6KiogJPP/206Dq96sqVK3BwcMAnn3yC6dOni65DJByHkBSvtbUV\nzs7OeOGFF0zqxdoPIjIyEocOHUJpaSkefvhh0XWIhOJTo6R4KSkpaGxsRExMjOgqRhMTE4Pbt2/j\n73//u+gqRMLxipAU7dtvv4WtrS3Wr1+PefPmia5jVNu2bUNoaCg0Gg0sLS1F1yEShkNIirZo0SIU\nFhaioKBAcQfYdnR0YNy4cXBzc0NycrLoOkTC8NYoKVZZWRlSU1ORlJSkdwRbWlqQkpKClpYWAe16\nhqF/h759+2LDhg3YvHkzysrKBLQjkgcOISlWXl4evLy84O7urvfz9vZ2JCYmYt26dUZu1nPWrl2L\nxMREdHR06P183Lhx8PT0RF5enpGbEckHh5AUa+jQocjLy0Ntba3ez83MzLB69Wq88847qKurM3K7\nB1dXV4eEhAQkJibiF7/4hd5MbW0t8vPz8fjjjxu5HZF88DtCUjQfHx8MGTIEO3bskMx4enriqaee\nMrkXVr/22muora3FiRMnJDOvvPIKvvnmGxw7dsyIzYjkhUNIilZRUQEXFxfk5ORgwoQJejMlJSVw\nc3NDXl4exo4da+SG3XPu3DlMmDABhYWFcHZ21ps5ffo0pk6dipKSEjg4OBi5IZF8cAhJ8RYuXIhz\n587hwoULkmf1vf766ygrK8OZM2dk/3RpR0cHxowZA1dXV8nfCXZ0dMDNzQ3u7u7YtGmTkRsSyQu/\nIyTFW7VqFWpqagzeHl21ahWqqqqwc+dOIzbrnp07d+LLL7/EW2+9JZlJT09HTU2NwQyRUnAISfEs\nLCwQGxuLiIgIaLVavZlBgwYhLi4OUVFR+M9//mPkhp2n1WoRHh6O5cuXY9CgQZKZyMhIxMXFwcLC\nwsgNieSHQ0gEYPHixbC0tDR4Tt+SJUvwy1/+UtZn+SUmJsLS0hJLliyRzLz77rsYNGgQFi1aZMRm\nRPLF7wiJvvfZZ5/Bz88PFRUVeOaZZ/Rmjh49ioCAAFRWVsLGxsbIDQ374osv4OTkhIyMDHh7e+vN\nXL58GU5OTjhy5IhkhkhpOIRE9/Dz84OZmRn2798vmXn22Wfx6KOPYt++fUZsdn8zZ87E//73Pxw5\nckQyExAQgObmZmRkZBixGZG8cQiJ7nH3qurw4cOYNm2a3kxVVRWcnZ1x7NgxeHp6GreghOzsbDz/\n/PMoLy+XvJo9fvw4ZsyYYTBDpEQcQqIfCQ0NxbFjx1BSUoKHHnpIbyYkJATHjx83mDGWtrY2jBw5\nEr///e8lXwfX1tYGZ2dn+Pr6Yu3atUZuSCRvfFiG6Efi4+NRX1+PtLQ0ycyKFSvumzGWLVu2oL6+\nHvHx8QYzDQ0NBjNESsUrQiI9Nm/ejOjoaFy+fFnyJwZ3MxqNBgMHDjRyw+/U19fD1tYWCQkJWLBg\nQbczRErGISTS4+7bWSZNmoT169cbzEycOBHvvfeekRt+Jzg4GPn5+Th37pzkW3GWLVuG06dP4/z5\n85IZIiXjEBJJyMvLw5QpU1BcXAxHR0e9mfz8fHh5eaGoqAhOTk5G7VdeXg4XFxecPHkSHh4eejN3\n36V64sQJyQyR0nEIiQzozE8SXnrpJdy6dQvZ2dlGbAZ4e3tj4MCB2LNnj2Rm2rRpeOyxx2T3Uw8i\nOeEQEhnw1VdfwcHBAQcPHoSvr6/ezLVr12BnZ4e9e/fCz8/PKL0yMjIwa9YsVFVVwdraWm/m6NGj\nePHFFw1miIhPjRIZZGNjg5CQEISEhKC1tVVvxtraGqGhoQgODkZzc3Ovd2pubkZwcDDUarXkwDU3\nN2Pp0qVYtmwZR5DoPjiERPcRFRUFrVaL5ORkg5mWlhaDmZ6yceNGtLa2IjIyUjKzadMmtLS0IDo6\nutf7EJk63hol6oT09HQsXboUGo1G8lSHjz76CEFBQdBoNBg6dGiv9Lh58yZUKhVSUlIwZ84cvZmv\nv/4aKpUKycnJmDt3bq/0IPo54RASdYJOp8O4cePg4uKClJQUycykSZNgb2+P1NTUXukxf/58VFdX\nIzc3V/KA4DfeeAPl5eU4ffq07A8RJpID3hol6oQ+ffpgw4YNSEtLQ2lpqWQmMTERwHej2NN0Oh36\n9OmD1atXSw7cxYsX8cEHHyApKYkjSNRJvCIk6oK5c+fixo0byMnJEV3lJ+5ekdrY2CA9PV10HSKT\nwSEk6oIbN27Azs4O27dvh7+/v+g6P3DgwAHMmzcPVVVVsLKyEl2HyGTw1ihRF1hZWSEiIgJqtRpN\nTU2i6/y/pqYmhIWFITIykiNI1EUcQqIuUqvV0Ol0ku8gFeHu8UuhoaGCmxCZHt4aJeqGffv24c9/\n/jOqq6vxxBNPCO1SV1cHlUqFbdu2YebMmUK7EJkiDiFRN3l6esLS0hIhISFCe6xbtw4NDQ04ceKE\n0B5Epkrs0dpEJmz9+vWIi4uDp6en0B7Tpk2TPJmeiO6PV4RERKRofFiGiIgUjUNIRESKxiEkIiJF\n4xASEZGicQiJiEjROIRERKRoHEIiIlI0DiERESkah5CIiBSNQ0hERIrGISQiIkXjEBIRkaJxCImI\nSNE4hEREpGgcQiIiUjQOIRERKdr/AbYCZQZaKmKkAAAAAElFTkSuQmCC\n", "prompt_number": 23, "text": [ "" ] } ], "prompt_number": 23 }, { "cell_type": "code", "collapsed": false, "input": [ "testData.groupby(['binnedProb',testData['molecule'] >= polyarom])['mutagenic'].size().unstack().plot(kind='bar',stacked=True,)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 24, "text": [ "" ] }, { "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAXMAAAEHCAYAAABcCaZFAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X9cVXWex/HX5YeZgYop6IKzsKHiRYQ7KtIPFRO0ViW0\nicIy1h/VQ3dmK9uZ1bYanJ1RnLJddfLxcIoa2hrRmhTGTaMfXqf0UagjtvPAhFxUfquDvxANhLN/\nMN5A8YLAhcvx/Xw8eDyu555zvp97j7zv4Xu/53wthmEYiIhIj+bR3QWIiEjHKcxFRExAYS4iYgIK\ncxERE1CYi4iYgMJcRMQE2hTm9fX12Gw2Zs6cCUBqaipBQUHYbDZsNhvbt293rLty5UqGDRtGWFgY\nOTk5rqlaRESa8WrLSmvWrMFqtXL+/HkALBYLS5YsYcmSJc3Wy8/PZ9OmTeTn51NaWkpcXBwFBQV4\neOgPABERV2o1zEtKSvjwww/593//d1599VUADMOgpWuNsrKySE5Oxtvbm+DgYEJDQ8nNzSUmJsax\njsVi6cTyRURuHs6u8Wz1lPnZZ5/l5ZdfbnZ2bbFYWLduHZGRkSxYsIAzZ84AUFZWRlBQkGO9oKAg\nSktLWyyovT8///nPO7R9Z/yohu5vXzW4R/uqoevab43TMN+2bRv+/v7YbLZmO1u0aBFFRUXk5eUx\nZMgQnnvuuevuQ2fiIiKu5zTM9+zZQ3Z2NiEhISQnJ/PZZ5/x+OOP4+/vj8ViwWKxsHDhQnJzcwEI\nDAykuLjYsX1JSQmBgYGufQUiIuI8zFesWEFxcTFFRUVkZmZy77338vbbb1NeXu5YZ8uWLURERACQ\nkJBAZmYmtbW1FBUVUVhYSHR0dKcWHBsb26n7Uw09s33V4B7tqwb3aB/AYrSlMwaw2+28+uqrZGdn\nM3fuXA4ePIjFYiEkJIQNGzYQEBAANH4AvPnmm3h5ebFmzRqmTZvWvEGLpU39PyIi8r3WsrPNYd5Z\nrlfQgAEDOH36dFeW0qP4+flRVVXV3WWI3JQG9O3L6b8NzW4vP19fqs6da/f2PSbMdcbunN4fke5j\nsVjo6G+fBedDC9tUg5PtdTWPiIgJKMxFRExAYS4iYgIKcxERE7ipw/x3v/sdP/nJTzp1n6mpqaxe\nvbpT9yki0pqbOsxdcasB3b5ARLpDjw/zo0ePEhYWxrx58xgxYgSPPvooOTk53H333QwfPpy9e/dS\nVVVFYmIikZGR3Hnnnfzv//7vNfs5efIkP/rRj4iOjiY6Opo9e/YAUF1dzbx58xg9ejSRkZFs2bIF\nAB8fH8e277//PvPmzbtmn0eOHOH+++9n7NixTJw4kcOHD7voXRCRm12b7mfu7o4cOcIf/vAHrFYr\n48aNY9OmTezevZvs7GxWrFjB0KFDGTNmDFu3bmXnzp08/vjjHDhwoNmYzaeffppnn32Wu+++m+PH\nj3PfffeRn5/Pf/zHf+Dn58fXX38N4LhDZNMz8KvPxq/8+8knn2TDhg2Ehoby1VdfsXjxYj799FNX\nvx0ichMyRZiHhIQQHh4OQHh4OHFxcQBERERQVFTEsWPH+OCDDwCYPHkyf/3rXx0TbVzxySefcOjQ\nIce/z58/z4ULF/j000/ZtGmTY3n//v3bVNOFCxfYs2cPDz30kGNZbW1t+16giEgrTBHmt9xyi+Ox\nh4cHvXr1AhrPkOvr6/H09Lzmyqmrz6YNw+Crr75ybHv1c1druv3Fixeveb6hoQE/Pz8OHDhwYy9G\nRKQdenyfeVtMmDCBd999F2i8YdigQYOa9XkDTJ06lbVr1zr+ffDgQQDi4+N57bXXHMuvdLMEBATw\nzTff0NDQ4OhHh+8n3vD19SUkJIT333/fsfxKV42ISGczRZhfr8/6yuOf//zn7N+/n8jISJ5//nky\nMjIcz11Zd+3atezbt4/IyEjCw8PZsGEDAC+88AKnT58mIiKCqKgo7HY7AGlpacyYMYO7776bv/u7\nv3Psp+k+3333XdLT04mKimLUqFFkZ2e79H0QkZuXbrTVQ+j9Eek+utGWiIh0iTaFeX19PTabjZkz\nZwJQVVVFfHw8w4cPZ+rUqY5+ZICVK1cybNgwwsLCyMnJcU3VIiLSTJvCfM2aNVitVkdfcFpaGvHx\n8RQUFDBlyhTS0tIAyM/PZ9OmTeTn57Njxw4WL15MQ0OD66oXERGgDWFeUlLChx9+yMKFCx39NdnZ\n2aSkpACQkpLC1q1bAcjKyiI5ORlvb2+Cg4MJDQ11TPYsIiKu0+o482effZaXX36Zc02mO6qsrHTM\n+RkQEEBlZSUAZWVlxMTEONYLCgqitLT0mn2mpqY6HsfGxrrFZKgiIu7Ebrc7Rs+1hdMw37ZtG/7+\n/thstuvutOlQvOs9f7WmYS4i0pqOzsHZ0fk3u8PVJ7rLly93ur7TMN+zZw/Z2dl8+OGHXLp0iXPn\nzjF37lwCAgKoqKhg8ODBlJeX4+/vD0BgYCDFxcWO7UtKSggMDOzAyxERgdPnz3doaKClg5Mx9wRO\n+8xXrFhBcXExRUVFZGZmcu+99/Lf//3fJCQkOC68ycjIIDExEYCEhAQyMzOpra2lqKiIwsJCoqOj\nXf8q3ExsbCzp6endXYaI3ERu6N4sV7pMli5dSlJSEunp6QQHB7N582YArFYrSUlJWK1WvLy8WL9+\nfYfu79237wDOnz/d7u1b4+vrx7lzVa2uFxwczIkTJ/D09AQa34eCggIGDx7c4vqtdT2JiHQ2t74C\ntDEQXVle266qDAkJIT09nXvvvbdNe508eTJz585l/vz5HS3QQVeAys2so1dgdsrVlx1ov9Nq0BWg\nnevMmTPMmDEDf39/BgwYwMyZM1sctQPw7bffMmnSJPr378+gQYN45JFHHM998803xMfHc/vttxMW\nFsZ7773XVS9BRExGYd5GTT8R6+vrWbBgAcePH+f48ePceuut/PjHP25xuxdffJH77ruPM2fOUFpa\nyr/8y78Ajfc7j4+P57HHHuPkyZNkZmayePHiZvdUFxFpK4V5GxiGQWJiIn5+fvj5+fHEE08wa9Ys\nevfujY+PD88//zy7du1qcdtevXpx9OhRSktL6dWrF3fddRfQOOwzJCSElJQUPDw8iIqKYvbs2To7\nF5F2UZi3gcViISsri9OnT3P69GneffddnnrqKYKDg+nXrx+TJk3i7NmzLfZn/frXv8YwDKKjoxk1\nahRvvfUWAMeOHeOrr75yfED4+fnx+9//3nEBlojIjTDFTENd7ZVXXqGgoIDc3Fz8/f3Jy8vjhz/8\nIYZhXDOKJSAggN/+9rcA7N69m7i4OCZOnMgPfvADJk2apJuRiUin0Jl5O1RXV3PrrbfSr18/qqqq\nnF6Z9d5771FSUgI0zh9qsVjw9PRkxowZFBQU8M4771BXV0ddXR179+7lm2++6aqXISIm4tZh7uvr\nR+OAHtf8NO7/xj3zzDNcvHiRgQMHctddd3H//fdfd1z5vn37iImJwdfXlwceeIC1a9cSHByMj48P\nOTk5ZGZmEhgYyJAhQ1i2bJkmfRaRdnHrcebyPb0/cjPTOHONMxcRuSkozEVETEBhLiJiAgpzERET\nUJiLiJiAwlxExAQU5iIiJqAwFxExAadhfunSJcaPH09UVBRWq5Vly5YBjRMyBwUFYbPZsNlsbN++\n3bHNypUrGTZsGGFhYbrviIhIF2n1CtCamhr69OnD5cuXueeee3jllVf49NNP8fX1ZcmSJc3Wzc/P\nZ86cOezdu5fS0lLi4uIoKCjAw+P7z4wbuQK0ozNyt6atM3b7+Pg4Lte/cOECvXv3dkwh99vf/pbk\n5GSX1XiFrgCVm5muAG09A1q9a2KfPn0AqK2tpb6+Hj+/xvuZtLTTrKwskpOT8fb2Jjg4mNDQUHJz\nc4mJiWlX8R2dkbs1bZ2xu7q62vHY2RRyly9fxstLN6IUka7Xap95Q0MDUVFRBAQEMHnyZMLDwwFY\nt24dkZGRLFiwgDNnzgBQVlZGUFCQY9ugoKAWp1NLTU11/Njt9k56KV3PbrcTFBTEr3/9a4YMGcL8\n+fPJyMhgwoQJzdbz8PDg//7v/wD47rvv+Nd//Vf+/u//nsGDB7No0SIuXbrUHeWLiBuz2+3NsrI1\nrZ5Genh4kJeXx9mzZ5k2bRp2u51Fixbx0ksvAY3Toj333HOkp6e3uH1LdxNsS2E9RWVlJadPn+b4\n8ePU19eTmZnpdP2lS5dSVFTEwYMH8fLyYs6cOfziF79gxYoVXVSxiPQEsbGxxMbGOv7t7FbbcAOj\nWfr168f06dPZt28f/v7+WCwWLBYLCxcuJDc3F4DAwECKi4sd25SUlBAYGHiDL6Fn8fDwYPny5Xh7\ne9O7d2+n6xqGweuvv86rr75K//798fHxYdmyZa1+AIiItMZpmJ86dcrRhXLx4kU+/vhjbDYbFRUV\njnW2bNlCREQEAAkJCWRmZlJbW0tRURGFhYVER0e7sPzuN2jQIHr16tWmdU+ePElNTQ1jxoxxTBV3\n//33c+rUKRdXKSJm57Sbpby8nJSUFBoaGmhoaGDu3LlMmTKFxx9/nLy8PCwWCyEhIWzYsAEAq9VK\nUlISVqsVLy8v1q9ff91JG8zi6td32223UVNT4/h30w++gQMHcuutt5Kfn8+QIUO6rEYRuQkYXex6\nTba0HDAMF/605+UHBwcbn376qWEYhrFz504jKCio2fOHDx82brnlFiMvL8+4ePGi8dRTTxkWi8U4\ncuSIYRiG8fTTTxtJSUnGiRMnDMMwjJKSEuOjjz5qtd1uOFQibqOjWdDR35/OyKJOqcEJt74C1M/X\n14WTxjXuv6OuPjMfPnw4L730EnFxcYwYMYIJEyY0W2fVqlWEhoYSExNDv379iI+Pp6CgoMN1iMjN\nTdPG9RB6f+RmpouGNG2ciMhNQWEuImICCnMRERNwmxuJ+Pn5mX4YY0dcuSeOiEhL3OYLUBGR69EX\noPoCVETkpqAwFxExAYW5iIgJKMxFREzA7cJ8QN++jtvrtvdnQN++3f0yRDqFfh+krdxuNIs7fGss\n4i70+9BIo1k6YQ7QruZN44vu6D5ERG4mbhfmdQAd/Ays6/DHgYhIz+J2feYiInLjnIb5pUuXGD9+\nPFFRUVitVpYtWwZAVVUV8fHxDB8+nKlTpzqmlgNYuXIlw4YNIywsjJycHNdWb2Id/eJLX3qJ3Fxa\n/QK0pqaGPn36cPnyZe655x5eeeUVsrOzGThwID/72c9YtWoVp0+fJi0tjfz8fObMmcPevXspLS0l\nLi6OgoICPDy+/8xoyxegHe1mgZ5/y4Du/sJH3IM7fPHmDrr798EdjkOHL+fv06cPALW1tdTX1+Pn\n50d2djYpKSkApKSksHXrVgCysrJITk7G29ub4OBgQkNDyc3NbXfx0n3cYUic/joRabtWvwBtaGjg\nhz/8IUeOHGHRokWEh4dTWVlJQEAAAAEBAVRWVgJQVlZGTEyMY9ugoCBKS0uv2WdqaqrjcWxsLLGx\nsR18GebT0VE9HR3Rc/r8+Y6fiZw/3601dLR9aTSgb19Od+C99PP1percuU6s6OZgt9ux2+1tXr/V\nMPfw8CAvL4+zZ88ybdo0du7c2ez5K2dB19PSc03DXFrW0VE9GtEjnUUfqt3j6hPd5cuXO12/zaNZ\n+vXrx/Tp09m/fz8BAQFUVFQAUF5ejr+/PwCBgYEUFxc7tikpKSEwMPBG6hc3ceUvg478mGG8v7p6\npKdwGuanTp1yjFS5ePEiH3/8MTabjYSEBDIyMgDIyMggMTERgISEBDIzM6mtraWoqIjCwkKio6Nd\n/BLEFb7/y6D9P3VdXbQLXDkrbe9PR7onRG6E026W8vJyUlJSaGhooKGhgblz5zJlyhRsNhtJSUmk\np6cTHBzM5s2bAbBarSQlJWG1WvHy8mL9+vWaPUhEpAu45b1ZNDSxM96Hjr0H7nAcuns4mjvU4DZD\n4rqxfYBeFkuH/tLzBmrNcBx60r1ZRESupgEBrVOYi9vq7uGZIj2Jwlzcls7GRNpON9oSETEBnZmL\niLSiJ8yzoDAXEWlFT5hnQd0sIiImoDAXETEBhbmIiAkozEVETEBhLiJiAhrNIuLGesKQOHEPCnMR\nJ7r7lgI9YUicuAeFuYgTuqWA9BTqMxcRMQGFuYiICTgN8+LiYiZPnkx4eDijRo1i7dq1QOOEzEFB\nQdhsNmw2G9u3b3dss3LlSoYNG0ZYWBg5OTmurd5FNO+jiPQ0TmcaqqiooKKigqioKKqrqxkzZgxb\nt25l8+bN+Pr6smTJkmbr5+fnM2fOHPbu3UtpaSlxcXEUFBTg4fH9Z0ZPmGmou2c1Ac001Dk1dHzG\nqe6uwV2OQ3fPNKTj0Hp2Oj0zHzx4MFFRUQD4+PgwcuRISktLgZYPTlZWFsnJyXh7exMcHExoaCi5\nubntLr67dHQyYzNMZCwiPUubR7McPXqUAwcOEBMTw+7du1m3bh1vv/02Y8eOZfXq1fTv35+ysjJi\nYmIc2wQFBTnCv6nU1FTH49jYWGJjYzv0IkREzMZut2O329u8fpsmdK6uriY2NpYXXniBxMRETpw4\nwaBBgwB48cUXKS8vJz09nZ/85CfExMTw6KOPArBw4UL+8R//kdmzZ3/fYA/oZunuP+ncoQYdB/eo\nwV2Og7pZ3OQ4tLebBaCuro4HH3yQxx57jMTERAD8/f0dX/YtXLjQ0ZUSGBhIcXGxY9uSkhICAwPb\nXbyIiLSN0zA3DIMFCxZgtVp55plnHMvLy8sdj7ds2UJERAQACQkJZGZmUltbS1FREYWFhURHR7uo\ndBERucJpn/nu3bt55513GD16NDabDYAVK1awceNG8vLysFgshISEsGHDBgCsVitJSUlYrVa8vLxY\nv3793/48ERERV2pTn3mnNqg+8x5Rg46De9TgLsdBfeZuchw60mcuIiLuTzfaEhGnuvvOkdI2CnMR\ncUp3juwZ1M0iImICCnMRERNQmIuImIDCXETEBBTmIiImoDAXETEBhbmIiAkozEVETEBhLiJiAgpz\nERETUJiLiJiAwlxExAQU5iIiJuA0zIuLi5k8eTLh4eGMGjWKtWvXAlBVVUV8fDzDhw9n6tSpnDlz\nxrHNypUrGTZsGGFhYeTk5Li2ehERAVqZaaiiooKKigqioqKorq5mzJgxbN26lbfeeouBAwfys5/9\njFWrVnH69GnS0tLIz89nzpw57N27l9LSUuLi4igoKMDD4/vPDM001DNq0HFwjxp0HNyjBnc5Ds62\nd3o/88GDBzN48GAAfHx8GDlyJKWlpWRnZ7Nr1y4AUlJSiI2NJS0tjaysLJKTk/H29iY4OJjQ0FBy\nc3OJiYlptt/U1FTH49jYWGJjY9v58kREzMlut2O329u8fpsnpzh69CgHDhxg/PjxVFZWEhAQAEBA\nQACVlZUAlJWVNQvuoKAgSktLr9lX0zAXEZFrXX2iu3z5cqfrt+kL0Orqah588EHWrFmDr69vs+cs\nFsvf/gRpmbPnRESkc7Qa5nV1dTz44IPMnTuXxMREoPFsvKKiAoDy8nL8/f0BCAwMpLi42LFtSUkJ\ngYGBrqhbRESacBrmhmGwYMECrFYrzzzzjGN5QkICGRkZAGRkZDhCPiEhgczMTGpraykqKqKwsJDo\n6GgXli8iItDKaJYvvviCiRMnMnr0aEd3ycqVK4mOjiYpKYnjx48THBzM5s2b6d+/PwArVqzgzTff\nxMvLizVr1jBt2rTmDWo0S4+oQcfBPWrQcXCPGtzlODjNTmdh7goK855Rg46De9Sg4+AeNbjLcXC2\nva4AFRExAYW5iIgJKMxFRExAYS4iYgIKcxERE1CYi4iYgMJcRMQEFOYiIiagMBcRMQGFuYiICSjM\nRURMQGEuImICCnMRERNQmIuImIDCXETEBJyG+fz58wkICCAiIsKxLDU1laCgIGw2Gzabje3btzue\nW7lyJcOGDSMsLIycnBzXVS0iIs04DfN58+axY8eOZsssFgtLlizhwIEDHDhwgPvvvx+A/Px8Nm3a\nRH5+Pjt27GDx4sU0NDS4rnIREXFwGuYTJkzAz8/vmuUtzXaRlZVFcnIy3t7eBAcHExoaSm5ubudV\nKiIi1+XVno3WrVvH22+/zdixY1m9ejX9+/enrKyMmJgYxzpBQUGUlpa2uH1qaqrjcWxsLLGxse0p\nQ0TEtOx2O3a7vc3r33CYL1q0iJdeegmAF198keeee4709PQW170yCfTVmoa5iIhc6+oT3eXLlztd\n/4ZHs/j7+2OxWLBYLCxcuNDRlRIYGEhxcbFjvZKSEgIDA2909yIi0g43HObl5eWOx1u2bHGMdElI\nSCAzM5Pa2lqKioooLCwkOjq68yoVEZHrctrNkpyczK5duzh16hRDhw5l+fLl2O128vLysFgshISE\nsGHDBgCsVitJSUlYrVa8vLxYv379dbtZRESkc1mMloamuLJBi6XF0TBNn4eOluS8jVa37nANHWvf\nHWrQcXCPGnQc3KMGdzkOzrbXFaAiIiagMBcRMQGFuYiICSjMRURMQGEuImICCnMRERNQmIuImIDC\nXETEBBTmIiImoDAXETEBhbmIiAkozEVETEBhLiJiAgpzERETUJiLiJiAwlxExASchvn8+fMJCAhw\nTA0HUFVVRXx8PMOHD2fq1KmcOXPG8dzKlSsZNmwYYWFh5OTkuK5qERFpxmmYz5s3jx07djRblpaW\nRnx8PAUFBUyZMoW0tDQA8vPz2bRpE/n5+ezYsYPFixfT0NDguspFRMTBaZhPmDABPz+/Zsuys7NJ\nSUkBICUlha1btwKQlZVFcnIy3t7eBAcHExoaSm5urovKFhGRppxO6NySyspKAgICAAgICKCyshKA\nsrIyYmJiHOsFBQVRWlra4j5SU1Mdj2NjY4mNjb3RMkRETM1ut2O329u8/g2HeVMWi+VvE51e//mW\nNA1zERG51tUnusuXL3e6/g2PZgkICKCiogKA8vJy/P39AQgMDKS4uNixXklJCYGBgTe6exERaYcb\nDvOEhAQyMjIAyMjIIDEx0bE8MzOT2tpaioqKKCwsJDo6unOrFRGRFjntZklOTmbXrl2cOnWKoUOH\n8otf/IKlS5eSlJREeno6wcHBbN68GQCr1UpSUhJWqxUvLy/Wr1/vtAtGREQ6j8UwDKNLG7RYcNZk\n4wdAR0ty3karW3e4ho617w416Di4Rw06Du5Rg7scB2fb6wpQERETUJiLiJiAwlxExAQU5iIiJqAw\nFxExAYW5iIgJKMxFRExAYS4iYgIKcxERE1CYi4iYgMJcRMQEFOYiIiagMBcRMQGFuYiICSjMRURM\nQGEuImIC7Z7QOTg4mL59++Lp6Ym3tze5ublUVVXx8MMPc+zYMccsRP379+/MekVEpAXtPjO3WCzY\n7XYOHDhAbm4uAGlpacTHx1NQUMCUKVNIS0vrtEJFROT6OtTNcvUURtnZ2aSkpACQkpLC1q1bO7J7\nERFpo3Z3s1gsFuLi4vD09OSpp57iiSeeoLKykoCAAAACAgKorKxscdvU1FTH49jYWGJjY9tbhoiI\nKdntdux2e5vXb/eEzuXl5QwZMoSTJ08SHx/PunXrSEhI4PTp0451BgwYQFVVVfMGNaFzj6hBx8E9\natBxcI8a3OU4uGRC5yFDhgAwaNAgZs2aRW5uLgEBAVRUVACNYe/v79/e3YuIyA1oV5jX1NRw/vx5\nAC5cuEBOTg4REREkJCSQkZEBQEZGBomJiZ1XqYiIXFe7+swrKyuZNWsWAJcvX+bRRx9l6tSpjB07\nlqSkJNLT0x1DE0VExPXa3Wfe7gbVZ94jatBxcI8adBzcowZ3OQ4u6TMXERH3oTAXETEBhbmIiAko\nzEVETEBhLiJiAgpzERETUJiLiJiAwlxExAQU5iIiJqAwFxExAYW5iIgJKMxFRExAYS4iYgIKcxER\nE1CYi4iYQA8Mc3t3F4BqcIf2QTW4Q/ugGtyhfReF+Y4dOwgLC2PYsGGsWrWqk/du7+T9tYe9uwug\n+2vo7vZBNbhD+6Aa3KF9F4R5fX09P/7xj9mxYwf5+fls3LiRQ4cOdXYzIiLSRKeHeW5uLqGhoQQH\nB+Pt7c0jjzxCVlZWZzcjIiJNdPocoO+//z4fffQRr7/+OgDvvPMOX331FevWrWts0GLpzOZERG4a\nzuLaq7Mbay2su3j+aBGRm0Knd7MEBgZSXFzs+HdxcTFBQUGd3YyIiDTR6WE+duxYCgsLOXr0KLW1\ntWzatImEhITObkZERJro9G4WLy8vfvOb3zBt2jTq6+tZsGABI0eO7OxmRESkiU7/AtQVDh06RFZW\nFqWlpQAEBQWRkJDgFh8Sb731FvPmzXN5O4cOHaKsrIzx48fj4+PjWL5jxw7uu+8+l7cP8MUXXzBg\nwACsVit2u519+/Zhs9mYMmVKl7Tfkscff5y33367W9r+/PPPyc3NJSIigqlTp3ZJm19++SUjR46k\nX79+1NTUkJaWxp///GfCw8N5/vnn6devn8trWLt2LbNmzWLo0KEub6sl3333HZmZmQQGBhIXF8e7\n777Lnj17sFqtPPnkk3h7e3dJHUeOHOGDDz6gpKQEDw8PRowYwZw5c+jbt2+XtH81tw/zVatWsXHj\nRh555BFH33txcTGbNm3i4YcfZtmyZd1a39ChQ5t9R+AKa9eu5bXXXmPkyJEcOHCANWvWkJiYCIDN\nZuPAgQMubR9g2bJl7Ny5k/r6eiZPnsyf/vQnpk+fzscff8zMmTP56U9/6vIaZs6cicViafYl+mef\nfca9996LxWIhOzvbpe1HR0eTm5sLwOuvv85rr73GrFmzyMnJYcaMGV3yf9FqtfL111/j5eXFE088\nwW233caPfvQjPvnkE77++ms++OADl9fQr18/+vTpwx133MGcOXN46KGHGDRokMvbvWLOnDnU19dT\nU1ND//79qa6uZvbs2XzyyScAZGRkuLyGNWvWsG3bNiZNmsT//M//YLPZ6N+/P1u2bGH9+vVMnjzZ\n5TVcw3BzoaGhRm1t7TXLv/vuO+OOO+7okhpGjRp13Z9evXq5vP3w8HDj/PnzhmEYRlFRkTFmzBjj\nP//zPw2miCf4AAAHP0lEQVTDMIyoqCiXt28YhjFy5Eijrq7OuHDhguHj42OcOXPGMAzDqKmpMSIi\nIrqkhqioKGPOnDnGZ599ZtjtdmPnzp3G4MGDDbvdbtjt9i5p/4oxY8YYJ06cMAzDMKqrq43w8HCX\nt28YhhEWFuZ4bLPZmj03evToLqkhKirKqK+vNz766CNj3rx5xsCBA41p06YZv/vd74xz5865vP1R\no0YZhmEYdXV1xqBBg4y6ujrDMAyjoaHB8ZyrhYeHG5cvXzYMwzAuXLhgTJw40TAMwzh27JgRGRnZ\nJTVcrdP7zDubp6cnpaWlBAcHN1teVlaGp6dnl9Rw4sQJduzYgZ+f3zXP3XXXXS5v3zAMR9dKcHAw\ndrudBx98kGPHjnXZUM9evXrh5eWFl5cXd9xxh+PP+VtvvRUPj665xc++fftYs2YNv/rVr3j55Zex\n2Wz07t2bSZMmdUn79fX1VFVVYRgG9fX1jrPR2267DS+vrvlVCg8P580332T+/PlERkayd+9exo0b\nR0FBAb169eqSGgA8PDyYOnUqU6dOpba2lu3bt7Nx40aee+45Tp065dK2Gxoa+O6776ipqeHixYuc\nPXuW22+/nUuXLtHQ0ODStq+wWCzU1dXh6enJpUuXuHDhAgA/+MEPqKur65Iarub2Yf5f//VfxMXF\nERoa6uijKy4uprCwkN/85jddUsP06dOprq7GZrNd81xXBIm/vz95eXlERUUB4OPjw7Zt21iwYAFf\nf/21y9sHuOWWW6ipqaFPnz78+c9/diw/c+ZMl4W5p6cnS5YsISkpiWeffRZ/f38uX77cJW0DnDt3\njjFjxgCNv8zl5eUMGTKE8+fPd1kNb7zxBk8//TS//OUvGTRoEHfddRdBQUEMHTqUN954o8vqaKpX\nr1488MADPPDAA45Qc6XHHnuMkSNH4u3tzerVq5kwYQJ33XUXX375JSkpKS5vH2DhwoWMGzeO8ePH\n8/nnn/Nv//ZvQOOJ3+23394lNVzN7fvMofGMKDc3l9LSUiwWC4GBgYwdO7bLzoa6W3FxMd7e3gwe\nPLjZcsMw2L17N/fcc4/La7h06RK9e/e+ZvmpU6coLy8nIiLC5TVcbdu2bezZs4cVK1Z0edtN1dTU\nUFlZSUhISJe1efbsWYqKirh8+TJBQUHX/N9wpcOHDzNixIgua68lR48epW/fvgwYMIAjR46wb98+\nwsLCiIyM7LIa/vKXv/DNN98watQowsLCuqzd6+kRYS4iIs71wPuZi4jI1RTmIiImoDAXETEBhbm4\nvaNHj7b4BesTTzzh8olPUlNTWb16NQD/9E//xD/8wz9gs9kYM2YMX3755Q3tq+mVuyKd7eYYDiKm\ndOWe+a5ksVgct3W2WCy88sorzJ49m48//pinnnqKgwcPNlu/oaHhukM1dS9/cSWdmUuPcPnyZR57\n7DGsVisPPfQQFy9eJDY21jHm3cfHhxdeeIGoqCjuvPNOTpw4ATSeTT/99NPcfffd3HHHHfzhD39w\n7PPll18mOjqayMhIUlNTHct/9atfMWLECCZMmMDhw4eb1XFl8NeECRP49ttvgcYLuZYuXcqYMWN4\n77332LhxI6NHjyYiIoKlS5c2237JkiWMGjWKuLg4l19cIzcXhbn0CIcPH+af//mfyc/Pp2/fvqxf\nv77ZmW5NTQ133nkneXl5TJw4sdlZe0VFBbt372bbtm2OcM3JyeHbb78lNzeXAwcOsH//fj7//HP2\n79/Ppk2bOHjwIB9++CF79+5tsZ4//vGPjB49Gmg84x44cCD79+9nwoQJLF26lJ07d5KXl8fevXsd\n0yZeuHCBcePG8Ze//IVJkyaxfPlyV71dchNSmEuPMHToUO68806g8QrAL774otnzvXr1Yvr06QCM\nGTOGo0ePAo1Be+WmZCNHjqSyshJoDPOcnBxH//fhw4cpLCzkiy++YPbs2fTu3RtfX99m9+I3DIOf\n/vSn2Gw23njjDdLT0x3PPfzwwwDs3buXyZMnc/vtt+Pp6cmjjz7Kn/70J6DxEvgr67X0GkQ6Qn3m\n0iM0PQs3DOOa/uemtz318PBodpl/03uWNL1GbtmyZTz55JPN9rNmzZpm6zR93LTP/Gq33XabY52r\nt2+pr/x6y0XaS2fm0iMcP37cMXrk97//fYdvYTBt2jTefPNNx71ESktLOXnyJBMnTmTr1q1cunSJ\n8+fPs23btmbbtXbB9Lhx49i1axd//etfqa+vJzMz03H/noaGBt577z3Ha5gwYUKHXoNIUzozF7dn\nsVgYMWIEr732GvPnzyc8PJxFixbxxz/+sdk6TR9f/e+rH8fHx3Po0CFH142vry/vvPMONpuNhx9+\nmMjISPz9/YmOjr6mlpbqu2LIkCGkpaUxefJkDMNgxowZzJw5E2g8e8/NzeWXv/wlAQEBbNq0qSNv\ni0gzujeLiIgJqJtFRMQEFOYiIiagMBcRMQGFuYiICSjMRURMQGEuImIC/w9svrqPeG2MqAAAAABJ\nRU5ErkJggg==\n" } ], "prompt_number": 24 }, { "cell_type": "markdown", "metadata": {}, "source": [ "This observation is supported by comparing the probability distributions dependent on the presence of the substructure. The predicted probability of being mutagenic is nearly twice if the substructure is present in a molecule." ] }, { "cell_type": "code", "collapsed": false, "input": [ "temp = testData.copy()\n", "temp['containsMotif'] = temp['molecule'] >= polyarom\n", "temp.boxplot('probability',by='containsMotif')" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 25, "text": [ "" ] }, { "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAEYCAYAAABSnD3BAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XlcVPX6B/DPYTGJSPC6MygmJCTLIIRb6mhuWZpRGtrC\n4JJaVN7uTfNVPwWXyso2vXbJTEGEi0tFNxV9mY65BamolVa4jCCkxaYWKgrf3x9cTgzLCAhz5sx8\n3q+XNYf5nnOewePDl+ec8xxJCCFAREQ2y0HpAIiIqGUx0RMR2TgmeiIiG8dET0Rk45joiYhsHBM9\nEZGNY6K3IY6OjggJCYFWq0VoaCgOHDjQrNs3GAwYM2aM2TG7d+9u9v1agre3N4qKimp9/Y477lAg\nGlPvv/8+rly5ctNx06ZNw4kTJxq9fYPBAAcHB6xatUr+2pEjR+Dg4IClS5eaXTchIQG//vqrSQw/\n/fQTAGDDhg245557cP/99zc6JmpeTPQ25Pbbb0dWVhaOHDmCN954A3PnzrV4DLt27cL+/fubvL4Q\nAkrc2iFJUqO+bkkffPABSktLbzpu5cqV8Pf3b/T2JUlCQEAA1q9fL38tJSUFwcHBN/38a9asQX5+\nvkkMfn5+AIBVq1bhk08+wddff93omKh5MdHbqIsXL6Jt27YAKpPnyy+/jMDAQAQFBcn/oGfNmoWF\nCxcCALZt24bBgwdDCAG9Xo8ZM2bg3nvvRc+ePbF58+Za2y8qKsK4ceMQHByMfv364fvvv4fRaER8\nfDzee+89hISEYO/evSbr/P777xg+fDgCAgIwbdo0eRZtNBrRs2dPREVFITAwELm5uXXGW/M3ipiY\nGCQkJAConJHPmTMHQUFB6NOnD06dOiXv87HHHkN4eDjCw8PlH0KFhYUYMWKEHIu5Hy4vvfQSAgIC\nMGzYMBQUFODUqVMIDQ2V38/OzjZZrnLy5EkMGzZM/g3rzJkzAFDvZ9PpdBg/fjz8/f3x5JNPAgA+\n/PBD5OfnY8iQIfLMeObMmbj33nsREBCA2NhYeX86nQ6HDx8GUPmbyGuvvQatVot+/frht99+A1A5\nyw4MDIRWq4VOp5OPj27duuHatWv47bffIITAtm3b8MADD8jflyNHjqBv374IDg5GREQESkpKsHHj\nRhw8eBBPPPEEevfujatXr0Kn0+HQoUNYsGAB9u3bh8mTJ2P27Nn1fm/JQgTZDEdHR6HVaoWfn59o\n06aNOHz4sBBCiI0bN4rhw4eLiooKceHCBdG1a1dx/vx5UVpaKnr16iV27twpevbsKU6fPi2EECIq\nKko88MADQgghsrOzhUajEVevXhW7du0SDz30kBBCiJiYGLFgwQIhhBA7d+4UWq1WCCFEbGysWLp0\naZ3xPffcc+LNN98UQgiRnp4uJEkShYWF4syZM8LBwUFkZGTUG++vv/5qsv+qGBISEoQQQnh7e4vX\nX39dCCFEYmKiPG7ixIli7969Qgghzp49K/z9/YUQQjz//PNi4cKFQgghNm/eLMdSkyRJIjk5WQgh\nxIIFC0RMTIwQQoghQ4aII0eOCCGEmDt3rli+fHmtdcPDw8UXX3whhBDi2rVrorS01Oxna9OmjcjL\nyxMVFRWiX79+Yt++ffJnqx5bUVGREEKIGzduCJ1OJ44dOyaEEEKn04lDhw7JcX/11VdCCCFmz54t\nFi1aJIQQIjAwUOTn5wshhLh48aIQQgiDwSAeeughsWzZMrF8+XKxb98+ER0dLWJjY8U777wjr/fN\nN98IIYSYN2+emDVrVq191lyu+R4phzN6G+Li4oKsrCycOHEC6enpeOqppwAAe/fuxaRJkyBJEjp0\n6IDBgwcjMzMTLi4uWLlyJYYPH47nn38e3bt3B1D5q/yECRMAAD4+PrjrrrvkumuVffv2ydsfMmQI\nCgsLcfnyZQCod3a8b98+REZGAgBGjhwJDw8P+b1u3bohPDxcHlcz3u++++6mZYSJEycCACIjI+Xz\nBDt27EBMTAxCQkLw8MMP4/Lly/jzzz+xZ88eedY8evRok1iqc3BwwOOPPw4AePLJJ+XfUqZOnYrV\nq1ejoqIC69evx6RJk0zWu3z5MvLz8/Hwww8DAFq1agUXFxezny08PBxdunSBJEnQarUwGo11xpSa\nmorQ0FD07t0bP/74Y511+VatWuHBBx8EAISGhsrbGjBgAKKiovDJJ5/gxo0bAP76+xo/fjzWr1+P\nlJQU+XsJAJcuXcLFixcxcOBAAEBUVBS++eYb+f36/r5v9h5ZjpPSAVDL6Nu3LwoKCvD7779DkiST\nf3BCCDlpHjt2DO3bt0deXp7Z7Tk41J4TNOUfcX3ruLq6mh0nSRKcnJxQUVEhf83cCcqqzyeEQEZG\nBlq1atXgWOpT/fsWERGBuLg4DB06FGFhYfX+oKhvO3XFetttt8lfc3R0lBNxdWfOnMHSpUtx8OBB\ntGnTBtHR0bh69Wqtcc7OzvJrBwcHeVsfffQRMjMzsXnzZoSGhuLQoUPyuI4dO6JVq1bYsWMHPvjg\nA+zfv7/OH671xV8XazjHQazR26yffvoJFRUVaNeuHQYOHIjU1FRUVFTg999/x549exAeHo6zZ8/i\n3XffRVZWFrZu3YrMzEwAlf+QN2zYACEETp06hdOnT6Nnz54m2x84cCDWrVsHoLK+3L59e7i5ucHN\nzU2e2dc0YMAAuSa9fft2FBcX1zmuZrzffPMNwsPD0bVrVxw/fhxlZWUoKSnBzp07TdZLTU2V/9+/\nf38AwIgRI/Dhhx/KY44ePQoAGDRoEJKTkwEAW7durTeWiooKbNiwAQCQnJwsz2pbt26NkSNHYubM\nmYiOjq61npubGzQaDdLS0gAA165dw5UrV+r9bOZ+6Li5ueHSpUsAKmfXrq6uuPPOO3HhwgVs3bq1\n3vXqcurUKYSHhyMuLg7t27fHuXPnTN5fsGABlixZAgcHB/nE+J133gkPDw/5t5m1a9fK9f3qsZH1\n4ozehly5cgUhISEAKpN1QkICJEnCI488ggMHDshXUbz99tvo0KEDhg8fjqVLl6JTp05YtWoV9Hq9\nXEbo2rUrwsPDcenSJcTHx6NVq1aQJEmeocXGxmLy5MkIDg6Gq6urfFJ0zJgxeOyxx5CWlobly5dj\nwIABcnzz58/HxIkTsXbtWvTr1w+dOnWSE0X1mV998QLAhAkTEBAQgO7du6N3794mn7+4uBjBwcFo\n3bo1UlJSAFSezHzuuecQHByMGzduYPDgwVixYoUcS0pKCvr3749u3brV+T11dXVFZmYmFi1ahI4d\nO8o/TABg0qRJ+PzzzzFixIg61127di2mT5+OefPmwdnZGRs3bqz3s504caLe2e8zzzyDUaNGwdPT\nE19//TVCQkLg5+cHLy8v3HfffXWuU31b1f/eZs+ejezsbAghMGzYMAQFBWH37t3y+/369atzvYSE\nBMyYMQOlpaXo0aMHVq9eDQDyifvbb7/9lq62ohZm0TMCpAp6vV5s2rSp2bd77do1cePGDSGEEPv3\n7xchISHNtu2aJyyr27Vrl9BoNE3a7pkzZ4QkSaK8vLzWe2+//bYYOnSomDp1ap1jH3jgAZGYmNik\n/RI1J87oyWJycnIwYcIEVFRUoFWrVli5cmWzbdvSteBHHnkEZ86cwc6dO+XLWGvasmWL/HrNmjVY\ntWoV9uzZY6kQiWRM9FRL1a/lzc3Hx0e+zrs53bhxA6dPn2727Zrz+eefW3R/RLeCJ2PJanl7e+PN\nN99Er1690LZtW0yePBnXrl2DwWCARqPBW2+9hc6dO2PKlCkoKyvDrFmz4OnpCU9PT/z9739HWVmZ\nyfbeeOMNtG/fHt27d5dPxALA5s2bERISgjZt2qBr166Ii4urFcuqVavg6emJLl26mLQFiI2NlS8z\nrUmn02HVqlX46aefMGPGDBw4cABubm5o27YtDh48iI4dO5qchP3ss8+g1Wpv9dtGVAsTPVm15ORk\nbN++HadOncIvv/yCRYsWQZIkXLhwAcXFxcjJyUF8fDwWLVqEzMxMHD16FEePHpVPoFY5f/48CgsL\nkZ+fj4SEBDzzzDP45ZdfAFTeRZqUlISLFy9i8+bN+Oijj+SrZaoYDAacPHkS27dvx5IlS+Tb+m92\naaEkSfDz80N8fDz69euHy5cvo6ioCGFhYWjXrh22bdsmj1+7di2ioqKa89tHBICJnqyYJEmIiYmB\np6cnPDw88Oqrr8pX0zg4OCAuLg7Ozs5o3bo1kpOTMW/ePLRr1w7t2rXD/PnzsXbtWpPtLVy4EM7O\nzhg0aBAefPBB+VLPwYMHo1evXgCAwMBAREZGYvfu3Sbrzp8/Hy4uLggICEB0dLQch2jgtfh1jXv6\n6aeRlJQEoLKlxPbt22vdeEXUHJjoyap5eXnJr7t27So30Grfvr3JTVD5+fkml0hWHwsAHh4ecHFx\nkZe7desmv5+RkYEhQ4agQ4cOcHd3R3x8PAoLCxsUx6144okn8N///helpaVYv349Bg0ahI4dO97y\ndolqYqInq5aTk2PyukuXLgBql0y6dOli0jKg+lig8hr76h0gz549C09PTwCV18OPGzcO586dQ0lJ\nCWbMmGFyB25dcVSt21B1lXg0Gg369u2Lzz77DElJSfXW+oluFRM9WS0hBFasWIG8vDwUFRVh8eLF\ncq+cmiZOnIhFixahoKAABQUFWLBgQa3EOX/+fFy/fh179uzB5s2bMX78eADAH3/8AQ8PD7Rq1QqZ\nmZlITk6ulZgXLVqEK1eu4Mcff8SaNWvk/jcN1bFjR5w7dw7Xr183+frTTz+NJUuW4IcffkBERESj\ntknUUEz0ZLUkScKkSZMwYsQI9OjRA76+vnjttddMes5Uee211xAWFoagoCAEBQUhLCwMr732mryd\nzp07w8PDA126dMFTTz2F+Ph43H333QCAFStWYN68ebjzzjuxcOHCWklckiQMHjwYPj4+GDZsGF5+\n+WUMGzZMfq/mXah1uf/++9GrVy906tRJvssXqOyZk5OTg0ceeQStW7e+9W8aUR0k0dCzSUQW1r17\nd6xatQpDhw5VOpQW5evri/j4eJv/nKQcszP6yZMno2PHjggMDKx3zAsvvABfX18EBwcjKyur2QMk\nsmWfffYZJElikqcWZTbRR0dHIz09vd73t2zZgpMnTyI7Oxsff/wxZs6c2ewBEtkqnU6HZ599Fv/6\n17+UDoVsnNkWCAMHDqz34QcA8OWXX8o3ePTp0wclJSW4cOECLxGjZlH16D1bZTAYlA6B7MQt9brJ\ny8szub5Yo9Hg3LlztRI9Hz5ARNTy6jvlestNzWpuuL6kznO+zU+SYiFErNJhEDVYbGysyQPNqfmY\nm1Df0uWVnp6eyM3NlZfPnTvX6BtJiIioZd1Soh87diwSExMBAN9++y3c3d1Zn7coo9IBEDWKuXN+\n1HLMlm4mTpyI3bt3o6CgAF5eXoiLi5Pv7Js+fTpGjx6NLVu2wMfHB66uri3Wx5zqNnIkW9qSurAN\nszIscsOUJEms0RMRYmMr/1DzM5dnmeiJyGIkCWAqaBnm8ix73agYr8Mm9TEoHYBdYqInIrJxLN0Q\nkcWwdNNyWLqxUTypRUQNwRm9ikmSAULolA6D7FTbtkBxcWPXMgDQNWoNDw+gqKix+7E/nNETUbMr\nLq4swzTmz65djV+n8T9MqCbO6FWM9U5SkqWOPx7nDcMZPRGRHWOiVzWD0gEQNQrv/VAGE72K/e+Z\nL0REZrFGT0RNwhq9dWGNnojIjjHRqxjrnaQ2PGaVwURPRGTjWKMnoiZhjd66sEZvo9jrhogagole\nxeLiDEqHQNQorNErg4meiMjGsUavYqxdkpJYo7curNETEdkxJnpVMygdAFGjsEavDCZ6FWOvGyJq\nCNboiahpJMly+2L+uClzedbJwrEQkY2QICx3Mrbld2PTWLpRMdY7SW14zCqDiZ6IyMaxRk9ETcLr\n6K0Lr6O3Uex1Q0QNwUSvYux1Q2rDGr0ymOiJiGwca/QqxtolKYk1euvCGj0RkR1jolc1g9IBEDUK\na/TKuGmiT09Ph5+fH3x9fbFkyZJa7xcUFGDUqFHQarUICAjAmjVrWiJOqgN73RBRQ5it0ZeXl6Nn\nz57YsWMHPD09ce+99yIlJQX+/v7ymNjYWFy7dg1vvPEGCgoK0LNnT1y4cAFOTn91V2CNnsj2sEZv\nXZpco8/MzISPjw+8vb3h7OyMyMhIpKWlmYzp3LkzLl26BAC4dOkS/va3v5kkeSIiUpbZjJyXlwcv\nLy95WaPRICMjw2TMtGnTMHToUHTp0gWXL1/G+vXr69yWXq+Ht7c3AMDd3R1arRY6nQ7AX3U7Ljdu\nuepr1hIPl+1rGWj8+jWP3YasDxhgMCj/ea1tueq10WjEzZgt3WzatAnp6elYuXIlACApKQkZGRlY\ntmyZPGbRokUoKCjA+++/j1OnTmH48OE4evQo3Nzc/toJSzctwmAwVPvHQGRZTSmpNOWYZemmYZpc\nuvH09ERubq68nJubC41GYzJm//79GD9+PACgR48e6N69O37++edbjZkagEme1IbHrDLMJvqwsDBk\nZ2fDaDSirKwMqampGDt2rMkYPz8/7NixAwBw4cIF/Pzzz7jrrrtaLmKSsdcNETXETe+M3bp1K2bN\nmoXy8nJMmTIFc+fORXx8PABg+vTpKCgoQHR0NHJyclBRUYG5c+di0qRJpjth6aZFSJIBQuiUDoPs\nFEs31sVcnmULBBVjoiclMdFbFyZ6G8V/AKQkXkdvXdjrhojIjjHRq5pB6QCIGqX6NeBkOUz0VqJt\n28pfURvzB2j8Om3bKvs5icjyWKO3Eqx3ktrwmLUurNETEdkxJnoVY72T1IbHrDKY6ImIbBxr9FaC\n9U5SGx6z1oU1eiIiO8ZEr2Ksd5La8JhVBhM9EZGNY43eSrDeSWrDY9a6sEZPRGTHmOhVjPVOUhse\ns8pgoicisnGs0VsJ1jtJbaoa67U0Dw+gqMgy+1Izc3nWycKxEJGNaMqEgRMNZbB0o2Ksd5L6GJQO\nwC4x0RMR2TjW6K0Ea/RkD3j8tRxeR09EZMeY6FWMNXpSm6gog9Ih2CUmeiKyGL1e6QjsE2v0VoI1\neiK6FazRExHZMSZ6FWONntSGx6wymOiJiGwcE72K6XQ6pUMgahSDQad0CHaJJ2OtBE/Gkj3g8ddy\neDLWRrHeSepjUDoAu8RET0Rk41i6sRIs3ZA94PHXcli6ISKyYzdN9Onp6fDz84Ovry+WLFlS5xiD\nwYCQkBAEBATwShALYo2e1Ia9bpRh9glT5eXliImJwY4dO+Dp6Yl7770XY8eOhb+/vzympKQEzz33\nHLZt2waNRoOCgoIWD5qI1Im9bpRhdkafmZkJHx8feHt7w9nZGZGRkUhLSzMZk5ycjEcffRQajQYA\n0K5du5aLlkzwtydSGx6zyjA7o8/Ly4OXl5e8rNFokJGRYTImOzsb169fx5AhQ3D58mW8+OKLeOqp\np2ptS6/Xw9vbGwDg7u4OrVYr/6VXlSDsfRmwrni4zGUuW+9y1Wuj0YibMXvVzaZNm5Ceno6VK1cC\nAJKSkpCRkYFly5bJY2JiYnD48GF8/fXXKC0tRb9+/bB582b4+vr+tRNedXNTTbkawWAwyH/5Lbkf\noubSlGOWGsZcnjU7o/f09ERubq68nJubK5doqnh5eaFdu3ZwcXGBi4sLBg0ahKNHj5okeiIiUo7Z\nGn1YWBiys7NhNBpRVlaG1NRUjB071mTMww8/jL1796K8vBylpaXIyMjAPffc06JBUyXOjEht2OtG\nGWYTvZOTE5YvX46RI0finnvuweOPPw5/f3/Ex8cjPj4eAODn54dRo0YhKCgIffr0wbRp05joiahO\ncXFKR2CfeGeslWCNnuyBJBkghE7pMGwS74wlIrJjnNFbCfa6IXvA46/lcEZPRGTHmOhVrPqNE0Rq\nwF43ymCiJyKLYa8bZbBGbyVYoyeiW8EaPRGRHWOiVzHW6ElteMwqg4meiMjGMdGrGHvdkNqw140y\neDLWSvBkLNkDHn8thydjbRTrnaQ+BqUDsEtM9ERENo6lGyvB0g3ZAx5/LYelGyIiO8ZEr2Ks0ZPa\nsNeNMsw+M5YsR0ACJEvs56//Elkae90ogzV6K8EaPRHdCtboiYjsGBO9irFGT2rDY1YZTPRERDaO\niV7F2OuG1Ia9bpTBk7FWgidjyR7w+Gs5PBlro1jvJPUxKB2AXWKiJyKycSzdWAmWbsge8PhrOSzd\nEBHZMSZ6FWONntSGvW6UwURPRBbDXjfKYI3eSrBGT0S3gjV6IiI7xkSvYqzRk9rwmFUGEz0RkY1j\nolcx9rohtWGvG2XcNNGnp6fDz88Pvr6+WLJkSb3jvvvuOzg5OeGzzz5r1gCJyHbExSkdgX0ym+jL\ny8sRExOD9PR0HD9+HCkpKThx4kSd4+bMmYNRo0bx6hoLYr2T1MegdAB2yWyiz8zMhI+PD7y9veHs\n7IzIyEikpaXVGrds2TI89thjaN++fYsFSkRETWP24eB5eXnw8vKSlzUaDTIyMmqNSUtLw86dO/Hd\nd99Bkup+wrVer4e3tzcAwN3dHVqtVq4xV81M7X0ZsK54uMzl5l/WWVk86l2uem00GnEzZm+Y2rRp\nE9LT07Fy5UoAQFJSEjIyMrBs2TJ5zPjx4/HPf/4Tffr0gV6vx5gxY/Doo4+a7oQ3TN0Ub5gie8Dj\nr+WYy7NmZ/Senp7Izc2Vl3Nzc6HRaEzGHDp0CJGRkQCAgoICbN26Fc7Ozhg7duytxk03YTAY5J/y\nRGpQ2etGp3AU9sdsog8LC0N2djaMRiO6dOmC1NRUpKSkmIw5ffq0/Do6Ohpjxoxhkieyc/WVcAEg\nIaH+9fibf8swm+idnJywfPlyjBw5EuXl5ZgyZQr8/f0RHx8PAJg+fbpFgqS6cTZP1ooJ27qwqZmV\nYI2eiG4Fm5rZqOpn34nUgMesMpjoiYhsHEs3VoKlGyK6FSzdEBHZMSZ6FWO9k9SGx6wymOiJiGwc\na/RWgjV6IroVrNETEdkxJnoVY72T1IbHrDKY6InIYo4cUToC+8REr2LsdUNqU1KiUzoEu8RET0Rk\n48x2ryTrxn70pAYGQ+UfAIiLM6CqH71OV/mHWh4TPRG1qOoJ/dtvgdhYBYOxUyzdqBhn86Q2V6/q\nlA7BLjHRE5HFtG6tdAT2iaUbFWONntSgeo1+2zYDYmN1AFijtyQmeiJqUdUT+hdfsEavBPa6sRLs\ndUO2yvSqG2D+/MrXnNE3L3N5loneSjDRkz3Q64E1a5SOwjaxqZmNYt8QUh+D0gHYJSZ6IrIYrVbp\nCOwTSzdWgqUbIroVLN0QEdkxJnoVY42e1IbHrDKY6InIYtiPXhm8YcqKSFJj19A1eh8eHo1ehajZ\nHDmiUzoEu8REbyWacoKUJ1ZJbTijVwYTvaoZ0JRZPZElVb8z9uhR9rpRAmv0REQ2jole1XRKB0DU\nSDqlA7BLvGFKxVijJ7Xx9ATy8pSOwjbxhikbFRVlUDoEoka5csWgdAh2iYlexfR6pSMgapy2bZWO\nwD6xdENELer99ysfOAIAu3cDgwdXvh43Dpg1S7m4bM0t9aNPT0/HrFmzUF5ejqlTp2LOnDkm769b\ntw5vvfUWhBBwc3PDRx99hKCgoAYHQET2w8cHOHlS6ShsU5Nr9OXl5YiJiUF6ejqOHz+OlJQUnDhx\nwmTMXXfdhW+++QbHjh3D//3f/+GZZ55pvsjJLPYNIbU5f96gdAh2yWyiz8zMhI+PD7y9veHs7IzI\nyEikpaWZjOnXrx/atGkDAOjTpw/OnTvXctESkar9L1WQhZm9MzYvLw9eXl7yskajQUZGRr3jV61a\nhdGjR9f5nl6vh7e3NwDA3d0dWq0Wuv/dFlc1M+Vy45YNBh10OuuJh8tcrms5JsaAvXsBd3cd8vN1\n0Gor39frdZg1S/n41Lpc9dpoNOJmzNboN23ahPT0dKxcuRIAkJSUhIyMDCxbtqzW2F27duG5557D\nvn374FGjcxZr9C2D19GT2lROTJSOwjY1uUbv6emJ3NxceTk3NxcajabWuGPHjmHatGn48ssvayV5\nakkGpQMgapSSEoPSIdgls4k+LCwM2dnZMBqNKCsrQ2pqKsaOHWsyJicnBxEREUhKSoKPj0+LBktE\n6nbffUpHYJ/M1uidnJywfPlyjBw5EuXl5ZgyZQr8/f0RHx8PAJg+fToWLFiA4uJizJw5EwDg7OyM\nzMzMlo+cwL4hpDaPPaZTOgS7xBumVIw1elKb2NjKP9T82OvGRrHXDamN0WhQOgS7xAePqBh73ZAa\nVH/wSEIC8L+rrMEHj1gOSzdEZDEs3bQclm6IiOwYE72KGXjnCamMu7tB6RDsEhM9EVmMVqt0BPaJ\niV7FDAad0iEQNYqOZ18VwZOxKsbr6ImoCk/G2iyD0gEQNQrPKymDiZ6IyMaxdKNiLN0QURWWboiI\n7BgTvYqx1w2pzSOPGJQOwS4x0asYe92Q2hw4oHQE9ok1eiKyGG9voAGPOKUmYI2eiBQTE1OZ4L29\ngbNn/3odE6NsXPaEM3oVMxgMvNOQVKVTJwPOn9cpHYZN4oyeiMiOMdGrGHvdkNrwmbHKYOlGxXjD\nFKmNwcCnSrUUlm5slkHpAIgaZc0ag9Ih2CUmeiIiG8eHg6uaTukAiG7K9OHgOj4cXAFM9ETUomom\ndD4c3PJYulEx9rohtTEaDUqHYJeY6FWMvW5IbfjMWGXw8koiIhvAyyuJiOwYE72K8fmbpDY8ZpXB\nRE9EZOOY6FWMvW5IbdhtVRk8Gati7HVDRFV4MtZmGZQOgKhRWKNXBhO9qh1ROgCiRjlyhMesEm6a\n6NPT0+Hn5wdfX18sWbKkzjEvvPACfH19ERwcjKysrGYPkupTonQARI1SUsJjVglmE315eTliYmKQ\nnp6O48ePIyUlBSdOnDAZs2XLFpw8eRLZ2dn4+OOPMXPmzBYNmIiIGsdsos/MzISPjw+8vb3h7OyM\nyMhIpKWlmYz58ssvERUVBQDo06cPSkpKcOHChZaLmGTBwUalQyBqFKPRqHQIdsls98q8vDx4eXnJ\nyxqNBhkZtU32AAAIvElEQVQZGTcdc+7cOXTs2NFknCRJzREv1SBJCUqHQNQoCQk8Zi3NbKJvaHKu\neUlPzfV4aSURkXLMlm48PT2Rm5srL+fm5kKj0Zgdc+7cOXh6ejZzmERE1FRmE31YWBiys7NhNBpR\nVlaG1NRUjB071mTM2LFjkZiYCAD49ttv4e7uXqtsQ0REyjFbunFycsLy5csxcuRIlJeXY8qUKfD3\n90d8fDwAYPr06Rg9ejS2bNkCHx8fuLq6YvXq1RYJnIiIGsYiLRCo4RwdHREUFCQvp6WloWvXrnWO\nveOOO/DHH39YKjSiehUWFmLYsGEAgPPnz8PR0RHt27eHJEnIzMyEkxOfWqokJnor4+bmhsuXLzf7\nWCJLiYuLg5ubG1566SX5a+Xl5XB0dFQwKvvGFghW7s8//8SwYcMQGhqKoKAgfPnll7XG/Prrrxg0\naBBCQkIQGBiIvXv3AgC2b9+O/v37IzQ0FBMmTMCff/5p6fDJTgkhoNfrMWPGDPTt2xezZ89GXFwc\nli5dKo8JCAhATk4OACApKQl9+vRBSEgIZsyYgYqKCqVCt0lM9FbmypUrCAkJQUhICB599FG0bt0a\nn3/+OQ4dOoSdO3fiH//4R611kpOTMWrUKGRlZeHo0aPQarUoKCjA4sWL8fXXX+PQoUMIDQ3Fu+++\nq8AnInslSRLy8/Nx4MABkwRf/X0AOHHiBNavX4/9+/cjKysLDg4OWLdunaXDtWksnFkZFxcXk35B\n169fx9y5c7Fnzx44ODggPz8fv/32Gzp06CCPCQ8Px+TJk3H9+nWMGzcOwcHBMBgMOH78OPr37w8A\nKCsrk18TWcr48ePN3o8jhJAnI2FhYQAqJzudOnWyVIh2gYneyq1btw4FBQU4fPgwHB0d0b17d1y9\netVkzMCBA7Fnzx589dVX0Ov1eOmll+Dh4YHhw4cjOTlZociJgNtvv11+7eTkZFKSqX4cR0VF4fXX\nX7dobPaEpRsrd+nSJXTo0AGOjo7YtWsXzp49W2tMTk4O2rdvj6lTp2Lq1KnIyspC3759sW/fPpw6\ndQpAZa0/Ozvb0uETyby9vXH48GEAwOHDh3HmzBlIkoT7778fGzduxO+//w4AKCoqkmv31Dw4o7cy\nNX/NfeKJJzBmzBgEBQUhLCwM/v7+tcbu2rUL77zzDpydneHm5obExES0a9cOa9aswcSJE3Ht2jUA\nwOLFi+Hr62u5D0N2r/rx/OijjyIxMREBAQHo06cPevbsCQDw9/fHokWLMGLECFRUVMDZ2RkrVqyo\n97JiajxeXklEZONYuiEisnFM9ERENo6JnojIxjHRExHZOCZ6IiIbx0RPijMajXBxcUHv3r2bbZtn\nz55FSkrKTcfl5+dj/PjxTdqHXq+Hq6urSQfRWbNmwcHBAUVFRfWud/HiRXz00Uf1xjBx4kQEBwfj\n/fffx+zZs9G5c+c6WwgQNRSvoyer4OPjI99M0xzOnDmD5ORkTJw40ey4Ll26YMOGDU3ahyRJ8PX1\nRVpaGp544glUVFRg586dtZ7CVlNxcTFWrFiBmTNn1orh/PnzOHjwoMnNba6urk2Kj6gKZ/RkdRIT\nExEcHAytVounn34aQOWsf+jQoQgODsawYcPkx1fq9Xq8+OKLGDBgAHr06IFNmzYBAF555RXs2bMH\nISEh+OCDD3D27FkMGjQIoaGhCA0NxYEDB+TtBgYGAgDWrFmDiIgIPPDAA7j77rsxZ84cAJUtdvV6\nPQIDAxEUFIQPPvhAjvXxxx9HamoqAMBgMOC+++4zacf77rvvIjAwEIGBgfJ6r7zyCk6dOoWQkBDM\nmTMHZ8+elWMYMWIE8vLyEBISgn379rXY95jsjCBS2JkzZ0RAQIAQQogffvhB3H333aKwsFAIIURx\ncbEQQoiHHnpIJCYmCiGE+PTTT8W4ceOEEEJERUWJCRMmCCGEOH78uPDx8RFCCGEwGMRDDz0k76O0\ntFRcvXpVCCHEL7/8IsLCwmrte/Xq1eKuu+4Sly5dElevXhXdunUTubm54uDBg2L48OHyti5evCiE\nEEKv14uNGzeKvn37iuLiYjFt2jSxe/du4e3tLQoLC8XBgwdFYGCgKC0tFX/88Yfo1auXyMrKEkaj\nUd5nzRhqvieEELGxseKdd965tW8y2TXO6Mmq7Ny5ExMmTEDbtm0BAO7u7gAqn0c8adIkAMCTTz4p\n99yXJAnjxo0DUHkr/YULFwBUdkWsrqysDFOnTkVQUBAmTJiA48eP17n/+++/H25ubrjttttwzz33\nICcnBz169MDp06fxwgsvYNu2bXBzczNZJyIiAikpKcjIyMDAgQPl/e/duxcRERFwcXGBq6srIiIi\nsGfPHrOfv2bcRM2BiZ6siiRJ9Sa7+r7eqlWrm45577330LlzZxw7dgwHDx5EWVlZneNuu+02+bWj\noyNu3LgBd3d3HD16FDqdDv/+978xdepUk3gff/xxzJs3DyNGjDDp7VLzswghzLbsJWopTPRkVYYM\nGYINGzbIV60UFxcDAPr374///Oc/ACpbNw8aNMjsdmo+ZvHSpUtyj/PExESUl5c3KB4hBAoLC1Fe\nXo6IiAgsXLjQ5HkBQgh07doVixcvxrPPPit/XZIkDBw4EF988QWuXLmCP//8E1988QUGDhyIO+64\ng4+AJIviVTdkVXr16oVXX30VgwcPhqOjI3r37o1PP/0Uy5YtQ3R0NN5++2106NABq1evltepOYsG\ngODgYDg6OkKr1SI6OhrPPvus3D1x1KhRuOOOO2qtI0lSrRm3JEnIy8tDdHS03Ev9zTffrLXuM888\nU+trISEh0Ov1CA8PBwBMmzYNwcHBAIABAwYgMDAQo0ePxrPPPlvnZyBqLuxeSYozGo0YM2YMvv/+\ne6VDsUqxsbFwc3Or8zGSRA3B0g0pzsnJCRcvXmzWG6Zsxcsvv4x169aZ/AZC1Fic0RMR2TjO6ImI\nbBwTPRGRjWOiJyKycUz0REQ2jomeiMjG/T/3DEuDtURCmgAAAABJRU5ErkJggg==\n" } ], "prompt_number": 25 }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Advanced use case" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Storing RDKit BitVector objects directly in Pandas series" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The allows to conduct RDKit fingerprint operations like computing a Tanimoto similarity matrix quite directly on the dataframe." ] }, { "cell_type": "code", "collapsed": false, "input": [ "#del trainData['explFP']\n", "def addExplFP(df,molColumn):\n", " fpCache = []\n", " for mol in df[molColumn]:\n", " res = AllChem.GetMorganFingerprintAsBitVect(mol,2,nBits=1024)\n", " fpCache.append(res) \n", " arr = np.empty((len(df),), dtype=np.object)\n", " arr[:]=fpCache\n", " S = pd.Series(arr,index=df.index,name='explFP')\n", " return df.join(pd.DataFrame(S))\n", "trainData = addExplFP(trainData,'molecule')\n", "trainData.head(2)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
smilesmutagenicmoleculeFPexplFP
1640-39-7 CC1=Nc2ccccc2C1(C)C 0 \"Mol\"/ [0 0 0 ..., 0 0 0] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
100-39-0 BrCc1ccccc1 1 \"Mol\"/ [0 0 0 ..., 0 0 0] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
\n", "
" ], "output_type": "pyout", "prompt_number": 26, "text": [ " smiles mutagenic \\\n", "1640-39-7 CC1=Nc2ccccc2C1(C)C 0 \n", "100-39-0 BrCc1ccccc1 1 \n", "\n", " molecule \\\n", "1640-39-7 \"Mol\"/ \n", "100-39-0 \"Mol\"/ \n", "\n", " FP \\\n", "1640-39-7 [0 0 0 ..., 0 0 0] \n", "100-39-0 [0 0 0 ..., 0 0 0] \n", "\n", " explFP \n", "1640-39-7 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] \n", "100-39-0 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] " ] } ], "prompt_number": 26 }, { "cell_type": "code", "collapsed": false, "input": [ "from rdkit import DataStructs\n", "fpList = trainData['explFP'].tolist()\n", "dm=[]\n", "for i,fp in enumerate(fpList): \n", " dm.extend(DataStructs.BulkTanimotoSimilarity(fp,fpList[1+i:],returnDistance=True))\n", "dm = array(dm)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 27 }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is also possible to store the fingerprints as numpy arrays, which allows for direct application in scikit-learn." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def convertToNumpy(df,fpCol):\n", " fpCache = []\n", " for fp in df[fpCol]:\n", " res = numpy.zeros(len(fp),numpy.int32)\n", " DataStructs.ConvertToNumpyArray(fp,res)\n", " fpCache.append(res)\n", " '''\n", " it is necessary to constructs an empty object array in advance and fill that later,\n", " because directly initializing an array with the fingerprint would trigger the numpy\n", " type recognition and result in a array of integers that again would trigger pandas\n", " to construct a Series object per bit position\n", " ''' \n", " arr = np.empty((len(df),), dtype=np.object)\n", " arr[:]=fpCache\n", " S = pd.Series(arr,index=df.index,name='npFP')\n", " return df.join(pd.DataFrame(S))\n", " \n", "trainData = convertToNumpy(trainData,'explFP')\n", "trainData.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
smilesmutagenicmoleculeFPexplFPnpFP
1640-39-7 CC1=Nc2ccccc2C1(C)C 0 \"Mol\"/ [0 0 0 ..., 0 0 0] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
100-39-0 BrCc1ccccc1 1 \"Mol\"/ [0 0 0 ..., 0 0 0] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
79-94-7 CC(C)(c1cc(Br)c(O)c(Br)c1)c2cc(Br)c(O)c(Br)c2 0 \"Mol\"/ [0 0 0 ..., 0 0 0] [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
1822-51-1 ClCc1ccncc1 0 \"Mol\"/ [0 0 0 ..., 0 0 0] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
594-71-8 CC(C)(Cl)[N+](=O)[O-] 1 \"Mol\"/ [0 0 0 ..., 0 0 0] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
\n", "
" ], "output_type": "pyout", "prompt_number": 28, "text": [ " smiles mutagenic \\\n", "1640-39-7 CC1=Nc2ccccc2C1(C)C 0 \n", "100-39-0 BrCc1ccccc1 1 \n", "79-94-7 CC(C)(c1cc(Br)c(O)c(Br)c1)c2cc(Br)c(O)c(Br)c2 0 \n", "1822-51-1 ClCc1ccncc1 0 \n", "594-71-8 CC(C)(Cl)[N+](=O)[O-] 1 \n", "\n", " molecule \\\n", "1640-39-7 \"Mol\"/ \n", "100-39-0 \"Mol\"/ \n", "79-94-7 \"Mol\"/ \n", "1822-51-1 \"Mol\"/ \n", "594-71-8 \"Mol\"/ \n", "\n", " FP \\\n", "1640-39-7 [0 0 0 ..., 0 0 0] \n", "100-39-0 [0 0 0 ..., 0 0 0] \n", "79-94-7 [0 0 0 ..., 0 0 0] \n", "1822-51-1 [0 0 0 ..., 0 0 0] \n", "594-71-8 [0 0 0 ..., 0 0 0] \n", "\n", " explFP \\\n", "1640-39-7 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] \n", "100-39-0 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] \n", "79-94-7 [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] \n", "1822-51-1 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] \n", "594-71-8 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] \n", "\n", " npFP \n", "1640-39-7 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] \n", "100-39-0 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] \n", "79-94-7 [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] \n", "1822-51-1 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] \n", "594-71-8 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] " ] } ], "prompt_number": 28 }, { "cell_type": "markdown", "metadata": {}, "source": [ "This avoids having to use wrapper classes and additional list comprehensions to prepare the data for scikit-learn.\n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "model = RandomForestClassifier()\n", "#resolve wrapped fingerprints\n", "#X = [x.fp for x in trainData['FP']]\n", "#y = trainData['mutagenic']\n", "model.fit(np.vstack(trainData['npFP']),trainData['mutagenic'])" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 29, "text": [ "RandomForestClassifier(bootstrap=True, compute_importances=False,\n", " criterion='gini', max_depth=None, max_features='auto',\n", " min_density=0.1, min_samples_leaf=1, min_samples_split=1,\n", " n_estimators=10, n_jobs=1, oob_score=False,\n", " random_state=,\n", " verbose=0)" ] } ], "prompt_number": 29 }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Clustering and GroupBy Statistics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas makes it very simple to compute statistics for data categories. The only things required is a discrete dataframe column that allows the data being grouped with respect to the unique values occuring in that column. This concept can be also used for computing property distribution with respect to molecular structures if the molecules are classified into categories. One way to obtain the latter would be to conduct a simple structural clustering using the molecular fingerprints." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from rdkit import DataStructs\n", "from rdkit.ML.Cluster import Butina\n", " \n", "def ClusterFps(fps,cutoff=0.2):\n", " # first generate the distance matrix:\n", " dists = []\n", " nfps = len(fps)\n", " for i in range(1,nfps):\n", " sims = DataStructs.BulkTanimotoSimilarity(fps[i],fps[:i])\n", " dists.extend([1-x for x in sims])\n", "\n", " # now cluster the data:\n", " cs = Butina.ClusterData(dists,nfps,cutoff,isDistData=True)\n", " return cs\n", "\n", "#the explicit bitvector column constructured a few steps earlier, can not directly used to perform a Butina clustering using the RDKit implementation\n", "bt_clusters = ClusterFps(trainData['explFP'].tolist(),cutoff=0.3)\n" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 30 }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"bt_clusters\" is a tuple of tuples aggregation the compound indices that belong to the same cluster together. In order the cluster information to the dataframe, this representation has to be converted into either a list or a numpy array. After the next step the dataframe contains an additional column containing the index of the centroid for the cluster the respective row to assigned to." ] }, { "cell_type": "code", "collapsed": false, "input": [ "cluster_map = np.empty(len(trainData))\n", "cluster_map[:]=-1\n", "for _c in bt_clusters:\n", " if len(_c)<5: continue\n", " centroid = _c[0]\n", " for it in _c:\n", " cluster_map[it]=centroid\n", "trainData['btCluster'] = cluster_map\n" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 31 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the centroid index it is simple to add the corresponding molecule object to the dataframe rows. This result in dataframe that associates each molecule with its centroid structure." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def getMoleculeForCluster(c):\n", " return trainData.ix[trainData.index[int(c)]]['molecule']\n" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 32 }, { "cell_type": "code", "collapsed": false, "input": [ "cData = trainData.ix[trainData['btCluster'] != -1]\n", "cData['Centroid'] = cData.apply(lambda row: getMoleculeForCluster(row['btCluster']),axis=1)\n", "cData['Centroid CAS'] = cData.apply(lambda row: str(trainData.index[int(row['btCluster'])]),axis=1)\n", "cData[['molecule','Centroid','btCluster','Centroid CAS']].tail(2)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
moleculeCentroidbtClusterCentroid CAS
954-46-1 \"Mol\"/ \"Mol\"/ 1802 159092-71-4
51938-12-6 \"Mol\"/ \"Mol\"/ 2888 51938-13-7
\n", "
" ], "output_type": "pyout", "prompt_number": 33, "text": [ " molecule \\\n", "954-46-1 \"Mol\"/ \n", "51938-12-6 \"Mol\"/ \n", "\n", " Centroid \\\n", "954-46-1 \"Mol\"/ \n", "51938-12-6 \"Mol\"/ \n", "\n", " btCluster Centroid CAS \n", "954-46-1 1802 159092-71-4 \n", "51938-12-6 2888 51938-13-7 " ] } ], "prompt_number": 33 }, { "cell_type": "code", "collapsed": false, "input": [ "#add an example numeric column to compute statistics for\n", "cData['nAtoms'] = cData.apply(lambda row: row['molecule'].GetNumAtoms(),axis=1)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 34 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, the cluster-wise statistic can be easily obtained by using the centroid index column as grouping key. Additionally, it is possible by using multiple keys and choosing the right ordering to show the statistics directly associated with the centroid structure." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from IPython.display import HTML,display\n", "temp = dict([(row[1]['btCluster'],row[1]['Centroid']) for row in cData.iterrows()])\n", "\n", "tC = cData.set_index('Centroid',False)\n", "\n", "#groupby clusterID (string doesn't work well) and compute statistic\n", "tC = tC.groupby('btCluster')\n", "tempC = tC.describe()[['mutagenic','nAtoms']]\n", "#display(HTML(tempC.to_html()))\n", "\n", "#the MCS was dropped because describe doesn't work on non-numerics, thus it has to be mapped back\n", "#every substatistic is now associated with an instance of the MCS\n", "def mapMCS(row):\n", " return temp[row.name[0]]\n", "tempC['Centroid'] = tempC.apply(lambda row: mapMCS(row),axis=1)\n", "#display(HTML(tempC.to_html()))\n", "\n", "#create a second index level using the MCS and reorder the indices\n", "index = tempC.index\n", "tempC = tempC.set_index('Centroid', append=True)\n", "#display(HTML(tempC.to_html()))\n", "tempC = tempC.reorder_levels([0,'Centroid',1])\n", "display(HTML(tempC.ix[1802].to_html()))" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
mutagenicnAtoms
Centroid
\"Mol\"/count 5 5.000000
mean 1 20.600000
std 0 2.880972
min 1 17.000000
25% 1 20.000000
50% 1 20.000000
75% 1 21.000000
max 1 25.000000
" ], "output_type": "display_data", "text": [ "" ] } ], "prompt_number": 35 } ], "metadata": {} } ] }