{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Building a Similarity Comparison Set, Revisited\n", "\n", "Goal: construct a set of molecular pairs that can be used to compare similarity methods to each other.\n", "\n", "The earlier version of this notebook (http://rdkit.blogspot.ch/2013/10/building-similarity-comparison-set-goal.html or https://github.com/greglandrum/rdkit_blog/blob/master/notebooks/Building%20A%20Similarity%20Comparison%20Set.ipynb)included a number of molecules that have counterions (from salts). Because this isn't really what we're interested in (and because the single-atom fragments that make up many salts triggered a bug in the RDKit's Morgan fingerprint implementation), I repeat the analysis here and restrict it to single-fragment molecules (those that do not include a `.` in the SMILES).\n", "\n", "The other big difference from the previous post is that an updated version of ChEMBL is used; this time it's ChEMBL21.\n", "\n", "I want to start with molecules that have some connection to each other, so I will pick pairs that have a baseline similarity: a Tanimoto similarity using count based Morgan0 fingerprints of at least 0.7. I also create a second set of somewhat more closely related molecules where the baseline similarity is 0.6 with a Morgan1 fingerprint. Both thresholds were selected empirically.\n", "\n", "**Note:** this notebook and the data it uses/generates can be found in the github repo: https://github.com/greglandrum/rdkit_blog" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I'm going to use ChEMBL as my data source, so I'll start by adding a table with Morgan0 fingerprints that only contains molecules with molwt<=600 and a single fragment (we recognize this because there is no '.' in the SMILES):\n", "\n", " chembl_21=# select molregno,morgan_fp(m,0) mfp0 into table rdk.tfps_smaller from rdk.mols \n", " join compound_properties using (molregno) \n", " join compound_structures using (molregno) \n", " where mw_monoisotopic<=600 and canonical_smiles not like '%.%';\n", " SELECT 1372487\n", " chembl_21=# create index sfps_mfp0_idx on rdk.tfps_smaller using gist(mfp0);\n", " CREATE INDEX\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And now I'll build the set of pairs using Python. This is definitely doable in SQL, but my SQL-fu isn't that strong.\n", "\n", "Start by getting a set of 35K random small molecules with MW<=600:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2016.03.1\n", "Thu Apr 21 11:41:19 2016\n" ] } ], "source": [ "from rdkit import Chem\n", "from rdkit import rdBase\n", "print(rdBase.rdkitVersion)\n", "import time\n", "print(time.asctime())" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import psycopg2\n", "cn = psycopg2.connect(dbname='chembl_21')\n", "curs = cn.cursor()\n", "curs.execute('select molregno,m from rdk.mols join rdk.tfps_smaller using (molregno) order by random() limit 35000')\n", "qs = curs.fetchall()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And now find one neighbor for 25K of those from the mfp0 table of smallish molecules:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done: 0\n", "Done: 1000\n", "Done: 2000\n", "Done: 3000\n", "Done: 4000\n", "Done: 5000\n", "Done: 6000\n", "Done: 7000\n", "Done: 8000\n", "Done: 9000\n", "Done: 10000\n", "Done: 11000\n", "Done: 12000\n", "Done: 13000\n", "Done: 14000\n", "Done: 15000\n", "Done: 16000\n", "Done: 17000\n", "Done: 18000\n", "Done: 19000\n", "Done: 20000\n", "Done: 21000\n", "Done: 22000\n", "Done: 23000\n", "Done: 24000\n", "Done: 25000\n" ] } ], "source": [ "cn.rollback()\n", "curs.execute('set rdkit.tanimoto_threshold=0.7')\n", "\n", "keep=[]\n", "for i,row in enumerate(qs):\n", " curs.execute('select molregno,m from rdk.mols join (select molregno from rdk.tfps_smaller where mfp0%%morgan_fp(%s,0) '\n", " 'and molregno!=%s limit 1) t2 using (molregno)',(row[1],row[0]))\n", " d = curs.fetchone()\n", " if not d: continue\n", " keep.append((row[0],row[1],d[0],d[1]))\n", " if len(keep)==25000: break\n", " if not i%1000: print('Done: %d'%i)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, write those out to a file so that we can use them elsewhere:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import gzip\n", "outf = gzip.open('../data/chembl21_25K.pairs.txt.gz','wb+')\n", "for idx1,smi1,idx2,smi2 in keep: outf.write(('%d %s %d %s\\n'%(idx1,smi1,idx2,smi2)).encode('UTF-8'))\n", "outf=None\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Early analysis of the data\n", "\n", "Start by loading the pairs from the file we saved and creating RDKit molecules from them" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from rdkit import Chem\n", "from rdkit.Chem.Draw import IPythonConsole\n", "IPythonConsole.ipython_useSVG=True\n", "from rdkit.Chem import Draw\n", "import gzip\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [], "source": [ "rows=[]\n", "for row in gzip.open('../data/chembl21_25K.pairs.txt.gz').readlines():\n", " row = row.split()\n", " row[1] = Chem.MolFromSmiles(row[1])\n", " row[3] = Chem.MolFromSmiles(row[3])\n", " rows.append(row)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Look at some pairs:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/svg+xml": [ "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "O\n", "NH\n", "O\n", "O\n", "NH\n", "O\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "N\n", "NH\n", "O\n", "N\n", "O\n", "NH\n", "NH\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "O\n", "NH\n", "O\n", "N\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "NH\n", "O\n", "N\n", "O\n", "N\n", "N\n", "H\n", "H\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "O\n", "NH\n", "S\n", "O\n", "O\n", "F\n", "F\n", "F\n", "S\n", "F\n", "F\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "O\n", "O\n", "N\n", "NH\n", "O\n", "F\n", "F\n", "F\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "O\n", "O\n", "OH\n", "NH\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "N\n", "N\n", "N\n", "NH\n", "NH\n", "N\n", "H\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "NH\n", "N\n", "NH\n", "N\n", "N\n", "N\n", "" ], "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t = []\n", "for x in rows[:5]:\n", " t.append(x[1])\n", " t.append(x[3])\n", " \n", "Draw.MolsToGridImage(t,molsPerRow=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Take a look at property distributions.\n", "\n", "Each plot below contains two histograms. The one in blue is for the first set of molecules, the one in green is for the neighbor molecules." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from rdkit.Chem import Descriptors" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [], "source": [ "mws = [(Descriptors.MolWt(x[1]),Descriptors.MolWt(x[3])) for x in rows]\n", "nrots = [(Descriptors.NumRotatableBonds(x[1]),Descriptors.NumRotatableBonds(x[3])) for x in rows]\n", "logps = [(Descriptors.MolLogP(x[1]),Descriptors.MolLogP(x[3])) for x in rows]" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Populating the interactive namespace from numpy and matplotlib\n" ] } ], "source": [ "%pylab inline" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAEPCAYAAABCyrPIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAF7hJREFUeJzt3X+s5XV95/HnC1hAK1JWlznduZQZQ1FwiTCbzlbZpret\njrDNAjEpRZtFKyYm4Go12ZVx/+COaTL1D13cKCbrL8BdF9GunTEh/Ao5yUoLTEcoozPCmDLI3DpH\n3XZxXWLLyHv/OJ9hDnfOde7ce+6vc5+P5GS+532+3+/9fDIz93U+3+/38/2mqpAk6aTlboAkaWUw\nECRJgIEgSWoMBEkSYCBIkhoDQZIEzCEQkpyW5OEkjybZk+SmVr8pycEk32qvywa22Zpkf5J9SbYM\n1DcleTzJk0luXpwuSZLmI3OZh5Dk5VX1XJKTgQeB9wOXA/+3qj4xY90LgC8Dvw5MAPcDv1ZVleRh\n4H1VtSvJXcAnq+qe0XZJkjQfczpkVFXPtcXTgFOAIymSIatfCdxRVYer6gCwH9icpAOcUVW72nq3\nA1fNt+GSpNGaUyAkOSnJo8Ah4L6BX+rvS/JYks8lObPV1gPPDGw+3WrrgYMD9YOtJklaAeY6Qnih\nqi6hfwhoc5ILgVuA11TVxfSD4uOL10xJ0mI75URWrqqfJOkCl804d/BZ4BtteRo4Z+CziVabrX6M\nJN5gSZLmoaqGHcqfk7lcZfTqI4eDkrwMeAvw3XZO4Ii3Ad9uyzuBa5KcmmQjcB7wSFUdAp5NsjlJ\ngGuBHbP93Koa29dNN9207G2wb/bP/o3fa6HmMkL4FeC2JCfRD5CvVNVdSW5PcjHwAnAAeG/7Rb43\nyZ3AXuB54Po62tIbgFuB04G7quruBfdAkjQSxw2EqtoDbBpSv/YXbLMd2D6kvhu46ATbKElaAs5U\nXgaTk5PL3YRFM859A/u32o17/xZqThPTllqSWontkqSVLAm1mCeVJUlrg4EgSQIMBElSYyBIkgAD\nQVp0nYkOSY55dSY6x99YWkJeZSQtsiQwNeSDKUYyu1Q6wquMJEkjYSBIkgADQVrRPP+gpXRCt7+W\ntLR6072h5x96U70lb4vGnyMESRJgIEiSGgNBkgQYCJKkxkCQJAEGgiSpMRAkSYCBIElqDARJEmAg\nSJIaA0GSBMwhEJKcluThJI8m2ZPkplY/K8m9SZ5Ick+SMwe22Zpkf5J9SbYM1DcleTzJk0luXpwu\nSZLm47iBUFX/APx2VV0CXAxcnmQzcCNwf1W9FngA2AqQ5ELgauAC4HLgliRHHtjwGeC6qjofOD/J\nW0fdIUnS/MzpkFFVPdcWT6N/h9QCrgRua/XbgKva8hXAHVV1uKoOAPuBzUk6wBlVtautd/vANpKk\nZTanQEhyUpJHgUPAfe2X+rqq6gFU1SHg7Lb6euCZgc2nW209cHCgfrDVJEkrwJyeh1BVLwCXJHkl\n8PUkr6c/SnjJaqNs2NTU1IvLk5OTTE5OjnL3krTqdbtdut3uyPZ3Qg/IqaqfJOkClwG9JOuqqtcO\nB/2wrTYNnDOw2USrzVYfajAQJEnHmvlledu2bQva31yuMnr1kSuIkrwMeAuwD9gJvKut9k5gR1ve\nCVyT5NQkG4HzgEfaYaVnk2xuJ5mvHdhGkrTM5jJC+BXgtiQn0Q+Qr1TVXUkeAu5M8m7gafpXFlFV\ne5PcCewFngeur6ojh5NuAG4FTgfuqqq7R9obSdK8HTcQqmoPsGlI/e+AN8+yzXZg+5D6buCiE2+m\nJGmxOVNZkgQYCJKkxkCQJAEGgrQidDobSHLMS1pKJzQPQdLi6PWeZvjcTkNBS8cRgiQJMBAkSY2B\nIEkCDARJUmMgSJIAA0GS1BgI0io32xyGTmfDcjdNq4zzEKRVbrY5DL2ecxh0YhwhSJIAA0GS1BgI\nkiTAQJAkNQaCNEedic6xV/JMdJa7WdLIeJWRNEe96R5MzahN9ZalLdJicIQgSQIMBGlkfMiNVjsP\nGUkj4kNutNo5QpAkAXMIhCQTSR5I8p0ke5L8+1a/KcnBJN9qr8sGttmaZH+SfUm2DNQ3JXk8yZNJ\nbl6cLkmS5mMuh4wOAx+qqseSvALYneS+9tknquoTgysnuQC4GrgAmADuT/JrVVXAZ4DrqmpXkruS\nvLWq7hlddyRJ83XcEUJVHaqqx9ryT4F9wPr28bCDo1cCd1TV4ao6AOwHNifpAGdU1a623u3AVQts\nvyRpRE7oHEKSDcDFwMOt9L4kjyX5XJIzW2098MzAZtOtth44OFA/yNFgkSQtszlfZdQOF30N+EBV\n/TTJLcBHq6qS/AnwceA9o2rY1NTUi8uTk5NMTk6OateSNBa63S7dbndk+5tTICQ5hX4YfKmqdgBU\n1Y8GVvks8I22PA2cM/DZRKvNVh9qMBAkScea+WV527ZtC9rfXA8ZfQHYW1WfPFJo5wSOeBvw7ba8\nE7gmyalJNgLnAY9U1SHg2SSb05+tcy2wY0GtlySNzHFHCEkuBf4Q2JPkUfozbz4CvCPJxcALwAHg\nvQBVtTfJncBe4Hng+naFEcANwK3A6cBdVXX3SHsjSZq34wZCVT0InDzko1l/mVfVdmD7kPpu4KIT\naaAkaWk4U1mSBBgIkqTGQJAkAQaCtKYNu2V3p7NhuZulZeLtr6U1bNgtu3s9b9e9VjlCkCQBBoIk\nqTEQJEmAgSBJagwESRJgIEiSGgNBkgQYCNL4OpljJp0loTPROf62WpOcmCaNq58DU8eWe1O9pW6J\nVglHCJIkwECQJDUGgjTDsBu+9Z/6Ko03zyFIMwy74VufoaDx5ghB0gmbbRTlrbNXN0cIkk7YbKMo\nb529ujlCkCQBBoIkqTEQJEnAHAIhyUSSB5J8J8meJO9v9bOS3JvkiST3JDlzYJutSfYn2Zdky0B9\nU5LHkzyZ5ObF6ZIkaT7mMkI4DHyoql4PvBG4IcnrgBuB+6vqtcADwFaAJBcCVwMXAJcDt+ToRdyf\nAa6rqvOB85O8daS9kbRwC7kHkvdPWtWOe5VRVR0CDrXlnybZB0wAVwK/1Va7DejSD4krgDuq6jBw\nIMl+YHOSp4EzqmpX2+Z24CrgntF1R9KCLeQeSN4/aVU7oXMISTYAFwMPAeuqqgcvhsbZbbX1wDMD\nm0232nrg4ED9YKtJklaAOc9DSPIK4GvAB9pIYeZFyMOmds7b1NTUi8uTk5NMTk6OcveStOp1u126\n3e7I9jenQEhyCv0w+FJV7WjlXpJ1VdVL0gF+2OrTwDkDm0+02mz1oQYDQZJ0rJlflrdt27ag/c31\nkNEXgL1V9cmB2k7gXW35ncCOgfo1SU5NshE4D3ikHVZ6NsnmdpL52oFtJEnL7LgjhCSXAn8I7Eny\nKP1DQx8BPgbcmeTdwNP0ryyiqvYmuRPYCzwPXF9VRw4n3QDcCpwO3FVVd4+2O5Kk+ZrLVUYPAifP\n8vGbZ9lmO7B9SH03cNGJNFCStDScqSxJAgwESVJjIEiSAANBktQYCFpTOhMd77UjzcInpmlN6U33\nvNeONAtHCJIkwECQJDUGgqQl1elsGH4ep7NhuZu25nkOQdKS6vWeZtjNkXu9HLuylpQjBEkSYCBI\nkhoDQZIEGAiSpMZAkCQBBoIkqTEQJEmAgSBJagwESRJgIEiSGgNBkgQYCJKkxkDQWJrtjpqSZnfc\nQEjy+SS9JI8P1G5KcjDJt9rrsoHPtibZn2Rfki0D9U1JHk/yZJKbR98V6aijd9Sc+ZI0m7mMEL4I\nvHVI/RNVtam97gZIcgFwNXABcDlwS45+LfsMcF1VnQ+cn2TYPiWtVSfj866X2XGfh1BV30xy7pCP\nho2/rwTuqKrDwIEk+4HNSZ4GzqiqXW2924GrgHvm2W5J4+bnHPd5153Ohjb6e6l1687l0KEDi9a0\ntWIh5xDel+SxJJ9LcmarrQeeGVhnutXWAwcH6gdbTZLmbLZDgcNCQiduvk9MuwX4aFVVkj8BPg68\nZ3TNgqmpqReXJycnmZycHOXuJWnV63a7dLvdke1vXoFQVT8aePtZ4BtteRo4Z+CziVabrT6rwUCQ\nJB1r5pflbdu2LWh/cz1kFAbOGSQZPMvzNuDbbXkncE2SU5NsBM4DHqmqQ8CzSTa3k8zXAjsW1HJJ\n0kgdd4SQ5MvAJPCqJN8HbgJ+O8nFwAvAAeC9AFW1N8mdwF7geeD6qjpyrd8NwK3A6cBdR65MkiSt\nDHO5yugdQ8pf/AXrbwe2D6nvBi46odZJkpaMM5UlSYCBIElqDARJEmAgSJIaA0GSBBgIkqTGQJAk\nAQaCJKkxECRJgIEgSWoMBEmr35CnrfmktRM33+chSNLKMeRpa4NPWtPcOEKQJAEGgiSpMRAkSYCB\nIElqDARJEmAgSJIaA0GSBBgIWoU6E51jJiE5EUlaOCemadXpTfeOmYQETkSSFsoRgiQJMBAkSc1x\nAyHJ55P0kjw+UDsryb1JnkhyT5IzBz7bmmR/kn1JtgzUNyV5PMmTSW4efVckSQsxlxHCF4G3zqjd\nCNxfVa8FHgC2AiS5ELgauAC4HLglSdo2nwGuq6rzgfOTzNynJGkZHTcQquqbwN/PKF8J3NaWbwOu\nastXAHdU1eGqOgDsBzYn6QBnVNWutt7tA9tIklaA+Z5DOLuqegBVdQg4u9XXA88MrDfdauuBgwP1\ng60mSYuu09kw/FLlzoblbtqKMqrLTmtE+3nR1NTUi8uTk5NMTk6O+kdohet0NtDrPb3czdAY6P87\nOvbXVK+XY1deRbrdLt1ud2T7m28g9JKsq6peOxz0w1afBs4ZWG+i1Warz2owELQ2zfafGFb3f2Jp\nVGZ+Wd62bduC9jfXQ0bhpf8LdwLvasvvBHYM1K9JcmqSjcB5wCPtsNKzSTa3k8zXDmwjSVoBjjtC\nSPJlYBJ4VZLvAzcBfwp8Ncm7gafpX1lEVe1NciewF3geuL6qjnzFuwG4FTgduKuq7h5tVyRJC3Hc\nQKiqd8zy0ZtnWX87sH1IfTdw0Qm1TpK0ZJypLEkCDARJUmMgSJIAA0GS1BgIkiTAQJAkNQaCJAkw\nECRJjYGgZdGZ6Ay/++REZ7mbJq1Zo7rbqXRCetM9mBpSn+oteVsk9TlCkCQBBoKktexkjnvoci09\nXMdDRpLWrp9z3EOX4/pwnWEcIUiSAANBktQYCJIkwECQJDUGghbdsKs0JK08XmWkRTf8Kg1DQVpp\nHCFIkgADQZLmZw6T2lYbDxlJ0nzMYVLbauMIQZIWyWq77cWCAiHJgSR/neTRJI+02llJ7k3yRJJ7\nkpw5sP7WJPuT7EuyZaGNl6SV7OgFFS999esrz0JHCC8Ak1V1SVVtbrUbgfur6rXAA8BWgCQXAlcD\nFwCXA7fE6w8lacVYaCBkyD6uBG5ry7cBV7XlK4A7qupwVR0A9gObkSStCAsNhALuS7IryXtabV1V\n9QCq6hBwdquvB54Z2Ha61SRJK8BCrzK6tKp+kOSfAfcmeYJjZyAde9/YOZiamnpxeXJyksnJyfm2\nUZLGUrfbpdvtjmx/CwqEqvpB+/NHSf6c/iGgXpJ1VdVL0gF+2FafBs4Z2Hyi1YYaDARJ0rFmflne\ntm3bgvY370NGSV6e5BVt+ZeALcAeYCfwrrbaO4EdbXkncE2SU5NsBM4DHpnvz5ckjdZCRgjrgK8n\nqbaf/15V9yb5K+DOJO8GnqZ/ZRFVtTfJncBe4Hng+qqa1+EkSRp3nc6GYy5PXbfuXA4dOrBoP3Pe\ngVBVTwEXD6n/HfDmWbbZDmyf78/UytOZ6NCbPnZm5rr16zh08NAytEgaD8NuCrnYj+301hVakN50\nb+ym70trlbeukCQBBoIkLb0VeqdUDxlJ0lJboXdKdYQgSQIMBM3RbLfxlTQ+PGSkORn+XGTw2cjS\n+HCEIEkCDARJUmMgSJIAA0GSVo9Fnr/gSWVJWi0Wef6CIwRJEmAgSJIaA0GSBBgIa8pss407nQ3L\n3TRJK4CBsIYcnW380lfvx0+vyDsvSlpaXmWkFXvnRUlLyxGCJAkwECRJjYEgSQIMhFXHK4UkLZYl\nD4QklyX5bpInk3x4qX/+aueVQpIWy5JeZZTkJOBTwO8CfwvsSrKjqr67lO1Ybt1ul8nJydHudKVc\nKfQUsHFpf+SSGvf+0V3uBiyyLjC5zG1YuZZ6hLAZ2F9VT1fV88AdwJVL3IZl1+12Z/2sM9FZ3d/y\nDyx3AxbZgeVuwGLrLncDFll3uRuwoi31PIT1wDMD7w/SD4lV5/Dhwzz11FPH1J977jl+9rOfHfO8\n4dNOO403vOENL77vdDa0wz9DTL30rfMBJC0FJ6bN06c+9Wk++ME/PvaDk+kfvhli9+7dbNq0CfAZ\nxZJWnlQN+6W0SD8s+Q1gqqoua+9vBKqqPjZjvaVrlCSNkaqa97fKpQ6Ek4En6J9U/gHwCPD2qtq3\nZI2QJA21pIeMqurnSd4H3Ev/hPbnDQNJWhmWdIQgSVq5VtRM5XGYtJbk80l6SR4fqJ2V5N4kTyS5\nJ8mZA59tTbI/yb4kW5an1XOTZCLJA0m+k2RPkve3+rj077QkDyd5tPXvplYfi/4dkeSkJN9KsrO9\nH5v+JTmQ5K/b3+EjrTZO/TszyVdbe7+T5F+NtH9VtSJe9MPpe8C5wD8BHgNet9ztmkc//jVwMfD4\nQO1jwH9syx8G/rQtXwg8Sv/Q3YbW/yx3H35B3zrAxW35FfTPB71uXPrX2vzy9ufJwEP0L4sem/61\ndn8Q+G/AznH699na/DfAWTNq49S/W4E/asunAGeOsn8raYQwFpPWquqbwN/PKF8J3NaWbwOuastX\nAHdU1eGqOgDsZwXPy6iqQ1X1WFv+KbAPmGBM+gdQVc+1xdPo/0cqxqh/SSaAfwN8bqA8Nv2jf932\nzN9rY9G/JK8EfrOqvgjQ2v0sI+zfSgqEYZPW1i9TW0bt7KrqQf+XKnB2q8/s8zSrpM9JNtAfCT0E\nrBuX/rXDKY8Ch4D7qmoXY9Q/4D8D/4GXToIZp/4VcF+SXUne02rj0r+NwI+TfLEd8vuvSV7OCPu3\nkgJhLVnVZ/KTvAL4GvCBNlKY2Z9V27+qeqGqLqE/8tmc5PWMSf+S/B7Qa6O8X3St+qrsX3NpVW2i\nPwq6IclvMiZ/f/RHrJuAT7c+/j/gRkbYv5UUCNPArw68n2i1cdBLsg4gSQf4YatPA+cMrLfi+5zk\nFPph8KWq2tHKY9O/I6rqJ/RvfHMZ49O/S4ErkvwN8D+A30nyJeDQmPSPqvpB+/NHwJ/TP0QyLn9/\nB4Fnquqv2vs/ox8QI+vfSgqEXcB5Sc5NcipwDbBzmds0X+Gl38B2Au9qy+8EdgzUr0lyapKNwHn0\nJ+utZF8A9lbVJwdqY9G/JK8+coVGkpcBb6F/nmQs+ldVH6mqX62q19D///VAVf074BuMQf+SvLyN\nXknyS8AWYA/j8/fXA55Jcn4r/S7wHUbZv+U+az7jDPpl9K9c2Q/cuNztmWcfvkz/1t7/AHwf+CPg\nLOD+1rd7gV8eWH8r/bP/+4Aty93+4/TtUvp3anqM/tUL32p/Z/90TPp3UevTY8DjwH9q9bHo34y+\n/hZHrzIai/7RP8Z+5N/mniO/Q8alf629b6D/5fkx4H/Sv8poZP1zYpokCVhZh4wkScvIQJAkAQaC\nJKkxECRJgIEgSWoMBEkSYCBIJLkqyQtHJvy0yZEvJPnowDqvSvKPSf5LuwXxjwc+e2Nb/5+3969M\n8r+XvifSwhgIUn/W7v8C3j5Qewr4vYH3vw98G6D6d5j82ySva5+9kf6Etje1978BPLyYDZYWg4Gg\nNa3d4uBS4DpeGgjPAfuSbGrv/wC4c+Dzv+RoALyJ/l1EB98/uFhtlhaLgaC17krg7qr6Hv1bC18y\n8NkdwNvbMwQO078lyREPcjQANgJfBX69vX8T8BeL2mppERgIWuveTv8XP8BXgHe05QLupn+Du2va\nZ4M3LPwL4NL2XIgDVfWP8OKI41/iISOtQqcsdwOk5ZLkLOB3gH+RpOg/NrOAT0P/iVRJdgMfov84\nwhef4FdV30vyy8C/pX/4CGA3/ZsZPlVHn7wmrRqOELSW/T5we1VtrKrXVNW59E8mn8PR0cDHgQ9X\n1f8Zsv1DwAc4GggPAX+M5w+0ShkIWsv+APj6jNqf0b9l8AsAVbW3qr40y/YP0n/oyJEHlvwl/fMJ\nBoJWJW9/LUkCHCFIkhoDQZIEGAiSpMZAkCQBBoIkqTEQJEmAgSBJagwESRIA/x9z721WBVtXZAAA\nAABJRU5ErkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "_=hist(([x for x,y in mws],[y for x,y in mws]),bins=20,histtype='bar')\n", "xlabel('AMW')" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYEAAAEPCAYAAACk43iMAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAFS5JREFUeJzt3X+QXeV93/H3B1GMf2BKk6LtSKQipcLgMsZqrbRD2rkZ\nY344E0Q7LYPbBgg00xnIONNMM0bpTFn9JTvNOMRtodPaAeHiIXJTAmkZEFQsU6d1UQAbbClCM6kU\npLEu9jjFYzqN+fHtH/cIXa122bur1d7dfd6vmTs697vPOfvcq7vnc5/zM1WFJKlNZ4y7A5Kk8TEE\nJKlhhoAkNcwQkKSGGQKS1DBDQJIaNlIIJDk3yVeT7Evy7SQ/leS8JLuS7E/yRJJzh9pvTXKga3/V\nUH1TkheTvJzk7tPxgiRJoxt1JPBbwGNVdQnwEeCPgDuBp6rqYmA3sBUgyaXADcAlwLXAPUnSLede\n4Laq2ghsTHL1or0SSdK8zRkCST4I/O2qug+gqt6sqteALcCOrtkO4Ppu+jrgoa7dQeAAsDnJBHBO\nVe3p2j0wNI8kaQxGGQlcCHwvyX1Jnk/y75O8D1hbVX2AqjoKnN+1Xwe8MjT/ka62Djg8VD/c1SRJ\nYzJKCJwJbAL+bVVtAl5nsClo+vUmvP6EJK0wZ47Q5jDwSlX9Yff8dxmEQD/J2qrqd5t6Xu1+fgS4\nYGj+9V1ttvpJkhgokrQAVZW5Wx0350ig2+TzSpKNXenjwLeBR4FbutrNwCPd9KPAjUnOSnIhcBHw\nbLfJ6LUkm7sdxTcNzTPT7/VRxV133TX2PiyXx0p6LwCYnPZYxM/1Snov/Fws3WMhRhkJAHwaeDDJ\nnwP+GPgFYA2wM8mtwCEGRwRRVXuT7AT2Am8At9fx3t0B3A+czeBoo8cX1GtJ0qIYKQSq6pvAx2b4\n0ZWztN8ObJ+h/hxw2Xw6KEk6fTxjeJnr9Xrj7sKy4XtxnO/Fcb4XpyYL3Y50OiWp5dgvaVRJ3tkP\n8I5JFrzdVhpFEmqxdwxLklYvQ0CSGmYISFLDDAHpFE1MbCDJCQ9ppRj1PAFJs+j3D3HyVVMMAq0M\njgQkqWGGgCQ1zBCQpIYZApLUMENAkhpmCEhSwwwBSWqYISBJDTMEJKlhhoAkNcwQkKSGGQKS1DBD\nQJIaZghIUsMMAUlqmCEgSQ0zBCSpYYaAJDXMEJCkhhkCktQwQ0CSGjZSCCQ5mOSbSV5I8mxXOy/J\nriT7kzyR5Nyh9luTHEiyL8lVQ/VNSV5M8nKSuxf/5UiS5mPUkcDbQK+qPlpVm7vancBTVXUxsBvY\nCpDkUuAG4BLgWuCeJOnmuRe4rao2AhuTXL1Ir0OStACjhkBmaLsF2NFN7wCu76avAx6qqjer6iBw\nANicZAI4p6r2dO0eGJpHkjQGo4ZAAU8m2ZPkn3S1tVXVB6iqo8D5XX0d8MrQvEe62jrg8FD9cFeT\nJI3JmSO2u6KqvpPkLwK7kuxnEAzDpj+XJC1zI4VAVX2n+/e7SX4P2Az0k6ytqn63qefVrvkR4IKh\n2dd3tdnqM5qcnHxnutfr0ev1RumqJDVjamqKqampU1pGqt79C3yS9wFnVNUPk7wf2AVsAz4OfL+q\nPpfkM8B5VXVnt2P4QeCnGGzueRL4q1VVSb4OfBrYA/xX4AtV9fgMv7Pm6pe0XAyOe5j+eQ1MTitN\ngp9rnU5JqKrM3fK4UUYCa4GHk1TX/sGq2pXkD4GdSW4FDjE4Ioiq2ptkJ7AXeAO4fWiNfgdwP3A2\n8NhMASBJWjpzjgTGwZGAVhJHAlouFjIS8IxhSWqYISBJDTMEpCU2MbGBJCc81rxnzUm1ifUT4+6q\nGjDqeQKSFkm/f4jp+xDe/tHJ+xD6k/0l65Pa5UhAkhpmCEhSwwwBSWqYISBJDTMEJKlhhoA0zUyH\ncE5MbBh3t6TTwkNEpWlmOoSz35/XmfjSiuFIQJIaZghIUsMMAUlqmCEgSQ0zBCSpYYaAJDXMEJCk\nhhkCktQwQ0CSGmYISFLDDAFJapghIEkNMwSkUazBG8FrVfIqotIo3sIbwWtVciQgSQ0zBCSpYYaA\nJDXMEJCkho0cAknOSPJ8kke75+cl2ZVkf5Inkpw71HZrkgNJ9iW5aqi+KcmLSV5OcvfivhRJ0nzN\nZyTwy8Deoed3Ak9V1cXAbmArQJJLgRuAS4BrgXuSHLtB673AbVW1EdiY5OpT7L8k6RSMFAJJ1gOf\nBL44VN4C7OimdwDXd9PXAQ9V1ZtVdRA4AGxOMgGcU1V7unYPDM0jSRqDUUcCvwn8KlBDtbVV1Qeo\nqqPA+V19HfDKULsjXW0dcHiofrirSZLGZM6TxZL8LNCvqm8k6b1L03qXn83b5OTkO9O9Xo9e791+\ntSS1Z2pqiqmpqVNaxihnDF8BXJfkk8B7gXOSfBk4mmRtVfW7TT2vdu2PABcMzb++q81Wn9FwCEiS\nTjb9C/K2bdvmvYw5NwdV1a9V1U9U1U8CNwK7q+rngd8Hbuma3Qw80k0/CtyY5KwkFwIXAc92m4xe\nS7K521F809A8kqQxOJVrB30W2JnkVuAQgyOCqKq9SXYyOJLoDeD2qjq2qegO4H7gbOCxqnr8FH6/\nJOkUzSsEquoZ4Jlu+vvAlbO02w5sn6H+HHDZ/LspSTodPGNYkhpmCEhSwwwBSWqYISBJDTMEJKlh\nhoAkNcwQkKSGGQKS1DBDQJIaZghIUsMMAUlqmCEgSQ0zBCSpYYaAtAJNrJ8gyQmPifUT4+6WVqBT\nuZ+ApDHpH+nD5LTaZH8sfdHK5khAkhpmCEhSwwwBSWqYISBJDTMEJKlhhoAkNcwQkKSGGQKS1DBD\nQJIaZghIUsMMAUlqmCEgSQ0zBCSpYXOGQJL3JPlfSV5I8lKSu7r6eUl2Jdmf5Ikk5w7NszXJgST7\nklw1VN+U5MUkLye5+/S8JEnSqOYMgar6M+BnquqjwOXAtUk2A3cCT1XVxcBuYCtAkkuBG4BLgGuB\ne5KkW9y9wG1VtRHYmOTqxX5BkqTRjbQ5qKr+bzf5Hgb3IChgC7Cjq+8Aru+mrwMeqqo3q+ogcADY\nnGQCOKeq9nTtHhiaR5I0BiOFQJIzkrwAHAWe7Fbka6uqD1BVR4Hzu+brgFeGZj/S1dYBh4fqh7ua\npDlMTGw44S5i0mIZ6c5iVfU28NEkHwQeTvJhBqOBE5otZscmJyffme71evR6vcVcvLSi9PuHOPFP\nzCAQTE1NMTU1dUrLmNftJavqB0mmgGuAfpK1VdXvNvW82jU7AlwwNNv6rjZbfUbDISBJOtn0L8jb\ntm2b9zJGOTrox48d+ZPkvcAngH3Ao8AtXbObgUe66UeBG5OcleRC4CLg2W6T0WtJNnc7im8amkeS\nNAajjAT+ErAjyRkMQuN3quqxJF8Hdia5FTjE4Iggqmpvkp3AXuAN4PaqOjaOvQO4HzgbeKyqHl/U\nVyNJmpc5Q6CqXgI2zVD/PnDlLPNsB7bPUH8OuGz+3ZQknQ6eMSxJDTMEJKlhhoAkNcwQkKSGGQKS\n1DBDQJIaZghIUsMMAUlqmCEgSQ0zBCSpYYaAJDXMEJCkhhkCas7E+okT7tKVhIn1E+PuljQW87qp\njLQa9I/0YXJabbI/lr5I4+ZIQKue9+eVZudIQKue9+eVZudIQJIaZghIUsMMAUlqmCEgSQ0zBCSp\nYYaAJDXMEJCkhhkCktQwQ0CSGmYISFLDDAFJapghIEkNmzMEkqxPsjvJt5O8lOTTXf28JLuS7E/y\nRJJzh+bZmuRAkn1Jrhqqb0ryYpKXk9x9el6SJGlUo4wE3gR+pao+DPwt4I4kHwLuBJ6qqouB3cBW\ngCSXAjcAlwDXAvfk+PV77wVuq6qNwMYkVy/qq5EkzcucIVBVR6vqG930D4F9wHpgC7Cja7YDuL6b\nvg54qKrerKqDwAFgc5IJ4Jyq2tO1e2BoHknSGMxrn0CSDcDlwNeBtVXVh0FQAOd3zdYBrwzNdqSr\nrQMOD9UPdzVJ0piMfFOZJB8A/hPwy1X1wyQ1rcn056dkcnLyneler0ev11vMxUvSijc1NcXU1NQp\nLWOkEEhyJoMA+HJVPdKV+0nWVlW/29Tzalc/AlwwNPv6rjZbfUbDISBJOtn0L8jbtm2b9zJG3Rz0\n28DeqvqtodqjwC3d9M3AI0P1G5OcleRC4CLg2W6T0WtJNnc7im8amkeSNAZzjgSSXAH8I+ClJC8w\n2Ozza8DngJ1JbgUOMTgiiKram2QnsBd4A7i9qo5tKroDuB84G3isqh5f3JcjSZqPOUOgqv4AWDPL\nj6+cZZ7twPYZ6s8Bl82ng5Kk08czhiWpYYaAJDXMEJCkhhkCktQwQ0CSGmYISFLDDAFJapghIEkN\nMwQkqWGGgCQ1zBCQpIYZApLUMENAkhpmCEhSwwwBaRWamNhAkhMeExMbxt0tLUOGgLQK9fuHGNz/\n6fij/71DJwfD+onxdlRjN/KN5iWtcG8BkyeW+pP9cfREy4gjAUlqmCEgSQ0zBCSpYYaAJDXMEJCk\nhhkCktQwQ0CSGmYISFLDDAFJapghIEkNMwQkqWFzhkCSLyXpJ3lxqHZekl1J9id5Ism5Qz/bmuRA\nkn1Jrhqqb0ryYpKXk9y9+C9FkjRfo4wE7gOunla7E3iqqi4GdgNbAZJcCtwAXAJcC9yTJN089wK3\nVdVGYGOS6cuU5uQlkqXFNWcIVNXXgD+dVt4C7OimdwDXd9PXAQ9V1ZtVdRA4AGxOMgGcU1V7unYP\nDM0jjWzGSyT3D423U9IKttB9AudXVR+gqo4C53f1dcArQ+2OdLV1wOGh+uGuJp26NXidfGmBFut+\nArVIy5Hmz+vkSwu20BDoJ1lbVf1uU8+rXf0IcMFQu/Vdbbb6rCYnJ9+Z7vV69Hq9BXZVklanqakp\npqamTmkZo4ZAuscxjwK3AJ8DbgYeGao/mOQ3GWzuuQh4tqoqyWtJNgN7gJuAL7zbLxwOAUnSyaZ/\nQd62bdu8lzFnCCT5CtADfizJnwB3AZ8FvprkVuAQgyOCqKq9SXYCe4E3gNur6timojuA+4Gzgceq\n6vF591aStKjmDIGq+oez/OjKWdpvB7bPUH8OuGxevZMknVaeMSxJDTMEJKlhhoAkNcwQkKSGGQKS\n1DBDQJIaZghIUsMMAUlqmCEgSQ0zBCSpYYaAJO/Y1rDFup+ApBXs+B3bhmuZubFWFUcCktQwQ0CS\nGmYISFLDDAFJapghIEkNMwQkqWGGgMbK49OXsTWc/H+zfmLcvdIi8zwBjZXHpy9jbwGTJ5b6k/1x\n9ESnkSMBLT9+A5WWjCMBLT9+A5WWjCMBSWqYISBJDTMEJKlhhoCkefPQ3tXDENCimmnlsOY9azza\nZ5U5fmjv8cegppXGo4O0qGY67v/tH8WjfVrQHdo7bO26tRw9fHRMHdIoljwEklwD3M1gFPKlqvrc\nUvdB0mngob0r0pJuDkpyBvBvgKuBDwOfSvKhpezDSjM1NTXuLiwbvheaiZ+LU7PU+wQ2Aweq6lBV\nvQE8BGxZ4j6sKOP8gC+37fv+sa9sp2tnsp+LU7PUIbAOeGXo+eGupiUw35X6TDv/3v7R24Mh/9Cj\nf8Qhv+Y2487k7x2a1xeKifUTJ31ef+Pzv7EEvV+93DE8gkOHDrFhw4aR2j799NP0er3T2p9jJiY2\nnHRExhlnnTFYUQ85tnPOnbZadqbtRxj+7M30+QZO+ry+Pvn66ehZM1JVc7darF+W/E1gsqqu6Z7f\nCdT0ncNJlq5TkrSKVNW8LsO71CGwBtgPfBz4DvAs8Kmq2rdknZAkvWNJNwdV1VtJfgnYxfFDRA0A\nSRqTJR0JSJKWl2Vz2Ygkfz/Jt5K8lWTTtJ9tTXIgyb4kV42rj+OQ5K4kh5M83z2uGXefllqSa5L8\nUZKXk3xm3P0ZpyQHk3wzyQtJnh13f5ZSki8l6Sd5cah2XpJdSfYneSLJuePs41KZ5b1Y0Lpi2YQA\n8BLwd4FnhotJLgFuAC4BrgXuyfRz01e/z1fVpu7x+Lg7s5Q8wfAkbwO9qvpoVW0ed2eW2H0MPgfD\n7gSeqqqLgd3A1iXv1XjM9F7AAtYVyyYEqmp/VR0Apq/gtwAPVdWbVXUQOMDgpLOWtBZ6wzzB8ERh\nGf3dLqWq+hrwp9PKW4Ad3fQO4Pol7dSYzPJewALWFSvhwzT9BLMjtHeC2S8l+UaSL7Yy3B3iCYYn\nKuDJJHuS/OK4O7MMnF9VfYCqOgqcP+b+jNu81xVLfe2gJ5O8OPR4qfv355ayH8vNHO/LPcBPVtXl\nwFHg8+PtrcbsiqraBHwSuCPJT4+7Q8tMy0e6LGhdsdSHiH5iAbMdAS4Yer6+q60a83hf/gPw+6ez\nL8vQEeAnhp6vuv//+aiq73T/fjfJwww2l31tvL0aq36StVXVTzIBvDruDo1LVX136OnI64rlujlo\neLvWo8CNSc5KciFwEYOTzJrQfbCP+XvAt8bVlzHZA1yU5C8nOQu4kcFnojlJ3pfkA930+4GraO/z\nEE5eP9zSTd8MPLLUHRqjE96Lha4rls21g5JcD/xr4MeB/5LkG1V1bVXtTbIT2Au8AdxebZ3c8OtJ\nLmdwVMhB4J+OtztLyxMMT7AWeLi7rMqZwINVtWvMfVoySb4C9IAfS/InwF3AZ4GvJrkVOMTgSMJV\nb5b34mcWsq7wZDFJathy3RwkSVoChoAkNcwQkKSGGQKS1DBDQJIaZghIUsMMAeldJLk5yRe66buS\n/Mq4+yQtJkNAkhpmCGjV6y45sS/Jfd3NR/5jko8n+Vr3/G90Nyd5uLthy/9I8tfmWOblSf5nd8XG\n3z12xcYkH+uW8XySX0/yUle/OcnvJXm6+53/cileuzQXQ0Ct+CvAv+puPvIh4FNV9dPAPwf+BbAN\neL6qPtI9//Icy9sB/Gp3xcZvMThtH+C3gV/srvT5Fide1fJjDG6c9BHgH0y/g540DoaAWvG/q2pv\nN/1t4L91098CNgBX0K34q+pp4C8cu1jbdEk+CJzb3dgDBoHwd7rRwAeq6tgFDr8ybdYnq+r/VNX/\nA/4z4GWgNXaGgFrxZ0PTbw89f5uZL6Q41x2aZvv5u803/UJdXrhLY2cIqBVzrdT/O/CPAZL0gO9W\n1Q9nalhVPwC+n+SKrvTzwDNV9RrwgyQf6+o3Tpv1E0n+fJL3MrgN4h/M/2VIi2vZXEpaOs1qlulj\nzyeB+5J8E3gduGmO5d0C/Ltuhf7HwC909duALyZ5C3gGeG1onmcZbAZaB3y5qp6f/8uQFpeXkpYW\nUZL3V9Xr3fRngImq+mdJbgb+elV9erw9lE7kSEBaXD+bZCuDv62DHL/rlbQsORKQpIa5Y1iSGmYI\nSFLDDAFJapghIEkNMwQkqWGGgCQ17P8DPIzoHjTWDbcAAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "_=hist(([x for x,y in logps],[y for x,y in logps]),bins=20,histtype='bar')\n", "xlabel('mollogp')" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYEAAAEPCAYAAACk43iMAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAG0lJREFUeJzt3X+UXOV93/H3R+KHwPywcKLdWEsQBIsgQgqiWdvg2pMD\nFsitkfojimw3gCFtY2RDk5PWktsT7fYklXHdGic54Dp2YHFwscw5RIqrSoIjhjgmIEX8kGwJSScg\nIcnewRhbBmMbJL79Y56VLrsz2pnV7s6Mns/rnD1797vPc+e5F7Sfuc+de68iAjMzy9OUVg/AzMxa\nxyFgZpYxh4CZWcYcAmZmGXMImJllzCFgZpaxhkJA0u9L+rakLZLulXSSpOmS1kvaIWmdpDML7ZdJ\n2iVpu6R5hfrctI6dkm6fiA0yM7PGjRoCkt4OfAKYGxG/DpwAfAhYCjwUERcAG4Blqf0cYBFwITAf\nuEOS0uruBG6KiNnAbElXj/P2mJlZExqdDpoKvEXSCcApwH5gATCQfj8ALEzL1wL3RcTBiNgN7AJ6\nJXUDp0fEptTunkIfMzNrgVFDICK+C/xP4Hmqf/wPRMRDQFdEVFKbQWBG6jIT2FtYxf5UmwnsK9T3\npZqZmbVII9NBb6X6rv8c4O1Ujwg+Agy/34TvP2Fm1mFOaKDNVcCzEfESgKQHgMuBiqSuiKikqZ4X\nUvv9wNmF/j2pVq8+giQHipnZGESERm91RCPnBJ4H3iVpWjrBeyWwDVgN3JDaXA+sSsurgcXpE0Tn\nAucDG9OU0QFJvWk91xX61NqQjv1avnx5y8eQ49g9/tZ/efyt/RqLUY8EImKjpPuBJ4HX0/cvAqcD\nKyXdCOyh+okgImKbpJVUg+J14OY4MrolwN3ANGBNRKwd06jNzGxcNDIdRET0A/3Dyi9RnSqq1X4F\nsKJGfTNwcZNjNDOzCeIrhidAqVRq9RDGrJPHDh5/q3n8nUdjnUeaSJKiHcdlZtbOJBETcGLYzMyO\nUw4BM7OMOQTMzDLmEDAzy5hDwMwsYw4BM7OMOQTMzDLmEDAzy5hDwMwsYw4BM7OMOQTMzDLmEDAz\ny5hDwMwsYw4BM7OMOQTMzDLmEDAzy5hDwMwsY6OGgKTZkp6U9ET6fkDSLZKmS1ovaYekdZLOLPRZ\nJmmXpO2S5hXqcyVtkbRT0u0TtVFmZtaYUUMgInZGxKURMRe4DPgJ8ACwFHgoIi4ANgDLACTNARYB\nFwLzgTskDT3u7E7gpoiYDcyWdPV4b5CZmTWu2emgq4B/jIi9wAJgINUHgIVp+Vrgvog4GBG7gV1A\nr6Ru4PSI2JTa3VPoY2ZmLdBsCPw28NW03BURFYCIGARmpPpMYG+hz/5UmwnsK9T3pZqZmbVIwyEg\n6USq7/K/nkoxrMnwn7PW3T0LSSO+urtntXpoZmaHndBE2/nA5oh4Mf1ckdQVEZU01fNCqu8Hzi70\n60m1evWa+vr6Di+XSiVKpVITQ229SmUPtXKxUtHIxmZmY1AulymXy8e0DkU09gZe0v8B1kbEQPr5\nNuCliLhN0ieB6RGxNJ0Yvhd4J9XpngeBd0RESHoMuAXYBPxf4E8jYm2N14pGx9WuqufCa22D6PRt\nM7P2JImIaOqdZkMhIOlUYA9wXkS8nGpnASupvrvfAyyKiB+l3y0DbgJeB26NiPWpfhlwNzANWBMR\nt9Z5PYeAmVmTJiwEJptDwMyseWMJAV8xbGaWMYeAmVnGHAJmZhlzCJiZZcwhYGaWMYeAmVnGHAJm\nZhlzCJiZZcwhYGaWMYeAmVnGHAJmZhlzCJiZZcwhYGaWMYeAmVnGHAJmZhlzCJiZZcwhYGaWMYeA\nmVnGHAJmZhlrKAQknSnp65K2S/qOpHdKmi5pvaQdktZJOrPQfpmkXan9vEJ9rqQtknZKun0iNmg0\n3T3dSKr51d3T3YohmZm1TEMPmpd0N/BIRNwl6QTgLcCngB9ExGckfRKYHhFLJc0B7gV+A+gBHgLe\nEREh6XHg4xGxSdIa4PMRsa7G603Yg+YlQV+dX/Yxbg+B94PmzWyyTciD5iWdAfyziLgLICIORsQB\nYAEwkJoNAAvT8rXAfandbmAX0CupGzg9IjaldvcU+piZWQs0Mh10LvCipLskPSHpi5JOBboiogIQ\nEYPAjNR+JrC30H9/qs0E9hXq+1LNzMxa5IQG28wFlkTEP0j6HLCUkXMd4zrH0dfXd3i5VCpRKpXG\nc/VmZh2vXC5TLpePaR2jnhOQ1AX8fUScl35+D9UQ+BWgFBGVNNXzcERcKGkpEBFxW2q/FlgO7Blq\nk+qLgfdFxMdqvKbPCZiZNWlCzgmkKZ+9kman0pXAd4DVwA2pdj2wKi2vBhZLOknSucD5wMY0ZXRA\nUq+qfyGvK/QxM7MWaGQ6COAW4F5JJwLPAh8FpgIrJd1I9V3+IoCI2CZpJbANeB24ufC2fglwNzAN\nWBMRa8drQ8zMrHkNfUR0sh3X00FTBYdGlrtmdjG4b3BcXtvM8jSW6aBGjwRsvByiZghV+iqTPRIz\nM982wswsZw4BM7OMOQTMzDLmEDAzy5hDwMwsYw4BM7OMOQTMzDJ23IZAd/esmg+OMTOzI47bi8Uq\nlT3Uu4GbmZlVHbdHAmZmNjqHgJlZxhwCZmYZcwiYmWXMIWBmljGHgJlZxhwCZmYZcwiYmWXMIWBm\nlrGGQkDSbklPS3pS0sZUmy5pvaQdktZJOrPQfpmkXZK2S5pXqM+VtEXSTkm3j//mmJlZMxo9EngD\nKEXEpRHRm2pLgYci4gJgA7AMQNIcYBFwITAfuENHbtpzJ3BTRMwGZku6epy2w8zMxqDREFCNtguA\ngbQ8ACxMy9cC90XEwYjYDewCeiV1A6dHxKbU7p5CHzMza4FGQyCAByVtkvS7qdYVERWAiBgEZqT6\nTGBvoe/+VJsJ7CvU96WamZm1SKN3Eb0iIr4n6ReB9ZJ2MPIWnbVu2TlmfX19h5dLpRKlUmk8V29m\n1vHK5TLlcvmY1tFQCETE99L370v6a6AXqEjqiohKmup5ITXfD5xd6N6TavXqNRVDwMzMRhr+Brm/\nv7/pdYw6HSTpVEmnpeW3APOArcBq4IbU7HpgVVpeDSyWdJKkc4HzgY1pyuiApN50ovi6Qh8zM2uB\nRo4EuoAHJEVqf29ErJf0D8BKSTcCe6h+IoiI2CZpJbANeB24OSKGpoqWAHcD04A1EbF2XLfGzMya\nMmoIRMRzwCU16i8BV9XpswJYUaO+Gbi4+WGamdlE8BXDZmYZcwiYmWXMIWBmljGHgJlZxhwCZmYZ\ncwg0qbunG0kjvrp7uls9NDOzpjV62whLKvsr0Fej3leZ9LGYmR0rHwmYmWXMIWBmljGHgJlZxhwC\nZmYZcwiYmWXMIWBmljGHgJlZxhwCZmYZcwiYmWXMIWBmljGHgJlZxhwCZmYZazgEJE2R9ISk1enn\n6ZLWS9ohaZ2kMwttl0naJWm7pHmF+lxJWyTtlHT7+G6KmZk1q5kjgVuBbYWflwIPRcQFwAZgGYCk\nOcAi4EJgPnCHJKU+dwI3RcRsYLakq49x/GZmdgwaCgFJPcAHgC8VyguAgbQ8ACxMy9cC90XEwYjY\nDewCeiV1A6dHxKbU7p5CHzMza4FGjwQ+B/wnIAq1roioAETEIDAj1WcCewvt9qfaTGBfob4v1czM\nrEVGfaiMpH8OVCLiKUmlozSNo/yuaX19fYeXS6USpdLRXtrMLD/lcplyuXxM62jkyWJXANdK+gBw\nCnC6pK8Ag5K6IqKSpnpeSO33A2cX+vekWr16TcUQMDOzkYa/Qe7v7296HaNOB0XEpyLilyPiPGAx\nsCEifgf4G+CG1Ox6YFVaXg0slnSSpHOB84GNacrogKTedKL4ukKfttPdPavms4Rb+dqS6O6eNSlj\nMLM8HMszhj8NrJR0I7CH6ieCiIhtklZS/STR68DNETE0VbQEuBuYBqyJiLXH8PoTqlLZQ+0ZrokP\ngvqvDZXK5ASRmeWhqRCIiEeAR9LyS8BVddqtAFbUqG8GLm5+mGZmNhF8xbCZWcYcAmZmGXMImJll\nzCFgZpYxh4CZWcYcAmZmGXMImJllzCFgZpYxh0CnmUrt20n0dLd6ZGbWgY7lthHWCoeAvpHlSl9l\nskdiZscBHwmYmWXMIWBmljGHgJlZxhwCZmYZcwiYmWXMIWBmljGHgJlZxhwCZmYZcwiYmWVs1BCQ\ndLKkxyU9KWmrpOWpPl3Sekk7JK2TdGahzzJJuyRtlzSvUJ8raYuknZJun5hNMjOzRo0aAhHxc+A3\nI+JS4BJgvqReYCnwUERcAGwAlgFImgMsAi4E5gN3SFJa3Z3ATRExG5gt6erx3iAzM2tcQ9NBEfFq\nWjyZ6v2GAlgADKT6ALAwLV8L3BcRByNiN7AL6JXUDZweEZtSu3sKfczMrAUaCgFJUyQ9CQwCD6Y/\n5F0RUQGIiEFgRmo+E9hb6L4/1WYC+wr1falmZmYt0tBdRCPiDeBSSWcAD0i6iOrRwJuajefA+vr6\nDi+XSiVKpdJ4rt7MrOOVy2XK5fIxraOpW0lHxI8llYFrgIqkroiopKmeF1Kz/cDZhW49qVavXlMx\nBMzMbKThb5D7+/ubXkcjnw76haFP/kg6BXg/sB1YDdyQml0PrErLq4HFkk6SdC5wPrAxTRkdkNSb\nThRfV+hjZmYt0MiRwC8BA5KmUA2Nr0XEGkmPASsl3QjsofqJICJim6SVwDbgdeDmiBiaKloC3A1M\nA9ZExNpx3RozM2vKqCEQEVuBuTXqLwFX1emzAlhRo74ZuLj5YZqZ2UTwFcNmZhlzCJiZZcwhYGaW\nMYeAmVnGHAJmZhlzCJiZZcwhYGaWMYeAmVnGHAJmZhlzCJiZZcwhYGaWMYeAmVnGHAJmZhlzCJiZ\nZcwhYGaWMYeAmVnGHAJmZhlzCJiZZcwhYGaWsVFDQFKPpA2SviNpq6RbUn26pPWSdkhaJ+nMQp9l\nknZJ2i5pXqE+V9IWSTsl3T4xm2RmZo1q5EjgIPAHEXER8G5giaRfBZYCD0XEBcAGYBmApDnAIuBC\nYD5whySldd0J3BQRs4HZkq4e160xM7OmjBoCETEYEU+l5VeA7UAPsAAYSM0GgIVp+Vrgvog4GBG7\ngV1Ar6Ru4PSI2JTa3VPoY2ZmLdDUOQFJs4BLgMeAroioQDUogBmp2Uxgb6Hb/lSbCewr1PelmpmZ\ntcgJjTaUdBpwP3BrRLwiKYY1Gf7zMenr6zu8XCqVKJVK47l6M7OOVy6XKZfLx7SOhkJA0glUA+Ar\nEbEqlSuSuiKikqZ6Xkj1/cDZhe49qVavXlMxBMzMbKThb5D7+/ubXkej00F/CWyLiM8XaquBG9Ly\n9cCqQn2xpJMknQucD2xMU0YHJPWmE8XXFfo07IP/5oO8fdbbR3z1nNfDo48+2uzqzMyyNuqRgKQr\ngI8AWyU9SXXa51PAbcBKSTcCe6h+IoiI2CZpJbANeB24OSKGpoqWAHcD04A1EbG22QE/8vAjvLzw\nZTj1zfWTv3kyW7Zs4fLLL292lWZm2Ro1BCLiW8DUOr++qk6fFcCKGvXNwMXNDLCms4DT3lyacoqv\nezMza5b/cpqZZcwhYGaWMYeAmVnGHAJmZhlzCJiZZazhK4YnW7lc5qc//emI+qFDh1owGjOz41Pb\nhsDVVy/klFPePaL+6k9ebcFozMyOT20bAqecciEHDvy/kb84cQrjfJsiM7Ns+ZyAmVnGHAJmZhlz\nCJiZZcwhYGaWMYeAmVnGHAJmZhlzCJiZZcwhYGaWMYeAmVnGHAJmZhlzCJiZZWzUEJD0ZUkVSVsK\ntemS1kvaIWmdpDMLv1smaZek7ZLmFepzJW2RtFPS7eO/KWZm1qxGjgTuAq4eVlsKPBQRFwAbgGUA\nkuYAi4ALgfnAHZKU+twJ3BQRs4HZkoav08zMJtmoIRARfwf8cFh5ATCQlgeAhWn5WuC+iDgYEbuB\nXUCvpG7g9IjYlNrdU+hjbaa7exaSRnxNPXlqzbokunu6Wz1sMxuDsd5KekZEVAAiYlDSjFSfCfx9\nod3+VDsI7CvU96W6TYDu7llUKntq/q6r6xwGB3cftX+178jbdb/xmqCvTp++SnODNLO2MF7PExj3\nG/z/7Gd7OfIXp5S+rBH1/ohXf6eadTPrPOVymXK5fEzrGGsIVCR1RUQlTfW8kOr7gbML7XpSrV69\nrmnTzubnP+8b4/DMzI5/pVKJUql0+Of+/v6m19HoR0SVvoasBm5Iy9cDqwr1xZJOknQucD6wMSIG\ngQOSetOJ4usKfczMrEUa+YjoV4FHqX6i53lJHwU+Dbxf0g7gyvQzEbENWAlsA9YAN0fE0LzEEuDL\nwE5gV0SsHe+NsQZMxSd2zeywUaeDIuLDdX51VZ32K4AVNeqbgYubGp2Nv0PUPLnrE7tmefIVw2Zm\nGXMImJllzCFgZpYxh4CZWcYcAmZmGXMImJllzCFgZpYxh4CZWcYcAmZmGXMI2KTp7un2LSvM2sx4\n3UrabFSV/RXfssKszfhIwMwsYw4BG3f1Hk9pZu3H00E27uo/2cxBYNZufCRgZpYxh4CZWcYcAmZm\nGXMImJllbNJDQNI1kp6RtFPSJyf79c3M7IhJDQFJU4A/B64GLgI+JOlXJ3MMk6FcLrd6CGPWDmOv\n9xHTqSdPHfWK47POqn1VcqP9W60d9v+x8Pg7z2R/RLQX2BURewAk3QcsAJ6Z5HFMqE7+H6kdxl7v\nI6ZvvKZRrzj+4Q8rNfs22r+7e1Z6/Tfr6jqHwcHdRx/4OCiXy5RKpQl/nYni8XeeyZ4OmgnsLfy8\nL9XM2sKRAHrzV+XFPXWPMNrpSMKsWW17sdjPfvYMZ5zxwRH1H78anPaN05hy4pvz67XvvsaJJ544\nWcOz3Byi5lEENHYkMeWkKbzx2hs1+3fN7GJw3yAAn/3s7fT39zfcv9i33mvD5BzJDL3+8PFP1lGU\njY0iah86T8iLSe8C+iLimvTzUiAi4rZh7SZvUGZmx5GIaOrS/MkOganADuBK4HvARuBDEbF90gZh\nZmaHTep0UEQckvRxYD3V8xFfdgCYmbXOpB4JmJlZe2mrK4Y7/UIySbslPS3pSUkbWz2e0Uj6sqSK\npC2F2nRJ6yXtkLRO0pmtHOPR1Bn/ckn7JD2Rvq5p5RjrkdQjaYOk70jaKumWVO+I/V9j/J9I9U7Z\n/ydLejz9W90qaXmqd8r+rzf+pvd/2xwJpAvJdlI9X/BdYBOwOCI65hoCSc8Cl0XED1s9lkZIeg/w\nCnBPRPx6qt0G/CAiPpOCeHpELG3lOOupM/7lwMsR8b9aOrhRSOoGuiPiKUmnAZupXjPzUTpg/x9l\n/L9NB+x/AEmnRsSr6Vzlt4BbgH9NB+x/qDv++TS5/9vpSODwhWQR8TowdCFZJxHttU+PKiL+Dhge\nWAuAgbQ8ACyc1EE1oc74oQMeXBARgxHxVFp+BdgO9NAh+7/O+Ieu+Wn7/Q8QEa+mxZOpnh8NOmT/\nQ93xQ5P7v53+YB0PF5IF8KCkTZL+XasHM0YzIqIC1X/owIwWj2csPi7pKUlfatfD+SJJs4BLgMeA\nrk7b/4XxP55KHbH/JU2R9CQwCDwYEZvooP1fZ/zQ5P5vpxA4HlwREXOBDwBL0nRFp2uP+cLG3QGc\nFxGXUP3H0dbTEmkq5X7g1vSOevj+buv9X2P8HbP/I+KNiLiU6hFYr6SL6KD9X2P8cxjD/m+nENgP\n/HLh555U6xgR8b30/fvAA1SnuDpNRVIXHJ73faHF42lKRHw/jpzo+gvgN1o5nqORdALVP6BfiYhV\nqdwx+7/W+Dtp/w+JiB8DZeAaOmj/DymOfyz7v51CYBNwvqRzJJ0ELAZWt3hMDZN0anpXhKS3APOA\nb7d2VA0Rb55DXA3ckJavB1YN79Bm3jT+9A93yL+ivf8b/CWwLSI+X6h10v4fMf5O2f+SfmFoqkTS\nKcD7qZ7X6Ij9X2f8z4xl/7fNp4Og+hFR4PMcuZDs0y0eUsMknUv13X9QPUlzb7uPX9JXgRLwNqAC\nLAf+Gvg6cDawB1gUET9q1RiPps74f5Pq/PQbwG7gPwzN8bYTSVcAfwts5cid6j5F9Sr6lbT5/j/K\n+D9MZ+z/i6me+J2Svr4WEX8i6Sw6Y//XG/89NLn/2yoEzMxscrXTdJCZmU0yh4CZWcYcAmZmGXMI\nmJllzCFgZpYxh4CZWcYcApYVSe+T9O5xbLdc0h/UqJ8jaWsT42qqfTMkPZc+/242gkPAjjvp1rr1\nlIDLG1hNo+2OptmLcCbqoh1fDGR1OQRsQqV3uNskfVHStyWtlXRy+t3Dkuam5bdJei4tXy/pgfRw\nj2clLZH0++khGY9KemuN17lL0p2SHgNuSw8HeUDVh/w8KunXJJ0D/B7wH9O6rpD0LyQ9Jmlzer1f\nbLRd4eUvSa+xQ9Lv1hjbFEmfUfUhIE8d5Q6zJ0r6q7S/VkqalvpfmcbxdLoz5Imp/pykvjSmpyXN\nTvWzVH0gylZJf0G6rUa6tck3VH0QyRZJvzWW/6Z2fHEI2GQ4H/iziPg14ADVB3fUUnzHehHVe7n3\nAn8CvJLu0PoYcF2d/jMj4l0R8YdAP/BERPwT4L9QvcnZHuALwOciYm5EfAv4ZupzGfA14D832q7w\nuhdz5Mjhj4bdvwXgJuBHEfHOtD3/PgXNcBcAfx4Rc4CXgZtTYN4F/FbalhOBjxX6vJDG9AXgD1Nt\neRrvxVRvZTJ0Y8ZrgP0RcWl6CM/aOvvRMuIQsMnwXEQMzXdvBmY10OfhiHg1Il4EfgR8I9W3HqX/\n1wvL7wG+AhARDwNnDd3gb5iz07vmLVT/iF5UZ91Ha7cqIl6LiB8AGxh599h5wHWq3vv9ceAs4B01\nXuP5iHgsLf9V2oYLgGcj4h9TfQB4b6HPA+l7cb++N/UnItZw5ME7W4H3S1oh6T0R8XKdbbWMOARs\nMvy8sHyI6g32AA5y5P/BaUfpE4Wf3yj0H+4nw/o04s+AP03vjH+vxjgaaVd8LdV4bQGfSO/AL42I\nX4mIh2q8Rr172R/tSVFD+6W4X4cTQETsAuZSDYM/lvRfj7Jey4RDwCZDvT9iu4F/mpbHe376m8C/\nBZBUAl5MDz15GTij0O4Mqs+0huqtg4c02g5ggaSTJL0NeB/V26IXraM6tXNCGs870u1/hztH0jvT\n8ofTNuxI9fNS/Xeo3jv+aP4W+Eh6rfnAW9PyLwE/jYivAv+DaiBY5hwCNhnqvSv/LPAxSZupTpE0\n2/9obfqByyQ9Dfx3jvzh/hvgXw6d8AX6gPslbQK+X+jfaDuALVT/MD8K/Lf0WMKiLwHbgCfSx0C/\nQO137c9QfSLdNqp/uL8QET+n+vD5+9O2HAL+d51tLm77e9NrLQSeT/WLgY1pWuqPgD+u098y4ltJ\nm5llzEcCZmYZcwiYmWXMIWBmljGHgJlZxhwCZmYZcwiYmWXMIWBmljGHgJlZxv4/HyYHQPsNkegA\nAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "_=hist(([x for x,y in nrots],[y for x,y in nrots]),bins=20,histtype='bar')\n", "xlabel('num rotatable bonds')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and a histogram of the similarities we used to construct the set" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from rdkit import DataStructs\n", "from rdkit.Chem import rdMolDescriptors\n", "sims = [DataStructs.TanimotoSimilarity(rdMolDescriptors.GetMorganFingerprint(x[1],0),rdMolDescriptors.GetMorganFingerprint(x[3],0)) for x in rows]" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYwAAAEPCAYAAABRHfM8AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAGTVJREFUeJzt3X2QZXV95/H3RxBRgiO6y0wyoODD8OCCODGj68PaioIY\nA9RWyYIugpBEA6vuZrXCuLthZitVI5qsaLlaa8nK4EMo1KTAFQEJNoIBQQQGGYRJlIcZnSaCssQ1\nhIfv/nHPwKXpnvn1vd3Tt2fer6quOfd3f79zvvfQ9Oeec+79nVQVkiRty9PmuwBJ0sJgYEiSmhgY\nkqQmBoYkqYmBIUlqYmBIkppsMzCSnJNkIsm6vraPJrktyU1Jvpbk2X3PrUyyoXv+iL725UnWJbkj\nydl97bslOb8bc02S58/mC5QkzY6WI4zPA0dOarsMeGlVHQZsAFYCJDkYOA44CDgK+HSSdGM+A5xa\nVcuAZUm2rPNU4P6qeglwNvDRIV6PJGmObDMwqupq4BeT2i6vqse6h9cC+3TLRwPnV9UjVXUnvTBZ\nkWQJsGdVXd/1Ow84tls+BljbLX8VOHzA1yJJmkOzcQ3jFODibnkpcE/fc5u6tqXAxr72jV3bk8ZU\n1aPAL5M8dxbqkiTNoqECI8l/AR6uqr+cpXoAsu0ukqTtbddBByY5GXgr8Ma+5k3Avn2P9+napmvv\nH/PTJLsAz66q+6fZphNfSdIAqmroN+OtRxih751/krcAHwKOrqqH+vpdBBzfffJpf+DFwHVVtRl4\nIMmK7iL4u4AL+8ac1C2/Hbhia4VU1YL9OfPMM+e9Buuf/zp2ttqtf/5/Zss2jzCSfBkYA56X5G7g\nTODDwG7At7oPQV1bVadV1fokFwDrgYeB0+qJak8HzgV2By6uqku69nOALyTZANwHHD9Lr02SNIu2\nGRhV9Y4pmj+/lf5rgDVTtN8AHDJF+0P0PoorSRphftN7OxobG5vvEoZi/fNnIdcO1r+jyGye35pr\nSWoh1StJoyAJtR0vekuSdnIGhiSpiYEhSWpiYEiSmhgYkqQmBoYkqYmBIUlqYmBIkpoYGJKkJgaG\nJKmJgSFJamJgSJKaGBiSpCYGhiSpiYEhSWpiYEiSmhgYkqQm27ynt2bHN795GZ/4xOcGGnvssUfx\n3ve+e5YrkqSZ8Rat28mJJ76HL37xMeDNMxy5nkMP/TY333zlXJQlaScwW7do9Qhju3oFcNwMx1wJ\nfHsOapGkmfEahiSpiYEhSWpiYEiSmhgYkqQmBoYkqYmBIUlqss3ASHJOkokk6/ra9kpyWZLbk1ya\nZFHfcyuTbEhyW5Ij+tqXJ1mX5I4kZ/e175bk/G7MNUmeP5svUJI0O1qOMD4PHDmp7Qzg8qo6ALgC\nWAmQ5GB6XzQ4CDgK+HSSLV8W+QxwalUtA5Yl2bLOU4H7q+olwNnAR4d4PZKkObLNwKiqq4FfTGo+\nBljbLa8Fju2WjwbOr6pHqupOYAOwIskSYM+qur7rd17fmP51fRU4fIDXIUmaY4New9i7qiYAqmoz\nsHfXvhS4p6/fpq5tKbCxr31j1/akMVX1KPDLJM8dsC5J0hyZrYvesznB09DznUiSZt+gc0lNJFlc\nVRPd6aZ7u/ZNwL59/fbp2qZr7x/z0yS7AM+uqvun2/CqVaseXx4bG2NsbGzAlyBJO6bx8XHGx8dn\nfb2tgRGe/M7/IuBk4CzgJODCvvYvJfk4vVNNLwauq6pK8kCSFcD1wLuAT/aNOQn4HvB2ehfRp9Uf\nGJKkp5r8Znr16tWzst5tBkaSLwNjwPOS3A2cCXwE+EqSU4C76KZgrar1SS4A1gMPA6f1zUd+OnAu\nsDtwcVVd0rWfA3whyQbgPuD4WXllkqRZtc3AqKp3TPPUm6bpvwZYM0X7DcAhU7Q/xMzn/JYkbWd+\n01uS1MTAkCQ1MTAkSU0MDElSEwNDktTEwJAkNTEwJElNDAxJUhMDQ5LUxMCQJDUxMCRJTQwMSVIT\nA0OS1MTAkCQ1MTAkSU0MDElSEwNDktTEwJAkNTEwJElNDAxJUhMDQ5LUxMCQJDUxMCRJTQwMSVIT\nA0OS1MTAkCQ1MTAkSU0MDElSEwNDktRkqMBI8p+S/DDJuiRfSrJbkr2SXJbk9iSXJlnU139lkg1J\nbktyRF/78m4ddyQ5e5iaJElzY+DASPJbwPuA5VV1KLArcAJwBnB5VR0AXAGs7PofDBwHHAQcBXw6\nSbrVfQY4taqWAcuSHDloXZKkuTHsKaldgD2S7Ao8E9gEHAOs7Z5fCxzbLR8NnF9Vj1TVncAGYEWS\nJcCeVXV91++8vjGSpBExcGBU1U+BvwDuphcUD1TV5cDiqpro+mwG9u6GLAXu6VvFpq5tKbCxr31j\n1yZJGiG7DjowyXPoHU28AHgA+EqSdwI1qevkx0NZtWrV48tjY2OMjY3N5uolacEbHx9nfHx81tc7\ncGAAbwJ+XFX3AyT5a+DVwESSxVU10Z1uurfrvwnYt2/8Pl3bdO1T6g8MSdJTTX4zvXr16llZ7zDX\nMO4GXpVk9+7i9eHAeuAi4OSuz0nAhd3yRcDx3Sep9gdeDFzXnbZ6IMmKbj3v6hsjSRoRAx9hVNV1\nSb4K3Ag83P37WWBP4IIkpwB30ftkFFW1PskF9ELlYeC0qtpyuup04Fxgd+Diqrpk0LokSXNjmFNS\nVNVqYPKxzv30TldN1X8NsGaK9huAQ4apRZI0t/ymtySpiYEhSWpiYEiSmhgYkqQmBoYkqYmBIUlq\nYmBIkpoYGJKkJgaGJKmJgSFJamJgSJKaGBiSpCYGhiSpiYEhSWpiYEiSmhgYkqQmBoYkqYmBIUlq\nYmBIkpoYGJKkJgaGJKmJgSFJamJgSJKaGBiSpCYGhiSpiYEhSWpiYEiSmhgYkqQmQwVGkkVJvpLk\ntiS3Jnllkr2SXJbk9iSXJlnU139lkg1d/yP62pcnWZfkjiRnD1OTJGluDHuE8Qng4qo6CHgZ8CPg\nDODyqjoAuAJYCZDkYOA44CDgKODTSdKt5zPAqVW1DFiW5Mgh65IkzbKBAyPJs4HXVdXnAarqkap6\nADgGWNt1Wwsc2y0fDZzf9bsT2ACsSLIE2LOqru/6ndc3RpI0IoY5wtgf+HmSzyf5QZLPJnkWsLiq\nJgCqajOwd9d/KXBP3/hNXdtSYGNf+8auTZI0QnYdcuxy4PSq+n6Sj9M7HVWT+k1+PJRVq1Y9vjw2\nNsbY2Nhsrl6SFrzx8XHGx8dnfb3DBMZG4J6q+n73+Gv0AmMiyeKqmuhON93bPb8J2Ldv/D5d23Tt\nU+oPDEnSU01+M7169epZWe/Ap6S60073JFnWNR0O3ApcBJzctZ0EXNgtXwQcn2S3JPsDLwau605b\nPZBkRXcR/F19YyRJI2KYIwyA9wNfSvJ04MfAu4FdgAuSnALcRe+TUVTV+iQXAOuBh4HTqmrL6arT\ngXOB3el96uqSIeuSJM2yoQKjqm4GfmeKp940Tf81wJop2m8ADhmmFknS3PKb3pKkJgaGJKmJgSFJ\namJgSJKaGBiSpCYGhiSpiYEhSWpiYEiSmhgYkqQmBoYkqYmBIUlqYmBIkpoYGJKkJgaGJKmJgSFJ\namJgSJKaGBiSpCYGhiSpiYGxAKxffyNJBvpZsmS/+S5f0g5iqHt6a/t45JEHgRpo7MREZrcYSTst\njzAkSU0MDElSEwNDktTEwJAkNTEwJElNDAxJUhMDQ5LUxMCQJDUZOjCSPC3JD5Jc1D3eK8llSW5P\ncmmSRX19VybZkOS2JEf0tS9Psi7JHUnOHrYmSdLsm40jjA8A6/senwFcXlUHAFcAKwGSHAwcBxwE\nHAV8OsmWryF/Bji1qpYBy5IcOQt1SZJm0VCBkWQf4K3A5/qajwHWdstrgWO75aOB86vqkaq6E9gA\nrEiyBNizqq7v+p3XN0aSNCKGPcL4OPAhnjzR0eKqmgCoqs3A3l37UuCevn6buralwMa+9o1dmyRp\nhAw8+WCS3wUmquqmJGNb6TrYrHnTWLVq1ePLY2NjjI1tbdOStPMZHx9nfHx81tc7zGy1rwGOTvJW\n4JnAnkm+AGxOsriqJrrTTfd2/TcB+/aN36drm659Sv2BIUl6qslvplevXj0r6x34lFRVfbiqnl9V\nLwSOB66oqhOBrwMnd91OAi7sli8Cjk+yW5L9gRcD13WnrR5IsqK7CP6uvjGSpBExF/fD+AhwQZJT\ngLvofTKKqlqf5AJ6n6h6GDitqracrjodOBfYHbi4qi6Zg7okSUOYlcCoqiuBK7vl+4E3TdNvDbBm\nivYbgENmoxZJ0tzwm96SpCYGhiSpiYEhSWpiYOzwnkGSGf8sWbLffBcuacTMxaekNFIeYpDvTk5M\nZNudJO1UPMKQJDUxMCRJTQwMSVITA0OS1MTAkCQ1MTAkSU0MDElSEwNDktTEwJAkNTEwJElNDAxJ\nUhMDQ5LUxMCQJDUxMCRJTQwMSVITA0OS1MTAkCQ1MTAkSU0MDElSEwNDktTEwJAkNTEwJElNDAxJ\nUpOBAyPJPkmuSHJrkluSvL9r3yvJZUluT3JpkkV9Y1Ym2ZDktiRH9LUvT7IuyR1Jzh7uJUmS5sIw\nRxiPAH9cVS8F/jVwepIDgTOAy6vqAOAKYCVAkoOB44CDgKOATydJt67PAKdW1TJgWZIjh6hLkjQH\nBg6MqtpcVTd1y/8I3AbsAxwDrO26rQWO7ZaPBs6vqkeq6k5gA7AiyRJgz6q6vut3Xt8YSdKImJVr\nGEn2Aw4DrgUWV9UE9EIF2LvrthS4p2/Ypq5tKbCxr31j1yZJGiFDB0aS3wC+CnygO9KoSV0mP5Yk\nLUC7DjM4ya70wuILVXVh1zyRZHFVTXSnm+7t2jcB+/YN36drm659SqtWrXp8eWxsjLGxsWFegiTt\ncMbHxxkfH5/19aZq8AOAJOcBP6+qP+5rOwu4v6rOSvInwF5VdUZ30ftLwCvpnXL6FvCSqqok1wLv\nB64HvgF8sqoumWJ7NUy98+nEE9/DF7+4HHjPDEdeCYwx+IFaBhy7O/DQjEctXvwCNm++c4DtSZor\nSaiqbLvn1g18hJHkNcA7gVuS3Ejvr9KHgbOAC5KcAtxF75NRVNX6JBcA64GHgdP6/vqfDpxL76/U\nxVOFhba3hxgkaCYmhv6dlDSiBg6MqvousMs0T79pmjFrgDVTtN8AHDJoLZKkuec3vSVJTQwMSVIT\nA0OS1MTAkCQ1MTAkSU0MDElSEwNDs+wZJBnoZ8mS/ea7eElbMdTUINJTDfaFP/BLf9Ko8whDktTE\nwJAkNTEwJElNDAxJUhMDQ5LUxMCQJDUxMCRJTQwMSVITA0OS1MTAkCQ1MTA0Qgabh8o5qKTtw7mk\nNEIGm4fKOaik7cMjDElSEwNDktTEwJAkNTEwJElNDAztAPx0lbQ9GBjaAWz5dNXMfiYmNns7WWkG\n/FitdmLeTlaaCY8wJElNRiYwkrwlyY+S3JHkT+a7HmnrvG6inc9IBEaSpwGfAo4EXgqckOTA+a1q\nLtw+3wUMaXy+CxjS+Cyua9DrJncNtLXx8fGhK55P1r9jGInAAFYAG6rqrqp6GDgfOGaea5oDd8x3\nAUMan+8ChjQ+3wUM7G1vO3ZBX6Bf6H9wF3r9s2VULnovBe7pe7yRXohIO5jeqazBDHqBfveBtvm0\npz2Lxx77fzMet3jxC9i8+c4Zj9PoG5XA2OHtttvT2WWX69ljj9+b0bhHH72PX/1qjorSPBj0k1nD\nfCprsG0+9lgGGjddQK1evXqbYwcNqZ0l3JYs2W/g05qzIVWDvWuZ1SKSVwGrquot3eMzgKqqsyb1\nm/9iJWkBqqqhPws+KoGxC70rwocDPwOuA06oqtvmtTBJ0uNG4pRUVT2a5D8Al9G7EH+OYSFJo2Uk\njjAkSaNvVD5W2/TFvSRjSW5M8sMk3+5rvzPJzd1z122/qh/f/lZrT/LBrrYfJLklySNJntMydnsY\nsv553fddDduq/9lJLkpyU1f/ya1jt4ch618I+/85Sf6qq/PaJAe3jp1rQ9Y+Cvv+nCQTSdZtpc8n\nk2zofn8O62uf+b6vqnn/oRdcfwe8AHg6cBNw4KQ+i4BbgaXd43/R99yPgb1GtfZJ/d8GXD7I2FGr\nf773/Qx+d1YCa7b83gD30TsduyD2/3T1L6D9/1Hgv3XLB4zK7/8wtY/Cvu9qeC1wGLBumuePAr7R\nLb8SuHaYfT8qRxgtX9x7B/C1qtoEUFU/73suzN/R0ky/dHgC8JcDjp0Lw9QP87vvoa3+AvbslvcE\n7quqRxrHzrVh6oeFsf8PBq4AqKrbgf2S/MvGsXNpmNph/vc9VXU18IutdDkGOK/r+z1gUZLFDLjv\nRyUwpvri3tJJfZYBz03y7STXJzmx77kCvtW1/8Ec1zpZS+0AJHkm8BbgazMdO4eGqR/md99DW/2f\nAg5O8lPgZuADMxg714apHxbG/r8Z+LcASVYAzwf2aRw7l4apHeZ/37eY7jUOtO9H4lNSjXYFlgNv\nBPYArklyTVX9HfCaqvpZl/zfSnJbl7yj5veAq6vql/NdyICmqn8h7PsjgRur6o1JXkSvzkPnu6gZ\nmLL+qvpHFsb+/wjwiSQ/AG4BbgQend+Smm2t9oWw7ycb6rsYo3KEsYlecm+xT9fWbyNwaVX9U1Xd\nB3wHeBlAVf2s+/cfgL9m+04r0lL7Fsfz5NM5Mxk7V4apf773PbTV/27grwCq6u+BnwAHNo6da8PU\nvyD2f1U9WFWnVNXyqjoJ2Jve+f/53v/D1D4K+77FJmDfvsdbXuNg+34+L9j0XZjZhScuwOxG7wLM\nQZP6HAh8q+v7LHppf3C3/Btdnz2A7wJHjFLtXb9F9C5WPnOmY0e4/nnd9zP43fmfwJnd8mJ6h+LP\nXSj7fyv1L5T9vwh4erf8B8C5M/ndG9Ha533f99W4H3DLNM+9lScuer+KJy56D7Tvt/uL28qLfgu9\nb3tvAM7o2t4D/GFfnw/S+6TUOuB9Xdv+3Yu9kV6InDGitZ8EfLll7EKpfxT2fUv9wG8Cl3a/N+vo\nzSKwYPb/dPUvoP3/qu7524CvAotGZf8PWvsI7fsvAz+lN2HY3fSORif/v/speuFwM7B8mH3vF/ck\nSU1G5RqGJGnEGRiSpCYGhiSpiYEhSWpiYEiSmhgYkqQmBoa2iySPJTmv7/EuSf4hyUXd45OS3NtN\noX5jknO79nOT/Lhr/356t/MlyV5JLktye5JLkyyaQS3/J8mzZ/klNm0zyaIkf9TX/vokX59mzGeT\nHDjHdY36VBYaIQaGtpdfAf8qyTO6x2/myZOfAZxfvSkYXl5VJ3dtBXywqpbTm+b7f3XtZ9CbavoA\nerOJrmwtpKreVlX/d8DXMZC+be4FnDb56WnG/GFV/WiO63rt5Lb0bpksPYWBoe3pYuB3u+XJ06TD\ntidG+w7wom75GGBtt7wWOHZy5yRLklzZHZ2sS/Karv0nSZ6b5AVJbkvy+e5I5YtJDk9ydff4FV3/\n1+eJG0jdkGSPSdv5YHq3GCbJx5P8Tbf8hiRf6N8msAZ4Ybeus7pV7JnkK10tX+hb77eTLO+WH0zy\nZ91NcP62b4rt/jrOTHJe9/ztSX6/a98jyeXdEdrNSY7uG/Ng32v8TpILgVuTPKs7Krqx23dv38Z/\nG+0EDAxtL0Vvzv0TuqOMQ4HvTerz77o/pD9IctIU6zia3jQMAIuragKgqjbTmxRusncAl3RHJy+j\nN5XDllq2eBHwse5I5UB60268FvgQ8OGuz38GTuvW8zrg15O2c1XXDvDbwB7du/TX0Qu5/m2eAfx9\ndyS15S5nhwHvpzc32ouSvHqK17IH8LdVdVi3vemm0z4EGANeDfxpkiXAPwHHVtUr6M32/Bd9/fv3\nxcvpTblzIL1pIzZ1R3uHApdMsz3tRAwMbTdV9UN6E6WdAHyDpx5RbDkltbyq1va1/3k3vfTvA6ds\nWd3k1U+xyeuBdyf5U+DQqvpV196/3Z9U1fpu+Vbgb7rlW7paoTex3MeTvI/eHdYem7SdG4DfTrIn\nvTl9rgF+h15gXDXFNie7rqp+Vr15em7q226/h6rq4r7tTdUH4MKq+ufqzeh8Bb0ZVAN8JMnNwOXA\nbyWZKmCvq6q7u+VbgDcnWZPktVX14Fbq107CwND2dhHwMZ56OmprPtiFyJFVdVvXNpHencPo3kXf\nO3lQVV0F/Bt60zafm+TfT7Huh/qWH+t7/Bjd/WKq6izgVOCZwHeTLJu0nUeAO4GT6YXLVcAbgBc1\nXoPor+FRpr5PzcMNfeDJwZnu8TuB5wEvr6qX09tXu08xdkugUlUb6N1/5hbgz5L81228Bu0EDAxt\nL1veYf9vYHVV3Trk+i6i9wcaejPpXviUDSbPB+6tqnOAz9H7AzhdXdNK8sKqurWqPkrvqGWqTy5d\nRW825e8AVwPvpTeT6WQP8sTtVmei9cY3xyTZLcnzgNfTq3cRvf3wWJI30JvSeqvrTfKbwK+r6sv0\nAn6qfaedzEK6454WtgKo3j3ZPzXTcVM4C7ggySnAXcBxU/QZAz6U5GF6f6i33Na3f53TLff7j90f\n2kfpnbb65hR9rqJ3zeOaqvp1kl/zxPWLx9ddVfcn+W6Sdd16Lp60npnWNtk6YJzeEcV/r6rNSb4E\nfL07JfV9elN1b2u9hwAfS/IY8M/AH03TTzsRpzeXdhBJzgQerKr/Md+1aMfkKSlJUhOPMCRJTTzC\nkCQ1MTAkSU0MDElSEwNDktTEwJAkNTEwJElN/j9a/zsR/7adNQAAAABJRU5ErkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "_=hist(sims,bins=20)\n", "xlabel('MFP0 sims within pairs')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "compare to MFP2 similarity (more on this in a later post)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [], "source": [ "sims2 = [DataStructs.TanimotoSimilarity(rdMolDescriptors.GetMorganFingerprint(x[1],2),rdMolDescriptors.GetMorganFingerprint(x[3],2)) for x in rows]" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYsAAAEPCAYAAACzwehFAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsnXd4FcXXx7+THiB0CL1Kl6ogikoEEQQrIgJiF2zYEBVU\nBKzYFZWiUvSniB0VBUUgKEVB6YJUpfcaIBCSnPePc/fdevveEjif55knd3dnZ+fu3czZmdMUEUEQ\nBEEQfJEQ6w4IgiAI8Y8IC0EQBMEvIiwEQRAEv4iwEARBEPwiwkIQBEHwiwgLQRAEwS8RFRZKqfFK\nqd1KqRVejvdRSi33lHlKqaaR7I8gCIIQGpGeWUwE0NnH8U0ALiai5gCeA/B+hPsjCIIghEBSJBsn\nonlKqZo+jv9u2PwdQNVI9kcQBEEIjXjSWdwJYHqsOyEIgiDYiejMIlCUUpcAuA3AhbHuiyAIgmAn\n5sJCKdUMwHsAuhDRQR/1JIiVIAhCCBCRCreNaCxDKU+xH1CqBoCvANxERBv9NUREcV+GDRsW8z5I\nP6WfRbWP0k/3i1tEdGahlJoMIAtAOaXUFgDDAKQAICJ6D8BQAGUBjFZKKQCniKhNJPskCIIgBE+k\nraH6+DneD0C/SPZBEARBCJ94soY6LcjKyop1FwJC+ukuRaGfRaGPgPQzXlFurmlFEqUUFZW+CoIg\nxAtKKVARUXALgiAIRRwRFoIgCIJfRFgIgiAIfhFhIQiCIPhFhIUgCILgFxEWgiAIgl9EWAiCIAh+\nEWEhCIIg+EWEhSAIguAXERaCIAiCX0RYCIIgCH4RYSEIgiD4RYSFIAiC4BcRFoIgCIJfRFgIgiAI\nfhFhIQiCIPhFhIUgCILgFxEWgiAIgl9EWAiCIAh+EWEhCIIg+EWEhSAIguAXERaCIAiCX0RYCIIg\nCH4RYSEIgiD4RYSFIAiC4BcRFoIgCIJfIioslFLjlVK7lVIrfNQZpZRar5RappRqEcn+CIIgCKER\n6ZnFRACdvR1USl0OoC4R1QNwF4CxEe6PIAiCEAIRFRZENA/AQR9VrgbwkafuHwBKKaUyI9knQRCE\ncJg9G1CKS5kyse5N9Ii1zqIqgK2G7e2efYIgCHFJx47650OHgIoVY9eXaBJrYSEIglBkOPdc+769\ne6Pfj1iQFOPrbwdQ3bBdzbPPkeHDh///56ysLGRlZUWqX4IgCDbuvx+49dZY98I32dnZyM7Odr1d\nRUSuN2q6gFK1AHxPRE0djnUFcB8RdVNKtQXwJhG19dIORbqvgiAI/lDKvB3vw5JSCkSk/Nf0TURn\nFkqpyQCyAJRTSm0BMAxACgAioveI6EelVFel1AYAxwDcFsn+CIIghIsmHGbPBjp0iG1foknEZxZu\nITMLQRCE4HFrZiEKbkEQBMEvIiwEQRAEv4iwEARBEPwiwkIQBEHwiwgLQRAEwS+xdsoTBEEoUhw+\nDLz6Kof66NcPaNYs1j2KDmI6KwhCxNm5EyheHChZMtY9CZ8LLwTmz+fPJUsCK1YANWvGtk++ENNZ\nQRDinsJC4IYbgCpVgPLlgY8+inWPwiMnRxcUAHDkCLBgQez6E01EWAiCEDGmTQM+/5w/nzoF3HUX\nUFDg/nXWrweuugpo3x74+mv329coUQKoVk3fVgpo2DBy14snRFgIghAxcnPN23l53oXFxo1Au3Y8\nCxk4MLiYS5dfDnz/PfDrr0DPnrw0FAmUAmrU0LdLlAAyz5AMPCIsBEGIGFdeCZxzjr49ZAiQkuJc\n9+abeUln507gjTeA//0vsGvk5LCg0SgoAFatCr3Pvjh61LzslJMDzJ0bmWvFG2INJQiC6/z3H5eW\nLYHffgPmzeOscsZ8EHl5PMhXrgyULg38+6+9DSe2bAHefRdISgIeegioUAFo3RpYvJiPFy8OtHWM\nXR0+xYvzTGL3bn1fnTqRuVa8IdZQgiC4ytdfA716sY6ienV+Ezeu8wPAvn1AVhbw999ARgYwdSrr\nN954g4+npQELFwItWpjPO3wYaNoU2OrJr9m4MbB0Kb/hv/ACH+/fH2jTJnLfb+5coHdv4PhxYMAA\n4LnnInctN3DLGkqEhSCcpuTm8kD233/A9dcD11wTnes2bWpeBhoyhAdyI8OGAc88o2+3bAn89Rdb\nS23ezMpqq6AAWCfRvr1535o1zkrmTZv42keP8gykU6fQv5ORrCx96al0adaPVK/u85SYUiTyWQiC\nEHl++YUdxDp35rd0jTvvBCZP5s+ffsr1opF/waqTcNJR5Ofbt5UCbrnFd9u1awOpqcDJk7xdqhQv\nY1kpLAQuu0zXZcyaBaxcCdSrF9h38IZVR3HoEC+x9e4dXrtFAVFwC0IRZsAAfmO+/nrgggt4MNOY\nM0f/TBS6ItY6sPvj9dd157tmzYAHHrDXufdeoFYt/pyaGvhSTvXqwJdf8kykTRvgu+9YYFg5cMCs\n9D550h0LKSfT2fr1w2+3KCDLUIJQRDl+nBWuRj7/nAUHwJZI06bpx779lpd3AmXDBm5j7VqekXzz\njXnm4osjR1gJXKsWkJzsXCcnhwfwGjXcX8YhApo04SUqgO/TypU8MwmXFStYSB8+DDz8cPzn5BYP\nbkE4w0lJAdLTzfuM4TQ+/JAHsqwsYPTo4AQFwDOCf/7hgXfWLODllwM/t2RJXvLxJigAFjzt2kVm\nvV8pYOZMjt10ww3Azz+7IygA4Ngxnrns3w8cPOhOm0UBmVkIQhHmyy/ZPyE3l72jx451r+02bXRz\nVAC45x4WOmcyhYVApUrA3r36vgULgPPPj12f/CEzC0EQ0KMHL/kcPequoABYr6A8Q0yxYsBttwV+\n7rRprLuIlHNcrDh+3CwoALbeOhOQmYUgCF5ZuBBYvZojrTZoENg5I0eyySrA/hK//WZ2xivqXHEF\n8MMP/DkzE1i2jGcb8Yr4WQiCEBB79wI33ggsWcI+Ch99ZFeMu0nDhqwU17j3Xva4Dpc5c9gZr1Mn\nZ3PZaHHyJPDBB6zgvvHG+A5PDoiwEATBD3v2AMuXA+PGAV99pe8fPBh48cXIXbdDB7PZLsB6lUmT\n9GWtYHn+eeCpp/hzejrwyCPs2JcknmJ+EWEhCIJXVq3iWcSBA0BCAitmNfr0AT75hGMxzZwJ1K0L\ndOwY+rU2beLlpipVeHvjRjbfXbrUXO/HHzk6bChkZrLwM+LWjOV0RxTcgiA4kpcH3H8/CwrALCgS\nEtjbeM0aoHlztqC69NLgzGI1iIC+fVnYVKumt1G3LicIss4ijh61t3HkCDBxIvDZZ77zXJQpY983\nfXrwfRZCR2YWghAkRDzwBeqg5jYnTvCsoFo1ex8KC/nt/eefzfvPPRe49lrg4ovZWa1lS7MVT5ky\nunAxMmUKz1IyM4HXXuMAgPfeC9xxB/tgWONNffABHwOARx/lXNUA0KoVK7qLFWNnvXvu4YRFe/bo\nM4ZrrmHHPycWLgS6dGHholGnDputessnceQI8Oab7PzXr9+Z42ltxa2ZBYioSBTuqiDElpUriapX\nJwKILryQ6PDh6F5/yxai2rX5+uXKEU2cSNSsGVGNGkRvvkm0bh0fM5ayZYmWL9fbuOwye52UFPu1\nRo601zOW5GTn/d98o7cxfz7Rjz8SHT+u7+vSxXubO3f6/v6PPWa+bsOGRCdOONc9/3y9XrlyRDt2\nBHybTys8Y2f4Y7AbjUSjiLAQok1BAdHJk+Z9WVnmwe3pp6Pbp3vv9T1g//ADUVKSvq0U0dKl+vn3\n3+88SDdpYr/WOef4FhbeBMaDD/r+DjVqOLeVlkZ09Kheb/58oq+/Jjp0SN83d679vDVr7NfYu9de\n76uvgrvXvvjrL6JZs7wLqnjCLWEhOgtBcODbbzn8dHo68OCD+v7Dh831rNuR5PffzVnaAM4ZYeTI\nEWD8eA54l5bGS0daqO/cXODtt+3tZmbq0WmNBBIeIzmZo7saadnS9zldu+qfleI2SpUym/Q+8wyH\nAunenZfQ9u/n/XXq8PfSKF1aV6wbKV3avDyVkACcdZb/7xMId9/N2f86duRw7CdOuNNu3OOGxPFV\nAHQB8A+AdQAedzheEsB3AJYBWAngVi/tuCxvBcGZU6eIihc3v5XOnMnHPvqIKCGB95UqxctSGrm5\nRMuWEe3Z436ffv/d/hZfvDjReefp2xUrmpdaCgvt3ys11dzGs88S5eURTZ7MS1o5OXr93buJOnTg\ndm+6iej993nJy3h+RgYvxfXvT3TxxUQvv+z/u5w6RfTaazxLmjnT3k8iovR083XGjtWPTZ9O1KYN\nUbt2PPvwxl9/EV1wAVHTpkT/+5//fgXC8eP2Gcsrr7jTdqRAUViGAltbbQBQE0CyRyA0tNQZAuBF\nz+fyAPYDSHJoy/WbKAhOHDtmHxA++0w/vmQJ0aefEm3erO/bs4eoUSOuW6wYD2gau3cT/fknt2vl\nt9+ILr2U9QiLFnnv0xNPmPtTqhSfe+IE0ahRPOj/+6/v7/Xdd7qgA4g6d+aB2qhDaNmShR4RD+TF\nivH+qlWJWrSw35dSpfzeTq/8+ivR448TffCBWWDs2WMXap98Evp13GT/fvs9GDQo1r3yTVERFm0B\nTDdsD7bOLjz73vF8rg1gnZe23L2DguCDfv30waBBA/O6uRNPPmkeQJo25f0zZugD7llnmRW4u3fz\nm7l2Ttmy3q8zYYJ9kNLeuFeuJJo3j2cIRtavJ/r4Y125fcUV5nOvvZZo0yZ7m/fdR7R2Lb+V+9JX\nKMWCKhSys4kSE/W2Hn9cP9a5s/k6F1xAlJ8f2nUiQenS5v7980+se+SboiIsrgPwnmG7L4BRljol\nAMwGsAPAEQCXe2nL5VsoxDOPPcbLHy1amJd6okVhIdG0afxG609QFBQQValiHkAqVCDq1Im/g3H/\nE0/o582fbx+AV6zgZZpt28yDf2Eh0SOP2Jei0tL0zxddxDONXbuI+vbVFd2Jiazcve0287lXX81v\nyk5K6owMonPPNe8zKs4TEliZHiqPPGJuu359/VhmpvnYiBGhX8dtTpyw36tAlt5iiVvCIh6c5TsD\nWEpEHZRSdQHMVEo1IyKbC8/w4cP//3NWVhaysrKi1kkhenz5pe7gtWcPO5GtXBndPigFdOsWWN1B\ng4AdO8z79u5l72hfpKba9x08CDRuzD4INWtyGzVrsk/Do4+yI9rq1Xp9o3L1t9+AqVOB4cO5vkZB\nAfD005yT4rPPOHIqwL4YBw4AEyaw74Qxy15ODlCxIiuKDx0CypXTlcwAK6LPO8/393MiP5/zaCdY\nTGuMyudLLmH/DoDrxdO/uTXiLMDf59FHo98Xb2RnZyM7O9v9ht2QON4KeBlqhmHbaRlqGoB2hu1Z\nAM51aMtNYSvEmKNH9bVxK6++an5zK148un0LhtxcXo7xtVyj6Qmsy1B//GGv2727efuaa3RdQWoq\n0TPPmJdvrNcePdp7P777zr4vJYXNZlevJmrVynxMKaKpU3lm5zT7+Pnn4O5VXh7RJZfo5zdrRlSn\nDtHll5sV88ePEw0dSnTzzUTff+/O7+QWJ0/a78Po0bHulW9QRJahEqEruFPACu5GljrvAhjm+ZwJ\nYCuAsg5tuX0PhRihKWuTkvR/tNxcXj8/eZLo9dfN/4ylS8e2v744eNA+eGiKbqMO4K+/7AruvDy2\n6NHqNW5s3gaIqlUzb9euTbR9O1v3rF5tH8S//JKVzk7C4umndf2JtbRty0LBqlh+8EFeArOel5jI\nv1cw/PKL/bq7drn3W0QL48tM+/bxpU9xokgIC+4nugBYC2A9gMGefXcB6O/5XBnATwBWeEpvL+24\nfhOF6LNkiX3QmTdPX/OvW5ffKo11wrG4iTT5+fx2buzvjBlEY8YQXXcdzwROnfJ+fm4u6xq0c60m\no5q3uFZq1tTPdbLaqlmT/5YoQVSypPnYlCnOggIgqlWL27z6avP+889nAT5hgj6jKVGC6MYbiSZN\nYn1NoMybZ247IYHowAHf9+bdd1kn4M+zO9qsWkU0Z05w3z9WFBlh4VYRYXF6kJ1tH6g6dTJv9+rF\nlkHa9kMPxaav+flE48cTPf88Wwd549tveQBViuiBB+zHCwt52W3jRvZH6NePw3IQ8X5vAzhA1KOH\nfi+UInrpJaIbbuDwFR07mmce1iWp22/n0qEDW00VFtqVx1p55hnuz65dRJUrm4+98AIf272bZwfG\nmcuddwZ3T++6SxcUb77pvV5hIX8/oxD0JViM7N5NdMstbI786afB9S8Qxo3TBed118W/wBBhIRRJ\n8vLMb9I33GA3lbzpJvYZGDWKl1WiwZEjbJ20dau+7/bb9T6VLMmDvZEVK1hQ7NnDg5vReik3l8NQ\nLFigD+jGJZ7y5YlefJFo9myz+az25g7w/uefNx+zWl35KuecY/+eixaxbkKbwb31ll0v0Lq1XWBp\nYS3eest8rFgx+zX+/ZdnJOXL8z20zqx27mQrLF/s3Gn/PoHqL4zPl1Lsj+IWJ0/al/6++8699iOB\nCAuhyHLyJA+y06fzW9mcOfqaeJky5lhG3vjmG6KePdkhKieH189//DG05Yrt2/XlnuRk/Z/fOig8\n+aR+zoQJuuI6LY1nP5qw2LKFTWe1N2hfA7pSbEZarhwvZ519tvl448bmbatOwVf7F10U/L0g4tmL\nta369VkJ/c035v0NGtjPv/RSc5233gq+D7m55mU0a4wrXxjNiQHfM5hQ+mWdwUXrhSZURFgIpxVb\nthD99JM+2BcWsq+BMfyERna2+R+2XTt9EE1I4IE3M5Otj77/nmjfPqKuXYkqVeK35Pvu4zf0pk05\njIa2NKKVqlX5OtaBuHhxvT/169sH1Icf5mMdOvgWENbSoQOfV1BgXn7TBJFxu1Yt/wJIK889F/rv\nMWWKPeBfr14sSB9+mIVhs2Yc3sSK9d4E6uG8YQM74FWowEthZcvykldmJtHbbwfed+OyZmIiW525\nyfDh5mcv3oMJirAQTltOnND/4YsVsy8/PPdc4ANxejrPQLwdz8wkat7cvC8piQduJ3PRJUu4Dy1b\n2o81bMjHGjQITlj06aN/t3LlzMesyvPMTA6TMXgwx3Py1mZyMiuUw8E6q9FKkya+Q7OPGGHu/++/\nB3Y9bx7jwcZ1OniQrbh69gzPcdAXy5fz72CNShyPiLAQTlvef988WFSubD4+Y0Zwg7HVE9nf8eRk\ntvG31itRQleyzp9vFyZVqvCxa64JvG9nn232MbB6WVs9wNu04RmR0zFr6dUrvN/Bm7AAeGnJaAp8\n8iR/D03Z+8UXrG/566/Ar2dVrGvl2WfD+x5nOm4JCwlRLsQdS5aYt61es507A716BZaprmVL4Oab\nfdextpOczOGyrdx1l57e84IL9IxwGiVKAP37A1u2+O8XACQmcij0ypX1faVLm+tY804fP657s1uP\nWSlRwvfxnBz2+v7nH2DsWGDUKPYg1yhf3vu5Dz4IVKgATJsGLFoEVK3KocLbtmWP7x49gCee4Ax5\ngdKrl31fWhpw5ZWBtyFEjngI9yGcoQwdCrz/PlCpEvDhh5wTGgDKljXX44mlzrff6uEgjCQkAAMH\n8iBJpA/exYpxSItly+znpKQAzZoBc+bo+ypW5NAWxvAXSgFXX82ft2wBNm2y53tYt46LlYoVObTH\n1q3m/c8/zzkqLrmE+/766/4FjVP7TmRk8P0dMwbYuJHTwO7bB7Rvz/m5d+0CLryQjyml3+OxY4HF\nizmcx0svce6JgwdZEBw4YA4vcvw4pyutXZvbBvjcN94ARowIrJ9GXnuNn4H//tPzd191lf5cCDHG\njelJNApkGSruOHnSnotg7ly2dDJmPLNSUMCOVsalhnr19OPWZRxOv64zcKD35ZGBA7nOoUNsDktk\nD+9tLP36sSe00Xdg3Dhe6y5ThpXJ1atzHgsiDi6oKdOt+gRvpUEDXuO27v/+e3NwvtRU9ocIdAkr\nOdmuAG/Rgn0MDh5kRb7TeaNHs1+Ft3bnzNHv9ZEjHFX1xAkOw2G1lCpZ0p7jIt5Ddp9pwKVlqJgL\ngYA7KsKCZs1is8SuXc05lSPNkSM8APTurSsM69bVB4fbb2chYUzZ2aKFs8AoKCC68kr7AJWaqtcx\n2slrxcgdd3gf6OrWZcc3gAf6V15xbk8r6emsh9BSiJYpY7aesTpc+dN/OJUSJdg81ygYEhM5Zai1\nbtmyPCAPGcIJgjp2ZKW2VT9SqRI7Co4erTuIde1q7qs3n4zevTkkuNOxhATfIbf37WPBrtV/+WV2\nfNP6UKECh0Y/3Vm+nJ95UXDHYTnThcWWLeZQEJmZ/KYXDYx5EBITia66KrBB8oEHOCSEUWj89ptz\n3dat2XchP59NM43HEhPN/fFl3WQtSplzUzgdHzbMvK9NG+/34vzzA7umVVgQcfwlbd8557BpsOaA\nZyylSukztsJCogED7LGZWrbk42edpe9LSmJHQY327Z37d8895mcpJYWtsMqXN2ek88bhw/zS8Oef\n+r7Vq3mmtHu3//OLOsbnRUxn47Cc6cJi5kz7P72/QG5//MFmoU2acJiGULF6GPuzwrGWVq24L4MG\nOS+NGNvv1Yu9no0DbufO5v4MHhzc9Veu5Dfpyy7jf3SjqWyfPvbcCo0b83XWr+cB8auv2HegQQN+\n67cmv/FX0tJ4EA/mnKFD2epr0iT7sYoV+X4eOWI/Nnmyfp82b2YT5LPOYiHXqRNbKFlNhfv2Df3Z\nONMQpzzfg/S5AL4BsAQc6G8lgBVuXDyojp7BwmLFCqJbbzWvT591lu8p8IED9mQ1//3HeZZr12aT\nTWMYhC1beMZQrx6bb2rr/UREWVnmfw5tmSfUogW7A5z1EwsXmsNwX3ml+bs9/XTg17r+en47f+EF\nom7dWB9x9CgvmWnObQ0bmuMvffSReW3e6ASXlsZhPxo2DLwPiYn8Jh7Kvbr2WvN2ZqZ5aaxWLfN1\n1q9nE9aHH2Yh27AhL8Np4cStDoja/W3ShH97o7Ax8sUXvCzmK0ZWOMydyx798f6Wnpcn4T58DdJr\nAVwFTnlaUytuXDyojp6hwmLrVvObbPnyRHffbY5h5MS339oHBWvqz9RU/Z+zTRvzsVq12Fu5oIC9\nZ885h52mJk3iwbJbN/OAbhywnn3W9wCYlMRtWHM3aG/N1nwWGRnm7/bBB97brlmTg7slJHBf3n6b\ng90Z62Rl2Z3ftHucnMz5tn15Sf/6K3t5G/e1bau/cVoV3wkJ/HsFkvfCGt68XTv2HNe2H3lEvw+H\nD9vDf0yd6uxdDrDAcLrfxqWwpCT2pDZi1G9kZHDMKzcxvny0axf/egAJJOh9kJ7nxoXC7ugZICyO\nHeM3yfR0Hrxuuom9V63/4Pv26eecOMHrxcaZABHPIqyD00032dv6+2+ub7Wq0QZ+4+ykXDme4Wjb\n55/PzmvGweb554nOO8/3oJiUZPeAVopnS/PmcdgP47Hzzzd/t/x8DpGdmGgemBMTnRP8WEN1A87f\nVytNmjgLQk2oGAdvrYwZw6FKNm+26yGU4kGwUSMe3JOTvQsj60yrXz8Oh/3ii/pb/7Ztdgskrdx9\nt/fvNXSoXfdhdYAEOLihEWuk2uefd++Z37PHfv2xY/leFSvGxhPxOBjv2cMBE63WgPFINIVFRwAf\nAOgNoLtW3Lh4UB09A4SFk4ln167mAbtqVT3ZyrZtumVK2bL2GDhjx/KbYPHiRCNHEj36qL19TSFq\nVGKHUq65huiNN7wn1/FX0tPNfW/alAfZlBTva8L5+fYlgRtvtLftNDBrgfsAe4Khs8/mpTptOyPD\nu2K7WDFe2jEmwLG+7QdThgwxbztlYXPyLgecw5MYy+efswVW8eJ8b++7j2cnVsG4ejVfJy+PQ3dY\ndTQTJwb9aHvF6frWmdH48e5d70wkmsLiYwB/AvgQwERPmeDGxYPq6BkgLG65xf4PXq8erxefdx6b\nzc6eTbR3L9e/915z3bZtWWBs2MBhoWfPZmWxxr//mmcbGRn6W9uxYxxkz9+A42/AN26npPCST+PG\n7KtgHXCN1xoyRO+nVSeRlqYf++UXVjgfOsTb1iWfvn0D62taGgckXL2aczhoepTUVDaftdb35lOh\n+XUY8ZapLpBiFVxG/xMNqzVaVhabwxr3pabqg7BS/Kxo5Oez0n/qVI7Ua+3DhAks+Js0sd+z/v0D\ne9Nft44V9P5CkRPxrEx7Ibr3XnvYDwn3ER5R1Vm4caGwO3qaCYu9e9kxasQIntIS8fKL9S342mv5\n+MSJet4Hpdi+3Zp3QPuHU8r8dnb33dz+3Lnm+kqZczBo/bLqNgAeAIcMcRZo3spnn+ntPvaYfdbR\nvj0vg/zyi3kAsobp1n76AQPM+6y6CMD5XG9Fs7JyGjCdhIt1X/XqrAd45x0WPBpGE9lwS+XKPEBn\nZPB3W7WKFcGaoE1JYUs5a3DFli1Z2d26NeslBgzQ7/GPP+rCz2lJ7brrnPtyzjl2/579+9k4wsiU\nKfqzWKUKL4n6IydHX141zrAjoSM504imsJgIoLEbFwuro0VcWBQU8HLK+PHsoGVUZDZowD4TO3ey\nWeiFF7IVizFfgtM/rzUFpq+yYwcnGrLu/+kn/ueuVo2XG7SZh3HdvWNHcwKbJUucHeuMb/WNG3O7\nxYt7X//XBrUHHuDBpXx57o+T49vx48EPtJpA8HasUSOi997z7YcB8D0x3uvkZBZUEyaYlwjfe4/v\nj3W57+KLeanK6MwWaLHmhmjViq+xahXRhx/qOqdDh/T7Vro0vxhYo7hOmMB1rdZt5cvrn2vX5mt4\n6096ur50+f77+vfv2VMXRtYZyeDBwf+/fP0156HQMgrGEzNm8P9HUhIL4XgnmsJiDYA8j1WUmM56\n4ZdfeJ29YUNeNrJiXGe2LjUA/AAaPW6tSzpO5bLLzIOVt5KQwG+AI0faj61cGdjSk9V81ckx7scf\nefbRs6d3hzB/RSmzEh1gBfWpU76FjrdSvLgepdV6T4ztG4851Xf67ayzwIwM9uO45x7z/t69+Z4d\nOMBCw+j97lSSkznHxeDB5twJAHtuG1m5kuiSS1joTprEb/GaE6TVYisri8OFd+tm3p+Zyff8rbe4\nj8YMgU48CH+VAAAgAElEQVTf8/nnnU1ItVDymje8VkaMcPM/Lbbk59sNRyKRutVNoiksajoVNy4e\nVEfjWFgcOmR2LEtOZv2AhlOeZeMgn5xszjccaElKcrZ+SUjQB+uEBLaPJ7IPPAArPQO93vbtHJr7\n0kudHdNCeXP29r2M223b+s9VHUhp1459S7Sc2d7qXXstUZ067nwXgGeG69YF5ptRvDi/VWusXm1+\nth5/XD9WWKhn+NN+a2NI8EGDnK/x8svm87SiWVsdOcIhVc47j+ipp+zLnd27O5+vGSIYU69qfimn\nC5s22b93Vlase+WbiAsLACU9f8s6FTcuHlRH+QvHJWvX2h8g4xp2fr7dCzocJahWnAbstDR+oHNy\neLnAaGbrpPxdtMg5kY+1pKSwAHSj38GW9HTnexhsqVaN78Pixc7mtFpJTQ3dqsupePOvcNr/55+s\n5G3WjAfsY8f4+Xr9dbtVmD8P7sJCoo8/tn+XevV4pma9fseO3p/vtm155ustV0fbtuzdTGR/noYN\nC/c/TOfECX6uNSOPaLN/v/27X3ppbPoSKG4JC1/5LCZ7/v7lsYb6y1D+9HHeGUeSQ6D3EyeA3buB\nZ54BXnyRQ3GXK8chsa+9Fjh8OPRrJSZyGO+GDe3HBw3iHAVly3Lo7Yce4kcaAPLy7PXz84HsbA4r\n/cYbHKp76lQOUW2kXj3gk0+89/umm0L7Pk5UqmTezsvj7zxlCofyDpXjx4EGDYA2bYAjR/T9qanm\neidPcl0jyclAejpQs2bw19Xuv9P+xER9u1gx4NNPgdGjgRUrgPHjgSefBOrXBx5+GLjuOvP5GRnA\nRReZz581i88vLOQw3zfeCJQsaf8uSUn2fBeNGjn3s359YOFCYPt2DqduJDUVmD2bn6G0NN6Xn2+u\nc+qUc7veWL+e82JUrszfW7t/+/cD55zDz3XNmsCMGcG16wZO/+sXXxz9fsQENyRONAp3NT759Vf7\n28aoUeYgb02bsonm5s2s3wj0rbRECQ7XkJzMb9beIoka61vNPLVIsU4xlVau5GOTJvGsIT2d6PLL\nA++fVtq1s7/5+1rqAXjZ5KGHWHncrRv3b+xYZ6U0Eb8NB5p/2tgvJx2RsTjdU6ewGH362NfjwylG\nf4zSpdnKyerVftllzs/c9u1stmv0BzGWBx/U6378sT6LSEzUQ5DPncu/eUICL/1lZPBz64u1a82/\n60032etMnao/gzVrsj9QMFidOj/8kPdbw6o3aRJcu25hzCCYkhL/y2yesRPhFv8VgHYAins+9wXw\nOoAablw8qI7GsbDYs8c+iDl5XmsK2htuMPsclC/PistAlNWhlDff5H46BaXbs4cV8pG47muv2cNX\nACz49uzhnAsas2ezUrdLF+dwFRq9egV27WHD9PV7b+EvAKIaNezpQxMS+DwnM+E33tB/p0B/Lyfz\nVKeyfLk5zDvAgvTNN3kA1sjN9a8fqltXr//jj7oyulgxXnrUOHDALLSU0p3yvLFqFd/fMWPMVnJG\nNm/ml6jDh9nK7+23+X/C6LzojUqVzN9F87Ow6twaNfLfViTYt4+NFjp35u8Y70RTWKwAoAA0B7AU\nwH0A5rpx8aA6GsfCgshsfnnFFWx77usteNo09kPQfBEGD46csNBMJp1mNHv3Ouc+DrcvxYqxnf/8\n+c75JIxhEqx5p52snjT8Od0lJfE/8nPPsfXXwYO+Z2O1atkVz1pIdCf9zJw5nO9h2jT+HMhMJzHR\nWVA7CYsePbzfi6efZsHhpFy2FmNuC6v57c0368c2bnT+jm6xZ495ZqdZhvnioYf0+unpum/Hrl26\nkExNNQvQaGKcobduHb1UAaESTWGxxPP3aQB3GPdFs8SzsNi71z7A+RscjOaTvt58Ay1Wm3qtKMVJ\nk4jYDt64NHbddc4Z3MIpzZrxgH733eZrWYtmk//hh4G1S8RvqP7q3XqrHkEWYLNQf0p5axDFpCS+\nntN51auz8QARL/UE0vfUVH4D9/YbAWyuSmSfWRiLP6W7UvxcXXkl3ysNq1La6M1dUGC2xGvalJXq\nbvHxx/Z+aopwbxQW8nPx7LN2J8CjRzkqcbBLW27hFKLcyVQ+noimsJgLYAiAdQAqAUgAsNKNiwfV\nUW3EiEMWLLD/Q/gaGMIpaWn2abq232j3bnyg69Vju/gZM+yDy6pV7vbv9dcDS06kzSycst45WQkR\n+fZ/0IqTs6Cv9s8+2x4oT7ve2LHOfVmyhI8HKixq1+b6x49zxFyjMAPMUXUPHuTluNRUcyh3b6V0\nadYxXX+92QLPyNq1eijzs882CxIitjB64gl+oz9wwJV/if9n1ixzf8uWLRrB97yRl2fXCU6bFute\n+SaawqISgIEALvJs1wBwsxsXD6qjcSwsnN6OL7kksIEk2OLLga5JE15ndwrX8N9/dmcxgNevX3st\nNIe3cEpaGs8CPvnEfszJRNbJTDSUkprKMx+j06PTkpvGokVmnUPp0vqAunmz84BuXSaqU8f8vFhz\niDdtyrqQI0dYX6C9DATimNmuXWDP6K5dbOjgFKvJuAzYuXNgeoVgGDqUf9MaNewRbYsiEyfq/4d9\n+sS/8IuasAj7AkAXAP94ZiaPe6mT5dGHrAIwx0sdV2+gm4waZf8nvu66wBWbbpb69VkoGEOEVKrE\nb49O4UEWLuTvcPIkD37ehFHp0uysNmWKc+yocPrboIF5n9XCRyn2MYjWPVSKXwAGDmTlcO/evE8p\n83r/qVMct8h6vnVZy8lq59lnuZ4xr8Y55/Dg46tvZ59tnu2MGeP/+VywQPcrycw0x1ravNl+DS0p\n1sGD3J9mzTiEibZ0eOoUhxnZtSu0/5fTAU1xXxQoEsLCs2S1Aez1nQxgGYCGljqlAPwNoKpnu7yX\ntty+h67xzjuBD0RJSe6aX3orycnmgX/gQGeTVGO2PCLvDmSJieyp/v77/JboL5FPMOWpp8zbr79u\n7nuvXoE5DropLHwdX7mSw2ZoMZWMyxJK2WdwRgFjZOFCe9u+YlmlprIlzoIFnN9i+vTAns8uXczt\n3HabfmzXLvv3XbyYj1l1HS+/zEtp7drpz9gnnwT//yJEl6IiLNoCmG7YHmydXQC4B8AzAbTl5v1z\nlXHjAhuELriAQ4g7ZbGzlgoV3B8EnaKhaspvjRo1vJ+/eLHZ+sctgVGhAi9VDBzI98aqW2na1J7Z\nzs1iXfbyZ+H04ot2gV+9OltVjRvH69paDKju3b3rAf77z74EVrOmnuPc2q8bbgjt+eza1dzOnXea\nj7/8sv5bPvCAvt/q5d6qlT1ZUrlyofVJiB5uCYsw/GEDoiqArYbtbZ59RuoDKKuUmqOUWqyUctEX\nODr06BFYvUGD2Hs4J8d3vdatgZ9+ArKygKrWuxUGmoetkeRk/XNBAXsvO5Gezh68hYX6PiJ3+rV3\nLzB2LHsoX3UVeyAbWbkSqFgxsLaUCv76ffuat53uk5GaNc0e4ABwyy3AmjVA//58T8eOBTZsAL76\nCihTxns748aZ923eDHTrxuXpp9lr/frrgUcfZY/uUHjmGY4eAADVqwNDhnCEgX//Ze/qRx8F9u0D\ndu0C3npLP8/q2V6mDD8jRqzbwumLg/M6o5QqCbaCqgaeHUw2HBtNRPe62IdWADoAKA5goVJqIRFt\nsFYcPnz4/3/OyspCVlaWS10Ij337AqvXvTtw+eVAu3a+6y1ZwmEcjh0Lv28aXbs6D6SpqTxgLF0K\nDBjAIUqcyM01Cwp/pKfzOUbq1gUuvRT49VceWI3s3cv3ZcgQYMcOe3t33w08+KD/61oFmFK+hVpm\nJrBzp3lf+fIs1P/4A9i61XwsKQm47DL+bfr357YrVgRuu81/35y4+WbgnnvMoVgmTuS/P/wAvPQS\n8PnnobWt0bQpC8SFC/n37t0bWL0aOHqUQ3nMnu38UpKZyb+LRo0aQJ8+LOCWLuXQKy+/HF7fBPfJ\nzs5Gdna2+w17m3IA+ArASADXAPjOs53qORaQnwV4GWqGYdtpGepxAMMM2x8AuM6hLdenZ26xf799\nScbXsom2zBBo8RX0LtDy229ELVrY90+Y4JwZzlsbTg58WklJYdPIkiWdQ4v06MGOVL6ukZhoz4eg\n/fQ//OA/v3coxbrsVKcOK3Evvti8PymJFd5EvLR0xRW89BROoLy8PN8OkJ06hf14eo0+qxWj38WC\nBRxye+dOe7RZLYlWbi7rWzZssF9r0ybWpZzJyu94Ay4tQ3k/ACyzbD8JYD6AckEIi0ToCu4UsIK7\nkaVOQwAzPXWLgfNl2JItxbOwcPJV+OEH//+k0Sxbtzp7/kZD2e5rYLaWYsWcw4MfOMD+G40bB5cN\nL5TSqpU9RhOgC4W//iJq3tx8rG5d9mcJxDpp7159MHVy8jKWQYN8t7V1KyvQr7qKs/Y54S+3yB13\ncL033tD3ZWYSvfSSvp2Wxkp9X8yYoYcOKVuW/y+E2BMNYbEGQIJl361gy6XNAV+ATWfXAlgPYLBn\n310A+hvqDPK0uwLA/V7aiciNdAOnMBqvvspept5yN0e7bNxof1MEnPfFsrzyirPJ8Y03mrc1D3jj\nQKuUXZkbSvGWmbBFC1Zu+xN4kyZ5f1aee07vc926nBho6FD93LZtWdHfti0rm0+c8P3sGQ0SEhOd\n4zoZ05RaS9myejBJa8DF55/n2Edjx/qPF0VkF0r9+vk/R4g80RAWLwO41GF/FwDr3bh4UB3lLxyX\nzJ9v/yd0Sg2qFTcT6wRa/v3XnsNaGxCi3RdfpVcvZwdBa6rPcGJXaSavSnFIEmvmurQ070tzgViA\npaY62+Bv3epc/4kniJYt4yiwJ0/yzOPXX4l27/b93B08aG/rySft9axLgi1acL0ZM3iGU1jI6Wyt\nlnDvvKO3sXIlL8EZgz9asZr9Gi2rTidmzuQshMWLc7yueCfiwiLeSjwLC3/r8LEuycm83PHYY/Zj\na9ZELoChr+KUuMnXgGwNsBdOqVOHv/f27fz7Wb3IO3bk0BlVq4ZuHvzccxwORAviSMRr/E5127fX\n6yxfruu7SpXipZ/jx9nkeutW83PnlD3w9dftz6f1jd8azK93b/2Y9ix07arPaoxhT2rV8q6PWL5c\n9z5v2JDzvp9uFBTYn11vYVbihWjMLM4DsBzAUQALnfQI0SzxLCx+/tm9gSxS5frrnT24f/qJlaix\n7p+xOIW5cFO3YozFRGTXT5Qsyfu9ZebTBvM6dZxjW1lnRk89pV/LmItdKxdfzMtbjz/OSnPjsS5d\ndIV/cjJ70Bu59lq9bqVKzuE8rC8JWsh6Ivb1sPbH6uxnzeX90kve/xdyc7lNb6HLI8Wvv7LOqEIF\njjYcKZwEtPU3iTeiISz+BNAJQCqA6wH85MYFQ+4of+G4ZMoU9waySBan3BILFtiTyrhZlAosIJ6x\nWK2h0tI4uJ5T3UAU5tZ9KSnm38+q+FeK9/tq9+GHuc6pU+boukrZHSqLFTNfz1e+dauOy6rMr1nT\n3FZBAT9/48ZxOHAn8vJ42alLFx5IjbGMnHKxLF1qPt8aPfjZZ9k57/vv9ToffcTLadaIANGgoMBu\nfThvXuSuZ32mInktN4iGsFjiazva5XQXFsEOqKEUp7hPI0cSDRkS+WuXLOldcWwtThn27r7buc17\n77XvP+88/ofOyODw0dbBMDXV/PtZo9lq+Sys0WGtZelS+2B+9dXOHuFE+iBtTeJjLdp1MzOJHnnE\nfEyLYOsmb7+tz4Yee4zjRc2dy+FdiHjmrH2ntm3Npt+DBvG6vfHehZsPY/NmzsFRrx4r//2Rk2O/\nh59+Gl4fvHHokP1aw4dH5lpuEQ1hsQlAd0Mxbbtx8aA6yl84Lvn88+AGzlgVJ2ExenRgob+jWZz0\nBKEuQyllfxNs3ZoHtIED+fv3728+npnJv+vmzRyTqmZN5zSrxoRXxmL1RSlWjAPyJSXx0tXixSz8\nnF4QlOI395tv5uimhw7p/jGpqURff+3+83v77fr1W7bUzV+rV+dw7HPnsrJ92zZ70Mzixe2ZBo0p\nXUPBmiwrkIHfGMeqShX/xgGh4iSYhgyJzLXcwi1h4dWDG8CvAK40bM81bBOAr32ce0bhLURGvJHg\nENxl506gVKno98UXRPZ99eoBf/0VWlvHj+vfPSODvcg7dtQ90rt1M5/TuTPw7rvA8uXA4MEchmTW\nLGDdOmDOHK4zcCCwdq3zNevXN3uFlysHTPbEP9i0iT2/lyzhkBvnnQesWMHH0tKAG24Ahg7l7Y8+\nYk/xP/4A/vmHPaozM+3X27CBw480bw4kJgZ3f7ZsASZM0LeXLtU/b90KtGrFn1NTgT//BCpUMJ9f\noQJQpw57hGvUqRNcH6ysX+9724nPPwcmTQIOHwZ69Qo8PEywOD2b1apF5lpxhxsSJxqFuxqfOCk5\n47E4maTefbfzUk4si9PMYt8+DpUdievVrcs6iNq1WTdiNTU1mu126MDhuYmckzw1bmzOrw7Yl9Wq\nVtWfnUOH+G39nXdYeWrNRXLZZXrdffs40c7UqWyhtHCh2XGuY0fWT/hi1y6Oitu7N+sXnKLOeiut\nW/NS2h138CypShVer9+xg/tZowY/S+HmwzDO4pKT9Si48cCpU/b7NW5crHvlG8/YiXCL9wPAJMPn\nW9y4WFgd5S8clziFtojH4hSCZMoU/zkUol2s1jeA8z+pW8XqE+Mv4u+YMZwwyprEqF07TmDkpLMw\n7nv2We/PkrVNLULsli32+6Ll2DDu87dMZQz5kp5OtG6dOersddfp6WStbTdrprej5baIBPn5rEd5\n9FE930q8cOyY/Xn47LNY98o30RAWSw2fY6rc9vTBrXvnOsHks4hlcRoE33vPdw6FaJcePZx9MHbt\nCuz8pCSirKzgfEecQrcHUlJT2SS5QQMeZDWzVScFvVZuvdX3s2Q1Yz7/fN4/bFhgfZo61XvbTtkG\ntXwU+/fr/hOHDhH984/Z8EEpPS7WmY5x5tOoESdCimeiISyWOH2OVYlnYTFpUmiDTbSL05t5r16x\n75dWGjfm++kUIiWQmYVSRJMnBzcDSUsLL3fIgAH258GXsFDK7gX9yy+sTM7IsM9KtHwRxuUmY2na\nVP++l1/u37/BmJUwOZnjN/3wAyveq1VjpbqRWbPYwVDiPJmZPp1n5fEuKIiI3BIWvhTc1ZRSowAo\nw2ejruMBV5QmpwHNmsW6B4FBZN/3yy/R74c3Vq9mRW7FisC2beZjeXnO/TdCxIpjp3oJCfYQ68nJ\nwPvvA88+aw7FHQxVqnC733zDytVrruEw5r76aOzHtm3AFVewstuJ5s357733At9+CyxYwO0XFnLo\n8W+/5TDsOTn8W3bqBJx1FvDqq86GCzNmsNL+0CHg/vtZOdumDRsBAMCdd3Ko+Hr1eLtDBy6CmS5d\nYt2DGOBNigC4xVdxQ1IFU7ir8ck337j7hh2JYlWcasWa7zrWpXZtXqs27ktK4vhb/oIyJiURffyx\nfX9mpnN9LYqs0U8AcDYEMJbmzdlk9PrrOSSG0Su7bl3OaGesr+kAALM3d06Os/lskybsQ3LhhRxC\nY9w49nUoLOQQJbm5dp2B1dfn+usDe3adQpDMnu3Gf0VkKSjgvOzjxzt7rQs6nrET4ZawG4hWiWdh\n4WSDH2/l0kud97sRpdXtYrWzP+cc+z5rSUwkuvJKtsqxHvOmkyhfngP3OeX5MJbbbuOYSLVr29ft\nnZy0hgwxL4W98gqHNV+zxnzuggX2cytV4kCCRERr15r1Ny+84P0ZtArYOnUCe3bz883WW2edxUIs\n3rEKaG+pawUit4SF17SqSqnvfJWITneKGKEuYUSTkyed/Sz8pRCNBb//bt7essWevrNYMfN2QQHw\n/ffAzz+b9591Fi+jGNPHauzbx9kLfWU6zMgAxozhFKSbNnGmQ43du52zHi5bxsOYxtSp7K/QsKG5\nXo0a5tSlKSmcta58ed6eMoWXizTGjPHeT2vSyKZNgcWLzf1wIjERmDkTePNNzsq3YAFQooTvc/yR\nlweMGMHZ+cLN8ufEsWPsg6KxcaP9dxcigDcpAmAvgCUAHgVwMYD2xuKGpAqmcFfjE6dEOUWl9O7t\nWyEbL+Xnn3XP4uRk9hL2FQY+mOIvZLxTuHEieygOpbhfxiiuAHtFjxvHPhq3325eNrn/fp4VJSVx\nbCUj775rbqdVK9/P4Wef8bWN3u49ephjQUUDq2f7d9+5235+vt0QYNYsd69xOuEZOxFu8X6AM9d1\nAfAhgKUAngPQxI2LhtRR/sJxyS23xH4wDbX07WtPLBTr4mT22q2bfV/DhpG7nvEa+/Y5/+733Weu\nq6W/tcbAsgqjbt34/E2bzPqRlBRz+O+8PHb8S0zkJTBrgD8n/v3X/h38Zbhzm3r1zNcfOND9a/zw\nA1uKpaRwPCvBO24JC6/LUERUQEQziOgWcC7tDQCylVIDIjDBKdJoFivxTLlyzlY6J04An30W/f74\nompV+74ffrDv++cfd65HZN+XksJ/167lZaHu3YH8fHOdRo3M20eO8F+rZZPWlsbixbz0tXu3eXkt\nL8+8JJaczL9NXh4vgbVo4f+7OIX7sP7uJ04ADz3ES1cjRtitxMJFCxHibdsNunble5Wby8tnQhTw\nJUnA4cm7A/gCwGIAQwFUdUNKBVu4q/HJ5MmRfdN2o3Ts6JyfIdo5uAMpVarE5/X+9z/z7+4tkCDA\n3tGJiRyV1pvvx8MPm7PT1a/P1lVPP81LVo8/7j98hxNPPaW32b+//fj995v78cYboT333jh8mO9N\n+/bs6S7EFs/YiXCL9wPAR2CdxXMAznbjYmF1lL9wXPLww9Ed3EIpP/3kvNxiXTKIh2J0HNOKlgo1\nlGKNimotTjm/nUrbtpxedMgQoquuYvNUb3WNwmH0aNYneEumpJXUVHuiokceCe2Z3LyZ86470a6d\n+Rp9+4b+7AvxTzSERSGAHE85Yig5AI64cfGgOspfOC7p0iUyg6abxVsfL7ggen3wl6gI4DhF1hDf\nAJvFhnpdf34TVmHhq7413LlTqBRjvgeAzV+XLQusr9YcGlrWPjd58knzNT74wP1rCPGDW8LCl84i\ngYgyPKWkoWQQUUl3F8OKNkePxroH/pkxw3n/jh3R64O/tfFSpXgt3Um3Ek4/rWa3Vqw6pzJl2Cs/\nKclu7qp5OmvUqcMmsBoJCWxua6VyZf9myq1b2/Uip075PidQfvwReO454NdfWU/x8stA797swX7H\nHe5cQzi98SoshMDZvz/WPQgdpWLdA53Dh4H//c95cDf6N7hJRgZQs6Z5X1IS55T47jvgt9+A0qX1\nY0a/CAA4/3weiDt2BM49F/jwQ84PYbyvI0ZwCJMvvgAaNADKljW3cd55wGuvsY9Fp07mYxdeyH8P\nHfIv9AAO+TFyJPdb4/33OWfH0KHAJZdwfx99lHNs3Hmn/zYFAQDCnppEq3BX4xN/HsDxXJzCgce6\nOCmDr746evfAmFHwllvMS2BXXsl6hK5dOQ/F8OHsAd2vn+75PGaMuT2nzHGTJpnDkGRl6UEAe/fm\nMCDXXkt0/Li+hFi2LNGvv3p/Dj/6SG8vIYHD0BBx28b+XH89e6W3bMlmrf6CD4bLwYMc3Vai1sYG\nz9iJcEvYDUSrxLOwqF498gNopEo4iuNoliZNvB/TnPV8lTp1OERE3br2Y04h0X0VLWyH1Wnu9tt5\nv9VJU4umq/Hnn85KdackP2+/ba7TqJH359Aa6kSLD2UMjQHYX25Gjgz/f8AbBw+ylZd2rXvuidy1\nBGfcEhayDOUCTn4BRQWnMBjxiHUt3wgRh7fwxaZNHPZj40b7MeMykxWn+6OFTVm2zLxf27bqVzTf\niWPHOGTHfffxZyNKOfdD893wtm3Emt5T237tNU4VW7480KMHkJ5urvf3397bDJfp0zkdrcbYsUUn\nDbFgRoSFCxDFugeh4yucdiy4/HLnGFa+8oTn5XEI79tv5zV5J5KSgGHD7Pvr12eFthFN35CaCowf\nz21rPPAAnwOwnsLIpZfy39xc8/78fFZUd+jAbf3xh70fr7zCcawAjtU0bBg7IvboYXbqu+oq5+8H\nsNL6ssv4+3TrxjHLatUC+vUDPv2Ut7/4gh3ajHTu7L3NcLHe2xIlis4LimDBjelJNAp3NT6JR8e2\nQEu8+Vl8+63zspJT6HFjSUmxm7VqJSGBzUWtZqlaOftse/1p01iXUbw4+1SULk1UpozdMe+TT1iv\n8eSTnM/89tvtEX4bNuRQ4976npWlt/fZZ+Zj1uRU9eoF9kwOH24+z5ihr7CQc3jfdVd0UoLedx/r\noTIy+Pct6ixfzlGQmza1J4uKR+DSMlTYDUSrxLOwMK7JFrXi5AAXyzJqlHPeim+/jdw1nfwqjDko\nrOXff82//8mTHNpbO251fqxWjWjHDrPi3FjS04k+/5zbsuYdsQrzatUCeyZvusl83kUXea87fTo/\nwzVrspI8Epw4Edm83dGisJB/A+2+KsU6qHjGLWER8WUopVQXpdQ/Sql1SqnHfdRrrZQ6pZTqHuk+\nuU1RCFFuXafWiLe+f/+9sznv1VdH7ppOJqmHD3uvn52tf963j01QN2zQ91n1K4cPA2+8wWa6ycm8\npGY0wc3NBW68kbPm1aplPrdtW31fQgLw9NMBfCEA115rvo/XXms+fugQsGIF61d69GC9wubNwG23\nmb+LW6SmOi8vFjWOHjVncSQy62ROa9yQON4KWCeyAUBNAMkAlgFo6KXeLADTAHT30pbbAtc1vC1/\nFIWihceOdT+MxeoBHeni9MbvKzf59On8u69d69xX6/20Rsdt1co57/cffxAdPcqRZitXZnPhgwe5\nzJgReB7scePY8zs9nejii+1LTYsW6UtyTnGxikKmvFjSvr1+r0qXJtqyJdY98o1n7ES4JewGfDbO\n0WqnG7YHA3jcod6DAO4BMKEoCotATDfjtQRrNhqNEg3hpRR/93btOAue8VhCAudMcDqvRAkevImI\nBlm6LrkAABv5SURBVAyw38uOHVmvoQkgpeyCJz3d3m6jRuxTQcTt//GHOe/FyZOBPYtbt9r9VBYt\nMtexmtga+1O8OAcCFLyTk0P0zDPsb/P337HujX/cEhaRnhhWBbDVsL3Ns+//UUpVAXANEY0BEEf+\nxIETbxZFweAU0jrW+DKTdQsiDj8yaZLd7DY93ft9adRIt8yymrEmJHD4jzfe0MN0EAGzZpmXnUpa\nguXUrw/Mm8fXXb2avbzPOw+oVw9YtIiXidLS2LKoY0fg8cftYUc0VqzgaxqxZh60hl0xWm8dOwas\nWuXctsCUKMHe8K++CjRuHOveRI94GObeBGDUZXgVGMOHD///z1lZWciy5pKMEUXZbtxq7x+vJCS4\nn3fhyBGOx1S5snm/NnhWqWL3mVi8GFi4ELjgAnt/Dhzg0BrWtfkTJ1jP0b07sHMn57FISmKhWKoU\ncOWVwJdfAv37Ay+8AOzZo7fXrx8LAID1DLNnc9m+Hfj4Y/t3uvBCNrXNy+PthATgiit4XX3VKuCc\nc9gs948/gJwcIDOT+2NEE5REwLhxfF7nztxPIf7Jzs5GtlGx5hZuTE+8FfAy1AzDtm0ZCsAmT/kX\nHNF2F4CrHNpyeXLmHrFetgmn+LL6iadizT7nZnFaFiIi+vBD59AjWppQY7gOX+Wqq5yjzr7/vrmN\nOnU4XLixTqNGzm3WqOH9eVy+nFPOtmhBNGcOh9nQLMxKlOBlqd27iRYs4CWvoUP1do0mtsa8GIAe\nPuRMJzubIwGUK0f0wgux7o1/PGMnwi1hN+CzcU7Nqim4U8AK7kY+6k9EEdRZBBJ6O15LUel7JM2T\nnRTcRM7CICODzWD/+cd7KHOrzqVrV87jbTQJTkoievNN+7nvvaeHaC9fnv09nEKyXHdd4M+n1e/j\nppvsdf77j2jdOvO+Vq3M5zklUjrTKCiw++v4itcVD7glLCKqsyCiAgADAPwM4G8AU4hojVLqLqVU\nf6dTItkfwY7bSzuRwhoZ1k3I8tRpy0jWJbo2bYDRo4HatTl0uZPJbVaW3bN73z6gUiVeNqpWjZe3\n3nzT2Wy5cWNO5bpkCZuwZmXx0tazz3LkWoDNb/v2Dfz7lShh3nYKoV6zJutIjFjTxlq3A+XUKdab\nuJUGN5bk5vLyoJGtW53rnna4IXGiUaC97sUhsX7rPhNKoNnsQiklSpi3ExJ4ecb4Zl26NNG+fb6X\nntLT+Xlo08a8v3x58/Myf76zxdfFF5vr7dmjO+VZZ4BNmnCdQLyJ167VLb7OPptnOYFw8CBHwG3W\njL29b7yRI+AGsxx14gTRhRfq/X7xxcDPjUdOnrQvTUbKkdEtPGMnwi1hNxCtIsJCSqRK69bm7ZQU\nDjturTdokLN+QysdO/LzYNUDJSbqz8q6dfbj5cvzoG/lhRe8X6teveC8iQsKWPgUFob2jLdsaf4+\nf/3lfI1Ro4juuIPzjhOxZ7r1XoSSVzxeOHbM/ls891yse+Ubt4TFaeBTKZwJBGqebPWADgRjoD6A\nrYEWLrTXe/11+3JY/fq8zHP++WyGCwDVq5vrZGTwktWqVRzo0Mk7vFkz+z7rd9YC8CUl8bJUMN7E\nCQlAhQqhJbvKzweWLtW3Cwp4mczKiBEcaHH8eKBPH+CTT+zfISEhvhJuBYvTsq0xU+LpjAgLIa6w\nZqLTeOCBwM73F6rcCesAUFDgrCMpLLTrKerUYT+I667TTXBHjzYPiE8/zSarTZuyyasVb9+tf3+g\nZUv+XKwYD77Z2azTuOEGFkLt2+v1S5VigdSzJ/D22z6/clAkJbEw1EhJYT8QKzNnmrd/+YXNbbt1\n4+3ERGDUqKLtl1SiBJtba6Snc6TfMwI3pifRKNzV+CTWyyhnQtm/P7B6oWQt7N3bvF2qFNHq1c51\nrd7exjJoED8P991n3m/ViWglJYWXbfr0YQulffvsz1ZeHvdl/35eL1+1ylzP6E1s9SgfNcq9Z3zv\nXqJ77+VQJDNnOtfp3998/Vdf5f2Fhaw32bHDvf7EilOn7EuRP/0U6175xjN2ItwSdgPRKiIszuyy\na1fk2m7b1rydmspK6GDbqVCBn4d77vFfNyWF6OuvzQNPRgZRbq7zM3bgACuaAY5FZkxRunw5m292\n6GC+RjDmtW6Qk0N0552sAxo8mEOmnG446SyiEeY9HNwSFrIMJRQJHn44cm1v2WLePnkytCi32nJW\nnz7+w6jk5QF//mkOtZGTA/z8s3P9MWN0T+7jx4FHHuHPI0YAzZsDF18MrFljPufss4P/DuFQogSb\n+S5aBLz4YnyGkgmXYsWAu+/Wtxs3Brp0iV1/okkRXj0UziRWroxc29aQHoCeCjUYunuC63/+uVm3\nYQy/oVGuHIcft1K7tn1fQYHdJ6OggMOIjBih79u501zn0KHA+y4EzpgxHPL90CEWFNZYX6crIiyE\nIoEbFjSJic6OdE44DfBOZGbqymtt4LbGWipbFti1S99OTgZ++omtaJKT9aCDAPDNNzxD+f13brdh\nQ1agLlzI94CI+/bCC2xZlJRkPt+IkxAU3OGMUWobcWMtKxqFuxqfxHo9X0rsSkYG/23QgGj9en4e\nevY013GKv3XZZaxncGpTCyOSmEh0223mY1WrEm3erD97Y8fq9c89V6+XkEA0dWps/h+E+MIzdiLc\nIjoLQXCga9fA6uXk8N+1a3UT2LJlzXWcZkU//8yznPLl7ce02U9BATB/vvlYQoLZrv+uu3j56d9/\nWVfwyy/ASy8Bc+dGNrugcOYhwkI4o1GK9QdW7r3X93mNGgGdOpn37d/Pf1evNu/3tpz1/ffsd+Br\nia1ZM6BuXf6cmAg88wzPHfbsYWe5ggJgwgT2q9i+neNSPfYYhyrXKCzkWFQ338x1BSEk3JieRKNw\nV+OTWC+FnAklkqlrjdFgteIvdPtLL3E4DmNGvEmT+HmoVMlcNynJe3RfpxDoZ53F+1u2ZN+Ew4eJ\nZs1iX4Xdu3UT2mrV9NhRAJv8bt9ufz5HjDC3/8EH7j7/O3dyZN06dYgefJDDfgjxg2fsRLhFZhZC\nkcDoqRws/kw409Pt+5xCcgCcse6GG/j4E0+wcjkpiUNc3HIL17FaIeXncwIiJ4hYSa5RrRq3TcQm\nvbt2sbVNhw4cWuSFF3QT2m3bgPXr9XNPnuTsbVZmzzZvz5rl3JdQuesu4McfgU2bgLfeYg924fRD\nhIVQJDBaEwWLU2gKDaX0mEtGevVyrr9wITBlCheN/HxzrCSn+EHffee9D+3bc3vjxrE5rWYmu3+/\n2TQW4HhQvqhQwb6vRQvf2+GyYYPvbeH0QISFUCRo0yb0c7ds4bfyNm2AJ58ErrqKFcUpKRxvyUmn\ncMEF9n1JSXr8pypVzMfGjdNjI5UpYz7mSyfRqhXw2ms8W+nfn2cuvrjvPt2uPyXFPGOpX5/1FVZ6\n9tTbzcgArrnG9zWCxdheQgLfX+H0Q/GSVvyjlKJ47WtRjqJZVPjvv9AiyhqpVo2XnIxLN5mZ7Hm8\ncaO+LyGBS36++fy0NBYYRJwYyeojccEFbL10+eXAjBn6/hIlnGcEjz8OPPQQJ0bSeOklYPBgfXvI\nEF56AoDly4EffuCAgVWqsPdwgwZsDZWTw8mLnJ7Fbt14mUjjzjvZ09otiDji7tq1/N3DWTIU3Ecp\nBSIKe5QSpzyhSGANIx4KxnDeGrt329uuUME5i92JE+ZtqzOctpxl9aT29o7z0ku8vj99OtCuHe9b\nvNhc56+/+O+SJVxH68MTT7AXMcCzHW3G44R1Wczt7IhKAbfd5m6bQvwhy1BCXOFkxgqYlcBuo5Te\nflIS8M47bKIaCKVK6X9HjuTP1nSsSnGIiNatOTWq0bciJ4dzU2hYU5tq219/bRZWkycH1j8AGDZM\n72elSs5LVYLgDxEWQlzRuLHz/kBCb4TKRRdxEL6ZMzlPdI8erNsoXdr3eWlp+qzh2DFdCV+njrle\n+fIcfG7RIp7JWHUGxvwOTz8N3Hor+1b07q0LIKsQLV488O/Xti0vvS1cyN+zQYPAzxUEDREWQlyx\nfLnz/sREZ6slNzh4EBgwgKPF9ujBDm8Ar+tryZicEvacPAkcOcKf8/N5WQmw61asmdSeeELfV6EC\n8Nxz+rH0dGDiRLYomjyZ9R2AfbZiXRLzR4UKLDT8CUBB8IYICyGuOHKElctWkpP1wTgcnHQf8+fz\nwLx3L7BsmZ4VrkcPni089RSv8ytl9tmw6iK0t/0HH9RnAmlpPEsxUrs2z2D+/psV64GYslqvVZSz\nzQlFE7GGcgGxhnKPSy9lpzHrT03Epq9WBXAwpKWxqemXX5r3lynDswuNhAQ9PtOmTcBZZ3lXUqen\nc06KlBRWVHfowPu3beM4TW3bcuTYcNmzh0N4rF/P1/r0Uz0kuiD4wi1rKBEWLiDCwj3+/JOFgtVi\n58iR8PMGdO/OimIr1arZLaW0R23pUvaFCIQePYAvvuAYTR06AOvW8fLPjBmBt+GLY8fYe7taNaB6\n9fDbE84M3BIWsgwlxA2VK7Py1WmpKCPDOUIrwEtDRkc4pXiZJimJB9bzz+eERN6SAVmVxcZlsKZN\ngUsu0bc1HYIT8+bxm/9LL7GgAHhpy+g3EQ7Fi/N3EUEhxAIRFmcIRWGNe/ZsHoytHtAA+xj4yl43\ncqQ+yBOxwjk/nxMA3X8/0LevPUYSwHGNmjUz78vI0D8nJfHM4PPPefnKl9nprl287KTFbtKIdPiL\nuXOBb7+1K8EFwU1kGcoFZBnKHbSft0YNYOvWwM9Tios3Z7NmzewDOAC0bMnObl99xUtIGjfcYI79\nZOXzz9kEdfJkfQZh5NxzeTlNo1Eje9hyt3j4YQ4/DnAu7nnzfM9+hDMPWYYSXCEhgRW48USxYsHV\nJ/LtlezNoU+bUfz+u3m/ddtKz57s6OYUx0kpFhZGrNtukZurCwqAzY6nT4/MtQRBhMUZTsWKwMqV\n0V+m8hVsz5ppzkqFCqxL8EbFinp+6jfe4MRANWva62m6EessYssW39fXeOsts36jfHn21H7rLQ4K\nWKcOh+QwDuhukpxsF1jGJTRBcJOICwulVBel1D9KqXVKqccdjvdRSi33lHlKKR/DgBAq1jASGhUr\n8oDjL+eD2xhNVQE9WB7Ab8xWRo7kTG+TJ7MZ6Zw5egY5jYQE4PnnWQFcWMg6i2nTeOa0cSN7RBvR\nHACtYb0DvRdZWWxaO3o0x3Dau5d1ICkpHIV240a2vvIn/EIlKYkd+DTHwdtvBzp3jsy1BCHs7Em+\nClgYbQBQE0AygGUAGlrqtAVQyvO5C4DfvbQVbsKoiBHrLHKBlA0bnPc3a8bf4bzz3L9m8+bOmeCs\npXFjolOn9Pv5zDPm4+npzvf9wAGisWOJOnUi6tKF6MsviRYssLe/Zg3XnzDBvH/QIN7/++/mTHaP\nPhq5ZyUS5OYSHTwY614I8QpcypQXUQW3UqotgGFEdLlne7Cn446+uEqp0gBWEpHNOFAU3OGxdSvn\nO7C+tb/2GjBwIL9lu50UR3NYs2IN7Q2wwtjovHb99TwrKFmSEwf5SmBkZPVqoEkTfVsp9nvQorK+\n9x7HgGrenE1ateW35cvZ6qlBA/fzPQhCLCkqIcqrAjDatWwD4CuNzZ0AREXnMq1bA1Wr8sBrHby1\nJYzmzYFzztFDYgdKVhaQne18zHqtO+7gDHS5ueYEOSkpdh+KL74Irh8aWqwmDSLepwmL/v25WGne\nnIsgCM7EjYJbKXUJgNsA2PQaQuh068bmlErpgsGI0czyzz85MmpWlu/2NIYN47f0CRNYkayFugB4\nnd44G0hJ4UQ/l14KXHklB+krX55DZn/8sXeHu2CxBspLSAjf81sQhMjPLLYDMMbcrObZZ0Ip1QzA\newC6ENFB63GN4cOH///nrKwsZPka1aKIt+WWeKBvX93q5+RJ+3Hrm3jr1sD//seRU7X4SAkJnA/h\njjuAV17h3NBK6YpbLfHNgAHARx8Bhw+zr0J6OueF2LePzz37bP06d97JxW0aNgRefJGD/yUksCWS\nr8RAgnC6kZ2djWxv0/0wiLTOIhHAWgAdAewEsAhAbyJaY6hTA8AsADcRkVcL93jWWVSurOcyiCeq\nVOHYRhUr8vZFF/Esw8jMmfy2b+X229nSJjmZ3/x79ox8f90kL083nxWEM5kiE0hQKdUFwFvgJa/x\nRDRSKXUXWNH9nlLqfQDdAWwGoACcIiKbXiOehcUNN7BXb6AkJdnzO7tNaiqbdVapou/bsYN1F8Y6\nx4/bQ4LPng107Khvp6fzbCFS+SQEQYgcRcaDm4hmEFEDIqpHRCM9+8YR0Xuez/2IqBwRtSKilk6C\nIt4JNk2lMY2mN6pUYUXs1Knm/ampnEnNyqOPcvhtgJedJk40CwqtzX/+YUujO+/k2ZBT7ghrwL3c\n3OCT7QiCcHohsaFc4LffgIsvDqxugwas3PVVv1gxc1C4oUOBl1/mqKMTJ7IJrDH9aHIysHMn6xC2\nbWOFrpZzORRycji66d9/8/YddwAffBB6e4IgxI4iM7M4E9ixw/fxhAT2Pp4xg9/sc3Kc62gYl4AA\nnomcOAEcOABcfTUHpnvtNfa8zsgAxo/nzGxKsfdyOIIC4DYXLOAEOz/8wMJNEIQzG5lZuMBrrwGD\nBjkfq1aNI5saQ0osWcI+DUZef51NV6tW5ZlEIDF+iIqGQ6AgCLGjyCi43SKehYVVcQwATzzB0U57\n9mRfAiOFhUCXLmyJBLB/ws8/Rz8+kyAIpz8iLOKM118HHnmEPz/0EDup+SI/n5elCguBrl3FxFMQ\nhMggwiIOyc3lwd+aplMQBCFWiLAQBEEQ/CLWUIIgCELUEGEhCIIg+EWEhSAIguAXERaCIAiCX0RY\nCIIgCH4RYSEIgiD4RYSFIAiC4BcRFoIgCIJfRFgIgiAIfhFhIQiCIPhFhIUgCILgFxEWgiAIgl9E\nWAiCIAh+EWEhCIIg+EWEhSAIguAXERaCIAiCX0RYCIIgCH4RYSEIgiD4RYSFIAiC4BcRFoIgCIJf\nRFgIgiAIfom4sFBKdVFK/aOUWqeUetxLnVFKqfVKqWVKqRaR7pMgCIIQHBEVFkqpBADvAOgMoAmA\n3kqphpY6lwOoS0T1ANwFYGwk+xRpsrOzY92FgJB+uktR6GdR6CMg/YxXIj2zaANgPRFtJqJTAKYA\nuNpS52oAHwEAEf0BoJRSKjPC/YoYReUBkn66S1HoZ1HoIyD9jFciLSyqAthq2N7m2eerznaHOoIg\nCEIMEQW3IAiC4BdFRJFrXKm2AIYTURfP9mAAREQvGeqMBTCHiD7zbP8DoD0R7ba0FbmOCoIgnMYQ\nkQq3jSQ3OuKDxQDOUkrVBLATQC8AvS11vgNwH4DPPMLlkFVQAO58WUEQBCE0IiosiKhAKTUAwM/g\nJa/xRLRGKXUXH6b3iOhHpVRXpdQGAMcA3BbJPgmCIAjBE9FlKEEQBOH0IC4U3AE67mUppZYqpVYp\npeYY9v+nlFruObYoVn1USg3y9GGJUmqlUipfKVU60O8XJ/2Myr0MsJ8llVLfeRw1Vyqlbg303Djq\nZzzdz9JKqa89/fldKdU40HPjqJ/R+l8fr5TarZRa4aOOoyNxlO9lsP1sadgf/L0kopgWsMDaAKAm\ngGQAywA0tNQpBeBvAFU92+UNxzYBKBPrPlrqXwHgl1DOjVU/o3Uvg/jNhwB4Ufu9AewHL5vG1f30\n1s84vJ8vAxjq+dwgXp9Pb/2M8v28EEALACu8HL8cwA+ez+cB+D3a9zKcfoZ6L+NhZhGI414fAF8R\n0XYAIKJ9hmMK8eFcaKQ3gE9DPDdW/QSicy+BwPpJADI8nzMA7Cei/ADPjYd+AvF1PxsDmA0ARLQW\nQC2lVIUAz42HfgJRup9ENA/AQR9VvDkSR/NehtNPIIR7GQ/CIhDHvfoAyiql5iilFiulbjIcIwAz\nPfv7xbCPAAClVDqALgC+CvZcFwinn0B07iUQWD/fAdBYKbUDwHIADwZxbjz0E4iv+7kcQHcAUEq1\nAVADQLUAz42HfgLRu5/+8PY9onkvA8GXw3PQ9zLSprNukQSgFYAOAIoDWKiUWkhEGwC0I6KdnreP\nmUqpNR6JGyuuBDCPiA7FsA+B4NTPeLqXnQEsJaIOSqm6nv40i1FffOHYTyI6ivi6nyMBvKWUWgJg\nJYClAApi1Bdf+OpnPN1PI0XRrD/oexkPM4vt4LcHjWqefUa2AfiJiE4Q0X4AvwJoDgBEtNPzdy+A\nb8BTwVj0UaMXzEs7wZwbLuH0M1r3Egisn7cB+NrTn40A/gXQMMBz46GfcXU/iSiHiG4nolZEdAuA\niuB167i6nz76Gc376Y/tAKobtrXvEc17GQje+hnavYyU8iUIJU0idKVQClgp1MhSpyGAmZ66xcBv\nHI09n0t46hQHMB/AZbHoo6deKbCCMz3Yc+Ogn1G5l0H85u8CGOb5nAmeTpeNt/vpo5/xdj9LAUj2\nfO4HYFI8Pp8++hm1++m5Ri0AK70c6wpdcdwWuoI7avcyzH6GdC8j9iWC/MJdAKwFsB7AYM++uwD0\nN9QZBLaIWgHgfs++2p4fZClYgAyOcR9vATA5kHPjrZ/RvJeB9BNAZQA/eX7vFQB6x+P99NbPOLyf\nbT3H1wD4EkCpOL2fjv2M8v/6ZAA7AJwEsAU8e7T+D70DFgzLAbSK0b0MqZ+h3ktxyhMEQRD8Eg86\nC0EQBCHOEWEhCIIg+EWEhSAIguAXERaCIAiCX0RYCIIgCH4RYSEIgiD4RYSFcMaglCpUSn1k2E5U\nSu1VSn3n2b5FKbVHcfj2pUqpSZ79k5RSmzz7/1Sc0RFKqTJKqZ+VUmuVUj8ppUoF0ZdpSqmSLn9F\nQYgYIiyEM4ljAM5WSqV6tjvBHGgNAKYQh5poSUS3evYRgEFE1AocknycZ/9gcAjtBuBIqUMC7QjR\n/7V396xVREEYx/9P4RewMZUKaay0sTOIYGGhqJWiCJF8ACtTaWMpsbQ0asBC7BJQbCx8aSQgaIwp\ng52JnVYiZCzOLFz2bnJIjNG4z6+aXfacs9vc2dkLc+JMRHzb4nOY7TgnC+ubZ8DpjNst2qHeFO4V\nMJrxOWAm4xngfPtiSSOSXmZV8kHSsTy/LGmvpAOSliQ9yArlkaSTkt7k8dGtPKTZdnOysD4Jyh4D\nl7K6OAy8bV1zMX/Y30ka75jjLKVFAsC+iFgBiIgvlKZ3bZeB51mVHKG0WWjupTEKTGWFcojSMmQM\nmARubPYhzf6E3dKi3GxbRMRHSQcpVcVThiuJxxFxrWPoHUk3ga/ARDNde/qOcfPAtKQ9wGxEvM/z\ng+suR8SnjBeBFxkvUJrSmf11riysj+aAKYY/QW3kev6XcSoilvLcSrPzmKQRYLU9KCJeA8cpraEf\nSrrSMfePgXht4HgNv9DZP8LJwvqkeZu/D9yKiMXfnG8OuJrxODA7tKC0H1iNiGngHmUTr/Xuq8tu\n3FjH/kN+a7E+CYAoe7nf3ey4DreBJ5ImgM/AhY5rTgCTkn4C34FmS+DBOdeLN1rbbEe5RbmZmVX5\nM5SZmVU5WZiZWZWThZmZVTlZmJlZlZOFmZlVOVmYmVmVk4WZmVU5WZiZWdUvHgZ4RfSgnIQAAAAA\nSUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "_=scatter(sims,sims2,marker='o',edgecolors='none')\n", "xlabel('MFP0 sim')\n", "ylabel('MFP2 sim')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Look at the distribution of MFP0 similarities in random molecule pairs (more on this in a later post) " ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import random\n", "idxs = list(range(len(rows)))\n", "random.shuffle(idxs)\n", "ms1 = [x[1] for x in rows]\n", "ms2 = [rows[x][3] for x in idxs]\n", "sims = [DataStructs.TanimotoSimilarity(rdMolDescriptors.GetMorganFingerprint(x,0),rdMolDescriptors.GetMorganFingerprint(y,0)) for x,y in zip(ms1,ms2)]" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEPCAYAAABP1MOPAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAGvJJREFUeJzt3X+0XWV95/H3B1ISEYggJdcJmlARSFQKUW9x2SWHugTS\nWpLSGRptBxCYuvhhnVpnJuka58aOayG2uqDLhraCJFktTVOrEiCGQMnRoRUSICGBBIijiUlqriMq\nQlGamO/8sZ8btjcnueeeH3efe5/Pa62z1nOe8zx7f/c59+7v3s8+59mKCMzMLF9HVR2AmZlVy4nA\nzCxzTgRmZplzIjAzy5wTgZlZ5pwIzMwyN2IikDRZ0iOSNkraImkg1Q9I2i3p8fS4uNRnkaTtkrZJ\nurBUP0fSZknPSrq5O5tkZmajoWZ+RyDp2Ih4SdLRwD8DfwDMBV6IiM8OazsLuBN4B3Aq8ADwpogI\nSY8AN0TEBkmrgVsi4r7ObpKZmY1GU0NDEfFSKk4GJgFD2UMNms8DVkTE/ojYAWwH+iX1AcdHxIbU\nbjkwv9XAzcysM5pKBJKOkrQR2AvcX9qZ3yBpk6TbJE1NddOBXaXue1LddGB3qX53qjMzswo1e0Zw\nICLOpRjq6Zc0G1gC/FJEnEORID7TvTDNzKxbJo2mcUT8WFIduHjYtYHPA3en8h7g9aXXTk11h6s/\nhCRPgGRm1oKIaDRkf0TNfGvo5KFhH0mvAt4LPJ3G/IdcCjyZyquABZKOkXQacDqwPiL2As9L6pck\n4HLgriNsTE89BgYGKo/BMU2suByTY+r0o1XNnBG8Dlgm6SiKxPH3EbFa0nJJ5wAHgB3Ah9IOfKuk\nlcBWYB9wXbwS4fXAUmAKsDoi1rQcuZmZdcSIiSAitgBzGtRffoQ+NwI3Nqh/DHjrKGM0M7Mu8i+L\nm1Sr1aoO4RCOqXm9GJdjao5j6r6mflA21iRFL8ZlZtbLJBHduFhsZmYTmxOBmVnmnAjMzDLnRGBm\nljknAjOzzDkRmJllzonAzCxzTgRmZplzIjAzy5wTgfWUvr6ZSGr50dc3s+pNMBt3PMWE9ZRihvJ2\nPnu1NR2v2XjmKSbMzKwlTgRmZplzIjAzy5wTgZlZ5pwIzMwy50RgZpY5JwIzs8w5EZiZZc6JwMws\nc04EZmaZGzERSJos6RFJGyVtkTSQ6k+UtFbSM5LukzS11GeRpO2Stkm6sFQ/R9JmSc9Kurk7m2Rm\nZqMxYiKIiJeBCyLiXOAcYK6kfmAh8EBEnAk8CCwCkDQbuAyYBcwFlqiYQAbgVuDqiDgDOEPSRZ3e\nIMvdZE9aZzZKTQ0NRcRLqTgZmEQxK9g8YFmqXwbMT+VLgBURsT8idgDbgX5JfcDxEbEhtVte6mPW\nIS9T/Hm29hgc3FlBzGbVaioRSDpK0kZgL3B/2plPi4hBgIjYC5ySmk8HdpW670l104Hdpfrdqc7M\nzCo0qZlGEXEAOFfSCcCXJb2ZQ+cK7ujcv4sXLz5YrtVq1Gq1Ti7ezGzcq9fr1Ov1tpcz6vsRSPo4\n8BJwDVCLiME07LMuImZJWghERNyU2q8BBoCdQ21S/QLg/Ii4tsE6fD+CTHXifgS+n4Hlqmv3I5B0\n8tA3giS9CngvsA1YBVyZml0B3JXKq4AFko6RdBpwOrA+DR89L6k/XTy+vNTHzMwq0szQ0OuAZZKO\nokgcfx8RqyU9DKyUdBXF0f5lABGxVdJKYCuwD7iudHh/PbAUmAKsjog1Hd0aMzMbNd+q0jqqr29m\nB75546Ehs1a0OjTkRGAd1Qtj/E4Elivfs9isI/yDNMuPzwisoybCGYHPKGy88hmBmZm1xInAzCxz\nTgRmZplzIjAzy5wTgZlZ5pwIzMwy50RgZpY5JwIzs8w5EZiZZc6JwMwsc04EZmaZcyIwM8ucE4GZ\nWeacCMzMMudEYGaWOScCM7PMORGYmWXOicDMLHNOBGZmmXMiMDPL3IiJQNKpkh6U9JSkLZI+nOoH\nJO2W9Hh6XFzqs0jSdknbJF1Yqp8jabOkZyXd3J1NMjOz0VBEHLmB1Af0RcQmSccBjwHzgN8BXoiI\nzw5rPwu4E3gHcCrwAPCmiAhJjwA3RMQGSauBWyLivgbrjJHist4kCWjnsxv//f23a1WRRERotP1G\nPCOIiL0RsSmVXwS2AdOH1tugyzxgRUTsj4gdwHagPyWU4yNiQ2q3HJg/2oDNzKyzRnWNQNJM4Bzg\nkVR1g6RNkm6TNDXVTQd2lbrtSXXTgd2l+t28klDMzKwik5ptmIaFvgh8JCJelLQE+JM05PNJ4DPA\nNZ0KbPHixQfLtVqNWq3WqUWbmU0I9Xqder3e9nJGvEYAIGkScA/w1Yi4pcHrM4C7I+JsSQuBiIib\n0mtrgAFgJ7AuImal+gXA+RFxbYPl+RrBOOVrBL5GYNXp2jWC5AvA1nISSGP+Qy4FnkzlVcACScdI\nOg04HVgfEXuB5yX1q9hbXA7cNdqArbv6+mYiqeWHmY0/Iw4NSXoX8LvAFkkbKQ6X/hj4gKRzgAPA\nDuBDABGxVdJKYCuwD7iudHh/PbAUmAKsjog1Hd0aa9vg4E7aP6I2s/GkqaGhseahoep4aMdDQzZ+\ndXtoyMzMJignAjOzzDkRmJllzonAzCxzTgRmZplzIjAzy5wTgZlZ5pwIzMwy50RgZpY5JwIzs8w5\nEZiZZc6JwMwsc04EZmaZcyIwM8ucE4GZWeacCMzMMudEYGaWOScCM7PMORGYmWXOicDMLHNOBGZm\nmXMiMDPLnBOBmVnmRkwEkk6V9KCkpyRtkfQHqf5ESWslPSPpPklTS30WSdouaZukC0v1cyRtlvSs\npJu7s0lmZjYazZwR7Ac+GhFvBt4JXC/pLGAh8EBEnAk8CCwCkDQbuAyYBcwFlkhSWtatwNURcQZw\nhqSLOro1ZmY2aiMmgojYGxGbUvlFYBtwKjAPWJaaLQPmp/IlwIqI2B8RO4DtQL+kPuD4iNiQ2i0v\n9TEzs4qM6hqBpJnAOcDDwLSIGIQiWQCnpGbTgV2lbntS3XRgd6l+d6ozM7MKTWq2oaTjgC8CH4mI\nFyXFsCbDn7dl8eLFB8u1Wo1ardbJxZt1yWReGQkdvWnTZrB3747OhWMTWr1ep16vt70cRYy8/5Y0\nCbgH+GpE3JLqtgG1iBhMwz7rImKWpIVARMRNqd0aYADYOdQm1S8Azo+IaxusL5qJyzqv2Im18967\nf7v9/bdvrZJERIz6SKTZoaEvAFuHkkCyCrgyla8A7irVL5B0jKTTgNOB9Wn46HlJ/eni8eWlPmZm\nVpERzwgkvQv4OrCF4lAngD8G1gMrgddTHO1fFhE/Sn0WAVcD+yiGktam+rcBS4EpwOqI+Mhh1ukz\nghb19c1kcHBnm0sZ30fU472///atVa2eETQ1NDTWnAha56Gd8d/ff/vWqm4PDZmZ2QTlRGBmljkn\nAjOzzDkRmJllzonAzCxzTgRmZplzIjAzy5wTgZlZ5pwIzMwy50RgZpY5JwIzs8w5EZiZZc6JwMws\nc04EZmaZcyIwM8ucE4GZWeacCMzMMudEYGaWOScCM7PMORGYmWXOicDMLHNOBGZmmXMiMDPL3IiJ\nQNLtkgYlbS7VDUjaLenx9Li49NoiSdslbZN0Yal+jqTNkp6VdHPnN8XMzFrRzBnBHcBFDeo/GxFz\n0mMNgKRZwGXALGAusESSUvtbgasj4gzgDEmNlmlmZmNsxEQQEQ8BP2zwkhrUzQNWRMT+iNgBbAf6\nJfUBx0fEhtRuOTC/tZDNzKyT2rlGcIOkTZJukzQ11U0HdpXa7El104Hdpfrdqc7MzCo2qcV+S4A/\niYiQ9EngM8A1nQsLFi9efLBcq9Wo1WqdXLyZ2bhXr9ep1+ttL0cRMXIjaQZwd0ScfaTXJC0EIiJu\nSq+tAQaAncC6iJiV6hcA50fEtYdZXzQTlx2quCTTznvn/tX2nwK83HLvadNmsHfvjjbWb+OZJCKi\n0bD9ETU7NCRK1wTSmP+QS4EnU3kVsEDSMZJOA04H1kfEXuB5Sf3p4vHlwF2jDdZs4nuZIpG09hgc\n3FlBzDbejTg0JOlOoAa8VtJ3KI7wL5B0DnAA2AF8CCAitkpaCWwF9gHXlQ7trweWUhzyrB76ppGZ\nmVWrqaGhseahodZ5aMj9/b+Tr24PDZmZ2QTlRGBmljknAjOzzDkRmJllzomgx/T1zURSyw8zs9Hy\nt4Z6jL/14/7+1pC1yt8aMjOzljgRmJllzonAzCxzTgRmZplzIjAzy5wTgZlZ5pwIzMwy50RgZpY5\nJwIzs8w5EZiZZc6JwMwsc04EZmaZcyIwM8ucE4GZWeacCMzMMudEYGaWOScCM7PMjZgIJN0uaVDS\n5lLdiZLWSnpG0n2SppZeWyRpu6Rtki4s1c+RtFnSs5Ju7vymmJlZK5o5I7gDuGhY3ULggYg4E3gQ\nWAQgaTZwGTALmAss0Ss30r0VuDoizgDOkDR8mWZmVoERE0FEPAT8cFj1PGBZKi8D5qfyJcCKiNgf\nETuA7UC/pD7g+IjYkNotL/UxM7MKtXqN4JSIGASIiL3AKal+OrCr1G5PqpsO7C7V7051ZmZWsUkd\nWk50aDkHLV68+GC5VqtRq9U6vQozs3GtXq9Tr9fbXo4iRt6HS5oB3B0RZ6fn24BaRAymYZ91ETFL\n0kIgIuKm1G4NMADsHGqT6hcA50fEtYdZXzQT10RUXFJpZ9vdP/f+uf7vWLH/iAiN3PLnNTs0pPQY\nsgq4MpWvAO4q1S+QdIyk04DTgfVp+Oh5Sf3p4vHlpT5mZlahEYeGJN0J1IDXSvoOxRH+p4B/kHQV\nxdH+ZQARsVXSSmArsA+4rnRofz2wFJgCrI6INZ3dFDMza0VTQ0NjzUND43towv09NGTV6PbQkJmZ\nTVBOBGZmmXMiMDPLnBOBmVnmnAjMJpTJSGr50dc3s+oNsAr4W0M9xt8acv+q++f6vzcR+FtDPaSv\nb2bLR2RmZmPNZwRd0N5RffVHhO6fd//x/L+XO58RmJlZS5wIzMwy50RgZpY5JwIzs8w5EZiZZc6J\nwMwsc04EZmaZcyIwM8ucE4GZWeacCMzMMudEYGaWOScCM7PMORGYmWXOicDMLHNOBGZmmWsrEUja\nIekJSRslrU91J0paK+kZSfdJmlpqv0jSdknbJF3YbvBmZta+ds8IDgC1iDg3IvpT3ULggYg4E3gQ\nWAQgaTZwGTALmAsskW/JZWZWuXYTgRosYx6wLJWXAfNT+RJgRUTsj4gdwHagHzMzq1S7iSCA+yVt\nkHRNqpsWEYMAEbEXOCXVTwd2lfruSXU9p517Dvskx8zGm0lt9n9XRHxX0i8CayU9w6E3TG3pBqiL\nFy8+WK7VatRqtVZjHLXBwZ20f99YM7Puqtfr1Ov1tpfTsZvXSxoAXgSuobhuMCipD1gXEbMkLQQi\nIm5K7dcAAxHxSINlVXrz+vZuPg/t3UC8+puXu3/O/acAL7fce9q0Gezdu6ON9Vs7xvzm9ZKOlXRc\nKr8auBDYAqwCrkzNrgDuSuVVwAJJx0g6DTgdWN/q+s2sG16mSCStPYqzaRtv2hkamgZ8WVKk5fxt\nRKyV9CiwUtJVwE6KbwoREVslrQS2AvuA6yo97DczM6CDQ0Od5KGh8Ty04P659+/FfUouxnxoyMzM\nJgYnAjOzzDkRmJllzonAzCxzTgRmZpmbkInAU0SYmTVvQn59tNqvf7bbfzzH7v7u76+PVslfHzUz\ns5Y4EZiZZc6JwMwsc04EZmaZcyIwM8ucE4GZWeacCMysgya39Ruevr6ZVW9Alvw7gsZLqLD/eI7d\n/d3fv0OoUqu/I2j3nsVd86Y3vb2lfied9JoOR2JmNrH17BkBbGip77HH/hYvvbSbqo9qfEbg/u7f\nWv9e3CeNFxPujABaOyM46qgpHY7DzGxi88ViM7PMORGYmWXOicDMLHNOBGbWQ/w7hCqMeSKQdLGk\npyU9K+l/jPX6zayXvUzxraPWHoODOyuIefwb00Qg6Sjgc8BFwJuB90s6ayxjaF296gAaqFcdwDhS\nrzqABupVBzBO1KsO4BD1er3qEDpqrM8I+oHtEbEzIvYBK4B5YxxDi+pVB9BAveoAxpF61QE0UK86\ngHGiXnUAh3AiaM90YFfp+e5UZ2bWAb7G0Iqe/UHZCSf8Zkv9fvKTf+1wJGY2fgxdY2jN4OCUNFfZ\nyD7xiU8cUjdt2gz27t3R8vqrMqZTTEg6D1gcERen5wuBiIibhrXzb8zNzFrQyhQTY50IjgaeAd4D\nfBdYD7w/IraNWRBmZvZzxnRoKCJ+JukGYC3F9YnbnQTMzKrVk7OPmpnZ2Knsl8XN/LBM0p9L2i5p\nk6RzeiEuSWdK+hdJP5X00R6J6QOSnkiPhyS9tQdiuiTFs1HSeknvqjqmUrt3SNon6dKqY5J0vqQf\nSXo8Pf5nt2NqJq7UppY+vyclras6JkkfS/E8LmmLpP2SunoDkiZiOkHSqrSP2iLpym7G02RMr5H0\npfT/97Ck2SMuNCLG/EGRgL4JzAB+AdgEnDWszVzg3lT+FeDhHonrZOBtwP8GPtojMZ0HTE3li7v9\nXjUZ07Gl8luBbVXHVGr3T8A9wKVVxwScD6zq9t9RC3FNBZ4CpqfnJ1cd07D27wMeqDomYBFw49B7\nBDwHTKo4pk8DH0/lM5t5n6o6I2jmh2XzgOUAEfEIMFXStKrjiojvR8RjwP4uxzKamB6OiOfT04fp\n/m8zmonppdLT44ADVceUfBj4IvC9LsczmphG/S2PNjUT1weAf4yIPVD83fdATGXvB/6uB2IK4PhU\nPh54LiK6uW9oJqbZwIMAEfEMMFPSLx5poVUlgmZ+WDa8zZ4GbaqIa6yNNqZrgK92NaImY5I0X9I2\n4G7gqqpjkvQfgPkRcStjs/Nt9rN7ZxpauLep0/ixiesM4CRJ6yRtkPSfeyAmACS9iuLM9x97IKbP\nAbMl/SvwBPCRHojpCeBSAEn9wBuAU4+00J79QZmNnqQLgA8Cv1p1LAAR8RXgK5J+Ffgk8N6KQ7oZ\nKI+pjvWReCOPAW+IiJckzQW+QrETrtokYA7wa8CrgW9I+kZEfLPasAD4TeChiPhR1YFQzJu2MSJ+\nTdIbgfslnR0RL1YY06eAWyQ9DmwBNgI/O1KHqhLBHoosNeTUVDe8zetHaFNFXGOtqZgknQ38NXBx\nRPywF2IaEhEPSfolSSdFxA8qjOntwAoVPx09GZgraV9ErKoqpvIOIyK+KmlJl9+npuKiONL8fkT8\nFPippK8Dv0wxPl1VTEMW0P1hIWgupg8CNwJExP+V9G3gLODRqmKKiBconYGnmL51xKV282LLES54\nHM0rFzyOobjgMWtYm1/nlYvF5zE2F4tHjKvUdgD4o16IKf1hbAfO66HP742l8hxgV9UxDWt/B92/\nWNzM+zStVO4HdvTI53cWcH9qeyzFkeXsqj8/iovYzwGv6pH36S+AgaHPkmLY5qSKY5oK/EIq/xdg\n6YjL7fabeYQNupjiV8bbgYWp7kPA75fafC5t9BPAnF6Iq/Rh/wj4AfAd4LiKY/p8+ud4nOI0cH0P\nvE//HXgyxfTPwDurjmlY2y/Q5UTQ5Pt0fXqfNgL/AvxKt2Nq9r0CPkbxzaHNwId7JKYrgDvH4j1q\n8vN7HXBfeo82U8yUUHVM56XXt1F8MWLqSMv0D8rMzDLnW1WamWXOicDMLHNOBGZmmXMiMDPLnBOB\nmVnmnAjMzDLnRGCjIumApOWl50dL+n+SVqXnV0j6XpoqeKOkpal+qaRvpfpHVdy2FEknSlor6RlJ\n90maOopY7pF0Qovb8TZJN7fSt1PSPD5zqowhxfHXks6qOg6rjhOBjda/AW+RNDk9fy8/PwkWwIqI\nmBMR50bElakugI9FxByKqXv/KtUvpJgm90yKGRMXNRtIRLwvIn7cykZExGMR8V9b6QsgacL870TE\n70fE08PrJ9I22pH5g7ZWrAZ+I5UbTQc80mRuXwfemMrzgGWpvAyYP7yxpD5JX0tnE5uHbnIj6duS\nTpI0Q9I2SXekM4u/kfQeFTfpeUbS2xss83xJd6fygKTb0xH6NyV9uFHQkl6Q9GeSNgLnSfq4ipvu\nbJb0l6V26yR9StIj6QYiQ/FOkfR3kp6S9CVgSqnP+9NyNkv61LB1flrFzWHWqripzlCc7zvMdn0t\nnS09LWlJ6bUlKd4tkgaGxTvnMNt4Y4p3k6RPN3pfbAIYq59q+zExHsCPgbcA/wBMppga4d2km6tQ\nTAHwPYqpJR4Hrkj1B+f2Af4T8I1U/uGw5f+gwTo/CixKZQGvTuVvASdRzLvy76S5cCgm/LotlS8B\nvtxgmQdvCEMxb9RDFJMwvhb4PnB0gz4HgN8uPX9Nqbwc+I1UXgf8aSrPBe5P5T8sxfVWYB/FPEyv\nA3ambRm6cc4lpXVemMpfAtakNmdTzHrZaLteSu+JKO4Pfmk53tR/HfCWUrxzhm9jiufp0rJPqPrv\nz4/uPHxGYKMWEU8CMynOBu7l0DOAoaGhORGxrFT/Z2lq3Gt4ZXbE4XOcNJrzZAPwQUn/Czg7Iv4t\n1ZfX++2I2JrKT1HsTKGYLG1GE5t1b0Tsj4jngEGKOaWG20+xMx7yHhW3AtwMXAC8ufTaULvHSut/\nN/A3ABGxhWIOLYB3AOsi4gcRcQD429QW4N8jYm1pW76W2hxpu9ZHceOSoDhbG5qWfIGkxyiS9+z0\nONI2Pg/8RNJtkn4L+Mlh1mfjnBOBtWoV8KeMbjrgj6XkcFFEbEt1g0p3npPUR4M7h0XE/6HYMe4B\nlkr6vQbLfrlUPlB6foDmplsf3r9Rn5+mnSvpGslfUBxtnw3cRmmop7S8nx1h/TpMuWzfsLheBkhx\nHG65hyRXSTOBPwIuiIhfphjem8KhDm5jRPyMYkbUL1LcGnLNYdZn45wTgY3W0A7rC8AnIuKpNpe3\nCrgyla8A7jpkhdIbgO9FxO0UO9xG37Q50nWJTt2AprycKRQ73OckHQf8xyb6fx34XQBJb6EY3gFY\nD7w7Xe84muJMqz7KeMr603WTo4DfoRj2OgF4EXghJd65Iy1T0qsphpPWUAzPnX2YPjbO+Q5lNlpD\nR4t7KKYJH1W/Bm4CVkq6imKc/LIGbWrAf5O0D3gBGLptYnmZhysfad2jjfVgfUQ8L+nzFMNQ36XY\nmY/U/1bgDklPUUwR/Gha1l5JC3ll539vRNzTROyHe+1Ris/mdODBiPgygKRNab27KJJDo+WUy8cD\nd0kaOnP4wyPEYuOYp6E2m0AknU9xw6RLqo7Fxg8PDZmZZc5nBGZmmfMZgZlZ5pwIzMwy50RgZpY5\nJwIzs8w5EZiZZc6JwMwsc/8fjnBH7oJn0H0AAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "_=hist(sims,bins=20)\n", "xlabel('MFP0 sim in random pairs')" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [], "source": [ "cn = None\n", "curs=None" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Try molecules that are a bit more similar.\n", "Use a similarity threshold for the pairs using MFP1 bits." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As above, start by adding a table with Morgan1 fingerprints for the smaller molecules:\n", "\n", " chembl_21=# select molregno,morgan_fp(m,1) mfp1 into table rdk.tfps1_smaller from rdk.mols \n", " join compound_properties using (molregno) \n", " join compound_structures using (molregno) \n", " where mw_monoisotopic<=600 and canonical_smiles not like '%.%';\n", " SELECT 1372487\n", " chembl_21=# create index sfps_mfp1_idx on rdk.tfps1_smaller using gist(mfp1);\n", " CREATE INDEX\n", " " ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import psycopg2\n", "cn = psycopg2.connect(dbname='chembl_21')\n", "curs = cn.cursor()\n", "curs.execute('select molregno,m from rdk.mols join rdk.tfps1_smaller using (molregno) order by random() limit 35000')\n", "qs = curs.fetchall()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And loop to find the pairs:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done: 0\n", "Done: 1000\n", "Done: 2000\n", "Done: 3000\n", "Done: 4000\n", "Done: 5000\n", "Done: 6000\n", "Done: 7000\n", "Done: 8000\n", "Done: 9000\n", "Done: 10000\n", "Done: 11000\n", "Done: 12000\n", "Done: 13000\n", "Done: 14000\n", "Done: 15000\n", "Done: 16000\n", "Done: 17000\n", "Done: 18000\n", "Done: 19000\n", "Done: 20000\n", "Done: 21000\n", "Done: 22000\n", "Done: 23000\n", "Done: 24000\n" ] } ], "source": [ "cn.rollback()\n", "curs.execute('set rdkit.tanimoto_threshold=0.6')\n", "\n", "keep=[]\n", "for i,row in enumerate(qs):\n", " curs.execute('select molregno,m from rdk.mols join (select molregno from rdk.tfps1_smaller where mfp1%%morgan_fp(%s,1) '\n", " 'and molregno!=%s limit 1) t2 using (molregno)',(row[1],row[0]))\n", " d = curs.fetchone()\n", " if not d: continue\n", " keep.append((row[0],row[1],d[0],d[1]))\n", " if len(keep)==25000: break\n", " if not i%1000: print('Done: %d'%i)\n" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import gzip\n", "outf = gzip.open('../data/chembl21_25K.mfp1.pairs.txt.gz','wb+')\n", "for idx1,smi1,idx2,smi2 in keep: outf.write(('%d %s %d %s\\n'%(idx1,smi1,idx2,smi2)).encode('UTF-8'))\n", "outf=None\n" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "25000\n" ] } ], "source": [ "print(len(keep))" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from rdkit import Chem\n", "from rdkit.Chem.Draw import IPythonConsole\n", "IPythonConsole.ipython_useSVG=True\n", "from rdkit.Chem import Draw\n", "import gzip\n" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false }, "outputs": [], "source": [ "rows=[]\n", "for row in gzip.open('../data/chembl21_25K.mfp1.pairs.txt.gz').readlines():\n", " row = row.split()\n", " row[1] = Chem.MolFromSmiles(row[1])\n", " row[3] = Chem.MolFromSmiles(row[3])\n", " rows.append(row)\n", " if len(rows)>100: break # we aren't going to use all the pairs, so there's no sense in reading them all in" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/svg+xml": [ "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "NH\n", "O\n", "O\n", "OH\n", "O\n", "S\n", "O\n", "HO\n", "H\n", "H\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "O\n", "OH\n", "NH\n", "O\n", "NH\n", "O\n", "O\n", "OH\n", "H\n", "H\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "NH\n", "O\n", "N\n", "O\n", "N\n", "N\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "O\n", "O\n", "N\n", "N\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "N\n", "O\n", "O\n", "OH\n", "S\n", "H\n", "H\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "N\n", "O\n", "S\n", "O\n", "OH\n", "H\n", "H\n", "H\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "N\n", "N\n", "N\n", "S\n", "O\n", "O\n", "N\n", "N\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "N\n", "N\n", "NH\n", "S\n", "O\n", "O\n", "N\n", "N\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "O\n", "NH\n", "F\n", "F\n", "F\n", "O\n", "S\n", "OH\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "O\n", "NH\n", "F\n", "F\n", "F\n", "O\n", "" ], "text/plain": [ "" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t = []\n", "for x in rows[:5]:\n", " t.append(x[1])\n", " t.append(x[3])\n", " \n", "Draw.MolsToGridImage(t,molsPerRow=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I won't repeat the property analysis for this set. These pairs will also be useful later though. " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.1" } }, "nbformat": 4, "nbformat_minor": 0 }