{ "cells": [ { "cell_type": "markdown", "id": "75ab2410", "metadata": {}, "source": [ "A while ago there was a [question on Twitter](https://twitter.com/GeorgeK_86/status/1425807309700276227) about highlighting bonds which changed in a reaction. I put together a quick bit of example code to answer that question and made a note to do a blog post on the topic. I'm finally getting around to doing that blog post.\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "a2a456ca", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2021.09.2\n" ] } ], "source": [ "from rdkit import Chem\n", "from rdkit.Chem import Draw\n", "from rdkit.Chem.Draw import IPythonConsole\n", "from rdkit.Chem import rdChemReactions\n", "import rdkit\n", "print(rdkit.__version__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's something similar to the reaction from the question:" ] }, { "cell_type": "code", "execution_count": 2, "id": "a1a62215", "metadata": {}, "outputs": [], "source": [ "rxn1 = rdChemReactions.ReactionFromRxnBlock('''$RXN\n", "\n", " Mrv2102 111820212128\n", "\n", " 2 1\n", "$MOL\n", "\n", " Mrv2102 11182121282D \n", "\n", " 13 13 0 0 0 0 999 V2000\n", " -7.5723 2.6505 0.0000 C 0 0 0 0 0 0 0 0 0 1 0 0\n", " -6.8579 2.2380 0.0000 O 0 0 0 0 0 0 0 0 0 2 0 0\n", " -6.8580 1.4130 0.0000 C 0 0 0 0 0 0 0 0 0 3 0 0\n", " -6.1435 1.0004 0.0000 O 0 0 0 0 0 0 0 0 0 4 0 0\n", " -7.5725 1.0005 0.0000 C 0 0 0 0 0 0 0 0 0 5 0 0\n", " -7.5725 0.1755 0.0000 N 0 0 0 0 0 0 0 0 0 6 0 0\n", " -8.2869 -0.2369 0.0000 C 0 0 0 0 0 0 0 0 0 7 0 0\n", " -8.2870 -1.0620 0.0000 C 0 0 0 0 0 0 0 0 0 8 0 0\n", " -9.0015 -1.4745 0.0000 C 0 0 0 0 0 0 0 0 0 9 0 0\n", " -9.0015 -2.2995 0.0000 C 0 0 0 0 0 0 0 0 0 10 0 0\n", " -8.2870 -2.7120 0.0000 C 0 0 0 0 0 0 0 0 0 11 0 0\n", " -7.5726 -2.2995 0.0000 C 0 0 0 0 0 0 0 0 0 12 0 0\n", " -7.5726 -1.4745 0.0000 C 0 0 0 0 0 0 0 0 0 13 0 0\n", " 1 2 1 0 0 0 0\n", " 2 3 1 0 0 0 0\n", " 3 4 2 0 0 0 0\n", " 3 5 1 0 0 0 0\n", " 5 6 1 0 0 0 0\n", " 6 7 2 0 0 0 0\n", " 7 8 1 0 0 0 0\n", " 8 9 1 0 0 0 0\n", " 8 13 2 0 0 0 0\n", " 9 10 2 0 0 0 0\n", " 10 11 1 0 0 0 0\n", " 11 12 2 0 0 0 0\n", " 12 13 1 0 0 0 0\n", "M END\n", "$MOL\n", "\n", " Mrv2102 11182121282D \n", "\n", " 12 11 0 0 0 0 999 V2000\n", " -3.7934 0.7703 0.0000 C 0 0 0 0 0 0 0 0 0 14 0 0\n", " -3.0790 1.1828 0.0000 C 0 0 0 0 0 0 0 0 0 15 0 0\n", " -2.3645 0.7703 0.0000 C 0 0 0 0 0 0 0 0 0 16 0 0\n", " -3.7934 -0.0547 0.0000 C 0 0 0 0 0 0 0 0 0 17 0 0\n", " -4.5078 -0.4672 0.0000 O 0 0 0 0 0 0 0 0 0 18 0 0\n", " -3.0789 -0.4671 0.0000 O 0 0 0 0 0 0 0 0 0 19 0 0\n", " -1.6500 1.1828 0.0000 O 0 0 0 0 0 0 0 0 0 20 0 0\n", " -2.3645 -0.0547 0.0000 O 0 0 0 0 0 0 0 0 0 21 0 0\n", " -3.0788 -1.2922 0.0000 C 0 0 0 0 0 0 0 0 0 22 0 0\n", " -1.6500 -0.4672 0.0000 C 0 0 0 0 0 0 0 0 0 23 0 0\n", " -2.3644 -1.7046 0.0000 C 0 0 0 0 0 0 0 0 0 24 0 0\n", " -1.6500 -1.2922 0.0000 C 0 0 0 0 0 0 0 0 0 25 0 0\n", " 1 2 2 0 0 0 0\n", " 1 4 1 0 0 0 0\n", " 2 3 1 0 0 0 0\n", " 3 7 2 0 0 0 0\n", " 3 8 1 0 0 0 0\n", " 4 5 2 0 0 0 0\n", " 4 6 1 0 0 0 0\n", " 6 9 1 0 0 0 0\n", " 8 10 1 0 0 0 0\n", " 9 11 1 0 0 0 0\n", " 10 12 1 0 0 0 0\n", "M END\n", "$MOL\n", "\n", " Mrv2102 11182121282D \n", "\n", " 25 26 0 0 0 0 999 V2000\n", " 5.1328 0.9532 0.0000 C 0 0 0 0 0 0 0 0 0 5 0 0\n", " 5.8002 0.4683 0.0000 N 0 0 0 0 0 0 0 0 0 6 0 0\n", " 5.5453 -0.3163 0.0000 C 0 0 0 0 0 0 0 0 0 7 0 0\n", " 4.7203 -0.3163 0.0000 C 0 0 0 0 0 0 0 0 0 14 0 0\n", " 4.4654 0.4683 0.0000 C 0 0 0 0 0 0 0 0 0 15 0 0\n", " 5.1328 1.7782 0.0000 C 0 0 0 0 0 0 0 0 0 3 0 0\n", " 3.6807 0.7232 0.0000 C 0 0 0 0 0 0 0 0 0 16 0 0\n", " 4.2354 -0.9838 0.0000 C 0 0 0 0 0 0 0 0 0 17 0 0\n", " 6.0302 -0.9838 0.0000 C 0 0 0 0 0 0 0 0 0 8 0 0\n", " 6.8507 -0.8975 0.0000 C 0 0 0 0 0 0 0 0 0 9 0 0\n", " 7.3356 -1.5650 0.0000 C 0 0 0 0 0 0 0 0 0 10 0 0\n", " 7.0001 -2.3187 0.0000 C 0 0 0 0 0 0 0 0 0 11 0 0\n", " 6.1796 -2.4049 0.0000 C 0 0 0 0 0 0 0 0 0 12 0 0\n", " 5.6947 -1.7375 0.0000 C 0 0 0 0 0 0 0 0 0 13 0 0\n", " 3.4149 -0.8975 0.0000 O 0 0 0 0 0 0 0 0 0 18 0 0\n", " 4.5709 -1.7375 0.0000 O 0 0 0 0 0 0 0 0 0 19 0 0\n", " 4.0860 -2.4049 0.0000 C 0 0 0 0 0 0 0 0 0 22 0 0\n", " 3.2655 -2.3187 0.0000 C 0 0 0 0 0 0 0 0 0 24 0 0\n", " 3.5092 1.5302 0.0000 O 0 0 0 0 0 0 0 0 0 20 0 0\n", " 3.0676 0.1712 0.0000 O 0 0 0 0 0 0 0 0 0 21 0 0\n", " 2.2830 0.4261 0.0000 C 0 0 0 0 0 0 0 0 0 23 0 0\n", " 1.6699 -0.1259 0.0000 C 0 0 0 0 0 0 0 0 0 25 0 0\n", " 5.8473 2.1907 0.0000 O 0 0 0 0 0 0 0 0 0 4 0 0\n", " 4.4183 2.1907 0.0000 O 0 0 0 0 0 0 0 0 0 2 0 0\n", " 4.4183 3.0157 0.0000 C 0 0 0 0 0 0 0 0 0 1 0 0\n", " 1 2 1 0 0 0 0\n", " 2 3 1 0 0 0 0\n", " 3 4 1 0 0 0 0\n", " 4 5 1 0 0 0 0\n", " 1 5 1 0 0 0 0\n", " 1 6 1 0 0 0 0\n", " 5 7 1 0 0 0 0\n", " 4 8 1 0 0 0 0\n", " 3 9 1 0 0 0 0\n", " 10 11 2 0 0 0 0\n", " 11 12 1 0 0 0 0\n", " 12 13 2 0 0 0 0\n", " 13 14 1 0 0 0 0\n", " 9 10 1 0 0 0 0\n", " 9 14 2 0 0 0 0\n", " 8 15 2 0 0 0 0\n", " 8 16 1 0 0 0 0\n", " 16 17 1 0 0 0 0\n", " 17 18 1 0 0 0 0\n", " 7 19 2 0 0 0 0\n", " 7 20 1 0 0 0 0\n", " 20 21 1 0 0 0 0\n", " 21 22 1 0 0 0 0\n", " 6 23 2 0 0 0 0\n", " 6 24 1 0 0 0 0\n", " 24 25 1 0 0 0 0\n", "M END\n", "''')" ] }, { "cell_type": "markdown", "id": "8899c183", "metadata": {}, "source": [ "Let's take a look at the reaction:" ] }, { "cell_type": "code", "execution_count": 3, "id": "30457372", "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "IPythonConsole.molSize = (600,250)\n", "IPythonConsole.highlightByReactant = True\n", "rxn1\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can ask the reaction to tell us which atoms in the reactants are modified in the reaction:" ] }, { "cell_type": "code", "execution_count": 4, "id": "6d3abe01", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "((4, 5, 6), (0, 1))" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rxn1.Initialize()\n", "rxn1.GetReactingAtoms()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The information about which atoms react is enough to figure out which bonds change, but we have to do some additional work for this:" ] }, { "cell_type": "code", "execution_count": 5, "id": "28f25400", "metadata": {}, "outputs": [], "source": [ "from collections import namedtuple\n", "AtomInfo = namedtuple('AtomInfo',('mapnum','reactant','reactantAtom','product','productAtom'))\n", "def map_reacting_atoms_to_products(rxn,reactingAtoms):\n", " ''' figures out which atoms in the products each mapped atom in the reactants maps to '''\n", " res = []\n", " for ridx,reacting in enumerate(reactingAtoms):\n", " reactant = rxn.GetReactantTemplate(ridx)\n", " for raidx in reacting:\n", " mapnum = reactant.GetAtomWithIdx(raidx).GetAtomMapNum()\n", " foundit=False\n", " for pidx,product in enumerate(rxn.GetProducts()):\n", " for paidx,patom in enumerate(product.GetAtoms()):\n", " if patom.GetAtomMapNum()==mapnum:\n", " res.append(AtomInfo(mapnum,ridx,raidx,pidx,paidx))\n", " foundit = True\n", " break\n", " if foundit:\n", " break\n", " return res\n", "def get_mapped_neighbors(atom):\n", " ''' test all mapped neighbors of a mapped atom'''\n", " res = {}\n", " amap = atom.GetAtomMapNum()\n", " if not amap:\n", " return res\n", " for nbr in atom.GetNeighbors():\n", " nmap = nbr.GetAtomMapNum()\n", " if nmap:\n", " if amap>nmap:\n", " res[(nmap,amap)] = (atom.GetIdx(),nbr.GetIdx())\n", " else:\n", " res[(amap,nmap)] = (nbr.GetIdx(),atom.GetIdx())\n", " return res\n", "\n", "BondInfo = namedtuple('BondInfo',('product','productAtoms','productBond','status'))\n", "def find_modifications_in_products(rxn):\n", " ''' returns a 2-tuple with the modified atoms and bonds from the reaction '''\n", " reactingAtoms = rxn.GetReactingAtoms()\n", " amap = map_reacting_atoms_to_products(rxn,reactingAtoms)\n", " res = []\n", " seen = set()\n", " # this is all driven from the list of reacting atoms:\n", " for _,ridx,raidx,pidx,paidx in amap:\n", " reactant = rxn.GetReactantTemplate(ridx)\n", " ratom = reactant.GetAtomWithIdx(raidx)\n", " product = rxn.GetProductTemplate(pidx)\n", " patom = product.GetAtomWithIdx(paidx)\n", "\n", " rnbrs = get_mapped_neighbors(ratom)\n", " pnbrs = get_mapped_neighbors(patom)\n", " for tpl in pnbrs:\n", " pbond = product.GetBondBetweenAtoms(*pnbrs[tpl])\n", " if (pidx,pbond.GetIdx()) in seen:\n", " continue\n", " seen.add((pidx,pbond.GetIdx()))\n", " if not tpl in rnbrs:\n", " # new bond in product\n", " res.append(BondInfo(pidx,pnbrs[tpl],pbond.GetIdx(),'New'))\n", " else:\n", " # present in both reactants and products, check to see if it changed\n", " rbond = reactant.GetBondBetweenAtoms(*rnbrs[tpl])\n", " if rbond.GetBondType()!=pbond.GetBondType():\n", " res.append(BondInfo(pidx,pnbrs[tpl],pbond.GetIdx(),'Changed'))\n", " return amap,res\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's look at what that function returns for our reaction:" ] }, { "cell_type": "code", "execution_count": 6, "id": "c0dbffb9", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[AtomInfo(mapnum=5, reactant=0, reactantAtom=4, product=0, productAtom=0), AtomInfo(mapnum=6, reactant=0, reactantAtom=5, product=0, productAtom=1), AtomInfo(mapnum=7, reactant=0, reactantAtom=6, product=0, productAtom=2), AtomInfo(mapnum=14, reactant=1, reactantAtom=0, product=0, productAtom=3), AtomInfo(mapnum=15, reactant=1, reactantAtom=1, product=0, productAtom=4)]\n", "[BondInfo(product=0, productAtoms=(4, 0), productBond=4, status='New'), BondInfo(product=0, productAtoms=(2, 1), productBond=1, status='Changed'), BondInfo(product=0, productAtoms=(3, 2), productBond=2, status='New'), BondInfo(product=0, productAtoms=(4, 3), productBond=3, status='Changed')]\n" ] } ], "source": [ "atms,bnds = find_modifications_in_products(rxn1)\n", "print(atms)\n", "print(bnds)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, define the funciton which we'll use to draw the product molecule with highlights shown for bonds and atoms involved in the reaction:" ] }, { "cell_type": "code", "execution_count": 7, "id": "cc197a20", "metadata": {}, "outputs": [], "source": [ "from IPython.display import Image\n", "def draw_product_with_modified_bonds(rxn,atms,bnds,productIdx=None,showAtomMaps=False):\n", " if productIdx is None:\n", " pcnts = [x.GetNumAtoms() for x in rxn.GetProducts()]\n", " largestProduct = list(sorted(zip(pcnts,range(len(pcnts))),reverse=True))[0][1]\n", " productIdx = largestProduct\n", " d2d = Draw.rdMolDraw2D.MolDraw2DCairo(350,300)\n", " pmol = Chem.Mol(rxn.GetProductTemplate(productIdx))\n", " Chem.SanitizeMol(pmol)\n", " if not showAtomMaps:\n", " for atom in pmol.GetAtoms():\n", " atom.SetAtomMapNum(0)\n", " bonds_to_highlight=[]\n", " highlight_bond_colors={}\n", " atoms_seen = set()\n", " for binfo in bnds:\n", " if binfo.product==productIdx and binfo.status=='New':\n", " bonds_to_highlight.append(binfo.productBond)\n", " atoms_seen.update(binfo.productAtoms)\n", " highlight_bond_colors[binfo.productBond] = (1,.4,.4)\n", " if binfo.product==productIdx and binfo.status=='Changed':\n", " bonds_to_highlight.append(binfo.productBond)\n", " atoms_seen.update(binfo.productAtoms)\n", " highlight_bond_colors[binfo.productBond] = (.4,.4,1)\n", " atoms_to_highlight=set()\n", " for ainfo in atms:\n", " if ainfo.product != productIdx or ainfo.productAtom in atoms_seen:\n", " continue\n", " atoms_to_highlight.add(ainfo.productAtom)\n", "\n", " d2d.drawOptions().useBWAtomPalette()\n", " d2d.drawOptions().continuousHighlight=False\n", " d2d.drawOptions().highlightBondWidthMultiplier = 24\n", " d2d.drawOptions().setHighlightColour((.9,.9,0))\n", " d2d.drawOptions().fillHighlights=False\n", " atoms_to_highlight.update(atoms_seen)\n", " d2d.DrawMolecule(pmol,highlightAtoms=atoms_to_highlight,highlightBonds=bonds_to_highlight,\n", " highlightBondColors=highlight_bond_colors)\n", " d2d.FinishDrawing()\n", " return d2d.GetDrawingText()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can draw the highlighted product molecule:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Image(draw_product_with_modified_bonds(rxn1,atms,bnds))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's look at some other reactions. I will use the SI data from " ] }, { "cell_type": "code", "execution_count": 9, "id": "3a9d8374", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
rxn_ClasspatentIDrxnSmiles_Mapping_NameRxnreactantSet_NameRxnNameRxn_Mapping_CompleterxnSmiles_Mapping_IndigoTKreactantSet_IndigoTKIndigoTK_Mapping_CompleterxnSmiles_IndigoAutoMapperKNIMEreactantSet_IndigoAutoMapperKNIMEIndigoAutoMapperKNIME_Mapping_Complete
06US05849732C.CCCCCC.CO.O=C(OCc1ccccc1)[NH:1][CH2:2][CH2:3...set([3, 4])TrueC(OC([NH:11][CH2:12][CH2:13][CH2:14][CH2:15][C...set([0, 2])TrueC.CCCCCC.CO.[CH3:10][O:11][C:12]([C@@H:14]([NH...set([3, 4])True
12US20120114765A1O[C:1](=[O:2])[c:3]1[cH:4][c:5]([N+:6](=[O:7])...set([0, 1])True[Cl:1][c:2]1[cH:3][n:4][cH:5][c:6]([Cl:20])[c:...set([0, 1])True[NH2:1][c:2]1[c:11]2[c:6]([cH:7][n:8][cH:9][cH...set([0, 1])True
21US08003648B2Cl.O=[CH:1][c:2]1[cH:3][cH:4][c:5](-[c:6]2[n:7...set([1, 3])True[CH2:1]([NH:3][CH2:4][CH3:5])[CH3:2].C([BH3-])...set([0, 3])True[CH3:1][CH2:2][NH:3][CH2:4][CH3:5].[CH3:6][c:7...set([0, 1])True
31US09045475B2CC(=O)O[BH-](OC(C)=O)OC(C)=O.ClCCl.O=[C:1]([CH...set([2, 3])True[nH:1]1[c:5]2[n:6][cH:7][c:8]([O:10][c:11]3[cH...set([0, 3])TrueCC(O[BH-](OC(=O)C)OC(=O)C)=O.[CH3:14][C:15]1([...set([1, 3])True
42US08188098B2CCN(C(C)C)C(C)C.ClCCl.Cl[C:1](=[O:2])[O:3][CH:...set([2, 5])TrueCl[C:2]([O:4][CH:5]1[CH2:9][CH2:8][CH2:7][CH2:...set([0, 2])TrueCCN(C(C)C)C(C)C.[CH3:10][CH2:11][O:12][c:13]1[...set([1, 4])True
\n", "
" ], "text/plain": [ " rxn_Class patentID \\\n", "0 6 US05849732 \n", "1 2 US20120114765A1 \n", "2 1 US08003648B2 \n", "3 1 US09045475B2 \n", "4 2 US08188098B2 \n", "\n", " rxnSmiles_Mapping_NameRxn reactantSet_NameRxn \\\n", "0 C.CCCCCC.CO.O=C(OCc1ccccc1)[NH:1][CH2:2][CH2:3... set([3, 4]) \n", "1 O[C:1](=[O:2])[c:3]1[cH:4][c:5]([N+:6](=[O:7])... set([0, 1]) \n", "2 Cl.O=[CH:1][c:2]1[cH:3][cH:4][c:5](-[c:6]2[n:7... set([1, 3]) \n", "3 CC(=O)O[BH-](OC(C)=O)OC(C)=O.ClCCl.O=[C:1]([CH... set([2, 3]) \n", "4 CCN(C(C)C)C(C)C.ClCCl.Cl[C:1](=[O:2])[O:3][CH:... set([2, 5]) \n", "\n", " NameRxn_Mapping_Complete \\\n", "0 True \n", "1 True \n", "2 True \n", "3 True \n", "4 True \n", "\n", " rxnSmiles_Mapping_IndigoTK reactantSet_IndigoTK \\\n", "0 C(OC([NH:11][CH2:12][CH2:13][CH2:14][CH2:15][C... set([0, 2]) \n", "1 [Cl:1][c:2]1[cH:3][n:4][cH:5][c:6]([Cl:20])[c:... set([0, 1]) \n", "2 [CH2:1]([NH:3][CH2:4][CH3:5])[CH3:2].C([BH3-])... set([0, 3]) \n", "3 [nH:1]1[c:5]2[n:6][cH:7][c:8]([O:10][c:11]3[cH... set([0, 3]) \n", "4 Cl[C:2]([O:4][CH:5]1[CH2:9][CH2:8][CH2:7][CH2:... set([0, 2]) \n", "\n", " IndigoTK_Mapping_Complete \\\n", "0 True \n", "1 True \n", "2 True \n", "3 True \n", "4 True \n", "\n", " rxnSmiles_IndigoAutoMapperKNIME \\\n", "0 C.CCCCCC.CO.[CH3:10][O:11][C:12]([C@@H:14]([NH... \n", "1 [NH2:1][c:2]1[c:11]2[c:6]([cH:7][n:8][cH:9][cH... \n", "2 [CH3:1][CH2:2][NH:3][CH2:4][CH3:5].[CH3:6][c:7... \n", "3 CC(O[BH-](OC(=O)C)OC(=O)C)=O.[CH3:14][C:15]1([... \n", "4 CCN(C(C)C)C(C)C.[CH3:10][CH2:11][O:12][c:13]1[... \n", "\n", " reactantSet_IndigoAutoMapperKNIME IndigoAutoMapperKNIME_Mapping_Complete \n", "0 set([3, 4]) True \n", "1 set([0, 1]) True \n", "2 set([0, 1]) True \n", "3 set([1, 3]) True \n", "4 set([1, 4]) True " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "df = pd.read_csv('../data/reaction_data_ci6b00564/dataSetB.csv')\n", "df.head()" ] }, { "cell_type": "code", "execution_count": 11, "id": "c3e97304", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[AtomInfo(mapnum=1, reactant=3, reactantAtom=1, product=0, productAtom=0), AtomInfo(mapnum=7, reactant=4, reactantAtom=0, product=0, productAtom=6)]\n", "[BondInfo(product=0, productAtoms=(6, 0), productBond=5, status='New')]\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rxnclass = 1\n", "class_smis = df[df['rxn_Class']==rxnclass].rxnSmiles_Mapping_NameRxn.to_list()\n", "rxn = rdChemReactions.ReactionFromSmarts(class_smis[3],useSmiles=True)\n", "rxn.Initialize()\n", "atms,bnds = find_modifications_in_products(rxn)\n", "print(atms)\n", "print(bnds)\n", "Image(draw_product_with_modified_bonds(rxn,atms,bnds))" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[AtomInfo(mapnum=1, reactant=1, reactantAtom=1, product=0, productAtom=0), AtomInfo(mapnum=2, reactant=1, reactantAtom=2, product=0, productAtom=9), AtomInfo(mapnum=16, reactant=2, reactantAtom=0, product=0, productAtom=18), AtomInfo(mapnum=17, reactant=2, reactantAtom=1, product=0, productAtom=16), AtomInfo(mapnum=19, reactant=2, reactantAtom=3, product=0, productAtom=15)]\n", "[BondInfo(product=0, productAtoms=(9, 0), productBond=8, status='Changed'), BondInfo(product=0, productAtoms=(18, 0), productBond=18, status='New'), BondInfo(product=0, productAtoms=(15, 9), productBond=14, status='New'), BondInfo(product=0, productAtoms=(16, 18), productBond=17, status='Changed'), BondInfo(product=0, productAtoms=(15, 16), productBond=15, status='Changed')]\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rxnclass = 4\n", "class_smis = df[df['rxn_Class']==rxnclass].rxnSmiles_Mapping_NameRxn.to_list()\n", "rxn = rdChemReactions.ReactionFromSmarts(class_smis[1],useSmiles=True)\n", "rxn.Initialize()\n", "atms,bnds = find_modifications_in_products(rxn)\n", "print(atms)\n", "print(bnds)\n", "Image(draw_product_with_modified_bonds(rxn,atms,bnds))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Look at an example where there are no changed bonds in the products but where there is a changed atom:" ] }, { "cell_type": "code", "execution_count": 13, "id": "4b06fb79", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[AtomInfo(mapnum=1, reactant=1, reactantAtom=1, product=0, productAtom=0)]\n", "[]\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAV4AAAEsCAIAAAAEo0yHAAAABmJLR0QA/wD/AP+gvaeTAAAgAElEQVR4nO3deVxTV9oH8JM9IKvggmEVERGKVRhE3MW6tVZEUdyrVhFrbavta7VadFxb7WKtFVS0igugYrVuFSlFrQqIC4oKmCCyCrImJCHbff+4M0wariFokhtyn+8ffjpwuDyZ6fw896w0DMMQAAD8E53sAgAApgiiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiAQBAAKIBAEAAogEAQACiwRjS09Nzc3ObmprILgQAXTHJLsDM5efnR0dH3717t76+HiFkb2/f85/8/Py6d+9OdpkAaKJhGEZ2DWZLKBQOHDjw8ePH7u7uXC63qKioubm5dTMHB4devXp5enr2+i9PT8+uXbsav2AAWkA0GAqGYdOnTz9x4sRbb7118+bNTp06IYTq6uoEAkFeXt6jR48EAoFAICgoKBAKha1/nMPh8Hi8vn37+vr6tnQx3N3d6XR4BwTGANFgKDt27Pjiiy+sra2zsrL69OmjpSWeF+oePnxYWVnZuiWbzXZ2dlZ/H+nbt6+3tzeTCS+GQM8gGgzixo0bI0aMUCgUJ0+eDA8Pf40ntM4LgUBQVFTU+n8vFovl4uKiMYTh4+NjaWmpj48CKAqiQf8qKysHDBhQUVHx1Vdfbdq0SY9Pbm5uLisrU38fEQgEz549U6lUrRvjQ57qryS9e/e2trbWYz3AjEE06JlcLg8NDb127dqoUaMuX77MYDAM/RtlMllpaal65yIvL6+goEChULRurDFF0rdvXz8/Pzs7O0MXCTociAY9W758+a5du1xcXHJycrp06UJWGXK5vKSkRON95PHjx2KxuHXj1lOqPXv29PDwoNFoxq8cmAiIBn1KTEycMWMGi8X666+/QkJCyC5Hk0qlev78OZ/Pf/r0qfqfhHlhY2ODT6MOGTJk+fLlxq8WkAuiQW8ePnwYHBzc1NQUGxsbFRVFdjntoDHkmZeXl5eXh6/RQgj5+/sHBQXt27eP3CKBkUE06EfL6qbZs2cnJCSQXY4evHz5ks/nP3r0aPny5SKR6MGDB35+fmQXBYyHsX79erJr6PAwDJs1a9bVq1f9/f1/++03FotFdkV6YGlp6ezs3L9///Ly8uzsbCaTOX78eLKLAsYDvQY92Lp165o1a+zt7bOzsz09PckuR88ePHjg7+9va2tbVlaGr+kEVACrbt9Uenr6unXr6HT6kSNH2pULCoXi5s2bVVVVhqtNL956662QkJCGhoakpCSyawHGA72GN1JSUhIQEFBdXR0TE9PeV7OCggJvb2+EEJfL7dGjhylvlzh8+PC8efOCgoIyMzPJrsXkrFixorS0NCYmxtfXl+xa9Ami4fXJ5fKRI0f+/fffo0ePvnTpUntXN927d2/RokVPnz5tmQtQx+VyW7Zjtvzp6upKynYJqVTK4/Fqa2tzcnIGDBhg/AJMmY+Pz5MnT+7fv+/v7092LfoE0fD6li5dumfPHldX15ycHEdHx9d+TofYLrFixYoffvghKioqNjbW0L+rA1GpVJaWljKZTCgUmtlADETDazp69Ojs2bO5XO61a9cCAwP1/nxT2y6Rn5/v4+PTqVOnsrIyGxsbPT65QysuLnZ3d3dyciovLye7Fj2DaHgdubm5gwYNEovF+/fvX7hwodF+75tsl3jzE6VGjRqVnp6+Z8+eJUuWvMGHMCtpaWmjR48eOnTo1atXya5Fz2Cff7vV19eHh4eLxeJFixYZMxcQQmw2G/8/ufoX5XL5s2fPWhY+FxYW8vn8oqKiurq6nJycnJwc9cYaJ0p5enr6+vra2trq8tujoqIgGjQ8ffoUIdSrVy+yC9E/iIb2wTBswYIFfD7/7bff3rlzJ9nlIIQQi8Xy8vLy8vLS+DrhiVI1NTU1NTXqEw27du1atmyZLr9o8uTJ3bp1y83NvXXrVnBwsD4/Q4fF5/MRQua3mAVBNLTXpk2bTp8+3blz55SUFAsLC7LL0cbe3j4gICAgIED9ixUVFRp7q/r27avjA9ls9vz587dt2xYXFwfRgMOjwSx7DTDW0A5paWljx47FMOzcuXPUXDVcVFTUq1cvDodTWlrauXNnssshX79+/XJzc7Ozsw0xFE0uU1lUY/qeP38eGRmpVCo3bNhAzVxACHl4eLzzzjsSicQ8tpC9IQzDBAIBMtMXCogGnTQ3N0+dOvXly5fvvffemjVryC6HTPh+89jYWOhvVlZWikQiR0dHe3t7smvRP4gGnSxbtiw7O9vd3f3XX381nfXLpJg4cSKPx3vy5Mm1a9fIroVk+PSEWXYZEESDLhISEvbv38/lck+dOuXg4EB2OSRjMpn4lG1cXBzZtZDMjGcuEURDm+7fv493oX/55RfYPoBbtGgRg8E4efKk6W8bNSgznp5AEA3a1dXVhYeHSySS6Ojo+fPnk12OqXB2dn733XdlMtmhQ4fIroVMZryoAUE0aKFSqWbNmiUQCIKCgn744QeyyzEteE8qLi6OcE8HRcBYA0Vt2LDh4sWLDg4OSUlJHA6H7HJMy7hx49zd3fl8flpaGtm1kAZeKKgoNTV18+bNdDr92LFj7u7uZJdjcuh0+ocffogoPBhZU1NTV1dnY2NjrneaQzQQKC4unjFjhlKp3Lx585gxY8gux0QtXLiQxWKdOXPG/PYj68K83yYQRENrUql0ypQpNTU1EydOXLVqFdnlmK7u3btPmjRJoVAcOHCA7FpIYN4zlwiiobWPPvooJyfHy8srISEBbnbTDh+M3Lt3r1KpJLsWY4NooJa4uLgDBw5YWFgkJSXpeIoBlYWGhvbu3bukpOTSpUtk12Js5j1ziSAa1GVlZX3yyScIodjY2P79+5NdTgdAo9EWLVqEKDkYad7TEwg2Zbeora0NDAwsKipavny5iZzR0iHU1NQ4OzvLZDKBQODm5kZ2OcbTrVu3qqqqkpISZ2dnsmsxCOg1IISQSqWaOXNmUVFRcHDw9u3byS6nI3FwcJgyZYpKpYqPjye7FuNpbGysqqqysLDo0aMH2bUYCkQDQgitW7fujz/+6Nat28mTJ9lsNtnldDD4YOT+/fvlcjnZtRhJy0CDGW/DNdsPprtz585t3bqVyWQmJSXxeDyyy+l4hg4d6ufnV1FR8fvvv5Ndi5GY/aIG1N5owDBs/fr1hw4d+vvvv1+8eGGgmozp6dOnc+bMwTBs27Ztw4cPJ7ucjopqg5FmP3OJ2jsMWV5erv73KofD4fF4GpejuLm5tfeKN7JIJJLBgwffvXt30qRJp0+fhlUMr62hoaFHjx4SiSQ/P7/12dbm58MPP4yPj//ll1+io6PJrsVQ2neiNIvFWrt2bct5xLW1tfgp5leuXGlpw+FwNG5q7NWrl5ubGymXNWoXHR199+7d3r17Hz58GHLhTdja2k6fPv3gwYPx8fHbtm0juxyDg15DG3S/rJHJZLq6umpcptSnTx8Srwn8+eefP/74Yysrq1u3bpnZHcekyMzMDA4OdnR0LC0tNfuNqi4uLqWlpXw+X+O6IHOi/3UN+GWNGpejFBcXE66lbX1Zo5eXlxGuVMzMzBw2bJhMJktKSpo2bZqhfx1FBAQE3Llz5/jx45GRkWTXYkASicTKyorBYIjFYhPsC+uLkZY8tb6sEc8OqVTaunHryxp9fX2dnJz0VUxVVVVAQEBpaemKFSu+++47fT0WxMXFLVmyZMSIEenp6WTXYkB5eXl+fn69e/fOz88nuxYDInM1pEKheP78uUZePHnypKmpqXXj1nnRs2dPDw+P9o4RKJXK8ePHp6amhoSEpKenwyoGPRKJRDwer7Gx8eHDh2b8jnbmzJmwsLDx48dfuHCB7FoMiMzuEJPJbH25KyK6rLGwsJDwctfXmCJZvXp1ampqt27dTpw4AbmgX1ZWVjNnzoyNjd2/f78ZH5lHhTFI1IH2ULQe8szLy6uoqGjdks1mOzs7q3cu+vbt6+3tzWQyz5w5M3nyZAaDkZaWNmzYMON/CrOXm5vbr18/Ozu7srIyS0tLsssxiKVLl+7Zs+fHH3/EN+OZqw4ziEJ4uWttba365a74/fEvXrzAs0O9JYvFcnFxqaiowDBs+/btkAsG4u/vP3DgwMzMzBMnTsybN4/scgwCeg0dlZYpEhcXFyaTqZEaQL8OHjy4YMGCQYMG3bhxg+xaDMLT0xMfFPP29ia7FgMyw2ggJJVK79y5M2rUKLlczufz4SRYw5FIJDwer66u7s6dO+Z37IVcLre0tFSpVGKx2LyXb1BlexWXyw0JCYmIiKDa9mHjs7CwmDNnDkJo//79ZNeif0VFRQqFwtXV1bxzAVEnGnAU3D5MiiVLltBotISEBKFQSHYtemb2hzu1oFY0DBkyxM/Pr7Ky8uzZs202TklJGTp06L1794xQmJnx8fEZOnSoUChMTEwkuxY9o8J2bBy1ogEhtHjxYqTb9uFr165dv36dOhuN9avlDmGyC9Ezsz8ttgXlomHu3LmdOnW6cuVKYWGh9pZRUVE0Gu3IkSONjY3Gqc2cTJ06tWvXrvfu3cvOzia7Fn2iyMwlomA04NuHMQxrc5CsT58+w4YNE4lEx48fN05t5oTNZuPrGsys20WdaKDK5KW6rKysgQMHOjo6lpSUcLlcLS2PHz8+c+bMfv36wYjDa+Dz+b179+ZyuaWlpfb29mSXowcqlcrS0lImkwmFQhLPEzAOyvUaEEJBQUEBAQEvX748ffq09pZTpkzp2rXr/fv3s7KyjFObOfH09Bw1apRYLD569CjZtejH8+fPm5ubnZyczD4XEDWjAek8GMlmsz/44ANdWgJCLYOR5tE5pc7MJaJsNMycOdPGxiYjIyMvL097y6ioKDqdfvz48bq6OuPUZk7CwsJ69Ojx+PFj81g0TZ2ZS0TZaLCyspo1axZCaN++fdpb9uzZc/To0RKJJCEhwSilmRUmkzl//nxkLt0u6sxcIspGA0JoyZIlCKFDhw6JxWLtLfFecWxsrHn0io1s8eLFDAYjOTm5urqa7FreFHWmJxCVo8Hf3z84OLi+vj45OVl7y/fff5/H4z1+/Pj69evGqc2cuLq6jhs3rrm52Qy6XRANVIF3B9rs6zKZzAULFujSEhDC/3ves2dPh+52YRiGb+enyAsFFdc1tJBIJM7OzrW1tW1uHy4pKfHw8GAwGKWlpV26dDFaheZBqVR6enoWFxenpaWNGjWK7HJeE34/k6Ojoxm8GemC0r2Glu3DbQ5Guri4jB8/XiaTHTp0yCilmbozZ878+OOPOjZmMBj4if4RERHvvvvu8uXLf/rppwsXLhQUFMhkMkOWqU+UeptACCGM2h4/fkyj0aysrBoaGrS3xO969fT0VCqVxqnNZBUUFNja2iKEzpw5o0t7mUwWFBT0qisb7O3tBw8evHjx4m3btiUnJ9++fbuxsdHQH+E1HDhwACE0e/Zssgsxkg5zNqSB4BslMjIyEhMT8XVQrzJhwgR3d3c+n5+enh4aGmq0Ck2NRCKZPn16Q0NDWFjYxIkTdfmRlStXZmVlOTs7Hzt2rKamBj/LEz/O8/nz53V1dX///ffff/+t/iM8Hk/jbkRPT088j8hCqZlL1IGOjTWcqKiojIyMX375RXs00On0hQsXrlu3Li4ujsrRgF8U6u3tfejQIV0uAUlMTNy1axeLxUpKSgoJCdH4rlwuLykp0Tgr/PHjx2VlZWVlZRkZGeqN9XUXyeuh1HonRPFhSJxMJnNxcamqqsrMzAwKCtLSsrKy0tXVFcOw4uLiHj16GK1C07Fr167ly5dbWVllZmb27du3zfYPHz4MDg5uamqKjY3F5yl0VFdXp37qr0AgKCgoIDwzisvl4vebqeeFu7s7na7ncbTAwMCcnJwbN24MGjRIv082UWS/0ZiE//u//0MILViwoM2WU6ZMQQht2rTJCFWZmps3b7LZbBqNlpycrEv7xsZGHx8fhNCsWbP0UkBtbe3t27eTk5O3bdu2ePHiwYMH29nZEf5bzWaz8WWs+BDG2bNn+Xy+QqF4k9+O/66qqiq9fBbTB70GhBASCAReXl4cDqesrEz79uHU1NQxY8a4uroKBAItd2SZnxcvXgQEBJSVlX3++efbt29vsz2GYREREadOnfL3979586bhrqvR/bp2/C4SjfcRHx8fXWp7+fJlly5dbGxsGhoaDPM5TA5Ew3+MGTMmNTX1p59++vjjj7U0wzDM29u7sLDw/PnzEyZMMFp55FK/KPSvv/5isVht/si2bdtWr15tb2+fnZ1t/PdzqVTK5/PV30cEAsGzZ89UKlXrxvb29uoXI/bs2dPb29vKykq9za1btwYNGjRgwACNqxXNGYk9FpNy8uRJhJCPj49KpdLe8ptvvkEIvf/++8YpzBR8/vnnCKHu3buXlZXp0v7PP/9kMpl0Ov38+fOGrk13YrE4Nzf39OnT27dvj4qKGj16tLu7+6u6fs7OzsOHD1+4cOGWLVuSk5M3bdqEEJo2bRrZH8J4oNfwHwqFwt3dvays7Nq1a0OGDNHS8uXLly4uLvhVN25ubkarkCz4RaFMJjMtLW3o0KFtti8pKQkICKiuro6JiVm/fr3hC3wjr5oiab3pjsFgsFgsjfFOwvuczQTZ2WRC1q5di3Rb0zJjxgyEUExMjOGLIll+fj6+muCnn37Spb1MJhs8eDBCaPTo0W847EcihUIhEAguX768Z8+elStXhoWFvWq8EyFkZ2cXEBAwffr0NWvWHDhwICMjQ8e+lYmDXsP/4BslmExmSUmJ9o0SGRkZI0aMcHJyKi4u1uXFu4MSiUTBwcF5eXmRkZE6np2L3yLt6uqak5Pj6Oho6AqNJiQk5ObNm7///ruTk1PLRe2PHj16+vQp4cAkh8Ph8Xjqd7X7+vq6ubl1oKFriIZ/mDhx4rlz53bs2LFy5UrtLf38/PLy8k6fPh0WFmac2oxv+vTpycnJffr0ycrKsra2brP90aNHZ8+ezeVyr127FhgYaIQKjaZr167V1dWlpaU8Hk/jW284RdK3b18LCwtjfY52gGj4h3Pnzk2cONHT07OwsFD7GrudO3d++umn48aNu3jxotHKM6bvv/9+5cqV1tbWmZmZ+PIE7XJzcwcNGiQWi/fv379w4UIjVGg0jY2Ntra2FhYWTU1NOq68bNcUiZOTk8YQRuspEuODaPgHlUrVs2fP4uLiK1euaF8NXV9fz+PxJBJJQUGB+e3Gu3HjxsiRI+Vy+YkTJ/BVXtrV19cHBgby+fxFixbt3bvXCBUa0507dwICAvz8/B48ePAmz2lubi4rK2tJCvyVpLi4WKlUtm6sviocfx/p1auXUXeRkDjOYZr+/e9/I4QiIiLabIkfNv3ll18aoSpjqqysxJeB6/jRVCrV5MmTEUJvv/22WCw2dHnGl5SUhBAKCwszxMNlMhmfz09NTY2Li1u1alVERERAQMCrXjHs7e0DAgIiIiJWrVoVFxeXmprK5/MNURUGw5Ct6b5RAl8G4+joWFpaajZXqisUitDQ0KtXr44cOfLy5cuv2kmtbuPGjV9//XXnzp1v377t4eFhhCKNbOvWrWvWrNFxGai+lJeXa7yP5Ofni0Si1i0NtIsEdl5q6t69+8SJE1NSUn799dc1a9ZoaRkcHNy/f/+7d+/+9ttv06dPN1qFBvX5559fvXrVxcUlKSlJl1xIS0vbsGEDnU4/cuSIWeYCImk7do8ePTT+ZsIwrKysDN/Mrv5nY2Pjo0ePHj16pN64U6dOGlvaBw0a1L7l6gbqjXRof/zxB0LI1dW1zZl5/CbokSNHGqcwQ8PvvGexWNevX9elfXFxMT5DuXHjRkPXRqLhw4cjhC5fvkx2IcQ0dp2NHj2acBVWe189IBoIqFQqLy8vhNCFCxe0txQKhTY2NgihvLw849RmOE+ePME/C37ZVJukUum//vUvhNB7771n3idf4ROWAoGA7ELaoa6uLjs7OzExcdOmTfPnzx8+fLhcLm/XEyAaiOEbJSZNmtRmS/wAmBUrVhihKsMRCoX4DOXMmTN1/JEPP/wQIeTu7v7y5UuD1kYusVhMo9HYbHbHXdz5eiAaiFVXV3M4HAaD8ezZM+0t8Uu07ezsmpqajFOb3qlUqqlTpyKE/P39dfwUhw8fRghxudycnBxDl0cufMKyd+/eZBdibJQ+UVoLR0fH8PBwpVJ58OBB7S379esXFBRUX19/6tQp49Smd9u3bz958qSdnV1KSoouI1X3799vued2wIABhi+QTJQ7SPq/IBpeCf+3f+/evXK5XJeWHfQCm/T09K+++opGox04cECXQfi6urrw8HCJRBIdHY3fZ2neKBsN8EKhja+vL0Lo9OnT2puJxWL8bKi7d+8apzB9KS8vd3JyQgitW7dOl/b4mS4IoaCgIKlUaujyTAF+N+rOnTvJLsTYoNegDT7S1mZ3wMLCYvbs2Qih/fv3G6MsPZHL5dOmTauoqAgNDY2JidHlRzZs2HDx4kUHB4ekpCSzWeWlHdXOmP8fsrPJpNXV1VlaWtJotMLCQu0tHz16RKPRrK2tTfN6FUIfffQRQsjV1bW6ulqX9pcvX2YwGHQ6/Y8//jB0baYDX8f15MkTsgsxNoiGNsybNw8htHr16jZb4mdD7du3zwhVvbmjR48ihDgcTlZWli7tnz175uDggBDaunWroWszHTKZDE9Dirw9qYNoaMONGzcQQl26dGnzX44jR44gg23C0a/c3Fx8JkLHIJNIJAEBAQihiRMntnl2pjnJz89HCLm7u5NdCAkgGtqGX6KdmJiovZlUKr106ZLprwtsbGzs06cPQmjOnDk6/siCBQsQQl5eXvX19QatzdScP38eITR69GiyCyEBDEO2bdGiRUiHwUgOhzN27Fi9X5qkXxiGffDBB0+ePOnXr5+Os62xsbEHDhywsLBISkoi99ZJ46PuzCWsa9DF7Nmzra2t09PTNTa3dURbtmxJSUmxt7dPSUnR5dyxrKysTz/9FCEUGxuL954ohbrTExANurC2tsaPkI6Pjye7ljeSlpYWExNDp9OPHj2qyxHptbW1kZGRzc3Ny5cvnzt3rhEqNDV4NFCz1wBjDToxg40Sz58/x4/J3rBhgy7tlUrl2LFjEULBwcHNzc2GLs80eXt7I4Ryc3PJLoQEEA26wjcgHz58mOxCXkfLBup33nlHxx2E+DE23bp1Ky0tNXR5pkmpVHI4HBqNJhKJyK6FBBANusJXOg4ePJjsQl7Hd999hxDy8PCoqanRpf3Zs2dpNBqTyfzrr78MXZvJKioqQgjxeDyyCyEHRIOuOu5GCQzD5HL56tWr79y5o0vjwsJC/LKmHTt2GLowU5aamooQGjZsGNmFkAOGIXVlYWExa9Ys1NE2SuCYTOaWLVt0mWKQSCTTpk2rr6+fNGnSihUrjFCbyaLyzCWCGYp2Wbp0KY1GO3z4sFAoJLsWQ4mOjr57927v3r0PHz6s43Us5orKM5cIoqFdfHx8Bg8eLBQK8YsJzM/PP/986NAhKyurlJQU/JxIKoNeA2iHDn1qi3a3bt3Cb/qMj4/HD6qgOEovaoCL7dqrubnZxcWlurr66tWrQ4cOJbscvamqqgoICCgtLV2xYgU+nUFxGIZZW1s3NTXV19dTbXk4DqKh3cLCwv7880+hUKh+KyHO19cXPzSpY8HPbkpNTQ0JCUlPT2ez2WRXRL7y8nIej9elS5eqqiqyayEH3F7VPvn5+X/++adIJGKxWHV1dTk5OTk5OeoNOnfu3KtXr5Zbg/B/6NatG1kF62L16tWpqandu3c/ceIE5AIOH2ig7BgkgmhoF5FINGXKFKFQGBkZefz48bq6upb7jvFbCQsLC2tra7OysrKystR/kMPh8Hi8liuP8S6Gm5sbg8Eg67O0OHPmzI4dO5hMZnJysvY7PimF4mOQCKKhXRYuXJiXl9enTx/8nnj82mL8jJMWeF6oy8vLq6iowP/5ypUrLS3ZbLazs7P6+0jfvn29vb11uWlSXwoKCubNm4dh2I4dO8xp6OTNUXzmEkE06O77779PTk62trZOSUmxtrZ+VTMd80IgEBQVFeH/oN6SxWK5uLhoDGH06dOnU6dOev9ETU1N4eHhDQ0NkZGRn3zyid6f36FBrwGGIXVy8+bNESNGyOXyEydOTJkyRS/PbG5uLisr03glKS4uViqVrRvjQ57q7yNeXl5vuPRg7ty5CQkJ3t7e2dnZWsKOmgIDA3Nycm7evBkcHEx2LeSAaGjbixcvBgwYUF5e/uWXX27dutWgv0smk5WWlmq8jxQUFCgUitaN32SK5IcfflixYoW1tXVmZiZ+2yVQZ2dn19DQUFVVhe9kpyCIhjYoFIrQ0NCrV6+OHDny8uXLxhwIUK/h+fPnGu8jT548aWpqat24dV707NnTw8NDfdVzSycoOTkZv+0SqKuuru7atauNjU1DQwPZtZAGxhra8MUXX1y9etXZ2TkpKYmUXEAIMZlM/P/h6l9UqVQlJSV8Pv/p06fqfxJOqVpbW7fMp3bt2vXbb7+VyWSrVq2CXCCEj0F6eXmRXQiZoNegTVJSUmRkJIvFSk9PHzx4MNnl6ORVUyTqbZydnb28vMjqBJm+I0eOzJkzZ9q0aea6WUYX8G/GK+Xn5y9evBghtHPnzo6SC+gVUyQ1NTV4n8YPVEQAAAr5SURBVALvXCxfvtzNzQ1y4VVgegJBNLyKSCQKDw9vbGycOXNmdHQ02eW8KQcHBwcHh6CgILIL6RhgUQOCnZeEMAybP3/+o0eP/P399+3bR3Y5wNig14AgGght37795MmTdnZ2KSkp+AVwgFIovh0bB8OQmtLT08eMGaNUKk+dOjV58mSyywHG1tjYaGtra2lpKRKJqHzOFfQa/qGysnLWrFkKhWLt2rWQC9RUWFiIEPL09KRyLiCIBnVyuTwiIqKioiI0NDQmJobscgA54G0CB9HwP5999tn169ddXV0TExNNYbs0IAWc1ICDaPiP48eP7969m8PhnDx50tHRkexyAGlg5hIH0YAQQg8ePFi0aBFCaNeuXfgFcICyYOYSB9GAhELhtGnTmpqa5syZgwcEoDIYa8BRffISw7CpU6empKT069fv5s2bFhYWZFcEyCQWi62srFgsllgspvh4E9V7DVu2bElJSbG3t09JSYFcAHw+H8MwDw8PiucCong0/PnnnzExMXQ6/ejRoxpbngE1wdtEC+pGQ0lJSWRkpFKpjImJGT9+PNnlAJMAM5ctKBoNcrk8MjKyurr6nXfe+eqrr8guB5gKmLlsQdFoWLZs2Y0bN9zc3I4fPw5vlaAFzFy2oGI0HDlyZO/evVwu99SpUw4ODmSXA0wIREMLyk1e3r9/PyQkRCwWx8fHL1iwgOxygAmRyWT4HnyxWAwX/FGr11BXVxceHi4Wi6OioiAXgIaioiKlUunq6gq5gCgVDSqVavbs2QKBICgoaOfOnWSXA0wOvE2oo9DZkBs3brxw4ULnzp0TExM5HA7Z5QCT4+zs/PHHH3t7e5NdiEmgyljDlStXxo0bh2HYhQsXxo4dS3Y5AJg6SvQaiouLZ8yYoVQqN2/eDLkAgC7Mv9cglUqHDBmSk5Pz3nvvnT17luKnegGgI/Mfhly2bFlOTk6vXr0SEhIgFwDQkZlHw969e+Pj4y0sLJKTk+3s7MguB4AOo6O+UKhUtQrFY5WqGiE5ne7IYLgzGB4abe7duxcSEiKRSH799dd58+aRUicAHVQHG4ZUKovE4nip9DeFIk/jW3R6Vw5nvKXlPDZ7JEKotrY2PDxcIpEsW7YMcgGA9uowvQaV6oVQuFosPoyQEiFEo3ViMn0ZjB4IsVSqFwpFgUpVibdksYKsrL6bPHnzpUuXgoODMzIyYHEbAO3VMaKhufl8ff08laqGRmNzuZGWlh+w2UM1ujwKRb5UmtjUFKtSVdbX0+fO7f78uSInJ8fZ2ZmssgHouDpANIjFcQ0NHyGk5HDG2truZjC07aXHMLFItFUk2iaTKSoqQgcOvECjQZcBgHYzbDQ0NzevXLkSITR9+vShQ4cStnn8+PHu3bsRQqtXr+bxeBrflUiO19fPQghZW/97716L/PwChNDSpUv79evX+lFZWVnx8fEIoW++mdrcHKlS1VpYTLezO44QzFkC0E6YITU2NuK/Zffu3a9qc/HiRbzN3bt3Nb4ll+dVVFiUlyOR6HsMw0aNGoW3DAwMVCgUrR915MgRvEFdXZ1Mdq+iwra8HIlE3+n3QwFABaa8rgFraFiEYRJLy/mdOn2m/o3bt2/v2bNH+w+zWP3s7BIQogmFa5XKIkPWCYAZMt1okErPymQ36HQnG5sf1b9uZWWFEFq7dm1FRYX2J3C5Ey0sZmGYRCjcYMBCATBHphsNTU07EUJWVl/SaDbqX580aZKXl1dDQ8Nnn332ih/9H2vrjQgxpdLjKlWVoQoFwByZaDSoVBUyWQaNZmFp+YHGt1gs1ubNmxFCSUlJly5d0v4cBsOdyx2PYTKp9JSBSgXALJloNDQ3pyOkYrNDNboMuIiIiBEjRiCEoqOjm5qatD+KwwlDCDU3pxmgTADMlolGg1x+DyHEZg98VYOff/6ZyWQ+e/Zs27Zt2h/FZgcjhBSK+/qtEADzZqQ9FBs3bty1axfhtwj/2lepShFCrXdMtfD19V2yZMnPP//87bffzpw508fH51Ut8YcolaXtLhoACjNSNFRWVlZWVureHsPECCEarZOWNps3bz516lRFRUVUVFRGRsarzmKg0SwQomOYFCGVyfaSADA1RoqGL774YsqUKYTfunXr1qeffqrxRRrNAv03IF7Fxsbmm2++mTt37rVr144dOzZr1izCZngo0GgcyAUAdGekaHB3dx84kHjgoK6urvUX6XRnhJBS+Uz7Y2fPnn3w4MH09PRVq1aFhYURtsEfQqdrLsEGAGhhon+Rslj9EEJyeab2ZjQabdeuXSwWq6ys7NtvvyVsgz8EfyAAQEcmGg1s9kiEaM3NaRgm0t7S19cXfx/Zvn17aSnBWKNUegYhxGaPMkSdAJgrE40GBoPHZg/BsCaJJKHNxl9//bWLi4tEIvn+++81vqVUlkql5xBiWlgQj3QAAAiZYjTk5eUlJCScPfv2gwdIJNqCYW0sarKyssJDoapKczW0SLQeIbmFRQSd7mSgagEwS6Z1NmRDQ8PcuXPPnj3LYrEQQnI5GjKkNDn5Uyenfdp/cOrUqRMmTLhw4YL6F5ubL4vFB2g0jpUVbK8CoH1Mq9cQHR2dlpZ27NgxiUQiEolWrVp4/TratGm/WBzb5s/u3LmTy+W2/EeF4nF9/UyEMCurGCbTy5BVA2CGGOvXrzfc0zEMKy8v79+/f2hoqIcH8dJGqVQqlUr79+8/YcKEwMDAcePGTZ48mU6nMxiMkSPfjY39sb6+eerUi3S6dWNjb3d39yFDhrz99tutn9O5c2dnZ+du3boFBASMGdNNKHxPparmct+3td0FpzwB0F6mfjZk//79q6ufZWc3IIRxuZNtbH5iMLQdA4thsqam70SiDRjWzOGMtbc/ja+eAgC0i2m9UGiQSCR8Pt/HJ9DePplGs5VKT1dX925oWCqXZyGk0misVJY0Ne2oru4tFK7BMFmnTh937vw75AIAr8e0hiE17Nu3TygUzps3j8ud2qXLQKHwC4kkWSzeIxbvodM7M5n+DEZ3hFj4TVZKpQD/KSbzLRubHzicUHKLB6BDM90Xirt37w4ZMmTQoEGXL1+m0//Tu1EoHonF+6TS31qvoabRbDicMZaW8zicCSbeGwLA9JloNNy7d++dd95xc3NLS0uztbVt3UCpLFUoHqtULxGS02h2TKYnk9kHIYbxSwXALJliNCQmJn744Yd+fn7nz593cHAguxwAqMi0Ot4KheLLL7+cMWPG5MmT09PTIRcAIItp9RpGjhz5119/2drajhkzRv3ru3fv7tKlC1lVAUBBpjVD0alTp9GjR6NWhzioVJpTlQAAgzKtXgMAwESY1lgDAMBEQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAhANAAACEA0AAAIQDQAAAj8P9HGQbr1eXJsAAABAXpUWHRyZGtpdFBLTCByZGtpdCAyMDIxLjA5LjIAAHice79v7T0GIOBlQAAeKG5gZFewANLMTMxsCRogBgucVjCA8sEKGJEZMBUKQBqhEJ2GqedmYFRgZMpgYmJmYGJJYGHNYGJlS2BjZ2DnYGDnZGDnYmDjzmDiZkwQYWRj5GZjZWESnwXUyAh3a8QG5QMreZgPgDjf/C/uz02M3Q9iL97Mc+BzK5s9iL38ePYB6TnWYHZ32OL901bfArO/dYfZNSkcALNnX3S1lzn7yBbEzukQcFB6cnQviP3IidnBp+0U2MxvZ2Udjp/cCFZfMCffIWLp7H0gdpq95L5T1p1gNWIAARI95cyAfzUAAAFeelRYdE1PTCByZGtpdCAyMDIxLjA5LjIAAHicfZNdTgQhDIDf5xS9wJL+0cKju2uMMc4munoH371/bGezMiZEoBNaPqA/zALZ3s4vX9/w2/i8LAD4z+i9w6cg4vIKOYHj49PzCqfrw/FuOV0+1us7EOfA7H/Zh+vl9W4hWOHApVY3aTlDddMGWHBrYy/DCQ5UTCpXgQOWZoHOSEmSC/WmprEeG7TRBNQEpVhTtFx34VZnYL3dzW7oHag4NrEZaAFiYeyVWnAVI1884Xzj3NlDwUKErjLhWnBcuDZhy5ixm7FPwL6BWONmT1drY+bZiYQbqVatW7gorXXHGUgBSgm/Om/5FpQeWybkVpqIQnucmrejM8nMzcf1/Kf8twdxvKzn8SCy86i6hsgoLYXoKKCGWkeZNMRGMTTER84ppI3UUkgfCUyV9nnaDLTLh+YZvAtb88P74PahpH7/L2K+/ADCA6Ssg9kgSAAAALh6VFh0U01JTEVTIHJka2l0IDIwMjEuMDkuMgAAeJwdjrkNw0AMBFtxKAEUwf/BhcrdhNpQ8eYZYEAOdha8H36O7/k8x33c58w9O3/e41KMMgogTJVyhnUJcpdt5OJWGzGGzgEXYYVl1D/mnmEwCw3yQYRsHR4wAqWwwZrelNQpY6Y0hSUoXqq7izpCtjkVPsf2vERkEwtPYNSqTliKI7dsS0l7slMt1M41ISfq/v8pGbRJUmnA+f4AgMwzBVi2hf0AAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rxnclass = 7\n", "class_smis = df[df['rxn_Class']==rxnclass].rxnSmiles_Mapping_NameRxn.to_list()\n", "rxn = rdChemReactions.ReactionFromSmarts(class_smis[1],useSmiles=True)\n", "rxn.Initialize()\n", "atms,bnds = find_modifications_in_products(rxn)\n", "print(atms)\n", "print(bnds)\n", "Image(draw_product_with_modified_bonds(rxn,atms,bnds))" ] }, { "cell_type": "code", "execution_count": null, "id": "7a94bd23", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.4" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 5 }