{ "cells": [ { "cell_type": "markdown", "id": "4477b206-34bf-4bae-a780-47ba8c3e9251", "metadata": {}, "source": [ "The RDKit's [Contrib](https://github.com/rdkit/rdkit/tree/master/Contrib) directory includes implementations of two descriptors, `SA_Score` and `NP_Score`, which are not included in the core RDKit because they both use large data files. Still, the descriptors can be useful and questions about how to use them come up every once in a while, so here's a really short blog post showing how to use them.\n", "\n", "Both descriptors are implemented in Python, so they're currently only accessible from Python. We're working on making them available in KNIME too (in fact this blog post was prompted by a question Alice at KNIME asked me as she was working on the node), but that's going to take a bit longer.\n", "\n", "If you want to learn more about the descriptors themselves, the publications describing for those two descriptors are:\n", "\n", "1. SA_Score: http://www.jcheminf.com/content/1/1/8 (this one is open access)\n", "2. NP_Score: http://pubs.acs.org/doi/abs/10.1021/ci700286x (this one is not open access)\n", "\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "3584082d-620d-4d60-9337-ac0b0f574688", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2023.09.2\n" ] } ], "source": [ "from rdkit import Chem\n", "from rdkit.Chem.Draw import IPythonConsole\n", "from rdkit.Chem import rdDepictor\n", "rdDepictor.SetPreferCoordGen(True)\n", "import rdkit\n", "print(rdkit.__version__)" ] }, { "cell_type": "markdown", "id": "eb47435f-4f44-45fa-993c-6835c2bfbe07", "metadata": {}, "source": [ "We'll use a compound from one of the papers in J. Med. Chem.'s ASAP section as I was writing this post (https://pubs.acs.org/doi/10.1021/acs.jmedchem.3c01626). \n", "\n", "> Mini rant: it was easy for me to get the structure of this compound since J. Med. Chem. suggests that authors provide SMILES strings for the compounds in their papers and compliance is pretty good. Too bad this isn't true in computational/cheminformatics journals.\n" ] }, { "cell_type": "code", "execution_count": 2, "id": "2b35641b-616e-4801-bd7a-95059df2c6d2", "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAIAAADCEh9HAAAABmJLR0QA/wD/AP+gvaeTAAAgAElEQVR4nO3deVxU5f4H8O8M+76oiCwSiGigxqKQuS+VKfZLE810XFK5lolmJrnkeF1HbQFLr5iWqFeLW1aY3byoqRi4gOKCCgKKLIKyDzvM+f7+eHAYERU4c2YG/b5f93VfeeZwvs+BM595znOeOUeEiEAIIaStxNpuACGEtG8Uo4QQwgvFKCGE8EIxSgghvFCMEkIILxSjhBDCC8UoIYTwQjFKCCG8UIwSQggvFKOEEMILxSghhPBCMUoIIbxQjBJCCC8Uo4QQwgvFKCGE8EIxSgghvFCMEkIILxSjhBDCC8UoIYTwQjFKCCG8UIy2E2VlcP8+KBTabgchpCmKUZ3366/g5QVWVmBnB3Z2sGQJ1NZqu02EkEYUo7rt0CEICoJJkyAvDyoqIDIS9u2DmTO13SxCSCMRPadep/XpA97esGdP45I//oAxY+DyZejdW3vNIoQ0ot6oDrt3D65cgUmTHlo4ejRYWcHx41pqEyGkKYpRHZaXBwDg5NR0ubMz5OZqvjmEkGZRjOowY2MAALm86XK5HExMNN8cQkizKEZ1mKsrmJrC1asPLSwshJwc8PLSUpsIIU1RjOowAwOYNAm+/BJKShoXrlkDNjYwZoz2mkUIeQhdqddtBQUwbBhUVMDkyWBjAzExEBsLP/8Mb7yh7ZYRQhpQjOq86mrYswfi4qCmBjw8YMYMcHXVdpsIIY3opF7nxcRAYiK8/TZ8/TXk5cHOndpuECHkIRSjOi8xEXbsgMREKCuDHTvghx+03SBCyEMoRgkhhBeKUUII4YViVOeJRA/9B10SJETHUIwSQggvFKOEEMILxSghhPBCMUoIIbzoa7sB5Cl+tbSM9/UNMDMbKBZ/4etr3rHjZ9puEiFEFfVGdd1luXzThQuXyssrETdduPBdaqq2W0QIeQjFKCGE8EIn9YRoFIe4MC1NdYmfhcV0e3tttYfwRzHabohEIgCgO3K1dxxAXFnZNHt7V/Z0AwBHIyPtNonwRDFKiBb0t7TsZ2Gh7VYQ9aCxUUII4YVitHXKysrGjh07cuTImJgYTdb9+eefg4KCNFnxwoULffv2HTJkyPnz5zVZV/P++uuvvn37enh4HDt2TGNFt2Rnz7t5k/3vbm2txuoSQSBpsejoaFdXVwAwNDQEgMDAwPT0dEErKhSKcePGqf69TExMbt26JWjRiooKmUxmZmYGAHp6eiKRKCgoKDMzU9CiWpGRkfH2228DgFgsBgB9ff0PP/ywsLBQ0KJ1HOeXkBCenf1bQQH7X1l9vaAVidAoRlskKSlp6NChLMjc3d1Hjx7NUsbY2Hj58uXl5eVCFD169GivXr1Y0QEDBgwZMsTY2BgATE1NV61aVVFRIUTR6OjoF154gRXt16/fe++9x4qamZlJpdKqqiohimpeRUWFVCpV/j5DQkLGjx+vr68PADY2NjKZrKamRu1FC+vqvsnOrlQo/BISzpWVqX37RFsoRp+isLAwJCRET08PAGxtbcPCwurr6xExJycnODiY9WIcHBwiIiIUCoW6it65c0cikbAsc3Z2joyMZMuzs7MlEgm7ZO/o6BgZGclxnLqKJiYmDho0iBX19fU9derUkxvTTnEcFxUV1bVrVwBo0tG+cePG6NGj2Z726NHj8OHD6ipax3H78/OHXLzol5Cw++5ditFnDMXoY9XV1UVERHTs2JGd7gUHB9+/f7/JOvHx8f7+/uyNFxAQcO7cHZ5F5XJcvXqrkZERAJibm2/YsKG6urrJOmfPnn355ZdZUX9//7i4OJ5FCwoKlB8VHTp0YB8ViYmJ77//vjKmjx492rt3b1Z0+PDhly83bVW7cO5c3SuvDFL+6uLj49lyhUIxZcqUmJgYRIyJifH09GTrjBw5Mjk5mWfRs6WlQcnJfgkJfgkJ76emplZUUIw+YyhGm6d6Qj1ixIgrV648bk1l76Zr127GxtVBQXj7dlsqchxGRWHXrjhgQNpThyM5jouMjLS3t2ddKolEcvfu3TYUra2tDQsLs7KyAgADA4OQkJCSkpK8vLzZs2ezjvbevXuVKysUisjISDs7u1deeVcsRokE8/PbsqdakZuLwcEoFuOgQXu6dOnS5Oxh79697G/91ltvpaWlNftraUPRzOrqhTdvsgAdd+XKqZISRKznuKDk5KvCDAQRraAYbSo1NVV5Tbx79+5RUVEt+Sm5XL5x41VjYwRAMzNcswYrK1tRND4e/f0RAAGwf388d+5SS36qvLxcKpWyrmsbxi4f7Xa1JD6KioqWLs03MEAAtLHBr77C2tpW7KnmVVfjhg1oYYEAaGSEq1ZVyuXyJuuwHbe0tFTd8WY76S0sWlpZ+WVWVkBiol9CwtCLF/+dl1envuEXomsoRhs9mkqPnlA/WVYWSiQoEiEAOjlhZCSy984ff+Bffz205qFDyK635+Q09JIA0MEBIyKwtUOsN2/eVOa+u7t7S3I/JSVlzJgx7Ec8PDx+//13bOXJbEoKjhnTkPseHnjoECJiTQ1GRODp042rVVZiRAQWFLRuj1qloABlMlQd2MjPR5kM2RW46Gjs1q2hnYGBmJb2pE3dv3+/2cEN5ZCxj4/PyZMnn9we1mfv0qXL6KiofgkJn926VajjnzOEN4pRxAeHfufOndncF4lEkpeX1+atnTiB3t4Nb92AADxzBrt1Q319vHixcR1HR9y5E8PC0NISAdDQEENCsLS07btw7Ngx1bHLS5ea788WFxeHhoayCVvW1tbsknSzqdoSMTHo5dWwpyNH4pkzCIC2to0n+/n5CIBJSW3fr6e6fh0B0NkZlV3MpCQEwNhYHDmyoW29euHRoy3dYLOX2lQnMAQGBmZkZDT7sydPnvT29marTZ8/P0WY2RRE11CMNr1io7zswEd9PUZEYKdOCIDTpmG3bujqiv37N/Y0HR1x40Y0NEQAHD8eH/OubB12TaxTp07KD4N79+4pX1WObCpfzc/PbzZVW1W0thbDwtDaGgEwLAwBsEcPnDat4VWNxainJy5a1LCExei2bQ3DDmFhWFfX6s02yc1bt25VVlbKZDILCwsAMDExCQ0NLVO5TJSVlaWcROHk5KTeSRRExz3XMSro/CFELC7GTz7BnBzs1g3Dw9HODrdvb3jJ0RF378avv8Zjx9RYEBGxqKhImYzKKZB//fXXSy+9xEJh6NChSUlJzaZqm4vm56NUikVFCIBRUWhkhMePNyzXTIweOoQGBg39fRaj+fm4aRPymUrPvoZgbm6umpuqxwyb6CaXy5WrmZqahoaGPjr2Sp5tz2+MSiQSNt1a0NnsTLdu+P33uHMn2tg0nPCyGBXO1atXR44cyXKT9U8BwNXV9aeffkLEEydOKFN1yJAhSWrKudLShtD85BPs2ROrqzUXo/n5OGMGBgSgQtEQoyodcV6a/az9+++/+/bty36B7MKUSCSaPHlyVlaWeqqSduU5jdFr166x69EjRowQ+ruV+CBGFQp8+WWcOhVR+BhlYmJiXF1dXVxcDA0N2XV8Qc89lTEql6OTE65Zo9EYzc9HGxv817/UHKPM2bNn+/fvz3KzX79+cXFxbKKbjY1N165dPTw8YmNj1VmPtCvPaYxev34dAFxcXDRTjsUoIl6+jAYGeOqUhmIUESMjIwFg/PjxiBgbG6v8ZueaNWsqWzUnqwWUMYqIUVFoYoLnzmkuRhFx61a0tcVjx9Qfo4ioUCh27tzJ5uqKxeIjR44g4tSpUwHge/bXJc+r5/oOTyYmJhqu2Ls3zJ0LixeDxm6+bGBgAABsFpe/v7+Li0tQUFBycvKKFSsE3f2gIBg0CFasEK5CM/7xD3Bzg3XrBNm4WCyeNWtWWlqaVCrt06fPsGHDAIDNjmK9e/LcEjxGx40bJxKJTE1N5XK50LXahTVr4M4dyM3VQmlDQ8Pz589HRUW5uLhooNzWrXDqlAbqNNLTg+3b4eRJAUuYmZmtWrXq/Pnz7POJEBA6RouKik6cOAEAVVVVO3bsELRWq2jxgRxWVrB5s+bKNdlTCw3ecd3dHT75RGPVGvj5wZw5gldhFycJYYSN0eXLl5eUlPj6+gLA6tWr8/LyBC2ns379FcaOhfp6KC4GuRymTIHERAgM1Haz1K20FE6cADc3KC2FjAwoLIQVKyA9HV58UcCi3bpBejp06ABFRZCRASUlsHEjnDgBlZUCFlWiB2QREDRGk5OTd+7cqa+vHxkZOXr06LKyspUrVwpXTpf16gUdOkBcHNjaQmAgiETg6wsdOmi7Weo2bhwMHQqpqbB7N3TrBmvWgKEhuLmBoaGARQ0MwM0N9PRg82bo1g22b4fsbBg6FB7c8Y4QwQkYox999FF9ff28efN69eoVHh5uZGS0a9euhIQE4SoSQojmCRWjBw8ejImJsbW1/eyzzwDA3d39ww8/5DhuwYIFunAGJBLpW1u7mprSw8EJIXwJEqO1tbWffvopAKxdu7bDg3NXqVRqb28fFxcXFRUlRNFW6lZSklFRcULbzRDcczh4x2YfPU97TLRMkBj94osvbt686enpOUfloqmFhcXq1asBYPHixRUVFULU1XHPz9v7edrT5+5TijxK/TGan58vk8kA4KuvvmoyL2TWrFl9+/bNzs7+/PPP1V6XPOeen+wmukb9McpuhDNu3LjXXnutaTGxODw8XCQSbdy4MTMzU+2lCSFE89QcoxcuXNi7d6+hoeHGjRubXeGVV16ZOHFiVVXV0qVL1Vu6VZ6fnosmzzp79vx50KB9YnG2Bmo9gVic07//xB49PtJuM8jzQ50xiogLFizgOG7RokXdu3d/3Gqff/65mZnZgQMHTmn4q4Lapq9/a9Cgfd26/aLthgglJUUWGytRKPLMzU/17/+Vjc1vmqxubX2gf/+J5ua/IMrj4/9z48afGihKY6ME1Buj+/fvP336dOfOnZ/c03Ryclq8eDEALFy4kOM4NTZAxykUObGxkrS0L7XdEMFVVFyKj19UVHRMk0VLS6/Gx/9HLr+uyaKEgBpjtKqqatmyZQCwYcMGdiPbJwgNDXVxcbl48eL333+vrgYQQohWqC1GZTLZnTt3fHx8pk+f/tSVTUxMNmzYAADLli0rLS1VVxtarnNniIiA1as1X5kQ8qxRT4yyOUwikSg8PFwsbtE2J0+ePHjw4Hv37q0T6PaQj60L3t7AcRAcDO+8AwAwfjxs26aJ0loZR9NkUWUtGjEkzxX1xOjixYsrKyvfeecd5ZNpWyIsLIxNgUpNTVVLM1oiNxeuXoVlyx5aUlyssfrkWaD8hKAPDAJqiVH2/U7leXrL+fj4zJw5s7a2ll1x0ph33oFdu+DsWU3W1BVTp06NiIhQKBRCbNzJycnNzc1Q0Bs6tYChoaGbm5uTk5MQG6+rqwsPDx8wYEB9fb0Q2yftEd8Yra+vnzlzJiKGhoba2dkVt9KSJUssLS0PHTr055+amJ7C9OwJc+fCnDlQV6exmlrDhp7Z/586derf//733Llz/fz82O201evgwYPp6enKZ45qGOsSIqKbm1t6enpMTIzaSxw6dMjLy2vhwoXx8fHsiNVKP7SkpCQ9PV3zdclj8XyW0w8//KCWZvTq1evgwYM8G9MSgwfjmjVYVIR2dvjll4iIAQG4dq3gdcvLy4cMGWJgYODr6yt4MRXLly8HAD09vd9//x0Ro6Oj3dzc2O88MDAwPT1d7RVTU1P79+9vb2/v6Oh46tQptW+/WefPn3dzc+vcubOrq2tcXJzat5+SkjJmzBj2e/Pw8GC/zJiYmM6dOzs6Ovbo0ePMmTNqL/oojuO+/fZb9gCoF1544fLlyxooqlRa+r+MjKkpKcMzMt4tKvpJk6V1HN8Y5ThOefZkYmJi0ybKCVIjRowQ+shgMYqIu3ejhQXm5goeoxzH7d69u0uXLmwfv2ThrSllZWXKX++ECRNu3bpVU1MTFhbGFhoaGoaEhJSWlqqlVklJyccff8xO6s3MzABAJBK9++67gj69PTs7e+rUqWyMkj0iRSwWz5gxIzc3Vy3bLy6uWrBgAXvyko2NTVhYWG1t7bVr115//XX2WzU3N2cfVMHBwfnsCaXCOH36dN++fVU7H/r6+h9++GFhYaFwRZWKi39LTNTLzl5SWLj/7l1Zbq7wXY/2QycesKxQKCIjI+3s7Nh7QCKRCHE43riBpaWNMcpxOGgQzpjREKM3bmBRkdpr4vnz51955RV20Pft2/eXX35Rf42nKSkpkclkLGKUuZmTkxMcHMymVXTp0oUNmLa5BPsLdu7cWfkXvHXrlkwmYxFjamoaGhoql8vVuFOIyD4PVPcrLy9PKpUqHyItlUqrqqravH2FAiMjsXNnzssrQHlYFhcXh4aGso8Ka2trmUyWn58fGhrKnr1qbm4ulUqrq6vVuJuImJ2dLZFI2EeFo6Pjrl27cnNzQ0JC2K1/bGxsZDJZTU2Neos2kZ4elJ4+QdASrZWXl7do0aK9e/dG8XD9+nX+LdGJGGWKioqaHKDqOjKKizE0FI2McMmSxhhFxCtX0MgIO3bENWuwb1+0tcWwMKyrU0tNVM0pBwcHnjklUHvOnz8/YMAAlvJ+fn6xsbFt2PLZs2dffvllthF/f//4+HjlS1lZWarv/8jISI7j1LI7TxidyMzMlEgk7CVnZ+fIyMg2bP+vv/CllxAAAXDKlEuXLl168od9ampqUFAQK9q9e/eoqCg17CRiZWWl8iPQxMSE3fcHEUtLS+vr669fv/7GG2+woj169Dh8+LBaijYrPX3i1auedXWa6Pm2RE5OjvIMj48NGzbwb4wOxSjTZBDq0KFDfLamUOCOHWhnhwCop4fz5z8Uo4j48ccIgEuX4rBhDe+Z3r3x2DFeu/BoL4kd+rqgSe/49OnTHMcpH7ksEomCgoJu377dwq016SU9LiVPnjzp4+PDir733rbERF67cOkSzp59nm2tT58+x48fb3a1o0eP9urVi602e3bSpUst3X5WFkokKBIhADo5IQvh48dx8uQwtrVhw4ZdeszmYmJilEVHjBhx5cqVtuzhA9HR0a6ursqPioyMDOVLEomkZ8+ef/zxByv64oOnBo4cOTI5OZlP0cepqEhMSuqYlGSbmTmvouKCECVaZcaMGQBgamr65ptvBvGglhNEnYtRJiYmxsvLS3lkXL16tQ0bOXMGAwIawjEgANk1gJUr8eefG9cpK8N3321YEh2Nbm4N6wcGYlpaW1qugWs4PLHc7Nq1q2puVlRUSKVSExMTaNk5OOslsXN21kt68vqsK+fl9ZKFRalIhBIJ3r3b6pYXFmJICOrro74++vpO27ZtW319/RPWr6ur27p1a0DAe/r6KBajRIJPHSvatAmNjREAzcxw7VqsqmpIVQA0MSnv12/gTz895dJKXV1dREREx44dAUBfXz84OPj+/fut3dOLFy8OHjyYHUXe3t4nT55UfVUul3fr1o29Om7cuLS0tNra2rCwMCsrKwAwMDAICQkpKSlpbdHmKOTy01lZH9XV5SNiXV3hvXvf3LjxSkIC5OSsUMf22ygxMVEsFhsaGqampmqxGUo6GqOI2OTIaNXhmJnJTZzYEIguLvjjjy0tWlWF69ejuTkCoJERhoZiWVlLz8RVLzu8+OKLf/75Z0uragPLTTaMaGpqKpVKKysrW3gOHh0d/cILLyg/Km7dutXCoiUl3OLFaGiIAGhpiZs2YQuHberqMCICO3ZEANTXx+BgbHk0sSEdVtTaGmUyfMLQ5TffoEiEQUF4+zZWVKBU2pCqpqYolWLLB1oLCwuVY5e2trZhYWF1LRstYj/IrsWzH2z2o4K9O9h1QpabpaWlBQUFyp/t0KHD4362BRRyeeydOyGXLjkmJEBCAty/H6H6cl7epsREfW2d4HMcN3DgQAD49NNPtdKAR+lujDKqR8YTjiqliooKmUzm6ztELFaYmmJoKLbhwkZuLgYHo54e2ttnd+3q8tSiLTz0dZBqbjo5ObHcVD0H37Ztm+r6Fy5cUH5RzcfHp23zmW7exKCghg85d3dko4jffIOvvoqZmY2rffllwwyKY8ewd++G9YcPx7ZN5UhJwTFjGjbi4YFsrOif/8RRox5K5GXLMDQUOQ6jotDFBQEaU7UNrl+/PmrUKPbrUp6DPw5LRmtra2UyFhcXP3n7ubm5wcHB7MDr2LEjO/ASExNV/0ZNerJPwHG1paV/3r49JympE0vPhAS4csUtK+uTysqHTgfLy+MTEqCmJvNxmxLUvn37AKBz587qmmHCn67HKNOSjh7Hcfv373d2dmbnqosWnbxzh1fRc+dQIvmn6jDio+uo5QxO606cOOHt7c32NCAg4MyZM+wc/KWXXlIeqerr6TQ4cgQ9PRty7dVX8f33EQD/7/8aV5gzBydOxLfeagzc337jUxAR8dAh9PBo2OCbb+Lbb6NIhDNnNq4wdix+8AH279+wjr8/qlwwa6Mm4zxpzY0W8RnFSkhIYL0zAPD19WWfbU3OGFTHVZuorq6Ojo6ePn36jh3Dlel59WrPnJzlqmOgeXmbi4qiqqtvVlQkpqQMv3r1RUT1XC1slcrKSjYe9d1332m++uO0jxhlnjDsqPoJrDyS1FX0cSewqhcx+F9P0K5HZyzl5eWxM3rBxt0aTtU7dcKxY3H+fBw2DC0t8ddfG16dMwcnTcLZs9HMDKXSJ52Jt0ptLYaFobU1zpuHEybg2LFobIzKHtvYsbhgAY4bh126YEQEqmtuhepcXeU5OHupySX+Nl9TffRAfdxVfqaqqio6OloikbC/LAC8+ebLV6965uRIKyoSHt1+Xt4X1671vXjRMimpQ1ra/1VX30TEqqqUtrW2zdgD2319fbU776WJ9hSj+MhF8A8++CA5OVm9vaRHsYEC1cspR48eVR767u7u6prdonVyuVwqlSqnQC5YsCAiIkK1lyTEVeCCAszKwvnzceJE3LgRnZ0bxmFYjN6/35aLUU+Vn49FRThhAoaE4JIl2KNHQ0yzGL17F4WYW8HOwZVzdbds2bJy5Uo1Tjh99ECVy+WqsykcHBy2bNly4MCBoKAg9v0Ixs/Pb/369a29XJOXtzkxUb+wcC+fNrdKZiYOHbrf3t6x2VNDLWpnMcpkZ2ePHj2aHQEsQEUi0bx58wQdK8nMzJw0aRI7HBlLS8vNmzcLPe1Z827cuKGcc6Yc1/vvf/8raFEWozU12KMHLl6M+CBGBcVitKwMHRxw3TrEBzEqKOU0W3YBSiwWz5kzR43fN2l2FpryG1BsMgbj6ekplUpTUtrYo7x/f2dCAly4YK6xPumkSQiAEonOvePaZYwi4tGjR9mBqKenxz7PBf3GoVJsbKyHh4dIJPLz88vLy9NARW2JjIy0tLQ0MzObPXt2bW2t0OVYjCJiTAwaGOCVK5qLUUT84Qc0McHbtzURo4jIcdzmzZsBwMrK6sIFQeZgqn4nol+/fnFxcQqF4tNPPwUAa2vr8PBwtbxfMjLeTUiAa9f8OE7waPv7bxSJGv5Mukb9D1jWpCFDhtTX13fq1EljFQcOHJiSksJxXEJCAhtJfFZNmzattLS0vLz822+/Zd8o14yRI2H8ePhIs4/1nDQJBgyAJUs0VE4kEo0dOxYA7O3tfXx85HJ5cXGxeu+85+/vHxcXFxkZaW9vf/78+Z07d4rFYvaVpz59+oSEhKjlRoJdu24zMnKtrEzMyVnBf2tPwHGwYAEgQmgouLgIWqot2neMkmfSl1/C2bNw9KhGi37zDfz2GyQlabQo89prr9na2iYkJKh3syKRaNq0aSkpKUuXLhXoGRN6elaurj+KRAb5+Z+Xlh4WogTz3XeQkABOTqDZWxO3VHuNUbrr+DPMwQFWrYJbtzRatEcPWLwYsrI0WlQDLC0t169fb29vL9D2zcz6dekiBcDbt9+rq8sTooRcDitXAgBs3gwqF8Z0SHuNUfKMcXaGB5PZAABCQiAwEB58oVwobm7g7Nz4z2XLYPhwEOau+U09S/2ALl2WWlgMr6mxX7z4IyH2aM0auHsX+veHSZPUvm310Nd2AwgBABg7FlQHYPX1YetWEPqhsRs3PvRPU1M4dkzYitolWHaLHR33DRvmm5t7uWvXvh9//LF6t+7nB127wpYtoDJNRrdQb5TohNDQpqEWEQHBwVpqDWklM7MuO3d+JxKJli5depbfY84WL4b580E16uPi4F//gofvWK1bKEYJ0bJn4wT/jTfemD9/fl1d3ZQpU8rKytq8nf/9D775Br7/vnHJf/8LmZlqaKFw2muMDtDTq3V2/t3KCgDS7O1rnZ3t2/lRSJ4rz0Z0NrFp0yYfH5/09PT58+fz2c7w4fDJJ3D/vrraJbj2GqNGHGeQlWVSUgIARvn5BllZNMrb3hUXw5Urjf9rR+8iwhgZGUVFRVlYWOzZs2fv3r1t3s7bb0Pv3pqbxssfhQ/RFYcPw+nTjf8sL4cHN34h7cDFixd9fHzc3d3DwsJmzZr14YcfSqVSUXNXhXr3/uPKlR7NbuSzzxr+IywM+vWDGTNgyBDhmqw2FKNEV0ydCjt2NP5z+XI4flx7rXkWCTeS8Msvv4wfPz4kJCQ8PNzX11dfX9/IyOjWY6b+OjiIMzKa345yboa3N3zwAXzwgXa+ENFa7TZG2afcszW0RJ5P7X2cNDs7e86cOQDQvXv3ysrKd999t76+PjAwcMWK5r8hKhLZPW5fO3SAXbsa/nv1aoiKgm3bBGmzerXbGFVFkUram/YenUocx02fPr2wsPCNN96YN29ecHDw9evXvby8tm7dqno3qTawsoLPP4eQEDA0VFdjhdJeLzERQnTB2rVrjx8/7ujouGfPnoMHD+7cudPY2Hj//v08M5SZMgW8vSFPkK+YqtMz0Rsl7d+yZWBk9NCSqVPhwYNjiI76++/4NWvWiMXiPXv2VFVVBQcHA8BXX33Vp0+fNm/T0vKhI+Hrr2HIkKbHhjMgWnYAAAdOSURBVK6hGCU6ISCg6ZIXX4QHT19/xmnsBF+9hUpK4B//8AoImDRkiMvgwUOHDh1cVFQ0bty4uXPn8tms6mwNAPD0hAsXYOVKGDUKHBx4NVg47TZGBw6EoiK4fx8uXoRjx6BDB3jwSBlCdN8zMDYaHAzJyZYvv7xv1Spu9Wpxff1iLy/5t99+q/ZCn3wCP/4IiLB7t9q3rR7tdmz0/HkYOhR69oRXX4Xu3eHNNyElRdttIuR58e238J//gLk57NkD8fHi9eshIeGtbduSOnTooPZa69eDsTHs2QP8vqwvoPYZo2lp8Npr4O8PxcVQUAC5uWBpCSNGQGGhtltGiM5JTk5W7631b96ERYsAALZvh06dYNo0UChg+XIYPFiQWzC5ucFHHwEiLFyoo/Nx2meMfvEFODnB9u0NJ/JdusCBA1BTAwKcUBAitPfff18mk7kI8HCMoqKiBQsWeHt7R0REqGubNTUwaRKUl8OMGTBlCsydC5mZ4O8Pj5kkqh7LloGDA5w5A/v2CVilzdpnjJ48CWPHgp5e4xLWGz15UnttIqQVCgoKACA/P5/juClTpoSGhqrl4UhK9fX1W7ZscXd337JlCwDcu3cPAEpLS1nRPB5ziHJyQC4HDw/4+mvYvh1+/BGsreHHH0HQ53WZm8P69QAAS5YAj7tHCUY7T9LjqWNH/OqrpgsXLkQ/P220hpBWS09PZw9Y7tOnT1xcnHo3fuzYsd69e7M3+IgRIy5fvlxXVxcREWFra8vGLsVi8YABA2Qy2c2bN9uw/bIyTE3F5GQ0NUUA3L9fvc1vHsdhQAAC4PLlmijXKu0zRl1ccOXKpgunT8fBg7XRGkLaYunSpebm5izUZs6cmZuby3+bN2/eDAoKYgHq7u4eFRWFiEeOHPH09GQLAwICXn/9dWNjY/ZPkUjk7++/cePGtLS01taKjkYLC5w1i3+rWyo+Hp2dbw0aNDkjI0NzVVugfcbo2LEYGPjQEo7DXr0aHjpOSDtRXl4ulUpZqJmZmUml0qqqKj6bMjIyUm6qurq62VRFxLKysgMHDkyYMMHU1FR5Yjp58vHVqzE5uZmNx8ejTIYFBY1LYmNx3z5MS8Py8ra1t42mT58JAOPHj9do1adpnzH6668oFuPx441LvvsO9fTw4kXttYmQNkpLS1OGnbOzc2RkZKt+nOM49jx61ruUSCR3795tNlUf/dmqqqro6GiJRGJlZeXuXg2AAOjmhiEhGBuLHNew2vr1CPBQx1MqxYED277LbXb37l1LS0sAOHLkiBbKP0b7jFFEXLQIDQxw4kRctgwDA1FfH8PDtd0mQtru+PHjyu9QDhs2LCkpqSU/de7cuf79+7Of6tevX1xcXLOp+tTt1NTUHD6M772HHTogC1OWp4sXY3w8rluHrq5oYoKxsQ3raytGEXHdunUA4OnpWVdXp50WPKLdxiginj2Lq1bh++/junV47Zq2W0MIXwqFIjIy0s7Ojg2YSiSS/Pz8x62ck5MjkUjYt6EcHR0jIyM5jns0VVvbhvp6jI3FkBB0cGgI086dce1aHDgQP/0UvbywthZRqzFaU1PTvXt3ANi6dat2WvCI9hyjhDyLioqKQkNDDQ0NAcDa2lomk9XU1KiuUFNTs3r1ajMzMwAwMTH57LPPysvLm01VPs1QKPDUKVy4EFeuxPXrceBAlMvRwQE3bEDUaowi4s8//wwAtra2BarjtdojQt38WgAhz7fU1NRFixYdPnwYADw8PL744ovAwED2Esdx/v7+iYmJgYGBW7Zssbe337Jly7p16+RyuYmJSUhIyPLlyy0sLNTYmA0b4I8/IDYWfvwR3nsPkpNh9244dgxiY9VYpHVef/31//3vf56ennPmzFG9UPZk1dXVVVVVqksGDBgwcOBAvq3Rdo4TQh4rJibGy8uLvVVHjhx59epVtvzMmTMnTpxAxOjoaFdXV7ZCYGCgQDOBWG+UGTUKJ0zQcm8UES9fvmxjY8M3/gCkUin/xrTbOzwR8hwYOXLkxYsXt23bJpVKjx496uPjM3PmzHXr1gUEBFy/fn3UqFFHjhwBAG9v7/Dw8MGDB2ugSWFh8NJLUFysgVJP0rt376ioqIULF/r4+LS8N2psbNzkftJq6IoC0Ek9Ie3AvXv3VqxYsWvXLo7jrKysTE1N7927p1AoOnXqtHbt2lmzZumpfjda3ZQn9czy5bB+PQwcqM2Tep3SPr9TT8hzxs7ObseOHYmJiYMHDy4tLWVzmIKDg69duxYcHCxohj5qxQp4MJBAAKg3Ski7s3Tp0oKCgnHjxo0ePVozFUtKoLLyoZvPFxZCTY3u3o5ewyhGCSGEFzqpJ4QQXihGCSGEF4pRQgjhhWKUEEJ4oRglhBBeKEYJIYQXilFCCOGFYpQQQnihGCWEEF4oRgkhhBeKUUII4YVilBBCeKEYJYQQXihGCSGEF4pRQgjhhWKUEEJ4oRglhBBeKEYJIYQXilFCCOGFYpQQQnj5fxyHuFQEpQJtAAACxnpUWHRyZGtpdFBLTCByZGtpdCAyMDIzLjA5LjIAAHicdZJfSFNRHMfPPXf37v+925rOOZ135v4IIWEQBIudPZRoPdRDgQXtSImXkqIQxIdElB4ijKikYgVREEgWKT00SnfMWP/UIHowISPMsAjN6A+kQfee4xy4vHD4fc73fH//4M4P3ZkC2ieD3FepnSrtdHBmoGiRE1FEC7xBRKoeec7IBD77YAT0Aa44c4bljJUSIJuxbABYixBmY07P+nVdy1/lM7F3uKaea7zGzHlL5c2av+X/Zuby7g6UnWFVByvgAAdVCHnMG1RoEBRBVESjCo0mbDKr0GxRLFbFagM2O7BLQJIV2QEcTuB0AXldBEpu4C4AlsIINHuwp0gr4MXeYhUW+xRfiQpLSnGpX4X+MlymqFAJKIFyUL4eBCpARRAoIRwK43BEc3O4SMQFVuBygIgPh/3YLWrjiVyx16DNbDSZPUWiaLNL7gIr73S4xJJSfzjiE8r84ZDi+cVpG678KL11Nejs5EhMv9z+KKOZTBfliul7sbt9TN8cMMSEK0y/uccQs8c/RXU+dnhgKGP4PqTzeEcynXg+SXUhmUwHjzB/Eglkbher8743SrZtZfqAt4Vcq2T6RLqHNMpMb2zrJ1U/H1MWq8bIqQ+dlNvCY8Te8IDWfzMxSy48+0H7bqyFw4nPfVRv3WkZ3mc7GGO6a7hu/y2qb389S4LVg2mda5v7ieRkuXWpHpJ4O0I9Y6EWMn4xgHR+cjVK7LtfUf1QSCDSDbbjeU9/TNq0QDmzIKNEaoF6LjfUoEzPN6rfP1qDTgQf0V7umXYUtS5RPh1MonO/H1JuupRCW45PU249MIpOXm+l7MNfkbOZcXf9Etr7d3RQ56cbpPif+peUxcYlxHd+oZ6uaik+3sR47kwKTe0opzMs+pMoPjVP+d2LdjTZvUjnLPwH8bbPlpFdjnkAAAMkelRYdE1PTCByZGtpdCAyMDIzLjA5LjIAAHicfVZJbhsxELzrFfyACfbC7WhbThAEloHYyR9yz/+RLopDjgHSktiQxzXFmuqFuji8fl1//v3nxouvl4tz4YtPrdX9kRDC5dXhi3t6+f7j5p4/Hp+OK89vv28f707JabJ77P0Z+/jx9npcIffNiSeSKuzIRwkUsgs+tNe8k92zY89apOEC17jEieHIS6HOxyUvcepudjVSos5X1vtG42s48IU9LhnfQ/CiRVPBN81s9yyQ2RgfyDNpMSQo01piASWAMRmQ9sAKRvahcGjAyLEugRTuyKq5dEozfIkkIMUX0oNT15sTQ6b6lOshUza7IzcP0UdNIdXGKWszSYFMXokOnbxON0Xs3pDgDF8gEziz55xGhkpYZohaioon5SNFtBFa7sikkhqylLR5+JakeuYMa+s5uLe70JZ3q4CtUKbDUR3PlNZVx2iglqZy7L9JEwv2b6kHkvepZx3lFDrnppw4HiUahlApa6HoJHSwROu4hqTN0+c+Eyr13tzMBEaWMGSidMpUN5vXjmxjAdbvkBhzhqyF8x1ZLfFrJJKkvtxHV+MsukailaLPtXbOnNNmd7RSMpNSN6mmzGskBl0GMh/ItNkdOcpWwbHekWT/XiPRSuWEtENhs3vunLk/exHecJbOOZG04azuvblkflLTGTd+aujO56OUds4r9WxSLyUr42Upvdyun46v+4H29Ha7zgMNb57nltqSeTwp1jyF8I7zsCFbaZ4pZCvPg0NtlXk8qK06DwG1RedZTy3QaaZTC0OczW5qYeizGU0tDIk2i6mFoZIgE2EIJShFGFpthFILQ66NSmphKLaRyLiLz6OvMfPQzM1M+0xDGXfZlaGZYSphHJ0GkyLw0GwDyHy24XIaM4rAQzJDsgUekhkWW+AhmeGyBTnPAXWty0/9rggyJAuKwILMMoDNFmRIFthsQYZkgc0WZGiW1DFDs0CzBRmaBZrBPDQLNCOc+0JbGJrxU609+7jQ6jbO4lMoNu9n8WgrjHpKusJko5oOarrvNN1BA53bBX8fvx3t++U/0IH0Tt1arkMAAAGcelRYdFNNSUxFUyByZGtpdCAyMDIzLjA5LjIAAHicTZI5biQxDEWvMmEXUC2Ii7igMZGACduHqLwTpz78kJTcdqZ69bl+zvm8zXlc8Lrwuq7b/aLXdXte/IqPefv7cTzHTE0+p8wpx5zjuC4+QkH/4oWf8OfrZs1sOJ333ro7Our50B8Gvfd4BSITLeQivJCZUiBoRtj9fNhvAhhEMmzsMMUMG03dtUSqYiHiZmwrjMU4CDU3XBoX70UABLbGbZNBlbpIpMaG7CGKUdBHjwQPiAZoNcAKNUpvA8QJS2bQFc9H/Ca2qL6EKCmMWghsWkLRyldsiJ5QDIJg64Y9ycDhizirLc2gJNQMeGtY3yQ0+CbcRH1npsoz2mDZUQRJpDHAzozjTfrqEccaRhuqsK5ZrIYuln3nApMFsgaMe7zMwcWESZKZxebjGO7+S9ZpVcjO+LuCrG1l/7Z15CtdbWJ3N3Zs7qt/x5K9LYGwBMoSWL5B/MrRLdeT3uaRQHkLy39aCuoA62hguR+Vyq28LN3u52Xt87N1NjAKHV//AScWrNx3H/o0AAAAAElFTkSuQmCC", "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m = Chem.MolFromSmiles('FC1=CN=C(NC2=NC=C(C(N3CCN(C(C4CC4)=O)CC3)=O)C=C2)N=C1C5=CC=C6N=C(N(CC)CC)SC6=C5')\n", "m" ] }, { "cell_type": "markdown", "id": "43e3183b-cbac-40af-9400-df60e1f281cd", "metadata": {}, "source": [ "# Using the RDKit installed from conda-forge\n", "\n", "The Contrib directory is installed with the RDKit conda package, you just need to tell python to look in the appropriate place:" ] }, { "cell_type": "code", "execution_count": 3, "id": "3f93ea6e-9aef-4b78-b2d3-0facb2922cbc", "metadata": {}, "outputs": [], "source": [ "import sys\n", "import os\n", "sys.path.append(os.path.join(os.environ['CONDA_PREFIX'],'share','RDKit','Contrib'))\n", "\n", "from SA_Score import sascorer\n", "from NP_Score import npscorer" ] }, { "cell_type": "markdown", "id": "bc45599d-9821-43d9-bf92-ab658ee7bed1", "metadata": {}, "source": [ "# Using the RDKit installed from pypi\n", "\n", "When Chris (@kuelumbus on GitHub) packages the RDKit for pypi, he copies the Contrib directory into the rest of the Python code, so you do:\n", "\n", "```\n", "from rdkit.Contrib.SA_Score import sascorer\n", "from rdkit.Contrib.NP_Score import npscorer\n", "```" ] }, { "cell_type": "markdown", "id": "c2ca986c-e94c-4d4d-bd48-e99ca6717899", "metadata": {}, "source": [ "# Calculating the scores\n", "\n", "This is the same regardless of how you have the RDKit installed:" ] }, { "cell_type": "code", "execution_count": 4, "id": "532d7847-7d51-4458-a009-9abecb9e8fb4", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2.8716389090191434" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sascorer.calculateScore(m)" ] }, { "cell_type": "markdown", "id": "cb999625-0504-4999-8215-6c88bdeaecdf", "metadata": {}, "source": [ "The SA_Score ranges from 1 to 10 with 1 being easy to make and 10 being hard to make." ] }, { "cell_type": "code", "execution_count": 5, "id": "3cb1f499-a021-4e0e-9737-be898babc028", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "reading NP model ...\n", "model in\n" ] }, { "data": { "text/plain": [ "-1.960519718438019" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fscore = npscorer.readNPModel()\n", "npscorer.scoreMol(m,fscore)" ] }, { "cell_type": "markdown", "id": "b4c1c5f7-fa0c-4e37-a783-cacacdfc7792", "metadata": {}, "source": [ "The NP score ranges from -5 to 5, so this is pretty low.\n", "\n", "You can also get a confidence value:" ] }, { "cell_type": "code", "execution_count": 6, "id": "7f3ea417-47e4-4eab-a41e-9bbe9f2db3ea", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "NPLikeness(nplikeness=-1.960519718438019, confidence=1.0)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "npscorer.scoreMolWConfidence(m,fscore)" ] }, { "cell_type": "markdown", "id": "2d84d483-bbf7-4f28-9ee1-98a5983acc59", "metadata": {}, "source": [ "That's it for this post. As I said in the intro, it's a short one!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.6" } }, "nbformat": 4, "nbformat_minor": 5 }