{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "A [question asked on Mastodon](https://aus.social/@polymerreaction/109543412170217264) made me realize that we don't have a tutorial anywhere on descriptor calculation. Here's a first pass at doing that. This will eventually end up in the RDKit documentation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Start by doing the usual imports" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2022-12-20T05:18:53.601332Z", "start_time": "2022-12-20T05:18:53.477333Z" } }, "outputs": [ { "data": { "text/plain": [ "'2022.09.1'" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from rdkit import Chem\n", "import rdkit\n", "rdkit.__version__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A test molecule:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2022-12-20T05:18:54.571085Z", "start_time": "2022-12-20T05:18:54.561585Z" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "doravirine = Chem.MolFromSmiles('Cn1c(n[nH]c1=O)Cn2ccc(c(c2=O)Oc3cc(cc(c3)Cl)C#N)C(F)(F)F')\n", "doravirine" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `Descriptors` module has a list of the available descriptors. The list is made of (name, function) 2-tuples:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2022-12-20T05:18:55.586164Z", "start_time": "2022-12-20T05:18:55.574240Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "208\n", "[('MaxEStateIndex', ), ('MinEStateIndex', ), ('MaxAbsEStateIndex', ), ('MinAbsEStateIndex', ), ('qed', )]\n" ] } ], "source": [ "from rdkit.Chem import Descriptors\n", "print(len(Descriptors._descList))\n", "print(Descriptors._descList[:5])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use those functions to directly calculate the corresponding descriptor. So, for example, the value of `MaxEStateIndex` for doravirine is:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2022-12-20T05:19:00.014047Z", "start_time": "2022-12-20T05:19:00.001327Z" } }, "outputs": [ { "data": { "text/plain": [ "13.412553309006833" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Descriptors._descList[0][1](doravirine)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As an aside, if we just want a few named descriptors, it's a lot clearer (and easier to write the code!) if we call the individual descriptor functions directly:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2022-12-20T05:19:01.156995Z", "start_time": "2022-12-20T05:19:01.145963Z" } }, "outputs": [ { "data": { "text/plain": [ "13.412553309006833" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Descriptors.MaxEStateIndex(doravirine)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Often we want to calculate all the descriptors. As of the 2022.09 release of the rdkit there's no real convenience function for descriptor calculation, so let's create one:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2022-12-20T05:19:02.305280Z", "start_time": "2022-12-20T05:19:02.302703Z" } }, "outputs": [], "source": [ "def getMolDescriptors(mol, missingVal=None):\n", " ''' calculate the full list of descriptors for a molecule\n", " \n", " missingVal is used if the descriptor cannot be calculated\n", " '''\n", " res = {}\n", " for nm,fn in Descriptors._descList:\n", " # some of the descriptor fucntions can throw errors if they fail, catch those here:\n", " try:\n", " val = fn(mol)\n", " except:\n", " # print the error message:\n", " import traceback\n", " traceback.print_exc()\n", " # and set the descriptor value to whatever missingVal is\n", " val = missingVal\n", " res[nm] = val\n", " return res\n", " " ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2022-12-20T05:19:03.356076Z", "start_time": "2022-12-20T05:19:03.319709Z" } }, "outputs": [ { "data": { "text/plain": [ "{'MaxEStateIndex': 13.412553309006833,\n", " 'MinEStateIndex': -4.871620672188628,\n", " 'MaxAbsEStateIndex': 13.412553309006833,\n", " 'MinAbsEStateIndex': 0.045220418860841605,\n", " 'qed': 0.6914051268589834,\n", " 'MolWt': 425.754,\n", " 'HeavyAtomMolWt': 414.66600000000005,\n", " 'ExactMolWt': 425.050251552,\n", " 'NumValenceElectrons': 150,\n", " 'NumRadicalElectrons': 0,\n", " 'MaxPartialCharge': 0.4197525104273902,\n", " 'MinPartialCharge': -0.45079941098947357,\n", " 'MaxAbsPartialCharge': 0.45079941098947357,\n", " 'MinAbsPartialCharge': 0.4197525104273902,\n", " 'FpDensityMorgan1': 1.3103448275862069,\n", " 'FpDensityMorgan2': 2.0344827586206895,\n", " 'FpDensityMorgan3': 2.6206896551724137,\n", " 'BCUT2D_MWHI': 35.495691906445956,\n", " 'BCUT2D_MWLOW': 10.182401353178228,\n", " 'BCUT2D_CHGHI': 2.363442602497932,\n", " 'BCUT2D_CHGLO': -2.1532454345808123,\n", " 'BCUT2D_LOGPHI': 2.362094239067197,\n", " 'BCUT2D_LOGPLOW': -2.2620565247489415,\n", " 'BCUT2D_MRHI': 6.30376236817795,\n", " 'BCUT2D_MRLOW': -0.13831572005086737,\n", " 'BalabanJ': 2.1143058157682066,\n", " 'BertzCT': 1236.821427505276,\n", " 'Chi0': 21.344570503761737,\n", " 'Chi0n': 14.619315272563007,\n", " 'Chi0v': 15.375244218581463,\n", " 'Chi1': 13.595574016164479,\n", " 'Chi1n': 7.8933192308003095,\n", " 'Chi1v': 8.271283703809537,\n", " 'Chi2n': 5.882827756329733,\n", " 'Chi2v': 6.319263536801718,\n", " 'Chi3n': 3.9307609940961763,\n", " 'Chi3v': 4.148978884332168,\n", " 'Chi4n': 2.4772835642835087,\n", " 'Chi4v': 2.7023697348309867,\n", " 'HallKierAlpha': -3.519999999999999,\n", " 'Ipc': 2291995.915536308,\n", " 'Kappa1': 20.220355828454835,\n", " 'Kappa2': 7.4789147435283585,\n", " 'Kappa3': 4.168020338062062,\n", " 'LabuteASA': 164.8909024413842,\n", " 'PEOE_VSA1': 9.303962601591405,\n", " 'PEOE_VSA10': 11.3129633249809,\n", " 'PEOE_VSA11': 5.824404497999927,\n", " 'PEOE_VSA12': 5.749511833283905,\n", " 'PEOE_VSA13': 5.559266895052007,\n", " 'PEOE_VSA14': 11.86604191564695,\n", " 'PEOE_VSA2': 9.361636831863176,\n", " 'PEOE_VSA3': 9.893218992372859,\n", " 'PEOE_VSA4': 23.531818506063985,\n", " 'PEOE_VSA5': 0.0,\n", " 'PEOE_VSA6': 11.600939890232516,\n", " 'PEOE_VSA7': 24.26546827384644,\n", " 'PEOE_VSA8': 18.267148868031594,\n", " 'PEOE_VSA9': 18.177429210401844,\n", " 'SMR_VSA1': 17.908108096824506,\n", " 'SMR_VSA10': 11.600939890232516,\n", " 'SMR_VSA2': 5.261891554738487,\n", " 'SMR_VSA3': 19.331562912184786,\n", " 'SMR_VSA4': 7.04767198267719,\n", " 'SMR_VSA5': 12.72105492335605,\n", " 'SMR_VSA6': 0.0,\n", " 'SMR_VSA7': 73.27433730199388,\n", " 'SMR_VSA8': 0.0,\n", " 'SMR_VSA9': 17.568244979360085,\n", " 'SlogP_VSA1': 15.98587324705553,\n", " 'SlogP_VSA10': 13.171245143024459,\n", " 'SlogP_VSA11': 11.49902366656781,\n", " 'SlogP_VSA12': 11.600939890232516,\n", " 'SlogP_VSA2': 19.331562912184786,\n", " 'SlogP_VSA3': 19.76872690603324,\n", " 'SlogP_VSA4': 11.33111286753076,\n", " 'SlogP_VSA5': 16.95130748139392,\n", " 'SlogP_VSA6': 40.05138621360316,\n", " 'SlogP_VSA7': 5.022633313741326,\n", " 'SlogP_VSA8': 0.0,\n", " 'SlogP_VSA9': 0.0,\n", " 'TPSA': 105.70000000000002,\n", " 'EState_VSA1': 28.738272135679853,\n", " 'EState_VSA10': 22.760319511168106,\n", " 'EState_VSA11': 0.0,\n", " 'EState_VSA2': 28.704757542634727,\n", " 'EState_VSA3': 6.06636706846161,\n", " 'EState_VSA4': 21.397409935657397,\n", " 'EState_VSA5': 19.18040611960041,\n", " 'EState_VSA6': 6.069221312792274,\n", " 'EState_VSA7': 0.0,\n", " 'EState_VSA8': 10.197363616602075,\n", " 'EState_VSA9': 21.599694398771053,\n", " 'VSA_EState1': 47.48050639865553,\n", " 'VSA_EState10': 5.842061004535676,\n", " 'VSA_EState2': 24.16343117595945,\n", " 'VSA_EState3': 14.921853617262808,\n", " 'VSA_EState4': -2.8980189732872814,\n", " 'VSA_EState5': -1.0781549918202147,\n", " 'VSA_EState6': 6.092225491490601,\n", " 'VSA_EState7': -3.945179835565914,\n", " 'VSA_EState8': -0.2762282865821226,\n", " 'VSA_EState9': 1.3919488437959202,\n", " 'FractionCSP3': 0.17647058823529413,\n", " 'HeavyAtomCount': 29,\n", " 'NHOHCount': 1,\n", " 'NOCount': 8,\n", " 'NumAliphaticCarbocycles': 0,\n", " 'NumAliphaticHeterocycles': 0,\n", " 'NumAliphaticRings': 0,\n", " 'NumAromaticCarbocycles': 1,\n", " 'NumAromaticHeterocycles': 2,\n", " 'NumAromaticRings': 3,\n", " 'NumHAcceptors': 7,\n", " 'NumHDonors': 1,\n", " 'NumHeteroatoms': 12,\n", " 'NumRotatableBonds': 4,\n", " 'NumSaturatedCarbocycles': 0,\n", " 'NumSaturatedHeterocycles': 0,\n", " 'NumSaturatedRings': 0,\n", " 'RingCount': 3,\n", " 'MolLogP': 2.65458,\n", " 'MolMR': 94.87570000000002,\n", " 'fr_Al_COO': 0,\n", " 'fr_Al_OH': 0,\n", " 'fr_Al_OH_noTert': 0,\n", " 'fr_ArN': 0,\n", " 'fr_Ar_COO': 0,\n", " 'fr_Ar_N': 4,\n", " 'fr_Ar_NH': 1,\n", " 'fr_Ar_OH': 0,\n", " 'fr_COO': 0,\n", " 'fr_COO2': 0,\n", " 'fr_C_O': 0,\n", " 'fr_C_O_noCOO': 0,\n", " 'fr_C_S': 0,\n", " 'fr_HOCCN': 0,\n", " 'fr_Imine': 0,\n", " 'fr_NH0': 4,\n", " 'fr_NH1': 1,\n", " 'fr_NH2': 0,\n", " 'fr_N_O': 0,\n", " 'fr_Ndealkylation1': 0,\n", " 'fr_Ndealkylation2': 0,\n", " 'fr_Nhpyrrole': 1,\n", " 'fr_SH': 0,\n", " 'fr_aldehyde': 0,\n", " 'fr_alkyl_carbamate': 0,\n", " 'fr_alkyl_halide': 3,\n", " 'fr_allylic_oxid': 0,\n", " 'fr_amide': 0,\n", " 'fr_amidine': 0,\n", " 'fr_aniline': 0,\n", " 'fr_aryl_methyl': 0,\n", " 'fr_azide': 0,\n", " 'fr_azo': 0,\n", " 'fr_barbitur': 0,\n", " 'fr_benzene': 1,\n", " 'fr_benzodiazepine': 0,\n", " 'fr_bicyclic': 0,\n", " 'fr_diazo': 0,\n", " 'fr_dihydropyridine': 0,\n", " 'fr_epoxide': 0,\n", " 'fr_ester': 0,\n", " 'fr_ether': 1,\n", " 'fr_furan': 0,\n", " 'fr_guanido': 0,\n", " 'fr_halogen': 4,\n", " 'fr_hdrzine': 0,\n", " 'fr_hdrzone': 0,\n", " 'fr_imidazole': 0,\n", " 'fr_imide': 0,\n", " 'fr_isocyan': 0,\n", " 'fr_isothiocyan': 0,\n", " 'fr_ketone': 0,\n", " 'fr_ketone_Topliss': 0,\n", " 'fr_lactam': 0,\n", " 'fr_lactone': 0,\n", " 'fr_methoxy': 0,\n", " 'fr_morpholine': 0,\n", " 'fr_nitrile': 1,\n", " 'fr_nitro': 0,\n", " 'fr_nitro_arom': 0,\n", " 'fr_nitro_arom_nonortho': 0,\n", " 'fr_nitroso': 0,\n", " 'fr_oxazole': 0,\n", " 'fr_oxime': 0,\n", " 'fr_para_hydroxylation': 0,\n", " 'fr_phenol': 0,\n", " 'fr_phenol_noOrthoHbond': 0,\n", " 'fr_phos_acid': 0,\n", " 'fr_phos_ester': 0,\n", " 'fr_piperdine': 0,\n", " 'fr_piperzine': 0,\n", " 'fr_priamide': 0,\n", " 'fr_prisulfonamd': 0,\n", " 'fr_pyridine': 1,\n", " 'fr_quatN': 0,\n", " 'fr_sulfide': 0,\n", " 'fr_sulfonamd': 0,\n", " 'fr_sulfone': 0,\n", " 'fr_term_acetylene': 0,\n", " 'fr_tetrazole': 0,\n", " 'fr_thiazole': 0,\n", " 'fr_thiocyan': 0,\n", " 'fr_thiophene': 0,\n", " 'fr_unbrch_alkane': 0,\n", " 'fr_urea': 0}" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "getMolDescriptors(doravirine)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Suppose I want to generate the full set of descriptors for a bunch of molecules..." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2022-12-20T05:19:07.446239Z", "start_time": "2022-12-20T05:19:07.335355Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "canonical_smiles molregno activity_id standard_value standard_units\r\n", "N[C@@H]([C@@H]1CC[C@H](CC1)NS(=O)(=O)c2ccc(F)cc2F)C(=O)N3CC[C@H](F)C3 29272 671631 49000 nM\r\n", "N[C@@H](C1CCCCC1)C(=O)N2CCSC2 29758 674222 28000 nM\r\n", "N[C@@H]([C@@H]1CC[C@H](CC1)NC(=O)c2ccc(F)c(F)c2)C(=O)N3CCSC3 29449 675583 5900 nM\r\n", "N[C@@H]([C@@H]1CC[C@H](CC1)NS(=O)(=O)c2ccc(F)cc2F)C(=O)N3CCCC3 29244 675588 35000 nM\r\n", "N[C@@H]([C@@H]1CC[C@H](CC1)NS(=O)(=O)c2ccc(OC(F)(F)F)cc2)C(=O)N3CC[C@@H](F)C3 29265 679299 6000 nM\r\n", "N[C@@H]([C@@H]1CC[C@H](CC1)NS(=O)(=O)c2ccc(F)cc2F)C(=O)N3CC[C@@H](F)C3 29253 679302 52000 nM\r\n", "N[C@@H]([C@@H]1CC[C@H](CC1)NC(=O)c2ccc(F)c(F)c2)C(=O)N3CCCC3 29482 683566 29000 nM\r\n", "N[C@@H]([C@@H]1CC[C@H](CC1)NC(=O)c2ccccc2C(F)(F)F)C(=O)N3CCSC3 29340 685042 39000 nM\r\n", "N[C@@H]([C@@H]1CC[C@H](CC1)NC(=O)OCc2ccccc2)C(=O)N3CC[C@@H](F)C3 29213 685047 43000 nM\r\n" ] } ], "source": [ "!head ../data/herg_data.txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "We can read in all the molecules using a \"Supplier\" object, there's more about this [in the documentation](https://www.rdkit.org/docs/GettingStartedInPython.html#reading-sets-of-molecules)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2022-12-20T05:19:08.934866Z", "start_time": "2022-12-20T05:19:08.767216Z" } }, "outputs": [ { "data": { "text/plain": [ "1090" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "suppl = Chem.SmilesMolSupplier('../data/herg_data.txt')\n", "mols = [m for m in suppl]\n", "len(mols)" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2022-12-20T04:36:08.368224Z", "start_time": "2022-12-20T04:36:08.365600Z" } }, "source": [ "Now calculate the descriptors. This takes a bit (10-20 seconds on my machine) for the ~1100 molecules I read in." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2022-12-20T05:19:18.220827Z", "start_time": "2022-12-20T05:19:09.896088Z" } }, "outputs": [], "source": [ "allDescrs = [getMolDescriptors(m) for m in mols]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The problem here is that we have a list of dictionaries... that's not useful for most things. Let's convert it to a pandas dataframe:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2022-12-20T05:19:18.393439Z", "start_time": "2022-12-20T05:19:18.221859Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MaxEStateIndexMinEStateIndexMaxAbsEStateIndexMinAbsEStateIndexqedMolWtHeavyAtomMolWtExactMolWtNumValenceElectronsNumRadicalElectrons...fr_sulfidefr_sulfonamdfr_sulfonefr_term_acetylenefr_tetrazolefr_thiazolefr_thiocyanfr_thiophenefr_unbrch_alkanefr_urea
013.787943-4.12037313.7879430.0743170.759946419.469395.277419.1490471560...0100000000
112.032152-0.23240712.0321520.1869440.777429228.361208.201228.129634860...1000000000
213.255664-1.03618513.2556640.0178450.835147383.464360.280383.1479041420...1000000000
313.787093-4.07256013.7870930.0151960.786287401.479376.279401.1584691500...0100000000
413.326286-4.85925413.3262860.0639660.625645467.485442.285467.1501901740...0100000000
\n", "

5 rows × 208 columns

\n", "
" ], "text/plain": [ " MaxEStateIndex MinEStateIndex MaxAbsEStateIndex MinAbsEStateIndex \\\n", "0 13.787943 -4.120373 13.787943 0.074317 \n", "1 12.032152 -0.232407 12.032152 0.186944 \n", "2 13.255664 -1.036185 13.255664 0.017845 \n", "3 13.787093 -4.072560 13.787093 0.015196 \n", "4 13.326286 -4.859254 13.326286 0.063966 \n", "\n", " qed MolWt HeavyAtomMolWt ExactMolWt NumValenceElectrons \\\n", "0 0.759946 419.469 395.277 419.149047 156 \n", "1 0.777429 228.361 208.201 228.129634 86 \n", "2 0.835147 383.464 360.280 383.147904 142 \n", "3 0.786287 401.479 376.279 401.158469 150 \n", "4 0.625645 467.485 442.285 467.150190 174 \n", "\n", " NumRadicalElectrons ... fr_sulfide fr_sulfonamd fr_sulfone \\\n", "0 0 ... 0 1 0 \n", "1 0 ... 1 0 0 \n", "2 0 ... 1 0 0 \n", "3 0 ... 0 1 0 \n", "4 0 ... 0 1 0 \n", "\n", " fr_term_acetylene fr_tetrazole fr_thiazole fr_thiocyan fr_thiophene \\\n", "0 0 0 0 0 0 \n", "1 0 0 0 0 0 \n", "2 0 0 0 0 0 \n", "3 0 0 0 0 0 \n", "4 0 0 0 0 0 \n", "\n", " fr_unbrch_alkane fr_urea \n", "0 0 0 \n", "1 0 0 \n", "2 0 0 \n", "3 0 0 \n", "4 0 0 \n", "\n", "[5 rows x 208 columns]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "df = pd.DataFrame(allDescrs)\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And now we have something that we could use to build models, filter, etc." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.4" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }