{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Getting Started with Projections" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, we are showing some basic functionalities for analysing MD trajectories using `Projection` classes. First, let's import HTMD:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Please cite HTMD: Doerr et al.(2016)JCTC,12,1845. \n", "https://dx.doi.org/10.1021/acs.jctc.6b00049\n", "Documentation: http://software.acellera.com/\n", "To update: conda update htmd -c acellera -c psi4\n", "\n", "You are on the latest HTMD version (unpackaged : /home/joao/maindisk/software/repos/Acellera/htmd/htmd).\n", "\n", "Populating the interactive namespace from numpy and matplotlib\n" ] } ], "source": [ "from htmd.ui import *\n", "config(viewer='webgl')\n", "#for inline plotting in Jupyter notebooks\n", "%pylab inline" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Projecting simulations" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "Get the data for this tutorial [here](http://pub.htmd.org/tutorials/projections/data.tar.gz). Alternatively, you can download the data using `wget`:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import os\n", "assert os.system('wget -rcN -np -nH -q --cut-dirs=2 -R index.html* http://pub.htmd.org/tutorials/projections/data/') == 0" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "We will be doing two types of data projections:\n", "* Projecting single Molecule objects\n", "* Projecting lists of simulations" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Projecting single Molecule objects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One can load a previously simulated single MD trajectory (in this case, NTL9 protein) into a `Molecule` object and visualize it using:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2018-03-15 23:57:31,506 - htmd.molecule.molecule - INFO - Removed 17 atoms. 631 atoms remaining in the molecule.\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7fa612a772bd414eae15f79ed23e8421", "version_major": 2, "version_minor": 0 }, "text/plain": [ "A Jupyter Widget" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mol = Molecule('data/ntl9_structure.pdb')\n", "mol.read('data/ntl9_trajectory.xtc')\n", "mol.filter('protein')\n", "mol.view()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Getting the RMSD along time" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "HTMD has many projection classes. One of those classes is `MetricRmsd`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the `MetricRmsd` class, one may calculate the RMSD of a `Molecule` object against a reference structure (`refmol`, e.g. the crystal structure) for a given atom selection (`trajrmsdstr`):" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "scrolled": true, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 16.36052132 15.79398251 14.12205887 14.77201653 13.90877914\n", " 12.92834282 11.54159355 10.69192505 10.77845764 12.86611652]\n" ] } ], "source": [ "crystal = Molecule('data/ntl9_crystal.pdb')\n", "metr = MetricRmsd(refmol=crystal, trajrmsdstr='protein and name CA')\n", "proj = metr.project(mol)\n", "print(proj)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The values are in Angstroms. They are quite high values, because the sample trajectory is of the unfolded protein (see the visualization above)." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Plot the RMSD along time" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The previously obtained RMSD values can be easily plotted using matplotlib:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEKCAYAAAAfGVI8AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvFvnyVgAAIABJREFUeJzt3Xd4VHX6/vH3k95IAiQQeui9R5oKdimCLlaaogiirrvW365bbOt+1bVXkCbqIpa1iwVclSI1KL0ngPSElhBKQpLn90eGNWLKEDNzpjyv65qLyeTMOTfnSubOmTmfzxFVxRhjTPAKcTqAMcYYZ1kRGGNMkLMiMMaYIGdFYIwxQc6KwBhjgpwVgTHGBDkrAmOMCXJWBMYYE+SsCIwxJsiFOR3AHUlJSZqamup0DGOM8SvLly/fr6rJlS3nF0WQmppKenq60zGMMcaviMh2d5azt4aMMSbIWREYY0yQsyIwxpggZ0VgjDFBzorAGGOCnBWBMcYEOSsCY4wJcgFdBAu37Gfagq2cLCp2OooxxvisgC6CL9bs5ZHP1jHw+fks2Lzf6TjGGOOTAroIHrm8PZNGdSe/sJiRU5dwy5vp7Dh4zOlYxhjjUwK6CESES9qnMPuuvtx3aWvmbdrPhc/M5enZGzlWUOh0PGOM8QkBXQSnRIWHcvv5Lfjm3n4M6JDCi99s4cKn5/LJyt2oqtPxjDHGUUFRBKfUS4jm+eu68t743tSKjeAPM3/k2lcXs3Z3jtPRjDHGMUFVBKeclVqLT35/Do8N7ciW7DwGv7iAv364moNHC5yOZowxXheURQAQGiIM69GYb+85jxv6pPL2sh2c/9R3vL5wG4V2uqkxJogEbRGckhATzoOD2/PFH8+lQ4N4HvxkLYNeWMDCLXa6qTEmOAR9EZzSqm4N/j2mJxNHdudoQSHDpyzhthnL2XnITjc1xgQ2jxWBiEwTkSwRWXPa43eIyEYRWSsi//LU9qtCROjfIYWv7+7HPRe34tsN2Vz49FyenbOJ4wVFTsczxhiP8OQRwXSgf+kHROR84HKgk6q2B57y4ParLCo8lDsubMl/7+nHJe1TeP6/m7nombnMWrXHTjc1xgQcjxWBqs4DDp728K3A46qa71omy1Pbrw71E6N5cVhX3hnXi/jocG5/6weum7SY9XtynY5mjDHVxtufEbQCzhWRJSIyV0TO8vL2q6Rns9p8dsc5PHpFBzbtO8KgF+bz94/WcMhONzXGBABvF0EYUBPoBdwHvCsiUtaCIjJORNJFJD07O9ubGcsUGiKM7NWEb+89j1G9mjBjyXbOf/o73ly8naJie7vIGOO/vF0EO4EPtMRSoBhIKmtBVZ2kqmmqmpacnOzVkBVJjIng4cs78Pkfz6VtSjx//2gNg16Yz+LMA05HM8aYKvF2EXwEXAAgIq2ACMAvT9hvkxLPW2N7MmFEN46cKOS6SYu5/a0f2HX4uNPRjDHmjHjy9NGZwCKgtYjsFJExwDSgmeuU0reBG9SPT8MREQZ0rMfXd/fjzota8vW6fVz49Hc8//VmTpy0002NMf5B/OF1OC0tTdPT052OUamdh47x2OcbmLV6Dw0So/nboLb075BCOR+DGGOMR4nIclVNq2w5G1lcjRrWjOHlEd2YObYXNaLCuHXGD4yYsoSNe484Hc0YY8plReABvZuXnG76j8vbs3Z3LgNfmM+rczOcjmWMMWWyIvCQsNAQRvVO5bt7z+OCNnX411cb2bzPjgyMMb7HisDDasZG8PjQjsRGhPLIZ+tsigpjjM+xIvCC2nGR3HlRK+Zv3s/X6316Vg1jTBCyIvCSUb2b0KJOHI/OWkd+oZ1aaozxHVYEXhIeGsIDl7Vj+4FjTFuwzek4xhjzP1YEXtS3VTIXta3LS99sJiv3hNNxjDEGsCLwur8NasvJIuWJLzc6HcUYYwArAq9LTYrlpnOa8v4PO1mx47DTcYwxxorACb+/oAXJNSJ56JO1FNsU1sYYh1kROCAuMow/9W/Dih2H+WjFLqfjGGOCnBWBQ4Z2bUDnRok8/sUGjuYXOh3HGBPErAgcEhIiPDi4HVlH8nn52y1OxzHGBDErAgd1a1yToV0bMGX+Vn46cMzpOMaYIGVF4LA/DWhDWKjw6Kx1TkcxxgQpKwKH1Y2P4vbzWzB73T4WbPbLq3YaY/ycFYEPGHNOUxrXiuGRz9ZSWFTsdBxjTJCxIvABUeGh/HVQWzbty2PGkp+cjmOMCTJWBD7iknZ1OadFEs/M2cShowVOxzHGBBErAh8hIjwwuB15+YU8M2eT03GMMUHEisCHtKpbg1G9mjBjyXbW78l1Oo4xJkhYEfiYOy9qSUJ0OI98ape1NMZ4hxWBj0mMieDuS1qzKPMAX67Z63QcY0wQsCLwQcPOakSblBr88/P1nDhpl7U0xniWFYEPCgsN4YHB7dh56DiT52U6HccYE+A8VgQiMk1EskRkTanHHhKRXSKywnUb6Knt+7s+zZMY0CGFV77LYE/OcafjGGMCmCePCKYD/ct4/FlV7eK6fe7B7fu9vwxsS5Eqj3+xwekoxpgA5rEiUNV5wEFPrT8YNKoVwy19m/Hxit2kb7NdaYzxDCc+I/i9iKxyvXVU04Ht+5Vbz2tOSnwUD3+6zi5raYzxCG8XwQSgOdAF2AM8Xd6CIjJORNJFJD07O9tb+XxOTEQY9w9sw+pdOfxn+U6n4xhjApBXi0BV96lqkaoWA5OBHhUsO0lV01Q1LTk52XshfdCQzvXp3qQm//pqA7knTjodxxgTYLxaBCJSr9SXvwPWlLes+ZmI8NDg9hw4WsBL39hlLY0x1cuTp4/OBBYBrUVkp4iMAf4lIqtFZBVwPnCXp7YfaDo2TODq7g157futZGbnOR3HGBNAPHnW0DBVraeq4araUFWnquooVe2oqp1UdYiq7vHU9gPRfZe2ITIslEdnrXc6ijEmgNjIYj+SXCOSP1zYgm82ZPHtxiyn4xhjAoQVgZ8Z3acpTZNi+cdn6ygo9M/LWh4vsPmTjPElVgR+JiIshL9f1pbM7KO8sWib03HOSH5hEU99tZEOD33F3z9aQ5GNizDGJ1gR+KEL2tTlvNbJPP/1Zvbn5Tsdxy1rduUw5MXveenbLXRumMCbi7dz67+X2+yqxvgAKwI/9bdB7Th+suQvbF9WUFjMM3M2cfnL33PoWAHTRqfxwW1n8/CQ9sxZv48RU5bYNZqNcZgVgZ9qUSeO0X1SeSd9B2t25Tgdp0xrd+dw+cvf88J/N3N5l/rMuasfF7SpC8ANfVJ5ZXg3Vu/K4cqJC9lx8JjDaY0JXlYEfuyOC1tSKyaChz9d61OXtTxZVMxzX2/i8pe+Z39ePlOuT+OZa7qQEBP+i+UGdKzHjJt7sv9IPkMnLGTtbt8sNGMCnRWBH0uIDue+S1uzbNshPl3lG0My1u/J5YqXv+e5rzdzWad6zLmrLxe1q1vu8mel1uL9W/sQHiJc++pi5m8O3nmljHGKlPeXpIgMreiJqvqBRxKVIS0tTdPT0721Ob9SVKwMeWkBB48W8M095xEdEepIjpNFxUz8LoMXvtlMQnQ4//xdRy5tn+L28/fmnGD0a0vZkpXHk1d34nddG3owrTHBQUSWq2paZcuFVfC9wRV8TwGvFYEpX2iI8NCQ9lw9cRET5mZw98WtvJ5h494j3PveSlbvymFw5/o8PKQ9tWIjzmgdKQlRvDu+N7e8sZy73lnJ3px8xvdrhoh4KLUx5pRyi0BVb/RmEFN1Z6XWYnDn+rw6N4Nr0hrSsGaMV7ZbWFTMq/Myef7rzdSICmPCiG4M6Fiv8ieWIz4qnOk3ncV9763iiS83sDfnOA8Mbk9oiJWBMZ5U0RHB/4jIIKA9EHXqMVV9xFOhzJm7f0Ab5qzby2Ofb+DlEd08vr3N+0qOAlbuzGFQp3o8MqQ9teMif/N6I8NCee7aLqQkRDFpXib7cvN57rouRIU785aXMcGg0g+LRWQicC1wByDA1UATD+cyZ6h+YjS39mvBrNV7WJx5wGPbKSwqZsJ3GQx6YQE7Dh3n5eHdeHl4t2opgVNCQoS/DGzLA5e146t1exk1dQmHj9lYA2M8xZ2zhvqo6vXAIVV9GOgNNPJsLFMVt/RrRoPEaB7+dJ1Hpm/YkpXHVRMX8cSXG7iwbR1m39WXQZ2q/lZQZW46pykvDuvKyh05XDVxEbsOH/fYtowJZu4UwanfvmMiUh84CTT1XCRTVVHhofxlYFvW78nl7WU/Vdt6i4qVSfMyGPjCfLYdOMoLw7ryyohuJFXjUUB5LutUnzfG9GBf7gmGvvI963bnenybxgQbd4rgMxFJBJ4EfgC2AW97MpSpuoEdU+jZtBZPfbWRnGO//bKWmdl5XD1xIf/3+Qb6tUpm9l19GdK5vlfP5unVrDb/Gd+HEBGueXUR32/Z77VtGxMM3CmCf6nqYVV9n5LPBtoAj3o2lqkqEeHBwe3JOX6S5/67qcrrKSpWpszPZMDz88nIPspz13Zh0qju1KkRVfmTPaB1Sg0+uK0PDRKjGf3aUj5escuRHMYEIneKYNGpO6qar6o5pR8zvqdd/XiG9WjMG4u2s3nfkTN+/tb9R7n21UU8Oms957ZMYs5dfbmiawPHz+mvlxDNu+N7061xTf749gomzcvwqak1jPFX5RaBiKSISHcgWkS6ikg31+08wDsnqpsqu+eS1sRGhPLIZ+vcfrEsLlamLdjKgOfnsWnfEZ6+ujOTr0+jTrwzRwFlSYgO540xPRjUqR7/9/kGHvlsHcV2XQNjfpOKxhFcCowGGgJPU3LqKMAR4C+ejWV+q1qxEdx1cSse/nQdX6/P4uIK5vsB2H7gKPf9ZxVLtx7k/NbJPDa0EykJvlMApUWGhfLidV1JiY9i6oKtZOXm8/Q1nW2sgTFVVO5cQ/9bQORK1+cDjrG5hqrmZFExA5+fT0FRMbPv6ktk2K9fKIuLlTcXb+fxLzYQFiI8MLgdV3Vv6PjbQO6aMj+TR2etp0fTWkwelfarGU6NCWbuzjXkzmcEDUUkXkpMEZEfROSSashoPCw8NIQHBrdj+4FjTFuw7Vff33HwGMOnLObBT9bSo2ktZt/dl6vTGvlNCQDcfG4zXhjWlRU/HeaqiQvZbWMNjDlj7hTBTaqaC1wC1AFuBB73aCpTbc5tmcxFbevy0jebyco9Afx8FHDpc/NYsyuXJ67syPQbz6JeQrTDaatmSOf6TL/pLPbmnGDoKwvZsNfGGhhzJtwpglN/Hg4EXlPVlaUeM37gb4PacrJIeeLLjew4eIyRU5fw94/W0L1JTb66qy/XntXYr44CytKneRLv3dobRbl6wiIWZXhumg1jAo07RbBcRGZTUgRfiUgNoNizsUx1Sk2K5aZzmvL+Dzu59Ll5rNxxmP/7XUfeuKkHDRL98yigLG1S4vngtrNJSYjihmlL+XTlbqcjGeMX3CmCMcCfgbNU9RgQQcnbQ8aP/P6CFqTWjqFb45KjgOE9/f8ooCwNEqP5z/g+dGmUyB0zf2TK/EynIxnj89wpgneBekAugKoeUNVVlT1JRKaJSJaIrCnje/eKiIpI0hknNlUSFxnGt/eex79v7um16xU4JSGmZKzBwI4pPDprPf+wsQbGVMidIpgIDAc2i8jjItLGzXVPB/qf/qCINAIuBqpvVjTjlkA8AihPVHgoLw7rxug+qUxdsJU/vP0j+YVFTscyxidVWgSq+rWqjgC6UTLh3BwRWSgiN4pIuSdtq+o84GAZ33oW+H+UXO7SGI8JDREeHNyO+we04bNVe7hh2lJyjv/2ifiMCTTuHBEgIrUpGWV8M/Aj8DwlxTDnTDYmIkOAXa4zj4zxOBHhln7Nef66LizffohrJi5iT46NNTCmNHeuUPYBMJ+S+YUGq+oQVX1HVe8A4tzdkIjEAH8FHnBz+XEiki4i6dnZ2e5uxpgyXd6lAdNv7MGuw8cZ+spCNlVhMj5jAlWFRSAiIcAKVW2nqo+p6p7S33dn6HIpzSm5oM1KEdlGyRxGP4hISlkLq+okVU1T1bTk5OQz2IwxZTu7RRLv3NKLomLlqgkLWeLBS3oa408qLAJVLQYGVMeGVHW1qtZR1VRVTQV2At1UdW91rN8Yd7Svn8AHt/UhuUYko6Yu5dsNWU5HMsZx7nxGMFtErpQzPOVERGZSct2C1iKyU0TGVCmhMdWsYc0Y3r+1D61S4rjl38uZu8neejTBzZ3ZR48AsUAhcIKS6SVUVeM9H6+EzT5qPOHwsQKGT17Cluw8plyfRt9W9hakCSzVNvuoqtZQ1RBVjVDVeNfXXisBYzwlMSaCGTf3pHlyHGPfSGfBZrsWsglO7pw19DsRSSj1daKIXOHZWMZ4R83YkjJomhTLzW8sY+EWKwMTfNz5jOBB13WKAVDVw8CDnotkjHfVcpVB41ox3PT6Mhbb2UQmyLhTBGUtU9ElLo3xO7XjInlrbC8a1YzhxteW2amlJqi4UwTpIvKMiDQXkWYi8iyw3NPBjPG2JFcZ1E+M4sbpy1i2rawZUowJPO4UwR1AAfAOJTORHgdu92QoY5ySXCOSmWN7kRIfxehpS1m+3crABD53zho6qqp/PjXKV1X/oqpHvRHOGCfUiY9i5rhe1ImP4oZpy/jhp0NORzLGo9yadM6YYFM3PoqZY3tROy6CG6YuZcWOw05HMsZjrAiMKUdKQkkZ1IyNYNTUJazaaWVgApMVgTEVqJ8YzcxxvUiMCWfklCWs2ZVT+ZOM8TPuDChrKCIfiki2iOwTkfdFpKE3whnjCxokRjNzbC9qRIUzwsrABCB3jgheAz6h5LrFDYBPXY8ZEzQa1ozh7XG9iIsMY+TUJazbnet0JGOqjTtFkKyqr6lqoes2HbDZuUzQaVQrhpljexEdHsqIKYtZv8fKwAQGd4pgv4iMFJFQ120kYMMuTVBqXLvkyCAyLJQRU5awca9d6cz4P3eK4CbgGmAvsAe4CrjRk6GM8WVNascyc1wvwkOF4ZMXs9kue2n8nDtF0Mh1neJk1xXGrgAaeTqYMb6saVIsb43tRUiIMGzyErZk5TkdyZgqc6cIXnTzMWOCSvPkOGaO7QXAsMmLyci2MjD+qdwiEJHeInIPkCwid5e6PQSEei2hMT6sRZ04Zo7tiaoybNJiMq0MjB+q6IggAoijZMrpGqVuuZR8TmCMAVrWrcFbY3tRVKwMm7yYbfttKi7jX9y5ZnETVd3upTxlsmsWG3+wYW8uwycvITIshLfH9aJJ7VinI5kgV53XLHa0BIzxF21S4vn3mJ4cP1nEsEmL2XHwmNORjHGLzTVkTDVqVz+eGTf35GhBEddZGRg/YUVgTDVrXz+BGTf35MiJkwybvJidh6wMjG+rsAhE5HwR+UBE1rpu/xGR87yUzRi/1aFBAjNu7kXO8ZIy2H34uNORjClXRaePDgKmUTLJ3HBgBPA5ME1EBnonnjH+q2PDBP49pieHj5aUwZ4cKwPjmyo6IrgPuMI14dxKVV2hqtOAK4A/eSeeMf6tc6NE3hjTgwN5BQyfvIS9OSecjmTMr1RUBCmquvL0B1V1FVC3shWLyDQRyRKRNaUe+4eIrBKRFSIyW0TqVy22Mf6ja+OavH5TD7JyTzB88mKycq0MjG+pqAgqGhXjzoiZ6UD/0x57UlU7qWoX4DPgATfWY4zf696kpAz25p7gusmLyTpiZWB8R0VF0FxEPinj9inQrLIVq+o84OBpj5WewD0WqHg0mzEBJC21FtNv7MHenBMMn7yE7CP5TkcyBqhgZLGI9Kvoiao6t9KVi6QCn6lqh1KP/RO4HsgBzlfV7HKeOw4YB9C4cePu27fbuDYTGBZnHuDG15bRqFY0b43tRVJcpNORTIByd2RxpVNMlFphONAB2KWqWW4+J5XTiqDU9+4HolT1wcrWY1NMmECzMGM/N01fRmrtWGbc3JPaVgbGA37zFBMiMlFE2rvuJwArgTeAH0VkWDVkfAu4shrWY4zf6dM8iak3nMXW/UcZOXUpxwuKnI5kglhFnxGcq6prXfdvBDapakegO/D/qrIxEWlZ6sshwIaqrMeYQHB2iyQmjuzO+j25PP7FeqfjmCBWUREUlLp/MfARgKrudWfFIjITWAS0FpGdIjIGeFxE1ojIKuAS4I9Vi21MYDi/TR1uPDuV1xdtZ+6mMj8uM0Hq8LECRr+2lDW7cjy+rYqK4LCIXCYiXYGzgS8BRCQMiK5sxao6TFXrqWq4qjZU1amqeqWqdnCdQjpYVXdVz3/DGP/1p/5taFknjvveW8mhowWVP8EEhTcXbee7jdmEhYrHt1VREdwC/B54Dbiz1JHAhcAsTwczJlhEhYfy3HVdOHSsgL98uBp3T+AwgevEySKmL9zG+a2TaZMS7/HtlVsEqrpJVfurahdVnV7q8a9U9R6PJzMmiLSvn8DdF7fmizV7ef8HO1AOdu+l7+DA0QLG92vule2FlfcNEXmhoieq6h+qP44xwWtc32Z8uyGLhz5ZS8+mtWhUK8bpSMYBhUXFTJqfSdfGifRoWssr26zoraHxwDnAbiAdWH7azRhTjUJDhKev6QzA3e+uoKjY3iIKRp+v2cuOg8cZ3685Ip7/fAAqLoJ6wCTgUmAUEA58oqqvq+rr3ghnTLBpVCuGh4e0Z9m2Q7w6L8PpOMbLVJWJ32XQPDmWi9tWOrdntanoM4IDqjpRVc8HRgOJwFoRGeWtcMYEo6HdGjCwYwrPztnklVMHje+Yv3k/6/bkckvf5oSEeOdoANy4VKWIdAPuBEYCX2BvCxnjUSLCP6/oSM2YCO58ZwUnTtqo42AxcW4GdeMjubyrd2for2iKiYdFZDlwNzAXSFPVMaq6zmvpjAlSNWMjeOrqzmzJyuPxL2wAfjBYtfMwCzMOcNPZTYkMC/Xqtis6Ivg7kAB0Bh4DfnBdVGa1a2SwMcaD+rZKZnSfVKYv3Mb8zTbqONBNnJtBjagwhvds7PVtl3v6KNDUaymMMWX684A2LNiyn3vfW8lXd/YlMSbC6UjGA7btP8oXa/Yyvl9zakSFe337FX1YvL2sG7CTktNKjTEeFhUeynPXduFAXgF//XCNjToOUJPmZxIeGsKNZ6c6sv2KPiOIF5H7ReQlEblEStwBZALXeC+iMcGtQ4ME7rq4FbNW7+HDH23UcaDJOnKC/yzfyZXdGlKnRpQjGSr6jOBNoDWwGrgZmA1cBVyuqpd7IZsxxmV8v+aclVqTBz5ey46Dx5yOY6rR9O+3cbKomHF9K70CsMdUVATNVHW0qr4KDAPSgMtUdYV3ohljTgkNEZ65pgsA97y70kYdB4gjJ07y5uLtDOiQQtOkWMdyVFQEJ0/dUdUiYKuqHvF8JGNMWRrViuHBwe1Yuu0gk+ZlOh3HVIOZS3/iyIlCr00uV56KiqCziOS6bkeATqfui0iutwIaY352VfeG9G+fwjNzNtqoYz+XX1jE1AVb6dO8Np0aJjqapaKzhkJVNd51q6GqYaXue36CbGPMr4gI/ze0I4kxEdxlo4792sc/7mZfbr7jRwPgxhQTxhjfUis2giev6sTmrDye+NJGHfuj4mJl4rwM2tWL59yWSU7HsSIwxh+d17oO1/duwmvf26hjfzRn/T4ys49yS79mXptquiJWBMb4qfsHtKV5ciz3vreSw8fsWsf+QlWZODeDRrWiGdSxntNxACsCY/xWdEQoz13btWTU8Uc26thfLN16kB9/OszYc5sRFuobL8G+kcIYUyUdGyZw50UtmbVqDx+tsFHH/mDi3AxqxUZwdfdGTkf5HysCY/zc+H7N6d6kJg98tJadh2zUsS/bsDeXbzdmM7pPKtER3p1quiJWBMb4ubDQEJ69pgvFqjbq2MdNmptJTEQo1/du4nSUX7AiMCYANK4dw4ND2rNk60GmzLdRx75o1+HjfLJyN9ed1djnphO3IjAmQFzdvSGXtq/LU7M3sm63Df73NacK+uZzfe9SLx4rAhGZJiJZIrKm1GNPisgG15XOPhQRZ8dVGxNARITHhnYiMSaCO9/50UYd+5BDRwt4e+kOhnSpT/3EaKfj/IonjwimA/1Pe2wO0EFVOwGbgPs9uH1jgk6t2Aj+dVUnNu3L48mvNjodx7i8sWg7x08W+cR0EmXxWBGo6jzg4GmPzVbVQteXi4GGntq+McHq/NZ1GNWrCVMXbOX7LfudjhP0jhcU8fqibVzYpg6t6tZwOk6ZnPyM4Cbgi/K+KSLjRCRdRNKzs20IvTFn4i8D29IsOZZ73l1JzrGTlT/BeMy76Ts4eLSA8ef55tEAOFQEIvJXoBCYUd4yqjpJVdNUNS05Odl74YwJACWjjruwPy+fv328pvInGI8oLCpm8vxMujVOJK1JTafjlMvrRSAiNwCXASPUxsQb4zGdGibyxwtb8unK3Xxso44dMWv1HnYeOs74fs19YnK58ni1CESkP/AnYIiq2hBIYzzs1vOa061xIn/7aA27Dh93Ok5QKZlcLpMWdeK4qG1dp+NUyJOnj84EFgGtRWSniIwBXgJqAHNEZIWITPTU9o0xrlHH13ahuFi5992VFNuoY6+Zuymb9XtyGde3GSEhvns0ABDmqRWr6rAyHp7qqe0ZY8rWpHYsDwxux5/eX83UBVsZ27eZ05GCwsS5GaTER3FFlwZOR6mUjSw2Jghck9aIS9rV5cmvNrJ+j4069rQVOw6zOPMgY85pSkSY77/M+n5CY8xvVjLquCPx0eF2rWMveHVuBvFRYQzr2djpKG6xIjAmSNSOi+RfV3Vkw94jPD3bRh17SmZ2Hl+u3cuo3k2Ii/TYu+/VyorAmCByQZu6jOjZmCkLtrIww0Yde8Lk+ZmEh4Ywuo/vTS5XHisCY4LMXwe1JbV2LPe+u5Kc4zbquDpl5Z7g/eW7uLp7Q5JrRDodx21WBMYEmZiIMJ69tgv7juTzgI06rlbTvt9GYXEx4/zszCwrAmOCUJdGifzhgpZ8vMJGHVeX3BMnmbF4OwM61qNJ7Vin45wRKwJjgtTt5zenq2vU8W4bdfybvbXkJ47kFzK+r+9OLlceKwJjgtSpax0XFZdc69hGHVddfmER0xZs5ewWtenYMMHpOGfMisCYIJaaFMswzRZ5AAAMXklEQVTfL2vHoswDTLZrHVfZhz/sIutIvs9eeKYy/nGSqzHGY647qxHfbsjisS82UKwwvl8zn54p09cUFSuT5mXSvn4857RIcjpOldgRgTFBTkR4cXhXBneuzxNfbuBvH62hsKjY6Vh+Y866vWTuP+rzU01XxI4IjDFEhoXy/LVdaJAYzcS5GezJOcGLw7oS6ycjY52iqkyYm0njWjEM6JDidJwqsyMCYwwAISHCnwe04R9XdOC7jVlcN2kxWUdOOB3Lpy3ZepCVOw4ztm8zwkL99+XUf5MbYzxiVK8mTL4+jS1ZeQx9ZSFbsvKcjuSzJs7NICkugqu7N3Q6ym9iRWCM+ZUL29bl7XG9OHGyiCsnLGTp1oNOR/I56/fk8t3GbEb3SSUqPNTpOL+JFYExpkydGyXy4W1nUzsugpFTlvDpyt1OR/Ipr87NIDYilFG9Up2O8ptZERhjytWoVgzvj+9D50YJ3DHzR16dm4GqDTzbcfAYn67aw7AejUmICXc6zm9mRWCMqVDN2AjeHNOTQZ3q8dgXG3jg47UUBfko5KkLtiLAmHP9Z6rpiti5YcaYSkWFh/LidV1pmBjNq/My2ZNznBeGdSUmIvheQg4eLeDtZT9xeZcG1EuIdjpOtbAjAmOMW0JChPsHtuWRy9vzzYYshk1aTPaRfKdjed3rC7dx4mQx4/v511TTFbEiMMacket7pzJxZHc27jvC0Anfk5EdPKeXHiso5PVF27iobR1a1q3hdJxqY0VgjDljl7RPYebYXhzLLzm9dNm24Di99J1lOzh87KTfTi5XHisCY0yVdG1ckw9u60PNmAhGTFnCrFV7nI7kUSeLipkyfytpTWqSllrL6TjVyorAGFNlTWrH8sGtfejYIIHb3/qByfMyA/b00s9W7WbX4eMBdzQAVgTGmN+oZmwEM27uycCOKfzz8/U8/Om6gDu9VFV5dW4mLevEcUGbOk7HqXYeKwIRmSYiWSKyptRjV4vIWhEpFpE0T23bGONdUeGhvDSsGzef05TpC7dx67+Xc7ygyOlY1ea7Tdls2HuEW/o1JyTEP6earognjwimA/1Pe2wNMBSY58HtGmMcEBIi/O2ydjw4uB1z1u9j2OTF7M8LjNNLJ36XQb2EKIZ0ru90FI/wWBGo6jzg4GmPrVfVjZ7apjHGeTee3ZSJI7uzfk8uQ19ZSKafn17640+HWLL1IGPOaUpEWGC+m+6z/ysRGSci6SKSnp2d7XQcY8wZuLR9CjPH9SIvv5ArJyxk+Xb/Pb104twMEqLDGdajsdNRPMZni0BVJ6lqmqqmJScnOx3HGHOGujWuyQe39iEhOpzhk5fwxWr/O700IzuP2ev2cX3vJgF9tTafLQJjjP9LTYrlg9vOpn39eG576wemLtjqdKQzMmluJhGhIdzQJ9XpKB5lRWCM8ahasRG8NbYXl7ZL4R+frePhT/1j9tJ9uSf48MddXJ3WkKS4SKfjeJQnTx+dCSwCWovIThEZIyK/E5GdQG9gloh85antG2N8R1R4KC+P6MZNZzflte+3cfuMHzhx0rdPL522YCuFxcWMOzfwBpCdzmNveqnqsHK+9aGntmmM8V2hIcIDg9vRoGY0j85ax7DJi5lyfRq1ffCv7ZzjJ5mx5CcGdqxH49oxTsfxOHtryBjjVWPOacorw7uxbncuV05YyLb9R52O9CszlmwnL78wIKeTKEvgfgxujPFZAzrWo058JDe/ns7QCQuZckMa3RrX9Mq2T5wsIvtIPvvz8jmQV8D+vHzXrYDsvHz2H8lnza4czm2ZRIcGCV7J5DTxhwmi0tLSND093ekYxphqtnX/UUa/tpS9OSd4/rqu9O+QcsbrUFXy8gvZn1fAAdeLenZeAfuP5P/vRf7nF/wC8vILy1xPjagwkuMiSYqLpE58JHdd3IrmyXG/9b/oKBFZrqqVTudjRWCMcdSBvHzGvJ7Oyp2HefCydow+uymqyuFjJ//34v3zX+357D/i+vrozy/2+YXFv1qvCNSMiSApLoLasZEk1YgkKS6CpLjIkhf8GiX3a8dFUjs2gqjwUAf+955lRWCM8RvHC4r449s/MnvdPpJrRHLoaAGFZZxiGhoi1IoteQFPiotwvaBH/urFPjkuklqxEYSFBvfHoO4WgX1GYIxxXHREKBNGdufVeRlszT7qekE//cU+ksTo8ICc/dNpVgTGGJ8QGiLcdl4Lp2MEpeA+bjLGGGNFYIwxwc6KwBhjgpwVgTHGBDkrAmOMCXJWBMYYE+SsCIwxJshZERhjTJDziykmRCQb2F7FpycB+6sxjr+z/fEz2xe/ZPvjlwJhfzRR1Uov+u4XRfBbiEi6O3NtBAvbHz+zffFLtj9+KZj2h701ZIwxQc6KwBhjglwwFMEkpwP4GNsfP7N98Uu2P34paPZHwH9GYIwxpmLBcERgjDGmAgFTBCLSX0Q2isgWEflzGd+PFJF3XN9fIiKp3k/pHW7si7tFZJ2IrBKR/4pIEydyektl+6PUcleJiIpIQJ8p4s7+EJFrXD8ja0XkLW9n9BY3flcai8i3IvKj6/dloBM5PU5V/f4GhAIZQDMgAlgJtDttmduAia771wHvOJ3bwX1xPhDjun9roO4Ld/eHa7kawDxgMZDmdG6Hfz5aAj8CNV1f13E6t4P7YhJwq+t+O2Cb07k9cQuUI4IewBZVzVTVAuBt4PLTlrkceN11/z/AhSISiNe8q3RfqOq3qnrM9eVioKGXM3qTOz8bAP8A/gWc8GY4B7izP8YCL6vqIQBVzfJyRm9xZ18oEO+6nwDs9mI+rwmUImgA7Cj19U7XY2Uuo6qFQA5Q2yvpvMudfVHaGOALjyZyVqX7Q0S6Ao1U9TNvBnOIOz8frYBWIvK9iCwWkf5eS+dd7uyLh4CRIrIT+By4wzvRvCtQrllc1l/2p58O5c4ygcDt/6eIjATSgH4eTeSsCveHiIQAzwKjvRXIYe78fIRR8vbQeZQcLc4XkQ6qetjD2bzNnX0xDJiuqk+LSG/gTde+KPZ8PO8JlCOCnUCjUl835NeHcP9bRkTCKDnMO+iVdN7lzr5ARC4C/goMUdV8L2VzQmX7owbQAfhORLYBvYBPAvgDY3d/Vz5W1ZOquhXYSEkxBBp39sUY4F0AVV0ERFEyB1FACZQiWAa0FJGmIhJByYfBn5y2zCfADa77VwHfqOsToABT6b5wvRXyKiUlEKjv/55S4f5Q1RxVTVLVVFVNpeQzkyGqmu5MXI9z53flI0pOKEBEkih5qyjTqym9w5198RNwIYCItKWkCLK9mtILAqIIXO/5/x74ClgPvKuqa0XkEREZ4lpsKlBbRLYAdwPlnkboz9zcF08CccB7IrJCRE7/4Q8Ybu6PoOHm/vgKOCAi64BvgftU9YAziT3HzX1xDzBWRFYCM4HRgfgHpI0sNsaYIBcQRwTGGGOqzorAGGOCnBWBMcYEOSsCY4wJclYExhgT5KwIjDEmyFkRmIAnIrVd4yVWiMheEdlV6uuFHtpmVxGZUsXnvi0igTiS1/goG0dggoqIPATkqepTHt7Oe8CjqrqyCs/tB4xU1bHVn8yYX7MjAhPURCTP9e95IjJXRN4VkU0i8riIjBCRpSKyWkSau5ZLFpH3RWSZ63Z2GeusAXQ6VQIi8pCITBOR70QkU0T+4Ho8VkRmichKEVkjIte6VjEfuMg1J5YxHmc/aMb8rDPQlpLJCDOBKaraQ0T+SMn0w3cCzwPPquoCEWlMyfQEbU9bTxqw5rTH2lAyf08NYKOITAD6A7tVdRCAiCQAqGqxayqUzsDy6v9vGvNLVgTG/GyZqu4BEJEMYLbr8dW4JmEDLgLalbqmUbyI1FDVI6XWU49fT0w2yzXLa76IZAF1Xet9SkSeAD5T1fmlls8C6mNFYLzAisCYn5Wejru41NfF/Py7EgL0VtXjFaznOCWzVJa37iIgTFU3iUh3YCDwmIjMVtVHXMtEudZjjMfZZwTGnJnZlMxYCYCIdCljmfVAi8pWJCL1gWOq+m/gKaBbqW+3Atb+tqjGuMeOCIw5M38AXhaRVZT8/swDxpdeQFU3iEhCGW8Zna4j8KSIFAMngVsBRKQucPzU21TGeJqdPmqMB4jIXcARVT3jsQSu5+aq6tTqT2bMr9lbQ8Z4xgR++bnAmTgMvF6NWYypkB0RGGNMkLMjAmOMCXJWBMYYE+SsCIwxJshZERhjTJCzIjDGmCD3/wGbljGrhVZKJQAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "time_range = mol.fstep*np.arange(len(proj))\n", "plt.plot(time_range, proj)\n", "_ = plt.xlabel('Time (ns)')\n", "_ = plt.ylabel('RMSD to crystal')" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "slideshow": { "slide_type": "slide" } }, "source": [ "## Projecting a list of MD simulations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With the arrival of Markov state models, the paradigm of running one single long simulation has shifted to running hundreds or thousands of shorter simulations." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With the analysis of a large amount of MD trajectories in mind, in HTMD, the `simlist` function can be used to bundle all the trajectories together:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Creating simlist: 100%|██████████| 314/314 [00:00<00:00, 2580.37it/s]\n" ] } ], "source": [ "sims = simlist(glob('data/1/filtered/*/'), 'data/1/filtered/filtered.pdb')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Use Metric class to project a list of simulations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the `Metric` class, one can calculate all the projections of an entire simlist at once. In this case, the `MetricDistance` projection class is used, which calculates the matrix of distances between two selections (`sel1` and `sel2`):" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Projecting trajectories: 100%|██████████| 313/313 [00:56<00:00, 5.57it/s]\n", "2018-03-15 23:58:31,995 - htmd.projections.metric - INFO - Frame step 9.999999974752427e-07ns was read from the trajectories. If it looks wrong, redefine it by manually setting the MetricData.fstep property.\n" ] } ], "source": [ "metr = Metric(sims)\n", "metr.set(MetricDistance(sel1='protein and name CA', sel2='resname MOL and noh', metric='distances'))\n", "data = metr.project()\n", "data.fstep = 0.1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, the projection is the matrix of distances between each carbon alpha of the protein and each heavy atom of the MOL molecule (which in the case of these trajectories, it's a ligand binding to the protein)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "The `Metric.project` method returns a `MetricData` object which contains the trajectory and projection information:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "MetricData object with 313 trajectories of 6260.0ns aggregate simulation time\n", "Centers: None\n", "K: None\n", "N: None\n", "description: at 0x7f424512be80\n", "fstep: 0.1\n", "parent: at 0x7f42aa8aba70\n", "trajectories shape: (313,)\n" ] } ], "source": [ "print(data)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Working with MetricData objects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `MetricData` object contains the information on each trajectory, including the corresponding projection. For example, for simulation 14:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "sim: simid = 14\n", "projection: np.array(shape=(200, 2493))\n", "reference: np.array(shape=(200, 2))\n", "cluster: None\n", "\n" ] } ], "source": [ "print(data.trajectories[14])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "The projection data can be easily accessed as a whole in the `MetricData.dat` property, which is an array of size equal to the number of trajectories. We can check the projection data of simulation 14 with:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 25.02148819, 22.95205688, 22.94793129, ..., 32.08512497,\n", " 35.54423523, 36.99856567],\n", " [ 26.67520523, 23.7568512 , 23.29753304, ..., 32.78013992,\n", " 36.31970978, 37.90874863],\n", " [ 25.62865257, 23.9732132 , 23.40108299, ..., 34.10222626,\n", " 37.37914276, 38.52599335],\n", " ..., \n", " [ 42.73937607, 43.45667648, 40.32186127, ..., 42.97974014,\n", " 40.95644379, 42.0711174 ],\n", " [ 37.95848846, 40.92785263, 39.53430176, ..., 38.68696976,\n", " 37.77161789, 39.62480164],\n", " [ 42.03954697, 42.95910263, 42.93307495, ..., 42.20518112,\n", " 41.23760986, 42.07869339]], dtype=float32)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.dat[14] # equivalent to data.trajectories[14].projection" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(313,)" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.dat.shape" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(62600, 2493)\n" ] } ], "source": [ "bundledprojs = np.vstack(data.dat) # np.concatenate can be used instead as well\n", "print(bundledprojs.shape)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Operations and calculations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One can use numpy functions like `np.min`, `np.max`, `np.average` to do calculations over the projected data:\n", "* get the global minimum over all data" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2.92014\n" ] } ], "source": [ "global_minimum = np.min(bundledprojs)\n", "print(global_minimum)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* get the maximum distance on each frame:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(62600,)\n" ] }, { "data": { "text/plain": [ "array([ 54.39825058, 53.99809647, 55.22867203, ..., 40.54018021,\n", " 39.33503342, 37.30210876], dtype=float32)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "max_eachframe = np.max(bundledprojs,axis=1)\n", "print(max_eachframe.shape)\n", "max_eachframe" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "* or get the average of each pair across all frames:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(2493,)\n" ] }, { "data": { "text/plain": [ "array([ 34.95511627, 33.69743729, 33.09692001, ..., 32.52411652,\n", " 33.0621109 , 33.91365433], dtype=float32)" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "avg_acrossframes = np.average(bundledprojs,axis=0)\n", "print(avg_acrossframes.shape)\n", "avg_acrossframes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `axis` property is important to make the operation over the desired dimension:\n", "\n", "* `axis=1`, in this case, operates over all pairs, to give a result for each frame\n", "* `axis=0`, in this case, operates over all frames, to give a result for each pair\n", "\n", "But how can we know to what particular atoms of the system corresponds a given indexed pair?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Playing with the projection mapping" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The mapping (meaning of each index) of the projection can be obtained from the `MetricData.map` property:\n", "\n", "* the mapping is a pandas `DataFrame` object\n", "\n", "The `head` method of `DataFrame` can be used to print the top entries of the mapping:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
atomIndexesdescriptiontype
0[4, 4480]distance between GLU 1 CA and MOL 278 C3distance
1[19, 4480]distance between ALA 2 CA and MOL 278 C3distance
2[29, 4480]distance between ASP 3 CA and MOL 278 C3distance
3[41, 4480]distance between CYX 4 CA and MOL 278 C3distance
4[51, 4480]distance between GLY 5 CA and MOL 278 C3distance
\n", "
" ], "text/plain": [ " atomIndexes description type\n", "0 [4, 4480] distance between GLU 1 CA and MOL 278 C3 distance\n", "1 [19, 4480] distance between ALA 2 CA and MOL 278 C3 distance\n", "2 [29, 4480] distance between ASP 3 CA and MOL 278 C3 distance\n", "3 [41, 4480] distance between CYX 4 CA and MOL 278 C3 distance\n", "4 [51, 4480] distance between GLY 5 CA and MOL 278 C3 distance" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.map.head()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "One can assess particular elements of the mapping through:\n", "\n", "* `iloc` - i(nteger)loc(ation):" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
atomIndexesdescriptiontype
20[321, 4480]distance between ARG 21 CA and MOL 278 C3distance
21[345, 4480]distance between GLU 22 CA and MOL 278 C3distance
\n", "
" ], "text/plain": [ " atomIndexes description type\n", "20 [321, 4480] distance between ARG 21 CA and MOL 278 C3 distance\n", "21 [345, 4480] distance between GLU 22 CA and MOL 278 C3 distance" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.map.iloc[20:22]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " * `loc` (using the DataFrame index):" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "atomIndexes [321, 4480]\n", "description distance between ARG 21 CA and MOL 278 C3\n", "type distance\n", "Name: 20, dtype: object" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.map.loc[20]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* or direct assess to the array elements:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'distance between ARG 21 CA and MOL 278 C3'" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.map['description'][20]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "One can use the mapping to find out to what particular pair of elements the minimum distance found in the trajectories corresponds to." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `np.where` functionality gives us the array coordinates where a given condition is matched (in this case, when the distance is equal to the minimum):" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2.92014\n", "[2196]\n", "[ 2.9201436]\n" ] } ], "source": [ "print(global_minimum)\n", "frame, index = np.where(bundledprojs == global_minimum)\n", "print(index)\n", "print(bundledprojs[frame, index]) # confirm that indeed these coordinates correspond the inquired value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, using the index in the mapping shown before:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
atomIndexesdescriptiontype
2196[4113, 4497]distance between GLY 258 CA and MOL 278 N3distance
\n", "
" ], "text/plain": [ " atomIndexes description type\n", "2196 [4113, 4497] distance between GLY 258 CA and MOL 278 N3 distance" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.map.iloc[index]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "gives us the atom indexes and description of the pair." ] } ], "metadata": { "anaconda-cloud": {}, "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 1 }