{ "cells": [ { "cell_type": "markdown", "id": "satisfactory-stanley", "metadata": {}, "source": [ "# Improving performance" ] }, { "cell_type": "code", "execution_count": 1, "id": "rural-wings", "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "# dask and distributed are extra installs\n", "from dask.distributed import Client, LocalCluster\n", "import matplotlib.pyplot as plt\n", "import mdtraj as md\n", "traj = md.load(\"5550217/kras.xtc\", top=\"5550217/kras.pdb\")\n", "topology = traj.topology" ] }, { "cell_type": "markdown", "id": "boxed-fiction", "metadata": {}, "source": [ "Much of the core computational effort in Contact Map Explorer is performed by MDTraj, which uses OpenMP during the nearest-neighbors calculation. This already provides excellent performance for a bottleneck in the contact map creation process. However, Contact Map Explorer also has a few other tricks to further enhance performance." ] }, { "cell_type": "markdown", "id": "assigned-cause", "metadata": {}, "source": [ "## Dask\n", "\n", "For multi-frame contact maps and contact trajectories, Contact Map Explorer can use Dask to parallelize across frames. Note that Dask is not required to install Contact Map Explorer, so you must install Dask separately to benefit from it.\n", "\n", "When using Dask, a few things are different:\n", "\n", "1. You need to provide a `distributed.Client` to the `DaskContactFrequency` or `DaskContactTrajectory`.\n", "2. You need to provide the filename (and any other arguments needed by MDTraj, instead of the trajectory itself.\n", "\n", "Dask might not give any performance boost on a single machine, but can be very useful if parallelizing across multiple machines. Because this directly takes a `Client`, it is easy to interface this with tools like [dask-jobqueue](https://jobqueue.dask.org/en/latest/)." ] }, { "cell_type": "code", "execution_count": 2, "id": "southeast-cement", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n",
"Client\n", "
| \n",
"\n",
"Cluster\n", "
| \n",
"