{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# System building: Protein in Membrane with Ligand" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "In this tutorial, we will showcase how to build a protein ligand system for simulating binding. The sample system is Trypsin (the protein) and benzamidine (the ligand).\n", "\n", "Let's start by doing some imports and definitions:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Please cite HTMD: Doerr et al.(2016)JCTC,12,1845. \n", "https://dx.doi.org/10.1021/acs.jctc.6b00049\n", "Documentation: http://software.acellera.com/\n", "To update: conda update htmd -c acellera -c psi4\n", "\n", "You are on the latest HTMD version (unpackaged : /home/joao/maindisk/software/repos/cuzzo87/htmd/htmd).\n", "\n" ] } ], "source": [ "from htmd.ui import *\n", "from htmd.home import home\n", "from os.path import join\n", "config(viewer='webgl')\n", "datadir = home(dataDir='building-protein-ligand')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Load the protein-ligand complex" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One can obtain the protein-ligand complex from the PDB database (ID:3PTB). The complex is already available in the data distributed with HTMD and either one could be used:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2018-04-30 14:51:38,239 - htmd.molecule.readers - INFO - Using local copy for 3PTB: /home/joao/maindisk/software/repos/cuzzo87/htmd/htmd/data/pdb/3ptb.pdb\n", "2018-04-30 14:51:38,361 - htmd.molecule.molecule - WARNING - Residue insertions were detected in the Molecule. It is recommended to renumber the residues using the Molecule.renumberResidues() method.\n", "2018-04-30 14:51:38,473 - htmd.molecule.molecule - WARNING - Residue insertions were detected in the Molecule. It is recommended to renumber the residues using the Molecule.renumberResidues() method.\n" ] } ], "source": [ "# One can download it directly from the RCSB servers\n", "prot = Molecule('3PTB')\n", "# Or use the pdb file found in the HTMD data directory\n", "prot = Molecule(join(datadir, 'trypsin.pdb'))" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "326c6f1c1fd140408f489101f575f0d6", "version_major": 2, "version_minor": 0 }, "text/plain": [ "A Jupyter Widget" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "prot.view()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Clean the structures" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "The PDB crystal structure contains the protein as well as water molecules, a calcium ion and a ligand. Here we will start by removing the ligand from the protein Molecule as we will add it later to manipulate it separately." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2018-04-30 14:51:38,911 - htmd.molecule.molecule - INFO - Removed 9 atoms. 1692 atoms remaining in the molecule.\n" ] }, { "data": { "text/plain": [ "array([1630, 1631, 1632, 1633, 1634, 1635, 1636, 1637, 1638], dtype=int32)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "prot.remove('resname BEN')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Preparing the protein" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this step, we prepare the protein for simulation by adding hydrogens, setting the protonation states, and optimizing the protein (more details on the protein preparation tutorial):" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2018-04-30 14:51:39,057 - propka - INFO - No pdbfile provided\n", "2018-04-30 14:51:41,578 - htmd.builder.preparation - WARNING - The following residue has not been optimized: CA\n", "2018-04-30 14:51:49,983 - htmd.builder.preparationdata - INFO - The following residues are in a non-standard state: CYS 6 A (CYX), HIS 22 A (HIE), CYS 24 A (CYX), HIS 39 A (HIP), CYS 40 A (CYX), HIS 72 A (HID), CYS 108 A (CYX), CYS 115 A (CYX), CYS 136 A (CYX), CYS 147 A (CYX), CYS 161 A (CYX), CYS 172 A (CYX), CYS 182 A (CYX), CYS 196 A (CYX), CYS 209 A (CYX)\n", "2018-04-30 14:51:50,022 - htmd.builder.preparationdata - WARNING - Dubious protonation state: the pKa of 4 residues is within 1.0 units of pH 7.0.\n", "2018-04-30 14:51:50,026 - htmd.builder.preparationdata - WARNING - Dubious protonation state: HIS 39 A (pKa= 7.46)\n", "2018-04-30 14:51:50,029 - htmd.builder.preparationdata - WARNING - Dubious protonation state: GLU 51 A (pKa= 6.10)\n", "2018-04-30 14:51:50,030 - htmd.builder.preparationdata - WARNING - Dubious protonation state: ASP 170 A (pKa= 6.49)\n", "2018-04-30 14:51:50,031 - htmd.builder.preparationdata - WARNING - Dubious protonation state: N+ 0T A (pKa= 7.49)\n", "2018-04-30 14:51:50,121 - htmd.builder.preparationdata - WARNING - Found N-terminus 80.7% buried (> 50.0% threshold)\n", "2018-04-30 14:51:50,122 - htmd.builder.preparationdata - WARNING - Found C-terminus involved in H bonds\n" ] } ], "source": [ "prot = proteinPrepare(prot, pH=7.0)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Define segments" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "To build a system in HTMD, we need to separate the chemical molecules into separate segments. This prevents the builder from accidentally bonding different chemical molecules and allows us to add caps to them." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "prot = autoSegment(prot, sel='protein')\n", "prot.set('segid', 'W', sel='water')\n", "prot.set('segid', 'CA', sel='resname CA')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Center the protein to the origin" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "prot.center()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Let's work on the ligand!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load the ligand from the HTMD data:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2018-04-30 14:51:50,585 - htmd.molecule.readers - WARNING - Element of atom ID 1 could not be automatically guessed from its MOL2 atomtype (ca).\n", "2018-04-30 14:51:50,586 - htmd.molecule.readers - WARNING - Element of atom ID 2 could not be automatically guessed from its MOL2 atomtype (ca).\n", "2018-04-30 14:51:50,587 - htmd.molecule.readers - WARNING - Element of atom ID 3 could not be automatically guessed from its MOL2 atomtype (ca).\n", "2018-04-30 14:51:50,588 - htmd.molecule.readers - WARNING - Element of atom ID 4 could not be automatically guessed from its MOL2 atomtype (ca).\n", "2018-04-30 14:51:50,589 - htmd.molecule.readers - WARNING - Element of atom ID 5 could not be automatically guessed from its MOL2 atomtype (ca).\n", "2018-04-30 14:51:50,590 - htmd.molecule.readers - WARNING - Element of atom ID 6 could not be automatically guessed from its MOL2 atomtype (ca).\n", "2018-04-30 14:51:50,592 - htmd.molecule.readers - WARNING - Element of atom ID 7 could not be automatically guessed from its MOL2 atomtype (ce).\n", "2018-04-30 14:51:50,593 - htmd.molecule.readers - WARNING - Element of atom ID 8 could not be automatically guessed from its MOL2 atomtype (ha).\n", "2018-04-30 14:51:50,594 - htmd.molecule.readers - WARNING - Element of atom ID 9 could not be automatically guessed from its MOL2 atomtype (ha).\n", "2018-04-30 14:51:50,595 - htmd.molecule.readers - WARNING - Element of atom ID 10 could not be automatically guessed from its MOL2 atomtype (ha).\n", "2018-04-30 14:51:50,596 - htmd.molecule.readers - WARNING - Element of atom ID 11 could not be automatically guessed from its MOL2 atomtype (ha).\n", "2018-04-30 14:51:50,597 - htmd.molecule.readers - WARNING - Element of atom ID 12 could not be automatically guessed from its MOL2 atomtype (ha).\n", "2018-04-30 14:51:50,598 - htmd.molecule.readers - WARNING - Element of atom ID 13 could not be automatically guessed from its MOL2 atomtype (nh).\n", "2018-04-30 14:51:50,599 - htmd.molecule.readers - WARNING - Element of atom ID 14 could not be automatically guessed from its MOL2 atomtype (nh).\n", "2018-04-30 14:51:50,600 - htmd.molecule.readers - WARNING - Element of atom ID 15 could not be automatically guessed from its MOL2 atomtype (hn).\n", "2018-04-30 14:51:50,601 - htmd.molecule.readers - WARNING - Element of atom ID 16 could not be automatically guessed from its MOL2 atomtype (hn).\n", "2018-04-30 14:51:50,602 - htmd.molecule.readers - WARNING - Element of atom ID 17 could not be automatically guessed from its MOL2 atomtype (hn).\n", "2018-04-30 14:51:50,603 - htmd.molecule.readers - WARNING - Element of atom ID 18 could not be automatically guessed from its MOL2 atomtype (hn).\n" ] } ], "source": [ "ligand = Molecule(join(datadir, 'benzamidine.mol2'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's center the ligand and visualize it:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "290df5adc31f4790af97103b1223b5c4", "version_major": 2, "version_minor": 0 }, "text/plain": [ "A Jupyter Widget" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "ligand.center()\n", "ligand.view()" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# We can give a convenient segid and resname to the ligand\n", "# The resname should be MOL to match the parameters in the\n", "# rtf and prm files.\n", "ligand.set('segid','L')\n", "ligand.set('resname','MOL')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But the ligand is now located inside the protein...\n", "We would like the ligand to be:\n", "\n", "* At a certain distance from the protein\n", "* Rotated randomly, to provide different starting conditions" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Let's randomize the ligand position" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "ligand.rotateBy(uniformRandomRotation())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This took care of the ligand rotation around its own center. \n", "We still need to position it far from the protein.\n", "First, find out the radius of the protein:\n", "\n", "