{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Protein Ligand Complex MD Setup tutorial using BioExcel Building Blocks (biobb)\n", "### Based on the official Gromacs tutorial: http://www.mdtutorials.com/gmx/complex/index.html\n", "***\n", "This tutorial aims to illustrate the process of **setting up a simulation system** containing a **protein in complex with a ligand**, step by step, using the **BioExcel Building Blocks library (biobb)**. The particular example used is the **T4 lysozyme** L99A/M102Q protein (PDB code 3HTB, [https://doi.org/10.2210/pdb3HTB/pdb](https://doi.org/10.2210/pdb3HTB/pdb)), in complex with the **2-propylphenol** small molecule (3-letter Code JZ4, [https://www.rcsb.org/ligand/JZ4](https://www.rcsb.org/ligand/JZ4)). \n", "***\n", "**Biobb modules** used:\n", "\n", " - [biobb_io](https://github.com/bioexcel/biobb_io): Tools to fetch biomolecular data from public databases.\n", " - [biobb_model](https://github.com/bioexcel/biobb_model): Tools to model macromolecular structures.\n", " - [biobb_chemistry](https://github.com/bioexcel/biobb_chemistry): Tools to manipulate chemical data.\n", " - [biobb_gromacs](https://github.com/bioexcel/biobb_gromacs): Tools to setup and run Molecular Dynamics simulations.\n", " - [biobb_analysis](https://github.com/bioexcel/biobb_analysis): Tools to analyse Molecular Dynamics trajectories.\n", " - [biobb_structure_utils](https://github.com/bioexcel/biobb_structure_utils): Tools to modify or extract information from a PDB structure file.\n", " \n", "### Auxiliary libraries used\n", "\n", "* [jupyter](https://jupyter.org/): Free software, open standards, and web services for interactive computing across all programming languages.\n", "* [nglview](http://nglviewer.org/#nglview): Jupyter/IPython widget to interactively view molecular structures and trajectories in notebooks.\n", "* [plotly](https://plot.ly/python/offline/): Python interactive graphing library integrated in Jupyter notebooks.\n", "* [simpletraj](https://github.com/arose/simpletraj): Lightweight coordinate-only trajectory reader based on code from GROMACS, MDAnalysis and VMD.\n", "\n", "### Conda Installation and Launch\n", "\n", "```console\n", "git clone https://github.com/bioexcel/biobb_wf_protein-complex_md_setup.git\n", "cd biobb_wf_protein-complex_md_setup\n", "conda env create -f conda_env/environment.yml\n", "conda activate biobb_Protein-Complex_MDsetup_tutorial\n", "jupyter-notebook biobb_wf_protein-complex_md_setup/notebooks/biobb_Protein-Complex_MDsetup_tutorial.ipynb\n", "```\n", "\n", "***\n", "### Pipeline steps:\n", " 1. [Input Parameters](#input)\n", " 2. [Fetching PDB Structure](#fetch)\n", " 3. [Fix Protein Structure](#fix)\n", " 4. [Create Protein System Topology](#top)\n", " 5. [Create ligand system topology](#ligtop)\n", " 6. [Preparing Ligand Restraints](#restraints)\n", " 7. [Create new protein-ligand complex structure file](#complex)\n", " 8. [Create new protein-ligand complex topology file](#complextop)\n", " 9. [Create Solvent Box](#box)\n", " 10. [Fill the Box with Water Molecules](#water)\n", " 11. [Adding Ions](#ions)\n", " 12. [Energetically Minimize the System](#min)\n", " 13. [Equilibrate the System (NVT)](#nvt)\n", " 14. [Equilibrate the System (NPT)](#npt)\n", " 15. [Free Molecular Dynamics Simulation](#free)\n", " 16. [Post-processing and Visualizing Resulting 3D Trajectory](#post)\n", " 17. [Output Files](#output)\n", " 18. [Questions & Comments](#questions)\n", " \n", "***\n", "***\n", "\"Bioexcel2\n", "***\n" ] }, { "cell_type": "markdown", "metadata": { "jp-MarkdownHeadingCollapsed": true }, "source": [ "## Initializing colab\n", "The two cells below are used only in case this notebook is executed via **Google Colab**. Take into account that, for running conda on **Google Colab**, the **condacolab** library must be installed. As [explained here](https://pypi.org/project/condacolab/), the installation requires a **kernel restart**, so when running this notebook in **Google Colab**, don't run all cells until this **installation** is properly **finished** and the **kernel** has **restarted**." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Only executed when using google colab\n", "import sys\n", "if 'google.colab' in sys.modules:\n", " import subprocess\n", " from pathlib import Path\n", " try:\n", " subprocess.run([\"conda\", \"-V\"], check=True)\n", " except FileNotFoundError:\n", " subprocess.run([sys.executable, \"-m\", \"pip\", \"install\", \"condacolab\"], check=True)\n", " import condacolab\n", " condacolab.install()\n", " # Clone repository\n", " repo_URL = \"https://github.com/bioexcel/biobb_wf_protein-complex_md_setup.git\"\n", " repo_name = Path(repo_URL).name.split('.')[0]\n", " if not Path(repo_name).exists():\n", " subprocess.run([\"mamba\", \"install\", \"-y\", \"git\"], check=True)\n", " subprocess.run([\"git\", \"clone\", repo_URL], check=True)\n", " print(\"⏬ Repository properly cloned.\")\n", " # Install environment\n", " print(\"⏳ Creating environment...\")\n", " env_file_path = f\"{repo_name}/conda_env/environment.yml\"\n", " subprocess.run([\"mamba\", \"env\", \"update\", \"-n\", \"base\", \"-f\", env_file_path], check=True)\n", " print(\"🎨 Install NGLView dependencies...\")\n", " subprocess.run([\"mamba\", \"install\", \"-y\", \"-c\", \"conda-forge\", \"nglview==3.0.8\", \"ipywidgets=7.7.2\"], check=True)\n", " print(\"👍 Conda environment successfully created and updated.\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Enable widgets for colab\n", "if 'google.colab' in sys.modules:\n", " from google.colab import output\n", " output.enable_custom_widget_manager()\n", " import os\n", " os.chdir(\"biobb_wf_protein-complex_md_setup/biobb_wf_protein-complex_md_setup/notebooks\")\n", " print(f\"📂 New working directory: {os.getcwd()}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Input parameters\n", "**Input parameters** needed:\n", " - **pdbCode**: PDB code of the protein-ligand complex structure (e.g. 3HTB, [https://doi.org/10.2210/pdb3HTB/pdb](https://doi.org/10.2210/pdb3HTB/pdb))\n", " - **ligandCode**: Small molecule 3-letter code for the ligand structure (e.g. JZ4, [https://www.rcsb.org/ligand/JZ4](https://www.rcsb.org/ligand/JZ4))\n", " - **mol_charge**: Charge of the small molecule, needed to add hydrogen atoms." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "import nglview\n", "import ipywidgets\n", "import os\n", "import zipfile\n", "\n", "pdbCode = \"3HTB\"\n", "ligandCode = \"JZ4\"\n", "mol_charge = 0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Fetching PDB structure\n", "Downloading **PDB structure** with the **protein-ligand complex** from the RCSB PDB database.
\n", "Alternatively, a PDB file can be used as starting structure.
\n", "Splitting the molecule in **three different files**: \n", "- **proteinFile**: Protein structure\n", "- **ligandFile**: Ligand structure\n", "- **complexFile**: Protein-ligand complex structure \n", "\n", "***\n", "**Building Blocks** used:\n", " - [Pdb](https://biobb-io.readthedocs.io/en/latest/api.html#module-api.pdb) from **biobb_io.api.pdb**\n", "***" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Downloading desired PDB file \n", "# Import module\n", "from biobb_io.api.pdb import pdb\n", "\n", "# Create properties dict and inputs/outputs\n", "downloaded_pdb = pdbCode+'.orig.pdb'\n", "prop = {\n", " 'pdb_code': pdbCode,\n", " 'filter': False\n", "}\n", "\n", "# Create and launch bb\n", "pdb(output_pdb_path=downloaded_pdb,\n", " properties=prop)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Extracting Protein, Ligand and Protein-Ligand Complex to three different files\n", "# Import module\n", "from biobb_structure_utils.utils.extract_heteroatoms import extract_heteroatoms\n", "from biobb_structure_utils.utils.extract_molecule import extract_molecule\n", "from biobb_structure_utils.utils.cat_pdb import cat_pdb\n", "\n", "# Create properties dict and inputs/outputs\n", "proteinFile = pdbCode+'.pdb'\n", "ligandFile = ligandCode+'.pdb'\n", "complexFile = pdbCode+'_'+ligandCode+'.pdb'\n", "\n", "prop = {\n", " 'heteroatoms' : [{\"name\": \"JZ4\"}]\n", "}\n", "\n", "extract_heteroatoms(input_structure_path=downloaded_pdb,\n", " output_heteroatom_path=ligandFile,\n", " properties=prop)\n", "\n", "extract_molecule(input_structure_path=downloaded_pdb,\n", " output_molecule_path=proteinFile)\n", "\n", "print(proteinFile, ligandFile, complexFile)\n", "\n", "cat_pdb(input_structure1=proteinFile,\n", " input_structure2=ligandFile,\n", " output_structure_path=complexFile)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualizing 3D structures\n", "Visualizing the generated **PDB structures** using **NGL**: \n", "- **Protein structure** (Left)\n", "- **Ligand structure** (Center)\n", "- **Protein-ligand complex** (Right) " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Show structures: protein, ligand and protein-ligand complex\n", "view1 = nglview.show_structure_file(proteinFile)\n", "view1._remote_call('setSize', target='Widget', args=['350px','400px'])\n", "view1.camera='orthographic'\n", "view1\n", "view2 = nglview.show_structure_file(ligandFile)\n", "view2.add_representation(repr_type='ball+stick')\n", "view2._remote_call('setSize', target='Widget', args=['350px','400px'])\n", "view2.camera='orthographic'\n", "view2\n", "view3 = nglview.show_structure_file(complexFile)\n", "view3.add_representation(repr_type='licorice', radius='.5', selection=ligandCode)\n", "view3._remote_call('setSize', target='Widget', args=['350px','400px'])\n", "view3.camera='orthographic'\n", "view3\n", "ipywidgets.HBox([view1, view2, view3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Fix protein structure\n", "**Checking** and **fixing** (if needed) the protein structure:
\n", "- **Modeling** **missing side-chain atoms**, modifying incorrect **amide assignments**, choosing **alternative locations**.
\n", "- **Checking** for missing **backbone atoms**, **heteroatoms**, **modified residues** and possible **atomic clashes**.\n", "\n", "***\n", "**Building Blocks** used:\n", " - [FixSideChain](https://biobb-model.readthedocs.io/en/latest/model.html#module-model.fix_side_chain) from **biobb_model.model.fix_side_chain**\n", "***" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Check & Fix Protein Structure\n", "# Import module\n", "from biobb_model.model.fix_side_chain import fix_side_chain\n", "\n", "# Create prop dict and inputs/outputs\n", "fixed_pdb = pdbCode+'_fixed.pdb'\n", "\n", "# Create and launch bb\n", "fix_side_chain(input_pdb_path=proteinFile,\n", " output_pdb_path=fixed_pdb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Create protein system topology\n", "**Building GROMACS topology** corresponding to the protein structure.
\n", "Force field used in this tutorial is [**amber99sb-ildn**](https://dx.doi.org/10.1002%2Fprot.22711): AMBER **parm99** force field with **corrections on backbone** (sb) and **side-chain torsion potentials** (ildn). Water molecules type used in this tutorial is [**spc/e**](https://pubs.acs.org/doi/abs/10.1021/j100308a038).
\n", "Adding **hydrogen atoms** if missing. Automatically identifying **disulfide bridges**.
\n", "\n", "Generating two output files: \n", "- **GROMACS structure** (gro file)\n", "- **GROMACS topology** ZIP compressed file containing:\n", " - *GROMACS topology top file* (top file)\n", " - *GROMACS position restraint file/s* (itp file/s)\n", "***\n", "**Building Blocks** used:\n", " - [Pdb2gmx](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.pdb2gmx) from **biobb_gromacs.gromacs.pdb2gmx**\n", "***" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create Protein system topology\n", "# Import module\n", "from biobb_gromacs.gromacs.pdb2gmx import pdb2gmx\n", "\n", "# Create inputs/outputs\n", "output_pdb2gmx_gro = pdbCode+'_pdb2gmx.gro'\n", "output_pdb2gmx_top_zip = pdbCode+'_pdb2gmx_top.zip'\n", "prop = {\n", " 'force_field' : 'amber99sb-ildn',\n", " 'water_type': 'spce'\n", "}\n", "\n", "# Create and launch bb\n", "pdb2gmx(input_pdb_path=fixed_pdb,\n", " output_gro_path=output_pdb2gmx_gro,\n", " output_top_zip_path=output_pdb2gmx_top_zip,\n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Create ligand system topology\n", "**Building GROMACS topology** corresponding to the ligand structure.
\n", "Force field used in this tutorial step is **amberGAFF**: [General AMBER Force Field](http://ambermd.org/antechamber/gaff.html), designed for rational drug design.
\n", "- [Step 1](#ligandTopologyStep1): Add **hydrogen atoms** if missing.\n", "- [Step 2](#ligandTopologyStep2): **Energetically minimize the system** with the new hydrogen atoms. \n", "- [Step 3](#ligandTopologyStep3): Generate **ligand topology** (parameters). \n", "***\n", "**Building Blocks** used:\n", " - [ReduceAddHydrogens](https://biobb-chemistry.readthedocs.io/en/latest/ambertools.html#module-ambertools.reduce_add_hydrogens) from **biobb_chemistry.ambertools.reduce_add_hydrogens**\n", " - [BabelMinimize](https://biobb-chemistry.readthedocs.io/en/latest/babelm.html#module-babelm.babel_minimize) from **biobb_chemistry.babelm.babel_minimize** \n", " - [AcpypeParamsGMX](https://biobb-chemistry.readthedocs.io/en/latest/acpype.html#module-acpype.acpype_params_gmx) from **biobb_chemistry.acpype.acpype_params_gmx** \n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 1: Add **hydrogen atoms**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create Ligand system topology, STEP 1\n", "# Reduce_add_hydrogens: add Hydrogen atoms to a small molecule (using Reduce tool from Ambertools package)\n", "# Import module\n", "from biobb_chemistry.ambertools.reduce_add_hydrogens import reduce_add_hydrogens\n", "\n", "# Create prop dict and inputs/outputs\n", "output_reduce_h = ligandCode+'.reduce.H.pdb' \n", "prop = {\n", " 'nuclear' : 'true'\n", "}\n", "\n", "# Create and launch bb\n", "reduce_add_hydrogens(input_path=ligandFile,\n", " output_path=output_reduce_h,\n", " properties=prop)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 2: **Energetically minimize the system** with the new hydrogen atoms. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create Ligand system topology, STEP 2\n", "# Babel_minimize: Structure energy minimization of a small molecule after being modified adding hydrogen atoms\n", "# Import module\n", "from biobb_chemistry.babelm.babel_minimize import babel_minimize\n", "\n", "# Create prop dict and inputs/outputs\n", "output_babel_min = ligandCode+'.H.min.mol2' \n", "prop = {\n", " 'method' : 'sd',\n", " 'criteria' : '1e-10',\n", " 'force_field' : 'GAFF'\n", "}\n", "\n", "\n", "# Create and launch bb\n", "babel_minimize(input_path=output_reduce_h,\n", " output_path=output_babel_min,\n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualizing 3D structures\n", "Visualizing the small molecule generated **PDB structures** using **NGL**: \n", "- **Original Ligand Structure** (Left)\n", "- **Ligand Structure with hydrogen atoms added** (with Reduce program) (Center)\n", "- **Ligand Structure with hydrogen atoms added** (with Reduce program), **energy minimized** (with Open Babel) (Right) " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Show different structures generated (for comparison)\n", "\n", "view1 = nglview.show_structure_file(ligandFile)\n", "view1.add_representation(repr_type='ball+stick')\n", "view1._remote_call('setSize', target='Widget', args=['350px','400px'])\n", "view1.camera='orthographic'\n", "view1\n", "view2 = nglview.show_structure_file(output_reduce_h)\n", "view2.add_representation(repr_type='ball+stick')\n", "view2._remote_call('setSize', target='Widget', args=['350px','400px'])\n", "view2.camera='orthographic'\n", "view2\n", "view3 = nglview.show_structure_file(output_babel_min)\n", "view3.add_representation(repr_type='ball+stick')\n", "view3._remote_call('setSize', target='Widget', args=['350px','400px'])\n", "view3.camera='orthographic'\n", "view3\n", "ipywidgets.HBox([view1, view2, view3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 3: Generate **ligand topology** (parameters)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create Ligand system topology, STEP 3\n", "# Acpype_params_gmx: Generation of topologies for GROMACS with ACPype\n", "# Import module\n", "from biobb_chemistry.acpype.acpype_params_gmx import acpype_params_gmx\n", "\n", "# Create prop dict and inputs/outputs\n", "output_acpype_gro = ligandCode+'params.gro'\n", "output_acpype_itp = ligandCode+'params.itp'\n", "output_acpype_top = ligandCode+'params.top'\n", "output_acpype = ligandCode+'params'\n", "prop = {\n", " 'basename' : output_acpype,\n", " 'charge' : mol_charge\n", "}\n", "\n", "# Create and launch bb\n", "acpype_params_gmx(input_path=output_babel_min, \n", " output_path_gro=output_acpype_gro,\n", " output_path_itp=output_acpype_itp,\n", " output_path_top=output_acpype_top,\n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Preparing Ligand Restraints\n", "In subsequent steps of the pipeline, such as the equilibration stages of the **protein-ligand complex** system, it is recommended to apply some **restraints** to the small molecule, to avoid a possible change in position due to protein repulsion. **Position restraints** will be applied to the ligand, using a **force constant of 1000 KJ/mol\\*nm^2** on the three coordinates: x, y and z. In this steps the **restriction files** will be created and integrated in the **ligand topology**.\n", "- [Step 1](#restraintsStep1): Creating an index file with a new group including just the **small molecule heavy atoms**.\n", "- [Step 2](#restraintsStep2): Generating the **position restraints** file.\n", "***\n", "**Building Blocks** used:\n", " - [MakeNdx](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.make_ndx) from **biobb_gromacs.gromacs.make_ndx** \n", " - [Genrestr](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.genrestr) from **biobb_gromacs.gromacs.genrestr** \n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 1: Creating an index file for the small molecule heavy atoms" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# MakeNdx: Creating index file with a new group (small molecule heavy atoms)\n", "from biobb_gromacs.gromacs.make_ndx import make_ndx\n", "\n", "# Create prop dict and inputs/outputs\n", "output_ligand_ndx = ligandCode+'_index.ndx'\n", "prop = {\n", " 'selection': \"0 & ! a H*\"\n", "}\n", "\n", "# Create and launch bb\n", "make_ndx(input_structure_path=output_acpype_gro,\n", " output_ndx_path=output_ligand_ndx,\n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 2: Generating the position restraints file" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Genrestr: Generating the position restraints file\n", "from biobb_gromacs.gromacs.genrestr import genrestr\n", "\n", "# Create prop dict and inputs/outputs\n", "output_restraints_top = ligandCode+'_posres.itp'\n", "prop = {\n", " 'force_constants': \"1000 1000 1000\",\n", " 'restrained_group': \"System\"\n", "}\n", "\n", "# Create and launch bb\n", "genrestr(input_structure_path=output_acpype_gro,\n", " input_ndx_path=output_ligand_ndx,\n", " output_itp_path=output_restraints_top,\n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Create new protein-ligand complex structure file\n", "Building new **protein-ligand complex** PDB file with:\n", "- The new **protein system** with fixed problems from *Fix Protein Structure* step and hydrogens atoms added from *Create Protein System Topology* step.\n", "- The new **ligand system** with hydrogens atoms added from *Create Ligand System Topology* step. \n", "\n", "This new structure is needed for **GROMACS** as it is **force field-compliant**, it **has all the new hydrogen atoms**, and the **atom names are matching the newly generated protein and ligand topologies**.\n", "***\n", "**Building Blocks** used:\n", " - [GMXTrjConvStr](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_trjconv_str) from **biobb_analysis.gromacs.gmx_trjconv_str**\n", "***" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# biobb analysis module\n", "from biobb_analysis.gromacs.gmx_trjconv_str import gmx_trjconv_str\n", "from biobb_structure_utils.utils.cat_pdb import cat_pdb\n", "\n", "# Convert gro (with hydrogens) to pdb (PROTEIN)\n", "proteinFile_H = pdbCode+'_'+ligandCode+'_complex_H.pdb'\n", "prop = {\n", " 'selection' : 'System'\n", "}\n", "\n", "# Create and launch bb\n", "gmx_trjconv_str(input_structure_path=output_pdb2gmx_gro,\n", " input_top_path=output_pdb2gmx_gro,\n", " output_str_path=proteinFile_H, \n", " properties=prop)\n", "\n", "# Convert gro (with hydrogens) to pdb (LIGAND)\n", "ligandFile_H = ligandCode+'_complex_H.pdb'\n", "prop = {\n", " 'selection' : 'System'\n", "}\n", "\n", "# Create and launch bb\n", "gmx_trjconv_str(input_structure_path=output_acpype_gro,\n", " input_top_path=output_acpype_gro,\n", " output_str_path=ligandFile_H, \n", " properties=prop)\n", "\n", "\n", "# Concatenating both PDB files: Protein + Ligand\n", "complexFile_H = pdbCode+'_'+ligandCode+'_H.pdb'\n", "\n", "# Create and launch bb\n", "cat_pdb(input_structure1=proteinFile_H,\n", " input_structure2=ligandFile_H,\n", " output_structure_path=complexFile_H)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Create new protein-ligand complex topology file\n", "Building new **protein-ligand complex** GROMACS topology file with:\n", "- The new **protein system** topology generated from *Create Protein System Topology* step.\n", "- The new **ligand system** topology generated from *Create Ligand System Topology* step. \n", "\n", "NOTE: From this point on, the **protein-ligand complex structure and topology** generated can be used in a regular MD setup.\n", "***\n", "**Building Blocks** used:\n", " - [AppendLigand](https://biobb-md.readthedocs.io/en/latest/modules.html) from **biobb_gromacs.gromacs_extra.append_ligand** (NOTE: link should be updated with the documentation)\n", "***" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# AppendLigand: Append a ligand to a GROMACS topology\n", "# Import module\n", "from biobb_gromacs.gromacs_extra.append_ligand import append_ligand\n", "\n", "# Create prop dict and inputs/outputs\n", "output_complex_top = pdbCode+'_'+ligandCode+'_complex.top.zip'\n", "\n", "posresifdef = \"POSRES_\"+ligandCode.upper()\n", "prop = {\n", " 'posres_name': posresifdef\n", "}\n", "\n", "# Create and launch bb\n", "append_ligand(input_top_zip_path=output_pdb2gmx_top_zip,\n", " input_posres_itp_path=output_restraints_top,\n", " input_itp_path=output_acpype_itp, \n", " output_top_zip_path=output_complex_top,\n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Create solvent box\n", "Define the unit cell for the **protein-ligand complex** to fill it with water molecules.
\n", "**Truncated octahedron** box is used for the unit cell. This box type is the one which best reflects the geometry of the solute/protein, in this case a **globular protein**, as it approximates a sphere. It is also convenient for the computational point of view, as it accumulates **less water molecules at the corners**, reducing the final number of water molecules in the system and making the simulation run faster.
A **protein to box** distance of **0.8 nm** is used, and the protein is **centered in the box**. \n", "\n", "***\n", "**Building Blocks** used:\n", " - [Editconf](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.editconf) from **biobb_gromacs.gromacs.editconf** \n", "***" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Editconf: Create solvent box\n", "# Import module\n", "from biobb_gromacs.gromacs.editconf import editconf\n", "\n", "# Create prop dict and inputs/outputs\n", "output_editconf_gro = pdbCode+'_'+ligandCode+'_complex_editconf.gro'\n", "\n", "prop = {\n", " 'box_type': 'octahedron',\n", " 'distance_to_molecule': 0.8\n", "}\n", "\n", "# Create and launch bb\n", "editconf(input_gro_path=complexFile_H, \n", " output_gro_path=output_editconf_gro,\n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Fill the box with water molecules\n", "Fill the unit cell for the **protein-ligand complex** with water molecules.
\n", "The solvent type used is the default **Simple Point Charge water (SPC)**, a generic equilibrated 3-point solvent model. \n", "\n", "***\n", "**Building Blocks** used:\n", " - [Solvate](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.solvate) from **biobb_gromacs.gromacs.solvate** \n", "***" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Solvate: Fill the box with water molecules\n", "from biobb_gromacs.gromacs.solvate import solvate\n", "\n", "# Create prop dict and inputs/outputs\n", "output_solvate_gro = pdbCode+'_'+ligandCode+'_solvate.gro'\n", "output_solvate_top_zip = pdbCode+'_'+ligandCode+'_solvate_top.zip'\n", "\n", "# Create and launch bb\n", "solvate(input_solute_gro_path=output_editconf_gro,\n", " output_gro_path=output_solvate_gro,\n", " input_top_zip_path=output_complex_top,\n", " output_top_zip_path=output_solvate_top_zip)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualizing 3D structure\n", "Visualizing the **protein-ligand complex** with the newly added **solvent box** using **NGL**
\n", "Note the **octahedral box** filled with **water molecules** surrounding the **protein structure**, which is **centered** right in the middle of the box." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "#Show protein\n", "view = nglview.show_structure_file(output_solvate_gro)\n", "view.clear_representations()\n", "view.add_representation(repr_type='cartoon', selection='protein', color='sstruc')\n", "view.add_representation(repr_type='licorice', radius='.5', selection=ligandCode)\n", "view.add_representation(repr_type='line', linewidth='1', selection='SOL', opacity='.3')\n", "view._remote_call('setSize', target='Widget', args=['','600px'])\n", "view.camera='orthographic'\n", "view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Adding ions\n", "Add ions to neutralize the **protein-ligand complex** and reach a desired ionic concentration.\n", "- [Step 1](#ionsStep1): Creating portable binary run file for ion generation\n", "- [Step 2](#ionsStep2): Adding ions to **neutralize** the system and reach a **0.05 molar ionic concentration**\n", "***\n", "**Building Blocks** used:\n", " - [Grompp](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.grompp) from **biobb_gromacs.gromacs.grompp** \n", " - [Genion](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.genion) from **biobb_gromacs.gromacs.genion** \n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 1: Creating portable binary run file for ion generation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Grompp: Creating portable binary run file for ion generation\n", "from biobb_gromacs.gromacs.grompp import grompp\n", "\n", "# Create prop dict and inputs/outputs\n", "prop = {\n", " 'mdp':{\n", " 'nsteps':'5000'\n", " },\n", " 'simulation_type':'minimization',\n", " 'maxwarn': 1\n", "}\n", "output_gppion_tpr = pdbCode+'_'+ligandCode+'_complex_gppion.tpr'\n", "\n", "# Create and launch bb\n", "grompp(input_gro_path=output_solvate_gro,\n", " input_top_zip_path=output_solvate_top_zip, \n", " output_tpr_path=output_gppion_tpr,\n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 2: Adding ions to neutralize the system and reach a 0.05 molar concentration\n", "Replace **solvent molecules** with **ions** to **neutralize** the system and reaching a **0.05 molar ionic concentration**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Genion: Adding ions to reach a 0.05 molar concentration\n", "from biobb_gromacs.gromacs.genion import genion\n", "\n", "# Create prop dict and inputs/outputs\n", "prop={\n", " 'neutral':True,\n", " 'concentration':0.05\n", "}\n", "output_genion_gro = pdbCode+'_'+ligandCode+'_genion.gro'\n", "output_genion_top_zip = pdbCode+'_'+ligandCode+'_genion_top.zip'\n", "\n", "# Create and launch bb\n", "genion(input_tpr_path=output_gppion_tpr,\n", " output_gro_path=output_genion_gro, \n", " input_top_zip_path=output_solvate_top_zip,\n", " output_top_zip_path=output_genion_top_zip, \n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Visualizing 3D structure\n", "Visualizing the **protein-ligand complex** with the newly added **ionic concentration** using **NGL**" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "#Show protein\n", "view = nglview.show_structure_file(output_genion_gro)\n", "view.clear_representations()\n", "view.add_representation(repr_type='cartoon', selection='protein', color='sstruc')\n", "view.add_representation(repr_type='licorice', radius='.5', selection=ligandCode)\n", "view.add_representation(repr_type='ball+stick', selection='NA')\n", "view.add_representation(repr_type='ball+stick', selection='CL')\n", "view._remote_call('setSize', target='Widget', args=['','600px'])\n", "view.camera='orthographic'\n", "view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Energetically minimize the system\n", "Energetically minimize the **protein-ligand complex** till reaching a desired potential energy.\n", "- [Step 1](#emStep1): Creating portable binary run file for energy minimization\n", "- [Step 2](#emStep2): Energetically minimize the **protein-ligand complex** till reaching a force of 500 kJ/mol*nm.\n", "- [Step 3](#emStep3): Checking **energy minimization** results. Plotting energy by time during the **minimization** process.\n", "***\n", "**Building Blocks** used:\n", " - [Grompp](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.grompp) from **biobb_gromacs.gromacs.grompp** \n", " - [Mdrun](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.mdrun) from **biobb_gromacs.gromacs.mdrun** \n", " - [GMXEnergy](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_energy) from **biobb_analysis.gromacs.gmx_energy** \n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 1: Creating portable binary run file for energy minimization\n", "Method used to run the **energy minimization** is a **steepest descent**, with a **maximum force of 500 kJ/mol\\*nm^2**, and a minimization **step size of 1fs**. The **maximum number of steps** to perform if the maximum force is not reached is **5,000 steps**. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Grompp: Creating portable binary run file for mdrun\n", "from biobb_gromacs.gromacs.grompp import grompp\n", "\n", "# Create prop dict and inputs/outputs\n", "prop = {\n", " 'mdp':{\n", " 'nsteps':'5000',\n", " 'emstep': 0.01,\n", " 'emtol':'500'\n", " },\n", " 'simulation_type':'minimization'\n", "}\n", "output_gppmin_tpr = pdbCode+'_'+ligandCode+'_gppmin.tpr'\n", "\n", "# Create and launch bb\n", "grompp(input_gro_path=output_genion_gro,\n", " input_top_zip_path=output_genion_top_zip,\n", " output_tpr_path=output_gppmin_tpr,\n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 2: Running Energy Minimization\n", "Running **energy minimization** using the **tpr file** generated in the previous step." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Mdrun: Running minimization\n", "from biobb_gromacs.gromacs.mdrun import mdrun\n", "\n", "# Create prop dict and inputs/outputs\n", "output_min_trr = pdbCode+'_'+ligandCode+'_min.trr'\n", "output_min_gro = pdbCode+'_'+ligandCode+'_min.gro'\n", "output_min_edr = pdbCode+'_'+ligandCode+'_min.edr'\n", "output_min_log = pdbCode+'_'+ligandCode+'_min.log'\n", "\n", "# Create and launch bb\n", "mdrun(input_tpr_path=output_gppmin_tpr,\n", " output_trr_path=output_min_trr, \n", " output_gro_path=output_min_gro,\n", " output_edr_path=output_min_edr, \n", " output_log_path=output_min_log)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 3: Checking Energy Minimization results\n", "Checking **energy minimization** results. Plotting **potential energy** by time during the minimization process. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# GMXEnergy: Getting system energy by time \n", "from biobb_analysis.gromacs.gmx_energy import gmx_energy\n", "\n", "# Create prop dict and inputs/outputs\n", "output_min_ene_xvg = pdbCode+'_'+ligandCode+'_min_ene.xvg'\n", "prop = {\n", " 'terms': [\"Potential\"]\n", "}\n", "\n", "# Create and launch bb\n", "gmx_energy(input_energy_path=output_min_edr, \n", " output_xvg_path=output_min_ene_xvg, \n", " properties=prop)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import plotly.graph_objs as go\n", "\n", "#Read data from file and filter energy values higher than 1000 kJ/mol\n", "with open(output_min_ene_xvg, 'r') as energy_file:\n", " x, y = zip(*[\n", " (float(line.split()[0]), float(line.split()[1]))\n", " for line in energy_file\n", " if not line.startswith((\"#\", \"@\"))\n", " if float(line.split()[1]) < 1000\n", " ])\n", "\n", "# Create a scatter plot\n", "fig = go.Figure(data=go.Scatter(x=x, y=y, mode='lines'))\n", "\n", "# Update layout\n", "fig.update_layout(title=\"Energy Minimization\",\n", " xaxis_title=\"Energy Minimization Step\",\n", " yaxis_title=\"Potential Energy kJ/mol\",\n", " height=600)\n", "\n", "# Show the figure (renderer changes for colab and jupyter)\n", "rend = 'colab' if 'google.colab' in sys.modules else ''\n", "fig.show(renderer=rend)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Equilibrate the system (NVT)\n", "Equilibrate the **protein-ligand complex** system in NVT ensemble (constant Number of particles, Volume and Temperature). To avoid temperature coupling problems, a new *\"system\"* group will be created including the **protein** + the **ligand** to be assigned to a single thermostatting group.\n", "\n", "- [Step 1](#eqNVTStep1): Creating an index file with a new group including the **protein-ligand complex**.\n", "- [Step 2](#eqNVTStep3): Creating portable binary run file for system equilibration\n", "- [Step 3](#eqNVTStep3): Equilibrate the **protein-ligand complex** with NVT ensemble.\n", "- [Step 4](#eqNVTStep4): Checking **NVT Equilibration** results. Plotting **system temperature** by time during the **NVT equilibration** process. \n", "***\n", "**Building Blocks** used:\n", "- [MakeNdx](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.make_ndx) from **biobb_gromacs.gromacs.make_ndx** \n", "- [Grompp](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.grompp) from **biobb_gromacs.gromacs.grompp** \n", "- [Mdrun](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.mdrun) from **biobb_gromacs.gromacs.mdrun** \n", "- [GMXEnergy](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_energy) from **biobb_analysis.gromacs.gmx_energy** \n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 1: Creating an index file with a new group including the **protein-ligand complex**" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# MakeNdx: Creating index file with a new group (protein-ligand complex)\n", "from biobb_gromacs.gromacs.make_ndx import make_ndx\n", "\n", "# Create prop dict and inputs/outputs\n", "output_complex_ndx = pdbCode+'_'+ligandCode+'_index.ndx'\n", "prop = {\n", " 'selection': \"\\\"Protein\\\"|\\\"Other\\\"\" \n", "}\n", "\n", "# Create and launch bb\n", "make_ndx(input_structure_path=output_min_gro,\n", " output_ndx_path=output_complex_ndx,\n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 2: Creating portable binary run file for system equilibration (NVT)\n", "Note that for the purposes of temperature coupling, the **protein-ligand complex** (*Protein_Other*) is considered as a single entity." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Grompp: Creating portable binary run file for NVT System Equilibration\n", "from biobb_gromacs.gromacs.grompp import grompp\n", "\n", "# Create prop dict and inputs/outputs\n", "output_gppnvt_tpr = pdbCode+'_'+ligandCode+'gppnvt.tpr'\n", "prop = {\n", " 'mdp':{\n", " 'nsteps':'5000',\n", " 'tc-grps': 'Protein_Other Water_and_ions',\n", " 'Define': '-DPOSRES -D' + posresifdef\n", " },\n", " 'simulation_type':'nvt'\n", "}\n", "\n", "# Create and launch bb\n", "grompp(input_gro_path=output_min_gro,\n", " input_top_zip_path=output_genion_top_zip,\n", " input_ndx_path=output_complex_ndx,\n", " output_tpr_path=output_gppnvt_tpr,\n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 3: Running NVT equilibration" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Mdrun: Running NVT System Equilibration \n", "from biobb_gromacs.gromacs.mdrun import mdrun\n", "\n", "# Create prop dict and inputs/outputs\n", "output_nvt_trr = pdbCode+'_'+ligandCode+'_nvt.trr'\n", "output_nvt_gro = pdbCode+'_'+ligandCode+'_nvt.gro'\n", "output_nvt_edr = pdbCode+'_'+ligandCode+'_nvt.edr'\n", "output_nvt_log = pdbCode+'_'+ligandCode+'_nvt.log'\n", "output_nvt_cpt = pdbCode+'_'+ligandCode+'_nvt.cpt'\n", "\n", "# Create and launch bb\n", "mdrun(input_tpr_path=output_gppnvt_tpr,\n", " output_trr_path=output_nvt_trr,\n", " output_gro_path=output_nvt_gro,\n", " output_edr_path=output_nvt_edr,\n", " output_log_path=output_nvt_log,\n", " output_cpt_path=output_nvt_cpt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 4: Checking NVT Equilibration results\n", "Checking **NVT Equilibration** results. Plotting **system temperature** by time during the NVT equilibration process. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# GMXEnergy: Getting system temperature by time during NVT Equilibration \n", "from biobb_analysis.gromacs.gmx_energy import gmx_energy\n", "\n", "# Create prop dict and inputs/outputs\n", "output_nvt_temp_xvg = pdbCode+'_'+ligandCode+'_nvt_temp.xvg'\n", "prop = {\n", " 'terms': [\"Temperature\"]\n", "}\n", "\n", "# Create and launch bb\n", "gmx_energy(input_energy_path=output_nvt_edr, \n", " output_xvg_path=output_nvt_temp_xvg, \n", " properties=prop)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import plotly.graph_objs as go\n", "\n", "# Read temperature data from file \n", "with open(output_nvt_temp_xvg, 'r') as temperature_file:\n", " x, y = zip(*[\n", " (float(line.split()[0]), float(line.split()[1]))\n", " for line in temperature_file\n", " if not line.startswith((\"#\", \"@\"))\n", " ])\n", "\n", "# Create a scatter plot\n", "fig = go.Figure(data=go.Scatter(x=x, y=y, mode='lines+markers'))\n", "\n", "# Update layout\n", "fig.update_layout(title=\"Temperature during NVT Equilibration\",\n", " xaxis_title=\"Time (ps)\",\n", " yaxis_title=\"Temperature (K)\",\n", " height=600)\n", "\n", "# Show the figure (renderer changes for colab and jupyter)\n", "rend = 'colab' if 'google.colab' in sys.modules else ''\n", "fig.show(renderer=rend)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Equilibrate the system (NPT)\n", "Equilibrate the **protein-ligand complex** system in NPT ensemble (constant Number of particles, Pressure and Temperature) .\n", "- [Step 1](#eqNPTStep1): Creating portable binary run file for system equilibration\n", "- [Step 2](#eqNPTStep2): Equilibrate the **protein-ligand complex** with NPT ensemble.\n", "- [Step 3](#eqNPTStep3): Checking **NPT Equilibration** results. Plotting **system pressure and density** by time during the **NPT equilibration** process.\n", "***\n", "**Building Blocks** used:\n", " - [Grompp](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.grompp) from **biobb_gromacs.gromacs.grompp** \n", " - [Mdrun](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.mdrun) from **biobb_gromacs.gromacs.mdrun** \n", " - [GMXEnergy](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_energy) from **biobb_analysis.gromacs.gmx_energy** \n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 1: Creating portable binary run file for system equilibration (NPT)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Grompp: Creating portable binary run file for (NPT) System Equilibration\n", "from biobb_gromacs.gromacs.grompp import grompp\n", "\n", "# Create prop dict and inputs/outputs\n", "output_gppnpt_tpr = pdbCode+'_'+ligandCode+'_gppnpt.tpr'\n", "prop = {\n", " 'mdp':{\n", " 'type': 'npt',\n", " 'nsteps':'5000',\n", " 'tc-grps': 'Protein_Other Water_and_ions',\n", " 'Define': '-DPOSRES -D' + posresifdef\n", " },\n", " 'simulation_type':'npt'\n", "}\n", "\n", "# Create and launch bb\n", "grompp(input_gro_path=output_nvt_gro,\n", " input_top_zip_path=output_genion_top_zip,\n", " input_ndx_path=output_complex_ndx,\n", " output_tpr_path=output_gppnpt_tpr,\n", " input_cpt_path=output_nvt_cpt,\n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 2: Running NPT equilibration" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Mdrun: Running NPT System Equilibration\n", "from biobb_gromacs.gromacs.mdrun import mdrun\n", "\n", "# Create prop dict and inputs/outputs\n", "output_npt_trr = pdbCode+'_'+ligandCode+'_npt.trr'\n", "output_npt_gro = pdbCode+'_'+ligandCode+'_npt.gro'\n", "output_npt_edr = pdbCode+'_'+ligandCode+'_npt.edr'\n", "output_npt_log = pdbCode+'_'+ligandCode+'_npt.log'\n", "output_npt_cpt = pdbCode+'_'+ligandCode+'_npt.cpt'\n", "\n", "# Create and launch bb\n", "mdrun(input_tpr_path=output_gppnpt_tpr,\n", " output_trr_path=output_npt_trr,\n", " output_gro_path=output_npt_gro,\n", " output_edr_path=output_npt_edr,\n", " output_log_path=output_npt_log,\n", " output_cpt_path=output_npt_cpt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 3: Checking NPT Equilibration results\n", "Checking **NPT Equilibration** results. Plotting **system pressure and density** by time during the **NPT equilibration** process. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# GMXEnergy: Getting system pressure and density by time during NPT Equilibration \n", "from biobb_analysis.gromacs.gmx_energy import gmx_energy\n", "\n", "# Create prop dict and inputs/outputs\n", "output_npt_pd_xvg = pdbCode+'_'+ligandCode+'_npt_PD.xvg'\n", "prop = {\n", " 'terms': [\"Pressure\",\"Density\"]\n", "}\n", "\n", "# Create and launch bb\n", "gmx_energy(input_energy_path=output_npt_edr, \n", " output_xvg_path=output_npt_pd_xvg, \n", " properties=prop)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import plotly.graph_objs as go\n", "from plotly.subplots import make_subplots\n", "\n", "# Read pressure and density data from file \n", "with open(output_npt_pd_xvg,'r') as pd_file:\n", " x, y, z = zip(*[\n", " (float(line.split()[0]), float(line.split()[1]), float(line.split()[2]))\n", " for line in pd_file\n", " if not line.startswith((\"#\", \"@\"))\n", " ])\n", "\n", "trace1 = go.Scatter(\n", " x=x,y=y\n", ")\n", "trace2 = go.Scatter(\n", " x=x,y=z\n", ")\n", "\n", "fig = make_subplots(rows=1, cols=2, print_grid=False)\n", "fig.append_trace(trace1, 1, 1)\n", "fig.append_trace(trace2, 1, 2)\n", "\n", "fig.update_layout(\n", " height=500,\n", " title='Pressure and Density during NPT Equilibration',\n", " showlegend=False,\n", " xaxis1_title='Time (ps)',\n", " yaxis1_title='Pressure (bar)',\n", " xaxis2_title='Time (ps)',\n", " yaxis2_title='Density (Kg*m^-3)'\n", ")\n", "\n", "# Show the figure (renderer changes for colab and jupyter)\n", "rend = 'colab' if 'google.colab' in sys.modules else ''\n", "fig.show(renderer=rend)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Free Molecular Dynamics Simulation\n", "Upon completion of the **two equilibration phases (NVT and NPT)**, the system is now well-equilibrated at the desired temperature and pressure. The **position restraints** can now be released. The last step of the **protein-ligand complex** MD setup is a short, **free MD simulation**, to ensure the robustness of the system. \n", "- [Step 1](#mdStep1): Creating portable binary run file to run a **free MD simulation**.\n", "- [Step 2](#mdStep2): Run short MD simulation of the **protein-ligand complex**.\n", "- [Step 3](#mdStep3): Checking results for the final step of the setup process, the **free MD run**. Plotting **Root Mean Square deviation (RMSd)** and **Radius of Gyration (Rgyr)** by time during the **free MD run** step. \n", "***\n", "**Building Blocks** used:\n", " - [Grompp](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.grompp) from **biobb_gromacs.gromacs.grompp** \n", " - [Mdrun](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.mdrun) from **biobb_gromacs.gromacs.mdrun** \n", " - [GMXRms](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_rms) from **biobb_analysis.gromacs.gmx_rms** \n", " - [GMXRgyr](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_rgyr) from **biobb_analysis.gromacs.gmx_rgyr** \n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 1: Creating portable binary run file to run a free MD simulation" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Grompp: Creating portable binary run file for mdrun\n", "from biobb_gromacs.gromacs.grompp import grompp\n", "\n", "# Create prop dict and inputs/outputs\n", "prop = {\n", " 'mdp':{\n", " #'nsteps':'500000' # 1 ns (500,000 steps x 2fs per step)\n", " #'nsteps':'5000' # 10 ps (5,000 steps x 2fs per step)\n", " 'nsteps':'25000' # 50 ps (25,000 steps x 2fs per step)\n", " },\n", " 'simulation_type':'free'\n", "}\n", "output_gppmd_tpr = pdbCode+'_'+ligandCode + '_gppmd.tpr'\n", "\n", "# Create and launch bb\n", "grompp(input_gro_path=output_npt_gro,\n", " input_top_zip_path=output_genion_top_zip,\n", " output_tpr_path=output_gppmd_tpr,\n", " input_cpt_path=output_npt_cpt,\n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 2: Running short free MD simulation" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Mdrun: Running free dynamics\n", "from biobb_gromacs.gromacs.mdrun import mdrun\n", "\n", "# Create prop dict and inputs/outputs\n", "output_md_trr = pdbCode+'_'+ligandCode+'_md.trr'\n", "output_md_gro = pdbCode+'_'+ligandCode+'_md.gro'\n", "output_md_edr = pdbCode+'_'+ligandCode+'_md.edr'\n", "output_md_log = pdbCode+'_'+ligandCode+'_md.log'\n", "output_md_cpt = pdbCode+'_'+ligandCode+'_md.cpt'\n", "\n", "# Create and launch bb\n", "mdrun(input_tpr_path=output_gppmd_tpr,\n", " output_trr_path=output_md_trr,\n", " output_gro_path=output_md_gro,\n", " output_edr_path=output_md_edr,\n", " output_log_path=output_md_log,\n", " output_cpt_path=output_md_cpt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 3: Checking free MD simulation results\n", "Checking results for the final step of the setup process, the **free MD run**. Plotting **Root Mean Square deviation (RMSd)** and **Radius of Gyration (Rgyr)** by time during the **free MD run** step. **RMSd** against the **experimental structure** (input structure of the pipeline) and against the **minimized and equilibrated structure** (output structure of the NPT equilibration step)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# GMXRms: Computing Root Mean Square deviation to analyse structural stability \n", "# RMSd against minimized and equilibrated snapshot (backbone atoms) \n", "\n", "from biobb_analysis.gromacs.gmx_rms import gmx_rms\n", "\n", "# Create prop dict and inputs/outputs\n", "output_rms_first = pdbCode+'_'+ligandCode+'_rms_first.xvg'\n", "prop = {\n", " 'selection': 'Backbone'\n", "}\n", "\n", "# Create and launch bb\n", "gmx_rms(input_structure_path=output_gppmd_tpr,\n", " input_traj_path=output_md_trr,\n", " output_xvg_path=output_rms_first, \n", " properties=prop)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# GMXRms: Computing Root Mean Square deviation to analyse structural stability \n", "# RMSd against experimental structure (backbone atoms) \n", "\n", "from biobb_analysis.gromacs.gmx_rms import gmx_rms\n", "\n", "# Create prop dict and inputs/outputs\n", "output_rms_exp = pdbCode+'_'+ligandCode+'_rms_exp.xvg'\n", "prop = {\n", " 'selection': 'Backbone'\n", "}\n", "\n", "# Create and launch bb\n", "gmx_rms(input_structure_path=output_gppmin_tpr,\n", " input_traj_path=output_md_trr,\n", " output_xvg_path=output_rms_exp, \n", " properties=prop)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import plotly.graph_objs as go\n", "\n", "# Read RMS vs first snapshot data from file \n", "with open(output_rms_first,'r') as rms_first_file:\n", " x, y = zip(*[\n", " (float(line.split()[0]), float(line.split()[1]))\n", " for line in rms_first_file\n", " if not line.startswith((\"#\", \"@\"))\n", " ])\n", "\n", "# Read RMS vs experimental structure data from file \n", "with open(output_rms_exp,'r') as rms_exp_file:\n", " x2, y2 = zip(*[\n", " (float(line.split()[0]), float(line.split()[1]))\n", " for line in rms_exp_file\n", " if not line.startswith((\"#\", \"@\"))\n", " ])\n", "\n", "fig = make_subplots()\n", "fig.add_trace(go.Scatter(x=x, y=y, mode=\"lines+markers\", name=\"RMSd vs first\"))\n", "fig.add_trace(go.Scatter(x=x, y=y2, mode=\"lines+markers\", name=\"RMSd vs exp\"))\n", "\n", "# Set layout including height\n", "fig.update_layout(\n", " title=\"RMSd during free MD Simulation\",\n", " xaxis=dict(title=\"Time (ps)\"),\n", " yaxis=dict(title=\"RMSd (nm)\"),\n", " height=600\n", ")\n", "\n", "# Show the figure (renderer changes for colab and jupyter)\n", "rend = 'colab' if 'google.colab' in sys.modules else ''\n", "fig.show(renderer=rend)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# GMXRgyr: Computing Radius of Gyration to measure the protein compactness during the free MD simulation \n", "\n", "from biobb_analysis.gromacs.gmx_rgyr import gmx_rgyr\n", "\n", "# Create prop dict and inputs/outputs\n", "output_rgyr = pdbCode+'_'+ligandCode+'_rgyr.xvg'\n", "prop = {\n", " 'selection': 'Backbone'\n", "}\n", "\n", "# Create and launch bb\n", "gmx_rgyr(input_structure_path=output_gppmin_tpr,\n", " input_traj_path=output_md_trr,\n", " output_xvg_path=output_rgyr, \n", " properties=prop)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import plotly.graph_objs as go\n", "\n", "# Read Rgyr data from file\n", "with open(output_rgyr, 'r') as rgyr_file:\n", " x, y = zip(*[\n", " (float(line.split()[0]), float(line.split()[1]))\n", " for line in rgyr_file\n", " if not line.startswith((\"#\", \"@\"))\n", " ])\n", "\n", "# Create a scatter plot\n", "fig = go.Figure(data=go.Scatter(x=x, y=y, mode='lines+markers'))\n", "\n", "# Update layout\n", "fig.update_layout(title=\"Radius of Gyration\",\n", " xaxis_title=\"Time (ps)\",\n", " yaxis_title=\"Rgyr (nm)\",\n", " height=600)\n", "\n", "# Show the figure (renderer changes for colab and jupyter)\n", "rend = 'colab' if 'google.colab' in sys.modules else ''\n", "fig.show(renderer=rend)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "## Post-processing and Visualizing resulting 3D trajectory\n", "Post-processing and Visualizing the **protein-ligand complex system** MD setup **resulting trajectory** using **NGL**\n", "- [Step 1](#ppStep1): *Imaging* the resulting trajectory, **stripping out water molecules and ions** and **correcting periodicity issues**.\n", "- [Step 2](#ppStep2): Generating a *dry* structure, **removing water molecules and ions** from the final snapshot of the MD setup pipeline.\n", "- [Step 3](#ppStep3): Visualizing the *imaged* trajectory using the *dry* structure as a **topology**. \n", "***\n", "**Building Blocks** used:\n", " - [GMXImage](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_image) from **biobb_analysis.gromacs.gmx_image** \n", " - [GMXTrjConvStr](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_trjconv_str) from **biobb_analysis.gromacs.gmx_trjconv_str** \n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 1: *Imaging* the resulting trajectory.\n", "Stripping out **water molecules and ions** and **correcting periodicity issues** " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# GMXImage: \"Imaging\" the resulting trajectory\n", "# Removing water molecules and ions from the resulting structure\n", "from biobb_analysis.gromacs.gmx_image import gmx_image\n", "\n", "# Create prop dict and inputs/outputs\n", "output_imaged_traj = pdbCode+'_imaged_traj.trr'\n", "prop = {\n", " 'center_selection': 'Protein_Other',\n", " 'output_selection': 'Protein_Other',\n", " 'pbc' : 'mol',\n", " 'center' : True\n", "}\n", "\n", "# Create and launch bb\n", "gmx_image(input_traj_path=output_md_trr,\n", " input_top_path=output_gppmd_tpr,\n", " input_index_path=output_complex_ndx,\n", " output_traj_path=output_imaged_traj, \n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 2: Generating the output *dry* structure.\n", "**Removing water molecules and ions** from the resulting structure" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# GMXTrjConvStr: Converting and/or manipulating a structure\n", "# Removing water molecules and ions from the resulting structure\n", "# The \"dry\" structure will be used as a topology to visualize \n", "# the \"imaged dry\" trajectory generated in the previous step.\n", "from biobb_analysis.gromacs.gmx_trjconv_str import gmx_trjconv_str\n", "\n", "# Create prop dict and inputs/outputs\n", "output_dry_gro = pdbCode+'_md_dry.gro'\n", "prop = {\n", " 'selection': 'Protein_Other'\n", "}\n", "\n", "# Create and launch bb\n", "gmx_trjconv_str(input_structure_path=output_md_gro,\n", " input_top_path=output_gppmd_tpr,\n", " input_index_path=output_complex_ndx,\n", " output_str_path=output_dry_gro, \n", " properties=prop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Step 3: Visualizing the generated dehydrated trajectory.\n", "Using the **imaged trajectory** (output of the [Post-processing step 1](#ppStep1)) with the **dry structure** (output of the [Post-processing step 2](#ppStep2)) as a topology." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Show trajectory\n", "view = nglview.show_simpletraj(nglview.SimpletrajTrajectory(output_imaged_traj, output_dry_gro), gui=True)\n", "view" ] }, { "cell_type": "markdown", "metadata": { "variables": { "output_genion_top_zip": "

NameError: name 'output_genion_top_zip' is not defined

\n", "output_gppmd_tpr": "

NameError: name 'output_gppmd_tpr' is not defined

\n", "output_md_cpt": "

NameError: name 'output_md_cpt' is not defined

\n", "output_md_gro": "

NameError: name 'output_md_gro' is not defined

\n", "output_md_trr": "

NameError: name 'output_md_trr' is not defined

\n", "output_rgyr": "

NameError: name 'output_rgyr' is not defined

\n", "output_rms_exp": "

NameError: name 'output_rms_exp' is not defined

\n", "output_rms_first": "

NameError: name 'output_rms_first' is not defined

\n" } }, "source": [ "\n", "## Output files\n", "\n", "Important **Output files** generated:\n", " - {{output_md_gro}}: **Final structure** (snapshot) of the MD setup protocol.\n", " - {{output_md_trr}}: **Final trajectory** of the MD setup protocol.\n", " - {{output_md_cpt}}: **Final checkpoint file**, with information about the state of the simulation. It can be used to **restart** or **continue** a MD simulation.\n", " - {{output_gppmd_tpr}}: **Final tpr file**, GROMACS portable binary run input file. This file contains the starting structure of the **MD setup free MD simulation step**, together with the molecular topology and all the simulation parameters. It can be used to **extend** the simulation.\n", " - {{output_genion_top_zip}}: **Final topology** of the MD system. It is a compressed zip file including a **topology file** (.top) and a set of auxiliary **include topology** files (.itp).\n", "\n", "**Analysis** (MD setup check) output files generated:\n", " - {{output_rms_first}}: **Root Mean Square deviation (RMSd)** against **minimized and equilibrated structure** of the final **free MD run step**.\n", " - {{output_rms_exp}}: **Root Mean Square deviation (RMSd)** against **experimental structure** of the final **free MD run step**.\n", " - {{output_rgyr}}: **Radius of Gyration** of the final **free MD run step** of the **setup pipeline**.\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "\n", "\n", "## Questions & Comments\n", "\n", "Questions, issues, suggestions and comments are really welcome!\n", "\n", "* GitHub issues:\n", " * [https://github.com/bioexcel/biobb](https://github.com/bioexcel/biobb)\n", "\n", "* BioExcel forum:\n", " * [https://ask.bioexcel.eu/c/BioExcel-Building-Blocks-library](https://ask.bioexcel.eu/c/BioExcel-Building-Blocks-library)\n" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.3" } }, "nbformat": 4, "nbformat_minor": 4 }