{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Protein Ligand Complex MD Setup tutorial using BioExcel Building Blocks (biobb)\n",
"### Based on the official Gromacs tutorial: http://www.mdtutorials.com/gmx/complex/index.html\n",
"***\n",
"This tutorial aims to illustrate the process of **setting up a simulation system** containing a **protein in complex with a ligand**, step by step, using the **BioExcel Building Blocks library (biobb)**. The particular example used is the **T4 lysozyme** L99A/M102Q protein (PDB code 3HTB, [https://doi.org/10.2210/pdb3HTB/pdb](https://doi.org/10.2210/pdb3HTB/pdb)), in complex with the **2-propylphenol** small molecule (3-letter Code JZ4, [https://www.rcsb.org/ligand/JZ4](https://www.rcsb.org/ligand/JZ4)). \n",
"***\n",
"**Biobb modules** used:\n",
"\n",
" - [biobb_io](https://github.com/bioexcel/biobb_io): Tools to fetch biomolecular data from public databases.\n",
" - [biobb_model](https://github.com/bioexcel/biobb_model): Tools to model macromolecular structures.\n",
" - [biobb_chemistry](https://github.com/bioexcel/biobb_chemistry): Tools to manipulate chemical data.\n",
" - [biobb_gromacs](https://github.com/bioexcel/biobb_gromacs): Tools to setup and run Molecular Dynamics simulations.\n",
" - [biobb_analysis](https://github.com/bioexcel/biobb_analysis): Tools to analyse Molecular Dynamics trajectories.\n",
" - [biobb_structure_utils](https://github.com/bioexcel/biobb_structure_utils): Tools to modify or extract information from a PDB structure file.\n",
" \n",
"### Auxiliary libraries used\n",
"\n",
"* [jupyter](https://jupyter.org/): Free software, open standards, and web services for interactive computing across all programming languages.\n",
"* [nglview](http://nglviewer.org/#nglview): Jupyter/IPython widget to interactively view molecular structures and trajectories in notebooks.\n",
"* [plotly](https://plot.ly/python/offline/): Python interactive graphing library integrated in Jupyter notebooks.\n",
"* [simpletraj](https://github.com/arose/simpletraj): Lightweight coordinate-only trajectory reader based on code from GROMACS, MDAnalysis and VMD.\n",
"\n",
"### Conda Installation and Launch\n",
"\n",
"```console\n",
"git clone https://github.com/bioexcel/biobb_wf_protein-complex_md_setup.git\n",
"cd biobb_wf_protein-complex_md_setup\n",
"conda env create -f conda_env/environment.yml\n",
"conda activate biobb_Protein-Complex_MDsetup_tutorial\n",
"jupyter-notebook biobb_wf_protein-complex_md_setup/notebooks/biobb_Protein-Complex_MDsetup_tutorial.ipynb\n",
"```\n",
"\n",
"***\n",
"### Pipeline steps:\n",
" 1. [Input Parameters](#input)\n",
" 2. [Fetching PDB Structure](#fetch)\n",
" 3. [Fix Protein Structure](#fix)\n",
" 4. [Create Protein System Topology](#top)\n",
" 5. [Create ligand system topology](#ligtop)\n",
" 6. [Preparing Ligand Restraints](#restraints)\n",
" 7. [Create new protein-ligand complex structure file](#complex)\n",
" 8. [Create new protein-ligand complex topology file](#complextop)\n",
" 9. [Create Solvent Box](#box)\n",
" 10. [Fill the Box with Water Molecules](#water)\n",
" 11. [Adding Ions](#ions)\n",
" 12. [Energetically Minimize the System](#min)\n",
" 13. [Equilibrate the System (NVT)](#nvt)\n",
" 14. [Equilibrate the System (NPT)](#npt)\n",
" 15. [Free Molecular Dynamics Simulation](#free)\n",
" 16. [Post-processing and Visualizing Resulting 3D Trajectory](#post)\n",
" 17. [Output Files](#output)\n",
" 18. [Questions & Comments](#questions)\n",
" \n",
"***\n",
"***\n",
"\n",
"***\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"jp-MarkdownHeadingCollapsed": true
},
"source": [
"## Initializing colab\n",
"The two cells below are used only in case this notebook is executed via **Google Colab**. Take into account that, for running conda on **Google Colab**, the **condacolab** library must be installed. As [explained here](https://pypi.org/project/condacolab/), the installation requires a **kernel restart**, so when running this notebook in **Google Colab**, don't run all cells until this **installation** is properly **finished** and the **kernel** has **restarted**."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Only executed when using google colab\n",
"import sys\n",
"if 'google.colab' in sys.modules:\n",
" import subprocess\n",
" from pathlib import Path\n",
" try:\n",
" subprocess.run([\"conda\", \"-V\"], check=True)\n",
" except FileNotFoundError:\n",
" subprocess.run([sys.executable, \"-m\", \"pip\", \"install\", \"condacolab\"], check=True)\n",
" import condacolab\n",
" condacolab.install()\n",
" # Clone repository\n",
" repo_URL = \"https://github.com/bioexcel/biobb_wf_protein-complex_md_setup.git\"\n",
" repo_name = Path(repo_URL).name.split('.')[0]\n",
" if not Path(repo_name).exists():\n",
" subprocess.run([\"mamba\", \"install\", \"-y\", \"git\"], check=True)\n",
" subprocess.run([\"git\", \"clone\", repo_URL], check=True)\n",
" print(\"⏬ Repository properly cloned.\")\n",
" # Install environment\n",
" print(\"⏳ Creating environment...\")\n",
" env_file_path = f\"{repo_name}/conda_env/environment.yml\"\n",
" subprocess.run([\"mamba\", \"env\", \"update\", \"-n\", \"base\", \"-f\", env_file_path], check=True)\n",
" print(\"🎨 Install NGLView dependencies...\")\n",
" subprocess.run([\"mamba\", \"install\", \"-y\", \"-c\", \"conda-forge\", \"nglview==3.0.8\", \"ipywidgets=7.7.2\"], check=True)\n",
" print(\"👍 Conda environment successfully created and updated.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Enable widgets for colab\n",
"if 'google.colab' in sys.modules:\n",
" from google.colab import output\n",
" output.enable_custom_widget_manager()\n",
" import os\n",
" os.chdir(\"biobb_wf_protein-complex_md_setup/biobb_wf_protein-complex_md_setup/notebooks\")\n",
" print(f\"📂 New working directory: {os.getcwd()}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"## Input parameters\n",
"**Input parameters** needed:\n",
" - **pdbCode**: PDB code of the protein-ligand complex structure (e.g. 3HTB, [https://doi.org/10.2210/pdb3HTB/pdb](https://doi.org/10.2210/pdb3HTB/pdb))\n",
" - **ligandCode**: Small molecule 3-letter code for the ligand structure (e.g. JZ4, [https://www.rcsb.org/ligand/JZ4](https://www.rcsb.org/ligand/JZ4))\n",
" - **mol_charge**: Charge of the small molecule, needed to add hydrogen atoms."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"import nglview\n",
"import ipywidgets\n",
"import os\n",
"import zipfile\n",
"\n",
"pdbCode = \"3HTB\"\n",
"ligandCode = \"JZ4\"\n",
"mol_charge = 0"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Fetching PDB structure\n",
"Downloading **PDB structure** with the **protein-ligand complex** from the RCSB PDB database.
\n",
"Alternatively, a PDB file can be used as starting structure.
\n",
"Splitting the molecule in **three different files**: \n",
"- **proteinFile**: Protein structure\n",
"- **ligandFile**: Ligand structure\n",
"- **complexFile**: Protein-ligand complex structure \n",
"\n",
"***\n",
"**Building Blocks** used:\n",
" - [Pdb](https://biobb-io.readthedocs.io/en/latest/api.html#module-api.pdb) from **biobb_io.api.pdb**\n",
"***"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# Downloading desired PDB file \n",
"# Import module\n",
"from biobb_io.api.pdb import pdb\n",
"\n",
"# Create properties dict and inputs/outputs\n",
"downloaded_pdb = pdbCode+'.orig.pdb'\n",
"prop = {\n",
" 'pdb_code': pdbCode,\n",
" 'filter': False\n",
"}\n",
"\n",
"# Create and launch bb\n",
"pdb(output_pdb_path=downloaded_pdb,\n",
" properties=prop)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Extracting Protein, Ligand and Protein-Ligand Complex to three different files\n",
"# Import module\n",
"from biobb_structure_utils.utils.extract_heteroatoms import extract_heteroatoms\n",
"from biobb_structure_utils.utils.extract_molecule import extract_molecule\n",
"from biobb_structure_utils.utils.cat_pdb import cat_pdb\n",
"\n",
"# Create properties dict and inputs/outputs\n",
"proteinFile = pdbCode+'.pdb'\n",
"ligandFile = ligandCode+'.pdb'\n",
"complexFile = pdbCode+'_'+ligandCode+'.pdb'\n",
"\n",
"prop = {\n",
" 'heteroatoms' : [{\"name\": \"JZ4\"}]\n",
"}\n",
"\n",
"extract_heteroatoms(input_structure_path=downloaded_pdb,\n",
" output_heteroatom_path=ligandFile,\n",
" properties=prop)\n",
"\n",
"extract_molecule(input_structure_path=downloaded_pdb,\n",
" output_molecule_path=proteinFile)\n",
"\n",
"print(proteinFile, ligandFile, complexFile)\n",
"\n",
"cat_pdb(input_structure1=proteinFile,\n",
" input_structure2=ligandFile,\n",
" output_structure_path=complexFile)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Visualizing 3D structures\n",
"Visualizing the generated **PDB structures** using **NGL**: \n",
"- **Protein structure** (Left)\n",
"- **Ligand structure** (Center)\n",
"- **Protein-ligand complex** (Right) "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# Show structures: protein, ligand and protein-ligand complex\n",
"view1 = nglview.show_structure_file(proteinFile)\n",
"view1._remote_call('setSize', target='Widget', args=['350px','400px'])\n",
"view1.camera='orthographic'\n",
"view1\n",
"view2 = nglview.show_structure_file(ligandFile)\n",
"view2.add_representation(repr_type='ball+stick')\n",
"view2._remote_call('setSize', target='Widget', args=['350px','400px'])\n",
"view2.camera='orthographic'\n",
"view2\n",
"view3 = nglview.show_structure_file(complexFile)\n",
"view3.add_representation(repr_type='licorice', radius='.5', selection=ligandCode)\n",
"view3._remote_call('setSize', target='Widget', args=['350px','400px'])\n",
"view3.camera='orthographic'\n",
"view3\n",
"ipywidgets.HBox([view1, view2, view3])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Fix protein structure\n",
"**Checking** and **fixing** (if needed) the protein structure:
\n",
"- **Modeling** **missing side-chain atoms**, modifying incorrect **amide assignments**, choosing **alternative locations**.
\n",
"- **Checking** for missing **backbone atoms**, **heteroatoms**, **modified residues** and possible **atomic clashes**.\n",
"\n",
"***\n",
"**Building Blocks** used:\n",
" - [FixSideChain](https://biobb-model.readthedocs.io/en/latest/model.html#module-model.fix_side_chain) from **biobb_model.model.fix_side_chain**\n",
"***"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Check & Fix Protein Structure\n",
"# Import module\n",
"from biobb_model.model.fix_side_chain import fix_side_chain\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"fixed_pdb = pdbCode+'_fixed.pdb'\n",
"\n",
"# Create and launch bb\n",
"fix_side_chain(input_pdb_path=proteinFile,\n",
" output_pdb_path=fixed_pdb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Create protein system topology\n",
"**Building GROMACS topology** corresponding to the protein structure.
\n",
"Force field used in this tutorial is [**amber99sb-ildn**](https://dx.doi.org/10.1002%2Fprot.22711): AMBER **parm99** force field with **corrections on backbone** (sb) and **side-chain torsion potentials** (ildn). Water molecules type used in this tutorial is [**spc/e**](https://pubs.acs.org/doi/abs/10.1021/j100308a038).
\n",
"Adding **hydrogen atoms** if missing. Automatically identifying **disulfide bridges**.
\n",
"\n",
"Generating two output files: \n",
"- **GROMACS structure** (gro file)\n",
"- **GROMACS topology** ZIP compressed file containing:\n",
" - *GROMACS topology top file* (top file)\n",
" - *GROMACS position restraint file/s* (itp file/s)\n",
"***\n",
"**Building Blocks** used:\n",
" - [Pdb2gmx](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.pdb2gmx) from **biobb_gromacs.gromacs.pdb2gmx**\n",
"***"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create Protein system topology\n",
"# Import module\n",
"from biobb_gromacs.gromacs.pdb2gmx import pdb2gmx\n",
"\n",
"# Create inputs/outputs\n",
"output_pdb2gmx_gro = pdbCode+'_pdb2gmx.gro'\n",
"output_pdb2gmx_top_zip = pdbCode+'_pdb2gmx_top.zip'\n",
"prop = {\n",
" 'force_field' : 'amber99sb-ildn',\n",
" 'water_type': 'spce'\n",
"}\n",
"\n",
"# Create and launch bb\n",
"pdb2gmx(input_pdb_path=fixed_pdb,\n",
" output_gro_path=output_pdb2gmx_gro,\n",
" output_top_zip_path=output_pdb2gmx_top_zip,\n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Create ligand system topology\n",
"**Building GROMACS topology** corresponding to the ligand structure.
\n",
"Force field used in this tutorial step is **amberGAFF**: [General AMBER Force Field](http://ambermd.org/antechamber/gaff.html), designed for rational drug design.
\n",
"- [Step 1](#ligandTopologyStep1): Add **hydrogen atoms** if missing.\n",
"- [Step 2](#ligandTopologyStep2): **Energetically minimize the system** with the new hydrogen atoms. \n",
"- [Step 3](#ligandTopologyStep3): Generate **ligand topology** (parameters). \n",
"***\n",
"**Building Blocks** used:\n",
" - [ReduceAddHydrogens](https://biobb-chemistry.readthedocs.io/en/latest/ambertools.html#module-ambertools.reduce_add_hydrogens) from **biobb_chemistry.ambertools.reduce_add_hydrogens**\n",
" - [BabelMinimize](https://biobb-chemistry.readthedocs.io/en/latest/babelm.html#module-babelm.babel_minimize) from **biobb_chemistry.babelm.babel_minimize** \n",
" - [AcpypeParamsGMX](https://biobb-chemistry.readthedocs.io/en/latest/acpype.html#module-acpype.acpype_params_gmx) from **biobb_chemistry.acpype.acpype_params_gmx** \n",
"***"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 1: Add **hydrogen atoms**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create Ligand system topology, STEP 1\n",
"# Reduce_add_hydrogens: add Hydrogen atoms to a small molecule (using Reduce tool from Ambertools package)\n",
"# Import module\n",
"from biobb_chemistry.ambertools.reduce_add_hydrogens import reduce_add_hydrogens\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_reduce_h = ligandCode+'.reduce.H.pdb' \n",
"prop = {\n",
" 'nuclear' : 'true'\n",
"}\n",
"\n",
"# Create and launch bb\n",
"reduce_add_hydrogens(input_path=ligandFile,\n",
" output_path=output_reduce_h,\n",
" properties=prop)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 2: **Energetically minimize the system** with the new hydrogen atoms. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create Ligand system topology, STEP 2\n",
"# Babel_minimize: Structure energy minimization of a small molecule after being modified adding hydrogen atoms\n",
"# Import module\n",
"from biobb_chemistry.babelm.babel_minimize import babel_minimize\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_babel_min = ligandCode+'.H.min.mol2' \n",
"prop = {\n",
" 'method' : 'sd',\n",
" 'criteria' : '1e-10',\n",
" 'force_field' : 'GAFF'\n",
"}\n",
"\n",
"\n",
"# Create and launch bb\n",
"babel_minimize(input_path=output_reduce_h,\n",
" output_path=output_babel_min,\n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Visualizing 3D structures\n",
"Visualizing the small molecule generated **PDB structures** using **NGL**: \n",
"- **Original Ligand Structure** (Left)\n",
"- **Ligand Structure with hydrogen atoms added** (with Reduce program) (Center)\n",
"- **Ligand Structure with hydrogen atoms added** (with Reduce program), **energy minimized** (with Open Babel) (Right) "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# Show different structures generated (for comparison)\n",
"\n",
"view1 = nglview.show_structure_file(ligandFile)\n",
"view1.add_representation(repr_type='ball+stick')\n",
"view1._remote_call('setSize', target='Widget', args=['350px','400px'])\n",
"view1.camera='orthographic'\n",
"view1\n",
"view2 = nglview.show_structure_file(output_reduce_h)\n",
"view2.add_representation(repr_type='ball+stick')\n",
"view2._remote_call('setSize', target='Widget', args=['350px','400px'])\n",
"view2.camera='orthographic'\n",
"view2\n",
"view3 = nglview.show_structure_file(output_babel_min)\n",
"view3.add_representation(repr_type='ball+stick')\n",
"view3._remote_call('setSize', target='Widget', args=['350px','400px'])\n",
"view3.camera='orthographic'\n",
"view3\n",
"ipywidgets.HBox([view1, view2, view3])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 3: Generate **ligand topology** (parameters)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create Ligand system topology, STEP 3\n",
"# Acpype_params_gmx: Generation of topologies for GROMACS with ACPype\n",
"# Import module\n",
"from biobb_chemistry.acpype.acpype_params_gmx import acpype_params_gmx\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_acpype_gro = ligandCode+'params.gro'\n",
"output_acpype_itp = ligandCode+'params.itp'\n",
"output_acpype_top = ligandCode+'params.top'\n",
"output_acpype = ligandCode+'params'\n",
"prop = {\n",
" 'basename' : output_acpype,\n",
" 'charge' : mol_charge\n",
"}\n",
"\n",
"# Create and launch bb\n",
"acpype_params_gmx(input_path=output_babel_min, \n",
" output_path_gro=output_acpype_gro,\n",
" output_path_itp=output_acpype_itp,\n",
" output_path_top=output_acpype_top,\n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Preparing Ligand Restraints\n",
"In subsequent steps of the pipeline, such as the equilibration stages of the **protein-ligand complex** system, it is recommended to apply some **restraints** to the small molecule, to avoid a possible change in position due to protein repulsion. **Position restraints** will be applied to the ligand, using a **force constant of 1000 KJ/mol\\*nm^2** on the three coordinates: x, y and z. In this steps the **restriction files** will be created and integrated in the **ligand topology**.\n",
"- [Step 1](#restraintsStep1): Creating an index file with a new group including just the **small molecule heavy atoms**.\n",
"- [Step 2](#restraintsStep2): Generating the **position restraints** file.\n",
"***\n",
"**Building Blocks** used:\n",
" - [MakeNdx](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.make_ndx) from **biobb_gromacs.gromacs.make_ndx** \n",
" - [Genrestr](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.genrestr) from **biobb_gromacs.gromacs.genrestr** \n",
"***"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 1: Creating an index file for the small molecule heavy atoms"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# MakeNdx: Creating index file with a new group (small molecule heavy atoms)\n",
"from biobb_gromacs.gromacs.make_ndx import make_ndx\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_ligand_ndx = ligandCode+'_index.ndx'\n",
"prop = {\n",
" 'selection': \"0 & ! a H*\"\n",
"}\n",
"\n",
"# Create and launch bb\n",
"make_ndx(input_structure_path=output_acpype_gro,\n",
" output_ndx_path=output_ligand_ndx,\n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 2: Generating the position restraints file"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Genrestr: Generating the position restraints file\n",
"from biobb_gromacs.gromacs.genrestr import genrestr\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_restraints_top = ligandCode+'_posres.itp'\n",
"prop = {\n",
" 'force_constants': \"1000 1000 1000\",\n",
" 'restrained_group': \"System\"\n",
"}\n",
"\n",
"# Create and launch bb\n",
"genrestr(input_structure_path=output_acpype_gro,\n",
" input_ndx_path=output_ligand_ndx,\n",
" output_itp_path=output_restraints_top,\n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Create new protein-ligand complex structure file\n",
"Building new **protein-ligand complex** PDB file with:\n",
"- The new **protein system** with fixed problems from *Fix Protein Structure* step and hydrogens atoms added from *Create Protein System Topology* step.\n",
"- The new **ligand system** with hydrogens atoms added from *Create Ligand System Topology* step. \n",
"\n",
"This new structure is needed for **GROMACS** as it is **force field-compliant**, it **has all the new hydrogen atoms**, and the **atom names are matching the newly generated protein and ligand topologies**.\n",
"***\n",
"**Building Blocks** used:\n",
" - [GMXTrjConvStr](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_trjconv_str) from **biobb_analysis.gromacs.gmx_trjconv_str**\n",
"***"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# biobb analysis module\n",
"from biobb_analysis.gromacs.gmx_trjconv_str import gmx_trjconv_str\n",
"from biobb_structure_utils.utils.cat_pdb import cat_pdb\n",
"\n",
"# Convert gro (with hydrogens) to pdb (PROTEIN)\n",
"proteinFile_H = pdbCode+'_'+ligandCode+'_complex_H.pdb'\n",
"prop = {\n",
" 'selection' : 'System'\n",
"}\n",
"\n",
"# Create and launch bb\n",
"gmx_trjconv_str(input_structure_path=output_pdb2gmx_gro,\n",
" input_top_path=output_pdb2gmx_gro,\n",
" output_str_path=proteinFile_H, \n",
" properties=prop)\n",
"\n",
"# Convert gro (with hydrogens) to pdb (LIGAND)\n",
"ligandFile_H = ligandCode+'_complex_H.pdb'\n",
"prop = {\n",
" 'selection' : 'System'\n",
"}\n",
"\n",
"# Create and launch bb\n",
"gmx_trjconv_str(input_structure_path=output_acpype_gro,\n",
" input_top_path=output_acpype_gro,\n",
" output_str_path=ligandFile_H, \n",
" properties=prop)\n",
"\n",
"\n",
"# Concatenating both PDB files: Protein + Ligand\n",
"complexFile_H = pdbCode+'_'+ligandCode+'_H.pdb'\n",
"\n",
"# Create and launch bb\n",
"cat_pdb(input_structure1=proteinFile_H,\n",
" input_structure2=ligandFile_H,\n",
" output_structure_path=complexFile_H)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Create new protein-ligand complex topology file\n",
"Building new **protein-ligand complex** GROMACS topology file with:\n",
"- The new **protein system** topology generated from *Create Protein System Topology* step.\n",
"- The new **ligand system** topology generated from *Create Ligand System Topology* step. \n",
"\n",
"NOTE: From this point on, the **protein-ligand complex structure and topology** generated can be used in a regular MD setup.\n",
"***\n",
"**Building Blocks** used:\n",
" - [AppendLigand](https://biobb-md.readthedocs.io/en/latest/modules.html) from **biobb_gromacs.gromacs_extra.append_ligand** (NOTE: link should be updated with the documentation)\n",
"***"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# AppendLigand: Append a ligand to a GROMACS topology\n",
"# Import module\n",
"from biobb_gromacs.gromacs_extra.append_ligand import append_ligand\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_complex_top = pdbCode+'_'+ligandCode+'_complex.top.zip'\n",
"\n",
"posresifdef = \"POSRES_\"+ligandCode.upper()\n",
"prop = {\n",
" 'posres_name': posresifdef\n",
"}\n",
"\n",
"# Create and launch bb\n",
"append_ligand(input_top_zip_path=output_pdb2gmx_top_zip,\n",
" input_posres_itp_path=output_restraints_top,\n",
" input_itp_path=output_acpype_itp, \n",
" output_top_zip_path=output_complex_top,\n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Create solvent box\n",
"Define the unit cell for the **protein-ligand complex** to fill it with water molecules.
\n",
"**Truncated octahedron** box is used for the unit cell. This box type is the one which best reflects the geometry of the solute/protein, in this case a **globular protein**, as it approximates a sphere. It is also convenient for the computational point of view, as it accumulates **less water molecules at the corners**, reducing the final number of water molecules in the system and making the simulation run faster.
A **protein to box** distance of **0.8 nm** is used, and the protein is **centered in the box**. \n",
"\n",
"***\n",
"**Building Blocks** used:\n",
" - [Editconf](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.editconf) from **biobb_gromacs.gromacs.editconf** \n",
"***"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Editconf: Create solvent box\n",
"# Import module\n",
"from biobb_gromacs.gromacs.editconf import editconf\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_editconf_gro = pdbCode+'_'+ligandCode+'_complex_editconf.gro'\n",
"\n",
"prop = {\n",
" 'box_type': 'octahedron',\n",
" 'distance_to_molecule': 0.8\n",
"}\n",
"\n",
"# Create and launch bb\n",
"editconf(input_gro_path=complexFile_H, \n",
" output_gro_path=output_editconf_gro,\n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Fill the box with water molecules\n",
"Fill the unit cell for the **protein-ligand complex** with water molecules.
\n",
"The solvent type used is the default **Simple Point Charge water (SPC)**, a generic equilibrated 3-point solvent model. \n",
"\n",
"***\n",
"**Building Blocks** used:\n",
" - [Solvate](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.solvate) from **biobb_gromacs.gromacs.solvate** \n",
"***"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Solvate: Fill the box with water molecules\n",
"from biobb_gromacs.gromacs.solvate import solvate\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_solvate_gro = pdbCode+'_'+ligandCode+'_solvate.gro'\n",
"output_solvate_top_zip = pdbCode+'_'+ligandCode+'_solvate_top.zip'\n",
"\n",
"# Create and launch bb\n",
"solvate(input_solute_gro_path=output_editconf_gro,\n",
" output_gro_path=output_solvate_gro,\n",
" input_top_zip_path=output_complex_top,\n",
" output_top_zip_path=output_solvate_top_zip)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Visualizing 3D structure\n",
"Visualizing the **protein-ligand complex** with the newly added **solvent box** using **NGL**
\n",
"Note the **octahedral box** filled with **water molecules** surrounding the **protein structure**, which is **centered** right in the middle of the box."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"#Show protein\n",
"view = nglview.show_structure_file(output_solvate_gro)\n",
"view.clear_representations()\n",
"view.add_representation(repr_type='cartoon', selection='protein', color='sstruc')\n",
"view.add_representation(repr_type='licorice', radius='.5', selection=ligandCode)\n",
"view.add_representation(repr_type='line', linewidth='1', selection='SOL', opacity='.3')\n",
"view._remote_call('setSize', target='Widget', args=['','600px'])\n",
"view.camera='orthographic'\n",
"view"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Adding ions\n",
"Add ions to neutralize the **protein-ligand complex** and reach a desired ionic concentration.\n",
"- [Step 1](#ionsStep1): Creating portable binary run file for ion generation\n",
"- [Step 2](#ionsStep2): Adding ions to **neutralize** the system and reach a **0.05 molar ionic concentration**\n",
"***\n",
"**Building Blocks** used:\n",
" - [Grompp](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.grompp) from **biobb_gromacs.gromacs.grompp** \n",
" - [Genion](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.genion) from **biobb_gromacs.gromacs.genion** \n",
"***"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 1: Creating portable binary run file for ion generation"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Grompp: Creating portable binary run file for ion generation\n",
"from biobb_gromacs.gromacs.grompp import grompp\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"prop = {\n",
" 'mdp':{\n",
" 'nsteps':'5000'\n",
" },\n",
" 'simulation_type':'minimization',\n",
" 'maxwarn': 1\n",
"}\n",
"output_gppion_tpr = pdbCode+'_'+ligandCode+'_complex_gppion.tpr'\n",
"\n",
"# Create and launch bb\n",
"grompp(input_gro_path=output_solvate_gro,\n",
" input_top_zip_path=output_solvate_top_zip, \n",
" output_tpr_path=output_gppion_tpr,\n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 2: Adding ions to neutralize the system and reach a 0.05 molar concentration\n",
"Replace **solvent molecules** with **ions** to **neutralize** the system and reaching a **0.05 molar ionic concentration**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Genion: Adding ions to reach a 0.05 molar concentration\n",
"from biobb_gromacs.gromacs.genion import genion\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"prop={\n",
" 'neutral':True,\n",
" 'concentration':0.05\n",
"}\n",
"output_genion_gro = pdbCode+'_'+ligandCode+'_genion.gro'\n",
"output_genion_top_zip = pdbCode+'_'+ligandCode+'_genion_top.zip'\n",
"\n",
"# Create and launch bb\n",
"genion(input_tpr_path=output_gppion_tpr,\n",
" output_gro_path=output_genion_gro, \n",
" input_top_zip_path=output_solvate_top_zip,\n",
" output_top_zip_path=output_genion_top_zip, \n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Visualizing 3D structure\n",
"Visualizing the **protein-ligand complex** with the newly added **ionic concentration** using **NGL**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"#Show protein\n",
"view = nglview.show_structure_file(output_genion_gro)\n",
"view.clear_representations()\n",
"view.add_representation(repr_type='cartoon', selection='protein', color='sstruc')\n",
"view.add_representation(repr_type='licorice', radius='.5', selection=ligandCode)\n",
"view.add_representation(repr_type='ball+stick', selection='NA')\n",
"view.add_representation(repr_type='ball+stick', selection='CL')\n",
"view._remote_call('setSize', target='Widget', args=['','600px'])\n",
"view.camera='orthographic'\n",
"view"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Energetically minimize the system\n",
"Energetically minimize the **protein-ligand complex** till reaching a desired potential energy.\n",
"- [Step 1](#emStep1): Creating portable binary run file for energy minimization\n",
"- [Step 2](#emStep2): Energetically minimize the **protein-ligand complex** till reaching a force of 500 kJ/mol*nm.\n",
"- [Step 3](#emStep3): Checking **energy minimization** results. Plotting energy by time during the **minimization** process.\n",
"***\n",
"**Building Blocks** used:\n",
" - [Grompp](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.grompp) from **biobb_gromacs.gromacs.grompp** \n",
" - [Mdrun](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.mdrun) from **biobb_gromacs.gromacs.mdrun** \n",
" - [GMXEnergy](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_energy) from **biobb_analysis.gromacs.gmx_energy** \n",
"***"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 1: Creating portable binary run file for energy minimization\n",
"Method used to run the **energy minimization** is a **steepest descent**, with a **maximum force of 500 kJ/mol\\*nm^2**, and a minimization **step size of 1fs**. The **maximum number of steps** to perform if the maximum force is not reached is **5,000 steps**. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# Grompp: Creating portable binary run file for mdrun\n",
"from biobb_gromacs.gromacs.grompp import grompp\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"prop = {\n",
" 'mdp':{\n",
" 'nsteps':'5000',\n",
" 'emstep': 0.01,\n",
" 'emtol':'500'\n",
" },\n",
" 'simulation_type':'minimization'\n",
"}\n",
"output_gppmin_tpr = pdbCode+'_'+ligandCode+'_gppmin.tpr'\n",
"\n",
"# Create and launch bb\n",
"grompp(input_gro_path=output_genion_gro,\n",
" input_top_zip_path=output_genion_top_zip,\n",
" output_tpr_path=output_gppmin_tpr,\n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 2: Running Energy Minimization\n",
"Running **energy minimization** using the **tpr file** generated in the previous step."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# Mdrun: Running minimization\n",
"from biobb_gromacs.gromacs.mdrun import mdrun\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_min_trr = pdbCode+'_'+ligandCode+'_min.trr'\n",
"output_min_gro = pdbCode+'_'+ligandCode+'_min.gro'\n",
"output_min_edr = pdbCode+'_'+ligandCode+'_min.edr'\n",
"output_min_log = pdbCode+'_'+ligandCode+'_min.log'\n",
"\n",
"# Create and launch bb\n",
"mdrun(input_tpr_path=output_gppmin_tpr,\n",
" output_trr_path=output_min_trr, \n",
" output_gro_path=output_min_gro,\n",
" output_edr_path=output_min_edr, \n",
" output_log_path=output_min_log)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 3: Checking Energy Minimization results\n",
"Checking **energy minimization** results. Plotting **potential energy** by time during the minimization process. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# GMXEnergy: Getting system energy by time \n",
"from biobb_analysis.gromacs.gmx_energy import gmx_energy\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_min_ene_xvg = pdbCode+'_'+ligandCode+'_min_ene.xvg'\n",
"prop = {\n",
" 'terms': [\"Potential\"]\n",
"}\n",
"\n",
"# Create and launch bb\n",
"gmx_energy(input_energy_path=output_min_edr, \n",
" output_xvg_path=output_min_ene_xvg, \n",
" properties=prop)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import plotly.graph_objs as go\n",
"\n",
"#Read data from file and filter energy values higher than 1000 kJ/mol\n",
"with open(output_min_ene_xvg, 'r') as energy_file:\n",
" x, y = zip(*[\n",
" (float(line.split()[0]), float(line.split()[1]))\n",
" for line in energy_file\n",
" if not line.startswith((\"#\", \"@\"))\n",
" if float(line.split()[1]) < 1000\n",
" ])\n",
"\n",
"# Create a scatter plot\n",
"fig = go.Figure(data=go.Scatter(x=x, y=y, mode='lines'))\n",
"\n",
"# Update layout\n",
"fig.update_layout(title=\"Energy Minimization\",\n",
" xaxis_title=\"Energy Minimization Step\",\n",
" yaxis_title=\"Potential Energy kJ/mol\",\n",
" height=600)\n",
"\n",
"# Show the figure (renderer changes for colab and jupyter)\n",
"rend = 'colab' if 'google.colab' in sys.modules else ''\n",
"fig.show(renderer=rend)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Equilibrate the system (NVT)\n",
"Equilibrate the **protein-ligand complex** system in NVT ensemble (constant Number of particles, Volume and Temperature). To avoid temperature coupling problems, a new *\"system\"* group will be created including the **protein** + the **ligand** to be assigned to a single thermostatting group.\n",
"\n",
"- [Step 1](#eqNVTStep1): Creating an index file with a new group including the **protein-ligand complex**.\n",
"- [Step 2](#eqNVTStep3): Creating portable binary run file for system equilibration\n",
"- [Step 3](#eqNVTStep3): Equilibrate the **protein-ligand complex** with NVT ensemble.\n",
"- [Step 4](#eqNVTStep4): Checking **NVT Equilibration** results. Plotting **system temperature** by time during the **NVT equilibration** process. \n",
"***\n",
"**Building Blocks** used:\n",
"- [MakeNdx](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.make_ndx) from **biobb_gromacs.gromacs.make_ndx** \n",
"- [Grompp](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.grompp) from **biobb_gromacs.gromacs.grompp** \n",
"- [Mdrun](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.mdrun) from **biobb_gromacs.gromacs.mdrun** \n",
"- [GMXEnergy](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_energy) from **biobb_analysis.gromacs.gmx_energy** \n",
"***"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 1: Creating an index file with a new group including the **protein-ligand complex**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# MakeNdx: Creating index file with a new group (protein-ligand complex)\n",
"from biobb_gromacs.gromacs.make_ndx import make_ndx\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_complex_ndx = pdbCode+'_'+ligandCode+'_index.ndx'\n",
"prop = {\n",
" 'selection': \"\\\"Protein\\\"|\\\"Other\\\"\" \n",
"}\n",
"\n",
"# Create and launch bb\n",
"make_ndx(input_structure_path=output_min_gro,\n",
" output_ndx_path=output_complex_ndx,\n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 2: Creating portable binary run file for system equilibration (NVT)\n",
"Note that for the purposes of temperature coupling, the **protein-ligand complex** (*Protein_Other*) is considered as a single entity."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# Grompp: Creating portable binary run file for NVT System Equilibration\n",
"from biobb_gromacs.gromacs.grompp import grompp\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_gppnvt_tpr = pdbCode+'_'+ligandCode+'gppnvt.tpr'\n",
"prop = {\n",
" 'mdp':{\n",
" 'nsteps':'5000',\n",
" 'tc-grps': 'Protein_Other Water_and_ions',\n",
" 'Define': '-DPOSRES -D' + posresifdef\n",
" },\n",
" 'simulation_type':'nvt'\n",
"}\n",
"\n",
"# Create and launch bb\n",
"grompp(input_gro_path=output_min_gro,\n",
" input_top_zip_path=output_genion_top_zip,\n",
" input_ndx_path=output_complex_ndx,\n",
" output_tpr_path=output_gppnvt_tpr,\n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 3: Running NVT equilibration"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# Mdrun: Running NVT System Equilibration \n",
"from biobb_gromacs.gromacs.mdrun import mdrun\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_nvt_trr = pdbCode+'_'+ligandCode+'_nvt.trr'\n",
"output_nvt_gro = pdbCode+'_'+ligandCode+'_nvt.gro'\n",
"output_nvt_edr = pdbCode+'_'+ligandCode+'_nvt.edr'\n",
"output_nvt_log = pdbCode+'_'+ligandCode+'_nvt.log'\n",
"output_nvt_cpt = pdbCode+'_'+ligandCode+'_nvt.cpt'\n",
"\n",
"# Create and launch bb\n",
"mdrun(input_tpr_path=output_gppnvt_tpr,\n",
" output_trr_path=output_nvt_trr,\n",
" output_gro_path=output_nvt_gro,\n",
" output_edr_path=output_nvt_edr,\n",
" output_log_path=output_nvt_log,\n",
" output_cpt_path=output_nvt_cpt)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 4: Checking NVT Equilibration results\n",
"Checking **NVT Equilibration** results. Plotting **system temperature** by time during the NVT equilibration process. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# GMXEnergy: Getting system temperature by time during NVT Equilibration \n",
"from biobb_analysis.gromacs.gmx_energy import gmx_energy\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_nvt_temp_xvg = pdbCode+'_'+ligandCode+'_nvt_temp.xvg'\n",
"prop = {\n",
" 'terms': [\"Temperature\"]\n",
"}\n",
"\n",
"# Create and launch bb\n",
"gmx_energy(input_energy_path=output_nvt_edr, \n",
" output_xvg_path=output_nvt_temp_xvg, \n",
" properties=prop)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import plotly.graph_objs as go\n",
"\n",
"# Read temperature data from file \n",
"with open(output_nvt_temp_xvg, 'r') as temperature_file:\n",
" x, y = zip(*[\n",
" (float(line.split()[0]), float(line.split()[1]))\n",
" for line in temperature_file\n",
" if not line.startswith((\"#\", \"@\"))\n",
" ])\n",
"\n",
"# Create a scatter plot\n",
"fig = go.Figure(data=go.Scatter(x=x, y=y, mode='lines+markers'))\n",
"\n",
"# Update layout\n",
"fig.update_layout(title=\"Temperature during NVT Equilibration\",\n",
" xaxis_title=\"Time (ps)\",\n",
" yaxis_title=\"Temperature (K)\",\n",
" height=600)\n",
"\n",
"# Show the figure (renderer changes for colab and jupyter)\n",
"rend = 'colab' if 'google.colab' in sys.modules else ''\n",
"fig.show(renderer=rend)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Equilibrate the system (NPT)\n",
"Equilibrate the **protein-ligand complex** system in NPT ensemble (constant Number of particles, Pressure and Temperature) .\n",
"- [Step 1](#eqNPTStep1): Creating portable binary run file for system equilibration\n",
"- [Step 2](#eqNPTStep2): Equilibrate the **protein-ligand complex** with NPT ensemble.\n",
"- [Step 3](#eqNPTStep3): Checking **NPT Equilibration** results. Plotting **system pressure and density** by time during the **NPT equilibration** process.\n",
"***\n",
"**Building Blocks** used:\n",
" - [Grompp](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.grompp) from **biobb_gromacs.gromacs.grompp** \n",
" - [Mdrun](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.mdrun) from **biobb_gromacs.gromacs.mdrun** \n",
" - [GMXEnergy](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_energy) from **biobb_analysis.gromacs.gmx_energy** \n",
"***"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 1: Creating portable binary run file for system equilibration (NPT)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# Grompp: Creating portable binary run file for (NPT) System Equilibration\n",
"from biobb_gromacs.gromacs.grompp import grompp\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_gppnpt_tpr = pdbCode+'_'+ligandCode+'_gppnpt.tpr'\n",
"prop = {\n",
" 'mdp':{\n",
" 'type': 'npt',\n",
" 'nsteps':'5000',\n",
" 'tc-grps': 'Protein_Other Water_and_ions',\n",
" 'Define': '-DPOSRES -D' + posresifdef\n",
" },\n",
" 'simulation_type':'npt'\n",
"}\n",
"\n",
"# Create and launch bb\n",
"grompp(input_gro_path=output_nvt_gro,\n",
" input_top_zip_path=output_genion_top_zip,\n",
" input_ndx_path=output_complex_ndx,\n",
" output_tpr_path=output_gppnpt_tpr,\n",
" input_cpt_path=output_nvt_cpt,\n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 2: Running NPT equilibration"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# Mdrun: Running NPT System Equilibration\n",
"from biobb_gromacs.gromacs.mdrun import mdrun\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_npt_trr = pdbCode+'_'+ligandCode+'_npt.trr'\n",
"output_npt_gro = pdbCode+'_'+ligandCode+'_npt.gro'\n",
"output_npt_edr = pdbCode+'_'+ligandCode+'_npt.edr'\n",
"output_npt_log = pdbCode+'_'+ligandCode+'_npt.log'\n",
"output_npt_cpt = pdbCode+'_'+ligandCode+'_npt.cpt'\n",
"\n",
"# Create and launch bb\n",
"mdrun(input_tpr_path=output_gppnpt_tpr,\n",
" output_trr_path=output_npt_trr,\n",
" output_gro_path=output_npt_gro,\n",
" output_edr_path=output_npt_edr,\n",
" output_log_path=output_npt_log,\n",
" output_cpt_path=output_npt_cpt)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 3: Checking NPT Equilibration results\n",
"Checking **NPT Equilibration** results. Plotting **system pressure and density** by time during the **NPT equilibration** process. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# GMXEnergy: Getting system pressure and density by time during NPT Equilibration \n",
"from biobb_analysis.gromacs.gmx_energy import gmx_energy\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_npt_pd_xvg = pdbCode+'_'+ligandCode+'_npt_PD.xvg'\n",
"prop = {\n",
" 'terms': [\"Pressure\",\"Density\"]\n",
"}\n",
"\n",
"# Create and launch bb\n",
"gmx_energy(input_energy_path=output_npt_edr, \n",
" output_xvg_path=output_npt_pd_xvg, \n",
" properties=prop)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import plotly.graph_objs as go\n",
"from plotly.subplots import make_subplots\n",
"\n",
"# Read pressure and density data from file \n",
"with open(output_npt_pd_xvg,'r') as pd_file:\n",
" x, y, z = zip(*[\n",
" (float(line.split()[0]), float(line.split()[1]), float(line.split()[2]))\n",
" for line in pd_file\n",
" if not line.startswith((\"#\", \"@\"))\n",
" ])\n",
"\n",
"trace1 = go.Scatter(\n",
" x=x,y=y\n",
")\n",
"trace2 = go.Scatter(\n",
" x=x,y=z\n",
")\n",
"\n",
"fig = make_subplots(rows=1, cols=2, print_grid=False)\n",
"fig.append_trace(trace1, 1, 1)\n",
"fig.append_trace(trace2, 1, 2)\n",
"\n",
"fig.update_layout(\n",
" height=500,\n",
" title='Pressure and Density during NPT Equilibration',\n",
" showlegend=False,\n",
" xaxis1_title='Time (ps)',\n",
" yaxis1_title='Pressure (bar)',\n",
" xaxis2_title='Time (ps)',\n",
" yaxis2_title='Density (Kg*m^-3)'\n",
")\n",
"\n",
"# Show the figure (renderer changes for colab and jupyter)\n",
"rend = 'colab' if 'google.colab' in sys.modules else ''\n",
"fig.show(renderer=rend)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Free Molecular Dynamics Simulation\n",
"Upon completion of the **two equilibration phases (NVT and NPT)**, the system is now well-equilibrated at the desired temperature and pressure. The **position restraints** can now be released. The last step of the **protein-ligand complex** MD setup is a short, **free MD simulation**, to ensure the robustness of the system. \n",
"- [Step 1](#mdStep1): Creating portable binary run file to run a **free MD simulation**.\n",
"- [Step 2](#mdStep2): Run short MD simulation of the **protein-ligand complex**.\n",
"- [Step 3](#mdStep3): Checking results for the final step of the setup process, the **free MD run**. Plotting **Root Mean Square deviation (RMSd)** and **Radius of Gyration (Rgyr)** by time during the **free MD run** step. \n",
"***\n",
"**Building Blocks** used:\n",
" - [Grompp](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.grompp) from **biobb_gromacs.gromacs.grompp** \n",
" - [Mdrun](https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.mdrun) from **biobb_gromacs.gromacs.mdrun** \n",
" - [GMXRms](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_rms) from **biobb_analysis.gromacs.gmx_rms** \n",
" - [GMXRgyr](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_rgyr) from **biobb_analysis.gromacs.gmx_rgyr** \n",
"***"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 1: Creating portable binary run file to run a free MD simulation"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# Grompp: Creating portable binary run file for mdrun\n",
"from biobb_gromacs.gromacs.grompp import grompp\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"prop = {\n",
" 'mdp':{\n",
" #'nsteps':'500000' # 1 ns (500,000 steps x 2fs per step)\n",
" #'nsteps':'5000' # 10 ps (5,000 steps x 2fs per step)\n",
" 'nsteps':'25000' # 50 ps (25,000 steps x 2fs per step)\n",
" },\n",
" 'simulation_type':'free'\n",
"}\n",
"output_gppmd_tpr = pdbCode+'_'+ligandCode + '_gppmd.tpr'\n",
"\n",
"# Create and launch bb\n",
"grompp(input_gro_path=output_npt_gro,\n",
" input_top_zip_path=output_genion_top_zip,\n",
" output_tpr_path=output_gppmd_tpr,\n",
" input_cpt_path=output_npt_cpt,\n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 2: Running short free MD simulation"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# Mdrun: Running free dynamics\n",
"from biobb_gromacs.gromacs.mdrun import mdrun\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_md_trr = pdbCode+'_'+ligandCode+'_md.trr'\n",
"output_md_gro = pdbCode+'_'+ligandCode+'_md.gro'\n",
"output_md_edr = pdbCode+'_'+ligandCode+'_md.edr'\n",
"output_md_log = pdbCode+'_'+ligandCode+'_md.log'\n",
"output_md_cpt = pdbCode+'_'+ligandCode+'_md.cpt'\n",
"\n",
"# Create and launch bb\n",
"mdrun(input_tpr_path=output_gppmd_tpr,\n",
" output_trr_path=output_md_trr,\n",
" output_gro_path=output_md_gro,\n",
" output_edr_path=output_md_edr,\n",
" output_log_path=output_md_log,\n",
" output_cpt_path=output_md_cpt)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 3: Checking free MD simulation results\n",
"Checking results for the final step of the setup process, the **free MD run**. Plotting **Root Mean Square deviation (RMSd)** and **Radius of Gyration (Rgyr)** by time during the **free MD run** step. **RMSd** against the **experimental structure** (input structure of the pipeline) and against the **minimized and equilibrated structure** (output structure of the NPT equilibration step)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# GMXRms: Computing Root Mean Square deviation to analyse structural stability \n",
"# RMSd against minimized and equilibrated snapshot (backbone atoms) \n",
"\n",
"from biobb_analysis.gromacs.gmx_rms import gmx_rms\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_rms_first = pdbCode+'_'+ligandCode+'_rms_first.xvg'\n",
"prop = {\n",
" 'selection': 'Backbone'\n",
"}\n",
"\n",
"# Create and launch bb\n",
"gmx_rms(input_structure_path=output_gppmd_tpr,\n",
" input_traj_path=output_md_trr,\n",
" output_xvg_path=output_rms_first, \n",
" properties=prop)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# GMXRms: Computing Root Mean Square deviation to analyse structural stability \n",
"# RMSd against experimental structure (backbone atoms) \n",
"\n",
"from biobb_analysis.gromacs.gmx_rms import gmx_rms\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_rms_exp = pdbCode+'_'+ligandCode+'_rms_exp.xvg'\n",
"prop = {\n",
" 'selection': 'Backbone'\n",
"}\n",
"\n",
"# Create and launch bb\n",
"gmx_rms(input_structure_path=output_gppmin_tpr,\n",
" input_traj_path=output_md_trr,\n",
" output_xvg_path=output_rms_exp, \n",
" properties=prop)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import plotly.graph_objs as go\n",
"\n",
"# Read RMS vs first snapshot data from file \n",
"with open(output_rms_first,'r') as rms_first_file:\n",
" x, y = zip(*[\n",
" (float(line.split()[0]), float(line.split()[1]))\n",
" for line in rms_first_file\n",
" if not line.startswith((\"#\", \"@\"))\n",
" ])\n",
"\n",
"# Read RMS vs experimental structure data from file \n",
"with open(output_rms_exp,'r') as rms_exp_file:\n",
" x2, y2 = zip(*[\n",
" (float(line.split()[0]), float(line.split()[1]))\n",
" for line in rms_exp_file\n",
" if not line.startswith((\"#\", \"@\"))\n",
" ])\n",
"\n",
"fig = make_subplots()\n",
"fig.add_trace(go.Scatter(x=x, y=y, mode=\"lines+markers\", name=\"RMSd vs first\"))\n",
"fig.add_trace(go.Scatter(x=x, y=y2, mode=\"lines+markers\", name=\"RMSd vs exp\"))\n",
"\n",
"# Set layout including height\n",
"fig.update_layout(\n",
" title=\"RMSd during free MD Simulation\",\n",
" xaxis=dict(title=\"Time (ps)\"),\n",
" yaxis=dict(title=\"RMSd (nm)\"),\n",
" height=600\n",
")\n",
"\n",
"# Show the figure (renderer changes for colab and jupyter)\n",
"rend = 'colab' if 'google.colab' in sys.modules else ''\n",
"fig.show(renderer=rend)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# GMXRgyr: Computing Radius of Gyration to measure the protein compactness during the free MD simulation \n",
"\n",
"from biobb_analysis.gromacs.gmx_rgyr import gmx_rgyr\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_rgyr = pdbCode+'_'+ligandCode+'_rgyr.xvg'\n",
"prop = {\n",
" 'selection': 'Backbone'\n",
"}\n",
"\n",
"# Create and launch bb\n",
"gmx_rgyr(input_structure_path=output_gppmin_tpr,\n",
" input_traj_path=output_md_trr,\n",
" output_xvg_path=output_rgyr, \n",
" properties=prop)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import plotly.graph_objs as go\n",
"\n",
"# Read Rgyr data from file\n",
"with open(output_rgyr, 'r') as rgyr_file:\n",
" x, y = zip(*[\n",
" (float(line.split()[0]), float(line.split()[1]))\n",
" for line in rgyr_file\n",
" if not line.startswith((\"#\", \"@\"))\n",
" ])\n",
"\n",
"# Create a scatter plot\n",
"fig = go.Figure(data=go.Scatter(x=x, y=y, mode='lines+markers'))\n",
"\n",
"# Update layout\n",
"fig.update_layout(title=\"Radius of Gyration\",\n",
" xaxis_title=\"Time (ps)\",\n",
" yaxis_title=\"Rgyr (nm)\",\n",
" height=600)\n",
"\n",
"# Show the figure (renderer changes for colab and jupyter)\n",
"rend = 'colab' if 'google.colab' in sys.modules else ''\n",
"fig.show(renderer=rend)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"## Post-processing and Visualizing resulting 3D trajectory\n",
"Post-processing and Visualizing the **protein-ligand complex system** MD setup **resulting trajectory** using **NGL**\n",
"- [Step 1](#ppStep1): *Imaging* the resulting trajectory, **stripping out water molecules and ions** and **correcting periodicity issues**.\n",
"- [Step 2](#ppStep2): Generating a *dry* structure, **removing water molecules and ions** from the final snapshot of the MD setup pipeline.\n",
"- [Step 3](#ppStep3): Visualizing the *imaged* trajectory using the *dry* structure as a **topology**. \n",
"***\n",
"**Building Blocks** used:\n",
" - [GMXImage](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_image) from **biobb_analysis.gromacs.gmx_image** \n",
" - [GMXTrjConvStr](https://biobb-analysis.readthedocs.io/en/latest/gromacs.html#module-gromacs.gmx_trjconv_str) from **biobb_analysis.gromacs.gmx_trjconv_str** \n",
"***"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 1: *Imaging* the resulting trajectory.\n",
"Stripping out **water molecules and ions** and **correcting periodicity issues** "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# GMXImage: \"Imaging\" the resulting trajectory\n",
"# Removing water molecules and ions from the resulting structure\n",
"from biobb_analysis.gromacs.gmx_image import gmx_image\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_imaged_traj = pdbCode+'_imaged_traj.trr'\n",
"prop = {\n",
" 'center_selection': 'Protein_Other',\n",
" 'output_selection': 'Protein_Other',\n",
" 'pbc' : 'mol',\n",
" 'center' : True\n",
"}\n",
"\n",
"# Create and launch bb\n",
"gmx_image(input_traj_path=output_md_trr,\n",
" input_top_path=output_gppmd_tpr,\n",
" input_index_path=output_complex_ndx,\n",
" output_traj_path=output_imaged_traj, \n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 2: Generating the output *dry* structure.\n",
"**Removing water molecules and ions** from the resulting structure"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# GMXTrjConvStr: Converting and/or manipulating a structure\n",
"# Removing water molecules and ions from the resulting structure\n",
"# The \"dry\" structure will be used as a topology to visualize \n",
"# the \"imaged dry\" trajectory generated in the previous step.\n",
"from biobb_analysis.gromacs.gmx_trjconv_str import gmx_trjconv_str\n",
"\n",
"# Create prop dict and inputs/outputs\n",
"output_dry_gro = pdbCode+'_md_dry.gro'\n",
"prop = {\n",
" 'selection': 'Protein_Other'\n",
"}\n",
"\n",
"# Create and launch bb\n",
"gmx_trjconv_str(input_structure_path=output_md_gro,\n",
" input_top_path=output_gppmd_tpr,\n",
" input_index_path=output_complex_ndx,\n",
" output_str_path=output_dry_gro, \n",
" properties=prop)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Step 3: Visualizing the generated dehydrated trajectory.\n",
"Using the **imaged trajectory** (output of the [Post-processing step 1](#ppStep1)) with the **dry structure** (output of the [Post-processing step 2](#ppStep2)) as a topology."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# Show trajectory\n",
"view = nglview.show_simpletraj(nglview.SimpletrajTrajectory(output_imaged_traj, output_dry_gro), gui=True)\n",
"view"
]
},
{
"cell_type": "markdown",
"metadata": {
"variables": {
"output_genion_top_zip": "
NameError: name 'output_genion_top_zip' is not defined
\n", "output_gppmd_tpr": "NameError: name 'output_gppmd_tpr' is not defined
\n", "output_md_cpt": "NameError: name 'output_md_cpt' is not defined
\n", "output_md_gro": "NameError: name 'output_md_gro' is not defined
\n", "output_md_trr": "NameError: name 'output_md_trr' is not defined
\n", "output_rgyr": "NameError: name 'output_rgyr' is not defined
\n", "output_rms_exp": "NameError: name 'output_rms_exp' is not defined
\n", "output_rms_first": "NameError: name 'output_rms_first' is not defined
\n" } }, "source": [ "\n", "## Output files\n", "\n", "Important **Output files** generated:\n", " - {{output_md_gro}}: **Final structure** (snapshot) of the MD setup protocol.\n", " - {{output_md_trr}}: **Final trajectory** of the MD setup protocol.\n", " - {{output_md_cpt}}: **Final checkpoint file**, with information about the state of the simulation. It can be used to **restart** or **continue** a MD simulation.\n", " - {{output_gppmd_tpr}}: **Final tpr file**, GROMACS portable binary run input file. This file contains the starting structure of the **MD setup free MD simulation step**, together with the molecular topology and all the simulation parameters. It can be used to **extend** the simulation.\n", " - {{output_genion_top_zip}}: **Final topology** of the MD system. It is a compressed zip file including a **topology file** (.top) and a set of auxiliary **include topology** files (.itp).\n", "\n", "**Analysis** (MD setup check) output files generated:\n", " - {{output_rms_first}}: **Root Mean Square deviation (RMSd)** against **minimized and equilibrated structure** of the final **free MD run step**.\n", " - {{output_rms_exp}}: **Root Mean Square deviation (RMSd)** against **experimental structure** of the final **free MD run step**.\n", " - {{output_rgyr}}: **Radius of Gyration** of the final **free MD run step** of the **setup pipeline**.\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "\n", "\n", "## Questions & Comments\n", "\n", "Questions, issues, suggestions and comments are really welcome!\n", "\n", "* GitHub issues:\n", " * [https://github.com/bioexcel/biobb](https://github.com/bioexcel/biobb)\n", "\n", "* BioExcel forum:\n", " * [https://ask.bioexcel.eu/c/BioExcel-Building-Blocks-library](https://ask.bioexcel.eu/c/BioExcel-Building-Blocks-library)\n" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.3" } }, "nbformat": 4, "nbformat_minor": 4 }