{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "*This notebook contains material from [PyRosetta](https://RosettaCommons.github.io/PyRosetta.notebooks);\n", "content is available [on Github](https://github.com/RosettaCommons/PyRosetta.notebooks.git).*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "< [Setting up a membrane protein in the bilayer](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/15.01-Accounting-for-the-lipid-bilayer.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [Running Rosetta in Parallel](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/16.00-Running-PyRosetta-in-Parallel.ipynb) >
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Predicting the ∆∆G of single point mutations"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Accurately estimating the thermodynamic cost of a mutation is a building block of protein engineering and design. This task is especially tricky for membrane proteins because the calculations must account for the lipid bilayer. In this tutorial, we will walk through the protocol for estimating the ∆∆G for lipid facing positions using RosettaMP and the _franklin2019_ energy function. \n",
"\n",
"As an example, we will examine mutations in the integral membrane enzyme PagP. PagP is a beta-barrel protein that transfers a palmitoyl group fron the sn-1 position of a glycerophospholipid to the endotoxin of lipopolysacharide (LPS). The enzyme provides bacterial resistance to host immune defenses such as antimicrobial pepetides (Guo et al. 1998; Kawasaki et al. 2004). Recently, Marx & Fleming measured the energetic cost of making mutations at the V111 position on PagP (Marx & Fleming, 2017). Here, we will perform the same set of mutations with Rosetta and compare with the experimental values. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Objectives\n",
" - Setup a protein in an implicit lipid bilayer\n",
" - Compute the ∆∆G of mutation\n",
" - Analyze contributions to the change in stability\n",
" - Visualize the model in PyMOL"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## PyRosetta Initialization & Setup"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first step is to initialize PyRosetta and load the protein of interest. In this tutorial, we will use PagP (PDB 3GP6). The starting structure is from the Orientations of Proteins in Membranes database (https://opm.phar.umich.edu/) which provides spatial arrangements of membrane proteins in the lipid bilayer. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install pyrosettacolabsetup\n",
"import pyrosettacolabsetup; pyrosettacolabsetup.install_pyrosetta()\n",
"import pyrosetta; pyrosetta.init()\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"PyRosetta-4 2020 [Rosetta PyRosetta4.MinSizeRel.python37.mac 2020.02+release.22ef835b4a2647af94fcd6421a85720f07eddf12 2020-01-05T17:31:56] retrieved from: http://www.pyrosetta.org\n",
"(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.\n",
"\u001b[0mcore.init: \u001b[0mChecking for fconfig files in pwd and ./rosetta/flags\n",
"\u001b[0mcore.init: \u001b[0mRosetta version: PyRosetta4.MinSizeRel.python37.mac r242 2020.02+release.22ef835b4a2 22ef835b4a2647af94fcd6421a85720f07eddf12 http://www.pyrosetta.org 2020-01-05T17:31:56\n",
"\u001b[0mcore.init: \u001b[0mcommand: PyRosetta -ex1 -ex2aro -mp:lipids:has_pore false -database /Users/ralford/Applications/env/lib/python3.7/site-packages/pyrosetta-2020.2+release.22ef835b4a2-py3.7-macosx-10.14-x86_64.egg/pyrosetta/database\n",
"\u001b[0mbasic.random.init_random_generator: \u001b[0m'RNG device' seed mode, using '/dev/urandom', seed=1900191457 seed_offset=0 real_seed=1900191457\n",
"\u001b[0mbasic.random.init_random_generator: \u001b[0mRandomGenerator:init: Normal mode, seed=1900191457 RG_type=mt19937\n"
]
}
],
"source": [
"from pyrosetta import *\n",
"init( extra_options=\"-mp:lipids:has_pore false\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Make sure you are in the right directory for accessing the `.pdb` files:\n",
"\n",
"`cd google_drive/MyDrive/student-notebooks/`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#cd google_drive/MyDrive/student-notebooks/"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[0mcore.chemical.GlobalResidueTypeSet: \u001b[0mFinished initializing fa_standard residue type set. Created 980 residue types\n",
"\u001b[0mcore.chemical.GlobalResidueTypeSet: \u001b[0mTotal time to initialize 1.42549 seconds.\n",
"\u001b[0mcore.import_pose.import_pose: \u001b[0mFile 'inputs/3gp6_A.pdb' automatically determined to be of type PDB\n",
"\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m PDB reader is ignoring atom CD in residue 12 A. Pass flag -ignore_zero_occupancy false to change this behavior\n",
"\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m PDB reader is ignoring atom OE1 in residue 12 A. Pass flag -ignore_zero_occupancy false to change this behavior\n",
"\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m PDB reader is ignoring atom OE2 in residue 12 A. Pass flag -ignore_zero_occupancy false to change this behavior\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CG on residue GLU 5\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CD on residue GLU 5\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: OE1 on residue GLU 5\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: OE2 on residue GLU 5\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CD on residue GLU 12\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: OE1 on residue GLU 12\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: OE2 on residue GLU 12\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CG on residue ARG 36\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CD on residue ARG 36\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: NE on residue ARG 36\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CZ on residue ARG 36\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: NH1 on residue ARG 36\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: NH2 on residue ARG 36\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CG on residue PHE 37\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CD1 on residue PHE 37\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CD2 on residue PHE 37\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CE1 on residue PHE 37\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CE2 on residue PHE 37\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CZ on residue PHE 37\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CG on residue TYR 39\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CD1 on residue TYR 39\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CD2 on residue TYR 39\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CE1 on residue TYR 39\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CE2 on residue TYR 39\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CZ on residue TYR 39\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: OH on residue TYR 39\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CG on residue ASN 40\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: OD1 on residue ASN 40\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: ND2 on residue ASN 40\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: OG1 on residue THR 139\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CG2 on residue THR 139\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CG on residue TYR 140\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CD1 on residue TYR 140\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CD2 on residue TYR 140\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CE1 on residue TYR 140\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CE2 on residue TYR 140\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CZ on residue TYR 140\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: OH on residue TYR 140\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CG on residue ASN 141\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: OD1 on residue ASN 141\n",
"\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: ND2 on residue ASN 141\n",
"\u001b[0mcore.pack.pack_missing_sidechains: \u001b[0mpacking residue number 5 because of missing atom number 6 atom name CG\n",
"\u001b[0mcore.pack.pack_missing_sidechains: \u001b[0mpacking residue number 12 because of missing atom number 7 atom name CD\n",
"\u001b[0mcore.pack.pack_missing_sidechains: \u001b[0mpacking residue number 36 because of missing atom number 6 atom name CG\n",
"\u001b[0mcore.pack.pack_missing_sidechains: \u001b[0mpacking residue number 37 because of missing atom number 6 atom name CG\n",
"\u001b[0mcore.pack.pack_missing_sidechains: \u001b[0mpacking residue number 39 because of missing atom number 6 atom name CG\n",
"\u001b[0mcore.pack.pack_missing_sidechains: \u001b[0mpacking residue number 40 because of missing atom number 6 atom name CG\n",
"\u001b[0mcore.pack.pack_missing_sidechains: \u001b[0mpacking residue number 139 because of missing atom number 6 atom name OG1\n",
"\u001b[0mcore.pack.pack_missing_sidechains: \u001b[0mpacking residue number 140 because of missing atom number 6 atom name CG\n",
"\u001b[0mcore.pack.pack_missing_sidechains: \u001b[0mpacking residue number 141 because of missing atom number 6 atom name CG\n",
"\u001b[0mcore.pack.task: \u001b[0mPacker task: initialize from command line()\n",
"\u001b[0mcore.scoring.ScoreFunctionFactory: \u001b[0mSCOREFUNCTION: \u001b[32mref2015\u001b[0m\n",
"\u001b[0mcore.scoring.etable: \u001b[0mStarting energy table calculation\n",
"\u001b[0mcore.scoring.etable: \u001b[0msmooth_etable: changing atr/rep split to bottom of energy well\n",
"\u001b[0mcore.scoring.etable: \u001b[0msmooth_etable: spline smoothing lj etables (maxdis = 6)\n",
"\u001b[0mcore.scoring.etable: \u001b[0msmooth_etable: spline smoothing solvation etables (max_dis = 6)\n",
"\u001b[0mcore.scoring.etable: \u001b[0mFinished calculating energy tables.\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBPoly1D.csv\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBFadeIntervals.csv\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBEval.csv\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/DonStrength.csv\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/AccStrength.csv\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/rama/fd/all.ramaProb\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/rama/fd/prepro.ramaProb\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.all.txt\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.gly.txt\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.pro.txt\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.valile.txt\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/P_AA\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/P_AA_n\n",
"\u001b[0mcore.scoring.P_AA: \u001b[0mshapovalov_lib::shap_p_aa_pp_smooth_level of 1( aka low_smooth ) got activated.\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/shapovalov/10deg/kappa131/a20.prop\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/elec_cp_reps.dat\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[0mcore.scoring.elec.util: \u001b[0mRead 40 countpair representative atoms\n",
"\u001b[0mcore.pack.dunbrack.RotamerLibrary: \u001b[0mshapovalov_lib_fixes_enable option is true.\n",
"\u001b[0mcore.pack.dunbrack.RotamerLibrary: \u001b[0mshapovalov_lib::shap_dun10_smooth_level of 1( aka lowest_smooth ) got activated.\n",
"\u001b[0mcore.pack.dunbrack.RotamerLibrary: \u001b[0mBinary rotamer library selected: /Users/ralford/Applications/env/lib/python3.7/site-packages/pyrosetta-2020.2+release.22ef835b4a2-py3.7-macosx-10.14-x86_64.egg/pyrosetta/database/rotamer/shapovalov/StpDwn_0-0-0/Dunbrack10.lib.bin\n",
"\u001b[0mcore.pack.dunbrack.RotamerLibrary: \u001b[0mUsing Dunbrack library binary file '/Users/ralford/Applications/env/lib/python3.7/site-packages/pyrosetta-2020.2+release.22ef835b4a2-py3.7-macosx-10.14-x86_64.egg/pyrosetta/database/rotamer/shapovalov/StpDwn_0-0-0/Dunbrack10.lib.bin'.\n",
"\u001b[0mcore.pack.dunbrack.RotamerLibrary: \u001b[0mDunbrack 2010 library took 0.408346 seconds to load from binary\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 115 rotamers at 9 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n"
]
}
],
"source": [
"pose = pose_from_pdb( \"inputs/3gp6_A.pdb\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then, initialize the protein in the membrane using `AddMembraneMover.` Here, the protein is already oriented in the bilayer so we can estimate the transmembrane spans from the structure and orientation. Thus, we use the `from_structure` option to initialize the spanning topology. "
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0m=====================================================================\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0m|| WELCOME TO THE WORLD OF MEMBRANE PROTEINS... ||\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0m=====================================================================\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mNo membrane residue was found\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mAdding a new membrane residue to the pose\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mAdding a membrane residue representing the position of the membrane after residue 155\n",
"\u001b[0mcore.chemical.GlobalResidueTypeSet: \u001b[0mLoading (but possibly not actually using) 'MEM' from the PDB components dictionary for residue type 'pdb_MEM'\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0m Edge \t Jump Jump #\n",
" \t0156--0001 001\n",
"0001--0155\n",
"\u001b[0mprotocols.DsspMover: \u001b[0mLLHHHHHHHHHHHHHHHHHLLLEEEEEEEEEEEELLLLLLLLLLEEEEEEEEELLLLLEEEEEEEEEELLLLLEEEEEEEEEEEEELLLLLLLEEEEEEEEEEEEELHHHLLLEEEEEEEEEEEEELLEEEEEEEELLLLLLLLEEEEEEEEEELL\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mFilling membrane spanning topology from structure with thickness 15\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mcreate topology from structure\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 3\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 3\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 6\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 7\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 10\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 11\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 13\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 14\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 17\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 18\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 23\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 33\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mprev_end: 0, end: 33\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mAdding TMspan from 23 to 33\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 45\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 53\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mprev_end: 33, end: 53\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mAdding TMspan from 45 to 53\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 59\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 68\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mprev_end: 53, end: 68\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mAdding TMspan from 59 to 68\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 74\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 86\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mprev_end: 68, end: 86\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mAdding TMspan from 74 to 86\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 94\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 106\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mprev_end: 86, end: 106\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mAdding TMspan from 94 to 106\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 108\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 108\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 110\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 110\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 114\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 126\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mprev_end: 106, end: 126\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mAdding TMspan from 114 to 126\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 129\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 136\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mprev_end: 126, end: 136\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mAdding TMspan from 129 to 136\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan start at 145\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTMspan end at 154\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mprev_end: 136, end: 154\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mAdding TMspan from 145 to 154\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTotal # of TM spans: 8\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mNumber of residues in spanfile: 243\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 1: start: 23, end: 33 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 2: start: 45, end: 53 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 3: start: 59, end: 68 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 4: start: 74, end: 86 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 5: start: 94, end: 106 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 6: start: 114, end: 126 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 7: start: 129, end: 136 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 8: start: 145, end: 154 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTotal # of TM spans: 8\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mNumber of residues in spanfile: 243\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 1: start: 23, end: 33 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 2: start: 45, end: 53 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 3: start: 59, end: 68 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 4: start: 74, end: 86 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 5: start: 94, end: 106 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 6: start: 114, end: 126 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 7: start: 129, end: 136 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 8: start: 145, end: 154 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mWATCH OUT: Writing spanfile out.span!\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mwrote out.span\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: membrane/implicit_lipid_parameters.txt\n",
"\u001b[0mprotocols.membrane.SetMembranePositionMover: \u001b[0mCalling SetMembranePositionMover\n",
"\u001b[0mprotocols.membrane.SetMembranePositionMover: \u001b[0mStarting foldtree: Is membrane fixed? 1\n",
"\u001b[0mprotocols.membrane.SetMembranePositionMover: \u001b[0m Edge \t Jump Jump #\n",
" \t0156--0001 001\n",
"0001--0155\n",
"\u001b[0mprotocols.membrane.SetMembranePositionMover: \u001b[0mFinal foldtree: Is membrane fixed? 1\n",
"\u001b[0mprotocols.membrane.SetMembranePositionMover: \u001b[0m Edge \t Jump Jump #\n",
" \t0156--0001 001\n",
"0001--0155\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mFinal foldtree: Is membrane fixed? 1\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0m Edge \t Jump Jump #\n",
" \t0156--0001 001\n",
"0001--0155\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mMembraneInfo: Information about this Membrane Protein\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mMembrane Residue Num: 156\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mMembrane Fold Tree Jump: 1\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mMembrane Thickness: 15\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mMembrane Steepness: 10\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mMembrane Spanning Topology\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mTotal # of TM spans: 8\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mNumber of residues in spanfile: 243\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 1: start: 23, end: 33 orientation: 1\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 2: start: 45, end: 53 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 3: start: 59, end: 68 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 4: start: 74, end: 86 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 5: start: 94, end: 106 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 6: start: 114, end: 126 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 7: start: 129, end: 136 orientation: 1\n",
"\u001b[0mcore.conformation.membrane.SpanningTopology: \u001b[0mSpan 8: start: 145, end: 154 orientation: 1\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mImplicit Lipid Information\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mInformation about the Lipid Composition:\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mWater thickness: 15.351\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mChange in water density: 0.343\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mTransformed water thickness: 199.57\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mLipid chain type: 12:0/12:0\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mLipid headgroup type: PC\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mLipid composition name (short-form): DLPC\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mLipid composition name (long-form): 1,2-dilauroyl-sn-glycero-3-phosphocholine\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mDegrees of saturation: 0\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mTemperature (celcius): 37\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mAre we accommodating the pore?: no\n",
"\u001b[0mprotocols.membrane.AddMembraneMover: \u001b[0mIs this an alpha helical protein?: yes\n"
]
}
],
"source": [
"from pyrosetta.rosetta.protocols.membrane import *\n",
"add_memb = AddMembraneMover(\"from_structure\")\n",
"add_memb.apply(pose)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## ∆∆G of mutation calculations\n",
"\n",
"Next, we will compute the ∆∆G for several point mutations in PagP. In the Marx & Fleming experiment, position V111 was first mutated to alanine. Therefore, we will create this variant first. We will use the `mutate_residue` function from the `predict_ddG` PyRosetta module included in this package. In this tutorial, we will use a repack radius of 8.0 Å. \n",
"\n",
"An important note - Pyrosetta residue numbering may differ from the PDB numbering because PyRosetta requires continuous numbering for calculations. Here, the PyRosetta residue number for V111 is 104. "
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[0mprotocols.membrane.scoring.FaWaterToBilayerEnergy: \u001b[0mReading fa_water_to_bilayer parameters from the database\n",
"\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: membrane/memb_fa_params_2019.txt\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n"
]
}
],
"source": [
"from additional_scripts import predict_ddG\n",
"# Create a franklin2019 energy function\n",
"sfxn = create_score_function(\"franklin2019\")\n",
"# Repack and score the native conformation\n",
"reference_pose = predict_ddG.mutate_residue(pose, 104, \"A\", 8.0, sfxn)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To demonstrate the ΔΔG calculation, we will now compute the energetic cost\n",
"of mutating alanine to tryptophan."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 54 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"-1.6647217086966464\n"
]
}
],
"source": [
"# Score the alanine reference pose\n",
"score_A111 = sfxn.score(reference_pose)\n",
"# Repack and score the L111 conformation\n",
"pose_W111 = predict_ddG.mutate_residue(pose, 104, \"W\", 8.0, sfxn)\n",
"score_W111 = sfxn.score(pose_W111)\n",
"# Compute the ddG of mutation as mutant_score - native_score (final-initial\n",
"ddG = score_W111 - score_A111\n",
"print(ddG)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The ΔΔG for mutating alanine to tryptophan at position 111 is -1.84 Rosetta Energy Units (REU). A\n",
"Rosetta Energy Unit is an arbitrary unit for the Rosetta energy function. Next, we would like to\n",
"compute the ΔΔG for mutating alanine to all 19 canonical amino acids. To do so, we will generalize\n",
"the code above into a function for easy calculations of multiple single point mutations."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"def compute_ddG(pose, native_res, site_no, mutant_res, sfxn): \n",
" \"\"\"A function for computing the ddG of single point mutations\n",
" \n",
" Example: \n",
" $ compute_ddG(pose, \"V\", 49, \"A\", sfxn)\n",
" \n",
" Arguments: \n",
" - pose = Object containing the coordinates for the biomolecular system\n",
" - native_res = Native amino acid\n",
" - site_no = Host site amino acid position\n",
" - mutant_res = Mutant amino acid\n",
" = sfxn = Score function object\n",
" \"\"\"\n",
" \n",
" repacked_native = predict_ddG.mutate_residue(pose, site_no, native_res, 8.0, sfxn)\n",
" native_score = sfxn.score(repacked_native)\n",
" repacked_mutant = predict_ddG.mutate_residue(pose, site_no, mutant_res, 8.0, sfxn)\n",
" mutant_score = sfxn.score(repacked_mutant)\n",
" ddG = mutant_score - native_score\n",
" return ddG"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we will write a loop that computes the ΔΔG for all canonical amino acids and store the results\n",
"in a python dictionary."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 58 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 59 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 65 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 52 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 61 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 53 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 71 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 52 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 61 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 63 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 75 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 77 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 103 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 103 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 52 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 54 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 50 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n",
"\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 53 rotamers at 11 positions.\n",
"\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating PDInteractionGraph\n"
]
}
],
"source": [
"# List of canonical amino acid one-letter codes\n",
"amino_acids = [ 'A', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'V', 'W', 'Y' ]\n",
"# Initialize an empty dictionary to store the data\n",
"ddG_data = {}\n",
"# Loop through all amino acids\n",
"for aa in amino_acids:\n",
" ddG = compute_ddG(reference_pose, 'A', 104, aa, sfxn)\n",
" ddG_data[ aa ] = ddG"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The next step is to compare the ΔΔG predictions to the experimentally measured values. The experimental data for Marx & Fleming are located in `inputs` in a file called `PagP_Marx_Fleming_set.dat`. We will parse the file and then import these values into a |dictionary."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"# Read contents of file into a list (file format is space delimited)\n",
"with open( 'inputs/PagP_Marx_Fleming_set.dat', 'rt' ) as f:\n",
" data = f.readlines()\n",
" data = [ x.strip() for x in data ]\n",
" data = [ x.split(' ') for x in data ]\n",
"\n",
"# Convert the list into a dictionary\n",
"exp_ddG_data = {}\n",
"for i in range(1, len(data)):\n",
" exp_ddG_data[ data[i][2] ] = float(data[i][3])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We now convert the dictionary format to numpy arrays that are compatible with analysis. Here, we\n",
"will compute the correlation coefficient and make a scatterplot of the experimentally measured vs.\n",
"predicted values."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.3590768320982386\n"
]
}
],
"source": [
"import numpy as np\n",
"mutations = np.asarray( ddG_data.keys() )\n",
"ddG_values = np.asarray( list(ddG_data.values()) )\n",
"exp_values = np.asarray( list(exp_ddG_data.values()) )\n",
"\n",
"# Compute the correlation coefficient\n",
"corr = np.corrcoef( exp_values, ddG_values )\n",
"print(corr[0,1])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We initially find that the correlation coefficient is low (0.376). We will want to find any outliers in the dataset that are lowering this value."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"167.71057239138239\n"
]
}
],
"source": [
"def find_outliers(x):\n",
" upper_quartile = np.percentile(x, 75)\n",
" lower_quartile = np.percentile(x, 25)\n",
" IQR = (upper_quartile - lower_quartile)\n",
" quartile_set = (lower_quartile - IQR, upper_quartile + IQR)\n",
" for y in x.tolist():\n",
" if (y < quartile_set[0]) or (y > quartile_set[1]):\n",
" print(y)\n",
"find_outliers(ddG_values)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here, the ΔΔG value for proline is an outlier. We will investigate this more later. For now, we will remove it from the set and then recompute the correlation coefficient."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.6720890940120559\n"
]
}
],
"source": [
"# Proline is the 13th, amino acid of 20\n",
"exp_data_no_P = []\n",
"pred_data_no_P = []\n",
"for i in range(0, 20):\n",
" if ( i != 12 ):\n",
" exp_data_no_P.append( list(exp_ddG_data.values())[i] )\n",
" pred_data_no_P.append( list(ddG_data.values())[i] )\n",
"\n",
"exp_data_no_P = np.asarray( exp_data_no_P )\n",
"pred_data_no_P = np.asarray( pred_data_no_P )\n",
"corr = np.corrcoef( exp_data_no_P, pred_data_no_P )\n",
"print(corr[0,1])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The new value of R = 0.692 is much more encouraging! Next, we will visualized the predicted vs. experimentally measured values with a scatterplot."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"