{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "*This notebook contains material from [PyRosetta](https://RosettaCommons.github.io/PyRosetta);\n", "content is available [on Github](https://github.com/RosettaCommons/PyRosetta.notebooks.git).*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "< [High-Resolution Movers](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/05.01-High-Res-Movers.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [Packing & Design](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.00-Introduction-to-Packing-and-Design.ipynb) >

\"Open" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Refinement Protocol" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0mcore.init: \u001b[0mChecking for fconfig files in pwd and ./rosetta/flags\n", "\u001b[0mcore.init: \u001b[0mRosetta version: PyRosetta4.Release.python36.mac r208 2019.04+release.fd666910a5e fd666910a5edac957383b32b3b4c9d10020f34c1 http://www.pyrosetta.org 2019-01-22T15:55:37\n", "\u001b[0mcore.init: \u001b[0mcommand: PyRosetta -ex1 -ex2aro -database /Users/kathyle/Computational Protein Prediction and Design/PyRosetta4.Release.python36.mac.release-208/pyrosetta/database\n", "\u001b[0mcore.init: \u001b[0m'RNG device' seed mode, using '/dev/urandom', seed=-1509889871 seed_offset=0 real_seed=-1509889871\n", "\u001b[0mcore.init.random: \u001b[0mRandomGenerator:init: Normal mode, seed=-1509889871 RG_type=mt19937\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/kathyle/Computational Protein Prediction and Design/PyRosetta4.Release.python36.mac.release-208/pyrosetta/teaching.py:13: UserWarning: Import of 'rosetta' as a top-level module is deprecated and may be removed in 2018, import via 'pyrosetta.rosetta'.\n", " from rosetta.core.scoring import *\n" ] } ], "source": [ "# Notebook setup\n", "import sys\n", "if 'google.colab' in sys.modules:\n", " !pip install pyrosettacolabsetup\n", " import pyrosettacolabsetup\n", " pyrosettacolabsetup.setup()\n", " print (\"Notebook is set for PyRosetta use in Colab. Have fun!\")\n", "\n", "from pyrosetta import *\n", "from pyrosetta.teaching import *\n", "init()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Make sure you are in the directory with the pdb files:**\n", "\n", "`cd google_drive/My\\ Drive/student-notebooks/`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The entire standard Rosetta refinement protocol, similar to that presented in Bradley, Misura, & Baker 2005, is available as a `Mover`. Note that the protocol can require ~40 minutes for a 100-residue protein. Try running it on a fresh `pose` made from the same 1YY8 PDB:\n", "\n", "```\n", "sfxn = get_fa_scorefxn()\n", "pose = pose_from_pdb(\"1YY8.clean.pdb\")\n", "relax = pyrosetta.rosetta.protocols.relax.ClassicRelax()\n", "relax.set_scorefxn(sfxn)\n", "relax.apply(pose)\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbgrader": { "grade": true, "grade_id": "cell-7e7532d4e9a15b1a", "locked": false, "points": 0, "schema_version": 1, "solution": true } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0mcore.scoring.ScoreFunctionFactory: \u001b[0mSCOREFUNCTION: \u001b[32mref2015\u001b[0m\n", "\u001b[0mcore.scoring.etable: \u001b[0mStarting energy table calculation\n", "\u001b[0mcore.scoring.etable: \u001b[0msmooth_etable: changing atr/rep split to bottom of energy well\n", "\u001b[0mcore.scoring.etable: \u001b[0msmooth_etable: spline smoothing lj etables (maxdis = 6)\n", "\u001b[0mcore.scoring.etable: \u001b[0msmooth_etable: spline smoothing solvation etables (max_dis = 6)\n", "\u001b[0mcore.scoring.etable: \u001b[0mFinished calculating energy tables.\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBPoly1D.csv\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBFadeIntervals.csv\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBEval.csv\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/DonStrength.csv\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/AccStrength.csv\n", "\u001b[0mcore.chemical.GlobalResidueTypeSet: \u001b[0mFinished initializing fa_standard residue type set. Created 696 residue types\n", "\u001b[0mcore.chemical.GlobalResidueTypeSet: \u001b[0mTotal time to initialize 1.07793 seconds.\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/rama/fd/all.ramaProb\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/rama/fd/prepro.ramaProb\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.all.txt\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.gly.txt\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.pro.txt\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.valile.txt\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/P_AA\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/P_AA_n\n", "\u001b[0mcore.scoring.P_AA: \u001b[0mshapovalov_lib::shap_p_aa_pp_smooth_level of 1( aka low_smooth ) got activated.\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/shapovalov/10deg/kappa131/a20.prop\n", "\u001b[0mcore.import_pose.import_pose: \u001b[0mFile 'inputs/1YY8.clean.pdb' automatically determined to be of type PDB\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CG on residue ARG 18\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CD on residue ARG 18\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: NE on residue ARG 18\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CZ on residue ARG 18\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: NH1 on residue ARG 18\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: NH2 on residue ARG 18\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CG on residue GLN:NtermProteinFull 214\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CD on residue GLN:NtermProteinFull 214\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: OE1 on residue GLN:NtermProteinFull 214\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: NE2 on residue GLN:NtermProteinFull 214\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CG on residue ARG 452\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CD on residue ARG 452\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: NE on residue ARG 452\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CZ on residue ARG 452\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: NH1 on residue ARG 452\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: NH2 on residue ARG 452\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CG on residue GLN:NtermProteinFull 648\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: CD on residue GLN:NtermProteinFull 648\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: OE1 on residue GLN:NtermProteinFull 648\n", "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: NE2 on residue GLN:NtermProteinFull 648\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 23 88\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 23 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 88 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 23 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 88 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 134 194\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 134 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 194 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 134 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 194 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 235 308\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 235 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 308 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 235 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 308 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 359 415\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 359 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 415 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 359 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 415 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 457 522\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 457 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 522 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 457 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 522 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 568 628\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 568 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 628 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 568 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 628 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 669 742\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 669 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 742 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 669 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 742 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 793 849\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 793 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 849 CYS\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 793 CYD\n", "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 849 CYD\n", "\u001b[0mcore.pack.pack_missing_sidechains: \u001b[0mpacking residue number 18 because of missing atom number 6 atom name CG\n", "\u001b[0mcore.pack.pack_missing_sidechains: \u001b[0mpacking residue number 214 because of missing atom number 6 atom name CG\n", "\u001b[0mcore.pack.pack_missing_sidechains: \u001b[0mpacking residue number 452 because of missing atom number 6 atom name CG\n", "\u001b[0mcore.pack.pack_missing_sidechains: \u001b[0mpacking residue number 648 because of missing atom number 6 atom name CG\n", "\u001b[0mcore.pack.task: \u001b[0mPacker task: initialize from command line()\n", "\u001b[0mcore.scoring.ScoreFunctionFactory: \u001b[0mSCOREFUNCTION: \u001b[32mref2015\u001b[0m\n", "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: scoring/score_functions/elec_cp_reps.dat\n", "\u001b[0mcore.scoring.elec.util: \u001b[0mRead 40 countpair representative atoms\n", "\u001b[0mcore.pack.dunbrack.RotamerLibrary: \u001b[0mshapovalov_lib_fixes_enable option is true.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0mcore.pack.dunbrack.RotamerLibrary: \u001b[0mshapovalov_lib::shap_dun10_smooth_level of 1( aka lowest_smooth ) got activated.\n", "\u001b[0mcore.pack.dunbrack.RotamerLibrary: \u001b[0mBinary rotamer library selected: /Users/kathyle/Computational Protein Prediction and Design/PyRosetta4.Release.python36.mac.release-208/pyrosetta/database/rotamer/shapovalov/StpDwn_0-0-0/Dunbrack10.lib.bin\n", "\u001b[0mcore.pack.dunbrack.RotamerLibrary: \u001b[0mUsing Dunbrack library binary file '/Users/kathyle/Computational Protein Prediction and Design/PyRosetta4.Release.python36.mac.release-208/pyrosetta/database/rotamer/shapovalov/StpDwn_0-0-0/Dunbrack10.lib.bin'.\n", "\u001b[0mcore.pack.dunbrack.RotamerLibrary: \u001b[0mDunbrack 2010 library took 0.475769 seconds to load from binary\n", "\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 85 rotamers at 4 positions.\n", "\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n", "\u001b[0mprotocols.relax.ClassicRelax: \u001b[0mSetting up default relax setting\n", "\u001b[0mprotocols.relax.ClassicRelax: \u001b[0m\n", "\u001b[0mprotocols.relax.ClassicRelax: \u001b[0m\n", "\u001b[0mprotocols.relax.ClassicRelax: \u001b[0m===================================================================\n", "\u001b[0mprotocols.relax.ClassicRelax: \u001b[0m Stage 1\n", "\u001b[0mprotocols.relax.ClassicRelax: \u001b[0m Ramping repulsives with 8 outer cycles and 1 inner cycles\n", "\u001b[0mcore.pack.task: \u001b[0mPacker task: initialize from command line()\n", "\u001b[0mcore.pack.pack_rotamers: \u001b[0mbuilt 33948 rotamers at 868 positions.\n", "\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mInstantiating DensePDInteractionGraph\n", "\u001b[0mcore.pack.interaction_graph.interaction_graph_factory: \u001b[0mHigh IG memory usage (>25 MB). If this becomes an issue, consider using a different interaction graph type.\n" ] } ], "source": [ "### BEGIN SOLUTION\n", "sfxn = get_fa_scorefxn()\n", "pose = pose_from_pdb(\"inputs/1YY8.clean.pdb\")\n", "relax = pyrosetta.rosetta.protocols.relax.ClassicRelax()\n", "relax.set_scorefxn(sfxn)\n", "relax.apply(pose)\n", "### END SOLUTION" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Programming Exercises\n", "\n", "\n", "1. Use the `Mover` constructs to create a complex folding algorithm. Create a program to do the following:\n", " 1. Five small moves\n", " 2. Minimize\n", " 3. Five shear moves\n", " 4. Minimize\n", " 5. Monte Carlo Metropolis criterion\n", " 6. Repeat a–e 100 times\n", " 7. Repeat a–f five times, each time decreasing the magnitude of the small and shear moves from 25° to 5° in 5° increments.\n", "\n", "\n", "Sketch a flowchart, and submit both the flowchart and your code." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. *Ab initio folding algorithm*. Based on the Monte Carlo energy optimization algorithm from Workshop #4, write a complete program that will fold a protein. A suggested algorithm involves preliminary low-resolution modifications by fragment insertion (first 9-mers, then 3-mers), followed by high-resolution refinement using small, shear, and minimization movers. Output both your low-resolution intermediate structure and the final refined, high-resolution decoy.\n", "\n", " Test your code by attempting to fold domain 2 of the RecA protein (the last 60 amino acid residues of PDB ID 2REB). How do your results compare with the crystal structure? (Consider both your low-resolution and high-resolution results.) If your lowest-energy conformation is different than the native structure, explain why this is so in terms of the limitations of the computational approach.\n", "\n", " *Bonus*: After using the `PyMOL_Mover` or `PyMOL_Observer` to record the trajectory, export the frames and tie them together to create an animation. Search the Internet for “PyMOL animation” for additional tools and tips. Animated GIF files are probably the best quality; MPEG and QuickTime formats are also popular and widely compatible and uploadable to YouTube." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "3. *AraC N-terminal arm*. The AraC transcription factor is believed to be activated by the conformational change that occurs in the N-terminus when arabinose binds. Let’s test whether PyRosetta can capture this change. Specifically, we will start with the arabinose-bound form and see if PyRosetta can refold it to the apo form.\n", "\n", " Download the arabinose-bound form of the AraC transcription factor. Edit the PDB file so that it contains only the arabinose-binding domain, and also remove any non-protein atoms (especially the arabinose). Set up a move map to include only the 15 N-terminal residues. Perform an *ab initio* search to find the lowest conformation state. How does it compare to the apo crystal form?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Thought Questions\n", "1. With $kT$ = 1, what is the change in propensity of the rama score component that has a 50% chance of being accepted as a small move?\n", "\n", "\n", "2. How would you test whether an algorithm is effective? That is, what kind of measures can you use? What can you vary within an algorithm to make it more effective?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "< [High-Resolution Movers](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/05.01-High-Res-Movers.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [Packing & Design](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.00-Introduction-to-Packing-and-Design.ipynb) >

\"Open" ] } ], "metadata": { "celltoolbar": "Create Assignment", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.0" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }