{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\\rightarrow$Run All).\n", "\n", "Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\", as well as your name and collaborators below:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "NAME = \"\"\n", "COLLABORATORS = \"\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "*This notebook contains material from [PyRosetta](https://RosettaCommons.github.io/PyRosetta.notebooks);\n", "content is available [on Github](https://github.com/RosettaCommons/PyRosetta.notebooks.git).*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "< [RosettaCarbohydrates: Trees, Selectors and Movers](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/13.01-Glycan-Trees-Selectors-and-Movers.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [RNA in PyRosetta](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/14.00-RNA-Basics.ipynb) >
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# RosettaCarbohydrates: Modeling and Design\n",
"Keywords: carbohydrate, glycan, glucose, mannose, sugar, design, prediction\n",
"\n",
"## Overview\n",
"Here, you will learn how to model glycans and design optimal glycosylation positions in a protein.\n",
"\n",
"We will be using the RosettaCarbohydrate framework to build and model glycans. The `GlycanModeler`, which is our main method for modeling glycans, will be published in 2020. We will be using some custom glycan options to load pdbs. \n",
"First, one needs the `-include_sugars` option, which will tell Rosetta to load sugars and add the sugar_bb energy term to a default scorefunction. This scoreterm is like rama for the sugar dihedrals which connect each sugar residue. \n",
"\n",
"\t\t-include_sugars\n",
"\n",
"\n",
"When loading structures from the PDB that include glycans, we use these options. This includes an option to write out the structures in pdb format instead of the Rosetta format (which is actually better). Again, this is included in the config/flags files you will be using.\n",
"\n",
"\t\t-maintain_links\n",
"\t\t-auto_detect_glycan_connections\n",
"\t\t-alternate_3_letter_codes pdb_sugar\n",
"\t\t-write_glycan_pdb_codes\n",
"\n",
"\n",
"More information on working with glycans can be found at this page: [Working With Glycans](https://www.rosettacommons.org/docs/latest/application_documentation/carbohydrates/WorkingWithGlycans)\n",
"\n",
"## Algorithm\n",
" \n",
"The `GlycanModeler` essentially builds glycans from the root (The first residue of the Tree) out to the trees in a way that simulates a tree growing. It uses a notion of a 'layer' where the layer is defined as the number of residues to the glycan root (with the glycan root being layer 0). Within modeling, all glycan residues other than the ones being optimized are 'virtualized'. In Rosetta, the term 'Virtual' means that these residues are present, but not scored. (It should be noted that it is now possible to turn any residues Virtual and back to Real using two movers: `ConvertVirtualToRealMover` and `ConvertRealToVirtualMover`. )\n",
"\n",
"Within the modeling application, sampling of glycan DOFs is done through the `GlycanSampler`. The sampler attempts to sample the large amount of DOFs available to a glycan tree. The GlycanSampler is a `WeightedRandomSampler`, which is a container of highly specific sampling strategies, where each strategy is weighted by a particular probability. At each apply, the mover selects one of these samplers using the probability set to it. This is the same way the SnugDock algorithm for antibody modeling works. \n",
"\n",
"Sampling is always scaled with the number of glycan residues that you are modeling, so run-time will increase proportionally as well. \n",
"If you are modeling a huge viral particle with lots of glycans, one can use quench mode, which will optimize each glycan individually. \n",
"Tpyically for these cases, multiple rounds of glycan modeling is desired. \n",
"\n",
"\n",
"### GlycanSampler Major components\n",
"\n",
"Some of these components were covered in the previous tutorial.\n",
"\n",
"1. __Glycan Conformers__\n",
"\n",
"\tThese conformers have been generated through an in-depth bioinformatic analysis of the PDB using adaptive kernal density estimates and are unique for each linkage type including glycan residues connected to ASN residues. A conformer is a specific conformation of all of the backbone dihedrals of a particular glycan linkage. Essentialy glycan 'fragments' for a particular type of linkage.\n",
"\n",
"\n",
"2. __SugarBB Sampling__ \n",
"\n",
"\tThis sampling is done through turning the `sugar_bb` energy term into a set of probabilities using the -log(e) function. This allows us to sample on the QM derived torsonal potentials during modeling. \n",
"\n",
"\n",
"3. __Random Sampling and Shear Moves__\n",
"\n",
"\tWe sample random torsions at +/- 15 , +/- 45, +/- 90 degrees, each at decreasing probabilities at a 4:2:1 ratio of sampling Small,Medium,Large. \n",
"\tShear sampling is done where torsions are set for two residues in order to reduce downsteam effects and allow 'flipping' of the glycan torsions.\n",
"\n",
"\n",
"4. __Minimization__\n",
"\t\n",
"\tWe Minimize Sugar residues by randomly selecting a residue from what is set to model, and selecting all residues out to the tree that are not virtualized. This reduces computational time that would otherwise restrict the total number of glycan residues we could model at once.\n",
" \n",
"\n",
"5. __Packing__\n",
"\n",
"\tOf the residues set to optimize, we chooses a random residue and pack that residue and all residues out to the tree that are not virtualized. We pack the sugar residues (OH and constituents) and any neighboring protein sidechains. TaskOperations may be set to allow design of protein residues during this. We do packing this way to once again reduce total computational time.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Notebook setup\n",
"import sys\n",
"if 'google.colab' in sys.modules:\n",
" !pip install pyrosettacolabsetup\n",
" import pyrosettacolabsetup\n",
" pyrosettacolabsetup.setup()\n",
" print (\"Notebook is set for PyRosetta use in Colab. Have fun!\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Make sure you are in the directory with the pdb files:**\n",
"\n",
"`cd google_drive/My\\ Drive/student-notebooks/`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# General Setup and Inputs\n",
"\n",
"You will be using a few different inputs. We will be designing in glycosylation spots in order to block antibody binding at a highly curved epitope, and we will be loading a human structure from the PDB that has internal glycans. \n",
"\n",
"\n",
"## Notes for Tutorial Shortening\n",
"\n",
"\n",
"Typically, the value of `-glycan_sampler_rounds` is set to 25 (which typically is enough) and nstruct is about 5-10k per input structure. You may increase glycan_sampler_rounds to 100 and then decrease output to 1-2500 nstruct in order to have the same level of sampling, which will result in very good models as well. Since this is denovo modeling of glycans, more nstruct is almost always better. For some tutorials, we may decrease this value below our optimal value in order to shorten the length of the tutorial.\n",
"\n",
"\n",
"## General Notes\n",
"\n",
"We will use a flags file for all common options in this tutorial. Note that instead of passing this flag on init, you can instead put it into your working directory or a particular place in your home directory and rename it common. \n",
" \n",
"See this page for more info on using rosetta with custom config files: "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.0"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 1
}