{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NOTEBOOK_HEADER-->\n",
    "*This notebook contains material from [PyRosetta](https://RosettaCommons.github.io/PyRosetta.notebooks);\n",
    "content is available [on Github](https://github.com/RosettaCommons/PyRosetta.notebooks.git).*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [RosettaAntibodyDesign](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/12.02-RosettaAntibodyDesign-RAbD.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [RosettaCarbohydrates: Trees, Selectors and Movers](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/13.01-Glycan-Trees-Selectors-and-Movers.ipynb) ><p><a href=\"https://colab.research.google.com/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/13.00-RosettaCarbohydrates-Working-with-Glycans.ipynb\"><img align=\"left\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open in Colab\" title=\"Open in Google Colaboratory\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# RosettaCarbohydrates\n",
    "Keywords: carbohydrate, glycan, sugar, glucose, mannose, sugar, GlycanTreeSet, saccharide, furanose, pyranose, aldose, ketose\n",
    "\n",
    "## Overview\n",
    "\n",
    "In this chapter, we will focus on a special subset of non-peptide oligo- and polymers — carbohydrates.</p>\n",
    "\n",
    "Modeling carbohydrates — also known as saccharides, glycans, or simply sugars — comes with some special challenges. For one, most saccharide residues contain a ring as part of their backbone. This ring provides potentially new degrees of freedom when sampling. Additionally, carbohydrate structures are often branched, leading in Rosetta to more complicated `FoldTrees`.\n",
    "    \n",
    "This chapter includes a quick overview of carbohydrate nomenclature, structure, and basic interactions within Rosetta.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Carbohydrate Chemistry Background"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"figure-right\" style=\"clear:right;float:right;width:300px\"><img src=\"./Media/Fig-pyranose_vs_furanose.png\" width=\"300px\"><small><b>Figure 1.</b> A pyranose (left) and a furanose (right).</small></div>\n",
    "<p>Sugars (<b>saccharides</b>) are defined as hyroxylated aldehydes and ketones. A typical monosaccharide has an equal number of carbon and oxygen atoms. For example, glucose has the molecular formula C<sub>6</sub>H<sub>12</sub>O<sub>6</sub>.</p><p>Sugars containing more than three carbons will spontaneously cyclize in aqueous environments to form five- or six-membered hemiacetals and hemiketals. Sugars with five-membered rings are called <b>furanoses</b>; those with six-membered rings are called <b>pyranoses</b> (Fig. 1).</p>\n",
    "<div class=\"figure-left\" style=\"clear:left;float:left;width:300px\"><img src=\"./Media/Fig-aldose_vs_ketose.png\" width=\"300px\"><small><b>Figure 2.</b> An aldose (left) and a ketose (right).</small></div>\n",
    "<p>A sugar is classified as an <b>aldose</b> or <b>ketose</b>, depending on whether it has an aldehyde or ketone in its linear form (Fig. 2).</p>\n",
    "<p>The different sugars have different names, depending on the stereochemistry at each of the carbon atoms in the molecule. For example, glucose has one set of stereochemistries, while mannose has another.</p>\n",
    "<p>In addition to their full names, many individual saccharide residues have three-letter codes, just like amino acid residues do. Glucose is \"Glc\" and mannose is \"Man\".</p>\n",
    "\n",
    "## Backbone Torsions, Residue Connections, and side-chains\n",
    "\n",
    "A glycan tree is made up of many sugar residues, each residue a ring.  The 'backbone' of a glycan is the connection between one residue and another. The chemical makeup of each sugar residue in this 'linkage' effects the propensity/energy of each bacbone dihedral angle.  In addition, sugars can be attached via different carbons of the parent glycan.  In this way, the chemical makeup and the attachment position effects the dihedral propensities. Typically, there are two backbone dihedral angles, but this could be up to 4+ angles depending on the connection.\n",
    "\n",
    "In IUPAC, the dihedrals of N are defined as the dihedrals between N and N-1 (IE - the parent linkage). The ASN (or other glycosylated protein residue's) dihedrals become part of the first glycan residue that is connected. For this first first glycan residue that is connected to an ASN, it has 4 torsions, while the ASN now has none!\n",
    "\n",
    "If you are creating a movemap for dihedral residues, please use the `MoveMapFactory` as this has the IUPAC nomenclature of glycan residues built in in order to allow proper DOF sampling of the backbone residues, especially for branching glycan trees.  In general, all of our samplers should use residue selectors and use the MoveMapFactory to build movemaps internally.\n",
    "\n",
    "A sugar's side-chains are the constitutents of the glycan ring, which are typically an OH group or an acetyl group.  These are sampled together at 60 degree angles by default during packing.  A higher granularity of rotamers cannot currently be handled in Rosetta, but 60 degrees seems adequete for our purposes.\n",
    "\n",
    "Within Rosetta, glycan connectivity information is stored in the `GlycanTreeSet`, which is continually updated to reflect any residue changes or additions to the pose. \n",
    "This info is always available through the function \n",
    "\n",
    "\t\tpose.glycan_tree_set()\n",
    "\n",
    "Chemical information of each glycan residue can be accessed through the CarbohydrateInfo object, which is stored in each ResidueType object: \n",
    "\n",
    "\t\tpose.residue_type(i).carbohydrate_info()\n",
    "      \n",
    "We will cover both of these classes in the next tutorial.\n",
    "\n",
    "## Documentation\n",
    "https://www.rosettacommons.org/docs/latest/application_documentation/carbohydrates/WorkingWithGlycans\n",
    "\n",
    "\n",
    "## References\n",
    "\n",
    "\n",
    "**Residue centric modeling and design of saccharide and glycoconjugate structures**\n",
    "Jason W. Labonte  Jared Adolf-Bryfogle  William R. Schief  Jeffrey J. Gray\n",
    "_Journal of Computational Chemistry_, 11/30/2016 - <https://doi.org/10.1002/jcc.24679>\n",
    "\n",
    "\n",
    "**Automatically Fixing Errors in Glycoprotein Structures with Rosetta**\n",
    "Brandon Frenz, Sebastian Rämisch, Andrew J. Borst, Alexandra C. Walls\n",
    "Jared Adolf-Bryfogle, William R. Schief, David Veesler, Frank DiMaio\n",
    "_Structure_, 1/2/2019"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Initialization\n",
    "\n",
    "<p>Let's use Pyrosetta to compare some common monosaccharide residues and see how they differ. As usual, we start by importing the `pyrosetta` and `rosetta` namespaces.</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install pyrosettacolabsetup\n",
    "import pyrosettacolabsetup; pyrosettacolabsetup.install_pyrosetta(cache_wheel_on_google_drive=False, serialization=False, mirror=0)\n",
    "import pyrosetta; pyrosetta.init()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyrosetta import *\n",
    "from pyrosetta.teaching import *\n",
    "from pyrosetta.rosetta import *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, one needs the `-include_sugars` option, which will tell Rosetta to load sugars and add the sugar_bb energy term to a default scorefunction.  This scoreterm is like rama for the sugar dihedrals which connect each sugar residue. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "PyRosetta-4 2019 [Rosetta PyRosetta4.Release.python36.mac 2019.39+release.93456a567a8125cafdf7f8cb44400bc20b570d81 2019-09-26T14:24:44] retrieved from: http://www.pyrosetta.org\n",
      "(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.\n",
      "\u001b[0mcore.init: \u001b[0mChecking for fconfig files in pwd and ./rosetta/flags\n",
      "\u001b[0mcore.init: \u001b[0mReading fconfig.../Users/jadolfbr/.rosetta/flags/common\n",
      "\u001b[0mcore.init: \u001b[0m\n",
      "\u001b[0mcore.init: \u001b[0m\n",
      "\u001b[0mcore.init: \u001b[0mRosetta version: PyRosetta4.Release.python36.mac r233 2019.39+release.93456a567a8 93456a567a8125cafdf7f8cb44400bc20b570d81 http://www.pyrosetta.org 2019-09-26T14:24:44\n",
      "\u001b[0mcore.init: \u001b[0mcommand: PyRosetta -include_sugars -database /Users/jadolfbr/Library/Python/3.6/lib/python/site-packages/pyrosetta-2019.39+release.93456a567a8-py3.6-macosx-10.6-intel.egg/pyrosetta/database\n",
      "\u001b[0mbasic.random.init_random_generator: \u001b[0m'RNG device' seed mode, using '/dev/urandom', seed=1177525307 seed_offset=0 real_seed=1177525307\n",
      "\u001b[0mbasic.random.init_random_generator: \u001b[0mRandomGenerator:init: Normal mode, seed=1177525307 RG_type=mt19937\n"
     ]
    }
   ],
   "source": [
    "init('-include_sugars')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When loading structures from the PDB that include glycans, we use these options. This includes an option to write out the structures in pdb format instead of the (better) Rosetta format.  We will be using these options in the next tutorial. \n",
    "\n",
    "\t\t-maintain_links\n",
    "\t\t-auto_detect_glycan_connections\n",
    "\t\t-alternate_3_letter_codes pdb_sugar\n",
    "\t\t-write_glycan_pdb_codes\n",
    "        -load_PDB_components false"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>Set up the `PyMOLMover` for viewing structures.</li></ul>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "pm = PyMOLMover()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating Saccharides from Sequence\n",
    "\n",
    "We will use the function, `pose_from_saccharide_sequence()`, which must  be imported from the `core.pose` namespace. Unlike with peptide chains, one-letter-codes will not suffice when specifying saccharide chains, because there is too much information to convey; we must use at least four letters. The first three letters are the sugar's three-letter code; the fourth letter designates whether the residue is a furanose (`f`) or pyranose (`p`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyrosetta.rosetta.core.pose import pose_from_saccharide_sequence"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[0mcore.chemical.GlobalResidueTypeSet: \u001b[0mFinished initializing fa_standard residue type set.  Created 1251 residue types\n",
      "\u001b[0mcore.chemical.GlobalResidueTypeSet: \u001b[0mTotal time to initialize 1.25647 seconds.\n",
      "\u001b[0mcore.pose: \u001b[0m by appending by jump...\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n",
      "\u001b[0mcore.pose: \u001b[0m by appending by jump...\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n",
      "\u001b[0mcore.pose: \u001b[0m by appending by jump...\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n"
     ]
    }
   ],
   "source": [
    "glucose = pose_from_saccharide_sequence('Glcp')\n",
    "galactose = pose_from_saccharide_sequence('Galp')\n",
    "mannose = pose_from_saccharide_sequence('Manp')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>Use the `PyMOLMover` to compare the three monosacharides in PyMOL.</li></ul>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>At which carbons do the three sugars differ?</li></ul>\n",
    "\n",
    "### L and D Forms\n",
    "\n",
    "<p>Just like with peptides, saccharides come in two enantiomeric forms, labelled <font style=\"font-variant: small-caps\">l</font> and <font style=\"font-variant: small-caps\">d</font>. (Note the small-caps, used in print.) These can be loaded into PyRosetta using the prefixes `L-` and `D-`.</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[0mcore.pose: \u001b[0m by appending by jump...\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n"
     ]
    }
   ],
   "source": [
    "L_glucose = pose_from_saccharide_sequence('L-Glcp')\n",
    "D_glucose = pose_from_saccharide_sequence('D-Glcp')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>Compare the two structures in PyMOL. Notice that all stereocenters are inverted between the two monosaccharides.</li></ul>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>Which enantiomer is loaded by PyRosetta by default if <font style=\"font-variant: small-caps\">l</font> or <font style=\"font-variant: small-caps\">d</font> are not specified?</li></ul>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Anomers\n",
    "\n",
    "<p>The carbon that is at a higher oxidation state — that is, the carbon of the hemiacetal/-ketal in the cyclic form or the carbon that is the carbonyl carbon of the aldehyde or ketone in the linear form — is called the <b>anomeric carbon</b>. Because the carbonyl of an aldehyde or ketone is planar, a sugar molecule can cyclize into one of two forms, one in which the resulting hydroxyl group is pointing \"up\" and another in which the same hydroxyl group is pointing \"down\". These two <b>anomers</b> are labelled α and β.</p>\n",
    "<ul><li>Create a one-residue `Pose` for both α- and β-<font style=\"font-variant: small-caps\">d</font>-glucopyranose and use PyMOL to compare both.</li></ul>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[0mcore.pose: \u001b[0m by appending by jump...\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n"
     ]
    }
   ],
   "source": [
    "alpha_D_glucose = pose_from_saccharide_sequence('a-D-Glcp')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>For which anomer is the C1 hydroxyl group axial to the chair conformation of the six-membered pyranose ring?</li>\n",
    "<li>Which anomer of <font style=\"font-variant: small-caps\">d</font>-glucose would you predict to be the most stable? (Hint: remember what you learned in organic chemistry about axial and equatorial substituents.)</li></ul>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Linear Oligosaccharides & IUPAC Sequences\n",
    "\n",
    "Oligo- and polysaccharides are composed of simple monosaccharide residues connected by acetal and ketal linkages called __glycosidic bonds__. Any of the monosaccharide's _hydroxyl_ groups can be used to form a linkage to the anomeric carbon of another monosaccharide, leading to both _linear_ and _branched_ molecules.\n",
    "    \n",
    "Rosetta can create both _linear_ and _branched_ oligosaccharides from an __IUPAC__ sequence. (IUPAC is the international organization dedicated to chemical nomenclature.)\n",
    "\n",
    "<p>To  properly  build  a linear oligosaccharide, Rosetta  must  know  the  following  details  about  each sugar residue being created in the following order:</p>\n",
    "\n",
    " - Main-chain connectivity — →2) (`->2)`), →4) (`->4)`), →6) (`->6)`), _etc._; default value  is `->4)-`\n",
    " - Anomeric form — α (`a` or `alpha`) or β (`b` or `beta`); default value is `alpha`\n",
    " - Enantiomeric form — <font style=\"font-variant: small-caps\">l</font> (`L`) or <font style=\"font-variant: small-caps\">d</font> (`D`); default value is `D`\n",
    " - 3-Letter code — required; uses sentence case\n",
    " - Ring form code — <i>f</i> (for a furanose/5-membered ring), <i>p</i> (for a pyranose/6-membered ring); required\n",
    "  \n",
    "  \n",
    "Residues must be separated by hyphens. Glycosidic linkages can be specified with full IUPAC notation, _e.g._, `-(1->4)-` for “-(1→4)-”. (This means that the residue on the left connects from its C1 (anomeric) position to the hydoxyl oxygen at C4 of the residue on the right.) Rosetta will assume `-(1->` for  aldoses  and `-(2->` for ketoses.</p><p>Note that the standard is to write the IUPAC sequence of a saccharide chain <em>in reverse order from how they are numbered</em>.  Lets create three new oligosacharides from sequence."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[0mcore.pose: \u001b[0m by appending by jump...\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n",
      "\u001b[0mcore.pose: \u001b[0m by appending by jump...\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n",
      "\u001b[0mcore.pose: \u001b[0m by appending by jump...\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n"
     ]
    }
   ],
   "source": [
    "maltotriose = pose_from_saccharide_sequence('a-D-Glcp-' * 3)\n",
    "lactose = pose_from_saccharide_sequence('b-D-Galp-(1->4)-a-D-Glcp')\n",
    "isomaltose = pose_from_saccharide_sequence('->6)-Glcp-' * 2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### General Residue Information\n",
    "When you print a `Pose` containing carbohydrate residues, the sugar residues will be listed as `Z` in the sequence."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "maltotriose\n",
      " PDB file name: alpha-D-Glcp-(1->4)-alpha-D-Glcp-(1->4)-alpha-D-Glcp\n",
      "Total residues: 3\n",
      "Sequence: ZZZ\n",
      "Fold tree:\n",
      "FOLD_TREE  EDGE 1 3 -1 \n",
      "\n",
      "isomaltose\n",
      " PDB file name: alpha-D-Glcp-(1->6)-alpha-D-Glcp\n",
      "Total residues: 2\n",
      "Sequence: ZZ\n",
      "Fold tree:\n",
      "FOLD_TREE  EDGE 1 2 -1 \n",
      "\n",
      "lactose\n",
      " PDB file name: beta-D-Galp-(1->4)-alpha-D-Glcp\n",
      "Total residues: 2\n",
      "Sequence: ZZ\n",
      "Fold tree:\n",
      "FOLD_TREE  EDGE 1 2 -1 \n"
     ]
    }
   ],
   "source": [
    "print(\"maltotriose\\n\", maltotriose)\n",
    "print(\"\\nisomaltose\\n\", isomaltose)\n",
    "print(\"\\nlactose\\n\", lactose)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "However, you can have Rosetta print out the sequences for individual chains, using the `chain_sequence()` method. If you do this, Rosetta is smart enough to give you a distinct sequence format for saccharide chains. (You may have noticed that the default file name for a `.pdb` file created from this `Pose` will be the same sequence.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "alpha-D-Glcp-(1->4)-alpha-D-Glcp-(1->4)-alpha-D-Glcp\n"
     ]
    }
   ],
   "source": [
    "print(maltotriose.chain_sequence(1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "alpha-D-Glcp-(1->6)-alpha-D-Glcp\n"
     ]
    }
   ],
   "source": [
    "print(isomaltose.chain_sequence(1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "beta-D-Galp-(1->4)-alpha-D-Glcp\n"
     ]
    }
   ],
   "source": [
    "print(lactose.chain_sequence(1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>Again, the standard is to show the sequence of a saccharide chain in reverse order from how they are numbered.</p>  This is also how phi, psi, and omega are defined. From i+1 to i. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 ->4)-alpha-D-Glcp:reducing_end\n",
      "2 ->4)-beta-D-Galp:non-reducing_end\n"
     ]
    }
   ],
   "source": [
    "for res in lactose.residues: print(res.seqpos(), res.name())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>Notice that for polysaccharides, the upstream residue is called the <b>reducing end</b>, while the downstream residue is called the <b>non-reducing end</b>.</p>\n",
    "\n",
    "You will also see the terms parent and child being used across Rosetta.  Here, for Residue 2, residue 1 is the parent.  For Residue 1, Residue 2 is the child.  Due to branching, residues can have more than one child/non-reducing-end, but only a single parent residue. \n",
    "\n",
    "<p>Rosetta stores carbohydrate-specific information within `ResidueType`. If you print a residue, this additional information will be displayed.</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Residue 1: ->4)-alpha-D-Glcp:reducing_end:non-reducing_end (Glc, Z):\n",
      "Base: ->4)-alpha-D-Glcp\n",
      " Properties: POLYMER CARBOHYDRATE LOWER_TERMINUS UPPER_TERMINUS POLAR CYCLIC HEXOSE ALDOSE D_SUGAR PYRANOSE ALPHA_SUGAR\n",
      " Variant types: UPPER_TERMINUS_VARIANT LOWER_TERMINUS_VARIANT\n",
      " Main-chain atoms:  C1   C2   C3   C4   O4 \n",
      " Backbone atoms:    C1   C2   C3   C4   O4   C5   O5   VO5  VC1  H1   H2   H3   H4   HO4  H5 \n",
      " Ring atoms:    C1   C2   C3   C4   C5   O5 \n",
      " Side-chain atoms:  O1   O2   O3   C6   O6   HO1  HO2  HO3 1H6  2H6   HO6\n",
      "Carbohydrate Properties for this Residue:\n",
      " Basic Name: glucose\n",
      " IUPAC Name: alpha-D-glucopyranose\n",
      " Abbreviation: alpha-D-Glcp\n",
      " Classification: aldohexose\n",
      " Stereochemistry: D\n",
      " Ring Form: pyranose\n",
      " Anomeric Form: alpha\n",
      " Modifications: \n",
      "  none\n",
      " Polymeric Information:\n",
      "  Main chain connection: N/A\n",
      "  Branch connections: none\n",
      "Ring Conformer: 4C1 (chair): C-P parameters (q, phi, theta): 0.55, 180, 0; nu angles (degrees): 60, -60, 60, -60, 60, -60\n",
      "  O1 : axial\n",
      "  O2 : equatorial\n",
      "  O3 : equatorial\n",
      "  O4 : equatorial\n",
      "  C6 : equatorial\n",
      "Atom Coordinates:\n",
      "   C1 : 0, 0, 0\n",
      "   C2 : 1.55, 0, 0\n",
      "   C3 : 2.04812, 1.44664, 0\n",
      "   C4 : 1.50806, 2.11919, -1.26369\n",
      "   O4 : 1.94666, 3.46908, -1.30661\n",
      "   C5 : -0.0200415, 2.06186, -1.21358\n",
      "   O5 : -0.475077, 0.686176, -1.1593\n",
      "   VO5: -0.492509, 0.676579, -1.17187 (virtual)\n",
      "   VC1: 0.031762, 0.00822503, 0.00564973 (virtual)\n",
      "   O1 : -0.494034, 0.697555, 1.2082\n",
      "   O2 : 2.02401, -0.669275, 1.15922\n",
      "   O3 : 3.4779, 1.4716, 1.64563e-16\n",
      "   C6 : -0.614146, 2.71298, -2.43962\n",
      "   O6 : -0.225074, 4.07556, -2.53127\n",
      "   H1 : -0.370662, -1.03564, 0.00767336\n",
      "   H2 : 1.90812, -0.520035, -0.900727\n",
      "   H3 : 1.67301, 1.95456, 0.900727\n",
      "   H4 : 1.88381, 1.57916, -2.14527\n",
      "   HO4: 1.61609, 3.94572, -0.516717\n",
      "   H5 : -0.369153, 2.59396, -0.316372\n",
      "   HO1: -0.167832, 1.62167, 1.20877\n",
      "   HO2: 3.00401, -0.669275, 1.15922\n",
      "   HO3: 3.78886, 2.40096, 5.03844e-17\n",
      "  1H6 : -1.71106, 2.65811, -2.3783\n",
      "  2H6 : -0.261365, 2.17983, -3.33478\n",
      "   HO6: -0.621924, 4.47587, -3.33293\n",
      "Mirrored relative to coordinates in ResidueType: FALSE\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(glucose.residue(1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>Scanning the output from printing a glucose `Residue`, what is the general term for an aldose with six carbon atoms?</li>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exploring Carbohydrate Structure\n",
    "\n",
    "### Torsion Angles"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>Most bioolymers have predefined, named torsion angles for their main-chain and side-chain bonds, such as φ, ψ, and ω and the various χs for amino acid residues. The same is true for saccharide residues. The torsion angles of sugars are as follows:</p>\n",
    "<ul><div class=\"figure-right\" style=\"clear:right;float:right;width:300px\"><img src=\"./Media/Fig-saccharide_main_chain_torsions.png\" width=\"300px\"><small><b>Figure 3.</b> A disaccharide's main-chain torsion angles.</small></div><div class=\"figure-right\" style=\"clear:right;float:right;width:150px\"><img src=\"./Media/Fig-saccharide_ring_torsions.png\" width=\"150px\"><small><b>Figure 4.</b> A monosaccharide's internal ring torsion angles.</small></div><div class=\"figure-right\" style=\"clear:right;float:right;width:150px\"><img src=\"./Media/Fig-saccharide_side_chain_torsions.png\" width=\"150px\"><small><b>Figure 5.</b> A monosaccharide's side-chain torsion angles.</small></div><li>φ — The 1<sup>st</sup> glycosidic torsion back to the <em>previous</em> (<i>n</i>&minus;1) residue. The angle is defined by the cyclic oxygen, the two atoms across the bond, and the cyclic carbon numbered one less than the glycosidic linkage position. For aldopyranoses, φ(<i>n</i>) is thus defined as O5(<i>n</i>)–C1(<i>n</i>)–O<i>X</i>(<i>n</i>&minus;1)–C<i>X</i>(<i>n</i>&minus;1), where <i>X</i> is the position of the glycosidic linkage. For aldofuranoses, φ(<i>n</i>) is defined as O4(<i>n</i>)–C1(<i>n</i>)–O<i>X</i>(<i>n</i>&minus;1)–CX(<i>n</i>&minus;1) For 2-ketopyranoses, φ(<i>n</i>) is defined as O6(<i>n</i>)–C2(<i>n</i>)–O<i>X</i>(<i>n</i>&minus;1)–C<i>X</i>(<i>n</i>&minus;1). For 2-ketofuranoses, φ(<i>n</i>) is defined as O5(<i>n</i>)–C2(<i>n</i>)–OX(<i>n</i>&minus;1)–CX(<i>n</i>&minus;1). <i>Et cetera</i>&hellip;.</li><li>ψ — The 2<sup>nd</sup> glycosidic torsion back to the <em>previous</em> (<i>n</i>&minus;1) residue. The angle is defined by the \n",
    "anomeric carbon, the two atoms across the bond, and the cyclic carbon numbered two less than \n",
    "    the glycosidic linkage position. ψ(<i>n</i>) is thus defined as C<sub>anomeric</sub>(<i>n</i>)–O<i>X</i>(<i>n</i>&minus;1)–C<i>X</i>(<i>n</i>&minus;1)–C<i>X</i>&minus;1(<i>n</i>&minus;1), where <i>X</i> is the position of the glycosidic linkage.</li><li>ω — The 3<sup>rd</sup> (and any subsequent) glycosidic torsion(s) back to the <em>previous residue</em>. ω<sub>1</sub>(<i>n</i>) is defined as O<i>X</i>(<i>n</i>&minus;1)–C<i>X</i>(<i>n</i>&minus;1)–C<i>X</i>&minus;1(<i>n</i>&minus;1)–C<i>X</i>&minus;2(<i>n</i>&minus;1), where <i>X</i> is the position of the glycosidic linkage. (This only applies to sugars with exocyclic connectivities.).  The connection in Figure 3 has an exocyclic carbon, but the other potential connection points do not - so only phi and psi would available as bacbone torsion angles for those connection points. \n",
    "    </li><li>ν<sub>1</sub> – ν<sub><i>n</i></sub> — The internal ring torsion angles, where <i>n</i> is the number of atoms in the ring. ν<sub>1</sub> defines the torsion across bond C1–C2, <i>etc</i>.</li><li>χ<sub>1</sub> – χ<sub><i>n</i></sub> — The side-chain torsion angles, where <i>n</i> is the number of carbons in the sugar residue. The angle is defined by the carbon numbered one less than the glycosidic linkage position, the two atoms across the bond, and the polar hydrogen. The cyclic ring counts as carbon 0. For an aldopyranose, χ<sub>1</sub> is thus defined by O5–C1–O1–HO1, and χ<sub>2</sub> is defined by C1–C2–O2–HO2. χ<sub>5</sub> is defined by C4–C5–C6–O6, because it rotates the exocyclic carbon rather than twists the ring. χ<sub>6</sub> \n",
    "is defined by C5–C6–O6–HO6.</li></ul>\n",
    "\n",
    "Take special note of how φ, ψ, and ω are defined <em>in the reverse order</em> as the angles of the same names for amino acid residues!\n",
    "\n",
    "The `chi()` method of `Pose` works with sugar residues in the same way that it works with amino acid residues, where the first argument is the χ subscript and the second is the residue number of the `Pose`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "-60.49356158672178"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "galactose.chi(1, 1)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "-180.0"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "galactose.chi(2, 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "180.0"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "galactose.chi(3, 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "-59.999999999999986"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "galactose.chi(4, 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "-59.999999999999986"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "galactose.chi(5, 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "180.0"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "galactose.chi(6, 1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Likewise, we can use `set_chi()` to change these torsion angles and observe the changes in \n",
    "PyMOL, setting the option to keep history to true."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyrosetta.rosetta.protocols.moves import AddPyMOLObserver"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "observer = AddPyMOLObserver(galactose, True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
    "pm.apply(galactose)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>Perform the following torsion angle changes to galactose using `set_chi()` and observe which torsions move in PyMOL.<ul><li>Set χ<sub>1</sub> to 120&deg;.</li><li>Set χ<sub>2</sub> to 60&deg;.</li><li>Set χ<sub>3</sub> to 60&deg;.</li><li>Set χ<sub>4</sub> to 0&deg;.</li><li>Set χ<sub>5</sub> to 60&deg;.</li><li>Set χ<sub>6</sub> to &minus;60&deg;.</li></ul></li></ul>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [],
   "source": [
    "galactose.set_chi(1, 1, 180)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "cell-cb2a132d9de834fa",
     "locked": false,
     "points": 0,
     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(1, 120)\n",
      "(2, 60)\n",
      "(3, 60)\n",
      "(4, 0)\n",
      "(5, 60)\n"
     ]
    }
   ],
   "source": [
    "## BEGIN SOLUTION\n",
    "\n",
    "for chi_angle in zip([x for x in range(1, 6)], [120, 60, 60, 0, 60, -60]):\n",
    "    print(chi_angle)\n",
    "    galactose.set_chi(chi_angle[0] , 1, chi_angle[1])\n",
    "    \n",
    "## END SOLUTION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Creating Saccharides from a PDB file"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `phi()`, `set_phi()`, `psi()`, `set_psi()`, `omega()`, and `set_omega()` methods of `Pose` also work with sugars. However, since `pose_from_saccharide_sequence()` may create a `Pose` with angles that cause the residues to wrap around onto each other, instead, let's reload some Pose's from `.pdb` files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[0mcore.import_pose.import_pose: \u001b[0mFile 'inputs/maltotriose.pdb' automatically determined to be of type PDB\n",
      "\u001b[0mcore.io.pdb.pdb_reader: \u001b[0mParsing 0 .pdb records with unknown format to search for Rosetta-specific comments.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Glc1 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Glc2 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Glc3 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n",
      "\u001b[0mcore.import_pose.import_pose: \u001b[0mFile 'inputs/isomaltose.pdb' automatically determined to be of type PDB\n",
      "\u001b[0mcore.io.pdb.pdb_reader: \u001b[0mParsing 0 .pdb records with unknown format to search for Rosetta-specific comments.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Glc1 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Glc2 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n"
     ]
    }
   ],
   "source": [
    "maltotriose = pose_from_file('inputs/glycans/maltotriose.pdb')\n",
    "isomaltose = pose_from_file('inputs/glycans/isomaltose.pdb')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>Now, try out the torsion angle getters and setters for the glycosydic bonds.</li></ul>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [],
   "source": [
    "pm.apply(maltotriose)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.0"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "maltotriose.phi(1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.0"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "maltotriose.psi(1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "96.93460655617179"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "maltotriose.phi(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "109.94421849476633"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "maltotriose.psi(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.0"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "maltotriose.omega(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "103.21420435050914"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "maltotriose.phi(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "118.64096726060517"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "maltotriose.psi(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>Notice how φ<sub>1</sub> and ψ<sub>1</sub> are undefined&mdash;the first residue is not connected to anything"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [],
   "source": [
    "observer = AddPyMOLObserver(maltotriose, True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [],
   "source": [
    "for i in (2, 3):\n",
    "    maltotriose.set_phi(i, 180)\n",
    "    maltotriose.set_psi(i, 180)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Isomaltose** is composed of (1→6) linkages, so in this case omega torsions are defined. Get and set φ<sub>2</sub>, ψ<sub>2</sub>, ω<sub>2</sub></p> for isomaltose"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [],
   "source": [
    "observer = AddPyMOLObserver(isomaltose, True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "cell-2dd10351a8438fa7",
     "locked": false,
     "points": 0,
     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "44.32677030464958\n",
      "-170.8693381707546\n",
      "49.383018645410004\n"
     ]
    }
   ],
   "source": [
    "## BEGIN SOLUTION\n",
    "\n",
    "print(isomaltose.phi(2))\n",
    "print(isomaltose.psi(2))\n",
    "print(isomaltose.omega(2))\n",
    "\n",
    "## END SOLUTION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>Any cyclic residue also stores its ν angles.</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [],
   "source": [
    "pm.apply(glucose)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [],
   "source": [
    "Glc1 = glucose.residue(1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "59.99999999999999\n",
      "-59.99999999999999\n",
      "60.00000000000001\n",
      "-59.999999999999986\n",
      "59.99999999999999\n"
     ]
    }
   ],
   "source": [
    "for i in range(1, 6): print(Glc1.nu(i))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>However, we generally care more about the ring conformation of a cyclic residue&rsquo;s rings, in this case, its only  ring with index of 1. (The output values here are the ideal angles, not the actual angles, which we viewed above.)</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4C1 (chair): C-P parameters (q, phi, theta): 0.55, 180, 0; nu angles (degrees): 60, -60, 60, -60, 60, -60\n"
     ]
    }
   ],
   "source": [
    "print(Glc1.ring_conformer(1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### RingConformers\n",
    "\n",
    "<p>The output above warrants a brief explanation. First, what does `4C1` mean? Most of us likely remember learning about chair and boat conformations in Organic Chemistry. Do you recall how there are two distinct chair conformations that can interconvert between each other? The names for these specific conformations are <sup>4</sup>C<sub>1</sub> and <sup>1</sup>C<sub>4</sub>. The nomenclature is as follows: Superscripts to the left of the capital letter are above the plane of the ring if it is oriented such that its carbon atoms proceed in a clockwise direction when viewed from above. Subscripts to the right of the letter are below the plane of the ring. The letter itself is an abbreviation, where, for example, C indicates a chair conformation and B a boat conformation. In all, there are 38 different ideal ring conformations that any six-membered cycle can take.</p><p>`C-P parameters` refers to the Cremer&ndash;Pople parameters for this conformation (Cremer D, Pople JA. J Am Chem Soc. 1975;97:1354–1358.). C&ndash;P parameters are an alternative coordinate system used to refer to a ring conformation.</p><p>\n",
    "    \n",
    "Finally, a `RingConformer` in Rosetta includes the values of the ν angles. Each conformer has a unique set of angles.\n",
    "    \n",
    "`Pose::set_nu()` does not exist, because it would rip a ring apart. Instead, to change a ring conformation, we need to use the `set_ring_conformer()` method, which takes a `RingConformer` object. Most of the time, you will not need to adjust the ring conformers, but you should be aware of it. \n",
    "\n",
    "We can ask a cyclic `ResidueType` for one of its `RingConformerSet`s to give us the `RingConformer` we want. (Each `RingConformerSet` includes the list of possible idealized ring conformers that such a ring can attain as well as information about the most energetically favorable one.) Then, we can et the conformation for our residue through `Pose`. (The arguments for `set_ring_conformer()` are the `Pose`’s sequence position, ring number, and the new conformer, respectively.)</p> \n",
    "\n",
    "<ul><div class=\"figure-center\" style=\"clear:middle;float:middle;width:300px\"><img src=\"./Media/chair_figure.tif\" width=\"300px\"><medium><b>Figure 5.</b> The two chair conformations of α-d-glucopyranose. In the <sup>1</sup>C<sup>4</sup> conformation (left), all of the substituents are axial; in the <sup>4</sup>C<sup>1</sup> conformation (right), they are equatorial. <sup>4</sup>C<sup>1</sup>  is the most stable conformation for the majority of the α-d-aldohexopyranoses. In this nomenclature, a superscript means that that numbered carbon is above the ring, if the atoms are arranged in a clockwise manner from C1. A subscripted number indicates a carbon below the plane of the ring."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [],
   "source": [
    "ring_set = Glc1.type().ring_conformer_set(1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [],
   "source": [
    "conformer = ring_set.get_ideal_conformer_by_name('1C4')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [],
   "source": [
    "glucose.set_ring_conformation(1, 1, conformer)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {},
   "outputs": [],
   "source": [
    "pm.apply(glucose)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Modified Sugars, Branched Oligosaccharides, &amp; `.pdb` File `LINK` Records"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>Modified sugars can also be created in Rosetta, either from sequence or from file. In the former case, simply use the proper abbreviation for the modification after the “ring form code”. For example, the abbreviation for an <i>N</i>-acetyl group is “NAc”. Note the <i>N</i>-acetyl group in the PyMOL window.</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[0mcore.pose: \u001b[0m by appending by jump...\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n"
     ]
    }
   ],
   "source": [
    "LacNAc = pose_from_saccharide_sequence('b-D-Galp-(1->4)-a-D-GlcpNAc')\n",
    "pm.apply(LacNAc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>Rosetta can handle branched oligosaccharides as well, but when loading from a sequence, this requires the use of brackets, which is the standard IUPAC notation. For example, here is how one would load Lewis<sup>x</sup> (Le<sup>x</sup>), a common branched glyco-epitope, into Rosetta by sequence.</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[0mcore.pose: \u001b[0m by appending by jump...\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mappending residue by a chemical bond in the foldtree: 3 ->4)-alpha-L-Fucp:non-reducing_end anchor:  O3     1 root:  C1\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n"
     ]
    }
   ],
   "source": [
    "Lex = pose_from_saccharide_sequence('b-D-Galp-(1->4)-[a-L-Fucp-(1->3)]-D-GlcpNAc')\n",
    "pm.apply(Lex)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "One can also load branched carbohydrates from a `.pdb` file. These `.pdb` files must include `LINK` records, which are a standard part of the PDB format. Open the `test/data/carbohydrates/Lex.pdb` file and look bear the top to see an example `LINK` record, which looks like this:\n",
    "\n",
    "```LINK         O3  Glc A   1                 C1  Fuc B   1     1555   1555  1.5   ```\n",
    "\n",
    "It tells us that there is a covalent linkage between O3 of glucose A1 and C1 of fucose B1 with a bond length of 1.5 Å. (The `1555`s indicate symmetry and are ignored by Rosetta.)\n",
    "\n",
    "Note that if the LINK records are not in order, or HETNAM records are not in a Rosetta format, we will fail to load.  In the next tutorial we will use auto-detection to do this. For now, we know Lex.pdb will load OK."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[0mcore.import_pose.import_pose: \u001b[0mFile 'inputs/Lex.pdb' automatically determined to be of type PDB\n",
      "\u001b[0mcore.io.pdb.pdb_reader: \u001b[0mParsing 0 .pdb records with unknown format to search for Rosetta-specific comments.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Glc1 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Gal2 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Fuc3 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.chemical.AtomICoor: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-alpha-L-Fucp:non-reducing_end 3.  Returning BOGUS ID instead.\n",
      "\u001b[0mcore.conformation.Residue: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing an atom: 3  H1  that depends on a nonexistent polymer connection!\n",
      "\u001b[0mcore.conformation.Residue: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m  --> generating it using idealized coordinates.\n",
      "\u001b[0mcore.chemical.AtomICoor: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-alpha-L-Fucp:non-reducing_end 3.  Returning BOGUS ID instead.\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n"
     ]
    }
   ],
   "source": [
    "Lex = pose_from_file('inputs/glycans/Lex.pdb')\n",
    "pm.apply(Lex)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You may notice when viewing the structure in PyMOL that the hybridization of the carbonyl of the amido functionality of the <i>N</i>-acetyl group is wrong. This is because of an error in the model deposited in the PDB from which this file was generated. This is, unfortunately, a very common problem with sugar structures found in the PDB.  It is always useful to use http://www.glycosciences.de to identify any errors in the solution PDB structure before working with them in Rosetta.  The referenced paper, __Automatically Fixing Errors in Glycoprotein Structures with Rosetta__ can be used as a guide to fixing these.\n",
    "\n",
    "You may\n",
    "also have noticed that the `inputs/glycans/Lex.pdb` file indicated in its `HETNAM` records that Glc1 was actually an <i>N</i>-acetylglycosamine (GlcNAc) with the indication `2-acetylamino-2-deoxy-`. This is optional and is helpful for human-readability, but Rosetta only needs to know the base `ResidueType` of each sugar residue; specific \n",
    "`VariantType`s needed — and most sugar modifications are treated as `VariantType`s — are determined automatically from \n",
    "the atom names in the `HETATM` records for the residue. Anything after the comma is ignored.</p><ul><li>Print out the `Pose` to see how the `FoldTree` is defined."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "cell-b1af95b61160d9be",
     "locked": false,
     "points": 0,
     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "PDB file name: inputs/Lex.pdb\n",
      "Total residues: 3\n",
      "Sequence: ZZZ\n",
      "Fold tree:\n",
      "FOLD_TREE  EDGE 1 2 -1  EDGE 1 3 -2  O3   C1  \n"
     ]
    }
   ],
   "source": [
    "## BEGIN SOLUTION\n",
    "print(Lex)\n",
    "## END SOLUTION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note the `CHEMICAL` `Edge` (`-2`). This is Rosetta’s way of indicating a branch backbone connection. Unlike a standard \n",
    "`POLYMER` `Edge` (`-1`), this one tells you which atoms are involved.<p><ul><li>Print out the sequence of each chain."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>Now print out information about each residue in the Pose to see which VariantTypes and ResiduePropertys are assigned to each.</li></ul>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>What are the three `VariantType`s of residue 1?</li><li>Output the various torison angles and make sure that you understand to which angles they correspond.</li></ul>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>Can you see now why φ and ψ are defined the way they are? If they were defined as in AA residues, they would not have unique definitions, since GlcNAc is a branch point. A monosaccharide can have multiple children, but it can never have more than a single parent.</p><p>Note that for this oligosaccharide χ<sub>3</sub>(1) is equivalent to ψ(3) and χ<sub>4</sub>(1) is equivalent to ψ(2). Make sure that you understand why!</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(-97.03727535363538, -97.03727535363538)"
      ]
     },
     "execution_count": 75,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Lex.chi(3, 1), Lex.psi(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(135.6468768989725, 135.6468768989725)"
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Lex.chi(4, 1), Lex.psi(2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>For chemically modified sugars, χ angles are redefined at the positions where substitution has occurred. For new χs that have come into existence from the addition of new atoms and bonds, new definitions are added to new indices. For example, for Glc<i>N</i><sup>2</sup>Ac residue 1, χ<sub>C2–N2–C′–Cα′</sub> is accessed through `chi(7, 1)`.</p>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "-230.8915297047683"
      ]
     },
     "execution_count": 77,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Lex.chi(2, 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {},
   "outputs": [],
   "source": [
    "Lex.set_chi(2, 1, 180)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {},
   "outputs": [],
   "source": [
    "pm.apply(Lex)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "179.81012671885887"
      ]
     },
     "execution_count": 80,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Lex.chi(7, 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [],
   "source": [
    "Lex.set_chi(7, 1, 0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {},
   "outputs": [],
   "source": [
    "pm.apply(Lex)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>Play around with getting and setting the various torsion angles for Le<sup>x</sup></li></ul>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## <i>N</i>- and <i>O</i>-Linked Glycans"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>Branching does not have to occur at sugars; a glycan can be attached to the nitrogen of an ASN or the oxygen of a SER or THR. <i>N</i>-linked glycans themselves tend to be branched structures.</p>  \n",
    "\n",
    "We will cover more on linked glycan trees in the next tutorial through the `GlycanTreeSet` object - which is always present in a pose that has carbohydrates."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[0mcore.import_pose.import_pose: \u001b[0mFile 'inputs/N-linked_14-mer_glycan.pdb' automatically determined to be of type PDB\n",
      "\u001b[0mcore.io.pdb.pdb_reader: \u001b[0mParsing 0 .pdb records with unknown format to search for Rosetta-specific comments.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Glc6 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Glc7 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Man8 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Man9 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Man10 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Man11 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Glc12 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Glc13 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Glc14 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Man15 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Man16 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Man17 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Man18 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Man19 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.chemical.AtomICoor: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-beta-D-Glcp:2-AcNH 6.  Returning BOGUS ID instead.\n",
      "\u001b[0mcore.conformation.Residue: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing an atom: 6  H1  that depends on a nonexistent polymer connection!\n",
      "\u001b[0mcore.conformation.Residue: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m  --> generating it using idealized coordinates.\n",
      "\u001b[0mcore.chemical.AtomICoor: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->3)-alpha-D-Manp:->6)-branch 15.  Returning BOGUS ID instead.\n",
      "\u001b[0mcore.conformation.Residue: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing an atom: 15  H1  that depends on a nonexistent polymer connection!\n",
      "\u001b[0mcore.conformation.Residue: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m  --> generating it using idealized coordinates.\n",
      "\u001b[0mcore.chemical.AtomICoor: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->2)-alpha-D-Manp 18.  Returning BOGUS ID instead.\n",
      "\u001b[0mcore.conformation.Residue: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing an atom: 18  H1  that depends on a nonexistent polymer connection!\n",
      "\u001b[0mcore.conformation.Residue: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m  --> generating it using idealized coordinates.\n",
      "\u001b[0mcore.chemical.AtomICoor: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-beta-D-Glcp:2-AcNH 6.  Returning BOGUS ID instead.\n",
      "\u001b[0mcore.chemical.AtomICoor: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->3)-alpha-D-Manp:->6)-branch 15.  Returning BOGUS ID instead.\n",
      "\u001b[0mcore.chemical.AtomICoor: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->2)-alpha-D-Manp 18.  Returning BOGUS ID instead.\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n",
      "PDB file name: inputs/N-linked_14-mer_glycan.pdb\n",
      "Total residues: 19\n",
      "Sequence: ANASAZZZZZZZZZZZZZZ\n",
      "Fold tree:\n",
      "FOLD_TREE  EDGE 1 5 -1  EDGE 2 6 -2  ND2  C1   EDGE 6 14 -1  EDGE 8 15 -2  O6   C1   EDGE 15 17 -1  EDGE 15 18 -2  O6   C1   EDGE 18 19 -1 \n"
     ]
    }
   ],
   "source": [
    "N_linked = pose_from_file('inputs/glycans/N-linked_14-mer_glycan.pdb')\n",
    "pm.apply(N_linked)\n",
    "print(N_linked)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ANASA\n",
      "alpha-D-Glcp-(1->3)-alpha-D-Glcp-(1->3)-alpha-D-Glcp-(1->3)-alpha-D-Manp-(1->2)-alpha-D-Manp-(1->2)-alpha-D-Manp-(1->3)-beta-D-Manp-(1->4)-beta-D-GlcpNAc-(1->4)-beta-D-GlcpNAc-\n",
      "alpha-D-Manp-(1->2)-alpha-D-Manp-(1->3)-alpha-D-Manp-\n",
      "alpha-D-Manp-(1->2)-alpha-D-Manp-\n"
     ]
    }
   ],
   "source": [
    "for i in range(4): print(N_linked.chain_sequence(i + 1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>Which residue number is glycosylated above?</li></ul>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[0mcore.import_pose.import_pose: \u001b[0mFile 'inputs/O_glycan.pdb' automatically determined to be of type PDB\n",
      "\u001b[0mcore.io.pdb.pdb_reader: \u001b[0mParsing 0 .pdb records with unknown format to search for Rosetta-specific comments.\n",
      "\u001b[0mcore.io.pose_from_sfr.PoseFromSFRBuilder: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Glc4 has an unfavorable ring conformation; the coordinates for this input structure may have been poorly assigned.\n",
      "\u001b[0mcore.chemical.AtomICoor: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-alpha-D-Glcp:non-reducing_end 4.  Returning BOGUS ID instead.\n",
      "\u001b[0mcore.conformation.Residue: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing an atom: 4  H1  that depends on a nonexistent polymer connection!\n",
      "\u001b[0mcore.conformation.Residue: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m  --> generating it using idealized coordinates.\n",
      "\u001b[0mcore.chemical.AtomICoor: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m IcoorAtomID::atom_id(): Cannot get atom_id for POLYMER_LOWER of residue ->4)-alpha-D-Glcp:non-reducing_end 4.  Returning BOGUS ID instead.\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mSetting up Glycan Trees\n",
      "\u001b[0mcore.conformation.carbohydrates.GlycanTreeSet: \u001b[0mFound 1 glycan trees.\n"
     ]
    }
   ],
   "source": [
    "O_linked = pose_from_file('inputs/glycans/O_glycan.pdb')\n",
    "pm.apply(O_linked)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>Print `O-linked` and the sequence of each of its chains.</li></ul>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`set_phi()` and `set_psi()` still work when a glycan is linked to a peptide. (Below, we use `pdb_info()` to give help us select the residue that we want. In this case, in the `.pdb` file, the glycan is chain B.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "N_linked.set_phi(N_linked.pdb_info().pdb2pose(\"B\", 1), 180)\n",
    "pm.apply(N_linked)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<ul><li>Set ψ(B1) to 0&deg; and ω(B1) to 90&deg; and view the results in PyMOL.</li></ul>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>Notice that in this case ψ and ω affect the side-chain torsions (χs) of the asparagine residue. This is another case where there are multiple ways of both naming and accessing the same specific torsion angles.</p><p>One can also create conjugated glycans from sequences if performed in steps, first creating the peptide portion by loading from a `.pdb`\n",
    "file or from sequence and then using the `glycosylate_pose()` function, (which needs to be imported first.) For example,  to glycosylate an ASA peptide with a single glucose at position 2 of the peptide, we perform the following:</p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Glycosylation by function\n",
    "\n",
    "Here, we will glycosylate a simple peptide using the function, `glycosylate_pose`.  In the next tutorial, we will use a Mover interface to this function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [],
   "source": [
    "peptide = pose_from_sequence('ASA')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [],
   "source": [
    "pm.apply(peptide)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyrosetta.rosetta.core.pose.carbohydrates import glycosylate_pose, glycosylate_pose_by_file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[0mcore.conformation.Conformation: \u001b[0mappending residue by a chemical bond in the foldtree: 5 ->4)-alpha-D-Glcp:non-reducing_end anchor:  OG     2 root:  C1\n",
      "\u001b[0mcore.pose.carbohydrates.util: \u001b[0mGlycosylated pose with a(n) Glcp-OGSER2 bond.\n",
      "\u001b[0mcore.pose.carbohydrates.util: \u001b[0mIdealizing glycosidic torsions.\n"
     ]
    }
   ],
   "source": [
    "glycosylate_pose(peptide, 2, 'Glcp')\n",
    "pm.apply(peptide)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, we uset the main function to glycosylate a pose.  In the next tutorial, we will use a Mover interface to do so."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>It is also possible to glycosylate a pose with common glycans found in the database. These files end in the `.iupac`\n",
    "extension and are simply IUPAC sequences just as we have been using throughout this chapter.</p>\n",
    "\n",
    "Here is a list of some common iupacs.\n",
    "\n",
    "```\n",
    "bisected_fucosylated_N-glycan_core.iupac\n",
    "bisected_N-glycan_core.iupac\n",
    "common_names.txt\n",
    "core_1_O-glycan.iupac\n",
    "core_2_O-glycan.iupac\n",
    "core_3_O-glycan.iupac\n",
    "core_4_O-glycan.iupac\n",
    "core_5_O-glycan.iupac\n",
    "core_6_O-glycan.iupac\n",
    "core_7_O-glycan.iupac\n",
    "core_8_O-glycan.iupac\n",
    "fucosylated_N-glycan_core.iupac\n",
    "high-mannose_N-glycan_core.iupac\n",
    "hybrid_bisected_fucosylated_N-glycan_core.iupac\n",
    "hybrid_bisected_N-glycan_core.iupac\n",
    "hybrid_fucosylated_N-glycan_core.iupac\n",
    "hybrid_N-glycan_core.iupac\n",
    "man5.iupac\n",
    "man9.iupac\n",
    "N-glycan_core.iupac\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [],
   "source": [
    "peptide = pose_from_sequence('ASA'); pm.apply(peptide)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[0mcore.conformation.Conformation: \u001b[0mappending residue by a chemical bond in the foldtree: 6 ->3)-alpha-D-Galp:2-AcNH anchor:  OG     2 root:  C1\n",
      "\u001b[0mcore.pose.carbohydrates.util: \u001b[0mGlycosylated pose with a(n) a-D-GalpNAc-(1->3)-a-D-GalpNAc--OGSER2 bond.\n",
      "\u001b[0mcore.pose.carbohydrates.util: \u001b[0mIdealizing glycosidic torsions.\n"
     ]
    }
   ],
   "source": [
    "glycosylate_pose_by_file(peptide, 2, 'core_5_O-glycan')\n",
    "pm.apply(peptide)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conclusion\n",
    "\n",
    "You now have a grasp on the basics of RosettaCarbohydrates.  Please continue onto the next tutorial for more on glycan residue selection and various movers that can be of use when working with glycans. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Chapter contributors:**\n",
    "\n",
    "- Jared Adolf-Bryfogle (Scripps; Institute for Protein Innovation)\n",
    "- Jason Labonte (Jons Hopkins; Franklin and Marshall College)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [RosettaAntibodyDesign](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/12.02-RosettaAntibodyDesign-RAbD.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [RosettaCarbohydrates: Trees, Selectors and Movers](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/13.01-Glycan-Trees-Selectors-and-Movers.ipynb) ><p><a href=\"https://colab.research.google.com/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/13.00-RosettaCarbohydrates-Working-with-Glycans.ipynb\"><img align=\"left\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open in Colab\" title=\"Open in Google Colaboratory\"></a>"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Create Assignment",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}