{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NOTEBOOK_HEADER-->\n",
    "*This notebook contains material from [PyRosetta](https://RosettaCommons.github.io/PyRosetta.notebooks);\n",
    "content is available [on Github](https://github.com/RosettaCommons/PyRosetta.notebooks.git).*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [Working With Antibodies](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/12.00-Working-With-Antibodies.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [RosettaAntibodyDesign](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/12.02-RosettaAntibodyDesign-RAbD.ipynb) ><p><a href=\"https://colab.research.google.com/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/12.01-RosettaAntibody-Framework-and-SimpleMetrics.ipynb\"><img align=\"left\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open in Colab\" title=\"Open in Google Colaboratory\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# RosettaAntibody Framework\n",
    "Keywords: CDRResidueSelector\n",
    "\n",
    "## Overview\n",
    "In this workshop we will learn how to use the RosettaAntibody framework.  The full RosettaAntibody (modeling) code is not available in PyRosetta, unfortunately - as it is based around an application. To use that, you will have to use either the ROSIE server, or the Rosetta application. \n",
    "\n",
    "For a full overview of the RosettaAntibody modeling application, see this paper: \n",
    "https://www.ncbi.nlm.nih.gov/pubmed/28125104\n",
    "\n",
    "Snugdock, and H3 modeling component of RosettaAntibody are available here as movers. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install pyrosettacolabsetup\n",
    "import pyrosettacolabsetup; pyrosettacolabsetup.install_pyrosetta()\n",
    "import pyrosetta; pyrosetta.init()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Make sure you are in the directory with the pdb files:**\n",
    "\n",
    "`cd google_drive/MyDrive/student-notebooks/`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Imports\n",
    "\n",
    "Lets import the antibody namespace so we can start using it.  Take a look at the different modules that are a part of the antibody module.\n",
    "\n",
    "Note that we can also do `from rosetta.protocols.antibody import *` in order to make accessing the enums much easier.  For the purpose of this workshop, we will use `antibody` to traverse the contents.  This makes it easier for you to use tab completion for exploration."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Python\n",
    "from pyrosetta import *\n",
    "from pyrosetta.rosetta import *\n",
    "from pyrosetta.teaching import *\n",
    "\n",
    "#Core Includes\n",
    "from rosetta.core.select import residue_selector as selections\n",
    "\n",
    "from rosetta.protocols import antibody\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Intitlialization \n",
    "\n",
    "Here, we will initialize a typical run of Rosetta. We could use the `-input_ab_scheme` option with `AHo_Scheme`, but we will learn to instead pass this to our main antibody framework code. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "PyRosetta-4 2019 [Rosetta PyRosetta4.Release.python36.mac 2019.33+release.1e60c63beb532fd475f0f704d68d462b8af2a977 2019-08-09T15:19:57] retrieved from: http://www.pyrosetta.org\n",
      "(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.\n",
      "\u001b[0mcore.init: \u001b[0mRosetta version: PyRosetta4.Release.python36.mac r230 2019.33+release.1e60c63beb5 1e60c63beb532fd475f0f704d68d462b8af2a977 http://www.pyrosetta.org 2019-08-09T15:19:57\n",
      "\u001b[0mcore.init: \u001b[0mcommand: PyRosetta -use_input_sc -ignore_unrecognized_res -ignore_zero_occupancy false -load_PDB_components false -no_fconfig -database /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyrosetta-2019.33+release.1e60c63beb5-py3.6-macosx-10.6-intel.egg/pyrosetta/database\n",
      "\u001b[0mbasic.random.init_random_generator: \u001b[0m'RNG device' seed mode, using '/dev/urandom', seed=967592561 seed_offset=0 real_seed=967592561\n",
      "\u001b[0mbasic.random.init_random_generator: \u001b[0mRandomGenerator:init: Normal mode, seed=967592561 RG_type=mt19937\n"
     ]
    }
   ],
   "source": [
    "init('-use_input_sc -ignore_unrecognized_res \\\n",
    "     -ignore_zero_occupancy false -load_PDB_components false -no_fconfig')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import and copy pose\n",
    "\n",
    "Let's load an antibody - this this the same antibody we used to learn packing and design. :)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[0mcore.import_pose.import_pose: \u001b[0mFile 'inputs/2r0l_1_1.pdb' automatically determined to be of type PDB\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom:  OXT on residue ARG:CtermProteinFull 108\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom:  OXT on residue SER:CtermProteinFull 225\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom:  OXT on residue ARG:CtermProteinFull 464\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 23 88\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 23 CYS\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 88 CYS\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 23 CYD\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 88 CYD\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 130 204\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 130 CYS\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 204 CYS\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 130 CYD\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 204 CYD\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 250 266\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 250 CYS\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 266 CYS\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 250 CYD\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 266 CYD\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 258 328\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 258 CYS\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 328 CYS\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 258 CYD\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 328 CYD\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 353 422\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 353 CYS\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 422 CYS\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 353 CYD\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 422 CYD\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 385 401\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 385 CYS\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 401 CYS\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 385 CYD\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 401 CYD\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mFound disulfide between residues 412 440\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 412 CYS\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 440 CYS\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 412 CYD\n",
      "\u001b[0mcore.conformation.Conformation: \u001b[0mcurrent variant for 440 CYD\n"
     ]
    }
   ],
   "source": [
    "#Import a pose\n",
    "pose = pose_from_pdb(\"inputs/2r0l_1_1.pdb\")\n",
    "original_pose = pose.clone()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## AntibodyInfo\n",
    "\n",
    "The main tool that we will use is the `AntibodyInfo` object.  This allows you to get a TON of information about the antibody to use in various custom protocols.  \n",
    "\n",
    "Note that this antibody has already been renumbered using the PyIgClassify server.\n",
    "\n",
    "Since we are not defining the numbering scheme and cdr definition during init, we will need to pass an Enum to the AntibodyInfo object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[0mbasic.io.database: \u001b[0mDatabase file opened: sampling/antibodies/cluster_center_dihedrals.txt\n",
      "\u001b[0mprotocols.antibody.AntibodyNumberingParser: \u001b[0mAntibody numbering scheme definitions read successfully\n",
      "\u001b[0mprotocols.antibody.AntibodyNumberingParser: \u001b[0mAntibody CDR definition read successfully\n",
      "\u001b[0mantibody.AntibodyInfo: \u001b[0mSuccessfully finished the CDR definition\n",
      "\u001b[0mantibody.AntibodyInfo: \u001b[0mAC Detecting Regular CDR H3 Stem Type\n",
      "\u001b[0mantibody.AntibodyInfo: \u001b[0mARFWWRSFDYW\n",
      "\u001b[0mantibody.AntibodyInfo: \u001b[0mAC Finished Detecting Regular CDR H3 Stem Type: KINKED\n",
      "\u001b[0mantibody.AntibodyInfo: \u001b[0mAC Finished Detecting Regular CDR H3 Stem Type: Kink: 1 Extended: 0\n",
      "\u001b[0mantibody.AntibodyInfo: \u001b[0mSetting up CDR Cluster for H1\n",
      "\u001b[0mprotocols.antibody.cluster.CDRClusterMatcher: \u001b[0mLength: 13 Omega: TTTTTTTTTTTTT\n",
      "\u001b[0mantibody.AntibodyInfo: \u001b[0mSetting up CDR Cluster for H2\n",
      "\u001b[0mprotocols.antibody.cluster.CDRClusterMatcher: \u001b[0mLength: 10 Omega: TTTTTTTTTT\n",
      "\u001b[0mantibody.AntibodyInfo: \u001b[0mSetting up CDR Cluster for H3\n",
      "\u001b[0mprotocols.antibody.cluster.CDRClusterMatcher: \u001b[0mLength: 10 Omega: TTTTTTTTTT\n",
      "\u001b[0mantibody.AntibodyInfo: \u001b[0mSetting up CDR Cluster for L1\n",
      "\u001b[0mprotocols.antibody.cluster.CDRClusterMatcher: \u001b[0mLength: 11 Omega: TTTTTTTTTTT\n",
      "\u001b[0mantibody.AntibodyInfo: \u001b[0mSetting up CDR Cluster for L2\n",
      "\u001b[0mprotocols.antibody.cluster.CDRClusterMatcher: \u001b[0mLength: 8 Omega: TTTTTTTT\n",
      "\u001b[0mantibody.AntibodyInfo: \u001b[0mSetting up CDR Cluster for L3\n",
      "\u001b[0mprotocols.antibody.cluster.CDRClusterMatcher: \u001b[0mLength: 9 Omega: TTTTTTCTT\n"
     ]
    }
   ],
   "source": [
    "ab_info = antibody.AntibodyInfo(pose, antibody.AHO_Scheme, antibody.North)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets take a look at what AntibodyInfo prints"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "////////////////////////////////////////////////////////////////////////////////\n",
      "///                          Rosetta Antibody Info                           ///\n",
      "///                                                                          ///\n",
      "///             Antibody Type:  Regular Antibody\n",
      "///             Light Chain Type:  unknown\n",
      "/// Predict H3 Cterminus Base:  KINKED\n",
      "///                                                                          \n",
      "/// H1 info: \n",
      "///            length:  13\n",
      "///          sequence:  AASGFTISNSGIH\n",
      "///     north_cluster:  H1-13-1\n",
      "///         loop_info:  LOOP start: 131  stop: 143  cut: 137  size: 13  skip rate: 0  extended?: False\n",
      "\n",
      "/// H2 info: \n",
      "///            length:  10\n",
      "///          sequence:  WIYPTGGATD\n",
      "///     north_cluster:  H2-10-1\n",
      "///         loop_info:  LOOP start: 158  stop: 167  cut: 163  size: 10  skip rate: 0  extended?: False\n",
      "\n",
      "/// H3 info: \n",
      "///            length:  10\n",
      "///          sequence:  ARFWWRSFDY\n",
      "///     north_cluster:  H3-10-1\n",
      "///         loop_info:  LOOP start: 205  stop: 214  cut: 206  size: 10  skip rate: 0  extended?: False\n",
      "\n",
      "/// L1 info: \n",
      "///            length:  11\n",
      "///          sequence:  RASQDVSTAVA\n",
      "///     north_cluster:  L1-11-1\n",
      "///         loop_info:  LOOP start: 24  stop: 34  cut: 29  size: 11  skip rate: 0  extended?: False\n",
      "\n",
      "/// L2 info: \n",
      "///            length:  8\n",
      "///          sequence:  YSASFLYS\n",
      "///     north_cluster:  L2-8-1\n",
      "///         loop_info:  LOOP start: 49  stop: 56  cut: 53  size: 8  skip rate: 0  extended?: False\n",
      "\n",
      "/// L3 info: \n",
      "///            length:  9\n",
      "///          sequence:  QQSYTTPPT\n",
      "///     north_cluster:  L3-9-cis7-1\n",
      "///         loop_info:  LOOP start: 89  stop: 97  cut: 93  size: 9  skip rate: 0  extended?: False\n",
      "\n",
      "////////////////////////////////////////////////////////////////////////////////\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(ab_info)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Isn't that AWESOME!!**  I think so.  But I wrote a lot of that code!  \n",
    "\n",
    "Anyway, as you can see you can get a pretty fair bit of information out of the AntibodyInfo object.  In fact, most antibody-related code actually takes an AntibodyInfo object or constructs one from set numbering scheme, cdr definitions, and pose passed to it.  You will see this as we go.  \n",
    "\n",
    "Note the north_cluster here.  This is useful in some modeling tasks, but becomes much more relevant during antibody design.  More information on what we mean by north_cluster can be found in this paper, if you want to read ahead a bit. https://www.ncbi.nlm.nih.gov/pubmed/21035459"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Basic AntibodyInfo Access\n",
    "Now, lets use the AntibodyInfo class to get a bit of useful information out of our antibody."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "h1 131\n",
      "h2 167\n"
     ]
    }
   ],
   "source": [
    "print(\"h1\", ab_info.get_CDR_start(antibody.h1, pose))\n",
    "print(\"h2\", ab_info.get_CDR_end(antibody.h2, pose))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now lets use these enums a bit more.  They go in order from 1 to 8, with 7 and 8 being CDR4 loops - also known as H3 loops.  We won't worry about them just yet.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 H1\n",
      "2 H2\n",
      "3 H3\n",
      "4 L1\n",
      "5 L2\n",
      "6 L3\n",
      "L1 CDRNameEnum.l1\n",
      "l1 CDRNameEnum.l1\n",
      "L2 CDRNameEnum.l2\n",
      "l2 CDRNameEnum.l2\n",
      "L3 CDRNameEnum.l3\n",
      "H1 CDRNameEnum.h1\n",
      "H2 CDRNameEnum.h2\n",
      "H3 CDRNameEnum.h3\n",
      "CDRNameEnum.h3\n",
      "3\n"
     ]
    }
   ],
   "source": [
    "for i in range(1, 7):\n",
    "    print(i, ab_info.get_CDR_name(antibody.CDRNameEnum(i)))\n",
    "    \n",
    "for cdr in ['L1', 'l1', 'L2', 'l2', 'L3', 'H1', 'H2', 'H3']:\n",
    "    print(cdr, str(ab_info.get_CDR_name_enum(cdr)))\n",
    "          \n",
    "print(str(antibody.h3))\n",
    "print(int(antibody.h3))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Does this make enums a bit less confusing?  These are named integers.  The last function allows us to print either the actual cdr name enum or the integer from it.  The cool thing here is that we can loop through all of the CDRs just by using a range 1-6 and rosetta will understand it.  \n",
    "\n",
    "Note that we convert the integer into a `CDRNameEnum` in the function.  If we are storing the cdr name enums as indexes to a dictionary or list, we don't need this.  That is simply for the C++ code to work properly. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### AntibodyEnumManager\n",
    "So we have seen that some of this code we can do directly within AntibodyInfo itself.  Cool. But what if we need something more advanced?  Lets use the class that actually does all this conversion.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AHO_Scheme\n",
      "North\n",
      "CDRNameEnum.h1\n",
      "framework_region\n"
     ]
    }
   ],
   "source": [
    "enum_manager = antibody.AntibodyEnumManager()\n",
    "print(enum_manager.numbering_scheme_enum_to_string(antibody.AHO_Scheme))\n",
    "print(enum_manager.cdr_definition_enum_to_string(antibody.North))\n",
    "print(enum_manager.cdr_name_string_to_enum(\"H1\"))\n",
    "print(enum_manager.antibody_region_enum_to_string(antibody.framework_region))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use the function, `get_region_or_residue` and `get_CDRNameEnum_of_residue` and the manager to traverse the antibody and get relevant regions of all residues in the pose"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "cell-bb8eb8b9a8a8e905",
     "locked": false,
     "points": 0,
     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 framework_region\n",
      "2 framework_region\n",
      "3 framework_region\n",
      "4 framework_region\n",
      "5 framework_region\n",
      "6 framework_region\n",
      "7 framework_region\n",
      "8 framework_region\n",
      "9 framework_region\n",
      "10 framework_region\n",
      "11 framework_region\n",
      "12 framework_region\n",
      "13 framework_region\n",
      "14 framework_region\n",
      "15 framework_region\n",
      "16 framework_region\n",
      "17 framework_region\n",
      "18 framework_region\n",
      "19 framework_region\n",
      "20 framework_region\n",
      "21 framework_region\n",
      "22 framework_region\n",
      "23 framework_region\n",
      "24 L1\n",
      "25 L1\n",
      "26 L1\n",
      "27 L1\n",
      "28 L1\n",
      "29 L1\n",
      "30 L1\n",
      "31 L1\n",
      "32 L1\n",
      "33 L1\n",
      "34 L1\n",
      "35 framework_region\n",
      "36 framework_region\n",
      "37 framework_region\n",
      "38 framework_region\n",
      "39 framework_region\n",
      "40 framework_region\n",
      "41 framework_region\n",
      "42 framework_region\n",
      "43 framework_region\n",
      "44 framework_region\n",
      "45 framework_region\n",
      "46 framework_region\n",
      "47 framework_region\n",
      "48 framework_region\n",
      "49 L2\n",
      "50 L2\n",
      "51 L2\n",
      "52 L2\n",
      "53 L2\n",
      "54 L2\n",
      "55 L2\n",
      "56 L2\n",
      "57 framework_region\n",
      "58 framework_region\n",
      "59 framework_region\n",
      "60 framework_region\n",
      "61 framework_region\n",
      "62 framework_region\n",
      "63 framework_region\n",
      "64 framework_region\n",
      "65 framework_region\n",
      "66 framework_region\n",
      "67 framework_region\n",
      "68 framework_region\n",
      "69 framework_region\n",
      "70 framework_region\n",
      "71 framework_region\n",
      "72 framework_region\n",
      "73 framework_region\n",
      "74 framework_region\n",
      "75 framework_region\n",
      "76 framework_region\n",
      "77 framework_region\n",
      "78 framework_region\n",
      "79 framework_region\n",
      "80 framework_region\n",
      "81 framework_region\n",
      "82 framework_region\n",
      "83 framework_region\n",
      "84 framework_region\n",
      "85 framework_region\n",
      "86 framework_region\n",
      "87 framework_region\n",
      "88 framework_region\n",
      "89 L3\n",
      "90 L3\n",
      "91 L3\n",
      "92 L3\n",
      "93 L3\n",
      "94 L3\n",
      "95 L3\n",
      "96 L3\n",
      "97 L3\n",
      "98 framework_region\n",
      "99 framework_region\n",
      "100 framework_region\n",
      "101 framework_region\n",
      "102 framework_region\n",
      "103 framework_region\n",
      "104 framework_region\n",
      "105 framework_region\n",
      "106 framework_region\n",
      "107 framework_region\n",
      "108 framework_region\n",
      "109 framework_region\n",
      "110 framework_region\n",
      "111 framework_region\n",
      "112 framework_region\n",
      "113 framework_region\n",
      "114 framework_region\n",
      "115 framework_region\n",
      "116 framework_region\n",
      "117 framework_region\n",
      "118 framework_region\n",
      "119 framework_region\n",
      "120 framework_region\n",
      "121 framework_region\n",
      "122 framework_region\n",
      "123 framework_region\n",
      "124 framework_region\n",
      "125 framework_region\n",
      "126 framework_region\n",
      "127 framework_region\n",
      "128 framework_region\n",
      "129 framework_region\n",
      "130 framework_region\n",
      "131 H1\n",
      "132 H1\n",
      "133 H1\n",
      "134 H1\n",
      "135 H1\n",
      "136 H1\n",
      "137 H1\n",
      "138 H1\n",
      "139 H1\n",
      "140 H1\n",
      "141 H1\n",
      "142 H1\n",
      "143 H1\n",
      "144 framework_region\n",
      "145 framework_region\n",
      "146 framework_region\n",
      "147 framework_region\n",
      "148 framework_region\n",
      "149 framework_region\n",
      "150 framework_region\n",
      "151 framework_region\n",
      "152 framework_region\n",
      "153 framework_region\n",
      "154 framework_region\n",
      "155 framework_region\n",
      "156 framework_region\n",
      "157 framework_region\n",
      "158 H2\n",
      "159 H2\n",
      "160 H2\n",
      "161 H2\n",
      "162 H2\n",
      "163 H2\n",
      "164 H2\n",
      "165 H2\n",
      "166 H2\n",
      "167 H2\n",
      "168 framework_region\n",
      "169 framework_region\n",
      "170 framework_region\n",
      "171 framework_region\n",
      "172 framework_region\n",
      "173 framework_region\n",
      "174 framework_region\n",
      "175 framework_region\n",
      "176 framework_region\n",
      "177 framework_region\n",
      "178 framework_region\n",
      "179 framework_region\n",
      "180 framework_region\n",
      "181 framework_region\n",
      "182 framework_region\n",
      "183 framework_region\n",
      "184 framework_region\n",
      "185 framework_region\n",
      "186 framework_region\n",
      "187 framework_region\n",
      "188 framework_region\n",
      "189 framework_region\n",
      "190 framework_region\n",
      "191 framework_region\n",
      "192 framework_region\n",
      "193 framework_region\n",
      "194 framework_region\n",
      "195 framework_region\n",
      "196 framework_region\n",
      "197 framework_region\n",
      "198 framework_region\n",
      "199 framework_region\n",
      "200 framework_region\n",
      "201 framework_region\n",
      "202 framework_region\n",
      "203 framework_region\n",
      "204 framework_region\n",
      "205 H3\n",
      "206 H3\n",
      "207 H3\n",
      "208 H3\n",
      "209 H3\n",
      "210 H3\n",
      "211 H3\n",
      "212 H3\n",
      "213 H3\n",
      "214 H3\n",
      "215 framework_region\n",
      "216 framework_region\n",
      "217 framework_region\n",
      "218 framework_region\n",
      "219 framework_region\n",
      "220 framework_region\n",
      "221 framework_region\n",
      "222 framework_region\n",
      "223 framework_region\n",
      "224 framework_region\n",
      "225 framework_region\n",
      "226 antigen_region\n",
      "227 antigen_region\n",
      "228 antigen_region\n",
      "229 antigen_region\n",
      "230 antigen_region\n",
      "231 antigen_region\n",
      "232 antigen_region\n",
      "233 antigen_region\n",
      "234 antigen_region\n",
      "235 antigen_region\n",
      "236 antigen_region\n",
      "237 antigen_region\n",
      "238 antigen_region\n",
      "239 antigen_region\n",
      "240 antigen_region\n",
      "241 antigen_region\n",
      "242 antigen_region\n",
      "243 antigen_region\n",
      "244 antigen_region\n",
      "245 antigen_region\n",
      "246 antigen_region\n",
      "247 antigen_region\n",
      "248 antigen_region\n",
      "249 antigen_region\n",
      "250 antigen_region\n",
      "251 antigen_region\n",
      "252 antigen_region\n",
      "253 antigen_region\n",
      "254 antigen_region\n",
      "255 antigen_region\n",
      "256 antigen_region\n",
      "257 antigen_region\n",
      "258 antigen_region\n",
      "259 antigen_region\n",
      "260 antigen_region\n",
      "261 antigen_region\n",
      "262 antigen_region\n",
      "263 antigen_region\n",
      "264 antigen_region\n",
      "265 antigen_region\n",
      "266 antigen_region\n",
      "267 antigen_region\n",
      "268 antigen_region\n",
      "269 antigen_region\n",
      "270 antigen_region\n",
      "271 antigen_region\n",
      "272 antigen_region\n",
      "273 antigen_region\n",
      "274 antigen_region\n",
      "275 antigen_region\n",
      "276 antigen_region\n",
      "277 antigen_region\n",
      "278 antigen_region\n",
      "279 antigen_region\n",
      "280 antigen_region\n",
      "281 antigen_region\n",
      "282 antigen_region\n",
      "283 antigen_region\n",
      "284 antigen_region\n",
      "285 antigen_region\n",
      "286 antigen_region\n",
      "287 antigen_region\n",
      "288 antigen_region\n",
      "289 antigen_region\n",
      "290 antigen_region\n",
      "291 antigen_region\n",
      "292 antigen_region\n",
      "293 antigen_region\n",
      "294 antigen_region\n",
      "295 antigen_region\n",
      "296 antigen_region\n",
      "297 antigen_region\n",
      "298 antigen_region\n",
      "299 antigen_region\n",
      "300 antigen_region\n",
      "301 antigen_region\n",
      "302 antigen_region\n",
      "303 antigen_region\n",
      "304 antigen_region\n",
      "305 antigen_region\n",
      "306 antigen_region\n",
      "307 antigen_region\n",
      "308 antigen_region\n",
      "309 antigen_region\n",
      "310 antigen_region\n",
      "311 antigen_region\n",
      "312 antigen_region\n",
      "313 antigen_region\n",
      "314 antigen_region\n",
      "315 antigen_region\n",
      "316 antigen_region\n",
      "317 antigen_region\n",
      "318 antigen_region\n",
      "319 antigen_region\n",
      "320 antigen_region\n",
      "321 antigen_region\n",
      "322 antigen_region\n",
      "323 antigen_region\n",
      "324 antigen_region\n",
      "325 antigen_region\n",
      "326 antigen_region\n",
      "327 antigen_region\n",
      "328 antigen_region\n",
      "329 antigen_region\n",
      "330 antigen_region\n",
      "331 antigen_region\n",
      "332 antigen_region\n",
      "333 antigen_region\n",
      "334 antigen_region\n",
      "335 antigen_region\n",
      "336 antigen_region\n",
      "337 antigen_region\n",
      "338 antigen_region\n",
      "339 antigen_region\n",
      "340 antigen_region\n",
      "341 antigen_region\n",
      "342 antigen_region\n",
      "343 antigen_region\n",
      "344 antigen_region\n",
      "345 antigen_region\n",
      "346 antigen_region\n",
      "347 antigen_region\n",
      "348 antigen_region\n",
      "349 antigen_region\n",
      "350 antigen_region\n",
      "351 antigen_region\n",
      "352 antigen_region\n",
      "353 antigen_region\n",
      "354 antigen_region\n",
      "355 antigen_region\n",
      "356 antigen_region\n",
      "357 antigen_region\n",
      "358 antigen_region\n",
      "359 antigen_region\n",
      "360 antigen_region\n",
      "361 antigen_region\n",
      "362 antigen_region\n",
      "363 antigen_region\n",
      "364 antigen_region\n",
      "365 antigen_region\n",
      "366 antigen_region\n",
      "367 antigen_region\n",
      "368 antigen_region\n",
      "369 antigen_region\n",
      "370 antigen_region\n",
      "371 antigen_region\n",
      "372 antigen_region\n",
      "373 antigen_region\n",
      "374 antigen_region\n",
      "375 antigen_region\n",
      "376 antigen_region\n",
      "377 antigen_region\n",
      "378 antigen_region\n",
      "379 antigen_region\n",
      "380 antigen_region\n",
      "381 antigen_region\n",
      "382 antigen_region\n",
      "383 antigen_region\n",
      "384 antigen_region\n",
      "385 antigen_region\n",
      "386 antigen_region\n",
      "387 antigen_region\n",
      "388 antigen_region\n",
      "389 antigen_region\n",
      "390 antigen_region\n",
      "391 antigen_region\n",
      "392 antigen_region\n",
      "393 antigen_region\n",
      "394 antigen_region\n",
      "395 antigen_region\n",
      "396 antigen_region\n",
      "397 antigen_region\n",
      "398 antigen_region\n",
      "399 antigen_region\n",
      "400 antigen_region\n",
      "401 antigen_region\n",
      "402 antigen_region\n",
      "403 antigen_region\n",
      "404 antigen_region\n",
      "405 antigen_region\n",
      "406 antigen_region\n",
      "407 antigen_region\n",
      "408 antigen_region\n",
      "409 antigen_region\n",
      "410 antigen_region\n",
      "411 antigen_region\n",
      "412 antigen_region\n",
      "413 antigen_region\n",
      "414 antigen_region\n",
      "415 antigen_region\n",
      "416 antigen_region\n",
      "417 antigen_region\n",
      "418 antigen_region\n",
      "419 antigen_region\n",
      "420 antigen_region\n",
      "421 antigen_region\n",
      "422 antigen_region\n",
      "423 antigen_region\n",
      "424 antigen_region\n",
      "425 antigen_region\n",
      "426 antigen_region\n",
      "427 antigen_region\n",
      "428 antigen_region\n",
      "429 antigen_region\n",
      "430 antigen_region\n",
      "431 antigen_region\n",
      "432 antigen_region\n",
      "433 antigen_region\n",
      "434 antigen_region\n",
      "435 antigen_region\n",
      "436 antigen_region\n",
      "437 antigen_region\n",
      "438 antigen_region\n",
      "439 antigen_region\n",
      "440 antigen_region\n",
      "441 antigen_region\n",
      "442 antigen_region\n",
      "443 antigen_region\n",
      "444 antigen_region\n",
      "445 antigen_region\n",
      "446 antigen_region\n",
      "447 antigen_region\n",
      "448 antigen_region\n",
      "449 antigen_region\n",
      "450 antigen_region\n",
      "451 antigen_region\n",
      "452 antigen_region\n",
      "453 antigen_region\n",
      "454 antigen_region\n",
      "455 antigen_region\n",
      "456 antigen_region\n",
      "457 antigen_region\n",
      "458 antigen_region\n",
      "459 antigen_region\n",
      "460 antigen_region\n",
      "461 antigen_region\n",
      "462 antigen_region\n",
      "463 antigen_region\n",
      "464 antigen_region\n"
     ]
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "\n",
    "for i in range(1, pose.size()+1):\n",
    "    region = ab_info.get_region_of_residue(pose, i)\n",
    "    if (region == antibody.cdr_region):\n",
    "        print(i, enum_manager.cdr_name_enum_to_string(ab_info.get_CDRNameEnum_of_residue(pose, i)))\n",
    "    else:\n",
    "        print(i, enum_manager.antibody_region_enum_to_string(region))\n",
    "              \n",
    "### END SOLUTION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### CDR Clusters\n",
    "\n",
    "Use either the PyRosetta docs on AntibodyInfo, or the interactive notebook to use AntibodyInfo to get the length and cluster of L1."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "cell-060502f6442e80a9",
     "locked": false,
     "points": 0,
     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "11\n",
      "CDRClusterEnum.L1_11_1\n"
     ]
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "\n",
    "print(ab_info.get_CDR_length(antibody.l1))\n",
    "print(ab_info.get_CDR_cluster(antibody.l1).cluster())\n",
    "\n",
    "### END SOLUTION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The CDRCluster object has a lot of information about a particular cluster.  Lets use it to get the normalized distance in degrees of the L1 cluster. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "7.137242784087944\n"
     ]
    }
   ],
   "source": [
    "L1_cluster = ab_info.get_CDR_cluster(antibody.l1)\n",
    "print(L1_cluster.normalized_distance_in_degrees())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Anything below 35 or 40 degrees is very close to the cluster center.  This is a structure with a very well-defined L1-11-1 loop - one of the most common L1 lengths and clusters."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Numbering Scheme Translation\n",
    "It may not seem like much, but numbering scheme translation is a very difficult thing to do without mistakes.   Rosetta now has this ability to make it much easier to understand antibody structural or sequence papers in a highly tested and fairly easy-to-use implementation.  Lets take a look.  We'll use `AntibodyInfo` and the `get_landmark_resnum()` function for this, but you could also use function `get_antibody_numbering_info()` that will give you all the conversions - though it is certainly a bit more tricky to use. \n",
    "\n",
    "#### Conserved Inter-Domain Cysteine\n",
    "\n",
    "The conserved cysteine residue forming the intradomain disulfide bridge always carries the label \"23\" as in the IMGT numbering scheme, while according to Kabat, it was labeled L23 in Vk and Vl, H22 in VH.  Let's find this residue in our antibody. https://www.bioc.uzh.ch/plueckthun/antibody/Numbering/FR1/index.html"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "rosetta_num = ab_info.get_landmark_resnum(pose, antibody.Kabat_Scheme, 'H', 22)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "What is the chain and resnum in OUR Aho numbering scheme?  Is this a cysteine?  How about a disulfide?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "cell-497c180f7a03995a",
     "locked": false,
     "points": 0,
     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "23 H \n",
      "Residue 130: CYS:disulfide (CYS, C):\n",
      "Base: CYS\n",
      " Properties: POLYMER PROTEIN CANONICAL_AA SC_ORBITALS METALBINDING DISULFIDE_BONDED ALPHA_AA L_AA\n",
      " Variant types: DISULFIDE\n",
      " Main-chain atoms:  N    CA   C  \n",
      " Backbone atoms:    N    CA   C    O    H    HA \n",
      " Side-chain atoms:  CB   SG  1HB  2HB \n",
      "Atom Coordinates:\n",
      "   N  : -13.918, -0.011, 40.022\n",
      "   CA : -15.022, -0.943, 39.837\n",
      "   C  : -16.073, -0.624, 40.895\n",
      "   O  : -15.877, -0.945, 42.066\n",
      "   CB : -14.515, -2.379, 40.021\n",
      "   SG : -15.8, -3.608, 40.319\n",
      "   H  : -13.6187, 0.217354, 40.9592\n",
      "   HA : -15.4065, -0.826975, 38.8236\n",
      "  1HB : -13.9648, -2.68746, 39.1317\n",
      "  2HB : -13.8236, -2.41565, 40.8626\n",
      "Mirrored relative to coordinates in ResidueType: FALSE\n",
      "\n"
     ]
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "print(pose.pdb_info().pose2pdb(rosetta_num))\n",
    "print(pose.residue(rosetta_num))\n",
    "### END SOLUTION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ok.  Cool.  Lets do the same thing for the Cysteine that is connected to this residue. \n",
    "In IMGT this is residue 104 on the heavy chain.  Lets do the same thing here.  Use tab completion for  `antibody.IMGT_Scheme` for the enum.  https://www.bioc.uzh.ch/plueckthun/antibody/Numbering/FR3a/index.html"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "cell-83b0649623e4b481",
     "locked": false,
     "points": 0,
     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [],
   "source": [
    "### BEGIN SOLUTION\n",
    "pre_cdr3_c = ab_info.get_landmark_resnum(pose, antibody.IMGT_Scheme, 'H', 104)\n",
    "### END SOLUTION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once again, what is the residue in our AHO-numbered antibody?  Is it a Cysteine?  Is it disulfide bonded?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "nbgrader": {
     "grade": true,
     "grade_id": "cell-f462d92f2b7fc64c",
     "locked": false,
     "points": 0,
     "schema_version": 3,
     "solution": true
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "106 H \n",
      "Residue 204: CYS:disulfide (CYS, C):\n",
      "Base: CYS\n",
      " Properties: POLYMER PROTEIN CANONICAL_AA SC_ORBITALS METALBINDING DISULFIDE_BONDED ALPHA_AA L_AA\n",
      " Variant types: DISULFIDE\n",
      " Main-chain atoms:  N    CA   C  \n",
      " Backbone atoms:    N    CA   C    O    H    HA \n",
      " Side-chain atoms:  CB   SG  1HB  2HB \n",
      "Atom Coordinates:\n",
      "   N  : -14.312, -6.402, 36.316\n",
      "   CA : -14.452, -6.929, 37.646\n",
      "   C  : -15.678, -7.837, 37.662\n",
      "   O  : -16.599, -7.672, 36.856\n",
      "   CB : -14.501, -5.824, 38.705\n",
      "   SG : -15.935, -4.767, 38.638\n",
      "   H  : -14.9281, -5.66132, 36.0129\n",
      "   HA : -13.5885, -7.5585, 37.8613\n",
      "  1HB : -14.4721, -6.27099, 39.699\n",
      "  2HB : -13.6222, -5.18697, 38.607\n",
      "Mirrored relative to coordinates in ResidueType: FALSE\n",
      "\n"
     ]
    }
   ],
   "source": [
    "### BEGIN SOLUTION\n",
    "print(pose.pdb_info().pose2pdb(pre_cdr3_c))\n",
    "print(pose.residue(pre_cdr3_c))\n",
    "\n",
    "### END SOLUTION"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Sequence\n",
    "Lets expore the sequence of this antibody"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DIQMTQSPSSLSASVGDRVTITCRASQDVSTAVAWYQQKPGKAPKLLIYSASFLYSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSYTTPPTFGQGTKVEIKREVQLVESGGGLVQPGGSLRLSCAASGFTISNSGIHWVRQAPGKGLEWVGWIYPTGGATDYADSVKGRFTISADTSKNTAYLQMNSLRAEDTAVYYCARFWWRSFDYWGQGTLVTVSS\n",
      "L1 RASQDVSTAVA\n",
      "CDRNameEnum.h1 AASGFTISNSGIH\n",
      "CDRNameEnum.h2 WIYPTGGATD\n",
      "CDRNameEnum.h3 ARFWWRSFDY\n",
      "CDRNameEnum.l1 RASQDVSTAVA\n",
      "CDRNameEnum.l2 YSASFLYS\n",
      "CDRNameEnum.l3 QQSYTTPPT\n"
     ]
    }
   ],
   "source": [
    "ab_seq = ab_info.get_antibody_sequence()\n",
    "print(ab_seq)\n",
    "\n",
    "L1_seq = ab_info.get_CDR_sequence_with_stem(antibody.l1, pose)\n",
    "print(\"L1\", L1_seq)\n",
    "\n",
    "for i in range(1, 7):\n",
    "    cdr = antibody.CDRNameEnum(i)\n",
    "    print(cdr, ab_info.get_CDR_sequence_with_stem(cdr, pose))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Other AntibodyInfo functions\n",
    "Use tab completion to find other useful functions.  This includes movemap, loops, and fold-tree creation for specific tasks.  With ResidueSelectors, this functionality is not quite as useful, but you should know that it is here."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### AntibodyInfo Deprecated Functions\n",
    "All functions are fair-game, except these: `get_TaskFactory_AllCDRs` and `get_TaskFactory_OneCDR`  - This will be removed from AntibodyInfo as it is extremely specific to a particular antibody modeling task."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Antibody Util and SimpleMetrics\n",
    "Util functions in Rosetta are stored in the `util.hh` file in each directory that has one.  Within PyRosetta, when you import the namespace, these come with.  There are many that you should be aware of to make modeling and design tasks easier for custom protocols.\n",
    "\n",
    "We will go through some examples here.\n",
    "\n",
    "### Function: get_cdr_loops()\n",
    "The get_cdr_loops function takes a vector1 bool of CDRs.  Use the Enums to set H3 and L3 to true.\n",
    "Vector1 bool starts as all negative."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "vector1_bool[0, 0, 0, 0, 0, 0]\n",
      "LOOP  begin  end  cut  skip_rate  extended\n",
      "LOOP start: 203  stop: 216  cut: 210  size: 14  skip rate: 0  extended?: False\n",
      "\n",
      "LOOP start: 87  stop: 99  cut: 93  size: 13  skip rate: 0  extended?: False\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "h3_l3 = rosetta.utility.vector1_bool(6)\n",
    "print(h3_l3)\n",
    "\n",
    "h3_l3[antibody.h3] = True\n",
    "h3_l3[antibody.l3] = True\n",
    "\n",
    "#Here, we get cdr loops, and set the stem size to 2, \n",
    "# so we include 2 residues on either side of the CDR loop (called the stem), to help us in modeling.\n",
    "h3_l3_loops = antibody.get_cdr_loops(ab_info, pose, h3_l3, 2)\n",
    "print(h3_l3_loops)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Function: select_epitope_residues()\n",
    "We could use the NeighborhoodResidueSelector as you have used in the passed to get neighbors.  Instead, lets use a general function to get all the epitope residues within an 8 Angstrum distance of the paratope."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "267\n",
      "270\n",
      "271\n",
      "272\n",
      "273\n",
      "299\n",
      "300\n",
      "301\n",
      "302\n",
      "303\n",
      "304\n",
      "305\n",
      "307\n",
      "308\n",
      "309\n",
      "310\n",
      "313\n",
      "396\n",
      "397\n",
      "398\n",
      "454\n",
      "458\n",
      "Total Epitope Residues: 22\n"
     ]
    }
   ],
   "source": [
    "epi_residues = antibody.select_epitope_residues(ab_info, pose, 8)\n",
    "total=0\n",
    "for i in range(1, len(epi_residues)+1):\n",
    "    if epi_residues[i]:\n",
    "        print(i)\n",
    "        total+=1\n",
    "print(\"Total Epitope Residues:\", total)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So that was cool.  But lets the wonderful `ReturnResidueSubsetSelector` to take this `ResidueSubset` of the epitope residues and store the data as a `ResidueSelector`!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [],
   "source": [
    "epi_res_selector = selections.ReturnResidueSubsetSelector(epi_residues)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now what?  Lets use some SimpleMetrics using the selector to calculate something about these epitope residues."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### SasaMetric, TotalEnergyMetric, SelectedResiduesPyMOLMetric"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "SASA 531.9639835627297\n",
      "\u001b[0mcore.scoring.ScoreFunctionFactory: \u001b[0mSCOREFUNCTION: \u001b[32mref2015\u001b[0m\n",
      "\n",
      "TOTAL RESIDUE ENERGY -2.6964334237038683\n",
      "\n",
      "SELECTION select rosetta_sele, (chain A and resid 42,45,46,47,48,74,75,76,77,78,79,80,82,83,84,85,88,171,172,173,229,233)\n"
     ]
    }
   ],
   "source": [
    "import rosetta.core.simple_metrics.metrics as sm\n",
    "sasa_metric = sm.SasaMetric(epi_res_selector)\n",
    "print(\"\\nSASA\", sasa_metric.calculate(pose))\n",
    "\n",
    "total_metric = sm.TotalEnergyMetric(epi_res_selector)\n",
    "print(\"\\nTOTAL RESIDUE ENERGY\", total_metric.calculate(pose))\n",
    "\n",
    "#Lets use a useful metric to select these residues in pymol\n",
    "pymol_metric = sm.SelectedResiduesPyMOLMetric(epi_res_selector)\n",
    "print(\"\\nSELECTION\", pymol_metric.calculate(pose))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now lets see which of these residues are most buried in the interface and the residues which have the lowest energy. Note that this is not ddG - we would need to separate the chains for this.  We can use the `protocols.toolbox.rigid_body.translate` function to do that. \n",
    "\n",
    "Use the pymol selection (copy from select...) and lets take a look at them in PyMol.  Then run the code below."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### PerResidueSasaMetric"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(300, 0.0)\n",
      "(303, 0.0)\n",
      "(305, 0.0)\n",
      "(304, 0.4468042885105504)\n",
      "(267, 1.024686682355324)\n",
      "(398, 1.8244508447514138)\n",
      "(302, 4.098746729421277)\n",
      "(396, 4.098746729421277)\n",
      "(271, 5.380780270728442)\n",
      "(307, 5.504698647620041)\n",
      "(301, 7.689850871116937)\n",
      "(313, 8.068377879289471)\n",
      "(270, 8.322490061360945)\n",
      "(458, 14.530261630816586)\n",
      "(310, 17.873691916125896)\n",
      "(308, 39.92505047193878)\n",
      "(454, 46.22431607929343)\n",
      "(397, 54.69632267990853)\n",
      "(299, 60.31999848338232)\n",
      "(309, 75.13475653929288)\n",
      "(272, 76.34620896564937)\n",
      "(273, 100.45374379174626)\n"
     ]
    }
   ],
   "source": [
    "import rosetta.core.simple_metrics.per_residue_metrics as residue_sm\n",
    "import operator\n",
    "\n",
    "res_sasa_metric = residue_sm.PerResidueSasaMetric()\n",
    "res_sasa_metric.set_residue_selector(epi_res_selector)\n",
    "per_res_sasa = res_sasa_metric.calculate(pose)\n",
    "#print(per_res_sasa)\n",
    "\n",
    "#Convert the Map to a Dictionary, which are essentially the same thing. \n",
    "for ele in sorted(per_res_sasa.items(), key=operator.itemgetter(1), reverse=False):\n",
    "    print(ele)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Cool. So the most buried residues at the interface are 300, 303, 305.  Convert those to the PDB chain/num using PDBInfo and take a look at them in PyMOL.\n",
    "\n",
    "### PerResidueEnergyMetric"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "300 75 A  0.0\n",
      "303 78 A  0.0\n",
      "305 80 A  0.0\n",
      "304 79 A  0.4468042885105504\n",
      "267 42 A  1.024686682355324\n",
      "398 173 A  1.8244508447514138\n",
      "302 77 A  4.098746729421277\n",
      "396 171 A  4.098746729421277\n",
      "271 46 A  5.380780270728442\n",
      "307 82 A  5.504698647620041\n",
      "301 76 A  7.689850871116937\n",
      "313 88 A  8.068377879289471\n",
      "270 45 A  8.322490061360945\n",
      "458 233 A  14.530261630816586\n",
      "310 85 A  17.873691916125896\n",
      "308 83 A  39.92505047193878\n",
      "454 229 A  46.22431607929343\n",
      "397 172 A  54.69632267990853\n",
      "299 74 A  60.31999848338232\n",
      "309 84 A  75.13475653929288\n",
      "272 47 A  76.34620896564937\n",
      "273 48 A  100.45374379174626\n"
     ]
    }
   ],
   "source": [
    "res_energy_metric = residue_sm.PerResidueEnergyMetric()\n",
    "res_energy_metric.set_residue_selector(epi_res_selector)\n",
    "\n",
    "per_res_energy = res_sasa_metric.calculate(pose)\n",
    "#print(per_res_sasa)\n",
    "\n",
    "#Convert the Map to a Dictionary, which are essentially the same thing. \n",
    "for ele in sorted(per_res_energy.items(), key=operator.itemgetter(1), reverse=False):\n",
    "    print(ele[0], pose.pdb_info().pose2pdb(ele[0]), ele[1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wow!  Why is 48A so high in energy!?  This may be due to the fact that we are working with a crystal structure that has not been pre-relaxed using the pareto-optimal protocol.  Be sure when using PDBs from the data bank for production runs to do this, outputting about 10 models and selecting the lowest energy residue.  Or, you could use density to relax within the crystal denstiy.  Either works well. \n",
    "\n",
    "https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0059004"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Other Useful Antibody Tools\n",
    "\n",
    "### CDRResidueSelector"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "89 107 L \n",
      "90 108 L \n",
      "91 109 L \n",
      "92 110 L \n",
      "93 111 L \n",
      "94 135 L \n",
      "95 136 L \n",
      "96 137 L \n",
      "97 138 L \n",
      "205 107 H \n",
      "206 108 H \n",
      "207 109 H \n",
      "208 110 H \n",
      "209 111 H \n",
      "210 134 H \n",
      "211 135 H \n",
      "212 136 H \n",
      "213 137 H \n",
      "214 138 H \n"
     ]
    }
   ],
   "source": [
    "from rosetta.protocols.antibody.residue_selector import *\n",
    "\n",
    "cdr_selector = CDRResidueSelector(ab_info)\n",
    "cdr_selector.set_cdrs(h3_l3)\n",
    "sele = cdr_selector.apply(pose)\n",
    "for i in range(1, len(sele)):\n",
    "    if sele[i]:\n",
    "        print(i, pose.pdb_info().pose2pdb(i))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### AntibodyRegionSelector\n",
    "We can use the AntibodyRegionSelector to select a specific region:\n",
    "`antigen_region`, `framework_region`, and `cdr_region`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "89 107 L \n",
      "90 108 L \n",
      "91 109 L \n",
      "92 110 L \n",
      "93 111 L \n",
      "94 135 L \n",
      "95 136 L \n",
      "96 137 L \n",
      "97 138 L \n",
      "205 107 H \n",
      "206 108 H \n",
      "207 109 H \n",
      "208 110 H \n",
      "209 111 H \n",
      "210 134 H \n",
      "211 135 H \n",
      "212 136 H \n",
      "213 137 H \n",
      "214 138 H \n"
     ]
    }
   ],
   "source": [
    "region_selector = AntibodyRegionSelector(ab_info)\n",
    "region_selector.set_region(antibody.antigen_region)\n",
    "sele = region_selector.apply(pose)\n",
    "\n",
    "for i in range(1, len(sele)):\n",
    "    if sele[i]:\n",
    "        print(i, pose.pdb_info().pose2pdb(i))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Other\n",
    "-  **TaskOperations** - Antibody specific Task Operations will be covered in the next workshop\n",
    "-  **SnugDock** - Snugdock is available in the `rosetta.protocols.antibody.snugdock` namespace.  Both the full protocol, `SnugDockProtocol` and the mover, `Snugdock` are available and easy to setup through code - but their run time is extremely long.\n",
    "-  **AntibodyModelerProtocol** - this is the `Antibody_H3` app. Personally, I would use the Rosetta C++ application for this with specific options specified in the docs, however you can call this in PyRosetta.\n",
    "-  **AntibodyCDRGrafter** This is the main grafting class used for RosettaAntibody and RosettaAntibodyDesign. Is is in the `protocols.antibody` namespace. Documentation on this mover can be found here (XML or code-level interface is available): https://www.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/Movers/movers_pages/antibodies/AntibodyCDRGrafter   "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## References\n",
    "Please site these papers when using any of RosettaAntibody.\n",
    "\n",
    "- J. Adolf-Bryfogle, O Kalyuzhniy, M Kubitz, B. D. Weitzner, X Hu, Y Adachi, W R. Schief, R L. Dunbrack Jr., \n",
    "    - \"Rosetta Antibody Design (RAbD): A General Framework for Computational Antibody Design\", PLOS Computational Biology (2018)\n",
    "\n",
    "- B. D. Weitzner*, J. R. Jeliazkov*, S. Lyskov*, N. M. Marze, D. Kuroda, R. Frick, J. Adolf-Bryfogle, N. Biswas, R. L. Dunbrack Jr., and J. J. Gray, \n",
    "    - \"Modeling and docking of antibody structures with Rosetta.\" Nature Protocols 12, 401–416 (2017)\n",
    "\n",
    "- B. D. Weitzner, D. Kuroda, N. M. Marze, J. Xu & J. J. Gray, \n",
    "    - \"Blind prediction performance of RosettaAntibody 3.0: Grafting, relaxation, kinematic loop modeling, and full CDR optimization.\" Proteins 82(8), 1611–1623 (2014)\n",
    "\n",
    "- A. Sivasubramanian,* A. Sircar,* S. Chaudhury & J. J. Gray, \n",
    "    - \"Toward high-resolution homology modeling of antibody Fv regions and application to antibody-antigen docking,\" Proteins 74(2), 497–514 (2009)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [Side Chain Conformations and Dunbrack Energies](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.01-Side-Chain-Conformations-and-Dunbrack-Energies.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [Protein Design with a Resfile and FastRelax](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.03-Design-with-a-resfile-and-relax.ipynb) ><p><a href=\"https://colab.research.google.com/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.02-Packing-design-and-regional-relax.ipynb\"><img align=\"left\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open in Colab\" title=\"Open in Google Colaboratory\"></a>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [Working With Antibodies](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/12.00-Working-With-Antibodies.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [RosettaAntibodyDesign](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/12.02-RosettaAntibodyDesign-RAbD.ipynb) ><p><a href=\"https://colab.research.google.com/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/12.01-RosettaAntibody-Framework-and-SimpleMetrics.ipynb\"><img align=\"left\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open in Colab\" title=\"Open in Google Colaboratory\"></a>"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Create Assignment",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.0"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}