{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\\rightarrow$Run All).\n",
    "\n",
    "Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\", as well as your name and collaborators below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "NAME = \"\"\n",
    "COLLABORATORS = \"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NOTEBOOK_HEADER-->\n",
    "*This notebook contains material from [PyRosetta](https://RosettaCommons.github.io/PyRosetta.notebooks);\n",
    "content is available [on Github](https://github.com/RosettaCommons/PyRosetta.notebooks.git).*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [Visualization with the `PyMOLMover`](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/02.06-Visualization-and-PyMOL-Mover.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [Visualization and `pyrosetta.distributed.viewer`](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/02.08-Visualization-and-pyrosetta.distributed.viewer.ipynb) ><p><a href=\"https://colab.research.google.com/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/02.07-RosettaScripts-in-PyRosetta.ipynb\"><img align=\"left\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open in Colab\" title=\"Open in Google Colaboratory\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# RosettaScripts in PyRosetta\n",
    "\n",
    "Keywords: RosettaScripts, script, xml, XMLObjects\n",
    "\n",
    "## Overview\n",
    "\n",
    "RosettaScripts in another way to script custom modules in PyRosetta.  It is much simpler than PyRosetta, but can be extremely powerful, and with great documentation. There are also many publications that give `RosettaScript` examples, or whole protocols as a `RosettaScript` instead of a mover or application. In addition, some early Rosetta code was written with `RosettaScripts` in mind, and still may only be fully accessible via `RosettaScripts` in order to change important variables. \n",
    "\n",
    "Recent versions of Rosetta have enabled full RosettaScript protocols to be run in PyRosetta. A new class called `XMLObjects`, has also enabled the setup of specific rosetta class types in PyRosetta instead of constructing them from code. This tutorial will introduce how to use this integration to get the most out of Rosetta.  Note that some tutorials use RosettaScripts almost exclusively, such as the parametric protein design notebook, as it is simpler to use RS than setting up everything manually in code. \n",
    "\n",
    "\n",
    "## RosettaScripts\n",
    "\n",
    "A RosettaScript is made up of different sections where different types of Rosetta classes are constructed. You will see many of these types throughout the notebooks to come. Briefly:\n",
    "\n",
    "`ScoreFunctions`:  A scorefunction evaluates the energy of a pose through physical and statistal energy terms\n",
    "`ResidueSelectors`: These select a list of residues in a pose according to some criteria\n",
    "`Movers`:  These do things to a pose. They all have an `apply()` method that you will see shortly.\n",
    "`TaskOperations`:  These control side-chain packing and design\n",
    "`SimpleMetrics`: The return some metric value of a pose.  This value can be a real number, string, or a composite of values. \n",
    "\n",
    "### Skeleton RosettaScript Format\n",
    "\n",
    "```xml\n",
    "<ROSETTASCRIPTS>\n",
    "    <SCOREFXNS>\n",
    "    </SCOREFXNS>\n",
    "    <RESIDUE_SELECTORS>\n",
    "    </RESIDUE_SELECTORS>\n",
    "    <TASKOPERATIONS>\n",
    "    </TASKOPERATIONS>\n",
    "    <SIMPLE_METRICS>\n",
    "    </SIMPLE_METRICS>\n",
    "    <FILTERS>\n",
    "    </FILTERS>\n",
    "    <MOVERS>\n",
    "    </MOVERS>\n",
    "    <PROTOCOLS>\n",
    "    </PROTOCOLS>\n",
    "    <OUTPUT />\n",
    "</ROSETTASCRIPTS>\n",
    "```\n",
    "Anything outside of the \\< \\> notation is ignored and can be used to comment the xml file\n",
    "\n",
    "\n",
    "### RosettaScript Example\n",
    "```xml\n",
    "<ROSETTASCRIPTS>\n",
    "\t<SCOREFXNS>\n",
    "\t</SCOREFXNS>\n",
    "\t<RESIDUE_SELECTORS>\n",
    "\t\t<CDR name=\"L1\" cdrs=\"L1\"/>\n",
    "\t</RESIDUE_SELECTORS>\n",
    "\t<MOVE_MAP_FACTORIES>\n",
    "\t\t<MoveMapFactory name=\"movemap_L1\" bb=\"0\" chi=\"0\">\n",
    "\t\t\t<Backbone residue_selector=\"L1\" />\n",
    "\t\t\t<Chi residue_selector=\"L1\" />\n",
    "\t\t</MoveMapFactory>\n",
    "\t</MOVE_MAP_FACTORIES>\n",
    "\t<SIMPLE_METRICS>\n",
    "\t\t<TimingProfileMetric name=\"timing\" />\n",
    "\t\t<SelectedResiduesMetric name=\"rosetta_sele\" residue_selector=\"L1\" rosetta_numbering=\"1\"/>\n",
    "\t\t<SelectedResiduesPyMOLMetric name=\"pymol_selection\" residue_selector=\"L1\" />\n",
    "\t\t<SequenceMetric name=\"sequence\" residue_selector=\"L1\" />\n",
    "\t\t<SecondaryStructureMetric name=\"ss\" residue_selector=\"L1\" />\n",
    "\t</SIMPLE_METRICS>\n",
    "\t<MOVERS>\n",
    "\t\t<MinMover name=\"min_mover\" movemap_factory=\"movemap_L1\" tolerance=\".1\" /> \n",
    "\t\t<RunSimpleMetrics name=\"run_metrics1\" metrics=\"pymol_selection,sequence,ss,rosetta_sele\" prefix=\"m1_\" />\n",
    "\t\t<RunSimpleMetrics name=\"run_metrics2\" metrics=\"timing,ss\" prefix=\"m2_\" />\n",
    "\t</MOVERS>\n",
    "\t<PROTOCOLS>\n",
    "\t\t<Add mover_name=\"run_metrics1\"/>\n",
    "\t\t<Add mover_name=\"min_mover\" />\n",
    "\t\t<Add mover_name=\"run_metrics2\" />\n",
    "\t</PROTOCOLS>\n",
    "</ROSETTASCRIPTS>\n",
    "```\n",
    "\n",
    "Rosetta will carry out the order of operations specified in PROTOCOLS.  An important point is that SimpleMetrics and Filters never change the sequence or conformation of the structure.\n",
    "\n",
    "The movers do change the pose, and the output file will be the result of sequentially applying the movers in the protocols section. The standard scores of the output will be carried over from any protocol doing scoring, unless the OUTPUT tag is specified, in which case the corresponding score function from the SCOREFXNS block will be used.  \n",
    "\n",
    "## RosettaScripts Documentation\n",
    "\n",
    "It is recommended to read up on RosettaScripts here.  Note that each type of Rosetta class has a list and documentation of ALL accessible components.  This is extremely useful to get an idea of what Rosetta can do and how to use it in PyRosetta. \n",
    "\n",
    "https://www.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/RosettaScripts\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Notebook setup\n",
    "import sys\n",
    "if 'google.colab' in sys.modules:\n",
    "    !pip install pyrosettacolabsetup\n",
    "    import pyrosettacolabsetup\n",
    "    pyrosettacolabsetup.mount_pyrosetta_install()\n",
    "    print (\"Notebook is set for PyRosetta use in Colab.  Have fun!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Running Whole Protocols via RosettaScriptsParser\n",
    "\n",
    "Here we will use a whole the parser to generate a ParsedProtocol (mover).  This mover can then be run with the apply method on a pose of interest. \n",
    "\n",
    "Lets run the protocol above.  We will be running this on the file itself.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pyrosetta import *\n",
    "from rosetta.protocols.rosetta_scripts import *\n",
    "\n",
    "init('-no_fconfig @inputs/rabd/common')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pose = pose_from_pdb(\"inputs/rabd/my_ab.pdb\")\n",
    "original_pose = pose.clone()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "parser = RosettaScriptsParser()\n",
    "protocol = parser.generate_mover(\"inputs/min_L1.xml\")\n",
    "\n",
    "if not os.getenv(\"DEBUG\"):\n",
    "    protocol.apply(pose)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Running via XMLObjects and strings\n",
    "\n",
    "Next, we will use XMLObjects to create a protocol from a string.  Note that in-code, XMLOjbects uses special functionality of the `RosettaScriptsParser`.  Also note that the `XMLObjects` also has a `create_from_file` method that will take a path to an XML file.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pose = original_pose.clone()\n",
    "\n",
    "min_L1 = \"\"\"\n",
    "<ROSETTASCRIPTS>\n",
    "\t<SCOREFXNS>\n",
    "\t</SCOREFXNS>\n",
    "\t<RESIDUE_SELECTORS>\n",
    "\t\t<CDR name=\"L1\" cdrs=\"L1\"/>\n",
    "\t</RESIDUE_SELECTORS>\n",
    "\t<MOVE_MAP_FACTORIES>\n",
    "\t\t<MoveMapFactory name=\"movemap_L1\" bb=\"0\" chi=\"0\">\n",
    "\t\t\t<Backbone residue_selector=\"L1\" />\n",
    "\t\t\t<Chi residue_selector=\"L1\" />\n",
    "\t\t</MoveMapFactory>\n",
    "\t</MOVE_MAP_FACTORIES>\n",
    "\t<SIMPLE_METRICS>\n",
    "\t\t<TimingProfileMetric name=\"timing\" />\n",
    "\t\t<SelectedResiduesMetric name=\"rosetta_sele\" residue_selector=\"L1\" rosetta_numbering=\"1\"/>\n",
    "\t\t<SelectedResiduesPyMOLMetric name=\"pymol_selection\" residue_selector=\"L1\" />\n",
    "\t\t<SequenceMetric name=\"sequence\" residue_selector=\"L1\" />\n",
    "\t\t<SecondaryStructureMetric name=\"ss\" residue_selector=\"L1\" />\n",
    "\t</SIMPLE_METRICS>\n",
    "\t<MOVERS>\n",
    "\t\t<MinMover name=\"min_mover\" movemap_factory=\"movemap_L1\" tolerance=\".1\" /> \n",
    "\t\t<RunSimpleMetrics name=\"run_metrics1\" metrics=\"pymol_selection,sequence,ss,rosetta_sele\" prefix=\"m1_\" />\n",
    "\t\t<RunSimpleMetrics name=\"run_metrics2\" metrics=\"timing,ss\" prefix=\"m2_\" />\n",
    "\t</MOVERS>\n",
    "\t<PROTOCOLS>\n",
    "\t\t<Add mover_name=\"run_metrics1\"/>\n",
    "\t\t<Add mover_name=\"min_mover\" />\n",
    "\t\t<Add mover_name=\"run_metrics2\" />\n",
    "\t</PROTOCOLS>\n",
    "</ROSETTASCRIPTS>\n",
    "\"\"\"\n",
    "\n",
    "\n",
    "xml = XmlObjects.create_from_string(min_L1)\n",
    "protocol = xml.get_mover(\"ParsedProtocol\")\n",
    "\n",
    "if not os.getenv(\"DEBUG\"):\n",
    "    protocol.apply(pose)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Constructing Rosetta objects using XMLObjects\n",
    "\n",
    "### Pulling from whole script\n",
    "\n",
    "Here we will use our previous XMLObject that we setup using our script to pull a specific component from it.  Note that while this is very useful for running pre-defined Rosetta objects, we will not have any tab completion for it as it will be a generic type - which means we will be unable to further modify it.  \n",
    "\n",
    "Lets grab the residue selector and then see which residues are L1."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "L1_sele = xml.get_residue_selector(\"L1\")\n",
    "L1_res = L1_sele.apply(pose)\n",
    "for i in range(1, len(L1_res)+1):\n",
    "    if L1_res[i]:\n",
    "        print(\"L1 Residue: \", pose.pdb_info().pose2pdb(i), \":\", i )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Constructing from single section\n",
    "\n",
    "Here, we instead of parsing a whole script, we'll simply create the same L1 selector from the string itself.  This can be used for nearly every Rosetta class type in the script.  The 'static' part in the name means that we do not have to construct the XMLObject first, we can simply call its function. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "L1_sele = XmlObjects.static_get_residue_selector('<CDR name=\"L1\" cdrs=\"L1\"/>')\n",
    "L1_res = L1_sele.apply(pose)\n",
    "for i in range(1, len(L1_res)+1):\n",
    "    if L1_res[i]:\n",
    "        print(\"L1 Residue: \", pose.pdb_info().pose2pdb(i), \":\", i )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Do these residues match what we had before?  Why do both of these seem a bit slower?  The actual residue selection is extremely quick, but validating the XML against a schema (which checks to make sure the string that you passed is valid and works) takes time. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And that's it!  That should be everything you need to know about RosettaScripts in PyRosetta.  Enjoy! \n",
    "\n",
    "For XMLObjects, each type has a corresponding function (with and without static), these are listed below, but tab completion will help you here.  As you have seen above, the static functions are called on the class type, `XmlObjects`, while the non-static objects are called on an instance of the class after parsing a script, in our example, it was called `xml`.\n",
    "\n",
    "```\n",
    ".get_score_function   / .static_get_score_function\n",
    "\n",
    ".get_residue_selector / .static_get_residue_selector\n",
    "\n",
    ".get_simple_metric    / .static_get_simple_metric\n",
    "\n",
    ".get_filter           / .static_get_filter\n",
    "\n",
    ".get_mover            / .static_get_mover\n",
    "\n",
    ".get_task_operation   / .static_get_task_operation\n",
    "\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [Visualization with the `PyMOLMover`](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/02.06-Visualization-and-PyMOL-Mover.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [Visualization and `pyrosetta.distributed.viewer`](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/02.08-Visualization-and-pyrosetta.distributed.viewer.ipynb) ><p><a href=\"https://colab.research.google.com/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/02.07-RosettaScripts-in-PyRosetta.ipynb\"><img align=\"left\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open in Colab\" title=\"Open in Google Colaboratory\"></a>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}