Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [None]:
NAME = ""
COLLABORATORS = ""

---

<!--NOTEBOOK_HEADER-->
*This notebook contains material from [PyRosetta](https://RosettaCommons.github.io/PyRosetta.notebooks);
content is available [on Github](https://github.com/RosettaCommons/PyRosetta.notebooks.git).*

<!--NAVIGATION-->
< [PyRosettaCluster Tutorial 1B. Reproduce simple protocol](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/16.07-PyRosettaCluster-Reproduce-simple-protocol.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [PyRosettaCluster Tutorial 3. Multiple decoys](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/16.09-PyRosettaCluster-Multiple-decoys.ipynb) ><p><a href="https://colab.research.google.com/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/16.08-PyRosettaCluster-Multiple-protocols.ipynb"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" title="Open in Google Colaboratory"></a>

# PyRosettaCluster Tutorial 2. Multiple protocols

PyRosettaCluster Tutorial 2 is an example of using multiple user-provided PyRosetta protocols with `PyRosettaCluster`. Unlike Rosetta's `MultiplePoseMover` which executes multiple protocols serially, `PyRosettaCluster` executes multiple protocols in parallel (provided the cluster has more than one distributed worker). The user defines the order in which the protocols execute. Each `Pose` or `PackedPose` object returned from the first user-provided PyRosetta protocol is automatically passed to the second user-providd PyRosetta protocol, and so on. That is, `protocol1` returns a `Pose` object, which is then used as input for `protocol2`; `protocol2` returns a new `Pose` object, which is then used as input for `protocol3`, and so on. `Pose` objects returned by the final protocol are written to disk (unless the user specifies `PyRosettaCluster(..., save_all=True, ...)` in which case all intermediate decoys are also written to disk. Each decoy contains all of the relevant information needed to reproduce it.

*Warning*: This notebook uses `pyrosetta.distributed.viewer` code, which runs in `jupyter notebook` and might not run if you're using `jupyterlab`.

*Note:* This Jupyter notebook uses parallelization and is **not** meant to be executed within a Google Colab environment.

*Note:* This Jupyter notebook requires the PyRosetta distributed layer which is obtained by building PyRosetta with the `--serialization` flag or installing PyRosetta from the RosettaCommons conda channel 

**Please see Chapter 16.00 for setup instructions**

*Note:* This Jupyter notebook is intended to be run within **Jupyter Lab**, but may still be run as a standalone Jupyter notebook.

### 1. Import packages

In [None]:
import bz2
import glob
import json
import logging
import os
import pyrosetta
import pyrosetta.distributed.io as io
import pyrosetta.distributed.viewer as viewer

from pyrosetta.distributed.cluster import PyRosettaCluster

logging.basicConfig(level=logging.INFO)

### 2. Initialize a compute cluster using `dask`:

See Tutorial 1A to review:
1. Click the "Dask" tab in Jupyter Lab <i>(arrow, left)</i>
2. Click the "+ NEW" button to launch a new compute cluster <i>(arrow, lower)</i>
3. Once the cluster has started, click the brackets to "inject client code" for the cluster into your notebook

Inject client code here, then run the cell:

In [None]:
if not os.getenv("DEBUG"):
    from dask.distributed import Client

    client = Client("tcp://127.0.0.1:40329")
else:
    client = None
client

### 3. Define the user-provided PyRosetta protocols:

User-provided PyRosetta protocols may return `Pose` or `PackedPose` objects to be passed on to the next protocol. Protocols that don't return `Pose` or `PackedPose` objects are allowed, for example returning a `NoneType` object. In such cases, the subsequent protocol receives an empty `PackedPose` object.

In [None]:
def protocol1(packed_pose_in, **kwargs):
    """
    Repacks the input `PackedPose` object, which can be (a) input to the function
    automatically via the 'packed_pose_in' argument or (b) accessed through the 's' 
    `kwargs` keyword argument, depending on the order in which the protocol is 
    specified in the PyRosettaCluster.distributed() method.
    
    Args:
        packed_pose_in: A `PackedPose` object to be repacked. Optional.
        **kwargs: PyRosettaCluster keyword arguments.

    Returns:
        A `PackedPose` object.
    """
    import pyrosetta
    import pyrosetta.distributed.io as io
    import pyrosetta.distributed.tasks.rosetta_scripts as rosetta_scripts
    
    logging.info(
        "Now executing protocol number '{0}' called '{1}'.".format(
            kwargs["PyRosettaCluster_protocol_number"],
            kwargs["PyRosettaCluster_protocol_name"]
        )
    )
    
    if packed_pose_in == None:
        logging.info("Generating `packed_pose_in` from `kwargs['s']`.")
        packed_pose_in = io.pose_from_file(kwargs["s"])
    else:
        logging.info("Using `packed_pose_in` from `args`.")
        
    xml = """
        <ROSETTASCRIPTS>
          <TASKOPERATIONS>
            <RestrictToRepacking name="restrict_to_repacking"/>
          </TASKOPERATIONS>
          <MOVERS>
            <PackRotamersMover name="pack" task_operations="restrict_to_repacking" />
          </MOVERS>
          <PROTOCOLS>
            <Add mover="pack"/>
          </PROTOCOLS>
        </ROSETTASCRIPTS>
        """
    
    return rosetta_scripts.SingleoutputRosettaScriptsTask(xml)(packed_pose_in.pose.clone())

def protocol2(packed_pose_in, **kwargs):
    """
    Performs sequence design (Thr24Ser) on an input pose.
    
    Args:
        packed_pose_in: A `PackedPose` object to be designed.
        **kwargs: PyRosettaCluster keyword arguments.

    Returns:
        A `PackedPose` object.
    """
    import pyrosetta
    import pyrosetta.distributed.tasks.rosetta_scripts as rosetta_scripts

    xml = """
        <ROSETTASCRIPTS>
          <RESIDUE_SELECTORS>
            <Index name="T24" resnums="24A"/>
            <Not name="not24" selector="T24"/>
          </RESIDUE_SELECTORS>
          <TASKOPERATIONS>
            <ResfileCommandOperation name="design" command="PIKAA S" residue_selector="T24"/>
            <OperateOnResidueSubset name="prevent_repacking" selector="not24">
              <PreventRepackingRLT/>
            </OperateOnResidueSubset>
          </TASKOPERATIONS>
          <MOVERS>
            <PackRotamersMover name="pack" task_operations="design,prevent_repacking"/>
          </MOVERS>
          <PROTOCOLS>
            <Add mover="pack"/>
          </PROTOCOLS>
        </ROSETTASCRIPTS>
        """

    return rosetta_scripts.SingleoutputRosettaScriptsTask(xml)(packed_pose_in.pose.clone())

### 4. Define the user-provided kwargs:

In [None]:
def create_tasks():
    yield {
        "options": "-ex1",
        "extra_options": "-out:level 300 -multithreading:total_threads 1",
        "set_logging_handler": "interactive",
        "s": os.path.join(os.getcwd(), "inputs", "1QYS.pdb"),
    }

### 5. Launch the original simulation using `distribute()`:

In [None]:
if not os.getenv("DEBUG"):
    output_path = os.path.join(os.getcwd(), "outputs_2")

    PyRosettaCluster(
        tasks=create_tasks,
        client=client,
        scratch_dir=output_path,
        output_path=output_path,
    ).distribute(protocols=[protocol1, protocol2, protocol1])

While jobs are running, you may monitor their progress using the dask dashboard diagnostics within Jupyter Lab!

### 6. Visualize the resultant decoy:

Gather the input and output decoys from disk into memory:

In [None]:
if not os.getenv("DEBUG"):
    input_file = os.path.join(os.getcwd(), "inputs", "1QYS.pdb")
    output_file = glob.glob(os.path.join(output_path, "decoys", "*", "*.pdb.bz2"))[0]

    packed_poses = []
    for pdbfile in [input_file, output_file]:
        if pdbfile.endswith(".bz2"):
            with open(pdbfile, "rb") as f:
                packed_poses.append(io.pose_from_pdbstring(bz2.decompress(f.read()).decode()))
        elif pdbfile.endswith(".pdb"):
            with open(pdbfile, "r") as f:
                packed_poses.append(io.pose_from_pdbstring(f.read()))

The original Top7 (PDB ID: 1QYS) decoy and the designed Top7 decoy with the T24S mutation highlighted is shown below using the `pyrosetta.distributed.viewer` visualizer: 

In [None]:
if not os.getenv("DEBUG"):
    resi_24 = pyrosetta.rosetta.core.select.residue_selector.ResidueIndexSelector("24A")

    view = viewer.init(packed_poses, window_size=(800, 600))
    view.add(viewer.setStyle())
    view.add(viewer.setStyle(colorscheme="whiteCarbon", radius=0.25))
    view.add(viewer.setStyle(residue_selector=resi_24, colorscheme="magentaCarbon", radius=0.5))
    view.add(viewer.setHydrogenBonds())
    view.add(viewer.setHydrogens(polar_only=True))
    view.add(viewer.setDisulfides(radius=0.25))
    view()

### Congrats! 
You have successfully run `PyRosettaCluster` with multiple user-provided PyRosetta protocols!

<!--NAVIGATION-->
< [PyRosettaCluster Tutorial 1B. Reproduce simple protocol](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/16.07-PyRosettaCluster-Reproduce-simple-protocol.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [PyRosettaCluster Tutorial 3. Multiple decoys](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/16.09-PyRosettaCluster-Multiple-decoys.ipynb) ><p><a href="https://colab.research.google.com/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/16.08-PyRosettaCluster-Multiple-protocols.ipynb"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" title="Open in Google Colaboratory"></a>