<!--NOTEBOOK_HEADER-->
*This notebook contains material from [PyRosetta](https://RosettaCommons.github.io/PyRosetta.notebooks);
content is available [on Github](https://github.com/RosettaCommons/PyRosetta.notebooks.git).*

<!--NAVIGATION-->
< [*De Novo* Parametric Backbone Design](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.06-Introduction-to-Parametric-backbone-design.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [**Point Mutation Scan**](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.08-Point-Mutation-Scan.ipynb) ><p><a href="https://colab.research.google.com/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.07-Introduction-to-DeNovo-protein-design.ipynb"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" title="Open in Google Colaboratory"></a>

# *De Novo* Protein Design with PyRosetta
Keywords:

## Overview

**Make sure you are in the directory with the `.pdb` files:**

`cd google_drive/My\ Drive/student-notebooks/`

*Warning*: This notebook uses `pyrosetta.distributed.viewer` code, which runs in `jupyter notebook` and might not run if you're using `jupyterlab`.

*Note:* This Jupyter notebook requires the PyRosetta distributed layer which is obtained by building PyRosetta with the `--serialization` flag or installing PyRosetta from the RosettaCommons conda channel (for more information, visit: http://www.pyrosetta.org/dow).

In [1]:
import Bio.Data.IUPACData as IUPACData
import Bio.SeqUtils
import logging
logging.basicConfig(level=logging.INFO)
import os
import pyrosetta
import pyrosetta.distributed
import pyrosetta.distributed.viewer as viewer
import site
import sys

# Notebook setup
if 'google.colab' in sys.modules:
    !pip install pyrosettacolabsetup
    import pyrosettacolabsetup
    pyrosettacolabsetup.setup()
    print ("Notebook is set for PyRosetta use in Colab.  Have fun!")

Initialize PyRosetta:

In [2]:
flags = """
-linmem_ig 10
-ignore_unrecognized_res 1
-mute core.select.residue_selector.SecondaryStructureSelector
-mute core.select.residue_selector.PrimarySequenceNeighborhoodSelector
-mute protocols.DsspMover
"""
pyrosetta.distributed.init(flags)

INFO:pyrosetta.rosetta:Found rosetta database at: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyrosetta-2019.33+release.1e60c63beb5-py3.6-macosx-10.6-intel.egg/pyrosetta/database; using it....
INFO:pyrosetta.rosetta:PyRosetta-4 2019 [Rosetta PyRosetta4.Release.python36.mac 2019.33+release.1e60c63beb532fd475f0f704d68d462b8af2a977 2019-08-09T15:19:57] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
INFO:rosetta:[0mcore.init: [0mChecking for fconfig files in pwd and ./rosetta/flags
INFO:rosetta:[0mcore.init: [0mReading fconfig.../Users/jadolfbr/.rosetta/flags/common
INFO:rosetta:[0mcore.init: [0m
INFO:rosetta:[0mcore.init: [0m
INFO:rosetta:[0mcore.init: [0mRosetta version: PyRosetta4.Release.python36.mac r230 2019.33+release.1e60c63beb5 1e60c63beb532fd475f0f704d68d462b8af2a977 http://www.pyrosetta.org 2019-08-09T15:19:57
INFO:rosetta:[0mcore.init: [

PyRosetta-4 2019 [Rosetta PyRosetta4.Release.python36.mac 2019.33+release.1e60c63beb532fd475f0f704d68d462b8af2a977 2019-08-09T15:19:57] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.


 Let's setup the pose of a *de novo* helical bundle from the PDB for downstream design: https://www.rcsb.org/structure/5J0J

In [3]:
start_pose = pyrosetta.io.pose_from_file("inputs/5J0J.clean.pdb")
pose = start_pose.clone()

INFO:rosetta:[0mcore.chemical.GlobalResidueTypeSet: [0mFinished initializing fa_standard residue type set.  Created 980 residue types
INFO:rosetta:[0mcore.chemical.GlobalResidueTypeSet: [0mTotal time to initialize 0.975747 seconds.
INFO:rosetta:[0mcore.import_pose.import_pose: [0mFile '5J0J.clean.pdb' automatically determined to be of type PDB
INFO:rosetta:[0mcore.io.pdb.pdb_reader: [0mParsing 0 .pdb records with unknown format to search for Rosetta-specific comments.


## Design Strategy 

Minimize the crystal structure coordinates, then convert chain "A" to poly-alanine, and then perform one-sided protein-protein interface design designing chain A while only re-packing the homotrimer interface residues of chains B and C! Therefore, we are designing a homotrimer into a heterotrimer. Furthermore, prevent backbone torsions from minimizing and only minimize the side-chains of the homotrimer interface and all of chain A using the `FastDesign` mover! After design, repack and minimize all side-chains.

 Prior to and after design, we want to relax the protein with a scorefunction that packs the rotamers and minimizes the side-chain degrees of freedom while optimizing for realistic energies. If you were to allow backbone minimization (which you may optionally choose to), we would want use a Cartesian scorefunction (in this case, `ref2015_cart.wts`) which automatically sets `cart_bonded` scoreterm to a weight of 1.0, which helps to close chain breaks in the backbone. Similarly, you might want to also turn on the `coordinate_constraint` scoreterm to penalize deviations of the backbone coordinates from their initial coordinates during minimization. In this tutorial, we demonstrate that concept but will prevent backbone torsions from being minimized (however, feel free to turn on backbone minimization!):

In [4]:
relax_scorefxn = pyrosetta.create_score_function("ref2015_cart.wts")
relax_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.coordinate_constraint, 1.0)
print("The starting pose total_score is {}".format(relax_scorefxn(start_pose)))

INFO:rosetta:[0mcore.scoring.etable: [0mStarting energy table calculation
INFO:rosetta:[0mcore.scoring.etable: [0msmooth_etable: changing atr/rep split to bottom of energy well
INFO:rosetta:[0mcore.scoring.etable: [0msmooth_etable: spline smoothing lj etables (maxdis = 6)
INFO:rosetta:[0mcore.scoring.etable: [0msmooth_etable: spline smoothing solvation etables (max_dis = 6)
INFO:rosetta:[0mcore.scoring.etable: [0mFinished calculating energy tables.
INFO:rosetta:[0mbasic.io.database: [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBPoly1D.csv
INFO:rosetta:[0mbasic.io.database: [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBFadeIntervals.csv
INFO:rosetta:[0mbasic.io.database: [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBEval.csv
INFO:rosetta:[0mbasic.io.database: [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/DonStrength.csv
INFO:rosetta:[0mbasic.io.database: [0mDatab

The starting pose total_score is 2113.5810054254503


 For *design*, if we allowed backbone minimzation we again would turn on the `coordinate_constraint` scoreterm to penalize deviations of the backbone coordinates from their initial coordinates during minimization. Additionally, we will use "non-pairwise decomposable" scoreterms to guide the packer trajectories (i.e. fixed backbone sequence design trajectories) in favor of our hypothetical design requirements to solve our biological problem. In this tutorial, we will make use of the following additional scoreterms during design:
- `aa_composition`: This scoring term is intended for use during design, to penalize deviations from a desired residue type composition. Applies to any amino acid composition requirements specified by the user within a ResidueSelector. This scoreterm also applies to the `AddHelixSequenceConstraints` mover which sets up ideal sequence constraints for each helix in a pose or in a selection.
- `voids_penalty`: This scoring term is intended for use during design, to penalize buried voids or cavities and to guide the packer to design solutions in which all buried volume is filled with side-chains.
- `aa_repeat`: This wholebody scoring term is intended for use during design, to penalize long stretches in which the same residue type repeats over and over (e.g. poly-Q sequences). 
- `buried_unsatisfied_penalty`: This scoring term is intended for use during design, to provide a penalty for buried hydrogen bond donors or acceptors that are unsatisfied. 
- `netcharge`: This scoring term is intended for use during design, to penalize deviations from a desired net charge in a pose or in a selection.
- `hbnet`: This scoring term is intended for use during design, to provide a bonus for hydrogen bond network formation.

In [5]:
design_scorefxn = pyrosetta.create_score_function("ref2015_cart.wts")
design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.coordinate_constraint, 10.0)
design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.aa_composition, 1.0)
design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.voids_penalty, 0.25)
design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.aa_repeat, 1.0)
design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.buried_unsatisfied_penalty, 1.0)
design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.hbnet, 1.0)
design_scorefxn.set_weight(pyrosetta.rosetta.core.scoring.ScoreType.netcharge, 1.0)
print("The starting pose total_score is {}".format(design_scorefxn(start_pose)))

INFO:rosetta:[0mbasic.io.database: [0mDatabase file opened: scoring/score_functions/aa_repeat_energy/default_repeat_penalty_table.rpt_pen


The starting pose total_score is 23016.58100542545


 By using the `relax_scorefxn` before and after the `design_scorefxn`, we ensure that these "non-pairwise decomposable" scoreterms are not forcing unrealistic rotamers that would otherwise not be held in place without these additional scoreterms.

 Prior to any deviation from crystal structure coordinates, apply coordinate constraints to all of the backbone heavy atoms:

In [6]:
true_selector = pyrosetta.rosetta.core.select.residue_selector.TrueResidueSelector() # Select all residues

# Apply a virtual root onto the pose to prevent large lever-arm effects while minimizing with coordinate constraints
virtual_root = pyrosetta.rosetta.protocols.simple_moves.VirtualRootMover()
virtual_root.set_removable(True)
virtual_root.set_remove(False)
virtual_root.apply(pose)

# Construct the CoordinateConstraintGenerator
coord_constraint_gen = pyrosetta.rosetta.protocols.constraint_generator.CoordinateConstraintGenerator()
coord_constraint_gen.set_id("contrain_all_backbone_atoms!")
coord_constraint_gen.set_ambiguous_hnq(False)
coord_constraint_gen.set_bounded(False)
coord_constraint_gen.set_sidechain(False)
coord_constraint_gen.set_sd(1.0) # Sets a standard deviation of contrained atoms to (an arbitrary) 1.0 Angstroms RMSD. Set higher or lower for different results.
coord_constraint_gen.set_ca_only(False)
coord_constraint_gen.set_residue_selector(true_selector)

# Apply the CoordinateConstraintGenerator using the AddConstraints mover
add_constraints = pyrosetta.rosetta.protocols.constraint_generator.AddConstraints()
add_constraints.add_generator(coord_constraint_gen)
add_constraints.apply(pose)

INFO:rosetta:[0mprotocols.constraint_generator.AddConstraints: [0mAdding 843 constraints generated by ConstraintGenerator named contrain_all_backbone_atoms!


 Prior to design, minimize with the `FastRelax` mover to optimize the pose within the `relax_scorefxn` scorefunction. Note: this takes ~1min 10s

In [7]:
tf = pyrosetta.rosetta.core.pack.task.TaskFactory()
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.InitializeFromCommandline())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.IncludeCurrent())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.NoRepackDisulfides())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(
    pyrosetta.rosetta.core.pack.task.operation.PreventRepackingRLT(), true_selector)) # Set to RestrictToRepackingRLT for slower/better results
mm = pyrosetta.rosetta.core.kinematics.MoveMap()
mm.set_bb(False) # Set to true if desired
mm.set_chi(True)
mm.set_jump(False)
fast_relax = pyrosetta.rosetta.protocols.relax.FastRelax(scorefxn_in=relax_scorefxn, standard_repeats=1) # 2-5 repeats suggested for real applications
fast_relax.cartesian(True)
fast_relax.set_task_factory(tf)
fast_relax.set_movemap(mm)
fast_relax.minimize_bond_angles(True)
fast_relax.minimize_bond_lengths(True)
fast_relax.min_type("lbfgs_armijo_nonmonotone")
fast_relax.ramp_down_constraints(False)

if not os.getenv("DEBUG"):
    %time fast_relax.apply(pose)

# Optionally, instead of running this you could reload the saved pose from a previously run trajectory:
#pose = pyrosetta.pose_from_file("minimized_start_pose.pdb")

INFO:rosetta:[0mprotocols.relax.RelaxScriptManager: [0mReading relax scripts list from database.
INFO:rosetta:[0mprotocols.relax.RelaxScriptManager: [0mLooking for default.txt
INFO:rosetta:[0mprotocols.relax.RelaxScriptManager: [0mrepeat %%nrepeats%%
INFO:rosetta:[0mprotocols.relax.RelaxScriptManager: [0mramp_repack_min 0.02  0.01     1.0
INFO:rosetta:[0mprotocols.relax.RelaxScriptManager: [0mramp_repack_min 0.250 0.01     0.5
INFO:rosetta:[0mprotocols.relax.RelaxScriptManager: [0mramp_repack_min 0.550 0.01     0.0
INFO:rosetta:[0mprotocols.relax.RelaxScriptManager: [0mramp_repack_min 1     0.00001  0.0
INFO:rosetta:[0mprotocols.relax.RelaxScriptManager: [0maccept_to_best
INFO:rosetta:[0mprotocols.relax.RelaxScriptManager: [0mendrepeat
INFO:rosetta:[0mcore.scoring.CartesianBondedEnergy: [0mCreating new peptide-bonded energy container (211)
INFO:rosetta:[0mcore.scoring.CartesianBondedEnergy: [0mAdding undefined angle VRT: X,ORIG,Y to DB with theta0 = 1.5708 , Kthet

CPU times: user 59.9 s, sys: 445 ms, total: 1min
Wall time: 1min


 Let's check the delta `total_score` per residue after minimizing in the `relax_scorefxn` scorefunction:

In [8]:
if not os.getenv("DEBUG"):
    initial_score_res = relax_scorefxn(start_pose)/start_pose.size()
    final_score_res = relax_scorefxn(pose)/pose.size()
    delta_total_score_res = final_score_res - initial_score_res
    print("{0} kcal/(mol*res) - {1} kcal/(mol*res) = {2} kcal/(mol*res)".format(final_score_res, initial_score_res, delta_total_score_res))

2.4514614619029005 kcal/(mol*res) - 10.064671454406906 kcal/(mol*res) = -7.6132099925040055 kcal/(mol*res)


 We can see that the crystal structure coordinates were not quite optimal according to the `relax_scorefxn` scorefunction. So which model is correct?

By how many Angstroms RMSD did the backbone Cα atoms move?

In [9]:
### BEGIN SOLUTION
pyrosetta.rosetta.core.scoring.CA_rmsd(start_pose, pose)
### END SOLUTION

0.0

 For downstream analysis, we want to save the pose:

In [10]:
minimized_start_pose = pose.clone()
pyrosetta.dump_pdb(minimized_start_pose, "outputs/minimized_start_pose.pdb")

True

## *De Novo* Protein Design

Prior to designing chain A, first let's make chain A poly-alanine so that we can re-design the sidechains onto the backbone:

In [11]:
# Since we will do direct pose manipulation, first remove the constraints
remove_constraints = pyrosetta.rosetta.protocols.constraint_generator.RemoveConstraints()
remove_constraints.add_generator(coord_constraint_gen)
remove_constraints.apply(pose)

INFO:rosetta:[0mprotocols.constraint_generator.RemoveConstraints: [0mRemoving 843 constraints from pose generated by contrain_all_backbone_atoms!


 Obtain chain A and convert it to poly-alanine

In [12]:
keep_chA = pyrosetta.rosetta.protocols.grafting.simple_movers.KeepRegionMover(res_start=str(pose.chain_begin(1)), res_end=str(pose.chain_end(1)))
keep_chA.apply(pose)
polyA_chA = pyrosetta.rosetta.protocols.pose_creation.MakePolyXMover(aa="ALA", keep_pro=0, keep_gly=0, keep_disulfide_cys=0)
polyA_chA.apply(pose)

INFO:rosetta:[0mprotocols.grafting.util: [0mReturning 70 residues from 1 to 70
INFO:rosetta:[0mprotocols.pose_creation.MakePolyXMover: [0mPose is converted to poly ALA


 Obtain chains B and C

In [13]:
pose_chBC = minimized_start_pose.clone()
keep_chBC = pyrosetta.rosetta.protocols.grafting.simple_movers.KeepRegionMover(res_start=str(pose_chBC.chain_begin(2)), res_end=str(pose_chBC.chain_end(3)-1))
keep_chBC.apply(pose_chBC)

INFO:rosetta:[0mprotocols.grafting.util: [0mReturning 140 residues from 71 to 210


 Append chains B and C onto the poly-alanine version of chain A

In [14]:
pyrosetta.rosetta.core.pose.append_pose_to_pose(pose1=pose, pose2=pose_chBC, new_chain=True)

 Pose is now considered to have only 2 chains.

In [15]:
pose.num_chains()

2

 Let's re-establish that there are 3 chains.

In [16]:
switch_chains = pyrosetta.rosetta.protocols.simple_moves.SwitchChainOrderMover()
switch_chains.chain_order("12")
switch_chains.apply(pose)

print(pose.pdb_info())
print("Now the number of chains = {}".format(pose.num_chains()))

INFO:rosetta:[0mcore.scoring.ScoreFunctionFactory: [0mSCOREFUNCTION: [32mref2015[0m
INFO:rosetta:[0mprotocols.simple_moves.SwitchChainOrderMover: [0mNumber of chains in pose: 2
INFO:rosetta:[0mprotocols.simple_moves.SwitchChainOrderMover: [0mNow at chain: 1
INFO:rosetta:[0mprotocols.simple_moves.SwitchChainOrderMover: [0mNow at chain: 2
INFO:rosetta:[0mprotocols.simple_moves.SwitchChainOrderMover: [0mNew pose's foldtree FOLD_TREE  EDGE 1 70 -1  EDGE 1 71 1  EDGE 71 210 -1


PDB file name: 
 Pose Range  Chain    PDB Range  |   #Residues         #Atoms

0001 -- 0070    A 0001  -- 0070  |   0070 residues;    00703 atoms
0071 -- 0139    B 0071  -- 0139  |   0069 residues;    01187 atoms
0140 -- 0210    C 0140  -- 0210  |   0071 residues;    01223 atoms
                           TOTAL |   0210 residues;    03113 atoms

Now the number of chains = 3


 Re-apply backbone coordinate constraints:

In [17]:
virtual_root.apply(pose)
add_constraints.apply(pose)

INFO:rosetta:[0mprotocols.constraint_generator.AddConstraints: [0mAdding 843 constraints generated by ConstraintGenerator named contrain_all_backbone_atoms!


 Have a look at the new pose, chain A in which is ready to be designed!

 Next, we need to apply certain movers with our design specifications to activate certain non-pairwise decomposable scoreterms in the `design_scorefxn` scorefunction, that will be implemented when we run the `FastDesign` mover (or any downstream mover that calls the packer).

 We will frequently be using the following residue selectors:

In [18]:
chain_A = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("A")
chain_BC = pyrosetta.rosetta.core.select.residue_selector.NotResidueSelector(chain_A)

 Apply the `AddCompositionConstraintMover` mover, which utilizes the `aa_composition` scoreterm. Applying the following mover to the pose imposes the design constraints that the ResidueSelector have 40% aliphatic or aromatic residues other than leucine (i.e. ALA, PHE, ILE, MET, PRO, VAL, TRP, or TYR), and 5% leucines. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/AACompositionEnergy

In [19]:
add_composition_constraint = pyrosetta.rosetta.protocols.aa_composition.AddCompositionConstraintMover()
add_composition_constraint.create_constraint_from_file_contents("""
PENALTY_DEFINITION
OR_PROPERTIES AROMATIC ALIPHATIC
NOT_TYPE LEU
FRACT_DELTA_START -0.05
FRACT_DELTA_END 0.05
PENALTIES 1 0 1 # The above two lines mean that if we're 5% below or 5% above the desired content, we get a 1-point penalty.
FRACTION 0.4 # Forty percent aromatic or aliphatic, but not leucine
BEFORE_FUNCTION CONSTANT
AFTER_FUNCTION CONSTANT
END_PENALTY_DEFINITION

PENALTY_DEFINITION
TYPE LEU
DELTA_START -1
DELTA_END 1
PENALTIES 1 0 1
FRACTION 0.05 # Five percent leucine
BEFORE_FUNCTION CONSTANT
AFTER_FUNCTION CONSTANT
END_PENALTY_DEFINITION
""")
add_composition_constraint.add_residue_selector(chain_A)
add_composition_constraint.apply(pose)

INFO:rosetta:[0mprotocols.aa_composition.AddCompositionConstraintMover: [0mInitialized AACompositionConstraint object from file contents:

PENALTY_DEFINITION
OR_PROPERTIES AROMATIC ALIPHATIC
NOT_TYPE LEU
FRACT_DELTA_START -0.05
FRACT_DELTA_END 0.05
PENALTIES 1 0 1 # The above two lines mean that if we're 5% below or 5% above the desired content, we get a 1-point penalty.
FRACTION 0.4 # Forty percent aromatic or aliphatic, but not leucine
BEFORE_FUNCTION CONSTANT
AFTER_FUNCTION CONSTANT
END_PENALTY_DEFINITION

PENALTY_DEFINITION
TYPE LEU
DELTA_START -1
DELTA_END 1
PENALTIES 1 0 1
FRACTION 0.05 # Five percent leucine
BEFORE_FUNCTION CONSTANT
AFTER_FUNCTION CONSTANT
END_PENALTY_DEFINITION


 Apply the `AddHelixSequenceConstraints` mover, which utilizes the `aa_composition` scoreterm. By default, this mover adds five types of sequence constraints to the designable residues in each alpha helix in the pose. Any of these behaviours may be disabled or modified by invoking advanced options, but no advanced options need be set in most cases. The five types of sequence constraints are:

- A strong sequence constraint requiring at least two negatively-charged residues in the first (N-terminal) three residues of each alpha-helix.

- A strong sequence constraint requiring at least two positively-charged residues in the last (C-terminal) three residues of each alpha-helix.

- A weak but strongly ramping sequence constraint penalizing helix-disfavoring residue types (by default, Asn, Asp, Ser, Gly, Thr, and Val) throughout each helix. (A single such residue is sometimes tolerated, but the penalty for having more than one residue in this category increases quadratically with the count of helix-disfavouring residues.)

- A weak sequence constraint coaxing the helix to have 10% alanine. Because this constraint is weak, deviations from this value are tolerated, but this should prevent an excessive abundance of alanine residues.

- A weak sequence constraint coaxing the helix to have at least 25% hydrophobic content. This constraint is also weak, so slightly less hydrophobic helices will be tolerated to some degree. Note that alanine is not considered to be "hydrophobic" within Rosetta. 

### For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/AACompositionEnergy

In [20]:
add_helix_sequence_constraints = pyrosetta.rosetta.protocols.aa_composition.AddHelixSequenceConstraintsMover()
add_helix_sequence_constraints.set_residue_selector(chain_A)
add_helix_sequence_constraints.apply(pose)

INFO:rosetta:[0mprotocols.aa_composition.AddHelixSequenceConstraintsMover: [0mApplying sequence constraints for helix 1, running from residue 2 through residue 32.
INFO:rosetta:[0mprotocols.aa_composition.AddHelixSequenceConstraintsMover: [0m	Adding N-terminal sequence constraints to helix.
INFO:rosetta:[0mprotocols.aa_composition.AddHelixSequenceConstraintsMover: [0m	Adding C-terminal sequence constraints to helix.
INFO:rosetta:[0mprotocols.aa_composition.AddHelixSequenceConstraintsMover: [0m	Adding sequence constraints to helix penalizing helix-disfavouring residue types.
INFO:rosetta:[0mprotocols.aa_composition.AddHelixSequenceConstraintsMover: [0m	Adding sequence constraints to helix to control fractional alanine content.
INFO:rosetta:[0mprotocols.aa_composition.AddHelixSequenceConstraintsMover: [0m	Adding sequence constraints to helix to enforce a minimum hydrophobic content.
INFO:rosetta:[0mprotocols.aa_composition.AddHelixSequenceConstraintsMover: [0mApplying seque

Note: the `aa_repeat` scoreterm works out-of-the-box, and does not need to be applied to the pose to work, it just needs to have a weight of >0 in the scorefunction used by the packer. It imposes a penalty for each stretch of repeating amino acids, with the penalty value depending nonlinearly on the length of the repeating stretch. By default, 1- or 2-residue stretches incur no penalty, 3-residue stretches incur a penalty of +1, 4-residue stretches incur a penalty of +10, and 5-residue stretches or longer incur a penalty of +100. Since the term is sequence-based, it is really only useful for design -- that is, it will impose an identical penalty for a fixed-sequence pose, regardless its conformation. This also means that the term has no conformational derivatives: the minimizer ignores it completely. The term is not pairwise-decomposible, but has been made packer-compatible, so it can direct the sequence composition during a packer run. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/Repeat-stretch-energy

 Similarly, the `voids_penalty` scoreterm does not need to be applied to the pose to work, it just needs to have a weight of >0 in the scorefunction used by the packer. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/VoidsPenaltyEnergy

 Similarly, the `buried_unsatisfied_penalty` scoreterm does not need to be applied to the pose to work, it just needs to have a weight of >0 in the scorefunction used by the packer. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/BuriedUnsatPenalty

 Similarly, the `hbnet` scoreterm does not need to be applied to the pose to work, it just needs to have a weight of >0 (ideally between 1.0 to 10.0) in the scorefunction used by the packer. For documentation, see: https://www.rosettacommons.org/docs/latest/rosetta_basics/scoring/HBNetEnergy

 Apply the `AddNetChargeConstraintMover` mover, which utilizes the `netcharge` scoreterm. In this case, we require that the net charge in chain A must be exactly 0.

In [21]:
add_net_charge_constraint = pyrosetta.rosetta.protocols.aa_composition.AddNetChargeConstraintMover()
add_net_charge_constraint.create_constraint_from_file_contents("""
DESIRED_CHARGE 0 #Desired net charge is zero.
PENALTIES_CHARGE_RANGE -1 1 #Penalties are listed in the observed net charge range of -1 to +1.
PENALTIES 1 0 1 #The penalties are 1 for an observed charge of -1, 0 for an observed charge of 0, and 1 for an observed charge of +1.
BEFORE_FUNCTION QUADRATIC #Ramp quadratically for observed net charges of -2 or less.
AFTER_FUNCTION QUADRATIC #Ramp quadratically for observed net charges of +2 or greater.
""")
add_net_charge_constraint.add_residue_selector(chain_A)
add_net_charge_constraint.apply(pose)

INFO:rosetta:[0mprotocols.aa_composition.AddNetChargeConstraintMover: [0mInitialized NetChargeConstraint object from file contents:

DESIRED_CHARGE 0 #Desired net charge is zero.
PENALTIES_CHARGE_RANGE -1 1 #Penalties are listed in the observed net charge range of -1 to +1.
PENALTIES 1 0 1 #The penalties are 1 for an observed charge of -1, 0 for an observed charge of 0, and 1 for an observed charge of +1.
BEFORE_FUNCTION QUADRATIC #Ramp quadratically for observed net charges of -2 or less.
AFTER_FUNCTION QUADRATIC #Ramp quadratically for observed net charges of +2 or greater.


 Specify a custom `relaxscript` that optimizes the `ramp_repack_min` weights to prevent too many alanines from being designed (demonstrated at Pre-RosettaCON 2018):

 Specify TaskOperations to be applied to chain A. In this case, let's use the latest LayerDesign implementation (Note: this is still being actively developed) using the XmlObjects class.

In [22]:
layer_design_task = pyrosetta.rosetta.protocols.rosetta_scripts.XmlObjects.create_from_string("""
<RESIDUE_SELECTORS>
  <Layer name="surface" select_core="false" select_boundary="false" select_surface="true" use_sidechain_neighbors="true"/>
  <Layer name="boundary" select_core="false" select_boundary="true" select_surface="false" use_sidechain_neighbors="true"/>
  <Layer name="core" select_core="true" select_boundary="false" select_surface="false" use_sidechain_neighbors="true"/>
  <SecondaryStructure name="sheet" overlap="0" minH="3" minE="2" include_terminal_loops="false" use_dssp="true" ss="E"/>
  <SecondaryStructure name="entire_loop" overlap="0" minH="3" minE="2" include_terminal_loops="true" use_dssp="true" ss="L"/>
  <SecondaryStructure name="entire_helix" overlap="0" minH="3" minE="2" include_terminal_loops="false" use_dssp="true" ss="H"/>
  <And name="helix_cap" selectors="entire_loop">
   <PrimarySequenceNeighborhood lower="1" upper="0" selector="entire_helix"/>
  </And>
  <And name="helix_start" selectors="entire_helix">
    <PrimarySequenceNeighborhood lower="0" upper="1" selector="helix_cap"/>
  </And>
  <And name="helix" selectors="entire_helix">
    <Not selector="helix_start"/>
  </And>
  <And name="loop" selectors="entire_loop">
    <Not selector="helix_cap"/>
  </And>
</RESIDUE_SELECTORS>
<TASKOPERATIONS>
  <DesignRestrictions name="layer_design">
    <Action selector_logic="surface AND helix_start" aas="EHKPQR"/>
    <Action selector_logic="surface AND helix" aas="EHKQR"/>
    <Action selector_logic="surface AND sheet" aas="DEHKNQRST"/>
    <Action selector_logic="surface AND loop" aas="DEGHKNPQRST"/>
    <Action selector_logic="boundary AND helix_start" aas="ADEIKLMNPQRSTVWY"/>
    <Action selector_logic="boundary AND helix" aas="ADEIKLMNQRSTVWY"/>
    <Action selector_logic="boundary AND sheet" aas="DEFIKLNQRSTVWY"/>
    <Action selector_logic="boundary AND loop" aas="ADEFGIKLMNPQRSTVWY"/>
    <Action selector_logic="core AND helix_start" aas="AFILMPVWY"/>
    <Action selector_logic="core AND helix" aas="AFILMVWY"/>
    <Action selector_logic="core AND sheet" aas="FILVWY"/>
    <Action selector_logic="core AND loop" aas="AFGILMPVWY"/>
    <Action selector_logic="helix_cap" aas="DNST"/>
  </DesignRestrictions>
</TASKOPERATIONS>
""").get_task_operation("layer_design")

INFO:rosetta:[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mGenerating XML Schema for rosetta_scripts...
INFO:rosetta:[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
INFO:rosetta:[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mInitializing schema validator...
INFO:rosetta:[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
INFO:rosetta:[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mValidating input script...
INFO:rosetta:[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
INFO:rosetta:[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mParsed script:
<ROSETTASCRIPTS>
	<RESIDUE_SELECTORS>
		<Layer name="surface" select_boundary="false" select_core="false" select_surface="true" use_sidechain_neighbors="true"/>
		<Layer name="boundary" select_boundary="true" select_core="false" select_surface="false" use_sidechain_neighbors="true"/>
		<Layer name="core" select_boundary="false" select_core="true" select_surface="false" u

 Also prepare a ResidueSelector for the heterotrimer interface within chains B and C, and the rest of chains B and C. We will use these with `RestrictAbsentCanonicalAASRLT`, `RestrictToRepackingRLT`, and `PreventRepackingRLT` TaskOperations:

In [23]:
interface = pyrosetta.rosetta.core.select.residue_selector.InterGroupInterfaceByVectorSelector()
interface.group1_selector(chain_A)
interface.group2_selector(chain_BC)
chain_BC_interface = pyrosetta.rosetta.core.select.residue_selector.AndResidueSelector(interface, chain_BC)
not_chain_BC_interface = pyrosetta.rosetta.core.select.residue_selector.NotResidueSelector(chain_BC_interface)
chain_BC_not_interface = pyrosetta.rosetta.core.select.residue_selector.AndResidueSelector(not_chain_BC_interface, chain_BC)

 For minimization, we also need the following ResidueSelector:

In [24]:
chain_A_and_BC_interface = pyrosetta.rosetta.core.select.residue_selector.OrResidueSelector(chain_A, chain_BC_interface)

 Make sure each ResidueSelector selects the regions as desired (in the following case, the `ResidueSelector` `interface` is visualized:

In [None]:
view = viewer.init(pose) \
    + viewer.setStyle() \
    + viewer.setStyle(residue_selector=interface, colorscheme="greyCarbon")
view()

 Create TaskFactory to be used with the `FastRelax` mover (We use this instead of the FastDesign mover as the constructor allows us to set a relax script):

In [25]:
tf.clear()
tf = pyrosetta.rosetta.core.pack.task.TaskFactory()

tf.push_back(pyrosetta.rosetta.core.pack.task.operation.InitializeFromCommandline())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.IncludeCurrent())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.NoRepackDisulfides())

# Prevent repacking on chain_BC_not_interface
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(
    pyrosetta.rosetta.core.pack.task.operation.PreventRepackingRLT(), chain_BC_not_interface))

# Repack only on chain_BC_interface
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(
    pyrosetta.rosetta.core.pack.task.operation.RestrictToRepackingRLT(), chain_BC_interface))

# Enable design on chain_A
aa_to_design = pyrosetta.rosetta.core.pack.task.operation.RestrictAbsentCanonicalAASRLT()
aa_to_design.aas_to_keep("ACDEFGHIKLMNPQRSTVWY")
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(
    aa_to_design, chain_A))

# Apply layer design
tf.push_back(layer_design_task)

# Convert the task factory into a PackerTask
packer_task = tf.create_task_and_apply_taskoperations(pose)
# View the PackerTask
print(packer_task)

INFO:rosetta:[0mcore.pack.task: [0mPacker task: initialize from command line()


#Packer_Task

resid	pack?	design?	allowed_aas
1	TRUE	TRUE	ASP:NtermProteinFull,ASN:NtermProteinFull,SER:NtermProteinFull,THR:NtermProteinFull
2	TRUE	TRUE	GLU,HIS,HIS_D,LYS,PRO,GLN,ARG
3	TRUE	TRUE	GLU,HIS,HIS_D,LYS,GLN,ARG
4	TRUE	TRUE	ALA,ASP,GLU,ILE,LYS,LEU,MET,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
5	TRUE	TRUE	GLU,HIS,HIS_D,LYS,GLN,ARG
6	TRUE	TRUE	GLU,HIS,HIS_D,LYS,GLN,ARG
7	TRUE	TRUE	ALA,ASP,GLU,ILE,LYS,LEU,MET,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
8	TRUE	TRUE	ALA,ASP,GLU,ILE,LYS,LEU,MET,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
9	TRUE	TRUE	GLU,HIS,HIS_D,LYS,GLN,ARG
10	TRUE	TRUE	GLU,HIS,HIS_D,LYS,GLN,ARG
11	TRUE	TRUE	ALA,PHE,ILE,LEU,MET,VAL,TRP,TYR
12	TRUE	TRUE	GLU,HIS,HIS_D,LYS,GLN,ARG
13	TRUE	TRUE	GLU,HIS,HIS_D,LYS,GLN,ARG
14	TRUE	TRUE	ALA,PHE,ILE,LEU,MET,VAL,TRP,TYR
15	TRUE	TRUE	ALA,ASP,GLU,ILE,LYS,LEU,MET,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
16	TRUE	TRUE	GLU,HIS,HIS_D,LYS,GLN,ARG
17	TRUE	TRUE	GLU,HIS,HIS_D,LYS,GLN,ARG
18	TRUE	TRUE	ALA,PHE,ILE,LEU,MET,VAL,TRP,TYR
19	TRUE	TRUE	GLU,HIS,HIS_D,LYS,GLN,ARG
20	TRU

The PackerTask looks as intended. Now setup the `MoveMapFactory`:

In [26]:
# Set up a MoveMapFactory
mmf = pyrosetta.rosetta.core.select.movemap.MoveMapFactory()
mmf.all_bb(setting=False)  # Set to true if needed
mmf.all_bondangles(setting=False)
mmf.all_bondlengths(setting=False)
mmf.all_chi(setting=True)
mmf.all_jumps(setting=False)
mmf.set_cartesian(setting=False)

# Set movemap actions to turn on or off certain torsions, overriding the above defaults
enable = pyrosetta.rosetta.core.select.movemap.move_map_action.mm_enable
disable = pyrosetta.rosetta.core.select.movemap.move_map_action.mm_disable

# Set custom minimizable torsions
mmf.add_bondangles_action(action=enable, selector=chain_A_and_BC_interface)
mmf.add_bondlengths_action(action=enable, selector=chain_A_and_BC_interface)
mmf.add_chi_action(action=enable, selector=chain_A_and_BC_interface)

mmf.add_bondangles_action(action=disable, selector=chain_BC_not_interface)
mmf.add_bondlengths_action(action=disable, selector=chain_BC_not_interface)
mmf.add_chi_action(action=disable, selector=chain_BC_not_interface)

 Now let's double-check some more `pose` information to verify that we are ready for `FastDesign`:

In [27]:
display_pose = pyrosetta.rosetta.protocols.fold_from_loops.movers.DisplayPoseLabelsMover()
display_pose.tasks(tf)
display_pose.movemap_factory(mmf)
display_pose.apply(pose)

INFO:rosetta:[0mprotocols.fold_from_loops.DisplayPoseLabelsMover: [0mSEQUENCE       AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAYKIKETLKRLEDSLRELRRILEELKEMLERLEKNPDKDVIVEVLKVIVKAIEASVENQRISAENQKALATKYKIKETLKRLEDSLRELRRILEELKEMLERLEKNPDKDVIVEVLKVIVKAIEASVENQRISAENQKALAX
INFO:rosetta:[0mprotocols.fold_from_loops.DisplayPoseLabelsMover: [0mSTRUCTURE      LHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHLLLLLLLHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHLLLHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHLLLLHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHLLHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHLLLLHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHLL
INFO:rosetta:[0mprotocols.fold_from_loops.DisplayPoseLabelsMover: [0mMVMP_BB        ...................................................................................................................................................................................................................
INFO:rosetta:[0mprotocols.fold_from_loops.DisplayPoseLabelsMover: [0mMVMP_CHI       ********************

 We are ready to setup the `FastRelax` mover:

In [29]:
#fast_design = pyrosetta.rosetta.protocols.denovo_design.movers.FastDesign(scorefxn_in=design_scorefxn, standard_repeats=1) # 2-5 repeats suggested for real applications
fast_design = pyrosetta.rosetta.protocols.relax.FastRelax(scorefxn_in=design_scorefxn, standard_repeats=1, script_file="KillA2019")
fast_design.cartesian(False)
fast_design.set_task_factory(tf)
fast_design.set_movemap_factory(mmf)
fast_design.min_type("lbfgs_armijo_nonmonotone")
fast_design.ramp_down_constraints(False)

INFO:rosetta:[0mprotocols.relax.RelaxScriptManager: [0mscore12 : 16.2709
INFO:rosetta:[0mprotocols.relax.RelaxScriptManager: [0mtalaris2013 : 9.41743
INFO:rosetta:[0mprotocols.relax.RelaxScriptManager: [0mtalaris2014 : 9.34686
INFO:rosetta:[0mprotocols.relax.RelaxScriptManager: [0mref2015 : 6.875
INFO:rosetta:[0mprotocols.relax.RelaxScriptManager: [0mbeta_nov16 : 14.9379


Note that this takes ~37min 13s:

In [30]:
if not os.getenv("DEBUG"):
    %time fast_design.apply(pose)
    # Optionally, instead of running this you could reload the saved pose from a previously run trajectory:
    pose = pyrosetta.pose_from_file("expected_outputs/designed_pose.pdb")

INFO:rosetta:[0mcore.import_pose.import_pose: [0mFile 'designed_pose.pdb' automatically determined to be of type PDB
INFO:rosetta:[0mcore.io.pdb.pdb_reader: [0mParsing 219 .pdb records with unknown format to search for Rosetta-specific comments.


 Save this pose for downstream reference:

In [31]:
if not os.getenv("DEBUG"):
    designed_pose = pose.clone()
    #pyrosetta.dump_pdb(designed_pose, "outputs/designed_pose.pdb")

 Now that we have re-designed chain A, it is strongly recommended to use `FastRelax` with a Cartesian scorefunction to repack and minimize with a realistic scorefuction. Note: this takes ~6min 27s

In [32]:
tf.clear()
tf = pyrosetta.rosetta.core.pack.task.TaskFactory()
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.InitializeFromCommandline())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.IncludeCurrent())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.NoRepackDisulfides())
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(
    pyrosetta.rosetta.core.pack.task.operation.RestrictToRepackingRLT(), true_selector))
mm = pyrosetta.rosetta.core.kinematics.MoveMap()
mm.set_bb(False) # Set to true if desired
mm.set_chi(True)
mm.set_jump(False)
fast_relax = pyrosetta.rosetta.protocols.relax.FastRelax(scorefxn_in=relax_scorefxn, standard_repeats=1) # 2-5 repeats suggested for real applications
fast_relax.cartesian(True)
fast_relax.set_task_factory(tf)
fast_relax.set_movemap(mm)
fast_relax.minimize_bond_angles(True)
fast_relax.minimize_bond_lengths(True)
fast_relax.min_type("lbfgs_armijo_nonmonotone") # Cartisian scorefunction
fast_relax.ramp_down_constraints(False)

# To run the FastRelax trajectory.
if not os.getenv("DEBUG"):
    %time fast_relax.apply(pose)

#Or for speed, we will load the pose from a previous trajectory.
pose = pyrosetta.pose_from_file("expected_outputs/designed_relaxed_pose.pdb")

INFO:rosetta:[0mcore.scoring.CartesianBondedEnergy: [0mCreating new peptide-bonded energy container (210)
INFO:rosetta:[0mprotocols.relax.FastRelax: [0mCMD: repeat  197.55  0  0  0.55
INFO:rosetta:[0mcore.pack.task: [0mPacker task: initialize from command line()
INFO:rosetta:[0mcore.pack.pack_rotamers: [0mbuilt 12019 rotamers at 210 positions.
INFO:rosetta:[0mcore.pack.interaction_graph.interaction_graph_factory: [0mInstantiating LinearMemoryInteractionGraph
INFO:rosetta:[0mprotocols.relax.FastRelax: [0mCMD: ramp_repack_min  -241.67  0  0  0.011
INFO:rosetta:[0mcore.pack.task: [0mPacker task: initialize from command line()
INFO:rosetta:[0mcore.pack.pack_rotamers: [0mbuilt 11198 rotamers at 210 positions.
INFO:rosetta:[0mcore.pack.interaction_graph.interaction_graph_factory: [0mInstantiating LinearMemoryInteractionGraph
INFO:rosetta:[0mprotocols.relax.FastRelax: [0mCMD: ramp_repack_min  -114.155  0  0  0.1375
INFO:rosetta:[0mcore.pack.task: [0mPacker task: initiali

CPU times: user 10min 9s, sys: 2.31 s, total: 10min 12s
Wall time: 10min 11s


 Save the pose for downstream analysis:

In [33]:
designed_relaxed_pose = pose.clone()
pyrosetta.dump_pdb(designed_relaxed_pose, "expected_outputs/designed_relaxed_pose.pdb")

True

 Let's compare sequences to see what happened:

In [34]:
print(minimized_start_pose.chain_sequence(1))
print(designed_relaxed_pose.chain_sequence(1))

KYKIKETLKRLEDSLRELRRILEELKEMLERLEKNPDKDVIVEVLKVIVKAIEASVENQRISAENQKALA
DEHERRIMEERHRMEEKFHHEMERMHRKKHQRHQTTSKSEYEHMFETIMRMMEWFERSFERMQEMYRQRT


View the resulting design!

In [None]:
chA = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("A")
chB = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("B")
chC = pyrosetta.rosetta.core.select.residue_selector.ChainSelector("C")
view = sum(
    [
        viewer.init(designed_relaxed_pose),
        viewer.setStyle(cartoon_color="lightgrey", radius=0.25),
        viewer.setSurface(residue_selector=chA, colorscheme="orangeCarbon", opacity=0.5, surface_type="VDW"),
        viewer.setSurface(residue_selector=chB, color="greenCarbon", opacity=0.5, surface_type="VDW"),
        viewer.setSurface(residue_selector=chB, color="violetCarbon", opacity=0.5, surface_type="VDW"),
        viewer.setDisulfides(radius=0.25),
        viewer.setZoom(factor=1.5)
    ]
)
view()

## Analysis:

 By how many Angstroms RMSD did the backbone Cα atoms move?

 What is the delta `total_score` from `minimized_start_pose` to `designed_relaxed_pose`?

 What is the per-residue energy difference for each mutated position between `minimized_start_pose` and `designed_relaxed_pose`?

 Are the sequence constraints imposed by the `aa_repeat` scoreterm satisfied? Re-write the following python code that counts the number of residue types in chain A to check for the longest stretch of each residue type in the primary amino acid sequence in chain A:

In [35]:
for aa in IUPACData.protein_letters:
    aa_selector = pyrosetta.rosetta.core.select.residue_selector.ResidueNameSelector(str.upper(Bio.SeqUtils.seq3(aa)))
    aa_and_chain_A = pyrosetta.rosetta.core.select.residue_selector.AndResidueSelector(chain_A, aa_selector)
    sel_res_count_metric = pyrosetta.rosetta.core.simple_metrics.metrics.SelectedResidueCountMetric()
    sel_res_count_metric.set_residue_selector(aa_and_chain_A)
    print(aa, int(sel_res_count_metric.calculate(designed_relaxed_pose)))

A 0
C 0
D 0
E 15
F 4
G 0
H 4
I 2
K 4
L 0
M 10
N 0
P 0
Q 4
R 12
S 3
T 3
V 0
W 1
Y 2


 Does chain A have 40% percent aromatic or aliphatic (but not leucine) and 5% leucine, satisfying the `aa_composition` scoreterm? For additional practice, re-write the following python implementation using PyRosetta ResidueSelectors and SimpleMetrics:

In [36]:
num_aro_ali = 0
num_leucine = 0

for aa in pose.chain_sequence(1):
    if aa in "WYFAVIMP":
        num_aro_ali += 1
    if aa == "L":
        num_leucine += 1
        
print("The % aromatic residues in chain A is {0}%".format((num_aro_ali*100)/len(pose.chain_sequence(1))))
print("The % leucine in chain A is {0}%".format((num_leucine*100)/len(pose.chain_sequence(1))))

The % aromatic residues in chain A is 27.142857142857142%
The % leucine in chain A is 0.0%


 Are the sequence constraints imposed by the `AddHelixSequenceConstraints` mover with the `aa_composition` scoreterm satisfied?

 Uses sasa (solvent-accessible surface area) to asses whether there are there more or less voids in `designed_relaxed_pose` as compared to `minimized_start_pose`, satisfying the `voids_penalty` scoreterm.

In [37]:
tf.clear()
tf = pyrosetta.rosetta.core.pack.task.TaskFactory()
tf.push_back(pyrosetta.rosetta.core.pack.task.operation.OperateOnResidueSubset(
    pyrosetta.rosetta.core.pack.task.operation.RestrictToRepackingRLT(), true_selector))
sasa = pyrosetta.rosetta.protocols.simple_filters.TaskAwareSASAFilter()
sasa.task_factory(tf)
print("The sasa of minimized_start_pose is {}".format(sasa.score(minimized_start_pose)))
print("The sasa of designed_relaxed_pose is {}".format(sasa.score(designed_relaxed_pose)))

The sasa of minimized_start_pose is 10608.90320076503
The sasa of designed_relaxed_pose is 11126.886879822758


 Is the net charge of chain A equal to exactly zero, satisfying the `netcharge` scoreterm?

In [38]:
# One method to calculate net charge of chain A
num_negative = 0
num_positive = 0
for aa in pose.chain_sequence(1):
    if aa in "DE":
        num_negative += 1
    if aa in "KR":
        num_positive += 1
print("The net charge of chain A = {0}".format(num_positive - num_negative))

# Another method to calculate net charge of chain A
test_pose = pose.clone()
keep_chA = pyrosetta.rosetta.protocols.grafting.simple_movers.KeepRegionMover(res_start=str(test_pose.chain_begin(1)), res_end=str(test_pose.chain_end(1)))
keep_chA.apply(test_pose)
net_charge = pyrosetta.rosetta.protocols.simple_filters.NetChargeFilter()
print("The net charge of chain A = {0}".format(net_charge.score(test_pose)))

INFO:rosetta:[0mprotocols.grafting.util: [0mReturning 70 residues from 1 to 70
INFO:rosetta:[0mprotocols.simple_filters.NetChargeFilter: [0mAA:  -1  ASP 1
INFO:rosetta:[0mprotocols.simple_filters.NetChargeFilter: [0mAA:  -1  GLU 2
INFO:rosetta:[0mprotocols.simple_filters.NetChargeFilter: [0mAA:  -1  GLU 4
INFO:rosetta:[0mprotocols.simple_filters.NetChargeFilter: [0mAA:  +1  ARG 5
INFO:rosetta:[0mprotocols.simple_filters.NetChargeFilter: [0mAA:  +1  ARG 6
INFO:rosetta:[0mprotocols.simple_filters.NetChargeFilter: [0mAA:  -1  GLU 9
INFO:rosetta:[0mprotocols.simple_filters.NetChargeFilter: [0mAA:  -1  GLU 10
INFO:rosetta:[0mprotocols.simple_filters.NetChargeFilter: [0mAA:  +1  ARG 11
INFO:rosetta:[0mprotocols.simple_filters.NetChargeFilter: [0mAA:  +1  ARG 13
INFO:rosetta:[0mprotocols.simple_filters.NetChargeFilter: [0mAA:  -1  GLU 15
INFO:rosetta:[0mprotocols.simple_filters.NetChargeFilter: [0mAA:  -1  GLU 16
INFO:rosetta:[0mprotocols.simple_filters.NetChargeFilte

The net charge of chain A = 0
The net charge of chain A = 0.0


 Are there any buried unsatisfied polar atoms, satisfying the `buried_unsatisfied_penalty` scoreterm?

In [39]:
uhb = pyrosetta.rosetta.protocols.rosetta_scripts.XmlObjects.create_from_string("""
<SCOREFXNS>
  <ScoreFunction name="fa_default" weights="ref2015"/>
</SCOREFXNS>
<FILTERS>
  <BuriedUnsatHbonds name="uhb_sc" use_reporter_behavior="true" use_hbnet_behavior="false" scorefxn="fa_default" report_bb_heavy_atom_unsats="false" report_sc_heavy_atom_unsats="true" cutoff="99999" print_out_info_to_pdb="false" ignore_surface_res="true" residue_surface_cutoff="20.0" ignore_bb_heavy_unsats="false" confidence="0"/>
  <BuriedUnsatHbonds name="uhb_bb" use_reporter_behavior="true" use_hbnet_behavior="false" scorefxn="fa_default" report_bb_heavy_atom_unsats="true" report_sc_heavy_atom_unsats="false" cutoff="99999" print_out_info_to_pdb="false" ignore_surface_res="true" residue_surface_cutoff="20.0" ignore_bb_heavy_unsats="false" confidence="0"/>
</FILTERS>
""")

INFO:rosetta:[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mGenerating XML Schema for rosetta_scripts...
INFO:rosetta:[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
INFO:rosetta:[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mInitializing schema validator...
INFO:rosetta:[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
INFO:rosetta:[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mValidating input script...
INFO:rosetta:[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
INFO:rosetta:[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mParsed script:
<ROSETTASCRIPTS>
	<SCOREFXNS>
		<ScoreFunction name="fa_default" weights="ref2015"/>
	</SCOREFXNS>
	<FILTERS>
		<BuriedUnsatHbonds confidence="0" cutoff="99999" ignore_bb_heavy_unsats="false" ignore_surface_res="true" name="uhb_sc" print_out_info_to_pdb="false" report_bb_heavy_atom_unsats="false" report_sc_heavy_atom_unsats="true" residue_surface_cutoff="20.0" scorefxn="

In [40]:
print("The number of unsatisfied side-chain heavy atoms in minimized_start_pose is {}".format(uhb.get_filter("uhb_sc").score(minimized_start_pose)))
print("The number of unsatisfied backbone heavy atoms in minimized_start_pose is {}".format(uhb.get_filter("uhb_bb").score(minimized_start_pose)))

print("The number of unsatisfied side-chain heavy atoms in designed_relaxed_pose is {}".format(uhb.get_filter("uhb_sc").score(designed_relaxed_pose)))
print("The number of unsatisfied backbone heavy atoms in designed_relaxed_pose is {}".format(uhb.get_filter("uhb_bb").score(designed_relaxed_pose)))

INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m/////////////////////////////////////////////////////////////////////////////////////////
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m    USER HAS SPECIFIED CUSTOM REPORTING BEHAVIOR FOR # UNSATS
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m/////////////////////////////////////////////////////////////////////////////////////////
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m
INFO:rosetta:[0mbasic.io.database: [0mDatabase file opened: scoring/score_functions/hbonds/sp2_elec_params/HBPoly1D.csv
INFO:rosetta:[0mbasic.io.database: [0mDatabase file opened: scoring/score_functions/hbonds/sp2_elec_params/HBFadeIntervals.csv
INFO:rosetta:[0mbasic.io.database: [0mDatabase file opened: scoring/score_function

The number of unsatisfied side-chain heavy atoms in minimized_start_pose is 3.0


INFO:rosetta:[0mprotocols.vardist_solaccess: [0mVarSolDistSasaCalculator::lookup
INFO:rosetta:[0mprotocols.vardist_solaccess: [0mVarSolDistSasaCalculator::lookup
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  buried unsats in input pose:
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  all_heavy_atom_unsats = 14
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  bb_heavy_atom_unsats = 11
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  sc_heavy_atom_unsats = 3
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  countable_nonheavy_unsats = 10
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m      Unsatisfied HEAVY polar atom at residue 4: ILE  N
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m      Unsatisfied Hpol polar atom at residue 4: ILE  H
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m      Unsatisfi

The number of unsatisfied backbone heavy atoms in minimized_start_pose is 11.0


INFO:rosetta:[0mprotocols.vardist_solaccess: [0mVarSolDistSasaCalculator::lookup
INFO:rosetta:[0mprotocols.vardist_solaccess: [0mVarSolDistSasaCalculator::lookup
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  buried unsats in input pose:
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  all_heavy_atom_unsats = 16
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  bb_heavy_atom_unsats = 9
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  sc_heavy_atom_unsats = 7
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  countable_nonheavy_unsats = 11
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m      Unsatisfied HEAVY polar atom at residue 4: GLU  N
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m      Unsatisfied Hpol polar atom at residue 4: GLU  H
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m      Unsatisfie

The number of unsatisfied side-chain heavy atoms in designed_relaxed_pose is 7.0


INFO:rosetta:[0mprotocols.vardist_solaccess: [0mVarSolDistSasaCalculator::lookup
INFO:rosetta:[0mprotocols.vardist_solaccess: [0mVarSolDistSasaCalculator::lookup
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  buried unsats in input pose:
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  all_heavy_atom_unsats = 16
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  bb_heavy_atom_unsats = 9
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  sc_heavy_atom_unsats = 7
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m  countable_nonheavy_unsats = 11
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m      Unsatisfied HEAVY polar atom at residue 4: GLU  N
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m      Unsatisfied Hpol polar atom at residue 4: GLU  H
INFO:rosetta:[0mprotocols.simple_filters.BuriedUnsatHbondFilter: [0m      Unsatisfie

The number of unsatisfied backbone heavy atoms in designed_relaxed_pose is 9.0


 Inspect the tracer output from the `BuriedUnsatHbonds` filter, and inspect `designed_relaxed_pose` very closely in PyMol or py3Dmol. Would you agree with the `BuriedUnsatHbonds` filter? How does the number of buried unsatisfied heavy atoms compare to `minimized_start_pose`?

 Packer results are stochastic based on a random number generator, that is after running `pyrosetta.init()` you see:
>rosetta:core.init.random: RandomGenerator:init: Normal mode, seed=937978431 RG_type=mt19937

How do your results compare with your neighbors' results? Ideally you would run the same protocol hundreds of times, and filter them down using Rosetta filters that recaptiulate your design requirements to a number of designs that you can then experimentally validate to answer your biological question.

# See Also

Note these may not be available in PyRosetta through code or even by xml (remodel), but they are extremely useful tools when doing denovo protein design - and you should be aware of them.

- **RosettaRemodel**
 - https://www.rosettacommons.org/docs/latest/application_documentation/design/rosettaremodel
    
    
- **Sewing**

 - https://www.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/composite_protocols/sewing/SEWING
 
 
-  **Parametric Design**
 - Previous Workshop!

<!--NAVIGATION-->
< [*De Novo* Parametric Backbone Design](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.06-Introduction-to-Parametric-backbone-design.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [**Point Mutation Scan**](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.08-Point-Mutation-Scan.ipynb) ><p><a href="https://colab.research.google.com/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/06.07-Introduction-to-DeNovo-protein-design.ipynb"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" title="Open in Google Colaboratory"></a>