## TaskOperation API

@Author: 吴炜坤

@email:weikun.wu@xtalpi.com/weikunwu@163.com

本章节主要详尽地介绍TaskOperation的API,主要参考页面:
> http://new.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/TaskOperations/TaskOperations-RosettaScripts#specialized-operations_packer-behavior-modification

**注意:**
1. 一些TaskOperation由于时代久远,只能配合xmlobject一同使用。xmlobject的详情请参考第九章,此处仅列出使用语法;
2. 一些TaskOperation年久失修,已经不能正常运行,此处也略去不做介绍;
3. 一些TaskOperation最新发布,在代码中尚未实装,本文也列举出来,在未来可能会进行更新。

### TaskOperation从构建逻辑上来分类共计有两种类型:

* Residue Level TaskOperations: 根据Selector设定选择范围内位点的Rotamer自由度(手动挡);
* Specialized Operations: 根据预设好逻辑,对位点进行全局Rotamer操作(自动挡);

## 一、Residue Level TaskOperations

Residue Level TaskOperations(RLT)一般需要配合Selector来指定操作的范围。用户可以直观地将RLT理解为一个自定义版本的Specialized Operations。特别注意的是:RLT是无法直接被TaskFactory所读取,其必须通过OperateOnResidueSubset函数来生成一个标准的TaskOperations。

Rosetta中目前支持的所有RTL的列表:
- RestrictToRepackingRLT
- PreventRepackingRLT
- RestrictAbsentCanonicalAASExceptNativeRLT
- RestrictAbsentCanonicalAASRLT
- DisallowIfNonnativeRLT
- IncludeCurrentRLT
- ExtraRotamersGenericRLT

In [1]:
# 初始化PyRosetta并读取一段螺旋结构的PDB。
from pyrosetta import *
init()
pose = pose_from_pdb('./data/helix.pdb')

PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.31+release.c7009b3115c22daa9efe2805d9d1ebba08426a54 2021-08-07T10:04:12] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
[0mcore.init: {0} [0mChecking for fconfig files in pwd and ./rosetta/flags
[0mcore.init: {0} [0mRosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r292 2021.31+release.c7009b3115c c7009b3115c22daa9efe2805d9d1ebba08426a54 http://www.pyrosetta.org 2021-08-07T10:04:12
[0mcore.init: {0} [0mcommand: PyRosetta -ex1 -ex2aro -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database
[0mbasic.random.init_random_generator: {0} [0m'RNG device' seed mode, using '/dev/urandom', seed=-10988823 seed_offset=0 real_seed=-10988823 thread_index=0
[0mbasic.random.init_random_generator: {0} [0mRandomGenerator:init: Normal mode, seed=-10988823 RG_t

**RestrictToRepackingRLT在上一章节中已经介绍,此处不再赘述。**

In [2]:
# 预先定义好 氨基酸的作用范围。
from pyrosetta.rosetta.core.pack.task.operation import *
from pyrosetta.rosetta.core.pack.task.operation import OperateOnResidueSubset
from pyrosetta.rosetta.core.select.residue_selector import ResidueIndexSelector
from pyrosetta.rosetta.core.pack.task import TaskFactory
# 选择氨基酸范围
select_pos = ResidueIndexSelector('2,3,4,5,6,7,8,9,10,11,12,13')

#### 1.1 PreventRepackingRLT

将选择区域的氨基酸Rotamer自由度完全关闭,其侧链构象维持不变。

In [3]:
# 使用OperateOnResidueSubset生成TaskOperations
packing_taskop = OperateOnResidueSubset(PreventRepackingRLT(), select_pos, False)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(packing_taskop)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	TRUE	ALA:NtermProteinFull,CYS:NtermProteinFull,ASP:NtermProteinFull,GLU:NtermProteinFull,PHE:NtermProteinFull,GLY:NtermProteinFull,HIS:NtermProteinFull,HIS_D:NtermProteinFull,ILE:NtermProteinFull,LYS:NtermProteinFull,LEU:NtermProteinFull,MET:NtermProteinFull,ASN:NtermProteinFull,PRO:NtermProteinFull,GLN:NtermProteinFull,ARG:NtermProteinFull,SER:NtermProteinFull,THR:NtermProteinFull,VAL:NtermProteinFull,TRP:NtermProteinFull,TYR:NtermProteinFull
2	FALSE	FALSE	
3	FALSE	FALSE	
4	FALSE	FALSE	
5	FALSE	FALSE	
6	FALSE	FALSE	
7	FALSE	FALSE	
8	FALSE	FALSE	
9	FALSE	FALSE	
10	FALSE	FALSE	
11	FALSE	FALSE	
12	FALSE	FALSE	
13	FALSE	FALSE	
14	TRUE	TRUE	ALA:CtermProteinFull,CYS:CtermProteinFull,ASP:CtermProteinFull,GLU:CtermProteinFull,PHE:CtermProteinFull,GLY:CtermProteinFull,HIS:CtermProteinFull,HIS_D:CtermProteinFull,ILE:CtermProteinFull,LYS:CtermProteinFull,LEU:CtermProteinFull,MET:CtermProteinFull,ASN:CtermProt

#### 1.2 RestrictAbsentCanonicalAASExceptNativeRLT

将Rotamer自由度限定在给定的氨基酸类型列表,并允许保留当前位点氨基酸类型的Rotamer。

In [4]:
# 定义自由度
design_with_wt = RestrictAbsentCanonicalAASExceptNativeRLT()
design_with_wt.aas_to_keep('QKI')

# 使用OperateOnResidueSubset生成TaskOperations
packing_taskop = OperateOnResidueSubset(design_with_wt, select_pos, False)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(packing_taskop)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	TRUE	ALA:NtermProteinFull,CYS:NtermProteinFull,ASP:NtermProteinFull,GLU:NtermProteinFull,PHE:NtermProteinFull,GLY:NtermProteinFull,HIS:NtermProteinFull,HIS_D:NtermProteinFull,ILE:NtermProteinFull,LYS:NtermProteinFull,LEU:NtermProteinFull,MET:NtermProteinFull,ASN:NtermProteinFull,PRO:NtermProteinFull,GLN:NtermProteinFull,ARG:NtermProteinFull,SER:NtermProteinFull,THR:NtermProteinFull,VAL:NtermProteinFull,TRP:NtermProteinFull,TYR:NtermProteinFull
2	TRUE	TRUE	GLU,ILE,LYS,GLN
3	TRUE	TRUE	ILE,LYS,LEU,GLN
4	TRUE	TRUE	ILE,LYS,GLN
5	TRUE	TRUE	ILE,LYS,GLN
6	TRUE	TRUE	ILE,LYS,GLN,TRP
7	TRUE	TRUE	ILE,LYS,GLN,VAL
8	TRUE	TRUE	GLU,ILE,LYS,GLN
9	TRUE	TRUE	ILE,LYS,GLN
10	TRUE	TRUE	ALA,ILE,LYS,GLN
11	TRUE	TRUE	GLU,ILE,LYS,GLN
12	TRUE	TRUE	ILE,LYS,GLN,ARG
13	TRUE	TRUE	ILE,LYS,ASN,GLN
14	TRUE	TRUE	ALA:CtermProteinFull,CYS:CtermProteinFull,ASP:CtermProteinFull,GLU:CtermProteinFull,PHE:CtermProteinFull,GLY:CtermProteinFu

#### 1.3 RestrictAbsentCanonicalAASRLT

功能与RestrictAbsentCanonicalAASExceptNativeRLT类似,将Rotamer自由度限定在给定的氨基酸类型列表。唯一的差别在于RestrictAbsentCanonicalAASRLT会忘记当前的氨基酸类型。

In [5]:
# 定义自由度
design_to = RestrictAbsentCanonicalAASRLT()
design_to.aas_to_keep('QKI')

# 使用OperateOnResidueSubset生成TaskOperations
packing_taskop = OperateOnResidueSubset(design_to, select_pos, False)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(packing_taskop)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	TRUE	ALA:NtermProteinFull,CYS:NtermProteinFull,ASP:NtermProteinFull,GLU:NtermProteinFull,PHE:NtermProteinFull,GLY:NtermProteinFull,HIS:NtermProteinFull,HIS_D:NtermProteinFull,ILE:NtermProteinFull,LYS:NtermProteinFull,LEU:NtermProteinFull,MET:NtermProteinFull,ASN:NtermProteinFull,PRO:NtermProteinFull,GLN:NtermProteinFull,ARG:NtermProteinFull,SER:NtermProteinFull,THR:NtermProteinFull,VAL:NtermProteinFull,TRP:NtermProteinFull,TYR:NtermProteinFull
2	TRUE	TRUE	ILE,LYS,GLN
3	TRUE	TRUE	ILE,LYS,GLN
4	TRUE	TRUE	ILE,LYS,GLN
5	TRUE	TRUE	ILE,LYS,GLN
6	TRUE	TRUE	ILE,LYS,GLN
7	TRUE	TRUE	ILE,LYS,GLN
8	TRUE	TRUE	ILE,LYS,GLN
9	TRUE	TRUE	ILE,LYS,GLN
10	TRUE	TRUE	ILE,LYS,GLN
11	TRUE	TRUE	ILE,LYS,GLN
12	TRUE	TRUE	ILE,LYS,GLN
13	TRUE	TRUE	ILE,LYS,GLN
14	TRUE	TRUE	ALA:CtermProteinFull,CYS:CtermProteinFull,ASP:CtermProteinFull,GLU:CtermProteinFull,PHE:CtermProteinFull,GLY:CtermProteinFull,HIS:CtermProteinFull,HIS_D:CtermP

#### 1.4 DisallowIfNonnativeRLT

不允许突变为列表中指定氨基酸类型,但允许保留当前位点氨基酸类型的Rotamer。

In [6]:
# 定义自由度
not_design_to = DisallowIfNonnativeRLT()
not_design_to.disallow_aas('HLQKILRA')

# 使用OperateOnResidueSubset生成TaskOperations
packing_taskop = OperateOnResidueSubset(not_design_to, select_pos, False)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(packing_taskop)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	TRUE	ALA:NtermProteinFull,CYS:NtermProteinFull,ASP:NtermProteinFull,GLU:NtermProteinFull,PHE:NtermProteinFull,GLY:NtermProteinFull,HIS:NtermProteinFull,HIS_D:NtermProteinFull,ILE:NtermProteinFull,LYS:NtermProteinFull,LEU:NtermProteinFull,MET:NtermProteinFull,ASN:NtermProteinFull,PRO:NtermProteinFull,GLN:NtermProteinFull,ARG:NtermProteinFull,SER:NtermProteinFull,THR:NtermProteinFull,VAL:NtermProteinFull,TRP:NtermProteinFull,TYR:NtermProteinFull
2	TRUE	TRUE	CYS,ASP,GLU,PHE,GLY,MET,ASN,PRO,SER,THR,VAL,TRP,TYR
3	TRUE	TRUE	CYS,ASP,GLU,PHE,GLY,LEU,MET,ASN,PRO,SER,THR,VAL,TRP,TYR
4	TRUE	TRUE	CYS,ASP,GLU,PHE,GLY,MET,ASN,PRO,GLN,SER,THR,VAL,TRP,TYR
5	TRUE	TRUE	CYS,ASP,GLU,PHE,GLY,LYS,MET,ASN,PRO,SER,THR,VAL,TRP,TYR
6	TRUE	TRUE	CYS,ASP,GLU,PHE,GLY,MET,ASN,PRO,SER,THR,VAL,TRP,TYR
7	TRUE	TRUE	CYS,ASP,GLU,PHE,GLY,MET,ASN,PRO,SER,THR,VAL,TRP,TYR
8	TRUE	TRUE	CYS,ASP,GLU,PHE,GLY,MET,ASN,PRO,SER,THR,VAL,TRP,TYR
9	TR

#### 1.5 IncludeCurrentRLT
设定Packer在执行期间,考虑Pose输入时的Rotamer状态。

In [7]:
# 使用OperateOnResidueSubset生成TaskOperations
packing_taskop = OperateOnResidueSubset(IncludeCurrentRLT(), select_pos, False)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(packing_taskop)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)

# 检查是否Pack中是否包含input的rotamer构象:
packer_task.include_current(2)

True

In [8]:
# 检查是否Pack中是否包含input的rotamer构象:
packer_task.include_current(1)

False

#### 1.6 ExtraRotamersGenericRLT
Rosetta Pack采样Rotamer时是离散的,默认只会采纳每个格点的中心富集的构象,我们可以通过Extra Rotamer相关控制手段来增加Rotamer的采样,默认扩充采样时,采集Rotamer时会额外考虑平均$\chi$的+/-1个标准差的构象。

ExtraRotamersGenericRLT有3个定义的参数:
1. ex?: “?”可选值为1-4,指定对侧链中第几个$\chi$角进行扩大采样。
 - ex1: 额外采样$\chi_{1}$二面角
 - ex1aro: 额外采样$\chi_{1}$二面角(只考虑芳香族氨基酸(FHWY))
 - ex1aro_exposed: 额外采样$\chi_{1}$二面角(只考虑芳香族氨基酸(FHWY)), 仅限暴露在蛋白外部的氨基酸位点。
 - ex2: 额外采样$\chi_{2}$二面角
 - ex2aro: 额外采样$\chi_{2}$二面角(只考虑芳香族氨基酸(FHWY))
 - ex2aro_exposed: 额外采样$\chi_{2}$二面角(只考虑芳香族氨基酸(FHWY)), 仅限暴露在蛋白外部的氨基酸位点。
 - ex3: 额外采样$\chi_{3}$二面角
 - ex4: 额外采样$\chi_{4}$二面角
2. ex?_sample_level: 指定允许的$\chi$角标准差范围。
 - 0 ...... no extra chi angles
 - 1 ...... sample at 1 standard deviation
 - 2 ...... sample at 1/2 standard deviation
 - 3 ...... sample at two full standard deviations
 - 4 ...... sample at two 1/2 standard deviations
 - 5 ...... sample at four 1/2 standard deviations
 - 6 ...... sample at three 1/3 standard deviations
 - 7 ...... sample at six 1/4 standard deviations

3. extrachi_cutoff: Rosetta默认不会对处于蛋白表面的氨基酸进行额外Rotamer采集,除非用户显式地设置(EX_CUTOFF >=1-3等)。默认每个氨基酸计算10埃范围内残基数量,当数量大于阈值时,认为是"包埋"的氨基酸,进行额外的Rotamer采样。因此通常EX_CUTOFF显式地设置为0,考虑所有的氨基酸位点都做Rotamer。

In [9]:
# 设定额外rotamer采样:
from pyrosetta.rosetta.core.select.residue_selector import ChainSelector
from pyrosetta.rosetta.core.pack.task.operation import RestrictToRepacking
from pyrosetta.rosetta.core.pack.task.operation import ExtraRotamers
from pyrosetta.rosetta.core.pack.task import ExtraRotSample

extract_chi = ExtraRotamersGenericRLT()
extract_chi.ex1(True)
extract_chi.ex2aro(True)
# extract_chi.ex1_sample_level(ExtraRotSample.EX_ONE_STDDEV) #【1~7 定义 chi标准误范围】
# extract_chi.ex2aro_sample_level(ExtraRotSample.EX_ONE_STDDEV) #【1~7 定义 chi标准误范围】
extract_chi.extrachi_cutoff(0) #定义一个残基必须有几个邻近采集方可增强chi采样。 0 = 默认不考虑邻近阈值。默认为18

# 使用OperateOnResidueSubset生成TaskOperations
select_pos = ChainSelector('1')
packing_taskop = OperateOnResidueSubset(extract_chi, select_pos, False)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(RestrictToRepacking()) # 全局设置为只能repacking
pack_tf.push_back(packing_taskop)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)

# 查看每个残基的Rotamer采样级别:
print(packer_task.task_string(pose))

start
1 A NATAA EX ARO 1 EX ARO 2 EX_CUTOFF 0
2 A NATAA EX ARO 1 EX ARO 2 EX_CUTOFF 0
3 A NATAA EX ARO 1 EX ARO 2 EX_CUTOFF 0
4 A NATAA EX ARO 1 EX ARO 2 EX_CUTOFF 0
5 A NATAA EX ARO 1 EX ARO 2 EX_CUTOFF 0
6 A NATAA EX ARO 1 EX ARO 2 EX_CUTOFF 0
7 A NATAA EX ARO 1 EX ARO 2 EX_CUTOFF 0
8 A NATAA EX ARO 1 EX ARO 2 EX_CUTOFF 0
9 A NATAA EX ARO 1 EX ARO 2 EX_CUTOFF 0
10 A NATAA EX ARO 1 EX ARO 2 EX_CUTOFF 0
11 A NATAA EX ARO 1 EX ARO 2 EX_CUTOFF 0
12 A NATAA EX ARO 1 EX ARO 2 EX_CUTOFF 0
13 A NATAA EX ARO 1 EX ARO 2 EX_CUTOFF 0
14 A NATAA EX ARO 1 EX ARO 2 EX_CUTOFF 0



In [10]:
from pyrosetta.rosetta.protocols.minimization_packing import PackRotamersMover
pack_mover = PackRotamersMover()
pack_mover.task_factory(pack_tf)
pack_mover.apply(pose)

[0mcore.scoring.ScoreFunctionFactory: {0} [0mSCOREFUNCTION: [32mref2015[0m
[0mcore.scoring.etable: {0} [0mStarting energy table calculation
[0mcore.scoring.etable: {0} [0msmooth_etable: changing atr/rep split to bottom of energy well
[0mcore.scoring.etable: {0} [0msmooth_etable: spline smoothing lj etables (maxdis = 6)
[0mcore.scoring.etable: {0} [0msmooth_etable: spline smoothing solvation etables (max_dis = 6)
[0mcore.scoring.etable: {0} [0mFinished calculating energy tables.
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBPoly1D.csv
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBFadeIntervals.csv
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBEval.csv
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/DonStrength.csv
[0mbasic.io.database: {0} [0mDatabase file op

查看运行日志: “built 717 rotamers at 14 positions.” 此时在计算InteractionGraph的时候使用了717个rotamer。

#### 思考
尝试改变“ex?”的参数,看看InteractionGraph的rotamer数量的变化情况?

## 二、Specialized Operations

### 2.1 Position/Identity Specification
这部分的TaskOperations分类主要根据残基的位置和类型信息进行自动化的定义。主要分为:
- General Specification(基本都能使用Residue Level TaskOperations实现,因此此部分不做介绍)
- General Design Specification(部分与Residue Level TaskOperations重合,因此仅挑选部分进行阐述)
- Property-based specification
- Interface/Neighborhood Specifications
- Input-based design

这一部分包含了绝大数与实际工作相关的TaskOperations。

### 2.1.1 General Design Specification

#### 1. RestrictToSpecifiedBaseResidueTypes
对于BaseResidueTypes的操作是在生成PackerTask之前,对PackerPalettes做调整。
PackerPalettes用于定义“默认”的氨基酸列表,默认允许20种天然氨基酸的Rotamer被采样。当氨基酸类型没有在预设列表中时,在后续过程中,这类氨基酸将不会出现。

In [11]:
from pyrosetta.rosetta.core.pack.task.operation import RestrictToSpecifiedBaseResidueTypes
from pyrosetta.rosetta.core.pack.task.operation import RestrictToRepacking
from pyrosetta.rosetta.utility import vector1_std_string
# allow basetype
mut_table = "TYR,VAL,TRP,CYS,"
mut_table_list = vector1_std_string()
for aa in mut_table.split(','):
 mut_table_list.append(aa)

# restrict to baestype list
restric_to_basetype = RestrictToSpecifiedBaseResidueTypes()
restric_to_basetype.set_base_types(mut_table_list)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(restric_to_basetype)
pack_tf.push_back(RestrictToRepacking())

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	FALSE	FALSE	
2	FALSE	FALSE	
3	FALSE	FALSE	
4	FALSE	FALSE	
5	FALSE	FALSE	
6	TRUE	FALSE	TRP
7	TRUE	FALSE	VAL
8	FALSE	FALSE	
9	FALSE	FALSE	
10	FALSE	FALSE	
11	FALSE	FALSE	
12	FALSE	FALSE	
13	FALSE	FALSE	
14	FALSE	FALSE	



**可见,尽管将所有的Rotamer状态设置为repacking,当氨基酸类型在PackerPalettes预设列表中删除后,一些位点Rotamer自由度变“空”!**

#### 2. ProhibitSpecifiedBaseResidueTypes
同理于RestrictToSpecifiedBaseResidueTypes的逻辑,ProhibitSpecifiedBaseResidueTypes设定了哪些氨基酸类型应当被“禁止”。

In [12]:
from pyrosetta.rosetta.core.pack.task.operation import ProhibitSpecifiedBaseResidueTypes
from pyrosetta.rosetta.utility import vector1_std_string
# allow basetype
not_mut_table = "ALA,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR"
not_mut_table_list = vector1_std_string()
for aa in not_mut_table.split(','):
 not_mut_table_list.append(aa)

# restrict to baestype list
restric_to_basetype = ProhibitSpecifiedBaseResidueTypes()
restric_to_basetype.set_base_types(not_mut_table_list)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(restric_to_basetype)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	TRUE	CYS:NtermProteinFull,VAL:NtermProteinFull,TRP:NtermProteinFull,TYR:NtermProteinFull
2	TRUE	TRUE	CYS,VAL,TRP,TYR
3	TRUE	TRUE	CYS,VAL,TRP,TYR
4	TRUE	TRUE	CYS,VAL,TRP,TYR
5	TRUE	TRUE	CYS,VAL,TRP,TYR
6	TRUE	TRUE	CYS,VAL,TRP,TYR
7	TRUE	TRUE	CYS,VAL,TRP,TYR
8	TRUE	TRUE	CYS,VAL,TRP,TYR
9	TRUE	TRUE	CYS,VAL,TRP,TYR
10	TRUE	TRUE	CYS,VAL,TRP,TYR
11	TRUE	TRUE	CYS,VAL,TRP,TYR
12	TRUE	TRUE	CYS,VAL,TRP,TYR
13	TRUE	TRUE	CYS,VAL,TRP,TYR
14	TRUE	TRUE	CYS:CtermProteinFull,VAL:CtermProteinFull,TRP:CtermProteinFull,TYR:CtermProteinFull



#### 3. ReadResfile
之前我们讲了Resfile的编写规则,此处我们将resfile写成如下格式:
```
NATAA
EX 1 EX 2
START
1 A PIKAA A
2 A PIKAA AKR
3 A PIKAA APM
4 A PIKAA APT
```
这个Resfile定义的自由度为:
NATAA至当前未指定的氨基酸都只能进行Repacking。此外,A链1号位点只能突变成ALA,A链2号位点只能突变成ALA,LYS和ARG如此类推。

In [13]:
from pyrosetta.rosetta.core.pack.task.operation import ReadResfile
# restrict to baestype list
resfile_type = ReadResfile('./data/mutation.resfile')

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(resfile_type)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	TRUE	ALA:NtermProteinFull
2	TRUE	TRUE	ALA,LYS,ARG
3	TRUE	TRUE	ALA,MET,PRO
4	TRUE	TRUE	ALA,PRO,THR
5	TRUE	FALSE	LYS
6	TRUE	FALSE	TRP
7	TRUE	FALSE	VAL
8	TRUE	FALSE	GLU
9	TRUE	FALSE	GLN
10	TRUE	FALSE	ALA
11	TRUE	FALSE	GLU
12	TRUE	FALSE	ARG
13	TRUE	FALSE	ASN
14	TRUE	FALSE	GLY:CtermProteinFull



#### 4. LinkResidues
“铁锁连环”,当设定一个氨基酸与另外一些氨基酸必须同时突变为一样类型的Rotamer。但这部分的代码在pyrosetta中一直会造成内核错误。暂无法使用(2017-2021)

### 2.1.2 Property-based specification

#### 1. RestrictToResidueProperties
根据给定的氨基酸性质来对PackerPalettes预设列表中的氨基酸类型进行指定和修改。</br>

氨基酸的性质可选列表都记录在pyrosetta.rosetta.core.chemical.ResidueProperty中。
一般常用可选:
- METAL: 金属离子
- POLAR: 极性氨基酸
- HYDROPHOBIC: 疏水氨基酸
- CHARGED: 带电氨基酸
- NEGATIVE_CHARGE: 带负电氨基酸
- POSITIVE_CHARGE: 带正电氨基酸
- AROMATIC: 芳香族氨基酸

In [14]:
from pyrosetta.rosetta.core.pack.task.operation import RestrictToResidueProperties
from pyrosetta.rosetta.core.chemical import ResidueProperty
from pyrosetta.rosetta.utility import vector1_core_chemical_ResidueProperty

# allow basetype
properties = vector1_core_chemical_ResidueProperty()
properties.append(ResidueProperty.NEGATIVE_CHARGE) # 只允许带负电氨基酸。

restrict_to_properties = RestrictToResidueProperties()
restrict_to_properties.set_properties(properties)
# 可以指定选择的范围:
# restrict_to_properties.set_selector()

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(restrict_to_properties)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	TRUE	ASP:NtermProteinFull,GLU:NtermProteinFull
2	TRUE	TRUE	ASP,GLU
3	TRUE	TRUE	ASP,GLU
4	TRUE	TRUE	ASP,GLU
5	TRUE	TRUE	ASP,GLU
6	TRUE	TRUE	ASP,GLU
7	TRUE	TRUE	ASP,GLU
8	TRUE	TRUE	ASP,GLU
9	TRUE	TRUE	ASP,GLU
10	TRUE	TRUE	ASP,GLU
11	TRUE	TRUE	ASP,GLU
12	TRUE	TRUE	ASP,GLU
13	TRUE	TRUE	ASP,GLU
14	TRUE	TRUE	ASP:CtermProteinFull,GLU:CtermProteinFull



#### 2. ProhibitResidueProperties
同理于RestrictToResidueProperties的逻辑,ProhibitResidueProperties设定了满足哪些性质的氨基酸类型应当被“删除”。

In [15]:
from pyrosetta.rosetta.core.pack.task.operation import ProhibitResidueProperties
from pyrosetta.rosetta.core.chemical import ResidueProperty
from pyrosetta.rosetta.utility import vector1_core_chemical_ResidueProperty

# allow basetype
not_properties = vector1_core_chemical_ResidueProperty()
not_properties.append(ResidueProperty.POLAR) # 添加氨基酸的属性为极性氨基酸。

restrict_to_properties = ProhibitResidueProperties()
restrict_to_properties.set_properties(not_properties)
# 可以指定选择的范围:
# restrict_to_properties.set_selector()

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(restrict_to_properties)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	TRUE	ALA:NtermProteinFull,CYS:NtermProteinFull,PHE:NtermProteinFull,GLY:NtermProteinFull,ILE:NtermProteinFull,LEU:NtermProteinFull,MET:NtermProteinFull,PRO:NtermProteinFull,VAL:NtermProteinFull,TRP:NtermProteinFull,TYR:NtermProteinFull
2	TRUE	TRUE	ALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR
3	TRUE	TRUE	ALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR
4	TRUE	TRUE	ALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR
5	TRUE	TRUE	ALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR
6	TRUE	TRUE	ALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR
7	TRUE	TRUE	ALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR
8	TRUE	TRUE	ALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR
9	TRUE	TRUE	ALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR
10	TRUE	TRUE	ALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR
11	TRUE	TRUE	ALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR
12	TRUE	TRUE	ALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR
13	TRUE	TRUE	ALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR


此时允许被设计的氨基酸类型被设定为非极性氨基酸。

#### 3. ConservativeDesignOperation
根据氨基酸替换频率的思想来设计蛋白质,ConservativeDesignOperation根据给定的氨基酸替换矩阵来决定每个位置上氨基酸的类型,如Blosum62等。当得分大于0时。所有允许出现的氨基酸类型即此位点的自由度范围。

In [16]:
from pyrosetta.rosetta.protocols.task_operations import ConservativeDesignOperation
Conserve_design = ConservativeDesignOperation()
Conserve_design.set_data_source('blosum62') # chothia_76, Blosum62(30-100) 选择使用哪种比对矩阵。

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(Conserve_design)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

[0mprotocols.task_operations.ConservativeDesignOperation: {0} [0mLoading conservative mutational data
#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	TRUE	ASP:NtermProteinFull,GLU:NtermProteinFull,ASN:NtermProteinFull,GLN:NtermProteinFull,SER:NtermProteinFull
2	TRUE	TRUE	ASP,GLU,HIS,HIS_D,LYS,ASN,GLN,ARG,SER
3	TRUE	TRUE	PHE,ILE,LEU,MET,VAL
4	TRUE	TRUE	ASP,GLU,HIS,HIS_D,LYS,MET,ASN,GLN,ARG,SER
5	TRUE	TRUE	GLU,LYS,ASN,GLN,ARG,SER
6	TRUE	TRUE	PHE,TRP,TYR
7	TRUE	TRUE	ALA,ILE,LEU,MET,THR,VAL
8	TRUE	TRUE	ASP,GLU,HIS,HIS_D,LYS,ASN,GLN,ARG,SER
9	TRUE	TRUE	ASP,GLU,HIS,HIS_D,LYS,MET,ASN,GLN,ARG,SER
10	TRUE	TRUE	ALA,CYS,GLY,SER,THR,VAL
11	TRUE	TRUE	ASP,GLU,HIS,HIS_D,LYS,ASN,GLN,ARG,SER
12	TRUE	TRUE	GLU,HIS,HIS_D,LYS,ASN,GLN,ARG
13	TRUE	TRUE	ASP,GLU,GLY,HIS,HIS_D,LYS,ASN,GLN,ARG,SER,THR
14	TRUE	TRUE	ALA:CtermProteinFull,GLY:CtermProteinFull,ASN:CtermProteinFull,SER:CtermProteinFull



#### 4. ConsensusLoopDesign
ConsensuLoopDesign首先定义的二级结构寻找Loop区域, 基于Pose的二面角定义Loop的组成类型(ABEGO字符串)。
如 一段4个氨基酸长度的loop区二面角类型为GGBB,通过搜索数据库,将相同类型的Loop的序列获取,并统计每个位点上的氨基酸频率,通过富集计算,筛选出那些在该二面角空间出现频率较高的氨基酸来定义自由度。**因此对于一些存在正则结构的loop设计十分有用!**

In [17]:
# denovo_pose
denovo_pose = pose_from_pdb('./data/EHEE_rd4_0976.pdb')

# 获取pose的二级结构:
from pyrosetta.rosetta.protocols.membrane import get_secstruct
secstruct = ''.join(get_secstruct(denovo_pose))

# ConsensusLoopDesign
from pyrosetta.rosetta.protocols.denovo_design.task_operations import ConsensusLoopDesignOperation
consensu_loop_design = ConsensusLoopDesignOperation()
# 或如'EHHHL'设置序列的二级结构
consensu_loop_design.set_secstruct(secstruct)
# 考虑loop邻近+1/-1氨基酸, 使得该区域也采取保守性设计,比如有一些loop的周围更加倾向于天然存在PRO.(默认为False)
consensu_loop_design.set_include_adjacent_residues(False) 
# 如果富集度低于阈值,那么该位点将不被设计,0.5比0.0更加严格,要求更高的富集度。
consensu_loop_design.set_enrichment_threshold(0.3) 

[0mcore.import_pose.import_pose: {0} [0mFile './data/EHEE_rd4_0976.pdb' automatically determined to be of type PDB
[0mprotocols.DsspMover: {0} [0mLEEEEELLHHHHHHHHHHHHHLLLLEEEEEELLEEEEEEL


In [18]:
# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(consensu_loop_design)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(denovo_pose)
print(packer_task)

[0mcore.select.residue_selector.SecondaryStructureSelector: {0} [0mUsing given pose secondary structure: LEEEEELLHHHHHHHHHHHHHLLLLEEEEEELLEEEEEEL
[0mprotocols.denovo_design.task_operations.ConsensusLoopDesignOperation: {0} [0mRestricting AAs in loop: Start: 7 Abego: BBEA Before: E After: H
[0mbasic.io.database: {0} [0mDatabase file opened: protocol_data/denovo_design/aa_abego_frequencies.gz
[0mprotocols.denovo_design.task_operations.ConsensusLoopDesignOperation: {0} [0mResidue: 7; forbidden aas: DEFGHIKLNPQRVY
[0mprotocols.denovo_design.task_operations.ConsensusLoopDesignOperation: {0} [0mResidue: 8; forbidden aas: ACDEFHIKLMNPQRSTVY
[0mprotocols.denovo_design.task_operations.ConsensusLoopDesignOperation: {0} [0mRestricting AAs in loop: Start: 22 Abego: AAGBBB Before: H After: E
[0mprotocols.denovo_design.task_operations.ConsensusLoopDesignOperation: {0} [0mResidue: 22; forbidden aas: CDFGIPSTVW
[0mprotocols.denovo_design.task_operations.ConsensusLoopDesignOperation: {0}

可见,Loop区域7,8,22,23,24,32,33都被设置为常出现在当前Loop构型中的氨基酸类型。尝试调整enrichment_threshold的参数,看看自由度的变化?

#### 5. DsspDesign
根据设定的二级结构的类型指定氨基酸的Rotamer范围,二级结构氨基酸突变的范围定义如下:
* Helix: ADEFIKLNQRSTVWY
* Strand: DEFHIKLNQRSTVWY
* Loop: ACDEFGHIKLMNPQRSTVWY
* HelixStart: ADEFHIKLNPQRSTVWY
* HelixCapping: DNST
* Nterm: ACDEFGHIKLMNPQRSTVWY
* Cterm: ACDEFGHIKLMNPQRSTVWY

In [19]:
from pyrosetta.rosetta.protocols.task_operations import DsspDesignOperation
dssp_design = DsspDesignOperation()

# 如果需要调整二级结构氨基酸类型,可以加载更多设定的语句:
# dssp_design.set_restrictions_aa('Helix','ADEF') #二级结构、范围、(重新设定二级结构氨基酸突变范围)
# dssp_design.set_restrictions_append('Helix','ADEF') #二级结构、范围、(添加二级结构氨基酸突变范围)
# dssp_design.set_restrictions_exclude('Helix','ADEF') #二级结构、范围、(删除二级结构氨基酸突变范围)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(dssp_design)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(denovo_pose)
print(packer_task)

[0mprotocols.TaskOperations.DsspDesignOperation: {0} [0mInitializing DSSP regions with default residues
[0mprotocols.TaskOperations.DsspDesignOperation: {0} [0mSecondary structure: LEEEEELLHHHHHHHHHHHHHLLLLEEEEEELLEEEEEEL
#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	TRUE	ALA:NtermProteinFull,CYS:NtermProteinFull,ASP:NtermProteinFull,GLU:NtermProteinFull,PHE:NtermProteinFull,GLY:NtermProteinFull,HIS:NtermProteinFull,HIS_D:NtermProteinFull,ILE:NtermProteinFull,LYS:NtermProteinFull,LEU:NtermProteinFull,MET:NtermProteinFull,ASN:NtermProteinFull,PRO:NtermProteinFull,GLN:NtermProteinFull,ARG:NtermProteinFull,SER:NtermProteinFull,THR:NtermProteinFull,VAL:NtermProteinFull,TRP:NtermProteinFull,TYR:NtermProteinFull
2	TRUE	TRUE	ASP,GLU,PHE,HIS,HIS_D,ILE,LYS,LEU,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
3	TRUE	TRUE	ASP,GLU,PHE,HIS,HIS_D,ILE,LYS,LEU,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
4	TRUE	TRUE	ASP,GLU,PHE,HIS,HIS_D,ILE,LYS,LEU,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
5	TR

#### 6. DesignByCavityProximity(等待发布)
2021年新增的TaskOperations,其主要作用是识别蛋白内部的空腔,并根据这些空腔的近似大小来选择可用的氨基酸自由度。
一些参数说明:
- region_shell: 被选定的氨基酸周围附近X埃的氨基酸均被设定为可突变,默认为8埃
- regions_to_design: 选择多少个可被设计的氨基酸中心?默认为1
- repack_non_selected: 顾名思义,那些处于region_shell以外的氨基酸都只能进行repacking

In [20]:
# from pyrosetta.rosetta.protocols import rosetta_scripts
# xml = rosetta_scripts.XmlObjects.create_from_string('''
# <TASKOPERATIONS>
# <DesignByCavityProximity name="des_cavity" region_shell="8.0" regions_to_design="1" repack_non_selected="0" />
# </TASKOPERATIONS>
# ''')
# des_cavity = xml.get_task_operation('des_cavity')

# # 将TaskOperations加载至TaskFactory中
# pack_tf = TaskFactory()
# pack_tf.push_back(des_cavity)

# # 生成PackerTask
# packer_task = pack_tf.create_task_and_apply_taskoperations(denovo_pose)
# print(packer_task)

#### 7. DesignByResidueCentrality(等待发布)
2021年新增的TaskOperations,其根据蛋白质内部的氨基酸相互作用网络来决定哪些氨基酸位点可以被设计。
具体的做法是: 先计算整个Pose的intra-protein interaction network,然后根据氨基酸的Centrality来决定位点被选择的概率。Centrality越高说明有越多的其他氨基酸在他的周围。因此推测,每次的PackerTask的自由度组成都是不同的,算法会偏向于选择那些对结构或功能有重要作用的氨基酸进行设计。而那些对稳定性贡献较小的区域被设计的概率就会降低。

In [21]:
# from pyrosetta.rosetta.protocols import rosetta_scripts
# xml = rosetta_scripts.XmlObjects.create_from_string('''
# <TASKOPERATIONS>
# <DesignByResidueCentrality name="des_by_centrality" region_shell="8.0" regions_to_design="1" repack_non_selected="0" />
# </TASKOPERATIONS>
# ''')
# des_by_centrality = xml.get_task_operation('des_by_centrality')

# # 将TaskOperations加载至TaskFactory中
# pack_tf = TaskFactory()
# pack_tf.push_back(des_by_centrality)

# # 生成PackerTask
# packer_task = pack_tf.create_task_and_apply_taskoperations(denovo_pose)
# print(packer_task)

#### 8. DesignRandomRegion(等待发布)
2021年新增的TaskOperations,简单地随机选择一些Pose中的氨基酸来进行设计。设计的逻辑依然是region_shell的方法。

In [22]:
# from pyrosetta.rosetta.protocols import rosetta_scripts
# xml = rosetta_scripts.XmlObjects.create_from_string('''
# <TASKOPERATIONS>
# <DesignRandomRegion name="des_random" region_shell="8.0" regions_to_design="1" repack_non_selected="0" />
# </TASKOPERATIONS>
# ''')
# des_random = xml.get_task_operation('des_random')

# # 将TaskOperations加载至TaskFactory中
# pack_tf = TaskFactory()
# pack_tf.push_back(des_random)

# # 生成PackerTask
# packer_task = pack_tf.create_task_and_apply_taskoperations(denovo_pose)
# print(packer_task)

#### 9. NoRepackDisulfides
将Pose的二硫键位点设置为no_repack。

In [23]:
from pyrosetta.rosetta.core.pack.task.operation import NoRepackDisulfides
norepack_ss = NoRepackDisulfides()

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(norepack_ss)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(denovo_pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	TRUE	ALA:NtermProteinFull,CYS:NtermProteinFull,ASP:NtermProteinFull,GLU:NtermProteinFull,PHE:NtermProteinFull,GLY:NtermProteinFull,HIS:NtermProteinFull,HIS_D:NtermProteinFull,ILE:NtermProteinFull,LYS:NtermProteinFull,LEU:NtermProteinFull,MET:NtermProteinFull,ASN:NtermProteinFull,PRO:NtermProteinFull,GLN:NtermProteinFull,ARG:NtermProteinFull,SER:NtermProteinFull,THR:NtermProteinFull,VAL:NtermProteinFull,TRP:NtermProteinFull,TYR:NtermProteinFull
2	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
3	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
4	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
5	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
6	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GL

由于体系中没有二硫键,因此TaskOperations没有起到相应的作用。

#### 10. LayerDesign
LayerDesign即将Pose分为core, surface, boundary三个区域,并根据所在的layer决定氨基酸Rotamer的自由度。目前已经有点过时了,最新的方式是使用Layer ResidueSelector进行以及RTL的方案进行替代(自由度极高),以下介绍如何正确设置LayerDesign的范例:

In [24]:
from pyrosetta import init, pose_from_pdb
from pyrosetta.rosetta.core.select.residue_selector import LayerSelector
from pyrosetta.rosetta.core.select.residue_selector import SecondaryStructureSelector
from pyrosetta.rosetta.core.select.residue_selector import AndResidueSelector, NotResidueSelector
from pyrosetta.rosetta.core.select.residue_selector import PrimarySequenceNeighborhoodSelector, ChainSelector
from pyrosetta.rosetta.core.pack.task.operation import DesignRestrictions
from pyrosetta.rosetta.core.pack.task.operation import RestrictToRepacking
from pyrosetta.rosetta.core.pack.task.operation import RestrictAbsentCanonicalAASRLT

def layer_selection(pick_core, pick_boundary, pick_surface):
 # layer选择器
 layer = LayerSelector()
 layer.set_use_sc_neighbors(True)
 layer.set_layers(pick_core, pick_boundary, pick_surface)
 layer.set_ball_radius(2.0)
 layer.set_cutoffs(3.5, 1.5) # >= 4 neighbor defined as core residuie. for miniprotein.

 return layer

def ss_selection(ss, min_E, min_H, is_overlap, is_include_ter_loop):
 # 设置二级结构选择
 ss_selector = SecondaryStructureSelector(ss)
 ss_selector.set_minE(min_E)
 ss_selector.set_minH(min_H)
 ss_selector.set_overlap(is_overlap)
 ss_selector.set_use_dssp(True)
 ss_selector.set_include_terminal_loops(is_include_ter_loop)
 return ss_selector

def restrict_to_design(residue_list):
 design_to = RestrictAbsentCanonicalAASRLT()
 design_to.aas_to_keep(residue_list)
 return design_to


# define selector
core_layer = layer_selection(1, 0, 0)
boundary_layer = layer_selection(0, 1, 0)
surface_layer = layer_selection(0, 0, 1)
sheet_selector = ss_selection('E', 2, 3, False, False)
entire_loop_selector = ss_selection('L', 2, 3, False, True)
entire_helix_selector = ss_selection('H', 2, 3, False, False)

# cap
helix_cap_selector = AndResidueSelector(entire_loop_selector, PrimarySequenceNeighborhoodSelector(1, 0, entire_helix_selector))
helix_start_selector = AndResidueSelector(entire_helix_selector, PrimarySequenceNeighborhoodSelector(0, 1, helix_cap_selector))
helix_selector = AndResidueSelector(entire_helix_selector, NotResidueSelector(helix_start_selector))
loop_selector = AndResidueSelector(entire_loop_selector, NotResidueSelector(helix_cap_selector))

# TASKOPERATIONS
layer_design_restrict = DesignRestrictions()
layer_design_restrict.add_selector_rlto_pair(AndResidueSelector(surface_layer, helix_start_selector), restrict_to_design('DEHKPQR'))
layer_design_restrict.add_selector_rlto_pair(AndResidueSelector(surface_layer, helix_selector), restrict_to_design('EHKQR'))
layer_design_restrict.add_selector_rlto_pair(AndResidueSelector(surface_layer, sheet_selector), restrict_to_design('EHKNQRST'))
layer_design_restrict.add_selector_rlto_pair(AndResidueSelector(surface_layer, loop_selector), restrict_to_design('DEGHKNPQRST'))
layer_design_restrict.add_selector_rlto_pair(AndResidueSelector(boundary_layer, helix_start_selector), restrict_to_design('ADEHIKLMNPQRSTVWY'))
layer_design_restrict.add_selector_rlto_pair(AndResidueSelector(boundary_layer, helix_selector), restrict_to_design('ADEHIKLMNQRSTVWY'))
layer_design_restrict.add_selector_rlto_pair(AndResidueSelector(boundary_layer, sheet_selector), restrict_to_design('DEFHIKLMNQRSTVWY'))
layer_design_restrict.add_selector_rlto_pair(AndResidueSelector(boundary_layer, loop_selector), restrict_to_design('ADEFGHIKLMNPQRSTVWY'))
layer_design_restrict.add_selector_rlto_pair(AndResidueSelector(core_layer, helix_start_selector), restrict_to_design('AFILMPVWY'))
layer_design_restrict.add_selector_rlto_pair(AndResidueSelector(core_layer, helix_selector), restrict_to_design('AFILMVWY'))
layer_design_restrict.add_selector_rlto_pair(AndResidueSelector(core_layer, sheet_selector), restrict_to_design('FILMVWY'))
layer_design_restrict.add_selector_rlto_pair(AndResidueSelector(core_layer, loop_selector), restrict_to_design('AFGILMPVWY'))
layer_design_restrict.add_selector_rlto_pair(helix_cap_selector, restrict_to_design('DNST'))

[0mcore.select.residue_selector.LayerSelector: {0} [0mSetting LayerSelector to use sidechain neighbors to determine burial.
[0mcore.select.residue_selector.LayerSelector: {0} [0mSet cutoffs for core and surface to 5.2 and 2, respectively, in LayerSelector.
[0mcore.select.residue_selector.LayerSelector: {0} [0mSetting LayerSelector to use sidechain neighbors to determine burial.
[0mcore.select.residue_selector.LayerSelector: {0} [0mSetting core=true boundary=false surface=false in LayerSelector.
[0mcore.select.residue_selector.LayerSelector: {0} [0mSetting radius for rolling ball algorithm to 2 in LayerSelector. (Note that this will have no effect if the sidechain neighbors method is used.)
[0mcore.select.residue_selector.LayerSelector: {0} [0mSet cutoffs for core and surface to 3.5 and 1.5, respectively, in LayerSelector.
[0mcore.select.residue_selector.LayerSelector: {0} [0mSetting LayerSelector to use sidechain neighbors to determine burial.
[0mcore.select.residue_sele

In [25]:
from pyrosetta.rosetta.core.pack.task import TaskFactory
# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(layer_design_restrict)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

[0mcore.select.residue_selector.SecondaryStructureSelector: {0} [0mUsing dssp for secondary structure: LHHHHHHHHHHHLL
[0mcore.select.residue_selector.SecondaryStructureSelector: {0} [0mUsing dssp for secondary structure: LHHHHHHHHHHHLL
[0mcore.select.residue_selector.SecondaryStructureSelector: {0} [0mUsing dssp for secondary structure: LHHHHHHHHHHHLL
[0mcore.select.residue_selector.PrimarySequenceNeighborhoodSelector: {0} [0m]
[0mcore.select.residue_selector.PrimarySequenceNeighborhoodSelector: {0} [0m]
[0mcore.select.residue_selector.SecondaryStructureSelector: {0} [0mUsing dssp for secondary structure: LHHHHHHHHHHHLL
[0mcore.select.residue_selector.SecondaryStructureSelector: {0} [0mUsing dssp for secondary structure: LHHHHHHHHHHHLL
[0mcore.select.residue_selector.SecondaryStructureSelector: {0} [0mUsing dssp for secondary structure: LHHHHHHHHHHHLL
[0mcore.select.residue_selector.SecondaryStructureSelector: {0} [0mUsing dssp for secondary structure: LHHHHHHHHHHHLL


#### 11. SelectResiduesWithinChain
根据内部链编号,选择链中的氨基酸Rotamer自由度,如果modify unselected residues参数设置为true,则所有其他residues都设置为norepack。

In [26]:
# 定义氨基酸范围:
from pyrosetta.rosetta.protocols.task_operations import SelectResiduesWithinChainOperation
repacking_define = SelectResiduesWithinChainOperation()
repacking_define.chain(1) # 1,2,3 按照pose的顺序
for i in [1,2,3,7,8,9]:
 repacking_define.add_res(i) # which residues within the chain

# 定义Pack状态:
repacking_define.allow_design(False) 
repacking_define.allow_repacking(True)
repacking_define.modify_unselected_residues(True) # 是否对未选择区域设置为no_repack?(快速操作)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(repacking_define)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

[0mprotocols.TaskOperations.SelectResiduesWithinChainOperation: {0} [0mResidues set to repacking (all others are prevented from repacking): 1,2,3,7,8,9,
#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	FALSE	ASP:NtermProteinFull
2	TRUE	FALSE	GLU
3	TRUE	FALSE	LEU
4	FALSE	FALSE	
5	FALSE	FALSE	
6	FALSE	FALSE	
7	TRUE	FALSE	VAL
8	TRUE	FALSE	GLU
9	TRUE	FALSE	GLN
10	FALSE	FALSE	
11	FALSE	FALSE	
12	FALSE	FALSE	
13	FALSE	FALSE	
14	FALSE	FALSE	



#### 12. SelectBySASAOperation

根据SASA值来选择蛋白质的Layer区域(core, surface, boundary),并根据所在的layer决定氨基酸Rotamer的自由度。
参数的解释: 
- mode: sc = 侧链SASA综合, mc = 主链+CB的SASA总和。
- state: "monomer"(将每条链进行分离独立的pose进行计算), "bound"(不做额外处理), "unbound"(根据jump的设定,将几个链平移1000A后在进行评估SASA)
- probe_radius: 默认2.2, 比1.4大,但是也能满足需求。
- core_asa: 默认0, sasa小于该值,认为是core.
- surface_asa: 默认30,sasa大于该值,认为是surface.
- jumps: 默认1, 定义从哪条链开始视为jump点.如果mode设置为"unbound". 如果jump设置为2,那么chain1,2视为一体.
- is_design_core: 该区域能否设计?
- is_design_boundary: 该区域能否设计?
- is_design_surface: 该区域能否设计?
- sym_dof_names: 控制同源多聚体对称化操作, 输入对称性的定义变量名。

In [27]:
# 定义参数:
from pyrosetta.rosetta.protocols.task_operations import SelectBySASAOperation
mode="mc"
state="bound"
probe_radius=2.0
core_asa=20
surface_asa=40
jump = '1'
sym_dof_names = '' # 对称化自由度的代号 
core = True 
boundary = False
surface = False
select_sasa = SelectBySASAOperation(mode, state, probe_radius, core_asa, surface_asa, jump, sym_dof_names, core, boundary, surface)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(select_sasa)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(denovo_pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	FALSE	FALSE	
2	FALSE	FALSE	
3	FALSE	FALSE	
4	FALSE	FALSE	
5	FALSE	FALSE	
6	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
7	FALSE	FALSE	
8	FALSE	FALSE	
9	FALSE	FALSE	
10	FALSE	FALSE	
11	FALSE	FALSE	
12	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
13	FALSE	FALSE	
14	FALSE	FALSE	
15	FALSE	FALSE	
16	FALSE	FALSE	
17	FALSE	FALSE	
18	FALSE	FALSE	
19	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
20	FALSE	FALSE	
21	FALSE	FALSE	
22	FALSE	FALSE	
23	FALSE	FALSE	
24	FALSE	FALSE	
25	FALSE	FALSE	
26	FALSE	FALSE	
27	FALSE	FALSE	
28	FALSE	FALSE	
29	FALSE	FALSE	
30	FALSE	FALSE	
31	FALSE	FALSE	
32	FALSE	FALSE	
33	FALSE	FALSE	
34	FALSE	FALSE	
35	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
36	FALSE	FALSE	
37	TRUE	TRUE	ALA,CYS

此处,蛋白质的core部分是被允许进行设计的。

#### 13. RestrictToTermini
在指定的Pose链中,只允许N端第一个残基或C端最后一个残基进行repack。

In [28]:
from pyrosetta.rosetta.protocols.task_operations import RestrictToTerminiOperation
nc_repack = RestrictToTerminiOperation(chain=1, restrict_n_terminus=True, restrict_c_terminus=True)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(nc_repack)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	FALSE	ASP:NtermProteinFull
2	FALSE	FALSE	
3	FALSE	FALSE	
4	FALSE	FALSE	
5	FALSE	FALSE	
6	FALSE	FALSE	
7	FALSE	FALSE	
8	FALSE	FALSE	
9	FALSE	FALSE	
10	FALSE	FALSE	
11	FALSE	FALSE	
12	FALSE	FALSE	
13	FALSE	FALSE	
14	TRUE	FALSE	GLY:CtermProteinFull



### 2.1.3 Interface/Neighborhood Specifications

#### 1. DesignAround
最早的region_shell机制的TaskOperations,根据指定的氨基酸位点来确定Design shell和repack shell半径。处于相应shell中的氨基酸Rotamer状态设置为Design或Repacking。其余剩下部分均设置为no_repack。

In [29]:
from pyrosetta.rosetta.protocols.task_operations import DesignAroundOperation
# 设定指定热点残基
around = DesignAroundOperation ()
for i in [7]:
 around.include_residue(i)

around.design_shell(7.0) # design层半径(不包括指定的残基)
around.repack_shell(10.0) # repack层半径, 大于等于epack_shell(不包括指定的残基)
around.resnums_allow_design(1) # 只允许指定的resnum list中氨基酸进行设计;
around.allow_design(1) # 允许Desgin层进行设计;

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(around)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	FALSE	ASP:NtermProteinFull
2	FALSE	FALSE	
3	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
4	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
5	TRUE	FALSE	LYS
6	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
7	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
8	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
9	TRUE	FALSE	GLN
10	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
11	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
12	TRUE	FALSE	ARG
13	FALSE	FALSE	
14	FALSE	FALSE	



#### 2. DetectProteinLigandInterface
该Operation只关注底物-蛋白PPI界面上的残基设置,根据cut1,cut2参数确定design shell,根据cut3和cut4确定repack shell。
特别注意,init的时候需要把-ignore_unrecognized_res进行设置,否则底物会被忽略导致识别失败。

In [30]:
from pyrosetta.rosetta.protocols import rosetta_scripts
# ligand pose;
init('-ignore_unrecognized_res false')
ligand_complex_pose = pose_from_pdb('./data/1ckn.pdb')

# pick selection;
xml = rosetta_scripts.XmlObjects.create_from_string('''
<TASKOPERATIONS>
 <DetectProteinLigandInterface name="ligand_interface" cut1="6.0" cut2="8.0" cut3="10.0" cut4="12.0"
 design="true"/>
</TASKOPERATIONS>
''')
ligand_design = xml.get_task_operation('ligand_interface')

# 将TaskOperations加载至TaskFactory中
tf = TaskFactory()
tf.push_back(ligand_design)

# 生成PackerTask
packer_task = tf.create_task_and_apply_taskoperations(ligand_complex_pose)
print(packer_task)

PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.31+release.c7009b3115c22daa9efe2805d9d1ebba08426a54 2021-08-07T10:04:12] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
[0mcore.init: {0} [0mChecking for fconfig files in pwd and ./rosetta/flags
[0mcore.init: {0} [0mRosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r292 2021.31+release.c7009b3115c c7009b3115c22daa9efe2805d9d1ebba08426a54 http://www.pyrosetta.org 2021-08-07T10:04:12
[0mcore.init: {0} [0mcommand: PyRosetta -ignore_unrecognized_res false -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database
[0mbasic.random.init_random_generator: {0} [0m'RNG device' seed mode, using '/dev/urandom', seed=-706432799 seed_offset=0 real_seed=-706432799 thread_index=0
[0mbasic.random.init_random_generator: {0} [0mRandomGenerator:init: Normal mode,

#### 3. ProteinInterfaceDesign
该Operation只关注蛋白-蛋白PPI界面上的残基设置(默认8A),非界面处的残基将会被设置为no_repack。并且默认会排除氨基酸设计为C, G, P三种不利的氨基酸。

In [31]:
from pyrosetta.rosetta.protocols.task_operations import ProteinInterfaceDesignOperation
# 加载一个dimmer的coil coil结构:
interface_pose = pose_from_pdb('./data/6yek.pdb')
print(interface_pose.pdb_info()) # 查看链信息

[0mcore.import_pose.import_pose: {0} [0mFile './data/6yek.pdb' automatically determined to be of type PDB
PDB file name: ./data/6yek.pdb
 Pose Range Chain PDB Range | #Residues #Atoms

0001 -- 0024 A 0300 -- 0323 | 0024 residues; 00396 atoms
0025 -- 0048 B 0300 -- 0323 | 0024 residues; 00396 atoms
 TOTAL | 0048 residues; 00792 atoms



In [32]:
# 此案例以chain1可以被设计,chain2只能repacking作为逻辑。
interface_design = ProteinInterfaceDesignOperation()
interface_design.jump(1) # jump点设置.之前的认定为chain1,之后的为chain2。2条链以上需要设置!
interface_design.interface_distance_cutoff(8.0)
interface_design.repack_chain1(True) # 是否允许chain1 repack
interface_design.repack_chain2(True) # 是否允许chain2 repack
interface_design.design_chain1(True) # 是否允许chain1 design
interface_design.design_chain2(False) # 是否允许chain design
interface_design.allow_all_aas(False) # allow all amino acids to be designed at all positions, do not exclude C, G, P

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(interface_design)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(interface_pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	FALSE	FALSE	
2	TRUE	TRUE	ALA,ASP,GLU,PHE,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
3	TRUE	TRUE	ALA,ASP,GLU,PHE,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
4	FALSE	FALSE	
5	TRUE	TRUE	ALA,ASP,GLU,PHE,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
6	TRUE	TRUE	ALA,ASP,GLU,PHE,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
7	TRUE	TRUE	ALA,ASP,GLU,PHE,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
8	FALSE	FALSE	
9	TRUE	TRUE	ALA,ASP,GLU,PHE,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
10	TRUE	TRUE	ALA,ASP,GLU,PHE,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
11	FALSE	FALSE	
12	TRUE	TRUE	ALA,ASP,GLU,PHE,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
13	TRUE	TRUE	ALA,ASP,GLU,PHE,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,GLN,ARG,SER,THR,VAL,TRP,TYR
14	FALSE	FALSE	
15	FALSE	FALSE	
16	TRUE	TRUE	ALA,ASP,GLU,PHE,HIS,HIS_D,ILE,LYS,LE

#### 4. RestrictToInterface
根据jump点和截断半径的设置,将蛋白-蛋白界面进行design,非界面区全部设置为no_repack。如果需要控制其中的自由度,还需要引入其他的TaskOperations,没有ProteinInterfaceDesign自定义程度高。

In [33]:
from pyrosetta.rosetta.protocols.simple_task_operations import RestrictToInterface
Design_interface = RestrictToInterface(rb_jump_in=1, distance_in=8)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(Design_interface)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(interface_pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	FALSE	FALSE	
2	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
3	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
4	FALSE	FALSE	
5	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
6	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
7	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
8	FALSE	FALSE	
9	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
10	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
11	FALSE	FALSE	
12	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
13	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,G

#### 5. RestrictToInterfaceVectorOperation
使用向量法计算界面。首先根据CB_dist_cutoff的距离判断出一个大致的shell区域,然后根据遍历每个残基侧链原子与其他氨基酸的原子距离,确认是否都小于nearby_atom_cutoff,如果满足条件,那么这些都是interface的残基。如果剩余的残基没有通过第一个标准,将使用第二个标准进行判断,计算残基的CA-CB原子向量以及CB原子与Interface上其他原子CB的向量(CB-CB),这两个向量的夹角如何小于vector_angle_cutoff。那么这些残基也是Interface上的残基,并且这些两个向量的距离不得大于vector_dist_cutoff。

- CB_dist_cutoff: CB-CB距离(range:8.0~15.0)
- nearby_atom_cutoff: CA-CB距离(range:4.0~8.0)
- vector_angle_cutoff: CA-CB、CB-CB向量的点乘. 向量的角度截断(range:60~90)
- vector_dist_cutoff: CA-CB、CB-CB向量距离的限制 (range:7.0~12.0)

In [34]:
from pyrosetta.rosetta.protocols.task_operations import RestrictToInterfaceVectorOperation
# 定义参数:
lower_chain_id = 1
upper_chain_id = 2
CB_dist_cutoff = 8.0
nearby_atom_cutoff = 5.0
vector_angle_cutoff = 75
vector_dist_cutoff = 9.0
include_all_water = False
restrict_to_interface = RestrictToInterfaceVectorOperation(lower_chain_id, upper_chain_id, CB_dist_cutoff, nearby_atom_cutoff,
 vector_angle_cutoff, vector_dist_cutoff, include_all_water)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(restrict_to_interface)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(interface_pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	FALSE	FALSE	
2	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
3	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
4	FALSE	FALSE	
5	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
6	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
7	FALSE	FALSE	
8	FALSE	FALSE	
9	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
10	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
11	FALSE	FALSE	
12	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
13	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
14	FALSE	FALSE	
15	FALSE	FALSE	
16	TRUE	TRUE	ALA,CYS,ASP

### 2.1.4 Input-based design

输入依赖的TaskOperations大多和序列比对有关。

#### 1. AlignedThread

使用经过比对的序列作为输入,与同源家族中的序列进行比对,gap区域、保守的区域设置为no_repack,并将比对上的区域的序列替换为同源序列的氨基酸。

In [35]:
# 目前pose的序列
pose.sequence()

'DELQKWVEQAERNG'

制作一个假的aligned.fatsa:
```
> pose
DELQKWVEQAERNG
> fake_homology
DELQKLKKQAEQNG
```

In [36]:
from pyrosetta.rosetta.protocols.task_operations import AlignedThreadOperation
Aligned = AlignedThreadOperation()
Aligned.alignment_file('./data/align.fasta') # 仅支持fasta文件
Aligned.query_name('fake_homology') # 比对文件中的query序列的代号,指同源序列的名字
Aligned.template_name('pose') # 比对文件中的template序列的代号,应该要和输入的pose一致
Aligned.start_res(1) # 确定从哪个氨基酸开始比对

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(Aligned)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

[0mprotocols.TaskOperations.AlignedThreadOperation: {0} [0mtemplate seq:
DELQKWVEQAERNG
query seq:
DELQKLKKQAEQNGDELQKMKKQAEQNGDELQKMKKQAEQNG
[0mprotocols.TaskOperations.AlignedThreadOperation: {0} [0msequence for threading: 
DELQKLKKQAEQNG
#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	FALSE	ASP:NtermProteinFull
2	TRUE	FALSE	GLU
3	TRUE	FALSE	LEU
4	TRUE	FALSE	GLN
5	TRUE	FALSE	LYS
6	TRUE	TRUE	LEU
7	TRUE	TRUE	LYS
8	TRUE	TRUE	LYS
9	TRUE	FALSE	GLN
10	TRUE	FALSE	ALA
11	TRUE	FALSE	GLU
12	TRUE	TRUE	GLN
13	TRUE	FALSE	ASN
14	TRUE	FALSE	GLY:CtermProteinFull



#### 2. RestrictNativeResidues
将现有的pose与ref_pose(对照)进行比对(并且两个pose的长度必须一致!),根据序列一致性,将ref和native pose中保守的氨基酸位点,设置为repack或Design.

In [37]:
# 读取ref_pose
ref_pose = pose_from_pdb('./data/helix_ref.pdb')

[0mcore.import_pose.import_pose: {0} [0mFile './data/helix_ref.pdb' automatically determined to be of type PDB


In [38]:
# 发现3,5,8,12位点序列有差异。
print(ref_pose.sequence())
print(pose.sequence())

DEMQKNVEHAERLG
DELQKWVEQAERNG


In [39]:
from pyrosetta.rosetta.protocols.task_operations import RestrictNativeResiduesOperation
no_pack_to_native = RestrictNativeResiduesOperation()
no_pack_to_native.reference_pose(ref_pose)
no_pack_to_native.prevent_repacking(1) # 1 = 仅设计非保守区段,其余设为no_repack. 0 = 设计非保守区段,其余设置为repack.

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(no_pack_to_native)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

[0mprotocols.task_operations.RestrictNativeResiduesOperation: {0} [0m4 non-native, designable residues found in pose
#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	FALSE	FALSE	
2	FALSE	FALSE	
3	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
4	FALSE	FALSE	
5	FALSE	FALSE	
6	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
7	FALSE	FALSE	
8	FALSE	FALSE	
9	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
10	FALSE	FALSE	
11	FALSE	FALSE	
12	FALSE	FALSE	
13	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
14	FALSE	FALSE	



In [40]:
from pyrosetta.rosetta.protocols.task_operations import RestrictNativeResiduesOperation
no_pack_to_native = RestrictNativeResiduesOperation()
no_pack_to_native.reference_pose(ref_pose)
no_pack_to_native.prevent_repacking(0) # 1 = 仅设计非保守区段,其余设为no_repack. 0 = 设计非保守区段,其余设置为repack.

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(no_pack_to_native)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

[0mprotocols.task_operations.RestrictNativeResiduesOperation: {0} [0m4 non-native, designable residues found in pose
#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	FALSE	ASP:NtermProteinFull
2	TRUE	FALSE	GLU
3	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
4	TRUE	FALSE	GLN
5	TRUE	FALSE	LYS
6	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
7	TRUE	FALSE	VAL
8	TRUE	FALSE	GLU
9	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
10	TRUE	FALSE	ALA
11	TRUE	FALSE	GLU
12	TRUE	FALSE	ARG
13	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
14	TRUE	FALSE	GLY:CtermProteinFull



#### 3. RestrictIdentitiesAtAlignedPositions
与输入的PDB结构序列进行比对,只有比对得上的位点得到保留,其他的位置可以进行Design,不要求PDB结构之间长度完全相同,但需要指定design aligment的区域不能大于pose的本身长度。

In [41]:
pose1 = pose_from_pdb('./data/three_helix_pose.pdb')
pose2 = pose_from_pdb('./data/homo1.pdb') # ref pose;

[0mcore.import_pose.import_pose: {0} [0mFile './data/three_helix_pose.pdb' automatically determined to be of type PDB
[0mcore.import_pose.import_pose: {0} [0mFile './data/homo1.pdb' automatically determined to be of type PDB


<center><img src="./img/structure_align.png" width = "700" height = "200" align=center /></center>
(图片来源: 晶泰科技团队)

In [42]:
from pyrosetta.rosetta.protocols.task_operations import RestrictIdentitiesAtAlignedPositionsOperation
aligned_position_design = RestrictIdentitiesAtAlignedPositionsOperation()
aligned_position_design.source_pose('./data/homo1.pdb') # pose1=ref_pose
aligned_position_design.chain(1) # 设定pose_ref中第几条链被用于比对
aligned_position_design.design_only_target_residues(False) # 允许reapckshell中氨基酸进行repack. 默认False
aligned_position_design.prevent_repacking(False) # 是否不允许repack?

# 当ref_pose和pose中不等长时,需要指定在哪个范围内进行alignment。
vector1 = pyrosetta.rosetta.utility.vector1_unsigned_long()
for i in range(1, pose1.total_residue()):
 vector1.append(i)
aligned_position_design.res_ids(vector1)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(aligned_position_design)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose1)
print(packer_task)

[0mcore.import_pose.import_pose: {0} [0mFile './data/homo1.pdb' automatically determined to be of type PDB
[0mprotocols.TaskOperations.RestrictIdentitiesAtAlignedPositionsOperation: {0} [0mResidue nearest is: 1
[0mprotocols.TaskOperations.RestrictIdentitiesAtAlignedPositionsOperation: {0} [0mResidue nearest is: ASP1
[0mprotocols.TaskOperations.RestrictIdentitiesAtAlignedPositionsOperation: {0} [0mResidue nearest is: 2
[0mprotocols.TaskOperations.RestrictIdentitiesAtAlignedPositionsOperation: {0} [0mResidue nearest is: GLU2
[0mprotocols.TaskOperations.RestrictIdentitiesAtAlignedPositionsOperation: {0} [0mResidue nearest is: 3
[0mprotocols.TaskOperations.RestrictIdentitiesAtAlignedPositionsOperation: {0} [0mResidue nearest is: LEU3
[0mprotocols.TaskOperations.RestrictIdentitiesAtAlignedPositionsOperation: {0} [0mResidue nearest is: 3
[0mprotocols.TaskOperations.RestrictIdentitiesAtAlignedPositionsOperation: {0} [0mResidue nearest is: GLN4
[0mprotocols.TaskOperations.Re

#### 4. SeqprofConsensus
根据PSSM文件来定义序列设计的自由度。PSSM文件可以使用PSIBLAST工具和MSA比对文件来生成。
此处依然使用伪造的MSA做示例:

msa.fasta内容如下:
```
>pose
DELQKWVEQAERNG
>fake_homology1
DELQKLKKQAEQNG
>fake_homology2
DELQKMKKQAEQNG
>fake_homology3
DELQKMKKQAEQNV
>fake_homology4
DELQKMMDQAEQNV
>fake_homology5
DELKDMMDQAEQNV
```

生成PSSM的命令:(结果文件可在data文件夹下找到)
```
psiblast -subject sequence.fasta -in_msa msa.fasta -out_ascii_pssm output.pssm
```

SeqprofConsensus自由度控制相关的关键参数:
- filename: PSSM的输入文件名
- min_aa_probability: PSSM score的阈值,只有大于阈值的氨基酸才会被考虑。值越大,需要氨基酸在PSSM中出现的频率越高。
- probability_larger_than_current: 突变氨基酸的频率是否需要比当前Pose中的氨基酸频率高?
- keep_native: 是否保留野生型氨基酸的自由度?

In [43]:
# load pssm
from pyrosetta.rosetta.protocols import rosetta_scripts
xml = rosetta_scripts.XmlObjects.create_from_string('''
<TASKOPERATIONS>
 <SeqprofConsensus name="pssm_design" 
 filename="./data/output.pssm" 
 min_aa_probability="0.0" 
 probability_larger_than_current="1" 
 convert_scores_to_probabilities="false" 
 keep_native="true"/>
</TASKOPERATIONS>
''')
pssm_design = xml.get_task_operation('pssm_design')

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(pssm_design)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose)
print(packer_task)

[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mGenerating XML Schema for rosetta_scripts...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mInitializing schema validator...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mValidating input script...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mParsed script:
<ROSETTASCRIPTS>
	<TASKOPERATIONS>
		<SeqprofConsensus convert_scores_to_probabilities="false" filename="./data/output.pssm" keep_native="true" min_aa_probability="0.0" name="pssm_design" probability_larger_than_current="1"/>
	</TASKOPERATIONS>
	<PROTOCOLS/>
</ROSETTASCRIPTS>
[0mcore.scoring.ScoreFunctionFactory: {0} [0mSCOREFUNCTION: [32mref2015[0m
[0mprotocols.task_operations.SeqprofConsensusOperation: {0} [0mLoading seqprof

#### 5. ThreadSequenceOperation
用于将一个序列“穿针”到一个Pose的所有位点上。

In [44]:
from pyrosetta.rosetta.protocols.task_operations import ThreadSequenceOperation
thread = ThreadSequenceOperation()
thread.target_sequence(pose2.sequence())

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(thread)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose1)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	TRUE	ASP:NtermProteinFull
2	TRUE	FALSE	GLU
3	TRUE	FALSE	LEU
4	TRUE	TRUE	GLN
5	TRUE	TRUE	LYS
6	TRUE	TRUE	TRP
7	TRUE	TRUE	VAL
8	TRUE	TRUE	GLU
9	TRUE	TRUE	GLN
10	TRUE	TRUE	ALA
11	TRUE	TRUE	GLU
12	TRUE	TRUE	ARG
13	TRUE	TRUE	ASN
14	TRUE	TRUE	GLY
15	TRUE	TRUE	VAL
16	TRUE	TRUE	SER
17	TRUE	TRUE	LEU
18	TRUE	FALSE	GLU
19	TRUE	FALSE	GLU
20	TRUE	TRUE	ILE
21	TRUE	TRUE	GLU
22	TRUE	FALSE	LYS
23	TRUE	TRUE	TRP
24	TRUE	TRUE	ILE
25	TRUE	FALSE	LYS
26	TRUE	FALSE	LYS
27	TRUE	TRUE	ALA
28	TRUE	TRUE	GLY
29	TRUE	TRUE	ASP
30	TRUE	TRUE	GLU
31	TRUE	FALSE	GLU
32	TRUE	TRUE	LEU
33	TRUE	TRUE	LEU
34	TRUE	FALSE	LYS
35	TRUE	TRUE	ARG
36	TRUE	TRUE	PHE
37	TRUE	TRUE	GLN
38	TRUE	TRUE	LYS
39	TRUE	TRUE	LYS
40	TRUE	TRUE	VAL
41	TRUE	FALSE	LYS
42	TRUE	TRUE	GLU
43	TRUE	FALSE	ARG:CtermProteinFull



### 2.2 Rotamer Specification

#### 2.2.1 InteractingRotamerExplosion
增强某一位点的Rotamer采样,如希望在设计所有氨基酸位点时,增强采样那些与第8号残基相互作用能量(two-body)大于-0.5个REU单位的Rotamer

In [45]:
from pyrosetta.rosetta.protocols import rosetta_scripts
xml = rosetta_scripts.XmlObjects.create_from_string('''
<TASKOPERATIONS>
 <InteractingRotamerExplosion name="rotexpl"
 ex_level="2" score_cutoff="0.5" target_seqpos="10A" debug="0" />
</TASKOPERATIONS>
''')
rotexpl = xml.get_task_operation('rotexpl')

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(rotexpl)

[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mGenerating XML Schema for rosetta_scripts...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mInitializing schema validator...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mValidating input script...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mParsed script:
<ROSETTASCRIPTS>
	<TASKOPERATIONS>
		<InteractingRotamerExplosion debug="0" ex_level="2" name="rotexpl" score_cutoff="0.5" target_seqpos="10A"/>
	</TASKOPERATIONS>
	<PROTOCOLS/>
</ROSETTASCRIPTS>
[0mcore.scoring.ScoreFunctionFactory: {0} [0mSCOREFUNCTION: [32mref2015[0m
[0mprotocols.jd2.parser.TaskOperationLoader: {0} [0mDefined TaskOperation named "rotexpl" of type InteractingRotamerExplosion
[0mprotocols.rosetta_scripts.Par

#### 2.2.2 ImportUnboundRotamers
可以用于增强采样输入的Ref PDB结构上的Rotamer。

In [46]:
from pyrosetta.rosetta.protocols.task_operations import ImportUnboundRotamersOperation
favor_native_rotamer_ex = ImportUnboundRotamersOperation()
init('-packing::unboundrot ./data/three_helix_pose.pdb')

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(favor_native_rotamer_ex)

PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.31+release.c7009b3115c22daa9efe2805d9d1ebba08426a54 2021-08-07T10:04:12] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
[0mcore.init: {0} [0mChecking for fconfig files in pwd and ./rosetta/flags
[0mcore.init: {0} [0mRosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r292 2021.31+release.c7009b3115c c7009b3115c22daa9efe2805d9d1ebba08426a54 http://www.pyrosetta.org 2021-08-07T10:04:12
[0mcore.init: {0} [0mcommand: PyRosetta -packing::unboundrot ./data/three_helix_pose.pdb -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database
[0mbasic.random.init_random_generator: {0} [0m'RNG device' seed mode, using '/dev/urandom', seed=-732456573 seed_offset=0 real_seed=-732456573 thread_index=0
[0mbasic.random.init_random_generator: {0} [0mRandomGenerator:

#### 2.2.3 LimitAromaChi2

防止在Packer运行中使用PHE、TYR和HIS的那些$\chi_{2}$角远离90度的Rotamer,因为这些Rotamer是罕见的。因此直接排除

- include_trp: Trp的能量更加平滑,可以在更广的$\chi_{2}$角空间出现,因此不推荐约束它。

In [47]:
from pyrosetta.rosetta.protocols import rosetta_scripts
xml = rosetta_scripts.XmlObjects.create_from_string('''
<TASKOPERATIONS>
 <LimitAromaChi2 name="limit_chi2" chi2max="110.0"
 chi2min="70.0" include_trp="false" />
</TASKOPERATIONS>
''')
limit_chi2 = xml.get_task_operation('limit_chi2')

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(limit_chi2)

[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mGenerating XML Schema for rosetta_scripts...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mInitializing schema validator...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mValidating input script...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mParsed script:
<ROSETTASCRIPTS>
	<TASKOPERATIONS>
		<LimitAromaChi2 chi2max="110.0" chi2min="70.0" include_trp="false" name="limit_chi2"/>
	</TASKOPERATIONS>
	<PROTOCOLS/>
</ROSETTASCRIPTS>
[0mcore.scoring.ScoreFunctionFactory: {0} [0mSCOREFUNCTION: [32mref2015[0m
[0mprotocols.jd2.parser.TaskOperationLoader: {0} [0mDefined TaskOperation named "limit_chi2" of type LimitAromaChi2
[0mprotocols.rosetta_scripts.ParsedProtocol: {0} [0mParsedProt

#### 2.2.4 SampleRotamersFromPDB
用于约束当前Rotamer采样范围与ref PDB结构中的Rotamer高度相似(如rotamer差异在+/-5度)。

In [48]:
from pyrosetta.rosetta.protocols import rosetta_scripts
from pyrosetta.rosetta.protocols.minimization_packing import PackRotamersMover

# load native rotamers;
init('-packing::unboundrot ./data/three_helix_pose.pdb')

xml = rosetta_scripts.XmlObjects.create_from_string('''
<TASKOPERATIONS>
 <SampleRotamersFromPDB name="restric_native_ex" add_rotamer="1" debug="0" ccd="0"/>
</TASKOPERATIONS>
''')
restric_native_ex = xml.get_task_operation('restric_native_ex')

pose1 = pose_from_pdb('./data/three_helix_pose.pdb')

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(restric_native_ex)
pack_tf.push_back(RestrictToRepacking())

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(pose1)
print(packer_task)

PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.31+release.c7009b3115c22daa9efe2805d9d1ebba08426a54 2021-08-07T10:04:12] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
[0mcore.init: {0} [0mChecking for fconfig files in pwd and ./rosetta/flags
[0mcore.init: {0} [0mRosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r292 2021.31+release.c7009b3115c c7009b3115c22daa9efe2805d9d1ebba08426a54 http://www.pyrosetta.org 2021-08-07T10:04:12
[0mcore.init: {0} [0mcommand: PyRosetta -packing::unboundrot ./data/three_helix_pose.pdb -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database
[0mbasic.random.init_random_generator: {0} [0m'RNG device' seed mode, using '/dev/urandom', seed=-631442782 seed_offset=0 real_seed=-631442782 thread_index=0
[0mbasic.random.init_random_generator: {0} [0mRandomGenerator:

#### 2.2.5 PruneBuriedUnsats
消除那些在会导致在结构中引入不饱和极性原子(无法配对产生氢键)的Rotamer。
参数:
- allow_even_trades - Allow residues that satisfy an unsat and create a new unsatisfiable one.
- atomic_depth_probe_radius - Probe radius for atomic depth calculation to determine burial.
- atomic_depth_resolution - Voxel resolution with which to calculate atomic depth.
- atomic_depth_cutoff - Atomic depth at which atoms are considered buried.
- Minimum energy (out of the typical rosetta -2.0) for a hbond to be considered to satisfy a polar.

In [49]:
from pyrosetta.rosetta.protocols.task_operations import PruneBuriedUnsatsOperation
prune_unsat = PruneBuriedUnsatsOperation()
prune_unsat.atomic_depth_cutoff(4.5)
prune_unsat.atomic_depth_probe_radius(2.3)
prune_unsat.atomic_depth_resolution(0.5)
prune_unsat.minimum_hbond_energy(-0.2)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(prune_unsat)

### 2.3 Packer Behavior Modification
该部分的TaskOperations主要作用于Packer,可以设置一些改变Packer的采样或运行方式。

#### 2.3.1 ModifyAnnealer
改变Packer模拟退火中的设置,包括:
- high_temp: 起始温度
- low_temp: 终止温度
- disallow_quench: “淬火”即每次都接受低能量的构象,如果希望有更大的Rotamer多样性可以将此关闭。默认启用“淬火”

In [50]:
from pyrosetta.rosetta.protocols import rosetta_scripts
xml = rosetta_scripts.XmlObjects.create_from_string('''
<TASKOPERATIONS>
 <ModifyAnnealer name="modify_annealer" high_temp="100" low_temp="0.3" disallow_quench="0"/>
</TASKOPERATIONS>
''')

modify_annealer = xml.get_task_operation('modify_annealer')

# # 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(modify_annealer)

[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mGenerating XML Schema for rosetta_scripts...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mInitializing schema validator...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mValidating input script...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mParsed script:
<ROSETTASCRIPTS>
	<TASKOPERATIONS>
		<ModifyAnnealer disallow_quench="0" high_temp="100" low_temp="0.3" name="modify_annealer"/>
	</TASKOPERATIONS>
	<PROTOCOLS/>
</ROSETTASCRIPTS>
[0mcore.scoring.ScoreFunctionFactory: {0} [0mSCOREFUNCTION: [32mref2015[0m
[0mprotocols.jd2.parser.TaskOperationLoader: {0} [0mDefined TaskOperation named "modify_annealer" of type ModifyAnnealer
[0mprotocols.rosetta_scripts.ParsedProtocol: {0} [0m

#### 2.3.2 RestrictInteractionGraphThreadsOperation
此TaskOperations可以控制Packer使用的线程数量。默认conda安装的pyrosetta都是单线程编译的,如果需要支持多线程进行Packer的,需要从Rosetta的源代码处从头编译multi-threaded版本的PyRosetta并安装(比较麻烦),当兼容多线程时,合理使用可以加速开发和测试时的速度。

In [51]:
from pyrosetta.rosetta.core.pack.task.operation import RestrictInteractionGraphThreadsOperation
init('-multithreading:total_threads 16 -multithreading:interaction_graph_threads 16')

# litmit thread setting;
litmit_thread = RestrictInteractionGraphThreadsOperation()
litmit_thread.set_thread_limit(8)

# # 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(litmit_thread)
pack_tf.push_back(RestrictToRepacking())

# run packer
from pyrosetta.rosetta.protocols.minimization_packing import PackRotamersMover
from pyrosetta import create_score_function
pack_mover = PackRotamersMover()
ref2015 = create_score_function('ref2015')
single_helix_pose = pose_from_pdb('./data/helix.pdb')

# 不需要导入PackTask,只需要输入TaskFactory即可。
pack_mover.task_factory(pack_tf)
pack_mover.score_function(ref2015)

# 执行repacking
pack_mover.apply(single_helix_pose)

PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.31+release.c7009b3115c22daa9efe2805d9d1ebba08426a54 2021-08-07T10:04:12] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
[0mcore.init: {0} [0mChecking for fconfig files in pwd and ./rosetta/flags
[0mcore.init: {0} [0mRosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r292 2021.31+release.c7009b3115c c7009b3115c22daa9efe2805d9d1ebba08426a54 http://www.pyrosetta.org 2021-08-07T10:04:12
[0mcore.init: {0} [0mcommand: PyRosetta -multithreading:total_threads 16 -multithreading:interaction_graph_threads 16 -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database
[0mbasic.random.init_random_generator: {0} [0m'RNG device' seed mode, using '/dev/urandom', seed=-2083723108 seed_offset=0 real_seed=-2083723108 thread_index=0
[0mbasic.random.init_random_gene

#### 2.3.3 SetIGTypeOperation
此处TaskOperations可影响InteractionGraph的计算方法,默认有4种方式:
- lin_mem_ig: 使用线性增长的内存机制,近期被计算的Rotamer pairs之间的能量会被重复使用,但是同一种Rotamer pairs之间的能量可能会被重复计算,因为超出一定数量时,相关的记忆就被丢失了。
- lazy_ig: 推迟计算Rotamer pairs之间的能量,直到真正用到时,这样可以大幅减少预计算的量。
- double_lazy_ig: 同时使用lin_mem_ig和double_lazy_ig。
- precompute_ig: 完整计算所有的Rotamer pairs之间的能量。(O(N^2) interaction graph)

In [52]:
from pyrosetta.rosetta.protocols import rosetta_scripts
xml = rosetta_scripts.XmlObjects.create_from_string('''
<TASKOPERATIONS>
 <SetIGType name="set_ig" lin_mem_ig="1" lazy_ig="0" double_lazy_ig="0" precompute_ig="0"/>
</TASKOPERATIONS>
''')

set_ig = xml.get_task_operation('set_ig')

# # 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(set_ig)

[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mGenerating XML Schema for rosetta_scripts...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mInitializing schema validator...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mValidating input script...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mParsed script:
<ROSETTASCRIPTS>
	<TASKOPERATIONS>
		<SetIGType double_lazy_ig="0" lazy_ig="0" lin_mem_ig="1" name="set_ig" precompute_ig="0"/>
	</TASKOPERATIONS>
	<PROTOCOLS/>
</ROSETTASCRIPTS>
[0mcore.scoring.ScoreFunctionFactory: {0} [0mSCOREFUNCTION: [32mref2015[0m
[0mprotocols.jd2.parser.TaskOperationLoader: {0} [0mDefined TaskOperation named "set_ig" of type SetIGType
[0mprotocols.rosetta_scripts.ParsedProtocol: {0} [0mParsedProtocol 

### 2.4 Antibody and CDR Specific Operations
这一部分的TaskOperations是为抗体结构设计定制的,在抗体设计时引用会带来极大的便利性。核心两个部分: 控制CDR如何Design、控制region的Design和repack状态。

#### 2.4.1 DisableAntibodyRegionOperation
抗体Pose的Region定义分为CDR区、Framework区、抗原区(如果Pose中是抗体抗原复合物的话)。DisableAntibodyRegionOperation可直接把特定区域的Rotamer自由度闭关或减少。

参数:
- region可选:AntibodyRegionEnum.antigen_region/AntibodyRegionEnum.cdr_region/AntibodyRegionEnum.framework_region
- disable_packing_and_design: True=norepack, False=allow_pack

In [53]:
from pyrosetta.rosetta.protocols.antibody import AntibodyInfo
from pyrosetta.rosetta.protocols.antibody.task_operations import DisableAntibodyRegionOperation
from pyrosetta.rosetta.protocols.antibody import AntibodyRegionEnum

# 读入抗体结构
init('-ex1 -ex2 -input_ab_scheme Chothia_Scheme -use_input_sc')
antibody_pose = pose_from_pdb('./data/7OBF_B.pdb') # nanobody

# 设置region自由度;
ab_info = AntibodyInfo(antibody_pose)
disable_region = DisableAntibodyRegionOperation(ab_info, AntibodyRegionEnum.framework_region,
 disable_packing_and_design=False)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(disable_region)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(antibody_pose)
print(packer_task)

PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.31+release.c7009b3115c22daa9efe2805d9d1ebba08426a54 2021-08-07T10:04:12] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
[0mcore.init: {0} [0mChecking for fconfig files in pwd and ./rosetta/flags
[0mcore.init: {0} [0mRosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r292 2021.31+release.c7009b3115c c7009b3115c22daa9efe2805d9d1ebba08426a54 http://www.pyrosetta.org 2021-08-07T10:04:12
[0mcore.init: {0} [0mcommand: PyRosetta -ex1 -ex2 -input_ab_scheme Chothia_Scheme -use_input_sc -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database
[0mbasic.random.init_random_generator: {0} [0m'RNG device' seed mode, using '/dev/urandom', seed=-475246036 seed_offset=0 real_seed=-475246036 thread_index=0
[0mbasic.random.init_random_generator: {0} [0mRandomGen

#### 思考
尝试不同的region,看看他们在抗体上的结构分布都是在哪些区域?

#### 2.4.2 DisableCDRsOperation
DisableCDRsOperation相比于单纯的DisableAntibodyRegionOperation,可以更加精细地去定义6个CDR区的单独状态!

In [54]:
from pyrosetta.rosetta.protocols.antibody.task_operations import DisableCDRsOperation

# 定义CDR区范围,目前1代表自定义CDR-H1。其他区不做定义。
cdr_allow = [1,1,0,0,0,0,0,0] # H1 H2 H3 H4 L1 L2 L3 L4
cdr_allow_vector1 = pyrosetta.rosetta.utility.vector1_bool()
for i in cdr_allow:
 cdr_allow_vector1.append(i)

# 将CDR-H1和CDR-H2的所有Rotamer自由度关闭:
disable_cdr = DisableCDRsOperation(ab_info, cdr_allow_vector1, disable_packing_and_design=True)

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(disable_cdr)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(antibody_pose)
print(packer_task)

#Packer_Task

Threads to request: ALL AVAILABLE

resid	pack?	design?	allowed_aas
1	TRUE	TRUE	ALA:NtermProteinFull,CYS:NtermProteinFull,ASP:NtermProteinFull,GLU:NtermProteinFull,PHE:NtermProteinFull,GLY:NtermProteinFull,HIS:NtermProteinFull,HIS_D:NtermProteinFull,ILE:NtermProteinFull,LYS:NtermProteinFull,LEU:NtermProteinFull,MET:NtermProteinFull,ASN:NtermProteinFull,PRO:NtermProteinFull,GLN:NtermProteinFull,ARG:NtermProteinFull,SER:NtermProteinFull,THR:NtermProteinFull,VAL:NtermProteinFull,TRP:NtermProteinFull,TYR:NtermProteinFull
2	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
3	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
4	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
5	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR
6	TRUE	TRUE	ALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GL

#### 2.4.3 AddCDRProfilesOperation

抗体的CDR-H1、H2、L1、L2、L3都是由正则结构的,因此可以被聚类成若干个簇。这些簇内部的序列有保守性也有多样性(Profiles)。
Profiles中记录了每种氨基酸在每种CDR簇中的频率分布,通过AddCDRProfilesOperation可以从Profiles中根据频率分布依次地、随机地挑选一个氨基酸类型添加到可被设计的列表中,因此每次Packer执行时,PackerTask都是不同的。set_picking_rounds设置在每个位点中挑选几次氨基酸,设置值越大,此位点Rotamer的自由度可能会增加。

注意: 当一些CDR结构多样性太少时,就会从保守性出发进行生成Rotamer的自由度。

因此此处有两种基本的生成策略:
1. PRIMARY STRATEGIES: 初始设计方案。【肯定的是,采样丰度更高了,但是有些位点反变得不可以控制】

 * seq_design_profiles: 基于cluster的氨基酸频率采样(如在H1-13-1中采样). #默认
 * seq_design_profile_sets: 基于所有的cluster的氨基酸频率采样(如H1-13-1,H1-10-1…穷举)(实验阶段)
 * seq_design_profile_sets_combined: profiles+profile_sets。组合(实验阶段)
 * seq_design_conservative: 当找到不到cluster或则cluster中成员较少是,使用blosum62矩阵统计氨基酸频率。
 * seq_design_basic: 可以被设置成20种氨基酸。

2. FALLBACK STRATEGIES: 如果找不到CDR图谱或则聚类中CDR序列过少,那么将自动调用备用策略。

 * seq_design_conservative: 基于blosum62矩阵选择突变设计范围 #默认
 * seq_design_basic: 允许20种氨基酸设计
 * seq_design_none: 禁止设计


Cons_design_data_source: 使用的打分矩阵
'chothia_1976', 'BLOSUM30', 'blosum30', 'BLOSUM35', 'blosum35', 'BLOSUM40', 'blosum40',
'BLOSUM45', 'blosum45', 'BLOSUM50', 'blosum50', 'BLOSUM55', 'blosum55', 'BLOSUM60', 'blosum60',
'BLOSUM62', 'blosum62', 'BLOSUM65', 'blosum65', 'BLOSUM70', 'blosum70', 'BLOSUM75', 'blosum75',
'BLOSUM80', 'blosum80', 'BLOSUM85', 'blosum85', 'BLOSUM90', 'blosum90', 'BLOSUM100', 'blosum100'

特别注意: 轻链类型,cmd -light_chain 使用set模式时需要指定轻链的类型![lambda, kappa]

In [55]:
from pyrosetta.rosetta.protocols.antibody import *
from pyrosetta.rosetta.protocols.antibody.task_operations import *
from pyrosetta.rosetta.protocols.antibody.design import *

add_cdr_profile = AddCDRProfilesOperation()
add_cdr_profile.set_include_native_type(True) # allowed wild type AA
add_cdr_profile.set_picking_rounds(2) # 控制sequence sampling丰度

#选择某一个CDR计算.
add_cdr_profile.set_cdr_only(h1) # 选择cdr1

#控制采样方法
add_cdr_profile.set_primary_strategy(seq_design_profiles) # 如果设置为含set模式时,初始化需加入-light_chain选项
add_cdr_profile.set_fallback_strategy(seq_design_conservative)

#增加变异度(选1)
add_cdr_profile.set_no_probability(False) #在profile中,忽略氨基酸出现的频率,采集所有出现过的氨基酸类型.
add_cdr_profile.set_sample_zero_probs_at(False) #在profile中, 增加频率为0的氨基酸的频率,使得可能被采样。
add_cdr_profile.set_stats_cutoff(10) # profile截断,如果簇内少于10个成员,那么使用FALLBACK STRATEGIES
add_cdr_profile.set_use_outliers(False) #是否采样聚类中偏离较远的序列。
add_cdr_profile.set_cons_design_data_source('BLOSUM62') #改变保守设计使用的打分矩阵

[0mprotocols.task_operations.ConservativeDesignOperation: {0} [0mLoading conservative mutational data


In [56]:
# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(add_cdr_profile)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(antibody_pose)
print(packer_task)

[0mbasic.io.database: {0} [0mDatabase file opened: sampling/antibodies/cluster_center_dihedrals.txt
[0mprotocols.antibody.AntibodyNumberingParser: {0} [0mAntibody numbering scheme definitions read successfully
[0mprotocols.antibody.AntibodyNumberingParser: {0} [0mAntibody CDR definition read successfully
[0mantibody.AntibodyInfo: {0} [0mSuccessfully finished the CDR definition
[0mantibody.AntibodyInfo: {0} [0m Could not setup Vl Vh Packing angle for camelid antibody
[0mantibody.AntibodyInfo: {0} [0mAC Detecting Camelid CDR H3 Stem Type
[0mantibody.AntibodyInfo: {0} [0mAC Finished Detecting Camelid CDR H3 Stem Type: NEUTRAL
[0mantibody.AntibodyInfo: {0} [0mSetting up CDR Cluster for H1
[0mprotocols.antibody.cluster.CDRClusterMatcher: {0} [0mLength: 13 Omega: TTTTTTTTTTTTT
[0mantibody.AntibodyInfo: {0} [0mSetting up CDR Cluster for H2
[0mprotocols.antibody.cluster.CDRClusterMatcher: {0} [0mLength: 10 Omega: TTTTTTTTTT
[0mantibody.AntibodyInfo: {0} [0mSetting up CD

#### 思考
可见CDR-H1处的氨基酸自由度减少了。尝试重复运行上述代码,以及增大set_picking_rounds参数,你发现了什么现象?

#### 2.4.4 AddCDRProfileSetsOperation
这个模块依然处于实验阶段。该TaskOperation每次生成PackTask时,它随机地选择一个CDR cluster(如H1-13-1 或 H1-13-2 或 H1-13-3)来统计氨基酸频率。因此不再受限于Pose中原有的CDR结构所在的那个簇。如果结构中本身的CDR没有对应的簇,那么将跳过该CDR的设计。

In [57]:
from pyrosetta.rosetta.protocols.antibody.task_operations import AddCDRProfileSetsOperation
add_cdr_profile_set = AddCDRProfileSetsOperation()
add_cdr_profile_set.set_include_native_type(True) # allowed wild type AA
add_cdr_profile_set.set_picking_rounds(1) # 控制sequence sampling次数

# 设置多个CDR.
cdr_allow = [1,0,0,0,0,0,0,0] # H1 H2 H3 H4 L1 L2 L3 L4
vector1 = pyrosetta.rosetta.utility.vector1_bool()
for i in cdr_allow:
 vector1.append(i)
add_cdr_profile_set.set_cdrs(vector1)

#基于CDR长度的sampling设置:
#add_cdr_profile_set.set_limit_only_to_length(True) # 默认为 不限制链长.

# 设置策略2:
add_cdr_profile_set.set_cutoff(10) # profile截断,如果簇内少于10个成员,那么使用FALLBACK STRATEGIES
add_cdr_profile_set.set_use_outliers(False) #是否采样聚类中偏离较远的序列。

# 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(add_cdr_profile)

# 生成PackerTask
packer_task = pack_tf.create_task_and_apply_taskoperations(antibody_pose)
print(packer_task)

[0mbasic.io.database: {0} [0mDatabase file opened: sampling/antibodies/cluster_center_dihedrals.txt
[0mprotocols.antibody.AntibodyNumberingParser: {0} [0mAntibody numbering scheme definitions read successfully
[0mprotocols.antibody.AntibodyNumberingParser: {0} [0mAntibody CDR definition read successfully
[0mantibody.AntibodyInfo: {0} [0mSuccessfully finished the CDR definition
[0mantibody.AntibodyInfo: {0} [0m Could not setup Vl Vh Packing angle for camelid antibody
[0mantibody.AntibodyInfo: {0} [0mAC Detecting Camelid CDR H3 Stem Type
[0mantibody.AntibodyInfo: {0} [0mAC Finished Detecting Camelid CDR H3 Stem Type: NEUTRAL
[0mantibody.AntibodyInfo: {0} [0mSetting up CDR Cluster for H1
[0mprotocols.antibody.cluster.CDRClusterMatcher: {0} [0mLength: 13 Omega: TTTTTTTTTTTTT
[0mantibody.AntibodyInfo: {0} [0mSetting up CDR Cluster for H2
[0mprotocols.antibody.cluster.CDRClusterMatcher: {0} [0mLength: 10 Omega: TTTTTTTTTT
[0mantibody.AntibodyInfo: {0} [0mSetting up CD

#### 2.4.5 RestrictToCDRsAndNeighbors
将某些CDRs及它的邻近氨基酸的pack状态设置为repack, 其余残基一律设置为no_repack。

In [58]:
from pyrosetta.rosetta.protocols.antibody.task_operations import RestrictToCDRsAndNeighbors
from pyrosetta.rosetta.protocols.antibody import CDRNameEnum
from pyrosetta.rosetta.utility import vector1_bool

# 设置CDR的选择范围:
cdr_allow = [1, 1, 0, 0, 0, 0, 0, 0] # H1 H2 H3 H4 L1 L2 L3 L4
vector1 = vector1_bool()
for i in cdr_allow:
 vector1.append(i)
print(vector1)

# # 约束CDR周围的氨基酸为Repack,非临近区域为no_repack;
restrict_cdr_nbr = RestrictToCDRsAndNeighbors()
restrict_cdr_nbr.set_cdrs(vector1)
restrict_cdr_nbr.set_allow_design_cdr(0) # pack状态设置为Design/repack
restrict_cdr_nbr.set_allow_design_neighbor_framework(0) # pack状态设置为Design/repack
restrict_cdr_nbr.set_allow_design_neighbor_antigen(0) # pack状态设置为Design/repack
restrict_cdr_nbr.set_neighbor_distance(8.0)
restrict_cdr_nbr.set_stem_size(0) # 在计算邻近氨基酸时,将CDR两端延长N个氨基酸来计算。(延长的氨基酸依然属于FR区)

# # 将TaskOperations加载至TaskFactory中
pack_tf = TaskFactory()
pack_tf.push_back(restrict_cdr_nbr)

# # 生成PackerTask(目前有bug.运行会报错)
# packer_task = pack_tf.create_task_and_apply_taskoperations(antibody_pose)
# print(packer_task)

vector1_bool[1, 1, 0, 0, 0, 0, 0, 0]
