{ "cells": [ { "cell_type": "markdown", "id": "mature-restriction", "metadata": {}, "source": [ "## Resfile System" ] }, { "cell_type": "markdown", "id": "monthly-september", "metadata": {}, "source": [ "@Author: 吴炜坤\n", "\n", "@email:weikun.wu@xtalpi.com/weikunwu@163.com" ] }, { "cell_type": "markdown", "id": "intelligent-associate", "metadata": {}, "source": [ "Resfile是控制Packer的外部输入文件,用于告诉Rosetta Packer如何对结构中的每一个氨基酸侧链自由度进行定义。Resfile以文件的形式被ReadResfile函数读取并生成对应的TaskOperation。用户可以很方便地在外部进行快速的定义,而不需要写出复杂的selector+RLT的方式。但Resfile系统的缺点是每个文件的定义都需要人去处理,无法自动化完成任务。" ] }, { "cell_type": "code", "execution_count": 1, "id": "understanding-satin", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.26+release.b308454c455dd04f6824cc8b23e54bbb9be2cdd7 2021-07-02T13:01:54] retrieved from: http://www.pyrosetta.org\n", "(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.\n", "\u001b[0mcore.init: {0} \u001b[0mChecking for fconfig files in pwd and ./rosetta/flags\n", "\u001b[0mcore.init: {0} \u001b[0mRosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r288 2021.26+release.b308454c455 b308454c455dd04f6824cc8b23e54bbb9be2cdd7 http://www.pyrosetta.org 2021-07-02T13:01:54\n", "\u001b[0mcore.init: {0} \u001b[0mcommand: PyRosetta -ex1 -ex2aro -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database\n", "\u001b[0mbasic.random.init_random_generator: {0} \u001b[0m'RNG device' seed mode, using '/dev/urandom', seed=-867162182 seed_offset=0 real_seed=-867162182 thread_index=0\n", "\u001b[0mbasic.random.init_random_generator: {0} \u001b[0mRandomGenerator:init: Normal mode, seed=-867162182 RG_type=mt19937\n", "\u001b[0mcore.chemical.GlobalResidueTypeSet: {0} \u001b[0mFinished initializing fa_standard residue type set. Created 984 residue types\n", "\u001b[0mcore.chemical.GlobalResidueTypeSet: {0} \u001b[0mTotal time to initialize 0.668191 seconds.\n", "\u001b[0mcore.import_pose.import_pose: {0} \u001b[0mFile './data/helix.pdb' automatically determined to be of type PDB\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m missing heavyatom: OXT on residue GLY:CtermProteinFull 14\n" ] } ], "source": [ "# 初始化PyRosetta\n", "from pyrosetta import init, pose_from_pdb\n", "from pyrosetta.rosetta.core.pack.task.operation import ReadResfile\n", "from pyrosetta.rosetta.core.pack.task import TaskFactory\n", "init()\n", "pose = pose_from_pdb('./data/helix.pdb')" ] }, { "cell_type": "markdown", "id": "crucial-scratch", "metadata": {}, "source": [ "### 一、Resfile的基本格式" ] }, { "cell_type": "markdown", "id": "attempted-consistency", "metadata": {}, "source": [ "在Resfile中,我们使用的编号策略是: PDB Numbering, 并且大小写敏感。通常一个Resfile的格式包括两部分:HEADER & BODYs.\n", "\n", "每个部分的作用:\n", "- HEADER: **控制全局**如何进行Rotamer搜索方式;\n", "- BODY: 记录明确指定的**特定位置或范围**氨基酸Rotamer搜索方式;\n" ] }, { "cell_type": "markdown", "id": "possible-novel", "metadata": {}, "source": [ "Resfile实例:" ] }, { "cell_type": "markdown", "id": "together-nutrition", "metadata": {}, "source": [ "<center><img src=\"./img/Resfile_format.jpg\" width = \"600\" height = \"200\" align=center /></center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "markdown", "id": "sealed-badge", "metadata": {}, "source": [ "### 二、HEADER的语法与编写" ] }, { "cell_type": "markdown", "id": "artificial-skiing", "metadata": {}, "source": [ "#### 2.1 全局自由度控制" ] }, { "cell_type": "markdown", "id": "least-cleaner", "metadata": {}, "source": [ "这部分语法实现的是除BODY部分中出现的氨基酸位点进行自由度控制,控制可以设置为Design/Repacking/No_repack三种基本状态。以下列举所有的语法:\n", "- ALLAA ......... # 允许设计为20种氨基酸\n", "- ALLAAxc ......... # 允许设计为**非半胱氨酸**以外的所有氨基酸\n", "- POLAR ......... # 允许设计极性氨基酸(DEHKNQRST)\n", "- APOLAR ......... # 允许设计非极性氨基酸(ACFGILMPVWY)\n", "- NOTAA ......... # 不允许设计为特定的氨基酸列表。(列表连续编写无空格)\n", "- PIKAA ......... # 只允许设计为特定的氨基酸列表。(列表连续编写无空格)\n", "- NATAA ......... # Repack当前氨基酸类型,只允许构象变化。\n", "- NATRO ......... # NoRepack不允许构象变化。\n", "- PROPERTY ......... # 只允许设计为有以下性质的氨基酸" ] }, { "cell_type": "markdown", "id": "major-finance", "metadata": {}, "source": [ "PROPERTY一般常用可选:\n", "- METAL: 金属离子\n", "- POLAR: 极性氨基酸\n", "- HYDROPHOBIC: 疏水氨基酸\n", "- CHARGED: 带电氨基酸\n", "- NEGATIVE_CHARGE: 带负电氨基酸\n", "- POSITIVE_CHARGE: 带正电氨基酸\n", "- AROMATIC: 芳香族氨基酸" ] }, { "cell_type": "markdown", "id": "academic-gather", "metadata": {}, "source": [ "编写一个NATRO相关的Resfile的demo:\n", "```\n", "NATRO\n", "START\n", "```" ] }, { "cell_type": "code", "execution_count": 2, "id": "genetic-spank", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#Packer_Task\n", "\n", "Threads to request: ALL AVAILABLE\n", "\n", "resid\tpack?\tdesign?\tallowed_aas\n", "1\tFALSE\tFALSE\t\n", "2\tFALSE\tFALSE\t\n", "3\tFALSE\tFALSE\t\n", "4\tFALSE\tFALSE\t\n", "5\tFALSE\tFALSE\t\n", "6\tFALSE\tFALSE\t\n", "7\tFALSE\tFALSE\t\n", "8\tFALSE\tFALSE\t\n", "9\tFALSE\tFALSE\t\n", "10\tFALSE\tFALSE\t\n", "11\tFALSE\tFALSE\t\n", "12\tFALSE\tFALSE\t\n", "13\tFALSE\tFALSE\t\n", "14\tFALSE\tFALSE\t\n", "\n" ] } ], "source": [ "# restrict to baestype list\n", "resfile_type = ReadResfile('./data/NATRO.resfile')\n", "\n", "# 将TaskOperations加载至TaskFactory中\n", "pack_tf = TaskFactory()\n", "pack_tf.push_back(resfile_type)\n", "\n", "# 生成PackerTask\n", "packer_task = pack_tf.create_task_and_apply_taskoperations(pose)\n", "print(packer_task)" ] }, { "cell_type": "markdown", "id": "considered-lounge", "metadata": {}, "source": [ "#### 练习\n", "尝试写更多的Resfile并读取到上述的代码中,比较不同语法之间的差异。" ] }, { "cell_type": "markdown", "id": "exempt-dating", "metadata": {}, "source": [ "#### 2.2 Rotamer采样丰度的设置:\n", "\n", "Rosetta Pack采样Rotamer时是离散的,默认只会采纳每个格点的中心富集的构象,我们可以通过Extra Rotamer相关控制手段来增加Rotamer的采样,默认扩充采样时,采集Rotamer时会额外考虑平均$\\chi$的+/-1个标准差的构象。这种Extra Rotamer相关控制仅对**包埋**的残基有效!\n", "\n", "只要在Resfile中HEADER中使用Extra Rotamer Commands字段即可,目前针对不同的氨基酸有四种编写方式:\n", "\n", "1. EX \\<chi-id> LEVEL \\<level-value> 语法\n", "\n", " - EX \\<chi-id> 代表指定对侧链中第几个$\\chi$角进行扩大采样, 如EX 1 EX 2 代表同时对$\\chi_{1}$角和$\\chi_{2}$角扩大采样\n", " - LEVEL \\<level-value> 代表如何允许的$\\chi$角标准差范围,如果缺省这部分的参数默认为+/-1个标准差\n", "\n", "2. EX ARO \\<chi-id> LEVEL \\<level-value> 语法\n", " - EX ARO \\<chi-id> 代表指定对侧链中第几个$\\chi$角进行扩大采样, **但范围仅限于芳香族氨基酸(FHWY)!**\n", " - LEVEL \\<level-value> 代表如何允许的$\\chi$角标准差范围,如果缺省这部分的参数默认为+/-1个标准差\n", "\n", "3. EX_CUTOFF \\<number of neighbors> 语法\n", " - Rosetta默认不会对处于蛋白表面的氨基酸进行额外Rotamer采集,除非用户显式地设置(EX_CUTOFF >=0等),来改变“包埋”氨基酸的判断条件。默认为统计当前氨基酸10埃范围内残基数量,当数量大于设定的阈值时,认为是\"包埋\"的氨基酸,进行额外的Rotamer采样。因此通常EX_CUTOFF被显式地设置为0(默认值为18),来考虑所有的氨基酸位点都做Rotamer。\n", "\n", "4. USE_INPUT_SC 语法\n", " - 第一轮Packer时考虑初始输入的侧链Rotamer构象" ] }, { "cell_type": "markdown", "id": "toxic-avatar", "metadata": {}, "source": [ "LEVEL参数目前有7个级别:\n", "- 0 ...... no extra chi angles\n", "- 1 ...... sample at 1 standard deviation\n", "- 2 ...... sample at 1/2 standard deviation\n", "- 3 ...... sample at two full standard deviations\n", "- 4 ...... sample at two 1/2 standard deviations\n", "- 5 ...... sample at four 1/2 standard deviations\n", "- 6 ...... sample at three 1/3 standard deviations\n", "- 7 ...... sample at six 1/4 standard deviations" ] }, { "cell_type": "markdown", "id": "sticky-retention", "metadata": {}, "source": [ "以下举一个控制Rotamer丰度的Resfile例子:\n", "```\n", "NATRO\n", "EX 1 EX 2\n", "START\n", "```" ] }, { "cell_type": "code", "execution_count": 3, "id": "random-sucking", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "start\n", "1 A NATRO EX ARO 1 EX ARO 2\n", "2 A NATRO EX ARO 1 EX ARO 2\n", "3 A NATRO EX ARO 1 EX ARO 2\n", "4 A NATRO EX ARO 1 EX ARO 2\n", "5 A NATRO EX ARO 1 EX ARO 2\n", "6 A NATRO EX ARO 1 EX ARO 2\n", "7 A NATRO EX ARO 1 EX ARO 2\n", "8 A NATRO EX ARO 1 EX ARO 2\n", "9 A NATRO EX ARO 1 EX ARO 2\n", "10 A NATRO EX ARO 1 EX ARO 2\n", "11 A NATRO EX ARO 1 EX ARO 2\n", "12 A NATRO EX ARO 1 EX ARO 2\n", "13 A NATRO EX ARO 1 EX ARO 2\n", "14 A NATRO EX ARO 1 EX ARO 2\n", "\n" ] } ], "source": [ "# restrict to baestype list\n", "resfile_type = ReadResfile('./data/EX1EX2.resfile')\n", "\n", "# 将TaskOperations加载至TaskFactory中\n", "pack_tf = TaskFactory()\n", "pack_tf.push_back(resfile_type)\n", "\n", "# 生成PackerTask\n", "packer_task = pack_tf.create_task_and_apply_taskoperations(pose)\n", "\n", "# 查看每个残基的Rotamer采样级别:\n", "print(packer_task.task_string(pose))" ] }, { "cell_type": "markdown", "id": "several-sunglasses", "metadata": {}, "source": [ "可见所有位点的Rotamer都会额外采集$\\chi_{1}$和$\\chi_{2}$角。**(上述结果存在显示错误,我们并没有设置ARO,其实是没有设置ARO的。因此不影响实际的运行效果)**" ] }, { "cell_type": "markdown", "id": "aboriginal-constitution", "metadata": {}, "source": [ "除了全局控制,我们可以还可在HEADER中特定地给一些氨基酸设置额外Rotamer采集:\n", "```\n", "1. EX 1 EX 2\n", "\n", "2. EX ARO 2\n", "\n", "3. EX 1 LEVEL 7\n", "\n", "4. EX 1 EX ARO 1 LEVEL 4\n", "```" ] }, { "cell_type": "markdown", "id": "closed-actress", "metadata": {}, "source": [ "#### 练习\n", "思考上述语法的具体含义。" ] }, { "cell_type": "markdown", "id": "defined-trainer", "metadata": {}, "source": [ "### 三、BODY的语法与编写" ] }, { "cell_type": "markdown", "id": "internal-blond", "metadata": {}, "source": [ "BODY部分用于指明特定位点或范围的氨基酸Rotamer自由度。指定的形式一共有4种。\n", "- 特定单个位点指定\n", "- 指定位点范围\n", "- 指定链范围\n", "- 多重指定" ] }, { "cell_type": "markdown", "id": "recorded-bottle", "metadata": {}, "source": [ "#### 3.1 位点指定\n", "第一列为氨基酸的PDB编号(允许有insert code)。第二列为PDB链编号,第三列为COMMAND项。\n", "\n", "基本语法: \n", "```\n", "<PDBNUM>[<ICODE>] <CHAIN> <COMMANDS>\n", "```" ] }, { "cell_type": "markdown", "id": "wired-shareware", "metadata": {}, "source": [ "注: ICODE是指PDB中存在特殊插入编号字符时使用,如抗体等有特殊编号的系统,和PDBNUM连续编写如35A,35B等。正常的PDBNUM应该只有数字。" ] }, { "cell_type": "markdown", "id": "reserved-miniature", "metadata": {}, "source": [ "使用举例:\n", "```\n", "NATRO\n", "EX 1 EX 2\n", "START\n", "\n", "3 A ALLAA # 3号位允许设计为20种氨基酸\n", "4 A APOLAR # 4号位只允许在非极性氨基酸范围\n", "```" ] }, { "cell_type": "code", "execution_count": 4, "id": "necessary-depression", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#Packer_Task\n", "\n", "Threads to request: ALL AVAILABLE\n", "\n", "resid\tpack?\tdesign?\tallowed_aas\n", "1\tFALSE\tFALSE\t\n", "2\tFALSE\tFALSE\t\n", "3\tTRUE\tTRUE\tALA,CYS,ASP,GLU,PHE,GLY,HIS,HIS_D,ILE,LYS,LEU,MET,ASN,PRO,GLN,ARG,SER,THR,VAL,TRP,TYR\n", "4\tTRUE\tTRUE\tALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR\n", "5\tFALSE\tFALSE\t\n", "6\tFALSE\tFALSE\t\n", "7\tFALSE\tFALSE\t\n", "8\tFALSE\tFALSE\t\n", "9\tFALSE\tFALSE\t\n", "10\tFALSE\tFALSE\t\n", "11\tFALSE\tFALSE\t\n", "12\tFALSE\tFALSE\t\n", "13\tFALSE\tFALSE\t\n", "14\tFALSE\tFALSE\t\n", "\n" ] } ], "source": [ "# restrict to baestype list\n", "resfile_type = ReadResfile('./data/position_mut.resfile')\n", "\n", "# 将TaskOperations加载至TaskFactory中\n", "pack_tf = TaskFactory()\n", "pack_tf.push_back(resfile_type)\n", "\n", "# 生成PackerTask\n", "packer_task = pack_tf.create_task_and_apply_taskoperations(pose)\n", "print(packer_task)" ] }, { "cell_type": "markdown", "id": "worth-mechanics", "metadata": { "tags": [] }, "source": [ "#### 3.2 位点范围进行指定\n", "范围氨基酸指定格式:\n", "```\n", "<PDBNUM>[<ICODE>] - <PDBNUM>[<ICODE>] <CHAIN> <COMMANDS>\n", "```" ] }, { "cell_type": "markdown", "id": "expanded-section", "metadata": {}, "source": [ "使用举例:\n", "```\n", "NATRO\n", "EX 1 EX 2\n", "START\n", "\n", "1 - 5 A APOLAR # A链1-5号位只允许在非极性氨基酸范围\n", "```" ] }, { "cell_type": "code", "execution_count": 5, "id": "induced-memorabilia", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#Packer_Task\n", "\n", "Threads to request: ALL AVAILABLE\n", "\n", "resid\tpack?\tdesign?\tallowed_aas\n", "1\tTRUE\tTRUE\tALA:NtermProteinFull,CYS:NtermProteinFull,PHE:NtermProteinFull,GLY:NtermProteinFull,ILE:NtermProteinFull,LEU:NtermProteinFull,MET:NtermProteinFull,PRO:NtermProteinFull,VAL:NtermProteinFull,TRP:NtermProteinFull,TYR:NtermProteinFull\n", "2\tTRUE\tTRUE\tALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR\n", "3\tTRUE\tTRUE\tALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR\n", "4\tTRUE\tTRUE\tALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR\n", "5\tTRUE\tTRUE\tALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR\n", "6\tFALSE\tFALSE\t\n", "7\tFALSE\tFALSE\t\n", "8\tFALSE\tFALSE\t\n", "9\tFALSE\tFALSE\t\n", "10\tFALSE\tFALSE\t\n", "11\tFALSE\tFALSE\t\n", "12\tFALSE\tFALSE\t\n", "13\tFALSE\tFALSE\t\n", "14\tFALSE\tFALSE\t\n", "\n" ] } ], "source": [ "# restrict to baestype list\n", "resfile_type = ReadResfile('./data/range_mut.resfile')\n", "\n", "# 将TaskOperations加载至TaskFactory中\n", "pack_tf = TaskFactory()\n", "pack_tf.push_back(resfile_type)\n", "\n", "# 生成PackerTask\n", "packer_task = pack_tf.create_task_and_apply_taskoperations(pose)\n", "print(packer_task)" ] }, { "cell_type": "markdown", "id": "traditional-reconstruction", "metadata": {}, "source": [ "#### 3.3 链为单位进行指定\n", "链单位指定基本格式:\n", "```\n", "* <CHAIN> <COMMANDS>\n", "```" ] }, { "cell_type": "markdown", "id": "conscious-backing", "metadata": {}, "source": [ "使用举例:\n", "```\n", "NATRO\n", "EX 1 EX 2\n", "START\n", "\n", "* A PROPERTY HYDROPHOBIC # A链所有位点可设计为疏水的天然氨基酸\n", "```" ] }, { "cell_type": "code", "execution_count": 6, "id": "compatible-respondent", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#Packer_Task\n", "\n", "Threads to request: ALL AVAILABLE\n", "\n", "resid\tpack?\tdesign?\tallowed_aas\n", "1\tTRUE\tTRUE\tPHE:NtermProteinFull,ILE:NtermProteinFull,LEU:NtermProteinFull,MET:NtermProteinFull,VAL:NtermProteinFull,TRP:NtermProteinFull,TYR:NtermProteinFull\n", "2\tTRUE\tTRUE\tPHE,ILE,LEU,MET,VAL,TRP,TYR\n", "3\tTRUE\tTRUE\tPHE,ILE,LEU,MET,VAL,TRP,TYR\n", "4\tTRUE\tTRUE\tPHE,ILE,LEU,MET,VAL,TRP,TYR\n", "5\tTRUE\tTRUE\tPHE,ILE,LEU,MET,VAL,TRP,TYR\n", "6\tTRUE\tTRUE\tPHE,ILE,LEU,MET,VAL,TRP,TYR\n", "7\tTRUE\tTRUE\tPHE,ILE,LEU,MET,VAL,TRP,TYR\n", "8\tTRUE\tTRUE\tPHE,ILE,LEU,MET,VAL,TRP,TYR\n", "9\tTRUE\tTRUE\tPHE,ILE,LEU,MET,VAL,TRP,TYR\n", "10\tTRUE\tTRUE\tPHE,ILE,LEU,MET,VAL,TRP,TYR\n", "11\tTRUE\tTRUE\tPHE,ILE,LEU,MET,VAL,TRP,TYR\n", "12\tTRUE\tTRUE\tPHE,ILE,LEU,MET,VAL,TRP,TYR\n", "13\tTRUE\tTRUE\tPHE,ILE,LEU,MET,VAL,TRP,TYR\n", "14\tTRUE\tTRUE\tPHE:CtermProteinFull,ILE:CtermProteinFull,LEU:CtermProteinFull,MET:CtermProteinFull,VAL:CtermProteinFull,TRP:CtermProteinFull,TYR:CtermProteinFull\n", "\n", "start\n", "\n" ] } ], "source": [ "# restrict to baestype list\n", "resfile_type = ReadResfile('./data/chain_range.resfile')\n", "\n", "# 将TaskOperations加载至TaskFactory中\n", "pack_tf = TaskFactory()\n", "pack_tf.push_back(resfile_type)\n", "\n", "# 生成PackerTask\n", "packer_task = pack_tf.create_task_and_apply_taskoperations(pose)\n", "print(packer_task)\n", "\n", "# 查看每个残基的Rotamer采样级别:\n", "print(packer_task.task_string(pose))" ] }, { "cell_type": "markdown", "id": "personalized-brisbane", "metadata": {}, "source": [ "#### 3.4 非标准氨基酸的指定\n", "当在BODY中想引入非标准氨基酸时,需要特殊的格式进行指定(2019年版本的Rosetta支持该语法)\n", "```\n", "<PDBNUM>[<ICODE>] <CHAIN> <COMMANDS> X[ncaa]\n", "```" ] }, { "cell_type": "markdown", "id": "proof-winner", "metadata": {}, "source": [ "不同的地方在于COMMANDs部分: 非标准氨基酸加入前必须加入\"X[ncaa]\" ncaa=非标准氨基酸的三字母缩写\n", "\n", "使用举例:\n", "```\n", "NATRO\n", "EX 1 EX 2\n", "START\n", "\n", "5 A PIKAA X[B36]X[A20] # 5号引入单点非标准氨基酸B36以及A20非标准氨基酸\n", "```" ] }, { "cell_type": "code", "execution_count": 7, "id": "selective-jumping", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#Packer_Task\n", "\n", "Threads to request: ALL AVAILABLE\n", "\n", "resid\tpack?\tdesign?\tallowed_aas\n", "1\tFALSE\tFALSE\t\n", "2\tFALSE\tFALSE\t\n", "3\tFALSE\tFALSE\t\n", "4\tFALSE\tFALSE\t\n", "5\tTRUE\tTRUE\tB36,A20\n", "6\tFALSE\tFALSE\t\n", "7\tFALSE\tFALSE\t\n", "8\tFALSE\tFALSE\t\n", "9\tFALSE\tFALSE\t\n", "10\tFALSE\tFALSE\t\n", "11\tFALSE\tFALSE\t\n", "12\tFALSE\tFALSE\t\n", "13\tFALSE\tFALSE\t\n", "14\tFALSE\tFALSE\t\n", "\n" ] } ], "source": [ "from pyrosetta.rosetta.core.pack.palette import CustomBaseTypePackerPalette\n", "# restrict to baestype list\n", "resfile_type = ReadResfile('./data/ncaa.resfile')\n", "\n", "# 先在CustomBaseTypePackerPalette引入NCAA列表\n", "pp = CustomBaseTypePackerPalette()\n", "pp.add_type('B36')\n", "pp.add_type('A20')\n", "\n", "# 将TaskOperations加载至TaskFactory中\n", "pack_tf = TaskFactory()\n", "pack_tf.set_packer_palette(pp) ## 加载Palette到TaskFactory中;\n", "pack_tf.push_back(resfile_type)\n", "\n", "# 生成PackerTask\n", "packer_task = pack_tf.create_task_and_apply_taskoperations(pose)\n", "print(packer_task)" ] }, { "cell_type": "markdown", "id": "growing-raleigh", "metadata": {}, "source": [ "#### 3.5 多重逻辑指定\n", "在Resfile中,如果BODY部分指定发生了重叠,有两种处理方式:\n", "\n", "- 如果BODY是同一种指定级别,**按照交集逻辑**进行处理,**无交集时为空集**。\n", "- 不同指定级别时,单位点指定的优先级高于范围级指定的优先级" ] }, { "cell_type": "markdown", "id": "genetic-diary", "metadata": {}, "source": [ "使用举例:\n", "```\n", "NATRO\n", "EX 1 EX 2\n", "START\n", "\n", "1 - 5 A APOLAR # A链1-5号位只允许在非极性氨基酸范围内进行Rotamer搜索\n", "3 - 5 A POLAR # A链3-5号位只允许在极性氨基酸范围内进行Rotamer搜索\n", "```" ] }, { "cell_type": "code", "execution_count": 8, "id": "spatial-qatar", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#Packer_Task\n", "\n", "Threads to request: ALL AVAILABLE\n", "\n", "resid\tpack?\tdesign?\tallowed_aas\n", "1\tTRUE\tTRUE\tALA:NtermProteinFull,CYS:NtermProteinFull,PHE:NtermProteinFull,GLY:NtermProteinFull,ILE:NtermProteinFull,LEU:NtermProteinFull,MET:NtermProteinFull,PRO:NtermProteinFull,VAL:NtermProteinFull,TRP:NtermProteinFull,TYR:NtermProteinFull\n", "2\tTRUE\tTRUE\tALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR\n", "3\tFALSE\tFALSE\t\n", "4\tFALSE\tFALSE\t\n", "5\tFALSE\tFALSE\t\n", "6\tFALSE\tFALSE\t\n", "7\tFALSE\tFALSE\t\n", "8\tFALSE\tFALSE\t\n", "9\tFALSE\tFALSE\t\n", "10\tFALSE\tFALSE\t\n", "11\tFALSE\tFALSE\t\n", "12\tFALSE\tFALSE\t\n", "13\tFALSE\tFALSE\t\n", "14\tFALSE\tFALSE\t\n", "\n" ] } ], "source": [ "# restrict to baestype list\n", "resfile_type = ReadResfile('./data/multi-logic.resfile')\n", "\n", "# 将TaskOperations加载至TaskFactory中\n", "pack_tf = TaskFactory()\n", "pack_tf.push_back(resfile_type)\n", "\n", "# 生成PackerTask\n", "packer_task = pack_tf.create_task_and_apply_taskoperations(pose)\n", "print(packer_task)" ] }, { "cell_type": "markdown", "id": "surprised-spread", "metadata": {}, "source": [ "同一指定级别,3-5号氨基酸自由度为空集。" ] }, { "cell_type": "markdown", "id": "acceptable-baking", "metadata": {}, "source": [ "另外一种情况: \n", "\n", "使用举例:\n", "```\n", "NATRO\n", "EX 1 EX 2\n", "START\n", "\n", "1 - 5 A APOLAR # A链1-5号位只允许在非极性氨基酸范围内进行Rotamer搜索\n", "3 A POLAR # A链3号氨基酸设计为极性氨基酸范围\n", "```" ] }, { "cell_type": "code", "execution_count": 9, "id": "sharing-steal", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#Packer_Task\n", "\n", "Threads to request: ALL AVAILABLE\n", "\n", "resid\tpack?\tdesign?\tallowed_aas\n", "1\tTRUE\tTRUE\tALA:NtermProteinFull,CYS:NtermProteinFull,PHE:NtermProteinFull,GLY:NtermProteinFull,ILE:NtermProteinFull,LEU:NtermProteinFull,MET:NtermProteinFull,PRO:NtermProteinFull,VAL:NtermProteinFull,TRP:NtermProteinFull,TYR:NtermProteinFull\n", "2\tTRUE\tTRUE\tALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR\n", "3\tTRUE\tTRUE\tASP,GLU,HIS,HIS_D,LYS,ASN,GLN,ARG,SER,THR\n", "4\tTRUE\tTRUE\tALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR\n", "5\tTRUE\tTRUE\tALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR\n", "6\tFALSE\tFALSE\t\n", "7\tFALSE\tFALSE\t\n", "8\tFALSE\tFALSE\t\n", "9\tFALSE\tFALSE\t\n", "10\tFALSE\tFALSE\t\n", "11\tFALSE\tFALSE\t\n", "12\tFALSE\tFALSE\t\n", "13\tFALSE\tFALSE\t\n", "14\tFALSE\tFALSE\t\n", "\n" ] } ], "source": [ "# restrict to baestype list\n", "resfile_type = ReadResfile('./data/multi-logic2.resfile')\n", "\n", "# 将TaskOperations加载至TaskFactory中\n", "pack_tf = TaskFactory()\n", "pack_tf.push_back(resfile_type)\n", "\n", "# 生成PackerTask\n", "packer_task = pack_tf.create_task_and_apply_taskoperations(pose)\n", "print(packer_task)" ] }, { "cell_type": "markdown", "id": "south-migration", "metadata": {}, "source": [ "得到的结果: 3号设计为极性氨酸,1-2,4-5号氨基酸设计为非极性氨基酸。因为单位点优先级高于氨基酸范围指定。" ] }, { "cell_type": "markdown", "id": "prepared-serve", "metadata": {}, "source": [ "再举一个例子:\n", "```\n", "NATRO\n", "EX 1 EX 2\n", "START\n", "\n", "* A POLAR # A链所有氨基酸设计为极性氨酸\n", "1 - 5 A APOLAR # A链1-5号位只允许在非极性氨基酸范围内进行Rotamer搜索\n", "```" ] }, { "cell_type": "code", "execution_count": 10, "id": "apart-protein", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#Packer_Task\n", "\n", "Threads to request: ALL AVAILABLE\n", "\n", "resid\tpack?\tdesign?\tallowed_aas\n", "1\tTRUE\tTRUE\tALA:NtermProteinFull,CYS:NtermProteinFull,PHE:NtermProteinFull,GLY:NtermProteinFull,ILE:NtermProteinFull,LEU:NtermProteinFull,MET:NtermProteinFull,PRO:NtermProteinFull,VAL:NtermProteinFull,TRP:NtermProteinFull,TYR:NtermProteinFull\n", "2\tTRUE\tTRUE\tALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR\n", "3\tTRUE\tTRUE\tALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR\n", "4\tTRUE\tTRUE\tALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR\n", "5\tTRUE\tTRUE\tALA,CYS,PHE,GLY,ILE,LEU,MET,PRO,VAL,TRP,TYR\n", "6\tTRUE\tTRUE\tASP,GLU,HIS,HIS_D,LYS,ASN,GLN,ARG,SER,THR\n", "7\tTRUE\tTRUE\tASP,GLU,HIS,HIS_D,LYS,ASN,GLN,ARG,SER,THR\n", "8\tTRUE\tTRUE\tASP,GLU,HIS,HIS_D,LYS,ASN,GLN,ARG,SER,THR\n", "9\tTRUE\tTRUE\tASP,GLU,HIS,HIS_D,LYS,ASN,GLN,ARG,SER,THR\n", "10\tTRUE\tTRUE\tASP,GLU,HIS,HIS_D,LYS,ASN,GLN,ARG,SER,THR\n", "11\tTRUE\tTRUE\tASP,GLU,HIS,HIS_D,LYS,ASN,GLN,ARG,SER,THR\n", "12\tTRUE\tTRUE\tASP,GLU,HIS,HIS_D,LYS,ASN,GLN,ARG,SER,THR\n", "13\tTRUE\tTRUE\tASP,GLU,HIS,HIS_D,LYS,ASN,GLN,ARG,SER,THR\n", "14\tTRUE\tTRUE\tASP:CtermProteinFull,GLU:CtermProteinFull,HIS:CtermProteinFull,HIS_D:CtermProteinFull,LYS:CtermProteinFull,ASN:CtermProteinFull,GLN:CtermProteinFull,ARG:CtermProteinFull,SER:CtermProteinFull,THR:CtermProteinFull\n", "\n" ] } ], "source": [ "# restrict to baestype list\n", "resfile_type = ReadResfile('./data/multi-logic3.resfile')\n", "\n", "# 将TaskOperations加载至TaskFactory中\n", "pack_tf = TaskFactory()\n", "pack_tf.push_back(resfile_type)\n", "\n", "# 生成PackerTask\n", "packer_task = pack_tf.create_task_and_apply_taskoperations(pose)\n", "print(packer_task)" ] }, { "cell_type": "markdown", "id": "informational-bolivia", "metadata": {}, "source": [ "得到的结果: 1-5号设计为非极性氨酸,其余氨基酸设计为极性氨基酸。因为氨基酸范围指定优先级大于链范围指定。" ] }, { "cell_type": "code", "execution_count": null, "id": "bright-niagara", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" } }, "nbformat": 4, "nbformat_minor": 5 }