{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Rosetta Script in PyRosetta\n", "\n", "@Author:Jian Huang\n", "\n", "@E-mail: jian.huang@xtalpi.com" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 1. Introduction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在Rosetta软件开始发展的初期,其功能是以单个的应用提交给用户使用的,例如我们经常使用的socre,relax等。作为开发者和设计者,当然不会满足于这些没办法定制自己独有设计过程的程序。Rosetta为了适应用户的不同需求,给定用户更高的自由度——直接使用比应用更底层的功能进行组装,形成自己的“应用”或Protocol。因此Rosetta Script诞生了,后续简称RS。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "首先观察以下Rosetta Script的标准框架:\n", "\n", "```\n", "<ROSETTASCRIPTS>\n", " <SCOREFXNS>\n", " </SCOREFXNS>\n", " <RESIDUE_SELECTORS>\n", " </RESIDUE_SELECTORS>\n", " <TASKOPERATIONS>\n", " </TASKOPERATIONS>\n", " <SIMPLE_METRICS>\n", " </SIMPLE_METRICS>\n", " <FILTERS>\n", " </FILTERS>\n", " <MOVERS>\n", " </MOVERS>\n", " <PROTOCOLS>\n", " </PROTOCOLS>\n", " <OUTPUT />\n", "</ROSETTASCRIPTS>\n", "```\n", "\n", "在一个RS基本流程中,用户输入一个蛋白构象(Pose),通过定义的Movers改变构象,设置Filters对改变后的构象进行评估,然后输出符合条件的构象。RS流程允许用户将rosetta的底层movers(如Packer,Minimizer等)流程化地组合起来,形成自己独特的设计流程,极度强大。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 2. RS基础\n", "仔细观察RS框架,可以发现每一项都是通过```<name> </name>```申明的,例如FILTER,可想而知,我们就应该在两个<>中间写入必要的内容进行操作了,而介于两种<>之间的空隙可以用来写comment:\n", "\n", "```\n", "<ROSETTASCRIPTS>\n", " <SCOREFXNS>\n", " </SCOREFXNS>\n", " \n", " 这块地方介于两项<>的中间,可以用来写注释,不会被rosetta读入进去!\n", " \n", " <RESIDUE_SELECTORS>\n", " </RESIDUE_SELECTORS>\n", " <TASKOPERATIONS>\n", " </TASKOPERATIONS>\n", " <SIMPLE_METRICS>\n", " </SIMPLE_METRICS>\n", " <FILTERS>\n", " </FILTERS>\n", " <MOVERS>\n", " \n", " 写在这里的定义会被rosetta读入进去!\n", " \n", " </MOVERS>\n", " <PROTOCOLS>\n", " </PROTOCOLS>\n", " <OUTPUT />\n", "</ROSETTASCRIPTS>\n", "```\n", "\n", "注意,下面两种写法等同:称之为一个tag\n", "\n", "```\n", "# 1\n", "<SCOREFXNS>\n", "</SCOREFXNS>\n", "\n", "# 2\n", "<SCOREFXNS/>(内容写在tag中间)\n", "```\n", "\n", "一般为了注重可读性,会使用缩进或空格的方式,将内层的层级凸显出来。这仅仅是为了script的可读性。当一个tag中存在选项列表的时候,列表中各个值应该以逗号分隔,且不应该含有空格。\n", "\n", "```\n", "<PackRotamers name=\"pack1\" task_operations=\"task1,task2,task3\" /> #This is allowed\n", "<PackRotamers name=\"pack2\" task_operations=\"task2, task2, task3\" /> #This will be misinterpreted\n", "```\n", "\n", "在RS中的这些tag中,可以创建由tag决定类型的实例,例如在movers中,我们可以写很多种不同的mover,并且可以给他们取不同的名字:\n", "\n", "```\n", "<MOVERS> #In this section, movers are defined.\n", " # We assume that task1, task2, and task3 were defined and given these unique names prior to this point in the script.\n", "\n", "\n", " <PackRotamers name=\"pack1\" task_operations=\"task1,task2,task3\" />\n", "\n", "\n", " #因为pack1是PackRotamers的一个实例,在后面我们可以使用直接使用pack1进行操作\n", "</MOVERS>\n", "```\n", "\n", "通常而言,即便RS中什么也不写,只有框架RS,rosetta也会将input文件中缺失的氢原子和侧链进行repack排布缺失原子后输出。此外,rosetta还会将标准的rosetta打分表给在输出文件的末尾。\n", "\n", "\n", "\n", "\n", "\n", "```\n", "<ROSETTASCRIPTS>\n", " <SCOREFXNS>\n", " <ScoreFunction name=\"molmech\" weights=\"mm_std_fa_elec_dslf_fa13\" />\n", " <ScoreFunction name=\"r15_cart\" weights=\"ref2015\" >\n", " <Reweight scoretype=\"pro_close\" weight=\"0.0\" />\n", " <Reweight scoretype=\"cart_bonded\" weight=\"0.625\" />\n", " </ScoreFunction>\n", " </SCOREFXNS>\n", " <RESIDUE_SELECTORS>\n", " </RESIDUE_SELECTORS>\n", " <TASKOPERATIONS>\n", " </TASKOPERATIONS>\n", " <FILTERS>\n", " </FILTERS>\n", " <MOVERS>\n", " </MOVERS>\n", " <APPLY_TO_POSE>\n", " </APPLY_TO_POSE>\n", " <PROTOCOLS>\n", " </PROTOCOLS>\n", " <OUTPUT scorefxn=\"r15_cart\" />\n", "</ROSETTASCRIPTS>\n", "```\n", "\n", "上面RS示例中的`<ScoreFunction/>` tag中展示了如何定义不同的能量函数。在这里定义了两个能量函数,第一个能量函数在rosetta里面调用的名字叫mm_std_fa_elec_dslf_fa13,第二个能量函数就是我们常用的ref2015,但这里又对ref2015中的某些能量项进行权重的修改:pro_close设定为0和cart_bonded设定为0.625。(之前我们也介绍过如何使用patch 文件,修改rosetta中内置能量函数的权重)。\n", "注意,虽然molmech能量函数被定义,但是在后面的tag之中从来没有被使用。这在rosetta中也是允许的。修改权重后的ref2015能量函数,在这里被实例化成“r15_cart”,在`<output/>` tag中被调用,这也就是告诉rosetta去使用实例化后的r15_cart对pose进行打分的意思。\n", "<br/>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 3.控制RS的输出" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在RS框架中存在`<output/>` tag,这个tag可以允许我们像以前使用rosetta的flag文件一样,设置输出选项来控制输出;`<SCOREFXNS/>` tag允许我们使用其他的一些能量函数。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.1 修改打分函数" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**场景一**\n", "\n", "```\n", "<ROSETTASCRIPTS>\n", " <SCOREFXNS>\n", " <ScoreFunction name=\"molmech\" weights=\"mm_std_fa_elec_dslf_fa13\" />\n", " <ScoreFunction name=\"r15_cart\" weights=\"ref2015\" >\n", " <Reweight scoretype=\"pro_close\" weight=\"0.0\" />\n", " <Reweight scoretype=\"cart_bonded\" weight=\"0.625\" />\n", " </ScoreFunction>\n", " </SCOREFXNS>\n", " <RESIDUE_SELECTORS>\n", " </RESIDUE_SELECTORS>\n", " <TASKOPERATIONS>\n", " </TASKOPERATIONS>\n", " <FILTERS>\n", " </FILTERS>\n", " <MOVERS>\n", " </MOVERS>\n", " <APPLY_TO_POSE>\n", " </APPLY_TO_POSE>\n", " <PROTOCOLS>\n", " </PROTOCOLS>\n", " <OUTPUT scorefxn=\"r15_cart\" />\n", "</ROSETTASCRIPTS>\n", "```\n", "\n", "<br/>\n", "上面RS示例中的`<ScoreFunction/>` tag中展示了如何定义不同的能量函数。在这里定义了两个能量函数,第一个能量函数在rosetta里面调用的名字叫mm_std_fa_elec_dslf_fa13,第二个能量函数就是我们常用的ref2015,但这里又对ref2015中的某些能量项进行权重的修改:pro_close设定为0和cart_bonded设定为0.625。(之前我们也介绍过如何使用patch 文件,修改rosetta中内置能量函数的权重)。\n", "注意,虽然molmech能量函数被定义,但是在后面的tag之中从来没有被使用。这在rosetta中也是允许的。修改权重后的ref2015能量函数,在这里被实例化成“r15_cart”,在`<output/>` tag中被调用,这也就是告诉rosetta去使用实例化后的r15_cart对pose进行打分的意思。" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.31+release.c7009b3115c22daa9efe2805d9d1ebba08426a54 2021-08-07T10:04:12] retrieved from: http://www.pyrosetta.org\n", "(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.\n", "\u001b[0mcore.init: {0} \u001b[0mChecking for fconfig files in pwd and ./rosetta/flags\n", "\u001b[0mcore.init: {0} \u001b[0mRosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r292 2021.31+release.c7009b3115c c7009b3115c22daa9efe2805d9d1ebba08426a54 http://www.pyrosetta.org 2021-08-07T10:04:12\n", "\u001b[0mcore.init: {0} \u001b[0mcommand: PyRosetta -ex1 -ex2aro -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database\n", "\u001b[0mbasic.random.init_random_generator: {0} \u001b[0m'RNG device' seed mode, using '/dev/urandom', seed=-53983269 seed_offset=0 real_seed=-53983269 thread_index=0\n", "\u001b[0mbasic.random.init_random_generator: {0} \u001b[0mRandomGenerator:init: Normal mode, seed=-53983269 RG_type=mt19937\n" ] } ], "source": [ "# 下面我们尝试使用pyrosetta将上面RS示例中的打分函数提取出来,并对结构进行打分你\n", "from pyrosetta.rosetta.protocols.rosetta_scripts import *\n", "from pyrosetta import *\n", "init()" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0mcore.chemical.GlobalResidueTypeSet: {0} \u001b[0mFinished initializing fa_standard residue type set. Created 983 residue types\n", "\u001b[0mcore.chemical.GlobalResidueTypeSet: {0} \u001b[0mTotal time to initialize 0.685656 seconds.\n", "\u001b[0mcore.import_pose.import_pose: {0} \u001b[0mFile 'data/my_ab.pdb' automatically determined to be of type PDB\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mFound disulfide between residues 771 845\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 771 CYS\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 845 CYS\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 771 CYD\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 845 CYD\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mFound disulfide between residues 891 956\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 891 CYS\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 956 CYS\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 891 CYD\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 956 CYD\n" ] } ], "source": [ "pose = pose_from_pdb(\"data/my_ab.pdb\")\n", "original_pose = pose.clone()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# 通过Xmlobjects对象读取\n", "with open (\"data/Example-ScoreFunction.xml\", 'r') as f:\n", " xml_lines = f.read()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mGenerating XML Schema for rosetta_scripts...\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mInitializing schema validator...\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mValidating input script...\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mParsed script:\n", "<ROSETTASCRIPTS>\n", "\t<SCOREFXNS>\n", "\t\t<ScoreFunction name=\"molmech\" weights=\"mm_std_fa_elec_dslf_fa13\"/>\n", "\t\t<ScoreFunction name=\"r15_cart\" weights=\"ref2015\">\n", "\t\t\t<Reweight scoretype=\"pro_close\" weight=\"0.0\"/>\n", "\t\t\t<Reweight scoretype=\"cart_bonded\" weight=\"0.625\"/>\n", "\t\t</ScoreFunction>\n", "\t</SCOREFXNS>\n", "\t<RESIDUE_SELECTORS/>\n", "\t<TASKOPERATIONS/>\n", "\t<FILTERS/>\n", "\t<MOVERS/>\n", "\t<APPLY_TO_POSE/>\n", "\t<PROTOCOLS/>\n", "\t<OUTPUT scorefxn=\"r15_cart\"/>\n", "</ROSETTASCRIPTS>\n", "\u001b[0mcore.scoring.ScoreFunctionFactory: {0} \u001b[0mSCOREFUNCTION: \u001b[32mref2015\u001b[0m\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0mStarting energy table calculation\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: changing atr/rep split to bottom of energy well\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: spline smoothing lj etables (maxdis = 6)\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: spline smoothing solvation etables (max_dis = 6)\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0mFinished calculating energy tables.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBPoly1D.csv\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBFadeIntervals.csv\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBEval.csv\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/DonStrength.csv\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/AccStrength.csv\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/fd/all.ramaProb\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/fd/prepro.ramaProb\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.all.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.gly.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.pro.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.valile.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/P_AA\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/P_AA_n\n", "\u001b[0mcore.scoring.P_AA: {0} \u001b[0mshapovalov_lib::shap_p_aa_pp_smooth_level of 1( aka low_smooth ) got activated.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/shapovalov/10deg/kappa131/a20.prop\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0mStarting energy table calculation\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: changing atr/rep split to bottom of energy well\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: spline smoothing lj etables (maxdis = 6)\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: spline smoothing solvation etables (max_dis = 6)\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0mFinished calculating energy tables.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/PairEPotential/pdb_pair_stats_fine\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/InterchainPotential/interchain_env_log.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/InterchainPotential/interchain_pair_log.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/EnvPairPotential/env_log.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/EnvPairPotential/cbeta_den.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/EnvPairPotential/pair_log.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/EnvPairPotential/cenpack_log.txt\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mshapovalov_lib::shap_rama_smooth_level of 4( aka highest_smooth ) got activated.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/shapovalov/kappa25/all.ramaProb\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/avg_L_rama.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/avg_L_rama.dat.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_all_rama.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_all_rama.dat.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_G_rama.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_G_rama.dat.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_P_rama.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_P_rama.dat.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/avg_L_rama_str.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/avg_L_rama_str.dat.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_all_rama_str.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_all_rama_str.dat.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_G_rama_str.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_G_rama_str.dat.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_P_rama_str.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_P_rama_str.dat.\n", "\u001b[0mcore.mm.MMLJLibrary: {0} \u001b[0mMM lj sets added: 105\n", "\u001b[0mcore.mm.MMTorsionLibrary: {0} \u001b[0mMM torsion sets added fully assigned: 1039; wildcard: 48 and 1 virtual parameter.\n", "\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0mdefined score function \"molmech\" with weights \"mm_std_fa_elec_dslf_fa13\"\n", "\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0mdefined score function \"r15_cart\" with weights \"ref2015\"\n", "\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0m setting r15_cart weight pro_close to 0\n", "\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0m setting r15_cart weight cart_bonded to 0.625\n", "\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mInitializing IdealParametersDatabase with default Ks=300 , 80 , 80 , 10 , 80\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/bondlength_bondangle/default-lengths.txt\n", "\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mRead 759 bb-independent lengths.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/bondlength_bondangle/default-angles.txt\n", "\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mRead 1434 bb-independent angles.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/bondlength_bondangle/default-torsions.txt\n", "\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mRead 1 bb-independent torsions.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/bondlength_bondangle/default-improper.txt\n", "\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mRead 529 bb-independent improper tors.\n", "\u001b[0mprotocols.rosetta_scripts.ParsedProtocol: {0} \u001b[0mParsedProtocol mover with the following settings\n" ] } ], "source": [ "# Pyrosetta直接从string读入会花费较长时间,效率并不高\n", "xml = XmlObjects.create_from_string(xml_lines)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "<class 'pyrosetta.rosetta.core.scoring.ScoreFunction'>\n", "<class 'pyrosetta.rosetta.core.scoring.ScoreFunction'>\n" ] } ], "source": [ "# 以RS定义的名字,获取对应的能量函数对象\n", "molmech_scorefunc = xml.get_score_function(\"molmech\")\n", "r15_cart = xml.get_score_function(\"r15_cart\")\n", "\n", "# 查看类型\n", "print(type(molmech_scorefunc))\n", "print(type(r15_cart))" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mCreating new peptide-bonded energy container (975)\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/elec_cp_reps.dat\n", "\u001b[0mcore.scoring.elec.util: {0} \u001b[0mRead 40 countpair representative atoms\n", "\u001b[0mcore.pack.dunbrack.RotamerLibrary: {0} \u001b[0mshapovalov_lib_fixes_enable option is true.\n", "\u001b[0mcore.pack.dunbrack.RotamerLibrary: {0} \u001b[0mshapovalov_lib::shap_dun10_smooth_level of 1( aka lowest_smooth ) got activated.\n", "\u001b[0mcore.pack.dunbrack.RotamerLibrary: {0} \u001b[0mBinary rotamer library selected: /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database/rotamer/shapovalov/StpDwn_0-0-0/Dunbrack10.lib.bin\n", "\u001b[0mcore.pack.dunbrack.RotamerLibrary: {0} \u001b[0mUsing Dunbrack library binary file '/opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database/rotamer/shapovalov/StpDwn_0-0-0/Dunbrack10.lib.bin'.\n", "\u001b[0mcore.pack.dunbrack.RotamerLibrary: {0} \u001b[0mDunbrack 2010 library took 0.16608 seconds to load from binary\n" ] }, { "data": { "text/plain": [ "16135.774462142173" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 该能量函数对象与前面章节中介绍的能量函数对象别无二致\n", "# 对pose进行打分:\n", "r15_cart(pose)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0mcore.scoring.ScoreFunction: {0} \u001b[0m\n", "------------------------------------------------------------\n", " Scores Weight Raw Score Wghtd.Score\n", "------------------------------------------------------------\n", " fa_atr 1.000 -5695.317 -5695.317\n", " fa_rep 0.550 24719.870 13595.929\n", " fa_sol 1.000 3501.170 3501.170\n", " fa_intra_rep 0.005 2661.048 13.305\n", " fa_intra_sol_xover4 1.000 175.028 175.028\n", " lk_ball_wtd 1.000 -57.980 -57.980\n", " fa_elec 1.000 -1588.297 -1588.297\n", " hbond_sr_bb 1.000 -74.000 -74.000\n", " hbond_lr_bb 1.000 -511.787 -511.787\n", " hbond_bb_sc 1.000 -130.727 -130.727\n", " hbond_sc 1.000 -78.766 -78.766\n", " dslf_fa13 1.250 -1.572 -1.965\n", " omega 0.400 474.068 189.627\n", " fa_dun 0.700 5380.790 3766.553\n", " p_aa_pp 0.600 -289.096 -173.458\n", " yhh_planarity 0.625 0.000 0.000\n", " ref 1.000 372.765 372.765\n", " rama_prepro 0.450 191.068 85.981\n", " cart_bonded 0.625 4396.342 2747.714\n", "---------------------------------------------------\n", " Total weighted score: 16135.774\n" ] } ], "source": [ "# 查看打分的详细信息\n", "r15_cart.show(pose)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.2 改变构象:RS中定义的Mover" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Movers tag是RS的核心部分,Movers指rosetta中一切可以改变构象的操作,包括改变原子坐标、Foldtree、constraint、序列、共价连接性质等。注意也有一些movers不改变原子坐标。\n", "\n", "\n", "movers的记录可以参考:https://www.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/Movers/Movers-RosettaScripts\n", "\n", "\n", "我们以MinMover为例(https://www.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/Movers/movers_pages/MinMover)\n", "\n", "该网址中给出了MinMover的定义范本,如下:\n", "```\n", "<MinMover name=\"(&string;)\" jump=\"(&string;)\"\n", " abs_score_convergence_threshold=\"(ℜ)\"\n", " max_iter=\"(200 &non_negative_integer;)\"\n", " type=\"(lbfgs_armijo_nonmonotone &minimizer_type;)\"\n", " tolerance=\"(0.01 ℜ)\" cartesian=\"(false &bool;)\"\n", " bondangle=\"(0 &bool;)\" bondlength=\"(0 &bool;)\" chi=\"(&bool;)\"\n", " bb=\"(&bool;)\" omega=\"(true &bool;)\"\n", " bb_task_operations=\"(&task_operation_comma_separated_list;)\"\n", " chi_task_operations=\"(&task_operation_comma_separated_list;)\"\n", " bondangle_task_operations=\"(&task_operation_comma_separated_list;)\"\n", " bondlength_task_operations=\"(&task_operation_comma_separated_list;)\"\n", " movemap_factory=\"(&string;)\" scorefxn=\"(&string;)\" >\n", " <MoveMap name=\"(&string;)\" bb=\"(&bool;)\" chi=\"(&bool;)\" jump=\"(&bool;)\" >\n", " <Jump number=\"(&non_negative_integer;)\" setting=\"(&bool;)\" />\n", " <Chain number=\"(&non_negative_integer;)\" chi=\"(&bool;)\" bb=\"(&bool;)\" />\n", " <Span begin=\"(&non_negative_integer;)\" end=\"(&non_negative_integer;)\"\n", " chi=\"(&bool;)\" bb=\"(&bool;)\" bondangle=\"(&bool;)\" bondlength=\"(&bool;)\" />\n", " <ResidueSelector selector=\"(&string;)\" chi=\"(&bool;)\" bb=\"(&bool;)\"\n", " bondangle=\"(&bool;)\" bondlength=\"(&bool;)\" />\n", " </MoveMap>\n", "</MinMover>\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "该MinMover的作用是用来进行侧链或骨架的能量最小化过程(调整二面角)。可以发现在MinMover中亚标签`<MoveMap> </MoveMap>`中可以让我们定义minimization的自由度,如chi,bb等参数设置。一个MinMover中有很多的选项让我们可以精确控制该Mover的行为,这也是RS中强大的一点。注意,大多数选项平常应用的时候本身都存在默认值,不需要修改,除非作为用户我们有修改的理由或明白我们在进行什么操作。\n", "\n", "\n", "在写mover的时候,用户需要给每个mover都写上一个独特的名字(name关键字),作为其实例化的名字。比如,我可以使用MinMover(相当于一个MinMover类),可以实例化各种不同名字的MinMover,并改变他们的属性,有些我可能想固定BB,有些我想固定CHI等等...可以发现这种类似于类实例化的方法,让我们有更高的自由度去操纵MinMover(其他Mover也是类似的)。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**场景二**\n", "\n", "在上面定义了molchem和r15_cart能量函数的基础上,希望定义两个MinMover:第一个命名为min_torsion,其中设定cartesian选项False,将使用默认的扭转角minimization的方法和定义的molchem能量函数;第二个命名为min_cart,需要将cartesian选项打开,并使用r15_cart能量函数。两种Mover都允许BB和CHI的改变。\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<br/>\n", "在基于场景一的RS上,我们需要在Movers tag中写入一下内容,进行场景二中的MinMover实例化。\n", "\n", "**注意**:\n", "当我们定义了Mover,实际调用使用的时候需要在```<PROTOCOLS></PROTOCOLS>```中申明才可以,以下例子中虽然定义了两个MinMover,但是在Protocol曾中仅使用了名为min_cart的mover。\n", "\n", "```\n", " <MOVERS> # 实例化两个MinMover,并改变其中的选项(属性)\n", " <MinMover name=\"min_torsion\" scorefxn=\"molmech\" chi=\"true\" bb=\"1\" cartesian=\"F\" >\n", " </MinMover>\n", " <MinMover name=\"min_cart\" scorefxn=\"r15_cart\" chi=\"true\" bb=\"1\" cartesian=\"T\" >\n", " </MinMover>\n", " </MOVERS>\n", " \n", " ......\n", " \n", " <PROTOCOLS>\n", " <Add mover=\"min_cart\" /> \n", " </PROTOCOLS>\n", " <OUTPUT scorefxn=\"r15_cart\" />\n", "```\n", "\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mGenerating XML Schema for rosetta_scripts...\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mInitializing schema validator...\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mValidating input script...\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mParsed script:\n", "<ROSETTASCRIPTS>\n", "\t<SCOREFXNS>\n", "\t\t<ScoreFunction name=\"molmech\" weights=\"mm_std_fa_elec_dslf_fa13\"/>\n", "\t\t<ScoreFunction name=\"r15_cart\" weights=\"ref2015\">\n", "\t\t\t<Reweight scoretype=\"pro_close\" weight=\"0.0\"/>\n", "\t\t\t<Reweight scoretype=\"cart_bonded\" weight=\"0.625\"/>\n", "\t\t</ScoreFunction>\n", "\t</SCOREFXNS>\n", "\t<RESIDUE_SELECTORS/>\n", "\t<TASKOPERATIONS/>\n", "\t<FILTERS/>\n", "\t<MOVERS>\n", "\t\t<MinMover bb=\"1\" cartesian=\"F\" chi=\"true\" name=\"min_torsion\" scorefxn=\"molmech\"/>\n", "\t\t<MinMover bb=\"1\" cartesian=\"T\" chi=\"true\" name=\"min_cart\" scorefxn=\"r15_cart\"/>\n", "\t</MOVERS>\n", "\t<APPLY_TO_POSE/>\n", "\t<PROTOCOLS>\n", "\t\t<Add mover=\"min_cart\"/>\n", "\t</PROTOCOLS>\n", "\t<OUTPUT scorefxn=\"r15_cart\"/>\n", "</ROSETTASCRIPTS>\n", "\u001b[0mcore.scoring.ScoreFunctionFactory: {0} \u001b[0mSCOREFUNCTION: \u001b[32mref2015\u001b[0m\n", "\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0mdefined score function \"molmech\" with weights \"mm_std_fa_elec_dslf_fa13\"\n", "\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0mdefined score function \"r15_cart\" with weights \"ref2015\"\n", "\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0m setting r15_cart weight pro_close to 0\n", "\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0m setting r15_cart weight cart_bonded to 0.625\n", "\u001b[0mprotocols.minimization_packing.MinMover: {0} \u001b[0mOptions chi, bb: 1, 1 omega: 1\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mDefined mover named \"min_torsion\" of type MinMover\n", "\u001b[0mprotocols.minimization_packing.MinMover: {0} \u001b[0mOptions chi, bb: 1, 1 omega: 1\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mDefined mover named \"min_cart\" of type MinMover\n", "\u001b[0mprotocols.rosetta_scripts.ParsedProtocol: {0} \u001b[0mParsedProtocol mover with the following settings\n", "\u001b[0mprotocols.rosetta_scripts.ParsedProtocol: {0} \u001b[0mAdded mover \"min_cart\"\n" ] } ], "source": [ "# 通过Xmlobjects对象读取\n", "with open (\"data/Example2-MinMover.xml\", 'r') as f2:\n", " xml2_lines = f2.read()\n", " \n", "xml2 = XmlObjects.create_from_string(xml2_lines)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0mprotocols.rosetta_scripts.ParsedProtocol: {0} \u001b[0m=======================BEGIN MOVER MinMover - min_cart=======================\n", "\u001b[0mcore.pose.util: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Unable to find atom_tree atom for this Rosetta branch connection angle: residue 771 BRANCH 1\n", "\u001b[0mcore.pose.util: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Unable to find atom_tree atom for this Rosetta branch connection angle: residue 845 BRANCH 1\n", "\u001b[0mcore.pose.util: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Unable to find atom_tree atom for this Rosetta branch connection angle: residue 891 BRANCH 1\n", "\u001b[0mcore.pose.util: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Unable to find atom_tree atom for this Rosetta branch connection angle: residue 956 BRANCH 1\n", "\u001b[0mcore.optimization.Minimizer: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m LBFGS MAX CYCLES 200 EXCEEDED, BUT FUNC NOT CONVERGED!\n", "\u001b[0mprotocols.rosetta_scripts.ParsedProtocol: {0} \u001b[0msetting status to success\n" ] }, { "data": { "text/plain": [ "True" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 场景二中使用的RS其实就已经是一个user-defined protocol了\n", "# 我们可以使用以下流程apply到pose上:\n", "\n", "# 拷贝一份原始pose到pose2,不操作原有pose\n", "pose2 = pose.clone()\n", "\n", "protocol = xml2.get_mover(\"ParsedProtocol\")\n", "\n", "protocol.apply(pose2)\n", "\n", "# followed by standard pose output procedure\n", "# pose2.dump_pdb(\"file_name.pdb\") with a user-defined file_name\n", "pose2.dump_pdb(\"./data/my_lab_test.pdb\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.3 练习" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在场景二中定义了两个MinMover,请尝试修改RS文件,让我们可以先进行min_torsion的MinMover(使用molchem能量函数,cartesian=False),再进行min_cart的MinMover(使用r15_cart能量函数,允许cartesian)。注意,到PROTOCOLS层中的任务是按照次序依次进行,每一步的Mover都是从上一步Mover操作后的构象接着操作的。\n", "\n", "1. 请写出新的xml格式的RS文件,并将该protocol应用到my_lab.pdb的pose中;\n", "2. 保存自定义RS运行结束后输出的构象PDB文件,并与原文件对比;\n", "3. 观察pyrosetta的输出,你得到了什么信息?\n", "4. 修改```<OUTPUT></OUTPUT>```标签的scorefunction的名字(根据定义的scorefunction),重复上述过程,会发生什么变化?最终构象会改变吗?\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" } }, "nbformat": 4, "nbformat_minor": 4 }