{ "cells": [ { "cell_type": "markdown", "id": "4d50c9fa-0422-4c70-b5c5-49a90edbffe7", "metadata": {}, "source": [ "# XmlObjects的基本用法\n", "\n", "@Author: 吴炜坤\n", "\n", "@E-mail: weikun.wu@xtalpi.com" ] }, { "cell_type": "markdown", "id": "71494982-1f97-412d-87ed-3a7150d91cb3", "metadata": {}, "source": [ "### 1. 为什么有XmlObject?\n", "在最早期,Rosetta的mover、filter等都没有为python api流程参数控制的接口,导致部分的组件无法直接调用,只能通过xml脚本读取参数,然后生成对应的对象。\n", "\n", "XmlObjects是在PyRosetta中直接使用xml脚本最直接的方法,特别是某些Mover没有做好Pyrosetta接口时特别有用。但是缺点是加载XmlObjects的速度并不理想,比纯粹的PyRosetta脚本启动要慢。" ] }, { "cell_type": "markdown", "id": "06da9647-eda0-4cbc-80a5-fb0ce6704f13", "metadata": {}, "source": [ "### 2. 如何使用XmlObject?\n", "最方便的方法有两种调用方式:\n", "1. create_from_string\n", "2. create_from_file" ] }, { "cell_type": "markdown", "id": "647d98fd-e38e-4c68-8843-8cd41ec39f26", "metadata": {}, "source": [ "#### 2.1 create_from_string\n", "从字符串文本中提取信息,返回XmlObjects" ] }, { "cell_type": "code", "execution_count": 3, "id": "0743b33d-8a9b-441a-9d3b-db8f052fe7cc", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.26+release.b308454c455dd04f6824cc8b23e54bbb9be2cdd7 2021-07-02T13:01:54] retrieved from: http://www.pyrosetta.org\n", "(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.\n", "\u001b[0mcore.init: {0} \u001b[0mChecking for fconfig files in pwd and ./rosetta/flags\n", "\u001b[0mcore.init: {0} \u001b[0mRosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r288 2021.26+release.b308454c455 b308454c455dd04f6824cc8b23e54bbb9be2cdd7 http://www.pyrosetta.org 2021-07-02T13:01:54\n", "\u001b[0mcore.init: {0} \u001b[0mcommand: PyRosetta -ex1 -ex2aro -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database\n", "\u001b[0mbasic.random.init_random_generator: {0} \u001b[0m'RNG device' seed mode, using '/dev/urandom', seed=-2025690693 seed_offset=0 real_seed=-2025690693 thread_index=0\n", "\u001b[0mbasic.random.init_random_generator: {0} \u001b[0mRandomGenerator:init: Normal mode, seed=-2025690693 RG_type=mt19937\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mGenerating XML Schema for rosetta_scripts...\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mInitializing schema validator...\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mValidating input script...\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mParsed script:\n", "<ROSETTASCRIPTS>\n", "\t<SCOREFXNS>\n", "\t\t<ScoreFunction name=\"sfxn1\" weights=\"ref2015\"/>\n", "\t</SCOREFXNS>\n", "\t<RESIDUE_SELECTORS/>\n", "\t<TASKOPERATIONS/>\n", "\t<FILTERS/>\n", "\t<MOVERS>\n", "\t\t<SetTorsion name=\"setTorsion\">\n", "\t\t\t<Torsion angle=\"perturb\" perturbation_magnitude=\"1.0\" perturbation_type=\"gaussian\" residue=\"1,2,3,4,5\" torsion_name=\"rama\"/>\n", "\t\t</SetTorsion>\n", "\t</MOVERS>\n", "\t<APPLY_TO_POSE/>\n", "\t<PROTOCOLS/>\n", "\t<OUTPUT/>\n", "</ROSETTASCRIPTS>\n", "\u001b[0mcore.scoring.ScoreFunctionFactory: {0} \u001b[0mSCOREFUNCTION: \u001b[32mref2015\u001b[0m\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0mStarting energy table calculation\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: changing atr/rep split to bottom of energy well\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: spline smoothing lj etables (maxdis = 6)\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: spline smoothing solvation etables (max_dis = 6)\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0mFinished calculating energy tables.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBPoly1D.csv\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBFadeIntervals.csv\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBEval.csv\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/DonStrength.csv\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/AccStrength.csv\n", "\u001b[0mcore.chemical.GlobalResidueTypeSet: {0} \u001b[0mFinished initializing fa_standard residue type set. Created 984 residue types\n", "\u001b[0mcore.chemical.GlobalResidueTypeSet: {0} \u001b[0mTotal time to initialize 0.682139 seconds.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/fd/all.ramaProb\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/fd/prepro.ramaProb\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.all.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.gly.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.pro.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.valile.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/P_AA\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/P_AA_n\n", "\u001b[0mcore.scoring.P_AA: {0} \u001b[0mshapovalov_lib::shap_p_aa_pp_smooth_level of 1( aka low_smooth ) got activated.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/shapovalov/10deg/kappa131/a20.prop\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0mStarting energy table calculation\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: changing atr/rep split to bottom of energy well\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: spline smoothing lj etables (maxdis = 6)\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: spline smoothing solvation etables (max_dis = 6)\n", "\u001b[0mcore.scoring.etable: {0} \u001b[0mFinished calculating energy tables.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/PairEPotential/pdb_pair_stats_fine\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/InterchainPotential/interchain_env_log.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/InterchainPotential/interchain_pair_log.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/EnvPairPotential/env_log.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/EnvPairPotential/cbeta_den.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/EnvPairPotential/pair_log.txt\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/EnvPairPotential/cenpack_log.txt\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mshapovalov_lib::shap_rama_smooth_level of 4( aka highest_smooth ) got activated.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/shapovalov/kappa25/all.ramaProb\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/avg_L_rama.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/avg_L_rama.dat.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_all_rama.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_all_rama.dat.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_G_rama.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_G_rama.dat.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_P_rama.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_P_rama.dat.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/avg_L_rama_str.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/avg_L_rama_str.dat.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_all_rama_str.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_all_rama_str.dat.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_G_rama_str.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_G_rama_str.dat.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_P_rama_str.dat\n", "\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_P_rama_str.dat.\n", "\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0mdefined score function \"sfxn1\" with weights \"ref2015\"\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mDefined mover named \"setTorsion\" of type SetTorsion\n", "\u001b[0mprotocols.rosetta_scripts.ParsedProtocol: {0} \u001b[0mParsedProtocol mover with the following settings\n" ] } ], "source": [ "from pyrosetta.rosetta.protocols import rosetta_scripts \n", "from pyrosetta import init\n", "\n", "# 初始化脚本:\n", "init()\n", "xml = rosetta_scripts.XmlObjects.create_from_string('''\n", "<SCOREFXNS>\n", " <ScoreFunction name=\"sfxn1\" weights=\"ref2015\"/>\n", "</SCOREFXNS>\n", "\n", "<RESIDUE_SELECTORS>\n", "</RESIDUE_SELECTORS>\n", "\n", "<TASKOPERATIONS>\n", "</TASKOPERATIONS>\n", "\n", "<FILTERS>\n", "</FILTERS>\n", "\n", "<MOVERS> \n", "\t<SetTorsion name=\"setTorsion\"> \n", "\t\t<Torsion residue=\"1,2,3,4,5\" torsion_name=\"rama\" angle=\"perturb\" perturbation_type=\"gaussian\" perturbation_magnitude=\"1.0\" /> \n", "\t\t</SetTorsion> \n", "</MOVERS>\n", "\n", "<APPLY_TO_POSE>\n", "</APPLY_TO_POSE>\n", "\n", "<PROTOCOLS>\n", "</PROTOCOLS>\n", "\n", "<OUTPUT />\n", "''')" ] }, { "cell_type": "markdown", "id": "837479ff-3402-4018-b646-0b94dcf4902e", "metadata": {}, "source": [ "**提取Filter、Mover、Selector、SimpleMetric、TaskOperation的语法语句**" ] }, { "cell_type": "code", "execution_count": 9, "id": "5f7fa7a8-13fb-4286-99f9-848277443082", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "<pyrosetta.rosetta.protocols.simple_moves.SetTorsion at 0x7fc5f0380bf0>" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# get a mover:\n", "mover_ = xml.get_mover(\"setTorsion\")\n", "mover_\n", "\n", "# 如此类推:\n", "# filter_ = xml.get_filter(name)\n", "# selector_ = xml.get_residue_selector(name)\n", "# score_ = xml.get_score_function(name)\n", "# sm_ = xml.get_simple_metric(name)\n", "# tf_ = xml.get_task_operation(name)" ] }, { "cell_type": "code", "execution_count": 10, "id": "ea7c9597-8eaf-4964-9404-f3f5001cc26c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "vector1_std_string[ParsedProtocol, null, setTorsion]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#列表式所选内容。\n", "xml.list_movers()\n", "\n", "# 如此类推:\n", "# xml.list_filters()\n", "# xml.list_residue_selectors()\n", "# xml.list_score_functions()\n", "# xml.list_simple_metrics()\n", "# xml.list_task_operations()" ] }, { "cell_type": "markdown", "id": "707259f6-6bef-416f-9fdf-20ea673f425a", "metadata": {}, "source": [ "#### 2.2 create_from_file\n", "从字符串文本中提取信息,返回XmlObjects" ] }, { "cell_type": "code", "execution_count": 11, "id": "b4e9c8e4-7d98-4a65-94a0-6ff971cfd88e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mGenerating XML Schema for rosetta_scripts...\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mInitializing schema validator...\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mValidating input script...\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mParsed script:\n", "<ROSETTASCRIPTS>\n", "\t<SCOREFXNS>\n", "\t\t<ScoreFunction name=\"molmech\" weights=\"mm_std_fa_elec_dslf_fa13\"/>\n", "\t\t<ScoreFunction name=\"r15_cart\" weights=\"ref2015\">\n", "\t\t\t<Reweight scoretype=\"pro_close\" weight=\"0.0\"/>\n", "\t\t\t<Reweight scoretype=\"cart_bonded\" weight=\"0.625\"/>\n", "\t\t</ScoreFunction>\n", "\t</SCOREFXNS>\n", "\t<RESIDUE_SELECTORS/>\n", "\t<TASKOPERATIONS/>\n", "\t<FILTERS/>\n", "\t<MOVERS>\n", "\t\t<MinMover bb=\"1\" cartesian=\"F\" chi=\"true\" name=\"min_torsion\" scorefxn=\"molmech\"/>\n", "\t\t<MinMover bb=\"1\" cartesian=\"T\" chi=\"true\" name=\"min_cart\" scorefxn=\"r15_cart\"/>\n", "\t</MOVERS>\n", "\t<APPLY_TO_POSE/>\n", "\t<PROTOCOLS>\n", "\t\t<Add mover=\"min_cart\"/>\n", "\t</PROTOCOLS>\n", "\t<OUTPUT scorefxn=\"r15_cart\"/>\n", "</ROSETTASCRIPTS>\n", "\u001b[0mcore.scoring.ScoreFunctionFactory: {0} \u001b[0mSCOREFUNCTION: \u001b[32mref2015\u001b[0m\n", "\u001b[0mcore.mm.MMLJLibrary: {0} \u001b[0mMM lj sets added: 105\n", "\u001b[0mcore.mm.MMTorsionLibrary: {0} \u001b[0mMM torsion sets added fully assigned: 1039; wildcard: 48 and 1 virtual parameter.\n", "\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0mdefined score function \"molmech\" with weights \"mm_std_fa_elec_dslf_fa13\"\n", "\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0mdefined score function \"r15_cart\" with weights \"ref2015\"\n", "\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0m setting r15_cart weight pro_close to 0\n", "\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0m setting r15_cart weight cart_bonded to 0.625\n", "\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mInitializing IdealParametersDatabase with default Ks=300 , 80 , 80 , 10 , 80\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/bondlength_bondangle/default-lengths.txt\n", "\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mRead 759 bb-independent lengths.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/bondlength_bondangle/default-angles.txt\n", "\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mRead 1434 bb-independent angles.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/bondlength_bondangle/default-torsions.txt\n", "\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mRead 1 bb-independent torsions.\n", "\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/bondlength_bondangle/default-improper.txt\n", "\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mRead 529 bb-independent improper tors.\n", "\u001b[0mprotocols.minimization_packing.MinMover: {0} \u001b[0mOptions chi, bb: 1, 1 omega: 1\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mDefined mover named \"min_torsion\" of type MinMover\n", "\u001b[0mprotocols.minimization_packing.MinMover: {0} \u001b[0mOptions chi, bb: 1, 1 omega: 1\n", "\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mDefined mover named \"min_cart\" of type MinMover\n", "\u001b[0mprotocols.rosetta_scripts.ParsedProtocol: {0} \u001b[0mParsedProtocol mover with the following settings\n", "\u001b[0mprotocols.rosetta_scripts.ParsedProtocol: {0} \u001b[0mAdded mover \"min_cart\"\n" ] } ], "source": [ "from pyrosetta.rosetta.protocols import rosetta_scripts \n", "xml = rosetta_scripts.XmlObjects.create_from_file('./data/Example2-MinMover.xml')" ] }, { "cell_type": "markdown", "id": "f4ab1ba4-c717-4a15-9b5c-579f02bd225c", "metadata": {}, "source": [ "### 3. RosettaScript所有的API\n", "\n", "详见: https://new.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/RosettaScripts" ] }, { "cell_type": "markdown", "id": "0c7b732c-36bd-45d7-9075-bf7614dcb043", "metadata": {}, "source": [ "#### 结语: \n", "RosettaScript大而全,Python API灵活速度快!" ] }, { "cell_type": "code", "execution_count": null, "id": "6db9e0f5-c610-45eb-840d-499887a4ae37", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" } }, "nbformat": 4, "nbformat_minor": 5 }