{
"cells": [
{
"cell_type": "markdown",
"id": "4d50c9fa-0422-4c70-b5c5-49a90edbffe7",
"metadata": {},
"source": [
"# XmlObjects的基本用法\n",
"\n",
"@Author: 吴炜坤\n",
"\n",
"@E-mail: weikun.wu@xtalpi.com"
]
},
{
"cell_type": "markdown",
"id": "71494982-1f97-412d-87ed-3a7150d91cb3",
"metadata": {},
"source": [
"### 1. 为什么有XmlObject?\n",
"在最早期,Rosetta的mover、filter等都没有为python api流程参数控制的接口,导致部分的组件无法直接调用,只能通过xml脚本读取参数,然后生成对应的对象。\n",
"\n",
"XmlObjects是在PyRosetta中直接使用xml脚本最直接的方法,特别是某些Mover没有做好Pyrosetta接口时特别有用。但是缺点是加载XmlObjects的速度并不理想,比纯粹的PyRosetta脚本启动要慢。"
]
},
{
"cell_type": "markdown",
"id": "06da9647-eda0-4cbc-80a5-fb0ce6704f13",
"metadata": {},
"source": [
"### 2. 如何使用XmlObject?\n",
"最方便的方法有两种调用方式:\n",
"1. create_from_string\n",
"2. create_from_file"
]
},
{
"cell_type": "markdown",
"id": "647d98fd-e38e-4c68-8843-8cd41ec39f26",
"metadata": {},
"source": [
"#### 2.1 create_from_string\n",
"从字符串文本中提取信息,返回XmlObjects"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "0743b33d-8a9b-441a-9d3b-db8f052fe7cc",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.26+release.b308454c455dd04f6824cc8b23e54bbb9be2cdd7 2021-07-02T13:01:54] retrieved from: http://www.pyrosetta.org\n",
"(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.\n",
"\u001b[0mcore.init: {0} \u001b[0mChecking for fconfig files in pwd and ./rosetta/flags\n",
"\u001b[0mcore.init: {0} \u001b[0mRosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r288 2021.26+release.b308454c455 b308454c455dd04f6824cc8b23e54bbb9be2cdd7 http://www.pyrosetta.org 2021-07-02T13:01:54\n",
"\u001b[0mcore.init: {0} \u001b[0mcommand: PyRosetta -ex1 -ex2aro -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database\n",
"\u001b[0mbasic.random.init_random_generator: {0} \u001b[0m'RNG device' seed mode, using '/dev/urandom', seed=-2025690693 seed_offset=0 real_seed=-2025690693 thread_index=0\n",
"\u001b[0mbasic.random.init_random_generator: {0} \u001b[0mRandomGenerator:init: Normal mode, seed=-2025690693 RG_type=mt19937\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mGenerating XML Schema for rosetta_scripts...\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mInitializing schema validator...\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mValidating input script...\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mParsed script:\n",
"\n",
"\t\n",
"\t\t\n",
"\t\n",
"\t\n",
"\t\n",
"\t\n",
"\t\n",
"\t\t\n",
"\t\t\t\n",
"\t\t\n",
"\t\n",
"\t\n",
"\t\n",
"\t\n",
"\n",
"\u001b[0mcore.scoring.ScoreFunctionFactory: {0} \u001b[0mSCOREFUNCTION: \u001b[32mref2015\u001b[0m\n",
"\u001b[0mcore.scoring.etable: {0} \u001b[0mStarting energy table calculation\n",
"\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: changing atr/rep split to bottom of energy well\n",
"\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: spline smoothing lj etables (maxdis = 6)\n",
"\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: spline smoothing solvation etables (max_dis = 6)\n",
"\u001b[0mcore.scoring.etable: {0} \u001b[0mFinished calculating energy tables.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBPoly1D.csv\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBFadeIntervals.csv\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBEval.csv\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/DonStrength.csv\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/AccStrength.csv\n",
"\u001b[0mcore.chemical.GlobalResidueTypeSet: {0} \u001b[0mFinished initializing fa_standard residue type set. Created 984 residue types\n",
"\u001b[0mcore.chemical.GlobalResidueTypeSet: {0} \u001b[0mTotal time to initialize 0.682139 seconds.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/fd/all.ramaProb\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/fd/prepro.ramaProb\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.all.txt\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.gly.txt\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.pro.txt\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/omega/omega_ppdep.valile.txt\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/P_AA\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/P_AA_n\n",
"\u001b[0mcore.scoring.P_AA: {0} \u001b[0mshapovalov_lib::shap_p_aa_pp_smooth_level of 1( aka low_smooth ) got activated.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/P_AA_pp/shapovalov/10deg/kappa131/a20.prop\n",
"\u001b[0mcore.scoring.etable: {0} \u001b[0mStarting energy table calculation\n",
"\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: changing atr/rep split to bottom of energy well\n",
"\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: spline smoothing lj etables (maxdis = 6)\n",
"\u001b[0mcore.scoring.etable: {0} \u001b[0msmooth_etable: spline smoothing solvation etables (max_dis = 6)\n",
"\u001b[0mcore.scoring.etable: {0} \u001b[0mFinished calculating energy tables.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/PairEPotential/pdb_pair_stats_fine\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/InterchainPotential/interchain_env_log.txt\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/InterchainPotential/interchain_pair_log.txt\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/EnvPairPotential/env_log.txt\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/EnvPairPotential/cbeta_den.txt\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/EnvPairPotential/pair_log.txt\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/EnvPairPotential/cenpack_log.txt\n",
"\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mshapovalov_lib::shap_rama_smooth_level of 4( aka highest_smooth ) got activated.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/shapovalov/kappa25/all.ramaProb\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/avg_L_rama.dat\n",
"\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/avg_L_rama.dat.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_all_rama.dat\n",
"\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_all_rama.dat.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_G_rama.dat\n",
"\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_G_rama.dat.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_P_rama.dat\n",
"\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_P_rama.dat.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/avg_L_rama_str.dat\n",
"\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/avg_L_rama_str.dat.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_all_rama_str.dat\n",
"\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_all_rama_str.dat.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_G_rama_str.dat\n",
"\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_G_rama_str.dat.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/rama/flat/sym_P_rama_str.dat\n",
"\u001b[0mcore.scoring.ramachandran: {0} \u001b[0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_P_rama_str.dat.\n",
"\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0mdefined score function \"sfxn1\" with weights \"ref2015\"\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mDefined mover named \"setTorsion\" of type SetTorsion\n",
"\u001b[0mprotocols.rosetta_scripts.ParsedProtocol: {0} \u001b[0mParsedProtocol mover with the following settings\n"
]
}
],
"source": [
"from pyrosetta.rosetta.protocols import rosetta_scripts \n",
"from pyrosetta import init\n",
"\n",
"# 初始化脚本:\n",
"init()\n",
"xml = rosetta_scripts.XmlObjects.create_from_string('''\n",
"\n",
" \n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
" \n",
"\t \n",
"\t\t \n",
"\t\t \n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"''')"
]
},
{
"cell_type": "markdown",
"id": "837479ff-3402-4018-b646-0b94dcf4902e",
"metadata": {},
"source": [
"**提取Filter、Mover、Selector、SimpleMetric、TaskOperation的语法语句**"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "5f7fa7a8-13fb-4286-99f9-848277443082",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# get a mover:\n",
"mover_ = xml.get_mover(\"setTorsion\")\n",
"mover_\n",
"\n",
"# 如此类推:\n",
"# filter_ = xml.get_filter(name)\n",
"# selector_ = xml.get_residue_selector(name)\n",
"# score_ = xml.get_score_function(name)\n",
"# sm_ = xml.get_simple_metric(name)\n",
"# tf_ = xml.get_task_operation(name)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "ea7c9597-8eaf-4964-9404-f3f5001cc26c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"vector1_std_string[ParsedProtocol, null, setTorsion]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#列表式所选内容。\n",
"xml.list_movers()\n",
"\n",
"# 如此类推:\n",
"# xml.list_filters()\n",
"# xml.list_residue_selectors()\n",
"# xml.list_score_functions()\n",
"# xml.list_simple_metrics()\n",
"# xml.list_task_operations()"
]
},
{
"cell_type": "markdown",
"id": "707259f6-6bef-416f-9fdf-20ea673f425a",
"metadata": {},
"source": [
"#### 2.2 create_from_file\n",
"从字符串文本中提取信息,返回XmlObjects"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "b4e9c8e4-7d98-4a65-94a0-6ff971cfd88e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mGenerating XML Schema for rosetta_scripts...\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mInitializing schema validator...\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mValidating input script...\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0m...done\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mParsed script:\n",
"\n",
"\t\n",
"\t\t\n",
"\t\t\n",
"\t\t\t\n",
"\t\t\t\n",
"\t\t\n",
"\t\n",
"\t\n",
"\t\n",
"\t\n",
"\t\n",
"\t\t\n",
"\t\t\n",
"\t\n",
"\t\n",
"\t\n",
"\t\t\n",
"\t\n",
"\t\n",
"\u001b[0mcore.scoring.ScoreFunctionFactory: {0} \u001b[0mSCOREFUNCTION: \u001b[32mref2015\u001b[0m\n",
"\u001b[0mcore.mm.MMLJLibrary: {0} \u001b[0mMM lj sets added: 105\n",
"\u001b[0mcore.mm.MMTorsionLibrary: {0} \u001b[0mMM torsion sets added fully assigned: 1039; wildcard: 48 and 1 virtual parameter.\n",
"\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0mdefined score function \"molmech\" with weights \"mm_std_fa_elec_dslf_fa13\"\n",
"\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0mdefined score function \"r15_cart\" with weights \"ref2015\"\n",
"\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0m setting r15_cart weight pro_close to 0\n",
"\u001b[0mprotocols.jd2.parser.ScoreFunctionLoader: {0} \u001b[0m setting r15_cart weight cart_bonded to 0.625\n",
"\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mInitializing IdealParametersDatabase with default Ks=300 , 80 , 80 , 10 , 80\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/bondlength_bondangle/default-lengths.txt\n",
"\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mRead 759 bb-independent lengths.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/bondlength_bondangle/default-angles.txt\n",
"\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mRead 1434 bb-independent angles.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/bondlength_bondangle/default-torsions.txt\n",
"\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mRead 1 bb-independent torsions.\n",
"\u001b[0mbasic.io.database: {0} \u001b[0mDatabase file opened: scoring/score_functions/bondlength_bondangle/default-improper.txt\n",
"\u001b[0mcore.energy_methods.CartesianBondedEnergy: {0} \u001b[0mRead 529 bb-independent improper tors.\n",
"\u001b[0mprotocols.minimization_packing.MinMover: {0} \u001b[0mOptions chi, bb: 1, 1 omega: 1\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mDefined mover named \"min_torsion\" of type MinMover\n",
"\u001b[0mprotocols.minimization_packing.MinMover: {0} \u001b[0mOptions chi, bb: 1, 1 omega: 1\n",
"\u001b[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} \u001b[0mDefined mover named \"min_cart\" of type MinMover\n",
"\u001b[0mprotocols.rosetta_scripts.ParsedProtocol: {0} \u001b[0mParsedProtocol mover with the following settings\n",
"\u001b[0mprotocols.rosetta_scripts.ParsedProtocol: {0} \u001b[0mAdded mover \"min_cart\"\n"
]
}
],
"source": [
"from pyrosetta.rosetta.protocols import rosetta_scripts \n",
"xml = rosetta_scripts.XmlObjects.create_from_file('./data/Example2-MinMover.xml')"
]
},
{
"cell_type": "markdown",
"id": "f4ab1ba4-c717-4a15-9b5c-579f02bd225c",
"metadata": {},
"source": [
"### 3. RosettaScript所有的API\n",
"\n",
"详见: https://new.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/RosettaScripts"
]
},
{
"cell_type": "markdown",
"id": "0c7b732c-36bd-45d7-9075-bf7614dcb043",
"metadata": {},
"source": [
"#### 结语: \n",
"RosettaScript大而全,Python API灵活速度快!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6db9e0f5-c610-45eb-840d-499887a4ae37",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.10"
}
},
"nbformat": 4,
"nbformat_minor": 5
}