{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# ResidueSelectors的逻辑\n", "@Author: 槐喆\n", "@email:zhe.huai@xtalpi.com\n", "\n", "@Proofread: 吴炜坤\n", "@email:weikun.wu@xtalpi.com" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "氨基酸选择器(ResidueSelector)具有十分重要的功能。它能够从蛋白质结构(Pose)中选取并生成氨基酸子集。一旦生成了这些子集,对后续建模的逻辑操作具有重大的意义,比如可以定义设计或采样的自由度(使用ResidueSelector可以将蛋白质距离内核中心5埃范围内的氨基酸选择出来,后续进行氨基酸侧链能量最小化等结构优化),也可以配合SimpleMetrics、Filter等进行蛋白质性质或参数的统计。\n", "\n", "注: ResidueSelectors的概念比较简单也比较利于初学者理解,因此此章节学习难度较小。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 一、 ResidueSelector与vector1_bool" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在PyRosetta中,定义好ResidueSelectors后,进行apply(可以理解为执行选择的过程),我们将得到氨基酸残基的子集列表。这个列表被保存在vector1<bool>对象中,以下以具体的实例进行讲解:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.31+release.c7009b3115c22daa9efe2805d9d1ebba08426a54 2021-08-07T10:04:12] retrieved from: http://www.pyrosetta.org\n", "(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.\n", "\u001b[0mcore.init: {0} \u001b[0mChecking for fconfig files in pwd and ./rosetta/flags\n", "\u001b[0mcore.init: {0} \u001b[0mRosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r292 2021.31+release.c7009b3115c c7009b3115c22daa9efe2805d9d1ebba08426a54 http://www.pyrosetta.org 2021-08-07T10:04:12\n", "\u001b[0mcore.init: {0} \u001b[0mcommand: PyRosetta -ex1 -ex2aro -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database\n", "\u001b[0mbasic.random.init_random_generator: {0} \u001b[0m'RNG device' seed mode, using '/dev/urandom', seed=1885576095 seed_offset=0 real_seed=1885576095 thread_index=0\n", "\u001b[0mbasic.random.init_random_generator: {0} \u001b[0mRandomGenerator:init: Normal mode, seed=1885576095 RG_type=mt19937\n", "\u001b[0mcore.chemical.GlobalResidueTypeSet: {0} \u001b[0mFinished initializing fa_standard residue type set. Created 983 residue types\n", "\u001b[0mcore.chemical.GlobalResidueTypeSet: {0} \u001b[0mTotal time to initialize 0.651952 seconds.\n", "\u001b[0mcore.import_pose.import_pose: {0} \u001b[0mFile './data/6LZ9_H_L.pdb' automatically determined to be of type PDB\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mFound disulfide between residues 21 94\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 21 CYS\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 94 CYS\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 21 CYD\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 94 CYD\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mFound disulfide between residues 141 206\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 141 CYS\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 206 CYS\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 141 CYD\n", "\u001b[0mcore.conformation.Conformation: {0} \u001b[0mcurrent variant for 206 CYD\n" ] } ], "source": [ "# 导入链选择器\n", "from pyrosetta import pose_from_pdb, init\n", "from pyrosetta.rosetta.core.select.residue_selector import ChainSelector\n", "init()\n", "# 从pdb中读入生成pose对象,(肝细胞生长因子抗体PDB:6LZ9)\n", "pose = pose_from_pdb('./data/6LZ9_H_L.pdb')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center><img src=\"./img/6LZ9_init.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "PDB file name: ./data/6LZ9_H_L.pdb\n", " Pose Range Chain PDB Range | #Residues #Atoms\n", "\n", "0001 -- 0081 H 0002 -- 0082 | 0081 residues; 01283 atoms\n", "0082 -- 0082 H 0082A -- 0082A | 0001 residues; 00011 atoms\n", "0083 -- 0083 H 0082B -- 0082B | 0001 residues; 00011 atoms\n", "0084 -- 0084 H 0082C -- 0082C | 0001 residues; 00019 atoms\n", "0085 -- 0102 H 0083 -- 0100 | 0018 residues; 00271 atoms\n", "0103 -- 0103 H 0100A -- 0100A | 0001 residues; 00010 atoms\n", "0104 -- 0104 H 0100B -- 0100B | 0001 residues; 00021 atoms\n", "0105 -- 0105 H 0100C -- 0100C | 0001 residues; 00021 atoms\n", "0106 -- 0106 H 0100D -- 0100D | 0001 residues; 00010 atoms\n", "0107 -- 0107 H 0100E -- 0100E | 0001 residues; 00017 atoms\n", "0108 -- 0118 H 0101 -- 0111 | 0011 residues; 00160 atoms\n", "0119 -- 0223 L 0001 -- 0105 | 0105 residues; 01600 atoms\n", " TOTAL | 0223 residues; 03434 atoms\n", "\n" ] } ], "source": [ "# 先来看抗体的残基基本信息:\n", "print(pose.pdb_info())" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "抗体含有的链数量:2\n", "抗体含有的氨基酸数量:223\n" ] } ], "source": [ "print(f'抗体含有的链数量:{pose.num_chains()}')\n", "print(f'抗体含有的氨基酸数量:{pose.total_residue()}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "可见抗体中,共有两条链。H链氨基酸范围是1-118,L链氨基酸范围是119-223。" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "vector1_bool[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n" ] } ], "source": [ "# 选择抗体的重链,PDB链号为\"H\":\n", "select_heavy_chain = ChainSelector('H')\n", "selected = select_heavy_chain.apply(pose)\n", "print(selected)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**结果解读**<br />\n", "知识点1: vector1_bool中被选择的氨基酸返回“1”,而没有被选择的氨基酸返回“0”<br /> \n", "知识点2: vector1_bool中是按照Pose编号进行编写的(从1开始),也就是说重链的编号从1 -> n, 轻链的编号从n+1 -> 223." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "验证选择器是否正确:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118]\n" ] } ], "source": [ "index_list = [index+1 for index, i in enumerate(selected) if i == 1]\n", "print(index_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "可见选择器正确选择了重链的所有氨基酸。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 二、 ResidueSelector的可视化" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "PyRosetta中内置SelectedResiduesPyMOLMetric的函数,可以直接显示被选择的氨基酸。" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "from pyrosetta.rosetta.core.simple_metrics.metrics import SelectedResiduesPyMOLMetric\n", "pymol_selected = SelectedResiduesPyMOLMetric()\n", "pymol_selected.set_residue_selector(select_heavy_chain)\n", "prefix = 'heavy_chain_'\n", "pymol_selected.apply(pose, prefix)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'select rosetta_sele, (chain H and resid 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,82A,82B,82C,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,100A,100B,100C,100D,100E,101,102,103,104,105,106,107,108,109,110,111)'" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from pyrosetta.rosetta.core.simple_metrics import get_sm_data\n", "sm_data = get_sm_data(pose)\n", "string_metric = sm_data.get_string_metric_data()\n", "string_metric['heavy_chain_pymol_selection']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "第一步,在PyMol中的cmd对话框输入上述的选择命令;\n", "\n", "第二步,用棍棒形式呈现\n", "show sticks, rosetta_sele" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center><img src=\"./img/6LZ9_heavychain.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 三、 ResidueSelector的应用实例\n", "\n", "氨基酸选择器按功能可分为三大类:\n", "\n", "- 逻辑选择器\n", "- 非构象依赖选择器\n", "- 构象依赖选择器\n", "\n", "以下我们将逐步来讲解在实战中,都有哪些氨基酸选择可以为我们所用。<br /> \n", "这一节主要简单示例,下一节将详细讲解不同的API。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 3.1 逻辑选择器\n", "第一部分是逻辑选择器,很好理解,按照逻辑分类为Not、And、Or逻辑关系,可以将**两个**选择器进行逻辑的再次选择。\n", "在Rosetta中,负责逻辑定义的选择器为NotResidueSelector、AndResidueSelector、OrResidueSelector。\n", "以下做实例说明:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 还是以之前读入的抗体pose为例。\n", "# 先定义选择的链Selector:\n", "select_heavy_chain = ChainSelector('H')\n", "select_light_chain = ChainSelector('L')\n", "select_light_chain.apply(pose)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "# 可视化,选择轻链\n", "pymol_selected = SelectedResiduesPyMOLMetric()\n", "pymol_selected.set_residue_selector(select_light_chain)\n", "prefix = 'light_chain_'\n", "pymol_selected.apply(pose, prefix)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'select rosetta_sele, (chain L and resid 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105)'" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "string_metric = sm_data.get_string_metric_data()\n", "string_metric['light_chain_pymol_selection']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center><img src=\"./img/6LZ9_lightchain.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "vector1_bool[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]\n" ] } ], "source": [ "#example1: 选择轻链**或**重链\n", "from pyrosetta.rosetta.core.select.residue_selector import OrResidueSelector\n", "light_or_heavy = OrResidueSelector(select_heavy_chain, select_light_chain)\n", "residue_selector = light_or_heavy.apply(pose)\n", "print(residue_selector)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "# 可视化 选择轻链**或**重链\n", "pymol_selected = SelectedResiduesPyMOLMetric()\n", "pymol_selected.set_residue_selector(light_or_heavy)\n", "prefix = 'light_or_heavy_'\n", "pymol_selected.apply(pose, prefix)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'select rosetta_sele, (chain H and resid 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,82A,82B,82C,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,100A,100B,100C,100D,100E,101,102,103,104,105,106,107,108,109,110,111) or (chain L and resid 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105)'" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "string_metric = sm_data.get_string_metric_data()\n", "string_metric['light_or_heavy_pymol_selection']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center><img src=\"./img/6LZ9_light_or_heavy.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n" ] } ], "source": [ "#example2: 选择重链**且**轻链\n", "from pyrosetta.rosetta.core.select.residue_selector import AndResidueSelector\n", "light_and_heavy = AndResidueSelector(select_heavy_chain, select_light_chain)\n", "residue_selector = light_and_heavy.apply(pose)\n", "print(residue_selector)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "# 可视化选择重链**且**轻链\n", "pymol_selected = SelectedResiduesPyMOLMetric()\n", "pymol_selected.set_residue_selector(light_and_heavy)\n", "prefix = 'light_and_heavy_'\n", "pymol_selected.apply(pose, prefix)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'select rosetta_sele, '" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "string_metric = sm_data.get_string_metric_data()\n", "string_metric['light_and_heavy_pymol_selection']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center><img src=\"./img/6LZ9_light_and_heavy.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "重链和轻链之间没有交集,所以选择的结果是**空集**" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]\n" ] } ], "source": [ "#example3: 非选择器:\n", "from pyrosetta.rosetta.core.select.residue_selector import NotResidueSelector\n", "not_heavy = NotResidueSelector(select_heavy_chain)\n", "residue_selector = not_heavy.apply(pose)\n", "print(residue_selector)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "# 可视化选择 非重链\n", "pymol_selected = SelectedResiduesPyMOLMetric()\n", "pymol_selected.set_residue_selector(not_heavy)\n", "prefix = 'not_heavy_'\n", "pymol_selected.apply(pose, prefix)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'select rosetta_sele, (chain L and resid 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105)'" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "string_metric = sm_data.get_string_metric_data()\n", "string_metric['not_heavy_pymol_selection']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center><img src=\"./img/6LZ9_not_heavy.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "vector1_bool[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]\n" ] } ], "source": [ "#example4: 选择整个Pose\n", "from pyrosetta.rosetta.core.select.residue_selector import TrueResidueSelector\n", "true = TrueResidueSelector()\n", "residue_selector = true.apply(pose)\n", "print(residue_selector)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "# 可视化选择 整个Pose\n", "pymol_selected = SelectedResiduesPyMOLMetric()\n", "pymol_selected.set_residue_selector(true)\n", "prefix = 'entire_pose_'\n", "pymol_selected.apply(pose, prefix)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'select rosetta_sele, (chain H and resid 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,82A,82B,82C,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,100A,100B,100C,100D,100E,101,102,103,104,105,106,107,108,109,110,111) or (chain L and resid 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105)'" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "string_metric = sm_data.get_string_metric_data()\n", "string_metric['entire_pose_pymol_selection']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center><img src=\"./img/6LZ9_entire_pose.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 3.2 非构象依赖的选择器\n", "这类选择器的定义不依赖于具体的构象,仅仅依靠属性就可以定义。如氨基酸的序号,氨基酸的名称等。此次简单举两个例子进行说明。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**3.2.1 ResidueIndexSelector**\n", "\n", "通过氨基酸的具体编号定义的选择器,不仅可以使用PDB编号、Pose编号,还可以指定氨基酸的范围进行选择。" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n" ] } ], "source": [ "from pyrosetta.rosetta.core.select.residue_selector import ResidueIndexSelector\n", "# 根据具体的Pose编号选择:\n", "pose_index_selector = ResidueIndexSelector('40,42,44')\n", "residue_selector = pose_index_selector.apply(pose)\n", "print(residue_selector)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "# 可视化选择 特定的残基位点\n", "pymol_selected = SelectedResiduesPyMOLMetric()\n", "pymol_selected.set_residue_selector(pose_index_selector)\n", "prefix = 'index_select_'\n", "pymol_selected.apply(pose, prefix)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'select rosetta_sele, (chain H and resid 41,43,45)'" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "string_metric = sm_data.get_string_metric_data()\n", "string_metric['index_select_pymol_selection']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center><img src=\"./img/6LZ9_index.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n" ] } ], "source": [ "#example1: 根据具体的PDB编号选择, 注意需要附带上PDB链的信息。\n", "pdb_index_selector = ResidueIndexSelector('62H,63H,64H')\n", "residue_selector = pdb_index_selector.apply(pose)\n", "print(residue_selector)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "# 可视化选择 根据PDB编号选择的残基\n", "pymol_selected = SelectedResiduesPyMOLMetric()\n", "pymol_selected.set_residue_selector(pdb_index_selector)\n", "prefix = 'pdb_index_select_'\n", "pymol_selected.apply(pose, prefix)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'select rosetta_sele, (chain H and resid 62,63,64)'" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "string_metric = sm_data.get_string_metric_data()\n", "string_metric['pdb_index_select_pymol_selection']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center><img src=\"./img/6LZ9_pdb_index.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n" ] } ], "source": [ "#example2: 根据PDB的范围进行选择。\n", "range_selector = ResidueIndexSelector('42H-60H')\n", "residue_selector = range_selector.apply(pose)\n", "print(residue_selector)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "# 可视化选择 一定范围的残基\n", "pymol_selected = SelectedResiduesPyMOLMetric()\n", "pymol_selected.set_residue_selector(range_selector)\n", "prefix = 'range_select_'\n", "pymol_selected.apply(pose, prefix)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'select rosetta_sele, (chain H and resid 42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60)'" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "string_metric = sm_data.get_string_metric_data()\n", "string_metric['range_select_pymol_selection']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center><img src=\"./img/6LZ9_pdb_range.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**3.2.2. ResidueNameSelector**\n", "\n", "通过氨基酸的具体残基名定义的选择器:" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0]\n" ] } ], "source": [ "#example1: 根据单个残基名进行选择:\n", "from pyrosetta.rosetta.core.select.residue_selector import *\n", "resname_selector = ResidueNameSelector('PHE')\n", "residue_selector = resname_selector.apply(pose)\n", "print(residue_selector)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [], "source": [ "# 可视化选择 根据残基名选择的残基\n", "pymol_selected = SelectedResiduesPyMOLMetric()\n", "pymol_selected.set_residue_selector(resname_selector)\n", "prefix = 'resname_select_'\n", "pymol_selected.apply(pose, prefix)" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'select rosetta_sele, (chain H and resid 27,79,100) or (chain L and resid 21,49,62,83,87,96,98)'" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "string_metric = sm_data.get_string_metric_data()\n", "string_metric['resname_select_pymol_selection']" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "<center><img src=\"./img/6LZ9_residuename.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0]\n" ] } ], "source": [ "#example2: 根据多个残基名进行选择:\n", "resname_selector = ResidueNameSelector('PHE,ASN')\n", "residue_selector = resname_selector.apply(pose)\n", "print(residue_selector)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "# 可视化选择多个残基名对应的残基\n", "pymol_selected = SelectedResiduesPyMOLMetric()\n", "pymol_selected.set_residue_selector(resname_selector)\n", "prefix = 'multi_resname_select_'\n", "pymol_selected.apply(pose, prefix)" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'select rosetta_sele, (chain H and resid 27,54,60,76,79,100) or (chain L and resid 21,31,34,49,62,77,83,87,96,98)'" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "string_metric = sm_data.get_string_metric_data()\n", "string_metric['multi_resname_select_pymol_selection']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center><img src=\"./img/6LZ9_multi_residuename.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n" ] } ], "source": [ "#example3: 选择带修饰的氨基酸残基\n", "resname_selector = ResidueNameSelector('CYS')\n", "residue_selector = resname_selector.apply(pose)\n", "print(residue_selector)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "好像出现了问题,残基选择器似乎没有正确地选择我所需要的二硫键残基。让我们打印21号残基的信息,看看出了什么问题?" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Residue 21: CYS:disulfide (CYS, C):\n", "Base: CYS\n", " Properties: POLYMER PROTEIN CANONICAL_AA SC_ORBITALS METALBINDING DISULFIDE_BONDED ALPHA_AA L_AA\n", " Variant types: DISULFIDE\n", " Main-chain atoms: N CA C \n", " Backbone atoms: N CA C O H HA \n", " Side-chain atoms: CB SG 1HB 2HB \n", "Atom Coordinates:\n", " N : 39.126, 55.553, 42.324\n", " CA : 37.869, 55.182, 41.689\n", " C : 37.774, 53.665, 41.73\n", " O : 38.654, 52.976, 41.209\n", " CB : 37.81, 55.713, 40.253\n", " SG : 36.265, 55.41, 39.34\n", " H : 39.995, 55.343, 41.854\n", " HA : 37.051, 55.626, 42.256\n", " 1HB : 37.967, 56.792, 40.257\n", " 2HB : 38.614, 55.268, 39.667\n", "Mirrored relative to coordinates in ResidueType: FALSE\n", "\n" ] } ], "source": [ "print(pose.residue(21))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**结果解读**<br />\n", "选择带二硫键的氨基酸时,使用CYS残基名并没有正确选择到对应的氨基酸,因为在Rosetta中,形成二硫键的半胱氨酸名为\n", "CYS:disulfide, 接下来我们尝试换个名字进行选择." ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n" ] } ], "source": [ "resname_selector = ResidueNameSelector('CYS:disulfide')\n", "residue_selector = resname_selector.apply(pose)\n", "print(residue_selector)" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "# 可视化选择 带修饰的残基\n", "pymol_selected = SelectedResiduesPyMOLMetric()\n", "pymol_selected.set_residue_selector(resname_selector)\n", "prefix = 'ss_select_'\n", "pymol_selected.apply(pose, prefix)" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'select rosetta_sele, (chain H and resid 22,92) or (chain L and resid 23,88)'" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "string_metric = sm_data.get_string_metric_data()\n", "string_metric['ss_select_pymol_selection']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center><img src=\"./img/6LZ9_ss.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**结果解读**<br />\n", "现在可以正确选择到对应的二硫键氨基酸子集了!这些二硫键的位置是22H, 92H, 23L, 88L。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 3.3 构象依赖的选择器\n", "顾名思义,这类选择器与分子结构的具体构象有关,具体地由二面角、二级结构、氢键、邻居分子数量、相互作用界面、对称性等几个层次去进行定义。这里以NeighborhoodResidueSelector为例进行简要说明。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**3.3.1. NeighborhoodResidueSelector**\n", "\n", "选择邻近残基,默认选择10埃范围内的残基。有两种用法来选择,第一种选择半径范围内所有的氨基酸,第二种为选择**邻近范围内**的氨基酸" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m ################ Cloning pose and building neighbor graph ################\n", "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Ensure that pose is either scored or has update_residue_neighbors() called\n", "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m before using NeighborhoodResidueSelector for maximum performance!\n", "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m ##########################################################################\n" ] }, { "data": { "text/plain": [ "vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 比如选择PDB编号为H链42号氨基酸的10埃范围内所有的氨基酸(包括42号氨基酸):\n", "from pyrosetta.rosetta.core.select.residue_selector import NeighborhoodResidueSelector, ResidueIndexSelector\n", "residue1_selector = ResidueIndexSelector('42H')\n", "nbr_selector = NeighborhoodResidueSelector(residue1_selector, 10.0, True) # True 代表包括42号氨基酸。\n", "nbr_selector.apply(pose)" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m ################ Cloning pose and building neighbor graph ################\n", "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Ensure that pose is either scored or has update_residue_neighbors() called\n", "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m before using NeighborhoodResidueSelector for maximum performance!\n", "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m ##########################################################################\n" ] } ], "source": [ "# 可视化选择PDB编号为H链42号氨基酸的10埃范围内所有的氨基酸且含42号氨基酸\n", "pymol_selected = SelectedResiduesPyMOLMetric()\n", "pymol_selected.set_residue_selector(nbr_selector)\n", "prefix = 'nbr_select_'\n", "pymol_selected.apply(pose, prefix)" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'select rosetta_sele, (chain H and resid 39,40,41,42,43,44,88,89)'" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "string_metric = sm_data.get_string_metric_data()\n", "string_metric['nbr_select_pymol_selection']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center><img src=\"./img/6LZ9_nbr.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m ################ Cloning pose and building neighbor graph ################\n", "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Ensure that pose is either scored or has update_residue_neighbors() called\n", "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m before using NeighborhoodResidueSelector for maximum performance!\n", "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m ##########################################################################\n" ] }, { "data": { "text/plain": [ "vector1_bool[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 比如选择PDB编号为H链42号氨基酸10埃范围内所有的氨基酸(不包括42号氨基酸):\n", "nbr_selector = NeighborhoodResidueSelector(residue1_selector, 10.0, False) # True 代表包括1号氨基酸。\n", "nbr_selector.apply(pose)" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m ################ Cloning pose and building neighbor graph ################\n", "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m Ensure that pose is either scored or has update_residue_neighbors() called\n", "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m before using NeighborhoodResidueSelector for maximum performance!\n", "\u001b[0mcore.select.residue_selector.NeighborhoodResidueSelector: {0} \u001b[0m\u001b[1m[ WARNING ]\u001b[0m ##########################################################################\n" ] } ], "source": [ "# 可视化选择 PDB编号为H链42号氨基酸的10埃范围内所有的氨基酸但不含42号氨基酸\n", "pymol_selected = SelectedResiduesPyMOLMetric()\n", "pymol_selected.set_residue_selector(nbr_selector)\n", "prefix = 'nbr_noself_select_'\n", "pymol_selected.apply(pose, prefix)" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "'select rosetta_sele, (chain H and resid 39,40,41,43,44,88,89)'" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "string_metric = sm_data.get_string_metric_data()\n", "string_metric['nbr_noself_select_pymol_selection']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<center><img src=\"./img/6LZ9_nbr_noself.png\" width = \"400\" height = \"300\" align=center /> </center>\n", "(图片来源: 晶泰科技团队)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" } }, "nbformat": 4, "nbformat_minor": 4 }