{ "cells": [ { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "# Visualize antibody mix\n", "We will visualize a hypothetical polyclonal antibody mix same way we'd like to visualize the antibody mixes we deconvolve from deep mutational scanning experiments.\n", "\n", "The hypothetical mix represents antibodies targeting three major neutralizing \"epitopes\" on the SARS-CoV-2 receptor-binding domain (RBD) using the classification scheme of [Barnes et al (2020)](https://www.nature.com/articles/s41586-020-2852-1).\n", "In particular, [Barnes et al (2020)](https://www.nature.com/articles/s41586-020-2852-1) divided anti-RBD antibodies that bind to the receptor-binding motif into three classes (see also [Greaney et al (2021)](https://www.nature.com/articles/s41467-021-24435-8)).\n", "For each class, we will use prior deep mutational scanning on a single well-studied monoclonal of that class to antibody to make plausible choices for how mutations affect that antibody class (of course, in reality as there are many somewhat distinct antibodies in each class).\n", "The antibodies used to represent each class are:\n", "\n", " - *LY-CoV016*: class 1, mutation estimates from [Starr et al (2021), Science](https://science.sciencemag.org/content/371/6531/850)\n", " \n", " - *LY-CoV555*: class 2, mutation estimates from [Starr et al (2021), Cell Reports Medicine](https://doi.org/10.1016/j.xcrm.2021.100255)\n", " \n", " - *REGN10987*: class 3, mutation estimates from [Starr et al (2021), Science](https://science.sciencemag.org/content/371/6531/850)\n", " \n", "Read in the mutation-level escape values $\\beta_{m,e}$ for each mutation against each antibody class for this simulated hypothetical mix:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2021-11-19T17:52:32.686518Z", "iopub.status.busy": "2021-11-19T17:52:32.685997Z", "iopub.status.idle": "2021-11-19T17:52:33.365140Z", "shell.execute_reply": "2021-11-19T17:52:33.364578Z", "shell.execute_reply.started": "2021-11-19T17:52:32.686395Z" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
epitopemutationescape
0class 1N331A0.08226
1class 1N331D0.77700
2class 1N331E0.08226
3class 1N331F0.08226
4class 1N331G1.70100
............
5791class 3T531R0.72620
5792class 3T531S0.72620
5793class 3T531V0.72480
5794class 3T531W0.72280
5795class 3T531Y0.72620
\n", "

5796 rows × 3 columns

\n", "
" ], "text/plain": [ " epitope mutation escape\n", "0 class 1 N331A 0.08226\n", "1 class 1 N331D 0.77700\n", "2 class 1 N331E 0.08226\n", "3 class 1 N331F 0.08226\n", "4 class 1 N331G 1.70100\n", "... ... ... ...\n", "5791 class 3 T531R 0.72620\n", "5792 class 3 T531S 0.72620\n", "5793 class 3 T531V 0.72480\n", "5794 class 3 T531W 0.72280\n", "5795 class 3 T531Y 0.72620\n", "\n", "[5796 rows x 3 columns]" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "\n", "mut_escape_df = pd.read_csv('RBD_mut_escape_df.csv')\n", "\n", "mut_escape_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the data frame only includes 1932 of the $201 \\times 19 = 3819$ possible amino-acid mutations to the RBD; this is because only about half of the mutations are functionally tolerated.\n", "\n", "We also choose simulated activities $a_{\\rm{wt},e}$ for each epitope $e$.\n", "We will let the activity of the polyclonal antibody mix be highest against the class 2 epitope, then next highest against the class 3 epitope, and lowest against the class 1 epitope ([experiments suggest](https://www.nature.com/articles/s41467-021-24435-8) this roughly corresponds to reality for SARS-CoV-2 polyclonal sera):" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2021-11-19T17:52:33.366476Z", "iopub.status.busy": "2021-11-19T17:52:33.366222Z", "iopub.status.idle": "2021-11-19T17:52:33.373864Z", "shell.execute_reply": "2021-11-19T17:52:33.373307Z", "shell.execute_reply.started": "2021-11-19T17:52:33.366456Z" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
epitopeactivity
0class 11.099
1class 23.178
2class 32.197
\n", "
" ], "text/plain": [ " epitope activity\n", "0 class 1 1.099\n", "1 class 2 3.178\n", "2 class 3 2.197" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "activity_wt_df = pd.read_csv('RBD_activity_wt_df.csv')\n", "\n", "activity_wt_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we will visualize the data using `Polyclonal` class provided by this package to model polyclonal antibody mixes:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2021-11-19T17:52:33.374885Z", "iopub.status.busy": "2021-11-19T17:52:33.374634Z", "iopub.status.idle": "2021-11-19T17:52:34.356100Z", "shell.execute_reply": "2021-11-19T17:52:34.355054Z", "shell.execute_reply.started": "2021-11-19T17:52:33.374864Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epitopes: ('class 1', 'class 2', 'class 3')\n", "Number of mutations: 1932\n", "Number of sites: 173\n" ] } ], "source": [ "import polyclonal\n", "\n", "poly_abs = polyclonal.Polyclonal(\n", " activity_wt_df=activity_wt_df,\n", " mut_escape_df=mut_escape_df)\n", "\n", "print(f\"Epitopes: {poly_abs.epitopes}\")\n", "print(f\"Number of mutations: {len(poly_abs.mutations)}\")\n", "print(f\"Number of sites: {len(poly_abs.sites)}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can access the activity values, the mutation escape values, and site-level summaries of the mutation escape values:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2021-11-19T17:52:34.358893Z", "iopub.status.busy": "2021-11-19T17:52:34.358415Z", "iopub.status.idle": "2021-11-19T17:52:34.369445Z", "shell.execute_reply": "2021-11-19T17:52:34.368731Z", "shell.execute_reply.started": "2021-11-19T17:52:34.358830Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
epitopeactivity
0class 11.099
1class 23.178
2class 32.197
\n", "
" ], "text/plain": [ " epitope activity\n", "0 class 1 1.099\n", "1 class 2 3.178\n", "2 class 3 2.197" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "poly_abs.activity_wt_df" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2021-11-19T17:52:34.371306Z", "iopub.status.busy": "2021-11-19T17:52:34.370879Z", "iopub.status.idle": "2021-11-19T17:52:34.406624Z", "shell.execute_reply": "2021-11-19T17:52:34.405933Z", "shell.execute_reply.started": "2021-11-19T17:52:34.371262Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
epitopesitewildtypemutantmutationescape
0class 1331NAN331A0.08226
1class 1331NDN331D0.77700
2class 1331NEN331E0.08226
3class 1331NFN331F0.08226
4class 1331NGN331G1.70100
.....................
5791class 3531TRT531R0.72620
5792class 3531TST531S0.72620
5793class 3531TVT531V0.72480
5794class 3531TWT531W0.72280
5795class 3531TYT531Y0.72620
\n", "

5796 rows × 6 columns

\n", "
" ], "text/plain": [ " epitope site wildtype mutant mutation escape\n", "0 class 1 331 N A N331A 0.08226\n", "1 class 1 331 N D N331D 0.77700\n", "2 class 1 331 N E N331E 0.08226\n", "3 class 1 331 N F N331F 0.08226\n", "4 class 1 331 N G N331G 1.70100\n", "... ... ... ... ... ... ...\n", "5791 class 3 531 T R T531R 0.72620\n", "5792 class 3 531 T S T531S 0.72620\n", "5793 class 3 531 T V T531V 0.72480\n", "5794 class 3 531 T W T531W 0.72280\n", "5795 class 3 531 T Y T531Y 0.72620\n", "\n", "[5796 rows x 6 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "poly_abs.mut_escape_df" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2021-11-19T17:52:34.407770Z", "iopub.status.busy": "2021-11-19T17:52:34.407543Z", "iopub.status.idle": "2021-11-19T17:52:34.452423Z", "shell.execute_reply": "2021-11-19T17:52:34.451572Z", "shell.execute_reply.started": "2021-11-19T17:52:34.407750Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
epitopesitewildtypemeantotal positivemaxmintotal negative
0class 1331N0.5506288.810042.62000.082260.0
1class 1332I0.98539618.722523.29700.082260.0
2class 1333T0.5539209.970562.60300.082260.0
3class 1334N0.66423611.956241.80800.082260.0
4class 1335L0.3641026.917941.59800.082260.0
...........................
514class 3527P0.73763512.539800.94690.723300.0
515class 3528K0.72466113.043900.72670.723800.0
516class 3529K0.72406713.033200.72570.719400.0
517class 3530S0.72474713.770200.73490.722800.0
518class 3531T0.72448913.765300.72620.720900.0
\n", "

519 rows × 8 columns

\n", "
" ], "text/plain": [ " epitope site wildtype mean total positive max min \\\n", "0 class 1 331 N 0.550628 8.81004 2.6200 0.08226 \n", "1 class 1 332 I 0.985396 18.72252 3.2970 0.08226 \n", "2 class 1 333 T 0.553920 9.97056 2.6030 0.08226 \n", "3 class 1 334 N 0.664236 11.95624 1.8080 0.08226 \n", "4 class 1 335 L 0.364102 6.91794 1.5980 0.08226 \n", ".. ... ... ... ... ... ... ... \n", "514 class 3 527 P 0.737635 12.53980 0.9469 0.72330 \n", "515 class 3 528 K 0.724661 13.04390 0.7267 0.72380 \n", "516 class 3 529 K 0.724067 13.03320 0.7257 0.71940 \n", "517 class 3 530 S 0.724747 13.77020 0.7349 0.72280 \n", "518 class 3 531 T 0.724489 13.76530 0.7262 0.72090 \n", "\n", " total negative \n", "0 0.0 \n", "1 0.0 \n", "2 0.0 \n", "3 0.0 \n", "4 0.0 \n", ".. ... \n", "514 0.0 \n", "515 0.0 \n", "516 0.0 \n", "517 0.0 \n", "518 0.0 \n", "\n", "[519 rows x 8 columns]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "poly_abs.mut_escape_site_summary_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also **plot** the relevant values characterizing the polyclonal mix.\n", "\n", "Here is the activity $a_{\\rm{wt},e}$ against each epitope for the unmutated protein:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2021-11-19T17:52:34.454177Z", "iopub.status.busy": "2021-11-19T17:52:34.453776Z", "iopub.status.idle": "2021-11-19T17:52:34.812542Z", "shell.execute_reply": "2021-11-19T17:52:34.811492Z", "shell.execute_reply.started": "2021-11-19T17:52:34.454138Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# NBVAL_IGNORE_OUTPUT\n", "\n", "poly_abs.activity_wt_barplot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here is the mutation escape $\\beta_{m,e}$ for each epitope at each site.\n", "For compact site-level plotting, the mutation escape at each site is summarized by a single number (e.g., as the mean or total of the $\\beta_{m,e}$ for that site).\n", "Note that you can zoom on specific sites, and use the dropdown at the bottom of the plot to select different summary metrics of the escape at each site: " ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2021-11-19T17:52:34.814389Z", "iopub.status.busy": "2021-11-19T17:52:34.813969Z", "iopub.status.idle": "2021-11-19T17:52:39.292423Z", "shell.execute_reply": "2021-11-19T17:52:39.291375Z", "shell.execute_reply.started": "2021-11-19T17:52:34.814349Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.VConcatChart(...)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# NBVAL_IGNORE_OUTPUT\n", "\n", "poly_abs.mut_escape_lineplot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we want to actually interrogate the effect of each single mutation on escape, we can look at the actual $\\beta_{m,e}$ in the form of a heatmap.\n", "Note that the heatmap is again zoomable, and you can mouse over specific mutations to get the value:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2021-11-19T17:52:39.295356Z", "iopub.status.busy": "2021-11-19T17:52:39.294694Z", "iopub.status.idle": "2021-11-19T17:52:42.847833Z", "shell.execute_reply": "2021-11-19T17:52:42.847363Z", "shell.execute_reply.started": "2021-11-19T17:52:39.295314Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.VConcatChart(...)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# NBVAL_IGNORE_OUTPUT\n", "\n", "poly_abs.mut_escape_heatmap()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also project the site-level summary metrics of the mutation escape onto the protein structure.\n", "Here we do this using [PDB 6m0j](https://www.rcsb.org/structure/6M0J), which holds the SARS-CoV-2 RBD (chain `E`) in complex with ACE2 (chain `A`).\n", "Specific, the `Polyclonal` object has a method to make versions of the PDB in which the B-factor is re-assigned to one of the site-level summary metrics of escape (such as *mean* or *total positive*):" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2021-11-19T17:52:42.849528Z", "iopub.status.busy": "2021-11-19T17:52:42.849267Z", "iopub.status.idle": "2021-11-19T17:52:43.393002Z", "shell.execute_reply": "2021-11-19T17:52:43.392524Z", "shell.execute_reply.started": "2021-11-19T17:52:42.849479Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
epitopePDB file
0class 1RBD_mean_class_1.pdb
1class 2RBD_mean_class_2.pdb
2class 3RBD_mean_class_3.pdb
\n", "
" ], "text/plain": [ " epitope PDB file\n", "0 class 1 RBD_mean_class_1.pdb\n", "1 class 2 RBD_mean_class_2.pdb\n", "2 class 3 RBD_mean_class_3.pdb" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "poly_abs.mut_escape_pdb_b_factor(input_pdbfile='6M0J.pdb',\n", " chains='E',\n", " metric='mean',\n", " outfile='RBD_{metric}_{epitope}.pdb')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These PDB files can then be colored in `pymol` by the escape metric.\n", "If you want the colors to match the same ones used above by the plotting from the `Polyclonal` object, get the colors and convert them to the RGB tuples used by `pymol`:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2021-11-19T17:52:43.394386Z", "iopub.status.busy": "2021-11-19T17:52:43.394105Z", "iopub.status.idle": "2021-11-19T17:52:43.399347Z", "shell.execute_reply": "2021-11-19T17:52:43.398742Z", "shell.execute_reply.started": "2021-11-19T17:52:43.394361Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "class 1: hex color is #1f77b4; RGB tuple is [0.122, 0.467, 0.706]\n", "class 2: hex color is #ff7f0e; RGB tuple is [1.0, 0.498, 0.055]\n", "class 3: hex color is #2ca02c; RGB tuple is [0.173, 0.627, 0.173]\n" ] } ], "source": [ "import matplotlib.colors\n", "\n", "for epitope, hex_color in poly_abs.epitope_colors.items():\n", " rgb = [round(val, 3) for val in matplotlib.colors.to_rgb(hex_color)]\n", " print(f\"{epitope}: hex color is {hex_color}; RGB tuple is {rgb}\")" ] }, { "cell_type": "markdown", "metadata": { "execution": { "iopub.execute_input": "2021-03-14T20:44:30.039666Z", "iopub.status.busy": "2021-03-14T20:44:30.039028Z", "iopub.status.idle": "2021-03-14T20:44:30.045508Z", "shell.execute_reply": "2021-03-14T20:44:30.044631Z", "shell.execute_reply.started": "2021-03-14T20:44:30.039616Z" } }, "source": [ "Then using these colors, we can use the `pymol spectrum` command to re-color by B-factor.\n", "Here is a Python script that can be run within `pymol` (via `run