{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Sera concentrations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Choosing sera concentrations is an important consideration for DMS selections. We'll use simulated data to estimate the optimal set of sera concentrations that provides the best performance while minimizing the **number of concentrations** (or selection experiments) required." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import os\n", "import pickle\n", "\n", "import altair as alt\n", "import numpy as np\n", "import pandas as pd\n", "import polyclonal" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we read in a simulated \"noisy\" dataset measured at six sera concentrations. The variants in this library were simulated to contain a Poisson-distributed number of mutations, with an average of three mutations per gene. The variants also generally span a wide range of escape fractions across the different sera concentrations." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | library | \n", "barcode | \n", "concentration | \n", "prob_escape | \n", "aa_substitutions | \n", "IC90 | \n", "
|---|---|---|---|---|---|---|
| 0 | \n", "avg3muts | \n", "AAAACTGCTGAGGAGA | \n", "0.125 | \n", "0.013540 | \n", "\n", " | 0.08212 | \n", "
| 1 | \n", "avg3muts | \n", "AAAAGCAGGCTACTCT | \n", "0.125 | \n", "0.066420 | \n", "\n", " | 0.08212 | \n", "
| 2 | \n", "avg3muts | \n", "AAAAGCTATAGGTGCC | \n", "0.125 | \n", "0.023120 | \n", "\n", " | 0.08212 | \n", "
| 3 | \n", "avg3muts | \n", "AAAAGGTATTAGTGGC | \n", "0.125 | \n", "0.005456 | \n", "\n", " | 0.08212 | \n", "
| 4 | \n", "avg3muts | \n", "AAAAGTGCCTTCGTTA | \n", "0.125 | \n", "0.054760 | \n", "\n", " | 0.08212 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 239995 | \n", "avg3muts | \n", "CTTAAAATAGCTGGTC | \n", "0.250 | \n", "0.000000 | \n", "Y508W | \n", "0.08212 | \n", "
| 239996 | \n", "avg3muts | \n", "CTTAAAATAGCTGGTC | \n", "0.500 | \n", "0.026480 | \n", "Y508W | \n", "0.08212 | \n", "
| 239997 | \n", "avg3muts | \n", "CTTAAAATAGCTGGTC | \n", "1.000 | \n", "0.012260 | \n", "Y508W | \n", "0.08212 | \n", "
| 239998 | \n", "avg3muts | \n", "CTTAAAATAGCTGGTC | \n", "2.000 | \n", "0.000000 | \n", "Y508W | \n", "0.08212 | \n", "
| 239999 | \n", "avg3muts | \n", "CTTAAAATAGCTGGTC | \n", "4.000 | \n", "0.000000 | \n", "Y508W | \n", "0.08212 | \n", "
240000 rows × 6 columns
\n", "| \n", " | barcode | \n", "ICxx_against_wt | \n", "
|---|---|---|
| concentration | \n", "\n", " | \n", " |
| 0.125 | \n", "AAAAAATGTTCTATCC | \n", "95.141 | \n", "
| 0.125 | \n", "AAAAACAATCCGGACT | \n", "95.141 | \n", "
| 0.125 | \n", "AAAAACGCGGTCACTT | \n", "95.141 | \n", "
| 0.125 | \n", "AAAAACTTGGCTAGCT | \n", "95.141 | \n", "
| 0.125 | \n", "AAAAAGCAAGGCCCAG | \n", "95.141 | \n", "
| ... | \n", "... | \n", "... | \n", "
| 4.000 | \n", "TTTGTATGGTCCATAT | \n", "99.999 | \n", "
| 4.000 | \n", "TTTGTCTCGAATGGTG | \n", "99.999 | \n", "
| 4.000 | \n", "TTTTAAGCTCATACGC | \n", "99.999 | \n", "
| 4.000 | \n", "TTTTATCCACCGAACT | \n", "99.999 | \n", "
| 4.000 | \n", "TTTTCGATCCTTGTCA | \n", "99.999 | \n", "
137868 rows × 2 columns
\n", "