{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Homework 9.2: Data transformations and parameter estimation (30 pts)\n", "\n", "[Data download](https://s3.amazonaws.com/bebi103.caltech.edu/data/fret_binding_curve.csv)\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We often want to ascertain how tightly two proteins are bound by measuring their dissociation constant, $K_d$. This is usually done by doing a titration experiment and then performing a maximum likelihood estimate of $K_d$. For example, imagine two proteins, $a$ and $b$ may bind to each other in the reaction\n", "\n", "\\begin{align}\n", "ab \\rightleftharpoons a + b\n", "\\end{align}\n", "\n", "with dissociation constant $K_d$. At equilibrium\n", "\n", "\\begin{align}\n", "K_d = \\frac{c_a\\,c_b}{c_{ab}},\n", "\\end{align}\n", "\n", "were $c_i$ is the concentration of species $i$. If we add known amounts of $a$ and $b$ to a solution such that the total concentration of a is $c_a^0$ and the total concentration of b is $c_b^0$, we can compute the equilibrium concentrations of all species. Specifically, in addition to the equation above, we have conservation of mass equations,\n", "\n", "\\begin{align}\n", "c_a^0 &= c_a + c_{ab}\\\\[1em]\n", "c_b^0 &= c_b + c_{ab},\n", "\\end{align}\n", "\n", "fully specifying the problem. We can solve the three equations for $c_{ab}$ in terms of the known quantities $c_a^0$ and $c_b^0$, along with the parameter we are trying to measure, $K_d$. We get\n", "\n", "\\begin{align}\n", "c_{ab} = \\frac{2c_a^0\\,c_b^0}{K_d+c_a^0+c_b^0 + \\sqrt{\\left(K_d+c_a^0+c_b^0\\right)^2 - 4c_a^0\\,c_b^0}}.\n", "\\end{align}\n", "\n", "The technique, then, is to hold $c_a^0$ fixed and measure $c_{ab}$ for various $c_b^0$. We can then perform devise a variate-covariate model and obtain an MLE of $K_d$.\n", "\n", "In order to do this, though, we need some readout of $c_{ab}$. For this problem, we will use FRET (fluorescence resonance energy transfer) to monitor how much of $a$ is bound to $b$. Specifically, we take $a$ with a fluorophore and $b$ is a receptor. When the two are unbound, we get a fluorescence signal per molecule of $f_0$. When they are bound, the receptor absorbs the light coming out of the fluorophore, so we get less fluorescence per molecule, which we will call $f_q$ (for \"quenched\"). Let $f$ be the total per-fluorophore fluorescence signal. Then, the measured fluorescence signal, $F$, is\n", "\n", "\\begin{align}\n", "F = c_a^0\\,V f = \\left(c_a \\,f_0 + c_{ab}\\, f_q\\right)V,\n", "\\end{align}\n", "\n", "where $V$ is the reaction volume.\n", "\n", "As is commonly done by biochemists, we can define a FRET efficiency, $e$, as\n", "\n", "\\begin{align}\n", "e = 1 - \\frac{f}{f_0}.\n", "\\end{align}\n", "\n", "If we measure $F_0$, the measured fluorescence when there is no b protein in the sample, we can compute the FRET efficiency from the measured values $F$ and $F_0$\n", "\n", "\\begin{align}\n", "e = 1 - \\frac{c_a^0\\,V f}{c_a^0\\,Vf_0} = 1 - \\frac{F}{F_0}.\n", "\\end{align}\n", "\n", "Substituting in our expressions for $F$ and $F_0$, we get\n", "\n", "\\begin{align}\n", "e = 1 - \\frac{\\left(c_a \\,f_0 + c_{ab}\\, f_q\\right)V}{c_a^0\\,V f_0}\n", "= 1 - \\frac{c_a}{c_a^0} - \\frac{c_{ab}}{c_a^0}\\,\\frac{f_q}{f_0}.\n", "\\end{align}\n", "\n", "Using the fact that $c_a^0 = c_a + c_{ab}$, this becomes\n", "\n", "\\begin{align}\n", "e = \\left(1-\\frac{f_q}{f_0}\\right)\\frac{c_{ab}}{c_a^0}.\n", "\\end{align}\n", "\n", "In other words, the FRET efficiency is proportional to the fraction of a that is bound, or\n", "\n", "\\begin{align}\n", "e = \\alpha \\, \\frac{c_{ab}}{c_a^0} = \\frac{2\\alpha\\,c_b^0}{K_d+c_a^0+c_b^0 + \\sqrt{\\left(K_d+c_a^0+c_b^0\\right)^2 - 4c_a^0\\,c_b^0}},\n", "\\end{align}\n", "\n", "where $\\alpha = 1 - f_q/f_0$. Biochemists then typically consider $e$ to be a variate (and $c_a^0$ and $c_b^0$ to be covariates) and then obtain MLEs for the parameters $\\alpha$ and $K_d$.\n", "\n", "**a)** Load in the data for one of these FRET efficiency titration curves. You can download the data set [here](https://s3.amazonaws.com/bebi103.caltech.edu/data/fret_binding_curve.csv). These are real data from here on campus, collected by a former student in this class, Emily Blythe. They were never published, but were preliminary experiments for [this publication](https://doi.org/10.1016/j.str.2019.09.011). To get the fluorescence for each measurement, you need to subtract the background fluorescence. Do that, and then also compute the FRET efficiency.\n", "\n", "**b)** One could use a variate-covariate model based on the typical approach used by biochemists using the FRET efficiency as described above to obtain estimates for $K_d$ and $\\alpha$. Alternatively, one could instead directly use the measured (background-subtracted) fluorescence and build a variate-covariate model around the equation\n", "\n", "\\begin{align}\n", "F = \\left(c_a \\,f_0 + c_{ab}\\, f_q\\right)V,\n", "\\end{align}\n", "\n", "where there are now three parameters, $K_d$, $f_0V$, and $f_qV$, from which $\\alpha$ may be calculated as $\\alpha = 1 - f_qV/f_0V$. Which of these two approaches is preferred, and why?\n", "\n", "**c)** Provide MLEs for $\\alpha$ and $K_d$, along with confidence intervals, and display a graphical model assessment." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.5" } }, "nbformat": 4, "nbformat_minor": 4 }