{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## STAT5 Dimerization Model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The goal of the notebook is to study practical identifiability of [STAT5 Dimerization](doi:10.1021/pr5006923) Model with *LikelihoodProfiler*. STAT5 Dimerization is one of the Benchmark models for [dMod R package](https://github.com/dkaschek/dMod). We have translated the model to [Julia language](https://julialang.org/). [dMod BenchmarkModels repo](https://github.com/dkaschek/dMod/tree/master/BenchmarkModels/Boehm_JProteomeRes2014) contains the model files, experimental data and best-fit parameters.\n", "The model is defined by the following system of differential equations:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "using DiffEqBase, CSV, DataFrames, LikelihoodProfiler, LSODA\n", "using Plots\n", "\n", "# constants\n", "const cyt = 1.4\n", "const nuc = 0.45\n", "const ratio = 0.693\n", "const specC17 = 0.107\n", "\n", "\n", "# ode system\n", "function stat5_ode(du, u, p, time)\n", " # 8 states:\n", " (STAT5A, pApA, STAT5B, pApB, pBpB, nucpApA, nucpApB, nucpBpB) = u\n", " # 6 parameters\n", " (Epo_degradation_BaF3, k_exp_hetero, k_exp_homo, k_imp_hetero, k_imp_homo, k_phos) = p\n", " \n", " BaF3_Epo = 1.25e-7*exp(-1*Epo_degradation_BaF3*time)\n", "\n", " v1 = BaF3_Epo*(STAT5A^2)*k_phos\n", " v2 = BaF3_Epo*STAT5A*STAT5B*k_phos\n", " v3 = BaF3_Epo*(STAT5B^2)*k_phos\n", " v4 = k_imp_homo*pApA\n", " v5 = k_imp_hetero*pApB\n", " v6 = k_imp_homo*pBpB\n", " v7 = k_exp_homo*nucpApA\n", " v8 = k_exp_hetero*nucpApB\n", " v9 = k_exp_homo*nucpBpB\n", "\n", " du[1] = -2*v1 - v2 + 2*v7*(nuc/cyt) + v8*(nuc/cyt)\n", " du[2] = v1 - v4\n", " du[3] = -v2 -2*v3 + v8*(nuc/cyt) + 2*v9*(nuc/cyt)\n", " du[4] = v2 - v5\n", " du[5] = v3 - v6\n", " du[6] = v4*(cyt/nuc) - v7\n", " du[7] = v5*(cyt/nuc) - v8\n", " du[8] = v6*(cyt/nuc) - v9\n", " end;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's load the experimental dataset " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "data = DataFrame!(CSV.File(\"data_stat5.csv\"));" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Observables, initial values and solver settings are set according to [model's files](https://github.com/dkaschek/dMod/tree/master/BenchmarkModels/Boehm_JProteomeRes2014) and [dMod settings](https://github.com/dkaschek/dMod/blob/master/R/PEtab2dMod.R)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "saveat = Float64.(data[!,:time])\n", "tspan = (0.,saveat[end])\n", "\n", "u0 = zeros(8)\n", "u0[1] = 207.6*ratio # STAT5A\n", "u0[3] = 207.6 - 207.6*ratio # STAT5B\n", "\n", "prob(p) = ODEProblem(stat5_ode, eltype(p).(u0), tspan, p)\n", "\n", "function solve_prob(p)\n", " _prob = prob(p)\n", "\n", " # solution\n", " sol = solve(_prob, lsoda(), saveat=saveat, reltol=1e-7,abstol=1e-7) #save_idxs=[1,2,3,4,5] \n", " STAT5A = sol[1,:]\n", " pApA = sol[2,:]\n", " STAT5B = sol[3,:]\n", " pApB = sol[4,:]\n", " pBpB = sol[5,:]\n", "\n", " # observables\n", " pSTAT5A_rel = (100 * pApB + 200 * pApA * specC17) ./ (pApB + STAT5A * specC17 + 2 * pApA * specC17)\n", " pSTAT5B_rel = -(100 * pApB - 200 * pBpB * (specC17 - 1)) ./ ((STAT5B * (specC17 - 1) - pApB) + 2 * pBpB * (specC17 - 1))\n", " rSTAT5A_rel = (100 * pApB + 100 * STAT5A * specC17 + 200 * pApA * specC17) ./ (2 * pApB + STAT5A * specC17 + 2 * pApA * specC17 - STAT5B * (specC17 - 1) - 2 * pBpB * (specC17 - 1))\n", "\n", " return [pSTAT5A_rel, pSTAT5B_rel, rSTAT5A_rel]\n", "end;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The [best-fit](https://github.com/dkaschek/dMod/blob/master/BenchmarkModels/Boehm_JProteomeRes2014/parameters_Boehm_JProteomeRes2014.tsv) parameters values:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "p_best = [\n", " 0.026982514033029, # Epo_degradation_BaF3\n", " 0.0000100067973851508, # k_exp_hetero\n", " 0.006170228086381, # k_exp_homo\n", " 0.0163679184468, # k_imp_hetero\n", " 97749.3794024716, # k_imp_homo\n", " 15766.5070195731, # k_phos\n", " 3.85261197844677, # sd_pSTAT5A_rel\n", " 6.59147818673419, # sd_pSTAT5B_rel\n", " 3.15271275648527 # sd_rSTAT5A_rel\n", "];" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's run the simulations and plot results for best-fit parameters values" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "0\n", "\n", "\n", "50\n", "\n", "\n", "100\n", "\n", "\n", "150\n", "\n", "\n", "200\n", "\n", "\n", "0\n", "\n", "\n", "20\n", "\n", "\n", "40\n", "\n", "\n", "60\n", "\n", "\n", "80\n", "\n", "\n", "time\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "pSTAT5A_rel\n", "\n", "\n", "\n", "pSTAT5B_rel\n", "\n", "\n", "\n", "rSTAT5A_rel\n", "\n", "\n" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sol = solve_prob(p_best)\n", "plot(saveat,sol, label=[:pSTAT5A_rel :pSTAT5B_rel :rSTAT5A_rel], xlabel=:time)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Loss function includes parameters transformation to log10 scale:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "function loss_func(p_init)\n", " p = exp10.(p_init)\n", "\n", " sim = solve_prob(p)\n", " σ = p[7:9]\n", " # loss\n", " return loss(sim,data,σ)\n", "end\n", "\n", "function loss(sim,data,σ)\n", " loss = 0.0\n", " obs = names(data)[2:end] \n", "\n", " for i in 1:length(obs)\n", " loss_i = loss_component(sim[i],data[!,i+1],σ[i])\n", " loss += loss_i\n", " end\n", " return loss\n", "end\n", "\n", "function loss_component(sim,data,σ)\n", " loss_i = 0.0\n", " \n", " for i in eachindex(sim)\n", " loss_i += ((sim[i]-data[i])/σ)^2 + 2*log(sqrt(2π)*σ)\n", " end\n", " return loss_i\n", "end;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we are ready to perform identifiability analysis with *LikelihoodProfiler*" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0.784934 seconds (1.69 M allocations: 135.207 MiB, 4.00% gc time)\n", " 1.044460 seconds (1.67 M allocations: 134.190 MiB, 6.98% gc time)\n", " 0.595737 seconds (902.18 k allocations: 72.128 MiB, 3.08% gc time)\n", " 0.470997 seconds (595.08 k allocations: 47.605 MiB, 4.63% gc time)\n", " 1.545490 seconds (2.59 M allocations: 205.508 MiB, 4.24% gc time)\n", " 0.301798 seconds (533.45 k allocations: 42.660 MiB, 5.99% gc time)\n", " 0.364914 seconds (709.19 k allocations: 56.716 MiB, 5.36% gc time)\n", " 0.655530 seconds (703.24 k allocations: 56.275 MiB, 3.77% gc time)\n", " 1.204388 seconds (1.84 M allocations: 147.707 MiB, 3.50% gc time)\n" ] } ], "source": [ "α = loss_func(log10.(p_best)) + 3.84 # chisq with 1 df\n", "\n", "# search CI with LikelihoodProfiler\n", "num_params = length(p_best)\n", "\n", "intervals = Vector{ParamInterval}(undef,num_params)\n", "p_log = log10.(p_best)\n", "\n", "tbounds = fill((-7.,7.), num_params)\n", "sbounds = (-5.,5.)\n", "for i in 1:num_params\n", " @time intervals[i] = get_interval(\n", " p_log,\n", " i,\n", " loss_func,\n", " :CICO_ONE_PASS,\n", " loss_crit = α,\n", " theta_bounds = tbounds,\n", " scan_bounds = sbounds,\n", " scan_tol = 1e-2,\n", " local_alg = :LN_NELDERMEAD,\n", " )\n", "end;" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

9 rows × 8 columns

paramsLSatusUStatusLBoundUBoundLCountUCountInitValues
SymbolSymbolSymbolUnion…Union…Int64Int64Float64
1Epo_degradation_BaF3BORDER_FOUND_BY_SCAN_TOLBORDER_FOUND_BY_SCAN_TOL-1.70803-1.41987523494-1.56892
2k_exp_heteroSCAN_BOUND_REACHEDBORDER_FOUND_BY_SCAN_TOL-3.1462941036-4.9997
3k_exp_homoBORDER_FOUND_BY_SCAN_TOLBORDER_FOUND_BY_SCAN_TOL-2.48322-1.98024237289-2.2097
4k_imp_heteroBORDER_FOUND_BY_SCAN_TOLBORDER_FOUND_BY_SCAN_TOL-1.85648-1.68961171179-1.78601
5k_imp_homoBORDER_FOUND_BY_SCAN_TOLSCAN_BOUND_REACHED0.190413128774.99011
6k_phosBORDER_FOUND_BY_SCAN_TOLBORDER_FOUND_BY_SCAN_TOL4.159944.272431431684.19774
7sd_pSTAT5A_relBORDER_FOUND_BY_SCAN_TOLBORDER_FOUND_BY_SCAN_TOL0.437580.7684131722430.585755
8sd_pSTAT5B_relBORDER_FOUND_BY_SCAN_TOLBORDER_FOUND_BY_SCAN_TOL0.7207680.9901632311860.818983
9sd_rSTAT5A_relBORDER_FOUND_BY_SCAN_TOLBORDER_FOUND_BY_SCAN_TOL0.4001020.6661052049290.498684
" ], "text/latex": [ "\\begin{tabular}{r|cccccccc}\n", "\t& params & LSatus & UStatus & LBound & UBound & LCount & UCount & InitValues\\\\\n", "\t\\hline\n", "\t& Symbol & Symbol & Symbol & Union… & Union… & Int64 & Int64 & Float64\\\\\n", "\t\\hline\n", "\t1 & Epo\\_degradation\\_BaF3 & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & -1.70803 & -1.41987 & 523 & 494 & -1.56892 \\\\\n", "\t2 & k\\_exp\\_hetero & SCAN\\_BOUND\\_REACHED & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & & -3.14629 & 4 & 1036 & -4.9997 \\\\\n", "\t3 & k\\_exp\\_homo & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & -2.48322 & -1.98024 & 237 & 289 & -2.2097 \\\\\n", "\t4 & k\\_imp\\_hetero & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & -1.85648 & -1.68961 & 171 & 179 & -1.78601 \\\\\n", "\t5 & k\\_imp\\_homo & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & SCAN\\_BOUND\\_REACHED & 0.190413 & & 1287 & 7 & 4.99011 \\\\\n", "\t6 & k\\_phos & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & 4.15994 & 4.27243 & 143 & 168 & 4.19774 \\\\\n", "\t7 & sd\\_pSTAT5A\\_rel & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & 0.43758 & 0.768413 & 172 & 243 & 0.585755 \\\\\n", "\t8 & sd\\_pSTAT5B\\_rel & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & 0.720768 & 0.990163 & 231 & 186 & 0.818983 \\\\\n", "\t9 & sd\\_rSTAT5A\\_rel & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & BORDER\\_FOUND\\_BY\\_SCAN\\_TOL & 0.400102 & 0.666105 & 204 & 929 & 0.498684 \\\\\n", "\\end{tabular}\n" ], "text/plain": [ "9×8 DataFrame. Omitted printing of 2 columns\n", "│ Row │ params │ LSatus │ UStatus │ LBound │ UBound │ LCount │\n", "│ │ \u001b[90mSymbol\u001b[39m │ \u001b[90mSymbol\u001b[39m │ \u001b[90mSymbol\u001b[39m │ \u001b[90mUnion…\u001b[39m │ \u001b[90mUnion…\u001b[39m │ \u001b[90mInt64\u001b[39m │\n", "├─────┼──────────────────────┼──────────────────────────┼──────────────────────────┼──────────┼──────────┼────────┤\n", "│ 1 │ Epo_degradation_BaF3 │ BORDER_FOUND_BY_SCAN_TOL │ BORDER_FOUND_BY_SCAN_TOL │ -1.70803 │ -1.41987 │ 523 │\n", "│ 2 │ k_exp_hetero │ SCAN_BOUND_REACHED │ BORDER_FOUND_BY_SCAN_TOL │ │ -3.14629 │ 4 │\n", "│ 3 │ k_exp_homo │ BORDER_FOUND_BY_SCAN_TOL │ BORDER_FOUND_BY_SCAN_TOL │ -2.48322 │ -1.98024 │ 237 │\n", "│ 4 │ k_imp_hetero │ BORDER_FOUND_BY_SCAN_TOL │ BORDER_FOUND_BY_SCAN_TOL │ -1.85648 │ -1.68961 │ 171 │\n", "│ 5 │ k_imp_homo │ BORDER_FOUND_BY_SCAN_TOL │ SCAN_BOUND_REACHED │ 0.190413 │ │ 1287 │\n", "│ 6 │ k_phos │ BORDER_FOUND_BY_SCAN_TOL │ BORDER_FOUND_BY_SCAN_TOL │ 4.15994 │ 4.27243 │ 143 │\n", "│ 7 │ sd_pSTAT5A_rel │ BORDER_FOUND_BY_SCAN_TOL │ BORDER_FOUND_BY_SCAN_TOL │ 0.43758 │ 0.768413 │ 172 │\n", "│ 8 │ sd_pSTAT5B_rel │ BORDER_FOUND_BY_SCAN_TOL │ BORDER_FOUND_BY_SCAN_TOL │ 0.720768 │ 0.990163 │ 231 │\n", "│ 9 │ sd_rSTAT5A_rel │ BORDER_FOUND_BY_SCAN_TOL │ BORDER_FOUND_BY_SCAN_TOL │ 0.400102 │ 0.666105 │ 204 │" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ENV[\"COLUMNS\"]=120\n", "res = DataFrame(\n", " params = [:Epo_degradation_BaF3,:k_exp_hetero,:k_exp_homo,:k_imp_hetero,:k_imp_homo,:k_phos,:sd_pSTAT5A_rel,:sd_pSTAT5B_rel,:sd_rSTAT5A_rel],\n", " LSatus = [k.result[1].status for k in intervals],\n", " UStatus = [k.result[2].status for k in intervals],\n", " LBound = [k.result[1].value for k in intervals],\n", " UBound = [k.result[2].value for k in intervals],\n", " LCount = [k.result[1].counter for k in intervals],\n", " UCount = [k.result[2].counter for k in intervals],\n", " InitValues = p_log\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Julia 1.5.0", "language": "julia", "name": "julia-1.5" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.5.0" } }, "nbformat": 4, "nbformat_minor": 4 }