{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"%load_ext autoreload\n",
"%autoreload 2\n",
"\n",
"import anndata\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns\n",
"import logging\n",
"import numpy as np\n",
"import pandas as pd\n",
"import scipy.stats"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import diffxpy.api as de"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Introduction"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Differential expression analysis is a group of statistical tests that are used to establish whether there a exists a significant variation across a set of tested conditions for each gene. In its easiset form, this test can test for the difference between two distinct groups: This scenario can be handled with (Welch's) T-test, rank sum tests or Wald and likelihood ratio tests (LRT). Wald tests and LRT allow for more adaptive assumptions on the noise model and can therefore be more statistically correct. Moreover, they also allow the testing of more complex effect, e.g. for the variation across many groups (a single p-value for: Is there any difference between four conditions?) or across continuous covariates (a single covariate for: Is a gene expression trajectory in time non-constant?). Below, we introduce these and similar scenarios. We dedicated separate tutorials to a selection of scenarios that require a longer introduction."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Testing a single coefficient"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The test of a single coefficient is the easiest differential expression test one can imagine, the comparison of two groups is a sub-scenario of this case."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Standard test"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate data:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here, we use a simulator provided by batchglm and pack the simulated data into an AnnData object. One can also directly supply arrays to diffxpy."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Transforming to str index.\n"
]
}
],
"source": [
"from batchglm.api.models.tf1.glm_nb import Simulator\n",
"\n",
"sim = Simulator(num_observations=200, num_features=100)\n",
"sim.generate_sample_description(num_batches=0, num_conditions=2)\n",
"sim.generate_params(\n",
" rand_fn_loc=lambda shape: np.random.uniform(-0.1, 0.1, shape),\n",
" rand_fn_scale=lambda shape: np.random.uniform(0.1, 2, shape)\n",
")\n",
"sim.generate_data()\n",
"\n",
"data = anndata.AnnData(\n",
" X=sim.x,\n",
" var=pd.DataFrame(index=[\"gene\" + str(i) for i in range(sim.x.shape[1])]),\n",
" obs=sim.sample_description\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Run differential expression test"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We first tackle this scenario with a Wald test.\n",
"\n",
"\n",
"The wald test checks if a certain coefficient introduces a significant difference in the expression of a gene.\n",
"\n",
"It needs a formula which describes the setup of the model and the factor of the formula `factor_loc_totest` which should be tested.\n",
"\n",
"Usually, this factor divides the samples into two groups, e.g. `condition 0` and `condition 1`.\n",
"In this case, diffxpy will automatically choose the coefficient to test.\n",
"If there are more than two groups specified by the factor, the coefficient which should be tested has to be set manually by specifying `coef_totest`. This coefficient should refer to one of the groups specified by `factor_loc_totest`, e.g. `condition 1`.\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING: Logging before flag parsing goes to stderr.\n",
"I1101 08:29:16.389711 4648003008 estimator.py:51] training strategy:\n",
"{'max_steps': 1000, 'update_b_freq': 5}\n"
]
}
],
"source": [
"test = de.test.wald(\n",
" data=data,\n",
" formula_loc=\"~ 1 + condition\",\n",
" factor_loc_totest=\"condition\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Obtain the results"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The p-/q-values can be obtained by calling test.pval / test.qval:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0.60139521, 0.22027086, 0.57619872, 0.93807501, 0.65553144,\n",
" 0.88983104, 0.49675858, 0.08771343, 0.07318849, 0.22164236])"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"test.pval[:10]"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0.84703551, 0.53478002, 0.84703551, 0.99868475, 0.86406426,\n",
" 0.98870116, 0.82114705, 0.31758301, 0.31758301, 0.53478002])"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"test.qval[:10]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"test.summary() returns a pandas DataFrame with a quick overview of the test results:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.scatterplot(\n",
" x=test.log10_pval_clean(),\n",
" y=test_rank.log10_pval_clean()\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Likelihood ratio test"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In a likelihood ratio test (LRT), one specifies a null (reduced) and an alternative (full) model. The difference set of coefficients of both models is tested. The LRT requires 2 models to be fit rather than one in the Wald test and therefore tends to be slightly slower."
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"#test_lrt = de.test.lrt(\n",
"# data=data, \n",
"# full_formula_loc=\"1+condition\",\n",
"# reduced_formula_loc=\"1\"\n",
"#)"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [],
"source": [
"#sns.scatterplot(\n",
"# x=test.log10_pval_clean(),\n",
"# y=test_lrt.log10_pval_clean()\n",
"#)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Two-sample wrapper"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For the special case of two group comparisons, one can also easily toggle between tests using the `two_sample wrapper`."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"test_2s = de.test.two_sample(\n",
" data=data, \n",
" grouping=\"condition\",\n",
" test=\"t_test\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This yields exactly the same as calling t-test twice:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXYAAAD8CAYAAABjAo9vAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAF2RJREFUeJzt3XtwnPV97/HPd3e18upmL7JsbMvENAG3HGIIFm4SZg5JoQltmXq4NTQxHsIcG8Jw0um0gaQMOZ3hZCbEZJimCRe7DalT2jQT6pIhSSGkpekJoUHGxCEBk3CrZGJbdiRb1m0vz/f8oUske23J3l3t7m/frxn9savleb6PzHz00/f5Pb+fubsAAOGIVboAAEBpEewAEBiCHQACQ7ADQGAIdgAIDMEOAIEh2AEgMAQ7AASGYAeAwCQqcdLFixf7qlWrKnFqAKhZO3fuPOjuHbN9riLBvmrVKnV3d1fi1ABQs8zszbl8jlYMAASGYAeAwBDsABAYgh0AAkOwA0BgShLsZnaFme0xs1+Y2SdLcUwACEU2m9fe/mG9eWhIe/uHlc3my3q+oqc7mllc0pck/a6kXknPmdk33f1nxR4bAGpdJpPTmwMj6v3ViJqScQ1n8hrO5rUq3aSGhnhZzlmKeezrJP3C3V+TJDP7mqT1kgh2AHUrilyHR8eUyboODo7prsdeVG//iDrTKW25do3aGhNaujBVlnOXohWzQlLPtNe9E+/NYGabzazbzLr7+vpKcFoAqE7ZbF5HRsc0PBZpNBdpNBupo6VRktTbP6JPfGO3slH59puetydP3X2rpK2S1NXVxQ7aAIITRa7BsYyGM3kdPJrVx/5+59Qo/Z5r1ujeJ/ZoV8+AevtHFJUx2EsxYt8raeW0150T7wFAXYgi18DwmN46PKKB4ZyiSFOhLo2P0u94dLdued/bJUmd6ZQS8fJNSizFkZ+TdI6ZnW1mSUnXS/pmCY4LAFUvl4t0cGhMe/tHdf3WZ3Xplqc1loumQn1Sb/+IFqUa1JlO6cENa7W4qaFsNRXdinH3nJndJukJSXFJX3b3nxZdGQBUuVwu0hu/GlLMTDdPG6HnI1dnOjUj3DvTKa1Ip/RPm9+tjuakksnydcJL8reAu3/b3c9197e7+2dKcUwAqGaZTE77B0eViMXUEI/pvb/RPvW9bd9/Tfd/5CJ1psdnvXSmU3pgw1p1NCW1It1U1lCXKrRsLwDUoihyDYxkFDNXT//YjJuj93/kIknS13f26us7e5VuSugfN71bkbsSMdOSlsayzVs/FksKAMAcZLN57Ts8otFsXkfHouNujt76yPPafOmvb47+wQUrNJbLa8XClFaU8WGkQhixA8BJRJHr4NExDWVyeuPgsBY0xLS0bUHBm6OJuOkbt7xHHa2NakvFtXBBo2Ixm/eaCXYAOIFsNq9XDhydujHamU7pgY9cJJcK3hxtiJlWLEqpo6VRiUTlGiK0YgCggGw2r7eOjM6Y7dLbP6KPPfK8RjP5wjdHm5NatihV0VCXGLEDwAy5XKQDR8cUuatvcKxgy2VBMq6v/+hNPXzjxYrHTMlETEvKPIXxVDBiB4AJ2Wxeew+P6K2BEWVzkbL5aGpUPqkznVLf4Jg+tO5takjE1NKY0PKFqaoJdYkROwAol4v0q+GMDgyO6ZZp/fT7/ugCffHD79Jt/7Br6r2HNqzVGS1JxWOmM1LJirddCjH3+V+Pq6ury7u7u+f9vABwrFwu0sv7B9U3bWndSZ3plO697gIdHsmqvTmpjtZGtTTGlW6qzGwXM9vp7l2zfa76ftUAwDw6cHR8lN6UjBfsp3e0NOrtHc0aHM0pn/eKhfqpoBUDoK5MzksfyeYVj5liNh7gAyPZglMYE3FTIh7T/1jepsUt1R/qEiN2AHUkilx79g3q6gee0aVbntb1W5/VwaMZfeC8JXrw6Vd1zzVrZkxhfGjDWi1qSmhZW0pL2hbURKhL9NgB1Ikocu07Mqo/euiHx43Kt9+0Thu//CN1tDTq45edo7e1N6khHtOytgVVdXN0rj12WjEAghdFrj37BzU0livYRz88ktXXNr9bUeSKxUypZEzpVG20XQoh2AEEKZPJqW8oo1w0vrrivzzfo4tWtRfsow8MZ7W0bYE625sqWHHpVM/fGABQIplMTnv6hvShiR2NPrT1Wf3BBSv0yi+P6PPXXTCjj77l2jVaeUZKSyY2mw4BI3YAwekbyhRcVvfhGy/W7d/Yrc9e/U4tW5hSY0NMyXhMZzRV54NGpyucKwFQt6JofF2Xvf3D6hscUy7ygr30eMy0q2dAn/znnyiTj7R84fhsl5BCXWLEDqDGZTI5HRjKKJOLlI9c2595XRvfe3bBXnpjIqbv3/5+pRriam9O1uzN0dkQ7ABq1mQv/dgt6vYfHtYDG9bOeP+BDWvVkDAta66d+eini3nsAGpSNpvXgaNj+uXhUR0ayujBp1/Vrp4BdaZTevjGi9XR2qChsWhqVkxrKq6WZG2P0pnHDiBYuVykPQeOzliJ8Z5r1ujeJ/ZoV8+A4jFTNm9akQ5j+uKpCuuOAYBgTb9Bun9wVF/43iszZr3c8ehu3fK+t6sznVIyEVN7c7LCFVcOI3YAVW98hD6om786c4TeN5jRrp4BSePh3t6c1AMb1qotFa/plkuxGLEDqGpR5Hrr8MhUqEszR+iTOtMpLVu4QEvbkmpJ1u9oXWLEDqDKHRoa39mo0Lz0yXZLZzqlBzesVcuCuFoba/sGaSkQ7ACqShS5Dg1llMnllUzElcnldWgoU3Be+rKFC/T9T7xPiXhMS1oag3vQ6HQR7ACqxuQqjJu2d0/10v/hf/22Ht3Zo3uuWaM7Ht39671Hb1irZQtTdT86L4RgB1A1Dg1lpkJdGm+3/N9v/Ux/cvm5+qunXtFdV56n9uaklrQ2ajmhfkL83QKgamRy+eN66U/+7IAWNyf1mavW6PzlbXpbe7M60020XU6CETuAiji2l97enFQyES/YS4/FYupoDWdZ3XLjVx6AeZfLRXpp3xFddf8PdMk9/66r7v+B9uwfVDrVoG0bu2asl75tY1ddP2x0OopaK8bMrpP0l5J+S9I6d5/TAjCsFQPUryhy9fYP68N/81/Hjcx33HqJ2puTx43k6aWPm6+1Yl6UdLWkh4o8DoCATW+7mJn6h7MF56VncnnFYkbbpUhFBbu7vyRJZvw2BXC8XC5S39ExjebyeuPgsL7wvZ+r7+iYtt+0rmAvPZmIV7DacNBjB1AWmUxOL+0f1HUP/VDvv/c/dNdjL+rPP7haHS2N+ux3XtKXPnzRjF76QzespZdeIrOO2M3sKUlnFvjWne7+2FxPZGabJW2WpLPOOmvOBQKoPdlsvuC+o3c8ult3XXmebv7qTv3J5ecyL71MZg12d7+8FCdy962StkrjN09LcUwA1SWKXAeHxjQ0llM+UsE++qJUgzrTKaWbklqUauAGaRkwjx1A0aLINTCS0S8HRnXzxCj94RsvLthHH87ktW1jl85sC3+LukopqsduZleZWa+k90j6lpk9UZqyANSKyfVdftxzeCrUJekL3/u5tly7ZkYf/cENa7VmZZtWL20l1Muo2FkxOyTtKFEtAGpMLhdp/+CoGhMxvWNJizpaGqeCfVfPgD73r3v01ZvW6dBQRssWLlBHc1LJJI2CcmNWDIBTFkWuA0dGtffwiDL5SPuPjOrux3+q269YrXetXDT1ub6jY3rj0LBSDXEtaWkk1OcJwQ7glEwuB3D1A8/o0i1P64a//ZEk6db3v0MP/+B1ffyycyRNTGHcsFYXrFyo31rWpoYG5qjPl6KWFDhdLCkA1JZcLtKBo2PK5SPFYqbrtz573E3Ru9efr0w+0m+e2Sp3qakxrsXNjfTSS2i+lhQAELhcLtLL+wd1y8SN0W/c8p6C0xibknG1xhJKxIwNMCqMVgyAkzpwdGwq1CVNbVM33eQ0xo7WRi1tZRpjpRHsAE4qm49mjNAffPpVff66C2ZMY9xy7RqtPCOl5W0L2ACjCtCKATDDsRtgpBpmbn6xq2dAf/v/XtP2m9bJXWpMxJRqjCmdop9eLQh2AJIKPz3amU5p2w1d+spHL9aNDz839d7HLztXzcm4YrEYywFUIWbFAHVucn2X4bG8cpHrxod/dNyMl3/+2HuVi1y5fKREPKYlLY20XCqAWTEAZpXN5vXKgaNTI/QTzXjJ5iOtSDdVqEqcKn7lAnUql4v01pHRGeu7nGjGCxtg1BaCHahDUeTaPziqvsGx42a83HPNGjaTrnG0YoA6MX22Sz5y5d2nRujTZ7z83TOv6ysfXadEzHh6tEYxYgfqwOTSulfd/wNdcs+/68N/819KxGJ6dGfPcSP0j192rs5obtBZZzRpCQ8b1SRG7EAdODSU0abt3TO2qfv7H76u//075+qv/+2VqS3qOlobtbxtAQt21ThG7EAdyOTyx812eeg/39CCBtMnPvibOmdJi5YtXKAVC1OEegAIdqAOJBPxgrNdXjs4rI9+5TmN5SItW5hibnog+FcE6kB7c1LbNnbN6KU/dMNaXdi5UDtuvYSt6gJDjx0IxLFrvEx/1D8WM61e2qodt15S8PsIC8EOBGBy1svkDdLJ+efTR+KxmKmjtbHClWI+0IoBalQUufoGx7S3f1j7joweN+tl0/ZuHRrKVLhKVALBDtSgY+elvzUwUnCNl0wuX6EKUUkEO1BDJkfpvQPD2nd4VB0t460V1njBdAQ7UCOmj9L/5+ee1l2Pvag//+BqvWvlItZ4wQzcPAVqRKGnR+94dLfuuvI83fzVnfq7Z17X129+j9ydWS91jmAHqtiMhbvcC/bRF6Ua1JlO6U9/d7XObGNtF9CKAarWsTdIXz0wVLCP3plO8ZARZiDYgSp1bOvlC9/7ubZce3wffdnClDpaWVoXv0YrBqhSxy7ctatnQJ/71z36p83vliT66Dghgh2oUpMLd00P976jY0om4jxBipOiFQNUqUILdzGFEXPBiB2oUizchdNFsANVjIW7cDqKasWY2RYze9nMdpvZDjNbVKrCAACnp9ge+3clne/uayS9IulTxZcEAChGUcHu7k+6e27i5bOSOosvCQBQjFLOirlJ0ndO9E0z22xm3WbW3dfXV8LTAgCmm/XmqZk9JenMAt+6090fm/jMnZJykh450XHcfaukrZLU1dXlp1UtAGBWswa7u19+su+b2Y2SrpR0mbsT2ABQYUVNdzSzKyTdLulSdx8uTUkAgGIU22P/oqRWSd81sxfM7MES1AQAKEJRI3Z3f0epCgEAlAZPngKnaPrmFzzmj2pEsAOnYHLzi8l10icX5mKTC1QTVncE5iiKXPuOjB637+im7d06NJSpcHXArzFiB+ZgcqQ+NJYruO9oJpevUGXA8RixA3MwuU3doaFMwX1Hk4l4hSoDjkewA3MwuU3dg0+/qnuuOX7fUTa/QDWhFQPMweQ2dbt6BnTvE3t015Xnqb05qeWLUjqzbQE3TlFVGLEDczB9m7pdPQO6+/GfqbkxQaijKjFiB+aAbepQSwh2YI7Ypg61glYMAASGYAeAwBDsABAYgh0AAkOwA0BgCHYACAzBDgCBIdgBIDAEOwAEhmAHgMAQ7AAQGIIdAAJDsANAYAh2AAgMwQ4AgSHYASAwBDsABIZgB4DAEOwAEBiCHQACQ7ADQGAIdgAITFHBbmZ3m9luM3vBzJ40s+WlKgwAcHqKHbFvcfc17n6hpMclfboENQEAilBUsLv7kWkvmyV5ceUAAIqVKPYAZvYZSRslHZb0/qIrAgAUZdYRu5k9ZWYvFvhaL0nufqe7r5T0iKTbTnKczWbWbWbdfX19pbsCAMAM5l6a7omZnSXp2+5+/myf7erq8u7u7pKcFwDqhZntdPeu2T5X7KyYc6a9XC/p5WKOBwAoXrE99s+a2WpJkaQ3Jd1SfEkAgGIUFezufk2pCgEAlAZPngJAYAh2AAgMwQ4AgSHYASAwBDsABIZgB4DAEOwAEBiCHQACQ7ADQGAIdgAIDMEOAIEh2AEgMAQ7AASGYAeAwBDsABAYgh0AAkOwA0BgCHYACAzBDgCBIdgBIDAEOwAEhmAHgMAQ7AAQGIIdAAJDsANAYAh2AAgMwQ4AgSHYASAwBDsABIZgB4DAEOwAEBiCHQACQ7ADQGBKEuxm9mdm5ma2uBTHAwCcvqKD3cxWSvqApP8uvhwAQLFKMWK/T9LtkrwExwpOFLn6Bse0t39YfYNjiiJ+TADKK1HMf2xm6yXtdfcfm9lsn90sabMknXXWWcWctmZEkWvP/kFt2t6t3v4RdaZT2raxS6uXtioWO/nPCwBO16wjdjN7ysxeLPC1XtJfSPr0XE7k7lvdvcvduzo6OoqtuyYcGspMhbok9faPaNP2bh0aylS4MgAhm3XE7u6XF3rfzN4p6WxJk6P1TknPm9k6d99X0iprVCaXnwr1Sb39I8rk8hWqCEA9OO1WjLv/RNKSyddm9oakLnc/WIK6gpBMxNWZTs0I9850SslEvIJVAQgd89jLqL05qW0bu9SZTknSVI+9vTlZ4coAhKyom6fTufuqUh0rFLGYafXSVu249RJlcnklE3G1Nye5cQqgrEoW7CgsFjN1tDZWugwAdYRWDAAEhmAHgMAQ7AAQGIIdAAJDsANAYAh2AAgMwQ4AgSHYASAwBDsABIZgB4DAEOwAEBiCHQACQ7ADQGAIdgAIDMEOAIEh2AEgMAQ7AASGYAeAwBDsABAYgh0AAkOwA0BgCHYACEyi0gXMVRS5Dg1llMnllUzE1d6cVCxmlS4LAKpOTQR7FLn27B/Upu3d6u0fUWc6pW0bu7R6aSvhDgDHqIlWzKGhzFSoS1Jv/4g2be/WoaFMhSsDgOpTE8GeyeWnQn1Sb/+IMrl8hSoCgOpVE8GeTMTVmU7NeK8znVIyEa9QRQBQvWoi2Nubk9q2sWsq3Cd77O3NyQpXBgDVpyZunsZiptVLW7Xj1kuYFQMAs6iJYJfGw72jtbHSZQBA1auJVgwAYO6KCnYz+0sz22tmL0x8/X6pCgMAnJ5StGLuc/d7S3AcAEAJ0IoBgMCUIthvM7PdZvZlM0uX4HgAgCKYu5/8A2ZPSTqzwLfulPSspIOSXNLdkpa5+00nOM5mSZsnXp4v6cXTrDkEizX+c6tXXD/Xz/Wfnre5e8dsH5o12OfKzFZJetzdz5/DZ7vdvaskJ65BXD/Xz/Vz/eU8R7GzYpZNe3mV6nsUDgBVodhZMZ8zsws13op5Q9LNRVcEAChKUcHu7jec5n+6tZjzBoDrr29cf30r+/WXrMcOAKgOzGMHgMBULNhZjmCcmf2ZmbmZLa50LfPJzO6eeP7hBTN70syWV7qm+WJmW8zs5Ynr32Fmiypd03wys+vM7KdmFplZ3cyOMbMrzGyPmf3CzD5ZznNVesR+n7tfOPH17QrXMu/MbKWkD0j670rXUgFb3H2Nu18o6XFJn650QfPou5LOd/c1kl6R9KkK1zPfXpR0taTvV7qQ+WJmcUlfkvR7ks6T9Mdmdl65zlfpYK9390m6XeOziuqKux+Z9rJZdfQzcPcn3T038fJZSZ2VrGe+uftL7r6n0nXMs3WSfuHur7l7RtLXJK0v18kqHex1uxyBma2XtNfdf1zpWirFzD5jZj2SPqL6GrFPd5Ok71S6CJTdCkk90173TrxXFmXdaGOW5Qge0PgyBJPLEXxe4/+TB2OW6/8LjbdhgnWy63f3x9z9Tkl3mtmnJN0m6f/Ma4FlNNu1T3zmTkk5SY/MZ23zYS7Xj/Ipa7C7++Vz+ZyZbdN4nzUoJ7p+M3unpLMl/djMpPE/xZ83s3Xuvm8eSyyruf77azzYvq2Agn22azezGyVdKekyD3DO8Sn829eLvZJWTnvdOfFeWVRyVkzdLkfg7j9x9yXuvsrdV2n8z7KLQgr12ZjZOdNerpf0cqVqmW9mdoXG7638obsPV7oezIvnJJ1jZmebWVLS9ZK+Wa6TVXLPU5YjqG+fNbPVkiJJb0q6pcL1zKcvSmqU9N2Jv9iedfe6uX4zu0rSX0vqkPQtM3vB3T9Y4bLKyt1zZnabpCckxSV92d1/Wq7z8eQpAASm0rNiAAAlRrADQGAIdgAIDMEOAIEh2AEgMAQ7AASGYAeAwBDsABCY/w9sMu/A8Q0EAwAAAABJRU5ErkJggg==\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.scatterplot(\n",
" x=test_tt.log10_pval_clean(),\n",
" y=test_2s.log10_pval_clean()\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Inclusion of size factors"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"One can also use pre-computed size factors in diffxpy by supplying them to the test function."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"size_factors = np.random.uniform(0.5, 1.5, (sim.x.shape[0]))"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"I1101 08:29:55.117454 4648003008 estimator.py:51] training strategy:\n",
"{'max_steps': 1000, 'update_b_freq': 5}\n"
]
}
],
"source": [
"test_sf = de.test.wald(\n",
" data=data,\n",
" formula_loc=\"~ 1 + condition\",\n",
" factor_loc_totest=\"condition\",\n",
" size_factors=size_factors\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And results can be retrieved as beford. Note that the results differ now as we imposed size factors without changing the data:"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAD8CAYAAAB+UHOxAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAHzVJREFUeJzt3XtwVPeVJ/Dv6daDRshIBiEMguAwrDYajyaYDrHDlAeX7YTssqEwOA9sE9u7FoTxzGyyiXGWMOMJSY2BTLkyaxMensTG5ZTNhDB2gYeHM6Zcg19IwSZYtmLAziBig6yViBCNWt195g/1bbfU96q7de/t1/1+qijU3Ze+V13UPf075/c7P1FVEBGR9/jyfQFERJQfDABERB7FAEBE5FEMAEREHsUAQETkUQwAREQexQBARORRDABERB7FAEBE5FFl+b6A0UyePFlnzZqV78sgIioabW1tH6lqXSbHFnQAmDVrFlpbW/N9GURERUNEfpfpsUwBERF5FAMAEZFHMQAQEXkUAwARkUc5EgBEZJGIdIjISRF5wOT1ShF5Jv76ayIyy4nzEhHR2NkOACLiB/AogC8CaALwNRFpGnHY/wTQo6p/BOBhABvtnpeIyE2xmKKrbwBney6hq28AsVj6zbPG8m/yyYlpoPMBnFTV0wAgIk8DWAKgPemYJQAejP/8CwCPiIgotyMjogIUiyk6zvXh3p2t6OwJoaE2gB0rg2isr4bPJ7b/TSym6O4PIxyJoqLMj9pAOXpCg4nHk6oqLM/jJCdSQNMBnEl63Bl/zvQYVY0AuABgkgPnJiJyXHd/OHEjB4DOnhDu3dmK7v5wyrHGt/7O3kv48MJl1E2oHPXfGIFi6ZYjWLDxRSzdcgTvnOvDuj3HE487zvXlZPRQcAvBRKQFQAsAzJw5M89XQ0ReFI5EEzd/Q2dPCOFIdNhzZt/6Ny5rxrPHzuKmpnrUBMoRjkQRi2niG31ycPnyvAbce8Mn4fcJvrf4j1ETqMCutk7cu7MVe9YsQF11pau/pxMB4CyAGUmPG+LPmR3TKSJlACYC6DZ7M1XdDmA7AASDQaaIiCjnKsr8aKgNDAsCDbUBVJT5hx2XfDOfO6MGqxfORmWZD3950xxs2PsWDrafT0kFGcHly/MacMf1n8Ddjx9NBI+f3H4tbr9uJi4PxlIChxucSAEdBTBHRK4WkQoAXwXw3IhjngPw9fjPywH8G/P/RFSoJlVVYMfKIBpqAwCQuIlPqqoYdpxxM587owbf/kIjNuxtx/Ktr+BrO17F1z93NebOqElJBRnB5d4bPok1T/16WJrpG0/9GlWV5fjK9lfxle2vup4Ksh0A4jn9+wAcAPA2gF2q+paIfF9EvhQ/7J8ATBKRkwC+BSBlqigRUaHw+QSN9dXYs2YBjqy9EXvWLDAt5ho389ULZ2Pt7uPDbuZrdx/H6oWzE4+N9JERXPw+MU0zGacYre7gFEdqAKr6PIDnRzz3N0k/XwZwmxPnIiLKBZ9P0ubgjZt5/0DE9GZeEygHMDSCEJFESqexvhofXAiZppmiSd/4zeoOTuJKYCKiMTJu5tNqAol0kaGhNoDe0GCiMPzgcyfwfnc/zvddxgcXQqiq9GPrHfOGpZm23H4tdrx0eth7jKw7OKngZgERERUTn08w9Ypx2LEyOGw20LY75uGKQBl+dtdnsOOl06gJDNUP/qP7Err7w9jddgZ/t+SP8XTLdQhHYlAAkWgUL58emh9jVXdwEgMAEZED6q+oxDMt1yGqCoEMmwX0yIq58Ilg5U9fHzZd9HfdIXz7n99MpIHmzqjBhiXXYHZdFcr8PkyZUOnqLCAGACIiG8zWAmxe3oyuvnBiaujEQDn+/vm3sX5xE2oC5egNDeKJl9/D/Yv+67Cb/+qFszG+wo9oTPGDfSfwzVsaR119bBcDABGRDWarhr/zi+N46NY/gYhg7e7j+Mnt1+Lrn7s6MVPIGAGMK/ehoTaAugmV+PYXGlNef/hQB364tNm1BWEMAEREWUru5RNVNZ0BNHXiONz1s6FFXuPK/fjGiDn/a3cfxzMt12HHyiA+vHAZa3cfR92EysQo4VI4irsXXO3qLCAGACIiEyMbthkN2kamfH5212dMp3Mmz/O/aDFNFAAa66sxvsJvOgr4ye3XIlDh3iwgTgMlIk8za+Fs1rCt41wf/n9/atO3f/zVu9i8vDll1XBVRVniufN9A6bTRIGhFFKF34e/umlOymKybzz1a0RcXAnMEQAReZZVC+dJEypMu4FuWHIN7n78aCJH/6MDHTh2pheb9nfgmZbrACAxWgCQmBq69fApbF7ejO/84viwQvF9Pz+GrosD2HnPfMyaPN50lDAYibn2+zMAEJFnWbV9/vm9nzW9GY+Pp2OMHP76xU1Y9WQbui4OQGRoPUDyjB2jnUQ4EkWgwo9frvkcLg/GcOr8RWzaPxQ8AGDlT1/HP6++3jSVVF7mXqKGKSAi8iyrts9+EcuVvcnH1QTK8fmmKdh5z/yhPQF6LiGS9I3daCcxvXY8rqyqxJTqcfALcPfjRxM3f+O9fEBKKmnz8mb4XbxLMwAQkWcZzdySNdQGEKjwp3QD3by8GVsPnxp23PTacfiLG+dg5U9fx5JHj2DFY6+h4/zoHTytzhkajGHT/g6sX9yEZ1quw/rFTdi0vwP9A0wBERE5zmjmNrIGUBOoQE2gIpG+KS/z4eLlCLouDgBAotXD+x9dwgO//M2wFNKqJ9tG3czF7JwblzUDALouDmDVk22JYxtqA/C7uDOkFHJb/mAwqK2trfm+DCIqYVbTPdMd5/cBp7v6sXzrKynHHll7I6bXjk97zlA4grc/7MPWw6dw/6JGAEgpFDdOrcaVVZkvBBORNlUNZnIsRwBElDOZ3mxzKZO2z2bHxWKKKdWRjHYOM44f+bt3A9iwtx2dPSFs2t+Bv/1SEzYsuQbjK/y4FI6i/opxiSZybmANgIhywmpufS42P3eDzyeYNjGAbXfOS1kDMLKDp9XvXhsoT9Qajp3pxZYXT2J2XRUaagO4ZvpEzJpU5WqAZAqIiHKiq28AS7ccSfm2nIvNz91k9s0ewLDnFIpbt7xs+rtPqqpwdFSUTQqIIwAiygmrKZfhSDRlJW4xSZ7qaQQy49v+fT8/hhNnLyAUjmL94ibMnVGT+HfG7+7zCSZVVaCizI9wJIru/nDOPgPWAIgoJ4zpj2ZbIBojAyOF4mYLZLcZi8usOnwaq4eNWoHVauRcfAYcARBRThjTH5Pz5dvunIcf7GtPWYnr5kbobjNGOqsXzsYTL783bF7/Ey+/h9ULZw+rFVitRs7FZ8ARABHlhLF/rjG3fujbbwwH288PO87tjdDdZox0PnFlwHQPgE9OHp/I/ft8MmpqzG0cARBRzozMl/t8PtNVsU5thG7W6dNtk6oqsPOe+aiqLE/p7rl293EoPp5O2tU3gKgqfnbXZ4bVB9zeDN7AEQARuSbdvH+rlbhObISer9y6zyeYMK4MH/ReNv1mr6qW20hu2t+BrosDrm8Gb2AAICJXZHIDNksLObU4zCq37va001hMcXkwig//cNlykZjVNpLPtFyX0wVyTAERkSsyLW6mpoWcufHlI7duBL1T5/uxu+0MNi5rTil61wbKEY5E8Q+3/Sm23TkvkfoxrtXJzyAdjgCIyBX5LG4C1tNO3cytj5wCaswCmlRVgSnVlbjqinF4t+tiSiO4Hx0YSv3kIu+fjCMAInKFVdvjXN3kzKadOllfMCsuG0Hv2Jle/OhAB5bNm4GaQDmumjgODbXj0Xs5kjIqWrv7OP7qpjk5y/sn4wiAiBwxsuBr9LkZa4HXbuM4t+oLo9U2kkcdx870YtWTbYmWD6NN+Zw9ZQIaagI5X/zGAEBEtlndFOfUTRjTDdipGTxGfcEIJh9cCNkOBKMVl9PNarJKSwXK/XlZ+cxmcERkm9ON3px8P6eng57tuYQFG19Mef7I2htx1cQAekNhhMJRRBUYV+7D5KqPi7q5mJrK/QCIKKecLvg6+X5OTQc1RhHGwq1//NW7iX19jc3bzW7uk5M2c3Fz2utY2CoCi8iVInJIRN6N/11rcVxURN6I/3nOzjmJqPA4XfB18v2cCCbJ/fxv2HQY6589gfsXNWLujJrEjb7MJ1lPezV6AeWrE6rdWUAPAPiVqs4B8Kv4YzMhVf10/M+XbJ6TiAqM0zNunHy/8jLzdhPlZZnf/qwWbj2yYi72rFmAxvpqhMLZBZpC2CDHbgpoCYCF8Z+fAHAYwFqb70lERcbp1IaT71fmE2xe3pyy125ZFu9lNYoAkEgjZbruILEf8GAEH164jLoJlejsCeVspXIyuwGgXlU/iP/8IYB6i+PGiUgrgAiAh1T1X2yel4gKTKZ76+b6/ULhKDbt78D6xU2oCZSjNzSITfs78MiKuUBVZu+RfHOfO6MGqxfOxqSqCogIYjFNbOoycgaQsfLXYFYETt4jINedUNMGABF5AcBUk5fWJT9QVRURq7HLJ1T1rIh8EsC/ichvVPWUxflaALQAwMyZM9NdHhHRqCrK/Oi6OIBVT7Ylnsu2nmDc3B8+1JHS4jl5Fs+cugn4+f/6LM73DaC7P4wfv/BbfPOWxsTrZqmktbuPY/3ipsSagVyuBk4bAFT1ZqvXROSciFylqh+IyFUAzpsdp6pn43+fFpHDAOYCMA0AqrodwHZgaBpo2t+AiGgUTnQcNVJSD37pGnx52yuWM4p6QoNY8dhrw9JA7R/0JV63SiXVBModXamcKbspoOcAfB3AQ/G/nx15QHxm0CVVHRCRyQAWANhk87xERBkZrZ6QzWpjn0+gqqMWetPNOLKqE0yprsSGJdeg/orcNYID7M8CegjALSLyLoCb448hIkEReSx+zKcAtIrImwBexFANoN3meYnIRfnYSMVNZh1HxzILx2p6qlELSDd91Wx208ZlzfjWrjdx9+NHEQrndic0rgQmKkF2+ujkc5PyXBrLamOrIu4TL7+Hb97SiDl1E1K6fY787GIxxYd/uIzf94bQ3R/G1sOnEpvEOzEDiCuBiTzM7g3c6ZWzhbDi1cxYFogZ6aRdq65P3MCNGTxGrj/d9FWfTzD1inG4EBrE/37mDcd3QssGAwBRibF7A3dy5WwhjyLGul+AUQtYvvWVYc8bn1Em01cLpSUE9wMgKjF2b+BOtGHIdDcwIH/1BjurjZ34jNzaCS0bHAEQlRi7O2E5MW0y0yCUz5GCnW/hbm5mn0ssAhOVGCduqnbz95kWWJ1uI51LhVrjYBGYyMOcyC/bbcNQGyjHtjvnYdWTbaN+Q873vsF2ON36Ih8YAIhKUD5vTrGY4t2ui/jxC78dtiH6tImpWx7mY+N2+hgDABE5KrkAfLB9qDuMVVonXS69UNMspYIBgIgclU1aJ12bhmxqGQwW2eM0UCJyVLZTJK2mQ2Y7lTQXm6uUXIuMfF8AEZUWp3bzymYkkU2wGKtC2MHLaUwBEZGjnFrlmk2BOBeziZxqkVFIOAIgIsc5sco1m5GEnZW5maZ1innKqhWOAIioIGUzkhjrytxsCs2lOGWVK4GJqCSMZRZQNiuRi6HBHcCVwETkQWNZ/ObUlNVixQBARJ4UiymiMc0qrVMK7R+SsQhMRJ7U3R/GD/a1Y+Oy5mGF5m13ziu6rp5jxREAEXlSOBLFwfbz6OoLY/3iJtQEytEbGsTkIk/rZIMBgIhKnlmB2JjVc+xML1Y92Qbg4wKwVzAAEFFJs5q9M6duAnasDOLhQx1YNm9GomtpbaA835ecM6wBEFFJs1rB2xMaxJy6Cfjrm/8LNuxtx/Ktr2DFY6/h3a6LRd3eIRscARBRSYvFYsNy/FsPn8KxM70IR6LoCSGxaQ1QGu0dssEAQEQlKxZTfNQfxoa97Yn0z8ZlzXji5fdQUeYvyfYO2WAKiIhKgllPn+7+cMo3/LW7j+N7/71pWCE4WbG3d8gGAwARFT3rVs0x02/4fp/A5xPThnPb7pwHgeL3vaGS6ftvhSkgIip6VoXeXauuH3Wl78j2DtGY4qlX38cNjfVYu/t4Qff8cQJHAEQlptR2rcqEVS5fVVO/4d8xD34fEp+L0d6hosyPFY+9hmtnTUrc/I33cXpzmULBEQBRCSmWjpVOG61Vc2N9AL9c8zlcGojivY/68b1/OYGuiwMpn4sRRGoC5Z4pDHMEQFRC7GyNWMwjh9E2j/H5BALBHf/0Gu5+/CiOnek1/VyMINIbGvRMYZgjAKISMtZpjcU+ckjXqnm0tQAGI4g8fKgDG5c1p9QASrFBnK0AICK3AXgQwKcAzFdV091bRGQRgB8D8AN4TFUfsnNeIjI31l2rSmG/W6tWzenWAiT/+8b6avxwaTNisRh2rboeqloSff+t2E0BnQBwK4CXrA4QET+ARwF8EUATgK+JSJPN8xKRiWz20U1Wygui0q0FSGYEkfqJAUyrCdja07gY2BoBqOrbACAy6oczH8BJVT0dP/ZpAEsAtNs5NxGlGuuuVaW4363BKrgZawG8LBdF4OkAziQ97ow/R0QuML7FZvPtdawjh9EUSlHZ66t9R5N2BCAiLwCYavLSOlV91ukLEpEWAC0AMHPmTKffnohMOL3fbSEVlY3gNvJaSrGom620AUBVb7Z5jrMAZiQ9bog/Z3W+7QC2A0AwGCyeeWhERc7J/W4Lqahcipu5OyUX00CPApgjIldj6Mb/VQArcnBeIsqTQisql9pm7k6xVQMQkaUi0gngegD7RORA/PlpIvI8AKhqBMB9AA4AeBvALlV9y95lE1EhY969OIhq4WZZgsGgtraaLi0gogJWSDUArxGRNlUNZnIsVwITkeOYdy8ODABE5Arm3QsfAwAR5YSxQ1cuRwT5OGcxYQAgIteZ1QR23jMfE8aVYTASc+XmzDpEemwHTUSuG7kuoG5CJc794TJu3fLyiC0cnZuUYqc1tlcwABCR60auC1i9cDa+8wt3d90qtLUIhYgBgIhcN3JdgNWuW6HBiGO9g7gWIT0GACKy5FRDt5HN5i6Fo6Y351Pn+x1LCbnR4K7UcCEYEZlyuoiaPCMnUOHHBxcuJ/r0N9QGsHl5Mzbt78CxM70Ahm7YdnsHeXEWEBeCEZFtTjd0G7kuYDDy8TaNU6or8a1dbyZu/sb57ObruRZhdAwARGTKKKLOnVGD1QtnJ/bTjcVijry/z+dLbNO47c556Lo4MOx15uvdxxoAEZmqKPPj801T8O0vNGLD3nZ8Zfur2LC3HR9dDOPchZDtQm1yjn7r4VPYvLyZ+focYw2AiEzFYorOnktY8dhrKVtFrl/chA17220vrBpZF4jE1LWFYV6RTQ2AIwAiMuXzCfw+MZ2uaUzjtDt3P3n7yiurKjGlelzJb8ReSBgAiMiS1Vz63tAgAC6sKnYMAERkyWwu/cZlzdh6+FTisZ1CbaFsHO9VnAVERJZG9vWPxhQ/2NeOY2d6bRdq2awt/1gEJqKMObmwqqtvAEu3HEkpMOdj4/hSwoVgROQKJxdWsVlb/rEGQER5wWZt+ccAQER5wWZt+ccUEBHlBTeOzz8GACLKGzZryy+mgIiIPIojACJylRd78hcLBgAicg0XexU2poCIyDVWm8o4ufk7jR0DABG5hou9ChsDABG5hou9ChsDABG5hou9ChuLwETkGi72Kmy2AoCI3AbgQQCfAjBfVU1bd4rI+wD6AEQBRDLtVEdExY+LvQqX3RHACQC3AtiWwbE3qupHNs9HREQOsRUAVPVtABDhcI6IqNjkqgisAA6KSJuItOTonERUoriVpDPSjgBE5AUAU01eWqeqz2Z4nj9T1bMiMgXAIRF5R1VfsjhfC4AWAJg5c2aGb09EXsHVxc5JOwJQ1ZtV9RqTP5ne/KGqZ+N/nwewB8D8UY7drqpBVQ3W1dVlegoi8giuLnaO6ykgEakSkWrjZwCfx1DxmIhyqFTSJlxd7BxbAUBElopIJ4DrAewTkQPx56eJyPPxw+oB/LuIvAngdQD7VHW/nfMSUXaMtMnSLUewYOOLWLrlCDrO9RVlEODqYueIauH+BwgGg9raarq0gIiy0NU3gKVbjgz75txQG8CeNQuKbo4+awCjE5G2TNdacSUwkQeUUtqEq4udwwBA5AFG2mTkCGAsaZNC2OCFq4udwWZwRB7gVFO2UqolEGsARJ7hxDf3UqollCrWAIgohRNpk1KqJRBTQESUBU7BLC0MAESUMW7wUlqYAiKijHEKZmlhACCirHAKZulgCoiIyKMYAIiIPIoBgIjIoxgAiIg8igGAiMijGACIiDyKAYCIyKMYAIiIPIoBgIjIoxgAiIg8igGAiMijGACIiDyKAYCIyKMYAIiIPIoBgIjIoxgAiIg8igGAiMijuCMYEY0qFlN094e5BWQJYgAgIkuxmKLjXB/u3dmKzp5QYhP4xvpqBoESwBQQEVnq7g8nbv4A0NkTwr07W9HdH87zlZETOAIoABxiU6EKR6KJm7+hsyeEcCSapysiJzEA5BmH2FTIKsr8aKgNDAsCDbUBVJT583hV5BRbKSAR2Swi74jIcRHZIyI1FsctEpEOETkpIg/YOWep4RCbCtmkqgrsWBlEQ20AABJfUCZVVQw7LhZTdPUN4GzPJXT1DSAW03xcLmXJ7gjgEIDvqmpERDYC+C6AtckHiIgfwKMAbgHQCeCoiDynqu02z10SOMSmQubzCRrrq7FnzQLLFCVHscXL1ghAVQ+qaiT+8FUADSaHzQdwUlVPq2oYwNMAltg5bykxhtjJOMQmJzj1rdznE9RVV2J67XjUVVem3NQ5ii1eTs4CugfAv5o8Px3AmaTHnfHnCJkPsYmyYXwrX7rlCBZsfBFLtxxBx7k+V1IzHMUWr7QpIBF5AcBUk5fWqeqz8WPWAYgAeMruBYlIC4AWAJg5c6bdtyt4mQyxibJl9a18z5oFqKuudPRcLBQXr7QBQFVvHu11EbkLwGIAN6mq2deLswBmJD1uiD9ndb7tALYDQDAY9EQlyRhiEzkll9/KjVHsyBoAR7GFz1YRWEQWAbgfwJ+r6iWLw44CmCMiV2Poxv9VACvsnJeIRpfLb+UcxRYvuzWARwBUAzgkIm+IyFYAEJFpIvI8AMSLxPcBOADgbQC7VPUtm+clolHkuraUrlBMhUnMszaFIRgMamtra74vg6gocYW5N4lIm6oGMzmWK4GJShRrS5QOm8EREXkUAwARkUcxABAReRQDABGRRzEAEBF5FGcBETmIUy+pmDAAEDmEbZGp2DAFROQQtkWmYsMAQOQQtkWmYsMAQOQQbu5DxYYBgMgh3NyHig2LwEQOYVtkKjYMAEQOYgM2KiZMAREReRQDABGRRzEAEBF5FGsARA5jOwgqFgwARA5iOwgqJkwBETmI7SComDAAEDmI7SComDAAEDmI7SComDAAEDmI7SComLAITOQgtoOgYsIAQOQwtoOgYsEUEBGRRzEAEBF5FAMAEZFHMQAQEXkUAwARkUeV3CwgNuIiIsqMrQAgIpsB/A8AYQCnANytqr0mx70PoA9AFEBEVYN2zmuFjbiIiDJnNwV0CMA1qtoM4LcAvjvKsTeq6qfduvkDbMRFRJQNWwFAVQ+qaiT+8FUADfYvaezYiIuIKHNOFoHvAfCvFq8pgIMi0iYiLaO9iYi0iEiriLR2dXVldQFsxEVElLm0AUBEXhCREyZ/liQdsw5ABMBTFm/zZ6p6LYAvAvgLEbnB6nyqul1Vg6oarKury+qXYSMuIqLMpS0Cq+rNo70uIncBWAzgJlVVi/c4G//7vIjsATAfwEtZX20abMRFRJQ5u7OAFgG4H8Cfq+oli2OqAPhUtS/+8+cBfN/OeUfDRlxERJmxWwN4BEA1gEMi8oaIbAUAEZkmIs/Hj6kH8O8i8iaA1wHsU9X9Ns9LREQ22RoBqOofWTz/ewD/Lf7zaQB/auc8RETkPLaCICLyKAYAIiKPYgAgIvIosZi5WRBEpAvA7/J9HTk0GcBH+b6IPONnwM/A678/YO8z+ISqZrSIqqADgNeISKubvZKKAT8DfgZe//2B3H0GTAEREXkUAwARkUcxABSW7fm+gALAz4Cfgdd/fyBHnwFrAEREHsURABGRRzEAFCgR+T8ioiIyOd/XkmsiskFEjsf7Sx0UkWn5vqZcEpHNIvJO/DPYIyI1+b6mXBOR20TkLRGJiYinZgSJyCIR6RCRkyLygJvnYgAoQCIyA0NdU/8j39eSJ5tVtVlVPw1gL4C/yfcF5Vg2W62WqhMAboULbeMLmYj4ATyKob1TmgB8TUSa3DofA0BhehhDbbY9WaBR1T8kPayCxz6HQttqNR9U9W1V7cj3deTBfAAnVfW0qoYBPA1gSZp/M2a2uoGS8+I7rZ1V1TdFvLuRjYj8EMBKABcA3Jjny8mnewA8k++LoJyZDuBM0uNOAJ9162QMAHkgIi8AmGry0joA/xdD6Z+SNtpnoKrPquo6AOtE5LsA7gPwtzm9QJel+/3jx6TbarWoZfIZkLsYAPLAaptNEfkTAFcDML79NwD4tYjMV9UPc3iJrku31WiSpwA8jxILAE5stVrssvg/4CVnAcxIetwQf84VDAAFRFV/A2CK8VhE3gcQVFVPNcYSkTmq+m784RIA7+TzenItk61WqWQdBTBHRK7G0I3/qwBWuHUyBgAqRA+JSCOAGIa6wa7O8/Xk2iMAKjG01SoAvKqqnvoMRGQpgP8HoA7APhF5Q1W/kOfLcp2qRkTkPgAHAPgB/FRV33LrfFwJTETkUZwGSkTkUQwAREQexQBARORRDABERB7FAEBE5FEMAEREHsUAQETkUQwAREQe9Z/nY+4EbfxSuQAAAABJRU5ErkJggg==\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.scatterplot(\n",
" x=test.log10_pval_clean(),\n",
" y=test_sf.log10_pval_clean()\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Inclusion of continuous effects"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"One can also regress out size factors. Alternatively one can account for other continuous effects such as time, space or concentration. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We also provide a separate tutorial for continuous covariate modelling in the notebook \"modelling_continuous_covariates\". Please consider this section here a short introduction and refer to the dedicated tutorial for further information."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Numeric covariates"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Firstly, you have to indicate that you are supplying a continuous effect if you want to do so. We will otherwise turn it into a catgeorical effect and this will not produce the desired results. We do this so that we can make sure that there are no errors arising from numeric and catgeorical columns in pandas DataFrames. Here, we add the size factor into the anndata object to make it accessible to the model:"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"data.obs[\"size_factors\"] = size_factors"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"I1101 08:29:58.174185 4648003008 estimator.py:51] training strategy:\n",
"{'max_steps': 1000, 'update_b_freq': 5}\n"
]
}
],
"source": [
"test_regressed_sf = de.test.wald(\n",
" data=data,\n",
" formula_loc=\"~ 1 + condition + size_factors\",\n",
" factor_loc_totest=\"condition\",\n",
" as_numeric=[\"size_factors\"]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Again, this gives different results to using size factors to scale to model only:"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXYAAAD8CAYAAABjAo9vAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAHQxJREFUeJzt3X1wG+WdB/DvT7KlyI4TG8dJACeE0jS9XM5Ao4ZCbuZgeClcaXIhofQ4EiglCZdhuOlBCW2Oo3NpOw2hxxwDTF56tCTAlAOageEtvLRc5wIpyBcIwWAChVwcQuIY2/hFsSzpuT+sVWxJK8nalbT76PuZYbCl9e6zq81vH/329zwrSikQEZE+POVuABER2YuBnYhIMwzsRESaYWAnItIMAzsRkWYY2ImINMPATkSkGQZ2IiLNMLATEWmmqhwbnTJlipo1a1Y5Nk1E5Fqtra3HlFJNuZYrS2CfNWsWQqFQOTZNRORaInIgn+WYiiEi0gwDOxGRZhjYiYg0w8BORKQZWwK7iFwqIu0i8qGI3G7HOomIqDCWq2JExAvgfgAXA+gA8KaIPK2UarO6biKiUojHFboGIohEY/BVedFY64PHI2VfV6HsKHdcAOBDpdSfAUBEfgtgMQAGdiJytHhcoSccweGe41j9cCs6usNobghg64ogZjdNRHd4GJFoDB4ReASIxOLwejzwCuDxeNKCdjyu0H6kDyu3hcasa9okP8KR0gV6OwL7qQAOjvq9A8A5NqyXiDTghB6sWbvaj/Ths97juOOpfejoDgMAOrrDuOeldvzTRV/B6u0ngv3GZS2464V2dPYP4ZdXngmPCAaGoqjxezGl1g+PR9A1EMHKbSE0TfRj47IWTJ80ATEF9IaH8c+PvY3O/iFsXRHEnGl1RT0GJbt5KiKrRCQkIqHOzs5SbZaIysgInkse2IWFG/6AJQ/sQvuRPkSjcXT2DeFQ9yA6+4YQj2d/9nI8rtKWz/Sa2bKZHBsYwsptIdT4vMmgblg6f0YyqAMjwf6HT+zFjeefgY7uMG55/G30D0Vx/t2v4ooHXkP7kT7E4wqRaAxNE/24c9FcAMDyB9/ARf/+31j+n2/g9su+iqaJfqzcFkLXQMTqoc3Kjh77IQAzRv3enHhtDKXUFgBbACAYDPIJ2kQVwOjB5uoNZ+vFmqU3/FUerHjwjbT0yf7O/rRlU9cdjysMDsXQ0R1GT3gYzQ2BMcG9sdaXFuw7usOoD1Qnf67xeZM/r9wWwo41C+Gr8uLmC2eje2A47VvALY+/je3XL8AHR/sRj8ftPdAp7OixvwlgtoicLiI+AN8F8LQN6yUil4tEY3n1hrP1YjNdHFZuC+FA12Daa0f7hzJeSD774viYHnzXQAQfHxtAc0MAm179CBuWtqC5IQAAaG4I4KRaX/J3Q3NDAD3h4bSfje1EojE01vpw+pTajN8COrrDONo3hPXPtOHYQCTntxQrLPfYlVJREbkJwE4AXgAPKqXetdwyInI9X5U3795wJBrLuI5MFwejx3z2jHrceP4ZqA9Uoyc8DIEas+zZM+px7Xmn4zubXx/Tgz+pphr3vrIfG5a2YO2Te3H3znasXzwPMxtrcLgnnAz2a5/cm5Zjb24I4J7vnImfP/d+cjvNDQHEEoG6xu/FYCSWtt/GxaCjO4zV21uxY81CNNX5Cz+4WdgyCZhS6jkAz9mxLiLSR2OtD1tXBMekRqbW+TMGPV+VN+M6Ml0cmhsCUABu/eacMcF38zXzccncqXix7SgA4Mbzz0i+D5zo2f/X6nPR2T+Eu3e2447L56I+UI3GiX5seP695N/uP9qP9Yvn4ctTawEIPus9jtsv+yoGIzFMrqlGU50v2Zb7r/4afvpsG362pAVTav04rbEGG5e14IdPnGjbL688E794/v1kO8wuZHYQpUqf7g4Gg4qzOxJVhtFVMSICn1dwpG/Ico69PlCFKzfvRkd3ONlznz5pAqZM9CV7xs0NAXzr3v9JW+eutRfgi+PRMevcdv0CDEXjadtpnOjDFQ+8lnZh+fV1X8fnAxH0hIdxyuQJ+PZ9u7Br7QU4taEmWUYZjsQQiyt82nscG55/H3sO9iT/vpAeu4i0KqWCuZYry7S9RFQ5PB5BY61vTHC+ZO5UPHrDOfB6JGcJpMcjmDOtDjvWLBxTMnm4N5wM6qk99w1LW/Bk60HcfOFXxvTggRPfDuZMC6StE4Dpdkbr6A7j84EIrtqyG80NAdxx+dxkOsbInccS90cn+LyYWudHZ/9QcvtbVwST2ysGBnYiKrrUG6Avth1F2+G+vHutHo+gqc6f7P0f7g1DRNDcEMiYbln75F7ccflc3PhwK7ZdvwBth/vG9sITF5JM2059rbrKY5ovNy4iD732MTYsbcFPn23DL5a24MgXQ2N7/suD+N0/nofjwzFUeT2YOtFf1Dp2BnYiKjrjBmjqzU6j7C+fQUypKZlL5k7Fpmvm4/hw5pur9YFqdHSH0RsexmOrvgEABQ2QqvJIWr78gX/4GhpqqvGb7y2Av0qw4txZuHtnO/Yc7MGd346lV/FsD2H94nn43m/ezJl6sgMDOxEVna/Ki0vmTsW1550+9mbn8vlorPXnVXueqdcPAHd++y+z9qh7BofR3FCT1hNPzf2bTRMQjsRw1wsnbrL2hIdx51Pv4vbLvjomFbPnYM9IOkadqMwZfSGbWufH2TPqsedgT7LuvVhVMZy2l4hskW3EZ2OtD//yrblpKZPV21sz1p5nqmvPVPb4YttRVHkEW1cEx9ShGzn2jctacFpjTVo+O3VE7Hc2v44POwewbsfe5ChSY39iSuHmC2dj06sfJata1n3rL3BSrQ9nz6hPfjswLkgTqkeqeIzc//pn2nDVlt1Y/uAbuPWbc5J/U8yqGPbYicgys8oVo9ft8Qi8HsmYMonG4mmvN030IxKN4VD3YDJ9Ylb26PF4xtxcFRGIjPTkAz4v6gPpqZdMg56MvPzKbSE8fdPCtDz5/VefjePDcdzy+NtjbtI+9NrHySqXhkA1Pg9H8PD3z0E0rnDXC+9l3Mb6Z9pMyzvtwMBORACsTdZlNjrUSDfE4woigiduPBddAxFsevWjZOqi2uvBr6/7Omp8XvSEh/FK2xEs+dqpuGrL7rTpAlJr4nPdCDVjNujJyMuHI+l58s8zTBOw9sm9ePSGc3Dy5JFvC6kXtw1LW9DZF0mWOXZ0h5O1/ayKIaKiytXjzsUsUEaisYzrNnq6P7h4DsLDsWTANG5M3vf7/RkvEpnKHgu5AWnW+x+OxfHr676OaFzhjsvnJi9AAEynCfAmvpF09qWnlIwe+urtrcltnFIfwPRJE/SY3ZGInMusx53vLIRGoBzNqBc3S3v8ZNE8TJvkT07kZby35pH/xdL5M8asy7hIGD3zUxM3QwsNjkaveXRe/pdXnom6CVW446l9+JuNr2L9M23JnDiA5DQBmfYRML+4GT1z42JZ7KAOsMdO5BrFnNc8W487H5mmDjDSDWYDfJRSCEeyB0NDtikHCpE66Mmoilm66XU0TfQnK2AGIzHcdukc/PCJvTitscZ0HwHzbwGn1Aewa+0FJZ2LnoGdyAVypUqsBn2zoJRvMDUbHepJjCzNtu5M742eT6ZYIzVT8/KHugfRNNGfNop10zXz8fRNC1EfyDwy1TjOZhe3UvTQU3GuGCIX6OwbwpIHdqUFwB1rFqYN1y9kAIzVHHuh6wbSbzimPpauVD3dzr4h7DvUO+YGKTC+eV2K/bQozhVDpJFsqZJcFSn5yNbjtirXus3eK9bgHTPGXOpWUlLlaHcmDOxELmA2X4mvyms5P24oZlDKtm6nBEOPR1Djs5aScgpWxRBZkO/zNa2s+0hvGF+Eh7Fx2din/Bh552wVKeVSzONSLPG4Qv9Q1PQ4W113KY8He+xEBSpVXtoYqZharTFtkj85JW7qTbvN18yH1zOyHrM5zouVCy7mcSmmroEIVjz4hulxLlQ5jgdvnhIVKNsNTauphdHrfmzVN3DVlt1pyxgldJFoDAGfF9GYwmAkho+PDeDeV/ajs3/I9EHOxQw0xTwuxXSoexALN/wh7XXj4RmFsvN45HvzlKkYogLZldvOtW5jlsLRjIc6GJNYLbpvFzr7h/Dz59rwvd+8iT0He9IGGRnpgMO9YUuDkcbTdkOm45IpPVHOFE6xUlrFPE/MMLATFcjOQJAa0AK+E+s2Hqw8Ou+7efl8/PTZtrSZEs1GbI6ezbCjO/OAIbsCTT7HJXV2xSUP7MInXQNprxkzLZZCptGoduTXy3EPhIGdqEB2BYJMQe7IF0PYdv0CNDcEsOdgDx567WM8esM52LX2AuxYsxBTan1jHvcGZB+xObok0uwbgF2BJp/jkqlE80DXYFG/SeQyuizTOM52pKeKdcHIhjdPiQpkV+23WR3679acZ7ruzr6hcY3YHD2s3/gGMHp0pZ2BJp/jkik9YTbJVjFTFqmKUXpZzDECZhjYiSywIxCY5WCHo3HTm3Zmw9dPmZz+gObUYf17Dvbg7p3tWL94Hs6YOhGBavsDTa7jkmmaAWOSrXxryIs9ytNOpa7VZyqGqMwKycGapQ2qqjwZZz9MTQd09g9h+uQJaK4PWJolsVCZ0hPGJFv5pCwypa9KmY93OpY7EpVZvuWHVnuo+f59qXrCmbYDIK9tu7Wk0irOFUPkEvnkYO2oPc8nHVDKwTRm7cknMJejhNBNmIohcoBcD5Cw+iCMfJVqO1Y5cRoFJ2FgJ3KBUvVQ3dITLkcJoZswFUNl46aqhnKz+iAMp23HqnKUELoJe+xUFqxqGJ9S9VAzbWfz8vloCFTbuh0z45lSwK7nn+qIVTFUFpVa1WBFqb7hRKNxfNobxtG+IXQNRPBk60H84OI5RZ+d0a2zQpYSq2LI0dySyzU4IW1UqkEu3eFhXP2rP435fNoO9xX9omvHk6BohKVUjIhcKSLvikhcRHJeRYgMbqpqqLS0Ubkuum672DuZ1Rz7PgBXAPijDW2hCuKmqga3lADapVwXXTdd7J3OUipGKfUeAIgw/0XjU66qhkJSKmY9yXg8js6+Ie2qMszmoSn2Rbdc29URc+xUNlZzxuMN0oXenMtUAnjJ3Kk4NhDB6u2t2t3oy3ckrN33HFjCaJ+cVTEi8jKA6RneWqeUeiqxzKsAblVKmZa6iMgqAKsAYObMmfMPHDhQaJuJCgrShVbiZNrWozeck3aDsVKqepxeveKEG93FYltVjFLqIjsapJTaAmALMFLuaMc6qXIVUkFR6M05oyf5uzXn4fhwHF4B4kpV7I0+J1evOP2iUyocoESuVEiQtnpzrqs/gqu37sbCDX/Ah0cHMq5LRLStljE4uXql0m50m7Fa7rhERDoAnAvgWRHZaU+ziLIrJEhbqcRJDRj3vrIfG5eNfQ7phqUt+MnT+7QuhQScXb3i5ItOKVmtitkBYIdNbSHKWyEVFFZuzqUGjD0He3DXC+347apv4LPe4+gaiODune3Yc7An52Aet+eAi129YuX4uGWum2JjVQy5UqFButBKnEwBo7N/CAJg2abXxyybrYeoQw64mNUrVo8PSyZHcK4YB3J7j05HZgFn2iQ/Ft2Xf6WNW+fIyXZO2nm+2nF8dP73w7liXEqHHp2OzHqpAMbVQ3RjDjjbOQnA1vPVjuNT6gdHOxGrYhzG7Xf1xzPtqttkmibW7KHSZkHNyTcezWQ7J+0+X+0+Pjqfj9kwsDuMG3t0hkqbLMswnnnB3TRHjiHbOWn3+Zrr+IwnUFfq+QgwFeM4br6r7+SBK07hxmHzuc5JO8/XbMdnvGnKSj4f2WN3GDf26Axu/rZRSm578k+2c7IY56vZ8Rlv2qeSz0f22B3GjT06g5u/bQB6V1NYkeucHP1edZUHVR7B4d6w7cdwvIHa7eejFeyxO1CuHp1Tbwi5+dtGJedj85HtnDTeO3lyAF39ESy6rzjHcLw3Vt18PlrFOnaXcXo5pFt7vUb9dNNEP248/wzUB6oxGInhzBmTcVKt3vlYuxS7Rr+Qc9+t56MZ1rFryuk3hNxaQxyJxtA00Y9bvzkHa5/cmwwcm6+Zj0n+anSHh7UJDsVS7Jx2IWlKt56PVjEV4zKVfEOomHxVXtx84exkUAdGjuvqh1vxaW+YKZo8lKJG3203nsuFgd1l3DjAxQ0aa304fUptxovm0b6hkg4Yc+o9lFwqOaftNEzFuAwnOSoOj0dQ489cRZEaxIv5Dcnp91CycXNFl24Y2F2G/3iKZ0qtP+2iuXn5fPzHyx+MWa6Y35Ccfg8ll0rNaTsNA7sL8R9PcWS6aDYEqvGDi+eg7XBfSb4h8R4K2YGBnUrCLWVnmS6apfyGVMmDasg+vHlKRef2wT+lrMTgDUiyAwcoUdG59eES5eKWbzdUehygRI7BvPH48B4KWcXA7iC69tSYN9afrueuWzGwO4Sb65dzYe293nQ+d92KOXaH0D0PzR6dvnQ/d52EOXaX0T0PzbyxvnQ/d92I5Y4OwTlgKBM3zBvDc9d5GNgdgvXLlMot9f88d52HOXYHYR6aRnNT7prnbmkwx+5CzEPTaG7KXfPcdRamYogcirlrKhQDO5FDMXdNhWIqhsihOPc+FYqBncjBmLumQlhKxYjIRhF5X0T2isgOEam3q2FERFQYqzn2lwDMU0q1APgAwI+sN4mIiKywFNiVUi8qpaKJX3cDaLbeJCIissLOqpjrATxv4/qIiKgAOW+eisjLAKZneGudUuqpxDLrAEQBPJJlPasArAKAmTNnFtRYO3GkHBHpKmdgV0pdlO19EbkOwOUALlRZ5idQSm0BsAUYmVJgfM20F+ePJiKdWa2KuRTAbQAWKaUG7WlS8XUNRJJBHRgZpr1yWwhdA5Eyt4yIyDqrOfb7ANQBeElE3hKRTTa0qejcNAcHEdF4WRqgpJT6sl0NKSU+g5OIdFaRc8VwDg4i0llFTinAOTiISGcVGdgB63NwsFySiJyqYgO7FSyXJCInq8gcu1UslyQiJ2NgLwDLJYnIyRjYC8BHlhGRkzGwF4DlkkTkZLx5WgCWSxKRkzGwF4iPLCMip2IqhohIMwzsRESaYWAnItIMAzsRkWYY2ImINMPATkSkGQZ2IiLNMLATEWmGgZ2ISDMM7EREmmFgJyLSDAM7EZFmGNiJiDTDwE5EpBkGdiIizTCwExFphoGdiEgzDOxERJphYCci0gwDOxGRZhjYiYg0w8BORKQZS4FdRNaLyF4ReUtEXhSRU+xqGBERFcZqj32jUqpFKXUWgGcA/KsNbSIiIgssBXal1Bejfq0FoKw1h4iIrKqyugIR+RmAFQB6AVxguUVERGRJzh67iLwsIvsy/LcYAJRS65RSMwA8AuCmLOtZJSIhEQl1dnbatwdERDSGKGVP9kREZgJ4Tik1L9eywWBQhUIhW7ZLRFQpRKRVKRXMtZzVqpjZo35dDOB9K+sjIiLrrObYfyEicwDEARwAcKP1JhERkRWWArtSaqldDSEiIntw5CkRkWYY2ImINMPATkSkGQZ2IiLNMLATEWmGgZ2ISDMM7EREmmFgJyLSDAM7EZFmGNiJiDTDwE5EpBkGdiIizTCwExFphoGdiEgzDOxERJphYCci0gwDOxGRZhjYiYg0w8BORKQZBnYiIs0wsBMRaYaBnYhIMwzsRESaYWAnItIMAzsRkWYY2ImINMPATkSkGQZ2IiLNMLATEWmGgZ2ISDNV5W4A5RaPK3QNRBCJxuCr8qKx1gePR8rdLCJyKAZ2h4vHFdqP9GHlthA6usNobghg64og5kyrY3AnooxsScWIyC0iokRkih3roxO6BiLJoA4AHd1hrNwWQtdApMwtIyKnshzYRWQGgEsA/J/15lCqSDSWDOqGju4wItFYmVpERE5nR4/9HgC3AVA2rItS+Kq8aG4IjHmtuSEAX5W3TC0iIqezFNhFZDGAQ0qpt21qD6VorPVh64pgMrgbOfbGWl+ZW0ZETpXz5qmIvAxgeoa31gH4MUbSMDmJyCoAqwBg5syZ42hiZfN4BHOm1WHHmoWsiiGivIhShWVQROSvALwCYDDxUjOATwEsUEp9lu1vg8GgCoVCBW2XiKhSiUirUiqYa7mCyx2VUu8AmDpqg58ACCqljhW6TiIiso4jT4mINGPbACWl1Cy71kVERIVjj52ISDMM7EREmmFgJyLSDAM7EZFmGNiJiDTDwE5EpBkGdiIizTCwExFphoGdiEgzDOxERJphYCci0gwDOxGRZhjYiYg0w8BORKQZBnYiIs0wsBMRaYaBnYhIMwzsRESaYWAnItIMAzsRkWYY2ImINMPATkSkGQZ2IiLNMLATEWmGgZ2ISDMM7EREmqkqdwPyFY8rdA1EEInG4KvyorHWB49Hyt0sIiLHcUVgj8cV2o/0YeW2EDq6w2huCGDriiDmTKtjcCciSuGKVEzXQCQZ1AGgozuMldtC6BqIlLllRETO44rAHonGkkHd0NEdRiQaK1OLiIicyxWB3VflRXNDYMxrzQ0B+Kq8ZWoREZFzuSKwN9b6sHVFMBncjRx7Y62vzC0jInIeSzdPReQnAFYC6Ey89GOl1HNWG5XK4xHMmVaHHWsWsiqGiCgHO6pi7lFK3W3DerLyeARNdf5ib4aIyPVckYohIqL82RHYbxKRvSLyoIg0mC0kIqtEJCQioc7OTrPFiIjIIlFKZV9A5GUA0zO8tQ7AbgDHACgA6wGcrJS6PtdGg8GgCoVC428tEVEFE5FWpVQw13I5c+xKqYvy3OBWAM/ksywRERWPpVSMiJw86tclAPZZaw4REVmVMxWT9Y9FtgM4CyOpmE8ArFZKHc7j7zoBHBjn5qZgJO1TSSpxn4HK3G/uc+Wwst+nKaWaci1kKbCXkoiE8skt6aQS9xmozP3mPleOUuw3yx2JiDTDwE5EpBk3BfYt5W5AGVTiPgOVud/c58pR9P12TY6diIjy46YeOxER5cGxgV1ENorI+4npCnaISL3Jcp+IyDsi8paIuHo46zj2+VIRaReRD0Xk9lK3004icqWIvCsicRExrRTQ6XMGxrXfOn3WJ4nISyKyP/H/jFOQiEgs8Tm/JSJPl7qddsn12YmIX0QeS7z/JxGZZde2HRvYAbwEYJ5SqgXABwB+lGXZC5RSZ2lQOpVzn0XEC+B+AJcBmAvg70Vkbklbaa99AK4A8Mc8ltXlcwby2G8NP+vbAbyilJoN4JXE75mEE5/zWUqpRaVrnn3y/Oy+D6BbKfVlAPcA2GDX9h0b2JVSLyqloolfdwNoLmd7SiHPfV4A4EOl1J+VUhEAvwWwuFRttJtS6j2lVHu521Fqee63Vp81Rtr+UOLnhwD8XRnbUmz5fHajj8cTAC4UEVseMuHYwJ7iegDPm7ynALwoIq0isqqEbSo2s30+FcDBUb93JF7Tna6fcza6fdbTRo1M/wzANJPlJiRmgt0tIm4N/vl8dsllEh26XgCNdmzcjgdtFCzbzJFKqacSy6wDEAXwiMlq/lopdUhEpgJ4SUTeV0rl87W+LGzaZ1fJZ5/z4KrPGbBtv10lx2ywSUopJSJmJXmnJT7rLwH4vYi8o5T6yO626qysgT3XzJEich2AywFcqEzqMpVShxL/PyoiOzDyFcix/+Bt2OdDAGaM+r058Zpj5TtDaI51uOpzBmzZb60+axE5IiInK6UOJyYQPGqyDuOz/rOIvArgbABuC+z5fHbGMh0iUgVgMoAuOzbu2FSMiFwK4DYAi5RSgybL1IpInfEzgEvg4hkm89lnAG8CmC0ip4uID8B3Abi2ciAfun3O46DbZ/00gGsTP18LIO1bi4g0iIg/8fMUAAsBtJWshfbJ57MbfTyWAfi9WQd23JRSjvwPwIcYyT+9lfhvU+L1UwA8l/j5SwDeTvz3Lka+4pa97cXc58Tvf4uRqpmPNNjnJRjJPw4BOAJgp+6fc777reFn3YiRapj9AF4GcFLi9SCAXyV+Pg/AO4nP+h0A3y93uy3sb9pnB+DfMNJxA4AJAB5P/Lt/A8CX7No2R54SEWnGsakYIiIqDAM7EZFmGNiJiDTDwE5EpBkGdiIizTCwExFphoGdiEgzDOxERJr5f5u/YsSMHDu8AAAAAElFTkSuQmCC\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.scatterplot(\n",
" x=test_sf.log10_pval_clean(),\n",
" y=test_regressed_sf.log10_pval_clean()\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Spline basis transformation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It may be desirable to not fit a linear trend to a continuous covariate but to allow smooth trends in this covariate, such as smooth trends of total counts, time, space or concentration. This can be solved by using a spline basis space representation of the continuous covariate. Diffxpy does this automatically in a separate wrapper `continuous_1d()`:"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"I1101 08:30:00.283493 4648003008 estimator.py:51] training strategy:\n",
"{'max_steps': 1000, 'update_b_freq': 5}\n"
]
}
],
"source": [
"test_spline_sf = de.test.continuous_1d(\n",
" data=data,\n",
" formula_loc=\"~ 1 + condition + size_factors\",\n",
" formula_scale=\"~ 1\",\n",
" factor_loc_totest=\"condition\",\n",
" continuous=\"size_factors\",\n",
" df=4,\n",
" quick_scale=False\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The spline model has more degrees of freedom (df=4 means 4 degrees of freedom) to fit the expression trend of each gene as a function of the size facotr than the simple linear model (1 degree of freedom) had. Accordingly, the p-values change again:"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXYAAAD8CAYAAABjAo9vAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAGWdJREFUeJzt3W1wXOV99/Hff3e18kqWkZBlYywbexrHE8cRD1YI4BcthRKSuvEQQ6fhwdA0thOXyT33UHBS12mnHmbqmIxnCAQbd0hrQqak8e0hTULB0HC3Q5sSyYDDk2JoIJINWFZkI8trrXbP1RfSLnrY9a60u9rds9/PjF7s7vGe62Dm50v/68mccwIA+Eeg1A0AABQWwQ4APkOwA4DPEOwA4DMEOwD4DMEOAD5DsAOAzxDsAOAzBDsA+EyoFDedO3euW7JkSSluDQAVq7Oz84RzriXbdSUJ9iVLlqijo6MUtwaAimVm7+RyHaUYAPAZgh0AfIZgBwCfIdgBwGcKEuxmdr2ZdZnZm2b2tUJ8JwBgevKeFWNmQUkPSvoDST2SfmFmP3LOvZbvdwNApfI8p77BmDzPU8JJzjmFQ0E114cVCFhR712I6Y6XS3rTOfc/kmRm/yRprSSCHUBVisc9dR0f0I9e7NFn2xbqz79/SD39UbU2RbTntlVaPq9BoVDxKuGF+OaFkrrHvO4ZfQ8AfM/znHoHhnS0/4x6B4YUj3s6diqqH73Yo1uvXJoKdUnq6Y9q06OdOnYqKs8r3rGkM7ZAycw2StooSYsXL56p2wJAUXie08loTO+ePKtN3+tM9ch337pKc2fX6A8vXqjfDsZSoZ7U0x9VdDihvsGYWhpqi9K2QvTYj0paNOZ16+h74zjnHnbOtTvn2ltasq6IBYCy5XlOXe8P6OXuU6lQl0ZC+8vf69TpIU8P/NsRNdWH1doUGfdnW5simlUTlKm8e+y/kLTMzJZqJND/RNLNBfheAChLfYMx7TrYpa9/5mP69hcu1ezakM4OJ3Ts1Fntfu4t1YZMm6/+iI6djOrRP7tcb584o/ufPaLe00Pasa5N9/7kNf31H328aO3LO9idc3Ezu1PSU5KCkh5xzr2ad8sAoEx5nqfbr1qq2x55IVWC2bGuTfs7u3XP9ctVEwwoGkvo7h8eTn3+0C2X6fRQXN/81y692H1Sf/WHK4rWvoIMyzrnfuqc+6hz7necc/cW4jsBoJzE456OnYzqnb5BxT2nLfsPjyvBbNl/WOtWLdLdPzysuOdSoZ78/CuPHdIHZ+N6sfukWpsiCgWLNyumJLs7AkC5Ss4/j8UTqgkFFAqYPM/p3Q+G9JXvdapldq2+9ccXpx0UnddQq57+qBKeS/t5Y6QmNcA6b3ZxBk4lgh0AUpKDohv2daRKKDtvbNOFjZFUqP/Fp5fLcyODoGPDu7Upotm1IbU2RZTwXNrPL2yM6PGNV2h+w6yyn8cOAL7QNxhLhbo00su++4eHFTAbmfHye7+jLfsPKxqLa8e6ttSMl2SNPZbw9ODNl+nxF97Rt266eNznD91ymYbiCS04L1LUUJfosQNASiyeSFtCSfbAGyM16umP6tips9rf2a1ta1aoMVKjk9Fh/eN//lrb1nxcZ4fj+mzbQj34syPatmaFmuvDammo1ckzMTXUhoq+nYBEsAOoYmPr6eFQUJFwMG0JZVZNQI996VMaTjh9945P6slfvqvbr1qaGkBtbYrooVtXafuPX9XTrx3XdSvm6Wuf+ZjMpKP9UT34b2/qM59YoKb6sHoHhoq+X4w5V7xJ8pm0t7c7jsYDUErp6ul7b2tXY11IR0+eVd9gTIfe7tP6q5aq/8ywvjxmdel3brlM//+N41rZ2qiLmusUDgX0t/8yEupJrU0R3XfTxdrx5Bu65/rl46Y+7l3fruXzG6Yc7mbW6Zxrz3YdNXYAVSldPX3XM10ajHlKeE7hYEA3fnKxukdXk469bvNjh/S5Sxfq4wsa1FhXI5PGhXryugsbI3rg5ksnTX3csK9DfYOxoj0bpRgAVWliPf3SRY26/aqluuO7L4ybEbNwQmlGGgnn3oEhzWuoVWvDLCW8WNoSTqQmmLFuH4snivZs9NgBVKVwKDhuH5fkjJeJM2JMlna/l77BmI4PDKlvMKbm+rD2rm8fNwtm7/p2NdeHJ90n+Xk4FCzasxHsAKrSxDBurg+n7VkPJ7y0Uxv3d3brvEiNPM9TIGBaPr9BBzav1vNbrtaBzatTNfRzhX6xUIoB4EsTZ7xMnIkyNoyTZZF05ZTjA0O676kubV+7Uhc11+nYyaj+8T9/rduvWqqdT72h/3PtR9XSMEuBgKXdhnfifWbiFCVmxQDwnbQzXrLMRBkeTqjr+Olxs1/23LpKLQ21OjucUE0woIBJ3f1R9Q3GtPu5t1L7vhzYvLpoe6uPleusGHrsAHzF85ze++CsBofi2rZmRSqAN+zryBjAnuf05olB/ctLPfruHZ9UMGAKhwKqCweU8EytTXUKBExH+8/oxt3/Ne7PFnsgdDoIdgC+ka6nvmNdm+57amSr3GQAe57TicEhnR1OKGimmmBAuw526enXjmvPf7ytSxc16qvXLNOi8yPq/m1UFzXXaUlzfWogdGK5ppgDodNBsAPwjXRz07fsP6xvf+ESnTwTV8I5/XZwSO+fGtKGR8dv9LX56o+od2BkbvlffHr5uFWlO29sU2NdTWogdGKJp5gDodNBjR2AbxztP6PVO36Wen3pokbddd1HtbApkjrF6KvXLNO2J16Z1OvevnalYglPkrT9x69N+vzxjVdoYVNd1kHZYqLGDsAXphKkY0slly5qnNTz3rGuTXXhYNppjXXhoBoCoYx7qSdG+8CZZr+UE+axAyhbyZr5Dd95Xqt3/Ew3fOd5db0/IM9zk67rHRhSLJ7QY1/6lK5bMS/tgqMt+w+n9kwfq7UpojOxhC5sjKi1KZLhAOrKicvKaSmAqpOuZj5xn5WJ4X/L3/+37vz9Zfro/Nlpe95nhxPa9cfj90rfeWObLmqu0wVzZmnBeZG0C4rm1pd3L30sSjEAylYu+6ykC//Njx3SP/zp5WlnsJw4HdOi8yP6f1+5SmfjnoImRcJBNUY+LPHM9IKiQqPHDqBs5bLPSqbwrw2ZHrrlsnE97923rtKsmoAaakOaN2eWFp9fp4VNdTq/vnbSqtSWhlotbKpTS0NtRYW6RLADKGO57LOSKfxfe3dA33jiVW1bs0KPb7xC29as0JxIaOSou4C/o4/pjgDKWrZZMZ7n9Pp7H2jToyOHTX/1mmVaMrdO738wpB1PvqEXu09K+vDgi/MiNdM65KIcMN0RgC/kMr2wMVKjnTe2aXZtSF957NC4hUXf/Ncu9Z4e0p7bVmnBebPG1dL9imAHUNH6BmP61funJUl3//DQpP3UH994RUUOgObD34UmAL4Xiyd0/7NHtLi5LuPComoKdYlgB1AGkguMjvafUe/A0KQFSOcSDgXVe3pI756Mph1Efev46bSLmvyMYAdQVNlCO5fVpef6juTMmX3/9Xbak47uf/ZI0Q+PLjfMigFQMBNnsDRFanSk9/Q5D7zoHRjSDd95ftJCouTe6bkcmpG8r+d5GvacjvZHdTI6nNqLXZKe33K1FjbVzfx/lALKdVYMPXYABZGu533sVDTrlgDZVpfmsq1AcubM/PMiqg0Fddc/v6xNj3aOm+pYbnumFxPBDqAg0gXw8YGhrFsCZFtdmin4o7F42tJOKQ6PLjcEO4CCSBfAfYOxrFsCZAviTMH/+nsDaevxYw+Pfn7L1TqweXXFLkiaLoIdQEGkC+D9nd3ac9uqc/aeAwHTspbZ+sGmK/Xvd/+eHt94hc6vqxmtmbu0wb9jXZt2P/dW2rJM8jsrea+XfDF4CqAgMg1yLmuZrf7o8Dm3BOh6f0C7Dnbp9quWjjsYIzlIKo30/qOxuF5/b2DcoKjkj4HRXLClAIAZNbYEMjHEz7UlQLI2v23NikkHY2zY15GaHdPSUKvegfTH1lXTwGgu8irFmNlNZvaqmXlmlvVfEQD+kGleeaYSyLnmoSdr842RmqwDrQyM5ibfHvsrkj4vaU8B2gKgAuQyr3wq1ydr8yejw2kPxhjbGz/XbwX4UF49dufc6865rkI1BkD5y/W4umQP/cTpIb136qy+ddPF2nPbKrXMrtWGfR06MTgk6cNe+P7O7kkrR9P1xqt9YDQX1NgBTEm2BUVje+gts2v1t2s/rm1PvJLqre9Y16b7nurS2WEvtWJ0zqyQ/uZzKxUOmn6w6Uo55+iN5yFrsJvZM5IuSPPRVufcE7neyMw2StooSYsXL865gQDKS7J0kqlkMrZHv23NitT+6NLIPwBb9h/W9rUrFTJNqaSD3GUtxTjnrnXOrUzzk3Ooj37Pw865dudce0tLy/RbDKCksg1gju3RZxoQvai5TqFgIGtJB9NDKQbAlGQbwBzbo880IFoTDGg44WWdBYPpyXe64w1m1iPpSkk/MbOnCtMsAOXsXAOYY3v0u597SztvHD8guue2VVowZ1bWPWIwfaw8BZCXdIdNS0q9FwkHFfechuPeuN79VKdNgpWnAIpo7P7nJwZj2vRo56RwznYANXPSi4dNwADkZOzc9J7+M9p64LBe6jmVCnVp6gOgzEkvDnrsALJKVzbZsa5NdeEgA6BliB47gJRMe7qcGByaNDVxy/7Dml0bYgC0DBHsACRlPlQ6Hvd0Zij9atOzw4mctgHAzKIUA0BS5j1gfrDpSv36xGDa+ejHTp3V/s5uff9Ln1JwdEMvBkBLj2AHICnzHjDDCU/3P3tEO9a1jTsEY/etq9QyO6zLFjcR5mWGYAcgSaoJBTKuEu09PaT7nurStjUr1Bip0ZlYQhc2ztL59eee0ojSoMYOVLnkgOnQcEL7vni5rlsxT9JIqO+8sU114YD2rm9X7+khbXq0U3f988uaN6dWc2prxv35dIdooDRYeQpUsXTTGB+8+TIFTDp26qx2P/eWHrj5Us1vmKVjp6I6PjCkvsGY9nd26//+wXIta5mtI72nWT06Q1h5CiCrdAOmf/79Q9q2ZoU2PdqZmrrYHx3WzX//3+PKNK+9O6AfbLoy7YBr8pxSlAalGKCKZRowbYzUjJu6mOm6ODs0liWCHahimXZYbG2K6MDm1ZPOJZ14XSgYYIFSGSLYgSrWFKnR7ltXjVtgtPvWVZoVDoybwpjpcI15s2szHrrBoGrpMHgKVLHegSFtPXBY669cogWNEf2m74zuf/aIek8PTRoETbc9b3L73XTb9rIlb+HlOnhKjx2oYrF4Qk+/dlyDsYRuf+QF/ek//EIvdp9Mu0vj2J0Ym+vD6huM6Wj/GfUNxtRcHx63Q2OmVawcezczmBUDVLFk7TzT2aTpBkFzOSAj02Arg6ozgx47UGXG1r6DAWnv+nadiSVyHgTNpTfOsXelRbADPpVu8HLiDo6fe+B51YYCals0R3smDKJm2qUxl954psFWdn2cGZRiAB/KVC6ZP6d2Um97/SMv6MDm1frYgjk5HVOX7I1P3FNmbG+cY+9Kix474EOZyiXRWObedq7H1OXaG+fYu9Khxw74UKZyScIpa287G3rj5Y8eO+BDmQYvZ9UEClL7pjde3uixAz6ULJdMrLHPra/V3Ppaets+R7ADFS7TitBzlUvYedHfCHaggmVbLESAVyeCHahgfYMx7Tr44ZF1J6PD2nWwS/fe0EaoVzGCHahgnufp9quWjjtkese6NnmeV+qmoYSYFQNUsIRTKtSlkSmNW/YfVoIdcqsawQ5UMOdc2vnqpdiOG+WDYAcqGJttIR2CHahgbLaFdBg8BSoYy/uRDsEOVIhMC5GYr46JCHagAuRyahGQlFeN3cx2mtkbZnbYzA6YWWOhGgbgQ5whiqnId/D0oKSVzrk2Sb+S9PX8mwRgIs4QxVTkFezOuaedc/HRlz+X1Jp/kwBMxLRGTEUhpzt+UdKTmT40s41m1mFmHb29vQW8LeB/TGvEVFi2FWpm9oykC9J8tNU598ToNVsltUv6vMthyVt7e7vr6OiYRnOB6pVpVgyqh5l1Oufas12XdVaMc+7aLDe6Q9IaSdfkEuoApodpjchVXtMdzex6SfdI+l3n3JnCNAkAkI98a+wPSGqQdNDMXjKz3QVoEwAgD3n12J1zHylUQwAAhcHKU6CIGPBEKRDsQJGwDQBKhW17gSJhGwCUCj12YBpyKbGwDQBKhWAHpijXEktyG4Cx4c42AJgJlGKAKcq1xMI2ACgVeuzAFOVaYuF0I5QKwQ5M0VRKLGwDgFKgFANMESUWlDt67MAUUWJBuSPYgWmgxIJyRikGAHyGYAcAnyHYAcBnCHYA8BmCHQB8hmAHAJ8h2AHAZwh2APAZgh0AfIZgBwCfIdgBwGcIdgDwGYIdAHyGYAcAnyHYAcBnCHYA8BmCHQB8hmAHAJ8h2AHAZwh2APAZgh0AfIZgBwCfIdgBwGfyCnYz225mh83sJTN72swuLFTDAADTk2+Pfadzrs05d4mkH0v6RgHaBADIQ17B7pz7YMzLekkuv+YAAPIVyvcLzOxeSeslnZJ0dd4tAgDkJWuP3cyeMbNX0vyslSTn3Fbn3CJJj0m68xzfs9HMOsyso7e3t3BPAAAYx5wrTPXEzBZL+qlzbmW2a9vb211HR0dB7gsA1cLMOp1z7dmuy3dWzLIxL9dKeiOf7wMA5C/fGvvfmdlySZ6kdyR9Of8mAQDykVewO+fWFaohAIDCYOUpAPgMwQ4APkOwA4DPEOwA4DMEOwD4DMEOAD5DsAOAzxDsAOAzBDsA+AzBDgA+Q7ADgM8Q7ADgMwQ7APgMwQ4APkOwA4DPEOwA4DMEOwD4DMEOAD5DsAOAzxDsAOAzBDsA+AzBDgA+Q7ADgM8Q7ADgMwQ7APhMqNQNwIc8z6lvMKZYPKFwKKjm+rACASt1swBUGIK9THieU9f7A9qwr0M9/VG1NkW0d327ls9vINwBTAmlmDLRNxhLhbok9fRHtWFfh/oGYyVuGYBKQ7CXiVg8kQr1pJ7+qGLxRIlaBKBSEexlIhwKqrUpMu691qaIwqFgiVoEoFIR7GWiuT6svevbU+GerLE314dL3DIAlYbB0zIRCJiWz2/Qgc2rmRUDIC8EexkJBEwtDbWlbgaACkcpBgB8hmAHAJ8pSLCb2V1m5sxsbiG+DwAwfXkHu5ktknSdpN/k3xwAQL4K0WPfJekeSa4A3wUAyFNewW5mayUddc69nMO1G82sw8w6ent787ktAOAcsk53NLNnJF2Q5qOtkv5SI2WYrJxzD0t6WJLa29vp3QNAkWQNdufcteneN7NPSFoq6WUzk6RWSYfM7HLn3HsFbSUAIGfTXqDknPulpHnJ12b2tqR259yJArQLADBNzGMHAJ8p2JYCzrklhfouAMD00WMHAJ8h2AHAZwh2APAZgh0AfIZgBwCfIdgBwGcIdgDwGYIdAHyGYAcAnyHYAcBnCHYA8BmCHQB8hmAHAJ8h2AHAZwh2APCZgu3HXmye59Q3GFMsnlA4FFRzfViBgJW6WQBQdioi2D3Pqev9AW3Y16Ge/qhamyLau75dy+c3EO4AMEFFlGL6BmOpUJeknv6oNuzrUN9grMQtA4DyUxHBHosnUqGe1NMfVSyeKFGLAKB8VUSwh0NBtTZFxr3X2hRROBQsUYsAoHxVRLA314e1d317KtyTNfbm+nCJWwYA5aciBk8DAdPy+Q06sHk1s2IAIIuKCHZpJNxbGmpL3QwAKHsVUYoBAOSOYAcAnyHYAcBnCHYA8BmCHQB8xpxzM39Ts15J70x4e66kEzPemPLAs1cnnr065fPsFznnWrJdVJJgT8fMOpxz7aVuRynw7Dx7teHZi/vslGIAwGcIdgDwmXIK9odL3YAS4tmrE89enYr+7GVTYwcAFEY59dgBAAVQVsFuZn9jZkfN7KXRn8+Wuk0zzczuMjNnZnNL3ZaZYmbbzezw6N/502Z2YanbNFPMbKeZvTH6/AfMrLHUbZopZnaTmb1qZp6Z+X6GjJldb2ZdZvammX2tmPcqq2Aftcs5d8noz09L3ZiZZGaLJF0n6TelbssM2+mca3POXSLpx5K+UeoGzaCDklY659ok/UrS10vcnpn0iqTPS/r3Ujek2MwsKOlBSZ+RtELSF8xsRbHuV47BXs12SbpHUlUNfDjnPhjzsl5V9PzOuaedc/HRlz+X1FrK9swk59zrzrmuUrdjhlwu6U3n3P8452KS/knS2mLdrByD/c7RX0sfMbOmUjdmppjZWklHnXMvl7otpWBm95pZt6RbVF099rG+KOnJUjcCRbFQUveY1z2j7xXFjB+0YWbPSLogzUdbJT0kabtGemzbJX1LI/+z+0KWZ/9LjZRhfOlcz+6ce8I5t1XSVjP7uqQ7Jf31jDawiLI9++g1WyXFJT02k20rtlyeHYU348HunLs2l+vMbK9G6q2+kenZzewTkpZKetnMpJFfxw+Z2eXOufdmsIlFk+vfu0aC7afyUbBne3Yzu0PSGknXOJ/NP57C37vfHZW0aMzr1tH3iqKsSjFmtmDMyxs0Mrjie865Xzrn5jnnljjnlmjk17TL/BLq2ZjZsjEv10p6o1RtmWlmdr1GxlU+55w7U+r2oGh+IWmZmS01s7CkP5H0o2LdrNzOPP2mmV2ikVLM25I2lbY5mCF/Z2bLJXka2fXzyyVuz0x6QFKtpIOjv6393DlXFc9vZjdI+rakFkk/MbOXnHOfLnGzisI5FzezOyU9JSko6RHn3KvFuh8rTwHAZ8qqFAMAyB/BDgA+Q7ADgM8Q7ADgMwQ7APgMwQ4APkOwA4DPEOwA4DP/C34AbCqfnDWhAAAAAElFTkSuQmCC\n",
"text/plain": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.scatterplot(\n",
" x=test_regressed_sf.log10_pval_clean(),\n",
" y=test_spline_sf.log10_pval_clean()\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Testing a multiple coefficients with a Wald test"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We know turn to tests that cannot be performed with T-tests or rank sum tests because they involve more than two groups (or more general: multiple coefficients)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generate data:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We now simulate not only two conditions but four conditions, which results in 3 coefficients to be tested: Note that the first group is absorbed into the intercept as is standard in generalized linear models."
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Transforming to str index.\n"
]
}
],
"source": [
"from batchglm.api.models.tf1.glm_nb import Simulator\n",
"\n",
"sim = Simulator(num_observations=200, num_features=100)\n",
"sim.generate_sample_description(num_batches=0, num_conditions=4)\n",
"sim.generate_params(\n",
" rand_fn_loc=lambda shape: np.random.uniform(-0.1, 0.1, shape),\n",
" rand_fn_scale=lambda shape: np.random.uniform(0.1, 2, shape)\n",
")\n",
"sim.generate_data()\n",
"\n",
"data = anndata.AnnData(\n",
" X=sim.x,\n",
" var=pd.DataFrame(index=[\"gene\" + str(i) for i in range(sim.x.shape[1])]),\n",
" obs=sim.sample_description\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Run differential expression test"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now choose whether we want to collectively test all coefficients of the condition factor or whether we test the significance of a selected set of coefficients."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Test a whole factor"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"I1101 08:30:21.557281 4648003008 estimator.py:51] training strategy:\n",
"{'max_steps': 1000, 'update_b_freq': 5}\n"
]
}
],
"source": [
"test_fac = de.test.wald(\n",
" data=data,\n",
" formula_loc=\"~ 1 + condition\",\n",
" factor_loc_totest=\"condition\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Again, we can look at results like before:"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.scatterplot(\n",
" x=test_fac.log10_pval_clean(),\n",
" y=test_coef.log10_pval_clean()\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Running a test across multiple partitions of a data set"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In some scenarios, one wants to perform a test in multiple partitions of a data set. This can be testing the condition effect separately at each observed time point or in each cell type cluster for example."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"WATCH OUT: The use of expression-derived cell type cluster information is confounded with the tested expression."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Similar to what we describe here, one can also run a Welch's t-test, a rank sum test or a likelihood-ratio test on partitions of a data set."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generate data:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We now simulate conditions across cell types."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Transforming to str index.\n"
]
}
],
"source": [
"from batchglm.api.models.tf1.glm_nb import Simulator\n",
"\n",
"sim = Simulator(num_observations=200, num_features=100)\n",
"sim.generate_sample_description(num_batches=0, num_conditions=4)\n",
"sim.generate_params(\n",
" rand_fn_loc=lambda shape: np.random.uniform(-0.1, 0.1, shape),\n",
" rand_fn_scale=lambda shape: np.random.uniform(0.1, 2, shape)\n",
")\n",
"sim.generate_data()\n",
"sample_description = sim.sample_description\n",
"sample_description[\"cell_type\"] = np.repeat(\n",
" np.array([\"c1\", \"c2\", \"c3\", \"c4\"]), \n",
" int(sim.input_data.num_observations / 4)\n",
")\n",
"\n",
"data_part = anndata.AnnData(\n",
" X=sim.x,\n",
" var=pd.DataFrame(index=[\"gene\" + str(i) for i in range(sim.x.shape[1])]),\n",
" obs=sample_description\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Run differential expression test"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now partition the data set by cell type and conduct a test across conditions in each cell type."
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"I1101 08:30:25.862147 4648003008 estimator.py:51] training strategy:\n",
"{'max_steps': 1000, 'update_b_freq': 5}\n",
"I1101 08:30:27.281496 4648003008 estimator.py:51] training strategy:\n",
"{'max_steps': 1000, 'update_b_freq': 5}\n",
"I1101 08:30:28.577277 4648003008 estimator.py:51] training strategy:\n",
"{'max_steps': 1000, 'update_b_freq': 5}\n",
"I1101 08:30:30.068945 4648003008 estimator.py:51] training strategy:\n",
"{'max_steps': 1000, 'update_b_freq': 5}\n"
]
}
],
"source": [
"part = de.test.partition(\n",
" data=data_part,\n",
" parts=\"cell_type\"\n",
")\n",
"test_part = part.wald(\n",
" formula_loc=\"~ 1 + condition\",\n",
" factor_loc_totest=\"condition\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that there is one test and p-value for each partition for each gene now. We can summarize test statistics across partitions using summary:"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"