{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Python MAGIC EMT tutorial\n",
"\n",
"## MAGIC (Markov Affinity-Based Graph Imputation of Cells)\n",
"\n",
"- MAGIC imputes missing data values on sparse data sets, restoring the structure of the data\n",
"- It also proves dimensionality reduction and gene expression visualizations\n",
"- MAGIC can be performed on a variety of datasets\n",
"- Here, we show the effectiveness of MAGIC on epithelial-to-mesenchymal transition (EMT) data\n",
" \n",
"Markov Affinity-based Graph Imputation of Cells (MAGIC) is an algorithm for denoising and transcript recover of single cells applied to single-cell RNA sequencing data, as described in Van Dijk D et al. (2018), Recovering Gene Interactions from Single-Cell Data Using Data Diffusion, Cell https://www.cell.com/cell/abstract/S0092-8674(18)30724-4.\n",
"\n",
"This tutorial shows loading, preprocessing, MAGIC imputation and visualization of myeloid and erythroid cells in mouse bone marrow. You can edit it yourself at https://colab.research.google.com/github/KrishnaswamyLab/MAGIC/blob/master/python/tutorial_notebooks/emt_tutorial.ipynb\n",
"\n",
"### Table of Contents\n",
"\n",
"Installation\n",
"
\n",
"Loading data\n",
"
\n",
"Data preprocessing\n",
"
\n",
"Running MAGIC\n",
"
\n",
"Visualizing gene-gene interactions\n",
"
\n",
"Visualizing cell trajectories with PCA on MAGIC\n",
"
\n",
"Using MAGIC data in downstream analysis\n",
"\n",
"\n",
"\n",
"### Installation \n",
"\n",
"If you haven't yet installed MAGIC, we can install it directly from this Jupyter Notebook."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install --user magic-impute"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Importing MAGIC\n",
"\n",
"Here, we'll import MAGIC along with other popular packages that will come in handy."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import magic\n",
"import scprep\n",
"\n",
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# Matplotlib command for Jupyter notebooks only\n",
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Loading Data\n",
"\n",
"Load your data using one of the following `scprep.io` methods: `load_csv`,`load_tsv`,`load_fcs`,`load_mtx`,`load_10x`. \n",
"\n",
"You can read about how to use them with `help(scprep.io.load_csv)` or on https://scprep.readthedocs.io/."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n", " | 5S_rRNA | \n", "5_8S_rRNA | \n", "A1BG | \n", "A1BG-AS1 | \n", "A2M | \n", "A2M-AS1 | \n", "A2ML1 | \n", "A2ML1-AS1 | \n", "A4GALT | \n", "AAAS | \n", "... | \n", "bP-2171C21.6 | \n", "chr22-38_28785274-29006793.1 | \n", "pk | \n", "snoU109 | \n", "snoU13 | \n", "snoU2-30 | \n", "snoU2_19 | \n", "snoZ196 | \n", "uc_338 | \n", "yR211F11.2 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
2 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
3 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
4 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
5 rows × 28910 columns
\n", "\n", " | 5S_rRNA | \n", "5_8S_rRNA | \n", "A1BG | \n", "A1BG-AS1 | \n", "A2M | \n", "A2M-AS1 | \n", "A2ML1 | \n", "A2ML1-AS1 | \n", "A4GALT | \n", "AAAS | \n", "... | \n", "bP-2171C21.6 | \n", "chr22-38_28785274-29006793.1 | \n", "pk | \n", "snoU109 | \n", "snoU13 | \n", "snoU2-30 | \n", "snoU2_19 | \n", "snoZ196 | \n", "uc_338 | \n", "yR211F11.2 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
2 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
3 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
4 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
5 rows × 28910 columns
\n", "\n", " | 5S_rRNA | \n", "5_8S_rRNA | \n", "A1BG | \n", "A1BG-AS1 | \n", "A2M | \n", "A2M-AS1 | \n", "A2ML1 | \n", "A2ML1-AS1 | \n", "A4GALT | \n", "AAAS | \n", "... | \n", "bP-2171C21.6 | \n", "chr22-38_28785274-29006793.1 | \n", "pk | \n", "snoU109 | \n", "snoU13 | \n", "snoU2-30 | \n", "snoU2_19 | \n", "snoZ196 | \n", "uc_338 | \n", "yR211F11.2 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "... | \n", "0.0 | \n", "0.000000 | \n", "0.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
1 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "... | \n", "0.0 | \n", "0.000000 | \n", "3.186743 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
2 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "... | \n", "0.0 | \n", "3.131851 | \n", "0.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
3 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "... | \n", "0.0 | \n", "0.000000 | \n", "0.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
4 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "... | \n", "0.0 | \n", "0.000000 | \n", "0.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
5 rows × 28910 columns
\n", "\n", " | CDH1 | \n", "VIM | \n", "ZEB1 | \n", "
---|---|---|---|
0 | \n", "1.241728 | \n", "42.023885 | \n", "0.009425 | \n", "
1 | \n", "1.325326 | \n", "45.031488 | \n", "0.007015 | \n", "
2 | \n", "0.929086 | \n", "53.765334 | \n", "0.059733 | \n", "
3 | \n", "1.500046 | \n", "31.481441 | \n", "0.017711 | \n", "
4 | \n", "0.784365 | \n", "44.139448 | \n", "0.007882 | \n", "
\n", " | PC1 | \n", "PC2 | \n", "PC3 | \n", "PC4 | \n", "PC5 | \n", "PC6 | \n", "PC7 | \n", "PC8 | \n", "PC9 | \n", "PC10 | \n", "... | \n", "PC91 | \n", "PC92 | \n", "PC93 | \n", "PC94 | \n", "PC95 | \n", "PC96 | \n", "PC97 | \n", "PC98 | \n", "PC99 | \n", "PC100 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0.912814 | \n", "-18.696491 | \n", "-22.982646 | \n", "-32.488871 | \n", "4.599422 | \n", "-17.646104 | \n", "-13.813356 | \n", "-19.805167 | \n", "2.388570 | \n", "-4.619030 | \n", "... | \n", "-0.388829 | \n", "-0.064307 | \n", "-0.738726 | \n", "-0.138937 | \n", "-0.747050 | \n", "-0.607294 | \n", "-0.239226 | \n", "-0.420499 | \n", "-0.309441 | \n", "0.344087 | \n", "
1 | \n", "7.065558 | \n", "61.644281 | \n", "-112.929948 | \n", "23.018526 | \n", "1.633835 | \n", "4.920916 | \n", "8.173887 | \n", "-2.396617 | \n", "-1.942064 | \n", "-7.829058 | \n", "... | \n", "-0.516180 | \n", "1.463713 | \n", "1.153961 | \n", "1.642654 | \n", "-0.868328 | \n", "0.147833 | \n", "-0.946518 | \n", "-0.521891 | \n", "0.853309 | \n", "-0.239975 | \n", "
2 | \n", "-12.555619 | \n", "-57.807529 | \n", "1.474938 | \n", "-8.535280 | \n", "2.626631 | \n", "-13.697039 | \n", "2.079862 | \n", "-1.538081 | \n", "14.009396 | \n", "1.824439 | \n", "... | \n", "-0.571415 | \n", "0.440768 | \n", "2.178793 | \n", "-0.892517 | \n", "-0.366848 | \n", "0.819920 | \n", "0.240286 | \n", "0.517275 | \n", "0.146062 | \n", "0.627752 | \n", "
3 | \n", "-107.818200 | \n", "160.082200 | \n", "15.099766 | \n", "15.106573 | \n", "-11.030644 | \n", "7.001513 | \n", "-11.414310 | \n", "1.862218 | \n", "18.806122 | \n", "-11.505140 | \n", "... | \n", "0.655219 | \n", "-0.736944 | \n", "0.257272 | \n", "-0.551501 | \n", "1.246661 | \n", "0.687309 | \n", "0.079348 | \n", "0.480107 | \n", "0.858098 | \n", "1.597552 | \n", "
4 | \n", "-15.950164 | \n", "-25.788197 | \n", "-56.245822 | \n", "-29.657388 | \n", "8.349882 | \n", "-12.885086 | \n", "-5.037923 | \n", "2.962646 | \n", "-10.603164 | \n", "-13.955877 | \n", "... | \n", "-2.423693 | \n", "-0.613807 | \n", "0.724030 | \n", "-0.930602 | \n", "0.730555 | \n", "1.485058 | \n", "0.809087 | \n", "0.875107 | \n", "-0.036312 | \n", "1.238325 | \n", "
5 rows × 100 columns
\n", "\n", " | 5S_rRNA | \n", "5_8S_rRNA | \n", "A1BG | \n", "A1BG-AS1 | \n", "A2M | \n", "A2M-AS1 | \n", "A2ML1 | \n", "A2ML1-AS1 | \n", "A4GALT | \n", "AAAS | \n", "... | \n", "bP-2171C21.6 | \n", "chr22-38_28785274-29006793.1 | \n", "pk | \n", "snoU109 | \n", "snoU13 | \n", "snoU2-30 | \n", "snoU2_19 | \n", "snoZ196 | \n", "uc_338 | \n", "yR211F11.2 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "5.764251e-05 | \n", "1.533951e-04 | \n", "0.003434 | \n", "0.061766 | \n", "0.000025 | \n", "4.818677e-07 | \n", "0.022142 | \n", "1.013731e-06 | \n", "0.028264 | \n", "0.310166 | \n", "... | \n", "0.000050 | \n", "0.590014 | \n", "2.812985 | \n", "0.003020 | \n", "0.011418 | \n", "0.000470 | \n", "0.008308 | \n", "5.631973e-05 | \n", "0.050187 | \n", "0.000001 | \n", "
1 | \n", "4.580311e-06 | \n", "3.944496e-05 | \n", "0.003664 | \n", "0.059943 | \n", "0.000041 | \n", "8.680537e-06 | \n", "0.023411 | \n", "2.492076e-06 | \n", "0.017567 | \n", "0.311516 | \n", "... | \n", "0.000003 | \n", "0.568839 | \n", "2.732127 | \n", "0.000583 | \n", "0.021170 | \n", "0.000034 | \n", "0.005238 | \n", "7.709941e-07 | \n", "0.033117 | \n", "0.000001 | \n", "
2 | \n", "5.582967e-05 | \n", "1.564661e-04 | \n", "0.002802 | \n", "0.063171 | \n", "0.000007 | \n", "1.509788e-07 | \n", "0.016387 | \n", "4.905095e-07 | \n", "0.030781 | \n", "0.322526 | \n", "... | \n", "0.000083 | \n", "0.584983 | \n", "2.855123 | \n", "0.004441 | \n", "0.011834 | \n", "0.000718 | \n", "0.007337 | \n", "1.519939e-04 | \n", "0.054317 | \n", "0.000002 | \n", "
3 | \n", "4.720916e-07 | \n", "1.636346e-07 | \n", "0.002893 | \n", "0.080461 | \n", "0.000012 | \n", "1.270483e-03 | \n", "0.052790 | \n", "8.470198e-04 | \n", "0.044647 | \n", "0.209806 | \n", "... | \n", "0.000050 | \n", "0.528671 | \n", "2.770666 | \n", "0.002443 | \n", "0.009754 | \n", "0.000252 | \n", "0.002916 | \n", "5.943974e-06 | \n", "0.026995 | \n", "0.000342 | \n", "
4 | \n", "2.949160e-05 | \n", "1.334564e-04 | \n", "0.003380 | \n", "0.063658 | \n", "0.000010 | \n", "1.974939e-07 | \n", "0.016863 | \n", "2.620986e-07 | \n", "0.023448 | \n", "0.314540 | \n", "... | \n", "0.000051 | \n", "0.607246 | \n", "2.763649 | \n", "0.002433 | \n", "0.011103 | \n", "0.000332 | \n", "0.011267 | \n", "7.097992e-05 | \n", "0.046243 | \n", "0.000002 | \n", "
5 rows × 28910 columns
\n", "