{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Python MAGIC EMT tutorial\n",
"\n",
"## MAGIC (Markov Affinity-Based Graph Imputation of Cells)\n",
"\n",
"- MAGIC imputes missing data values on sparse data sets, restoring the structure of the data\n",
"- It also proves dimensionality reduction and gene expression visualizations\n",
"- MAGIC can be performed on a variety of datasets\n",
"- Here, we show the effectiveness of MAGIC on epithelial-to-mesenchymal transition (EMT) data\n",
" \n",
"Markov Affinity-based Graph Imputation of Cells (MAGIC) is an algorithm for denoising and transcript recover of single cells applied to single-cell RNA sequencing data, as described in Van Dijk D et al. (2018), Recovering Gene Interactions from Single-Cell Data Using Data Diffusion, Cell https://www.cell.com/cell/abstract/S0092-8674(18)30724-4.\n",
"\n",
"This tutorial shows loading, preprocessing, MAGIC imputation and visualization of myeloid and erythroid cells in mouse bone marrow. You can edit it yourself at https://colab.research.google.com/github/KrishnaswamyLab/MAGIC/blob/master/python/tutorial_notebooks/emt_tutorial.ipynb\n",
"\n",
"### Table of Contents\n",
"\n",
"Installation\n",
"
\n",
"Loading data\n",
"
\n",
"Data preprocessing\n",
"
\n",
"Running MAGIC\n",
"
\n",
"Visualizing gene-gene interactions\n",
"
\n",
"Visualizing cell trajectories with PCA on MAGIC\n",
"
\n",
"Using MAGIC data in downstream analysis\n",
"\n",
"\n",
"\n",
"### Installation \n",
"\n",
"If you haven't yet installed MAGIC, we can install it directly from this Jupyter Notebook."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting git+git://github.com/KrishnaswamyLab/MAGIC.git#subdirectory=python\n",
" Cloning git://github.com/KrishnaswamyLab/MAGIC.git to /tmp/pip-req-build-rsiw3uia\n",
"Requirement already satisfied: numpy>=1.14.0 in /usr/lib/python3.6/site-packages (from magic==1.0.0) (1.14.3)\n",
"Requirement already satisfied: pandas>=0.21.0 in /usr/lib/python3.6/site-packages (from magic==1.0.0) (0.23.1)\n",
"Requirement already satisfied: scipy>=1.1.0 in /usr/lib/python3.6/site-packages (from magic==1.0.0) (1.1.0)\n",
"Requirement already satisfied: matplotlib in /usr/lib/python3.6/site-packages (from magic==1.0.0) (2.2.2)\n",
"Requirement already satisfied: scikit-learn>=0.19.1 in /usr/lib/python3.6/site-packages (from magic==1.0.0) (0.19.1)\n",
"Requirement already satisfied: graphtools>=0.1.8 in /old/home/dager/.local/lib/python3.6/site-packages (from magic==1.0.0) (0.1.8)\n",
"Requirement already satisfied: python-dateutil>=2.5.0 in /usr/lib/python3.6/site-packages (from pandas>=0.21.0->magic==1.0.0) (2.7.3)\n",
"Requirement already satisfied: pytz>=2011k in /usr/lib/python3.6/site-packages (from pandas>=0.21.0->magic==1.0.0) (2018.5)\n",
"Requirement already satisfied: cycler>=0.10 in /usr/lib/python3.6/site-packages (from matplotlib->magic==1.0.0) (0.10.0)\n",
"Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/lib/python3.6/site-packages (from matplotlib->magic==1.0.0) (2.2.0)\n",
"Requirement already satisfied: six>=1.10 in /usr/lib/python3.6/site-packages (from matplotlib->magic==1.0.0) (1.11.0)\n",
"Requirement already satisfied: kiwisolver>=1.0.1 in /usr/lib/python3.6/site-packages (from matplotlib->magic==1.0.0) (1.0.1)\n",
"Requirement already satisfied: pygsp>=0.5.1 in /old/home/dager/.local/lib/python3.6/site-packages (from graphtools>=0.1.8->magic==1.0.0) (0.5.1)\n",
"Requirement already satisfied: future in /old/home/dager/.local/lib/python3.6/site-packages (from graphtools>=0.1.8->magic==1.0.0) (0.16.0)\n",
"Requirement already satisfied: setuptools in /usr/lib/python3.6/site-packages (from kiwisolver>=1.0.1->matplotlib->magic==1.0.0) (39.2.0)\n",
"Building wheels for collected packages: magic\n",
" Running setup.py bdist_wheel for magic ... \u001b[?25ldone\n",
"\u001b[?25h Stored in directory: /tmp/pip-ephem-wheel-cache-fz42s06z/wheels/8c/34/81/1e929fa29dc80dfe23a343266d8ca3e007d9d3ce6fe899d112\n",
"Successfully built magic\n",
"\u001b[31mtensorflow 1.8.0 requires astor>=0.6.0, which is not installed.\u001b[0m\n",
"\u001b[31mtensorflow 1.8.0 requires gast>=0.2.0, which is not installed.\u001b[0m\n",
"\u001b[31mtensorflow 1.8.0 requires grpcio>=1.8.6, which is not installed.\u001b[0m\n",
"\u001b[31mtensorflow 1.8.0 requires termcolor>=1.1.0, which is not installed.\u001b[0m\n",
"\u001b[31mscikit-image 0.13.0 requires PyWavelets>=0.4.0, which is not installed.\u001b[0m\n",
"\u001b[31mpycuda 2017.1.1 requires pytest>=2, which is not installed.\u001b[0m\n",
"\u001b[31mpandas-datareader 0.6.0 requires wrapt, which is not installed.\u001b[0m\n",
"Installing collected packages: magic\n",
"Successfully installed magic-1.0.0\n"
]
}
],
"source": [
"!pip install --user git+git://github.com/KrishnaswamyLab/MAGIC.git#subdirectory=python"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Importing MAGIC\n",
"\n",
"Here, we'll import MAGIC along with other popular packages that will come in handy."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"import magic\n",
"\n",
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# Matplotlib command for Jupyter notebooks only\n",
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Loading Data\n",
"\n",
"Load your data using one of the following `magic.io` methods: `load_csv`,`load_tsv`,`load_fcs`,`load_mtx`,`load_10x`. \n",
"\n",
"You can read about how to use them with `help(magic.io.load_csv)` or on https://magic.readthedocs.io/."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n", " | 5_8S_rRNA | \n", "A1BG | \n", "A1BG-AS1 | \n", "A2M | \n", "A2M-AS1 | \n", "A2ML1 | \n", "A2ML1-AS1 | \n", "A4GALT | \n", "AAAS | \n", "AACS | \n", "... | \n", "bP-2171C21.6 | \n", "chr22-38_28785274-29006793.1 | \n", "pk | \n", "snoU109 | \n", "snoU13 | \n", "snoU2-30 | \n", "snoU2_19 | \n", "snoZ196 | \n", "uc_338 | \n", "yR211F11.2 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
5S_rRNA | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
5 rows × 28909 columns
\n", "\n", " | 5_8S_rRNA | \n", "A1BG | \n", "A1BG-AS1 | \n", "A2M | \n", "A2M-AS1 | \n", "A2ML1 | \n", "A2ML1-AS1 | \n", "A4GALT | \n", "AAAS | \n", "AACS | \n", "... | \n", "bP-2171C21.6 | \n", "chr22-38_28785274-29006793.1 | \n", "pk | \n", "snoU109 | \n", "snoU13 | \n", "snoU2-30 | \n", "snoU2_19 | \n", "snoZ196 | \n", "uc_338 | \n", "yR211F11.2 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
5S_rRNA | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
5 rows × 28909 columns
\n", "\n", " | 5_8S_rRNA | \n", "A1BG | \n", "A1BG-AS1 | \n", "A2M | \n", "A2M-AS1 | \n", "A2ML1 | \n", "A2ML1-AS1 | \n", "A4GALT | \n", "AAAS | \n", "AACS | \n", "... | \n", "bP-2171C21.6 | \n", "chr22-38_28785274-29006793.1 | \n", "pk | \n", "snoU109 | \n", "snoU13 | \n", "snoU2-30 | \n", "snoU2_19 | \n", "snoZ196 | \n", "uc_338 | \n", "yR211F11.2 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
5S_rRNA | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.002377 | \n", "... | \n", "0.0 | \n", "0.000000 | \n", "0.00000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.000000 | \n", "... | \n", "0.0 | \n", "0.000000 | \n", "1.34385 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.000000 | \n", "... | \n", "0.0 | \n", "1.320702 | \n", "0.00000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.000000 | \n", "... | \n", "0.0 | \n", "0.000000 | \n", "0.00000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.000000 | \n", "... | \n", "0.0 | \n", "0.000000 | \n", "0.00000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
5 rows × 28909 columns
\n", "\n", " | CDH1 | \n", "VIM | \n", "ZEB1 | \n", "
---|---|---|---|
5S_rRNA | \n", "\n", " | \n", " | \n", " |
0 | \n", "0.470840 | \n", "18.768920 | \n", "0.020094 | \n", "
0 | \n", "0.452545 | \n", "17.342712 | \n", "0.020327 | \n", "
0 | \n", "0.450335 | \n", "20.126832 | \n", "0.024562 | \n", "
0 | \n", "0.586670 | \n", "14.195153 | \n", "0.014579 | \n", "
0 | \n", "0.441815 | \n", "19.200228 | \n", "0.021331 | \n", "
\n", " | PC1 | \n", "PC2 | \n", "PC3 | \n", "PC4 | \n", "PC5 | \n", "PC6 | \n", "PC7 | \n", "PC8 | \n", "PC9 | \n", "PC10 | \n", "... | \n", "PC91 | \n", "PC92 | \n", "PC93 | \n", "PC94 | \n", "PC95 | \n", "PC96 | \n", "PC97 | \n", "PC98 | \n", "PC99 | \n", "PC100 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
5S_rRNA | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "-0.442736 | \n", "-12.775098 | \n", "-4.966400 | \n", "-6.013094 | \n", "-3.935735 | \n", "-1.701344 | \n", "-0.393281 | \n", "-0.268059 | \n", "0.406253 | \n", "-0.054690 | \n", "... | \n", "-0.027859 | \n", "-0.005808 | \n", "-0.037856 | \n", "0.021984 | \n", "-0.079401 | \n", "-0.022226 | \n", "0.132161 | \n", "0.002382 | \n", "0.016514 | \n", "0.016606 | \n", "
0 | \n", "1.390609 | \n", "20.902626 | \n", "-29.119468 | \n", "4.943013 | \n", "-0.355160 | \n", "1.257052 | \n", "0.234222 | \n", "0.398224 | \n", "0.902390 | \n", "-1.375229 | \n", "... | \n", "-0.075673 | \n", "0.029491 | \n", "-0.046063 | \n", "-0.057150 | \n", "-0.021239 | \n", "0.061170 | \n", "0.075315 | \n", "-0.063421 | \n", "0.091907 | \n", "0.090922 | \n", "
0 | \n", "-7.669914 | \n", "-22.456203 | \n", "-1.263977 | \n", "-5.836555 | \n", "-3.470875 | \n", "-2.354327 | \n", "-0.103232 | \n", "-0.675641 | \n", "-0.139599 | \n", "0.379178 | \n", "... | \n", "-0.002656 | \n", "-0.033608 | \n", "-0.033250 | \n", "0.067121 | \n", "-0.082855 | \n", "-0.034357 | \n", "0.107429 | \n", "-0.017970 | \n", "-0.016552 | \n", "-0.000571 | \n", "
0 | \n", "-38.678416 | \n", "48.573566 | \n", "4.139594 | \n", "2.379985 | \n", "-0.737370 | \n", "0.667385 | \n", "1.034789 | \n", "0.703243 | \n", "-1.325044 | \n", "-0.424376 | \n", "... | \n", "0.121966 | \n", "0.087281 | \n", "0.048118 | \n", "-0.061236 | \n", "0.003436 | \n", "0.042834 | \n", "-0.024626 | \n", "0.027015 | \n", "0.022872 | \n", "-0.021877 | \n", "
0 | \n", "-6.668670 | \n", "-14.098860 | \n", "-11.448382 | \n", "-5.098441 | \n", "-3.353138 | \n", "-1.019830 | \n", "0.003051 | \n", "0.104561 | \n", "0.102693 | \n", "-0.468140 | \n", "... | \n", "-0.021234 | \n", "-0.044628 | \n", "-0.068291 | \n", "0.036636 | \n", "-0.057767 | \n", "-0.045748 | \n", "0.097285 | \n", "-0.025363 | \n", "0.006584 | \n", "0.024934 | \n", "
5 rows × 100 columns
\n", "\n", " | 5_8S_rRNA | \n", "A1BG | \n", "A1BG-AS1 | \n", "A2M | \n", "A2M-AS1 | \n", "A2ML1 | \n", "A2ML1-AS1 | \n", "A4GALT | \n", "AAAS | \n", "AACS | \n", "... | \n", "bP-2171C21.6 | \n", "chr22-38_28785274-29006793.1 | \n", "pk | \n", "snoU109 | \n", "snoU13 | \n", "snoU2-30 | \n", "snoU2_19 | \n", "snoZ196 | \n", "uc_338 | \n", "yR211F11.2 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
5S_rRNA | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.000268 | \n", "0.001872 | \n", "0.028677 | \n", "0.000066 | \n", "-0.000002 | \n", "0.008798 | \n", "-0.000027 | \n", "0.012401 | \n", "0.127582 | \n", "0.220276 | \n", "... | \n", "0.000102 | \n", "0.244597 | \n", "1.186667 | \n", "0.001656 | \n", "0.004390 | \n", "0.000438 | \n", "0.001343 | \n", "0.000233 | \n", "0.021397 | \n", "0.000100 | \n", "
0 | \n", "0.000191 | \n", "0.001136 | \n", "0.032752 | \n", "0.000089 | \n", "0.000142 | \n", "0.010621 | \n", "0.000037 | \n", "0.006106 | \n", "0.126881 | \n", "0.216288 | \n", "... | \n", "-0.000041 | \n", "0.237189 | \n", "1.166234 | \n", "0.000477 | \n", "0.004618 | \n", "0.000190 | \n", "0.001491 | \n", "-0.000059 | \n", "0.015732 | \n", "-0.000106 | \n", "
0 | \n", "0.000249 | \n", "0.002114 | \n", "0.026938 | \n", "0.000054 | \n", "-0.000034 | \n", "0.008515 | \n", "-0.000019 | \n", "0.013400 | \n", "0.127766 | \n", "0.220840 | \n", "... | \n", "0.000181 | \n", "0.242696 | \n", "1.187458 | \n", "0.001921 | \n", "0.004160 | \n", "0.000456 | \n", "0.001315 | \n", "0.000287 | \n", "0.022830 | \n", "0.000122 | \n", "
0 | \n", "-0.000068 | \n", "0.003063 | \n", "0.033745 | \n", "0.000099 | \n", "0.000477 | \n", "0.019462 | \n", "0.000236 | \n", "0.017406 | \n", "0.105561 | \n", "0.232602 | \n", "... | \n", "0.000216 | \n", "0.216487 | \n", "1.157581 | \n", "0.000599 | \n", "0.004030 | \n", "0.000312 | \n", "0.001397 | \n", "0.000006 | \n", "0.012409 | \n", "0.000276 | \n", "
0 | \n", "0.000252 | \n", "0.001726 | \n", "0.027796 | \n", "0.000049 | \n", "-0.000021 | \n", "0.007596 | \n", "-0.000028 | \n", "0.010562 | \n", "0.130447 | \n", "0.219588 | \n", "... | \n", "0.000114 | \n", "0.241396 | \n", "1.177227 | \n", "0.001586 | \n", "0.004302 | \n", "0.000404 | \n", "0.001455 | \n", "0.000230 | \n", "0.021483 | \n", "0.000095 | \n", "
5 rows × 28909 columns
\n", "