{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "*First compiled: September 16, 2017.*" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "# Graph Abstraction for Deep Learning" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The graph of neighborhood relations of data points in pixel space is useless if computed with a simple distance metric (euclidian, cosine, correlation-based, etc.). Deep Learning generates a feature space in which data points are positioned according to biological similarity and hence generates a distance metric that is much more valuable. Here, we demonstrate how graph abstraction is applied in this case.\n", "\n", "[Eulenberg, Köhler, *et al.*, Nat. Commun. (2017)](https://doi.org/10.1101/081364) showed that continuous biological processes can be reconstructed using deep learning. Their results can be reproduced from https://github.com/theislab/deepflow. A video on how the deep learning based feature space organizes data according to biological similarity is here: https://youtu.be/eyWcHIiCazE. We'll use their analysis as a starting point." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Running Scanpy version 0.2.8+21.g7c3f248 on 2017-10-04 23:36.\n" ] } ], "source": [ "import numpy as np\n", "import pandas as pd\n", "import scipy as sp\n", "import scanpy.api as sc\n", "\n", "sc.settings.set_figure_params(dpi=70)\n", "sc.settings.verbosity = 1\n", "sc.logging.print_version_and_date()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Download the features of [Eulenberg, Köhler, *et al.*, Nat. Commun. (2017)](https://doi.org/10.1101/081364) [here](http://falexwolf.de/scanpy_usage/170529_images/)." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "reading file ./write/data/DeepFlow_data/stain/stain_G1SG2_features.h5\n" ] } ], "source": [ "adata = sc.read('./data/G1SG2/stain_G1SG2_features.csv', cache=True)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": false }, "outputs": [], "source": [ "annotation = pd.read_csv('./data/G1SG2/stain_G1SG2.lst', delimiter='\\t', header=None)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", " | 0 | \n", "1 | \n", "2 | \n", "
---|---|---|---|
0 | \n", "2 | \n", "0.584232 | \n", "../images/G2/40531_bright_field.jpg | \n", "
1 | \n", "12 | \n", "0.555460 | \n", "../images/G2/9425_bright_field.jpg | \n", "
2 | \n", "23 | \n", "0.833180 | \n", "../images/G2/28599_bright_field.jpg | \n", "
3 | \n", "40 | \n", "0.628845 | \n", "../images/G2/33523_bright_field.jpg | \n", "
4 | \n", "50 | \n", "0.522287 | \n", "../images/G2/16060_bright_field.jpg | \n", "