{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Overview of CyTOF Data\n", "The original data was given as two tab-separated matrices\n", "* ``Plasma.txt`` (original name: 160202_CGI002_Plasma_Plasma_singlets.fcs_raw_events.txt)\n", "* ``PMA.txt`` (original name: 160202_CGI002_PMA_PMA_singlets.fcs_raw_events.txt)\n", "\n", "These files had individual cell measurements as rows and dimensions (e.g. antibodies) as columns. I only kept the dimensions of interest surface marker and phospho marker antibody columns/dimensions and renamed these files. I then semi-automatically identified 'roughly-defined' cell types using hierarchical clustering and the surface markers associated cell types. \n", "\n", "``Plasma_CT.txt`` and ``PMA_CT.txt``." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "from clustergrammer_widget import *\n", "net = Network(clustergrammer_widget)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Plasma" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(110000, 28)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/nickfernandez/anaconda/lib/python2.7/site-packages/sklearn/cluster/k_means_.py:1382: RuntimeWarning: init_size=300 should be larger than k=1000. Setting it to 3*k\n", " init_size=init_size)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "(1000, 28)\n" ] } ], "source": [ "# load Plasma treated data with defined cell types\n", "net.load_file('../cytof_data/Plasma_UCT.txt')\n", "\n", "# subsample the data so that both treatments have the same number of cells\n", "net.random_sample(axis='row',num_samples=110000, random_state=99)\n", "df_plasma = net.export_df()\n", "print(df_plasma.shape)\n", "\n", "net.normalize(axis='col', norm_type='zscore', keep_orig=False)\n", "net.downsample(ds_type='kmeans', axis='row', num_samples=1000)\n", "print(net.dat['mat'].shape)\n", "\n", "# clip z-scores since we do not care about extreme outliers\n", "net.clip(-10,10)\n", "net.write_matrix_to_tsv('../cytof_data/ds_plasma.txt')" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "net.set_cat_color('row', 1, 'Majority-Treatment: Plasma', 'blue')\n", "net.set_cat_color('row', 1, 'Majority-Treatment: PMA', 'red')\n", "\n", "# greens\n", "net.set_cat_color('row', 2, 'Majority-Category: CD14hi monocytes', 'yellow')\n", "net.set_cat_color('row', 2, 'Majority-Category: CD4 Tcells', 'blue')\n", "net.set_cat_color('row', 2, 'Majority-Category: NK cells_CD16hi', 'red')\n", "net.set_cat_color('row', 2, 'Majority-Category: NK cells_CD16hi_CD57hi', 'orange')\n", "net.set_cat_color('row', 2, 'Majority-Category: NK cells_CD56hi', '#FF6347')\n", "\n", "net.set_cat_color('col', 1, 'Marker-type: phospho marker', 'red')\n", "net.set_cat_color('col', 1, 'Marker-type: surface marker', 'blue')" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "239191a72ccb425b8f091b7985cd1a5d" } }, "metadata": {}, "output_type": "display_data" } ], "source": [ "net.cluster(views=[])\n", "net.widget()" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "# PMA" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "22fc7751aae646f6999ead7aed0eea68" } }, "metadata": {}, "output_type": "display_data" } ], "source": [ "net.load_file('../cytof_data/PMA_UCT.txt')\n", "net.random_sample(axis='row',num_samples=110000, random_state=99)\n", "df_pma = net.export_df()\n", "\n", "net.load_df(df_pma)\n", "\n", "net.normalize(axis='col', norm_type='zscore', keep_orig=False)\n", "net.downsample(ds_type='kmeans', axis='row', num_samples=1000)\n", "net.dat['mat'].shape\n", "net.clip(-10,10)\n", "net.write_matrix_to_tsv('../cytof_data/ds_pma.txt')\n", "\n", "net.cluster(views=[])\n", "net.widget()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Plasma vs PMA Treated\n", "\n", "### Merge Plasma and PMA" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(220000, 28)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/nickfernandez/anaconda/lib/python2.7/site-packages/sklearn/cluster/k_means_.py:1382: RuntimeWarning: init_size=300 should be larger than k=2000. Setting it to 3*k\n", " init_size=init_size)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "837bba60fd0a4eebaaef8bc4a31090dc" } }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df_merge = pd.concat([df_plasma, df_pma])\n", "print(df_merge.shape)\n", "net.load_df(df_merge)\n", "net.normalize(axis='col', norm_type='zscore', keep_orig=False)\n", "net.downsample(ds_type='kmeans', axis='row', num_samples=2000)\n", "net.clip(-10,10)\n", "net.dat['mat'].shape\n", "net.cluster(views=[])\n", "net.widget()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Plasma vs PMA based on Surface markers only" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(2000, 18)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "31d3fefbd5594e5ba3b39475ceda3399" } }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df_merge = pd.concat([df_plasma, df_pma])\n", "net.load_df(df_merge)\n", "\n", "net.filter_cat('col', 1, 'Marker-type: surface marker')\n", "net.normalize(axis='col', norm_type='zscore', keep_orig=False)\n", "net.downsample(ds_type='kmeans', axis='row', num_samples=2000)\n", "net.clip(-10,10)\n", "print(net.dat['mat'].shape)\n", "\n", "net.cluster(views=[])\n", "net.widget()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Plasma vs PMA based on Phospho markers only" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(2000, 10)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d5a17ef73c494b7dbac42c2f1db5baae" } }, "metadata": {}, "output_type": "display_data" } ], "source": [ "df_merge = pd.concat([df_plasma, df_pma])\n", "net.load_df(df_merge)\n", "\n", "net.filter_cat('col', 1, 'Marker-type: phospho marker')\n", "net.normalize(axis='col', norm_type='zscore', keep_orig=False)\n", "net.downsample(ds_type='kmeans', axis='row', num_samples=2000)\n", "net.clip(-10,10)\n", "print(net.dat['mat'].shape)\n", "\n", "net.cluster(views=[])\n", "net.widget()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "PMA and Plasma treated cells separate more based on phospho markers than based on surface markers. This makes sense since PMA treatment is expected to influence phosphorylation levels." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see a cluster of Monocytes and Granulocytes with high phosphorylation markers: pCREB, pMAPKAP2, pERK1 2, pp38. Below we will export this cluster using the interactive dendrogram and the widget DataFrame export method, widget_df, below:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": true }, "outputs": [], "source": [ "df_CD14hi = net.widget_df()" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "4ac5863b87894a07abefb6674c90975f" } }, "metadata": {}, "output_type": "display_data" } ], "source": [ "net.load_df(df_CD14hi)\n", "net.cluster(views=[])\n", "net.widget()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python [Root]", "language": "python", "name": "Python [Root]" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 1, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 1 }