{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "[[<- back to pattern_classification](https://github.com/rasbt/pattern_classification)]" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%load_ext watermark" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sebastian Raschka 07/17/2015 \n", "\n", "CPython 3.4.3\n", "IPython 3.2.0\n" ] } ], "source": [ "%watermark -a \"Sebastian Raschka\" -d -v" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Hierarchical Agglomerative Clustering - Complete Linkage Clustering" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### - A quick Python tutorial" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sections" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- [Generating some Sample Data](#Generating-some-Sample-Data)\n", "- [Pair-wise Distance Matrix, Rows](#Pair-wise-Distance-Matrix,-Rows)\n", "- [Apply Clustering - Complete Linkage](#Apply-Clustering---Complete-Linkage)\n", "- [Heatmaps](#Heatmaps)\n", " - [Heatmap of the Original Data](#Heatmap-of-the-Original-Data)\n", " - [Heatmap after Row-Clustering](#Heatmap-after-Row-Clustering)\n", " - [Heatmap plus Row-Dendrogram](#[Heatmap-plus-Row-Dendrogram)\n", " - [Adding a Column Dendrogram](#Adding-a-Column-Dendrogram)\n", "- [Important Warning About Adding Dendrograms](#Important-Warning-About-Adding-Dendrograms)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**This is a more technical-oriented tutorial that was born out of necessity to plot a heatmap including the dendrograms from a complete linkage clustering. Since I couldn't find any good resource online, I thought that it might be worthwhile sharing it. **" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Therefore, the theoretical aspects are a little bit brief, but nonetheless, here is a short introduction of what it is all about.**\n", "\n", "Complete linkage is one implementation of *hierarchical agglomerative clustering*. The principle of hierarchical agglomerative clustering is to start with a singleton cluster, and clusters are iteratively merged until one single cluster remains. This results in a \"cluster tree,\" which is also called *dendrogram*. The opposite approach -- starting with one cluster and divide into clusters until only singleton clusters remain -- is called *divisive hierarchical clustering*.\n", "\n", "The algorithm can be summarized via the following pseudocode\n", "\n", "
\n", "\n", "1: Compute a distance or similarity matrix. \n", "2: Each data point is represented as a singleton cluster. \n", "3: **Repeat** \n", "4:      Merge two closest clusters (e.g., based on distance between most similar or dissimilar members). \n", "5:      Update the distance (or similarity) matrix. \n", "6: **Until** one single cluster remains. \n", "
\n", "\n", "Complete linkage compares the most dissimilar members between clusters in each iteration. The two clusters which have the most similar *dissimilar members* are merged into a new cluster. \n", "\\begin{equation}\n", "d(C,D) = \\max[dist(C_i, D_j)]\n", "\\end{equation}\n", "for all $i$ points in cluster $C$ and $j$ points in cluster $D$.\n", "\n", "\n", "In contrast, the *single linkage* algorithm compares the two most similar members instead of the most dissimilar ones.\n", "\\begin{equation}\n", "d(C,D) = \\min[dist(C_i, D_j)]\n", "\\end{equation}\n", "for all $i$ points in cluster $C$ and $j$ points in cluster $D$.\n", "\n", "For more implementations, e.g., `centroid`, `average` etc., please see the [`scipy.cluster.hierarchy.linkage` documentation](http://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html#scipy.cluster.hierarchy.linkage)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**(A more detailed article about prototype-based, hierarchical, and density-based clustering is in the planning stage)**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Generating some Sample Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we generate some random sample data to work with. Like in a typical application, the rows represent different observations (Samples 1-5), and the columns are the different features (*X, Y, Z*). " ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
XYZ
ID_06.9646922.8613932.268515
ID_15.5131487.1946904.231065
ID_29.8076426.8482974.809319
ID_33.9211753.4317807.290497
ID_44.3857220.5967793.980443
\n", "
" ], "text/plain": [ " X Y Z\n", "ID_0 6.964692 2.861393 2.268515\n", "ID_1 5.513148 7.194690 4.231065\n", "ID_2 9.807642 6.848297 4.809319\n", "ID_3 3.921175 3.431780 7.290497\n", "ID_4 4.385722 0.596779 3.980443" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "import numpy as np\n", "\n", "np.random.seed(123)\n", "\n", "variables = ['X', 'Y', 'Z']\n", "labels = ['ID_0','ID_1','ID_2','ID_3','ID_4']\n", "\n", "X = np.random.random_sample([5,3])*10\n", "df = pd.DataFrame(X, columns=variables, index=labels)\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pair-wise Distance Matrix, Rows" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we calculate the pair-wise distances for every row (i.e, between every sample across the different variables). We will use the default euclidean distance measure. The other available distance measures are listed in the [`scipy.spatial.distance.pdist` documentation](http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.spatial.distance.pdist.html#scipy.spatial.distance.pdist) with a nice and short explanatory paragraph. In addition, we use the [`squareform`](http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.spatial.distance.squareform.html#scipy.spatial.distance.squareform) function to return a symmetrical matrix of the pair-wise distances." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ID_0ID_1ID_2ID_3ID_4
ID_00.0000004.9735345.5166535.8998853.835396
ID_14.9735340.0000004.3470735.1043116.698233
ID_25.5166534.3470730.0000007.2442628.316594
ID_35.8998855.1043117.2442620.0000004.382864
ID_43.8353966.6982338.3165944.3828640.000000
\n", "
" ], "text/plain": [ " ID_0 ID_1 ID_2 ID_3 ID_4\n", "ID_0 0.000000 4.973534 5.516653 5.899885 3.835396\n", "ID_1 4.973534 0.000000 4.347073 5.104311 6.698233\n", "ID_2 5.516653 4.347073 0.000000 7.244262 8.316594\n", "ID_3 5.899885 5.104311 7.244262 0.000000 4.382864\n", "ID_4 3.835396 6.698233 8.316594 4.382864 0.000000" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from scipy.spatial.distance import pdist,squareform\n", "\n", "row_dist = pd.DataFrame(squareform(pdist(df, metric='euclidean')), columns=labels, index=labels)\n", "row_dist" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Apply Clustering - Complete Linkage" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we apply the complete linkage agglomeration to our clusters, the `linkage` function returns a so-called `linkage matrix`. \n", "This `linkage matrix` consists of several rows where each row consists of 1 merge. The first and second column denote the most dissimilar members in each cluster, and the third row reports the distance between those members. The last column returns the count of members in the clusters." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, before we call the `linkage` function, let us take a careful look at its [documentation](http://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html#scipy.cluster.hierarchy.linkage). \n", "\n", "> Parameters:\t\n", "**y** : ndarray \n", " A condensed or redundant distance matrix. A condensed distance matrix is a flat array containing the upper triangular of the distance matrix. This is the form that pdist returns. Alternatively, a collection of m observation vectors in n dimensions may be passed as an m by n array. \n", " \n", "> **method** : str, optional \n", "The linkage algorithm to use. See the Linkage Methods section below for full descriptions.\n", "\n", "> **metric** : str, optional \n", "The distance metric to use. See the distance.pdist function for a list of valid distance metrics.\n", "\n", "> Returns:\t\n", "**Z** : ndarray \n", "The hierarchical clustering encoded as a linkage matrix.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Thus, we can either pass a condensed distance matrix (upper triangular) from the `pdist` function, or we can pass the \"original\" data array and define the `'euclidean'` metric as function argument in `linkage`. However, we shouldn't pass the `squareform` distance metrics, which would yield incorrect distance values although the overall clustering could be the same. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### a) Squareform distance matrix (wrong)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
row label 1row label 2distanceno. of items in clust.
cluster 1046.5219732
cluster 2126.7296032
cluster 3358.5392473
cluster 46712.4448245
\n", "
" ], "text/plain": [ " row label 1 row label 2 distance no. of items in clust.\n", "cluster 1 0 4 6.521973 2\n", "cluster 2 1 2 6.729603 2\n", "cluster 3 3 5 8.539247 3\n", "cluster 4 6 7 12.444824 5" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from scipy.cluster.hierarchy import linkage\n", "\n", "row_clusters = linkage(row_dist, method='complete', metric='euclidean')\n", "pd.DataFrame(row_clusters, \n", " columns=['row label 1', 'row label 2', 'distance', 'no. of items in clust.'],\n", " index=['cluster %d' %(i+1) for i in range(row_clusters.shape[0])])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### b) Condensed distance matrix (correct)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
row label 1row label 2distanceno. of items in clust.
cluster 1043.8353962
cluster 2124.3470732
cluster 3355.8998853
cluster 4678.3165945
\n", "
" ], "text/plain": [ " row label 1 row label 2 distance no. of items in clust.\n", "cluster 1 0 4 3.835396 2\n", "cluster 2 1 2 4.347073 2\n", "cluster 3 3 5 5.899885 3\n", "cluster 4 6 7 8.316594 5" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "row_clusters = linkage(pdist(df, metric='euclidean'), method='complete')\n", "pd.DataFrame(row_clusters, \n", " columns=['row label 1', 'row label 2', 'distance', 'no. of items in clust.'],\n", " index=['cluster %d' %(i+1) for i in range(row_clusters.shape[0])])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### c) Input sample matrix (correct)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
row label 1row label 2distanceno. of items in clust.
cluster 1043.8353962
cluster 2124.3470732
cluster 3355.8998853
cluster 4678.3165945
\n", "
" ], "text/plain": [ " row label 1 row label 2 distance no. of items in clust.\n", "cluster 1 0 4 3.835396 2\n", "cluster 2 1 2 4.347073 2\n", "cluster 3 3 5 5.899885 3\n", "cluster 4 6 7 8.316594 5" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "row_clusters = linkage(df.values, method='complete', metric='euclidean')\n", "pd.DataFrame(row_clusters, \n", " columns=['row label 1', 'row label 2', 'distance', 'no. of items in clust.'],\n", " index=['cluster %d' %(i+1) for i in range(row_clusters.shape[0])])" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWYAAAD8CAYAAABErA6HAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAADdZJREFUeJzt3XusJGWZx/HfjxmRi6vM6MYbmBlxvZCwLmPEiRot1ktG\nsl6iJkJUFC9/mF1hvQX9w0zPX8ao8bZZEu/gjWTxnuCqGEpBzSgwIDrDKu66C7giMsfRQUlGefzj\n1Dl9HM901Znut/o53d9PcqC76z2VJ09X/6jzUtWvI0IAgDyOmXYBAIC/RDADQDIEMwAkQzADQDIE\nMwAkQzADQDIbx92Bba63A4CjEBFe7fWxg7nZ+SR2AwBzw141kyUxlQEA6RDMAJAMwQwAyRDMAJAM\nwQwAyRDMAJAMwQwAyRDMAJDMRG4wwaLNm6WFhWlXgVm0aZO0f/+0q0BfPO5de7aDO/8W2RKtQAkc\nW7PH9hFvyWYqAwCSIZgBIJnWYLb9Nts/tn2T7c/Yvm8fhQHAvBoZzLa3SHqtpG0RcbqkDZLOKV8W\nAMyvtqsyfivpkKQTbP9J0gmSbi9eFQDMsZFnzBGxX9J7JP2fpF9I+k1EXNlHYQAwr0aeMds+VdK/\nStoi6YCk/7D90oj49Mpxg8Fg+XFVVaqqatJ1AsC6Vte16rruNHbkdcy2XyLpWRHxmub5yyVtj4h/\nXjGG65gbXGuKUji2Zs841zHfLGm77eO9uA7KMyXtnXSBAIChtjnmGyVdKulaST9sXv5Q6aIAYJ5x\nS/YE8ecmSuHYmj3ckg0A6wjBDADJEMwAkAzBDADJEMwAkAzBDADJEMwAkAzBDADJEMwAkAzBDADJ\nEMwAkAzBDADJEMwAkAzBDADJEMwAkAzBDADJtAaz7cfY3rPi54DtC/ooDgDm0ZpWMLF9jKTbJZ0Z\nEbc2r7GCSYNVJlAKx9bsmeQKJs+U9LOlUAYATN5ag/kcSZ8pUQgAYFHnqQzbx2pxGuO0iLhzxetM\nZTT4cxOlcGzNnlFTGRvXsJ/nSLpuZSgvGQwGy4+rqlJVVWssEQBmW13Xquu609i1nDFfJumrEXHJ\nYa9zxtzgrAalcGzNnlFnzJ2C2faJkv5X0taI+N1h2wjmBh8elMKxNXvGDuaWnRPMDT48KIVja/ZM\n8nI5AEBhBDMAJEMwA0AyBDMAJEMwA0AyBDMAJEMwA0AyBDMAJEMwA0AyBDMAJEMwA0AyBDMAJEMw\nA0AyBDMAJEMwA0AyBDMAJEMwA0AyrcFs+yTbl9veZ3uv7e19FAYA86rLKtnvl3RFRLzY9kZJJxau\nCQDm2sg1/2w/QNKeiHjkiDGs+ddgXTaUwrE1e8ZZ82+rpDttf9z29bY/bPuEyZcIAFjSFswbJW2T\n9O8RsU3S3ZLeWrwqAJhjbXPMt0m6LSJ+0Dy/XKsE82AwWH5cVZWqqppQeQAwG+q6Vl3XncaOnGOW\nJNvflvSaiPiJ7YGk4yPiohXbmWNuMA+IUji2Zs+oOeYuwfx4SR+RdKykn0k6PyIOrNhOMDf48OSz\nebO0sDDtKrBk0yZp//5pV5HDWMHcYecEc4Ngzof3JBfej6FxrsoAAPSMYAaAZAhmAEiGYAaAZAhm\nAEiGYAaAZAhmAEiGYAaAZAhmAEiGYAaAZAhmAEiGYAaAZAhmAEiGYAaAZAhmAEiGYAaAZAhmAEim\nbTFWSZLtn0v6raQ/SToUEWeWLAoA5lmnYJYUkqqIYLUuAChsLVMZq65NBQCYrK7BHJKutH2t7deW\nLAgA5l3XqYynRMT/2/5bSd+wfXNEXL20cTAYLA+sqkpVVU20SABY7+q6Vl3XncY61riWuO2dkg5G\nxHua57HWfcwqlmbPh/ckF96PIduKiFWniFunMmyfYPtvmscnSnq2pJsmWyIAYEmXqYwHS/qC7aXx\nn46IrxetCgDm2JqnMv5qB0xlLOPPtHx4T3Lh/RgaayoDANAvghkAkiGYASAZghkAkiGYASAZghkA\nkiGYASAZghkAkiGYASAZghkAkiGYASAZghkAkpmZLzHa/M7NWrhnYbpFXLVTOmvXdGuQtOm4Tdp/\nEcszSnxpTja8H0OjvsRoZoLZu6zYOf06MqAXQwRBLrwfQ3y7HACsIwQzACTTKZhtb7C9x/ZXShcE\nAPOu6xnzhZL2SmJ2CAAK67IY68mSzpb0EUmrTlQDACanyxnzeyW9RdK9hWsBAKglmG3/k6RfRcQe\ncbYMAL3Y2LL9yZKeZ/tsScdJur/tSyPivJWDBoPB8uOqqlRV1YTLBID1ra5r1XXdaWznG0xsP13S\nmyPiuYe9zg0mydCLIW5oyIX3Y2iSN5jQUgAorG0qY1lEfEvStwrWAgAQd/4B82Pz5sW5hCn+7NRg\n6jXIXuxFYp3PmAGscwsLU5/gHaz451Q590VmnDEDQDIEMwAkQzADQDIEMwAkQzADQDIEMwAkQzAD\nQDIEMwAkQzADQDIEMwAkQzADQDIEMwAkQzADQDIEMwAkQzADQDKtwWz7ONu7bd9ge6/td/RRGADM\nq9Yvyo+Ie2yfFRG/t71R0jW2nxoR1/RQHwDMnU5TGRHx++bhsZI2SNpfrCIAmHOdgtn2MbZvkHSH\npKsiYm/ZsgBgfnVa8y8i7pX0D7YfIOlrtquIqJe2DwaD5bFVVamqqslWCQDrXF3Xquu601jHGhdn\ntP12SX+IiHc3z2Ot+yjBu6zYOf06MqAXQ/bU1x/Ng2YMJeiFbUXEqqvCdrkq40G2T2oeHy/pWZL2\nTLZEAMCSLlMZD5V0ie1jtBjkn4yIb5YtCwDmV5fL5W6StK2HWgAA4s4/AEiHYAaAZAhmAEiGYAaA\nZAhmAEiGYAaAZAhmAEiGYAaAZAhmAEiGYAaAZAhmAEiGYAaAZAhmAEiGYAaAZAhmAEiGYAaAZLos\nLXWK7ats/9j2j2xf0EdhADCvuiwtdUjSGyLiBtv3k3Sd7W9ExL7CtQHAXGo9Y46IX0bEDc3jg5L2\nSXpY6cIAYF6taY7Z9hZJZ0jaXaIYAMAagrmZxrhc0oXNmTMAoIAuc8yyfR9Jn5P0qYj44uHbB4PB\n8uOqqlRV1YTKA4DZUNe16rruNNYRMXqAbUmXSLorIt6wyvZo20cfvMuKndOvIwN6MWRLCQ7PHGjG\nUIJe2FZEeLVtXaYyniLpZZLOsr2n+dkx0QoBAMtapzIi4hpxIwoA9IbABYBkCGYASIZgBoBkCGYA\nSIZgBoBkCGYASIZgBoBkCGYASIZgBoBkCGYASIZgBoBkCGYASIZgBoBkCGYASIZgBoBkCGYASIZg\nBoBkWoPZ9sds32H7pj4KAoB51+WM+eOSWOMPAHrSGswRcbWkhR5qAQCIOWYASKd1lewuBoPB8uOq\nqlRV1SR2CwAzo65r1XXdaawjon2QvUXSVyLi9FW2RZd9lOZdVuycfh0Z0IshW0pweOZAM4YS9MK2\nIsKrbWMqAwCS6XK53GclfVfSo23favv88mUBwPxqnWOOiHP7KAQAsIipDABIhmAGgGQIZgBIhmAG\ngGQIZgBIhmAGgGQIZgBIhmAGgGQIZgBIhmAGgGQIZgBIhmAGgGQIZgBIhmAGgGQIZgBIhmAGgGS6\nrGCyw/bNtn9q+6I+igKAeTYymG1vkPRvknZIOk3SubYf10dhADCv2s6Yz5R0S0T8PCIOSbpM0vPL\nlwUA86stmB8u6dYVz29rXgMAFNIWzNFLFQCAZW2rZN8u6ZQVz0/R4lnzX7A9yZqOmgc56siAXgwl\nOTxzoBlDiXvhiCOfFNveKOm/JD1D0i8kfV/SuRGxr5/yAGD+jDxjjog/2v4XSV+TtEHSRwllAChr\n5BkzAKB/3PkHAMmkD2bbB5t/b7H9B9vX295re7ftV7T87mNtf8/2Pbbf1E/F5YzZi5favtH2D21/\nx/bf91N1GWP24vlNL/bYvs72P/ZTdRnj9KL5vQ80d/beaPuM8hWXM24vmt99ou0/2n5h2WqPrO2q\njAxWzrXcEhHbJMn2Vkmft+2I+MQRfvcuSa+X9IKyJfZmnF78t6SnRcQB2zskfUjS9qLVljVOL66M\niC8140+X9AVJjypZbGFH3QvbZ0t6VET8ne0nSbpY83tcLN3t/E5J/ylpapdtpD9jPpKI+B9Jb5R0\nwYgxd0bEtZIO9VbYFHTsxfci4kDzdLekk/uorW8de3H3iqf3k/Tr0nVNQ5deSHqepEua8bslnWT7\nwT2U16uOvZAWT+Qul3Rn8aJGWLfB3Ngj6bHTLiKJtfTi1ZKuKFjLtLX2wvYLbO+T9FW1f1jXs7Ze\nrHZ370z+R1stvbD9cC1+5cTFzUtTuzJivQdz3ivE+9epF7bPkvQqSbP8TYGtvYiIL0bE4yQ9V9In\ny5c0NV2Oi8PHzOqlWm29eJ+kt8bipWruML6Y9TDHPMoZkvZOu4gkWnvR/A+/D0vaERELvVQ1HZ2P\ni4i42vZG2w+MiLsK1zUNbb04/O7ek5vXZlFbL54g6bLmTuYHSXqO7UMR8eU+iltp3Z4x294i6V2S\nPthleNFipqxLL2w/QtLnJb0sIm7pp7L+dezFqW4+fba3SdIshnLHz8iXJZ3XjN8u6TcRcUfx4nrW\npRcR8ciI2BoRW7U4z/y6aYSytD7OmFf+WXWq7eslHSfpd5LeHxGXHukXbT9E0g8k3V/SvbYvlHRa\nRBwsWXBBR90LSW+XtEnSxU0mHYqIM4tVWt44vXiRpPNsH5J0UNI55crsxVH3IiKusH227Vsk3S3p\n/LKlFjfOcZEGd/4BQDLrdioDAGbVepjKaGX7lZIuPOzlayLi9VMoZ6roxRC9GKIXQ+uhF0xlAEAy\nTGUAQDIEMwAkQzADQDIEMwAkQzADQDJ/BraTRQpKBzTEAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "from scipy.cluster.hierarchy import dendrogram\n", "\n", "row_dendr = dendrogram(row_clusters, labels=labels)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Heatmaps" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This section is about the visualization of our hierarchical clustering. Here, we will plot simple heatmaps for each scenario: The original data, the data after row clustering, and eventually the data after we applied row and column clustering." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Heatmap of the Original Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAANEAAAD7CAYAAAD0DXG/AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAESRJREFUeJzt3X2sHNV5x/Hvz9iGAHHDW9MkOLLr4LgvpNgFAqWQJaIN\nLykJaSOBQjE0pa2iGjdUUZNKla5VtRXQqjUooQlVbJyQIEFMAAFtgHYRLtQBrkMBQ4IBKwZCQiEY\n20Bj4OkfOy7X1zv7Mmd2Z3f8+0ijvXvvmTnH0j5+Zs6eeUYRgZkVN6PqAZiNOweRWSIHkVkiB5FZ\nIgeRWSIHkVmiWgWRpLmSnpR0UPb+oOz9e6seWztquVvSqVN+90lJt1U5rk4knSVpw7TtDUkfqXps\nVVHdvieS9DngfRHxx5K+DDwZEZdUPa48kn4FuA5YDMwCJoGPRMRTlQ6sR5L+CDgnIk6ueixVqWMQ\nzQQeAFYBnwaOiog3qh1VZ5IuAXYABwJbI+JvKh5STyQtBO4Ejo+Ip6seT1VqF0QA2anFbcBvRcSd\nVY+nG0n7AxuA14CjI2JnxUPqStIs4F7gkoi4rurxVGlm1QMYkNOAZ4Ejaf1POdIi4hVJ1wLbxiGA\nMn8NPLS3BxDUMIgkHQWcAhwPrJN0bUQ8V/GwevEmMBanBZIawFnAkoqHMhLqNjsn4EpgeURsAS4D\n/r7aUdVLNvO5CjgvInZUPZ5RUKsgAi4ENk+5DvoS8EuSTqxwTP0Yh0z0J8BhwD9Pm+b+ZNUDq0ot\nJxbMhqlumchs6BxEZokcRGaJHERmiRxEZokq/7JVkqcHbQ8RoZT9+/1cpfRXeRABfH8Ax7wCWDaA\n4wIsjP0HctyJiZ8xMTF7AEf+0ACOCRMTjzMxcUTpxy3rTpC399huW2I/IxFEZoOwz5D6cRBZbQ3r\ngr+2QXRs1QMooNEY1v+d5Wg0Dq56CB0N4sS4ncqX/UiKQVwTDdKgrokGZzDXRIMi3VbKxMK7e2z7\nLDWYWDAbBF8TmSVyEJkl8sSCWSJnIrNEzkRmiYY1xe0gstpyJjJL5Gsis0TDCiLfT2S1NaPHrR1J\nyyU9JOlhScu79WNWS/v0uE0n6VeBPwSOAX4N+KikBXn9OIisthIy0SJgfUS8lj0M4S7gE536ySVp\ne/Y6T9KrkiYlbZS0XtLSbv8ISZdLelzSg5IWd2tvVqbZPW5tPAycKOng7GEDZwCH5/XTbWJh6hLv\nTRGxBEDSfGCtJEXE6nY7Sjqd1nOCjpD0QVrlfY/r0p9ZafIyxIvATzvsFxGPZY+7+Q6tR95soFUr\nva9+OsoeQHUxcFGHZmcCV2ft1wPvkPTOIv2ZFZF3DXQYsHDK1k5EfDUijo6IDwEv0aGKQcoU9wZa\n54553gNsmfL+aVop8ccJfZr1LGWKW9LPR8RPskeVngV8MK9tShD1chPT9DZt7wC8YsrPx9JhtFZL\nzeYLNJsvln7cxFmz6yUdAuwEPhMRL+c1TAmixcDGDn9/Bpg75f3h2e/2MKiqPDYeGo1DaDQO+f/3\nK1ZsKuW4KZkoIk7qtW2hYJU0j9azf67o0Owm4Lys/XHASxHhUzkbmpQvW/vRz+zcAkmTwH60SnWt\njIg1uTtG3CrpdEmbaM1wXJA8WrM+jMQq7oiYk71uBvquzhERf1psWGbpvIrbLNHYrOKWdD4wfYHe\nuojwfIFVamyCKFuxsDp5JGYl8+mcWaKxyURmo2rWkPpxEFltOROZJfI1kVkiZyKzRA4is0Q+nTNL\n5ExklmhYU9yu9mO1VbRkFoCkL0h6JKs99w1J++b14yCy2ip6P1F2v9yFwJKIOJJWrJ2d149P56y2\nEq6JXqZ1W/j+kt6gdRtQ27uywZnIaqzo6VxEvAj8A/BDWs9Ffiki7sjrx5nIaisvQzyRbXmyksF/\nBswDtgLXSfpURFzTrv1IBNHCU6oeQX/+S69UPYS+HLfqtqqHUIm807np9eZu37PJ0cA9EfECgKS1\nwG8AbYPIp3NWW7N63Np4DDhO0tskCTiFDpWtRiITmQ1C0YmFiHhQ0hrgflrlgyeBr+S1dxBZbaWc\nZkXEpcClvbR1EFltedmPWSIHkVkir+I2S+RMZJbIhUrMEjkTmSXyNZFZImcis0QOIrNEPp0zS+RM\nZJbIU9xmiZyJzBL5msgs0bAyUcdglbQ9e50n6VVJk5I2SlovaWmXfRdJulfSa5L+vMxBm/WiaKES\nSe+XtGHKtlXSRXn9dMtEMeXnTRGxJOtkPrBWkrLHTbbzArAM+HiXPswGoujpXER8H1gMIGkGrXJZ\nN5TaT0Q8BVwM5EZnRDwfEffTqt9lNnQJNRamOgV4IiK25DVIuSbaACxK2N9soEq6Jjob+EanBilB\npIR9dzMxpQhY4yBoHFzWkW0cNB9rbWVLDSJJs4HfAf6iU7uUIFpMhzJC/ZhYUMZRbFw1FrW2XVbc\nWM5x865VvpttPTgNeCAinu/UqFAQZQW/LwMu76V5kT7MUuVlouOzbZcv5h/iHOCb3frpZ3ZugaRJ\nYD9gG7AyItbk7SjpF4D7gDnAm5KWA78cEdu7DcqsDCmnc5IOoDWpcGG3th2DKCLmZK+baVXG71lE\nPAfM7WcfszIl1p3bARzaS1uvWLDaGpsFqJLOB5ZP+/W6iFiWemyzFGOzADVbsbA6eSRmJRubIDIb\nVV7FbZbImcgskYPILJFP58wSjc0Ut9mo8umcWSIHkVmqXi+K3kzrxkFk9dVrKnIQmeXoNYgSCxg4\niKy+hjTH7SCy+prdY7tX0roZ1vdRZsM3o8etDUnvkHS9pEezWovH5XXjTGT1lTbHvRK4NSJ+T9JM\n4IC8hiMRRAfcUfUI+vNo1QPoVynlZMZQwfMsST8HnBgRSwEi4nVga8ndmI2BonWEYT7wvKRVWens\nqyTllkdwEFl9FQ+imcAS4EtZ6ewdwOfzuhmJ0zmzgchJEc3XWlsHTwNPR8R92fvrcRDZXilnirsx\nGxpz3nq/YtrVTkQ8J2mLpIUR8QNapbMeyevGQWT1lXaxsgy4Jisl/ARwQV5DB5HVV8IUd0Q8CBzT\nS1sHkdWXl/2YJRrSDUUOIqsvB5FZoiEVWXAQWX05E5kl8sSCWSJnIrNEzkRmiZyJzBI5iMwSDWmK\nu+tZo6Tt2es8Sa9mNyltlLRe0tIu+35K0oOS/lvSf0r6QFkDN+uq+P1EfeklE019gvim7CYlJM0H\n1kpS9rS8dp4EToqIrZJOBb4C5BZ8MCvVkCYWCncTEU8BFwMXdWhzb0TsultjPXB40f7M+jZCmaiT\nDcCiHtt+Grg1sT+z3o3JFLd6aiSdDPwBcEK7v/9sys8l/edgY6T5Q2huGcCBx2R2bjFdCjJlkwlX\nAadGxE/btem1UKXVU+O9rW2XFfeUdOCEIJK0GXgZeAPYGRHH5rUtHESS5gGXAZd3aPNeYC1wbkRs\nKtqXWSFpU9wBNCLixW4N+52dWyBpEtgP2AasjIg1Hfb9K+Ag4EpJ0CWizUqVfjrX0+VK1yCKiDnZ\n62Ygt4Bdzr4XAhf2s49ZadImFgK4Q9IbwJcj4qq8hl6xYPWVk4maz0Dz2a57nxARP5J0GHC7pMci\n4u52DUsJIknnA8un/XpdRCwr4/hmheRkosbc1rbLivv3bBMRP8pen5d0A3AsMLggylYsrC7jWGal\nKXhNlNXd3icitkk6APhtYEVee5/OWX0Vn1h4J3BDNhk2E7gmIr6T19hBZPVVcIo7W9J2VK/tHURW\nX2OyYsFsdI3J2jmz0eVMZJbImcgskTORWSIHkVki1+I2S+RMZJbIEwtmiZyJzBI5E5kl2psy0Y7b\nqh5Bn46oegD9ab6v6hFUxLNzZon2pkxkNhCjXkbYbOQllhGWtI+kDZJu7tSNM5HVV/rp3HJaxUnf\n3qmRM5HV14wetzYkHQ6cDvwLXerPORNZfaVlon8EPgfM6dbQQWT1lTPF3XyoteWR9FHgJxGxQVKj\nWzeKiG5tBkpShL8nGqhx+57oZCAieirhm0dSxC09tj1j9/4k/S3w+8DrtEpmzwG+FRHntdvf10RW\nXwWviSLiLyNibkTMB84G/j0vgMCnc1Zn5X3Z2vF0zUFk9VVCEEXEXcBdndo4iKy+vIrbLJHXzpkl\n8ipus0TORGaJfE1klsiZyCzRkIKoY8KTtD17nSfpVUmTkjZKWi9paZd9Pybpwex+jAckfbjMgZt1\nlbCKux/dMtHUb2o3RcQSAEnzgbWSlD1qsp07IuLGrP2RwA3AmK3isrE2CpkoT/YksYuBizq02THl\n7YHA/xTpy6ywWT1uiVKuiTYAizo1kPRx4O+Ad9F6eKzZ8IzBxELXpeoR8W3g25JOBL4GvL9du4mv\nv/Vz4wOtzfYe38u20o3BFPdiWvefdxURd0uaKemQiHhh+t8nzk0YhY29o9j9KcNXl3XgUb4mkjQP\nuAy4okObBcqeYS5pCUC7ADIbmMRqP73qZ3ZugaRJWnf6bQNWRsSaDvv+LnCepJ3Adlo3N5kNzyic\nzkXEnOx1M7B/PweOiEuBSwuPzCxVwSwjaT9a9xDtC8wGboyIL+S194oFq6+C09cR8ZqkkyPiFUkz\ngXWSfjMi1rVrnxxEks6nVeRuqnURsSz12GZJEq53IuKV7MfZ2ZFezGubHETZioXVqccxK13CNZGk\nGcAksAC4MiJyZ6Jd7cfqK2F2LiLejIijgMOBkzrVn/M1kdVXToA0m9DsWHrkLRGxVdItwNFAs10b\nF28swsUbB6q04o1v9th2xh7FGw8FXo+IlyS9Dfg3YEVE3Nluf2ciqy/t22PD/53+i3cBV2fXRTOA\nr+UFEDiIrNZ6/XjvHkQR8RCwpOxezMbQcD7eDiKrMQeRWSIHkVkiB5FZIgeRWaJep7jTOIisxpyJ\nzBI5iMwS7U1B9GTVA+jTqdWuN+xX46akZWjDd2ZZB9qbgshsIBxEZokcRGaJ9htKLw4iqzFnIrNE\nw/l4u8aC1djMHrfdSZor6T8kPSLpYUm5Tz/Z1YtZTRX+eO8EPhsR35N0IPCApNsj4tFSezEbfcU+\n3hHxHPBc9vN2SY8C7wYcRLa3Sf94Zw9vWAysH1wvZiMrbRV3dip3PbA8IrbntXMQWY21/3g3m8/Q\nbD7bcU9Js4BvAV/PHlaX33Yk6s59sdIh9O8z47V2jpvHa+2cziyp7lyP5eClK6bXnROtZ429EBGf\n7ba/p7itxopNcQMnAOcCJ0vakG2ndurFrKYKz86to48E4yCyGvOyH7NEDiKzRF7FbZZoBBagStqe\nvc6T9KqkSUkbJa2XtLSXDiQdI+l1SZ8oY8BmvSs8O9d3L51M/UJkU0QsAZA0H1ir1mT86rydJe0D\nXAL8KzBeX1ZYDYxAJsoTEU8BFwMdl4gDy2gtm3i+SD9maUYjE3WyAViU90dJ7wE+BnwYOIbds9pu\nJm556+fGEdBYmDAqGzvNh1pb+UZ/dq7b6dk/AZ+PiMiWUeS2nzgjYRQ29hpHtrZdVlxb1pFHP4gW\nA7mPJQd+Hbi2FT8cCpwmaWdE3JTQp1kfRrgWd3aPxWXA5XltIuIXp7RfBdzsALLhGo1MNPU6ZoGk\nSVrfYG0DVkbEmoGNzCzZCARRRMzJXjcD+xftJCIuKLqvWXEjEERm421MgkjS+cDyab9e1/MdUWYD\nMyZBlK1YWJ08ErPSDWcBqu9stRorXLzxq5J+LKmnr4AdRFZjhZf9rAJybwdv14tZTRW+Pfzu7LvQ\nAfZiNhbGZGLBbHTl1Z37Ls3mfaX14rpzRbju3ECVV3fusR7bLtqjv+x07uaIOLLtTlM4E1mNDWcB\nqmfnrMYKT3F/E7gHWChpi6SOy9aciazGCs/OnTP4XszGwgjXWBgHzR9UPYL+NZvNqofQl8Hc0l2m\n4dRYqG8QPV71CPrnICrb6BcqMRtx/rLVLNFwVnGPxJetlQ7ARlIZX7YOq7/Kg8hs3NV2YsFsWBxE\nZokcRGaJHERmiRxEZon+DxRFaOsioGz6AAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig = plt.figure()\n", "\n", "ax = fig.add_subplot(111)\n", "\n", "cax = ax.matshow(df, interpolation='nearest', cmap='hot_r')\n", "fig.colorbar(cax)\n", "\n", "ax.set_xticklabels([''] + list(df.columns))\n", "ax.set_yticklabels([''] + list(df.index))\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Heatmap after Row-Clustering" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `dendrogram` function returns a dictionary with various items which are explained in detail in the scipy documenation [here](http://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.dendrogram.html). The dendrogram leave order can then be accessed via the `'leaves'` key, e.g.," ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[1, 2, 3, 0, 4]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "row_dendr['leaves']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Thus, in order to sort the DataFrame according to the clustering, we can simply use the `'leaves'` as indices like so: \n", "\n", " df.ix[row_dendr['leaves']]" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAANEAAAD7CAYAAAD0DXG/AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEQ9JREFUeJzt3X2sHNV5x/Hvz9iGAHFDgKZpcGTX4LgvpNjltRRyiWjD\nS5qEtJFAoRia0lZRjRuqqEmlSraqtgJaNSZKaEoUGyckSBADQUAboF2EC3WA61DAvMSAFV5CQiEY\n20Br4OkfO4Tr653d2Tm7Ozvj30ca7d3rM3OO0H14Zs6ceUYRgZmVN6PqAZjVnYPILJGDyCyRg8gs\nkYPILJGDyCxRo4JI0lxJj0s6IPt+QPb9vVWPrRO13SHplCm/+4Skm6scVzeSzpC0cdr2uqQPVT22\nqqhp94kkfRY4NCL+RNJXgMcj4qKqx5VH0q8CVwOLgVnAJPChiHii0oEVJOmPgbMi4qSqx1KVJgbR\nTOBeYDXwKeCIiHi92lF1J+kiYAewP7A1Iv624iEVImkhcBtwXEQ8VfV4qtK4IALITi1uBn47Im6r\nejy9SNoX2Ai8ChwZETsrHlJPkmYBdwEXRcTVVY+nSjOrHsCQnAo8AxxO+/+UYy0iXpZ0FbCtDgGU\n+Rvg/j09gKCBQSTpCOBk4DhgvaSrIuLZiodVxBtALU4LJE0AZwBLKh7KWGja7JyAy4DlEfEkcAnw\nD9WOqlmymc/VwDkRsaPq8YyDRgURcD6wZcp10JeBX5Z0QoVj6kcdMtGfAgcD/zxtmvsTVQ+sKo2c\nWDAbpaZlIrORcxCZJXIQmSVyEJklchCZJar8ZqskTw/abiJCKfv3+3eV0l/lQQQQJw/+mCsegxUL\nBn9cgP+6dTjH/SrwR0M47rGrh3BQYMV1sOJjgz+uzhvMcd5esN22xH7GIojMhmGvEfXjILLGGtUF\nf2ODaOKAqkfQv7qt5pxYVPUIups9on6aG0TvrHoE/XMQDZYzkVkiXxOZJXIQmSXy6ZxZImcis0TO\nRGaJPMVtlsiZyCyRr4nMEo0qiPw8kTXWjIJbJ5KWS7pf0gOSlvfqx6yR9iq4TSfp12g/lXIU8OvA\nhyXlPljjILLGSshEi4ANEfFq9jKE24GPd+snl6Tt2ec8Sa9ImpS0SdIGSUt77LtI0l2SXpX0F93a\nmg3D7IJbBw8AJ0h6Z/aygdOBQ/L66TWxMPUR280RsQRA0nxgnSRFxJqcfZ8HlgFDePbRrLe8DPEC\n8NMu+0XEw9nrbr5L+5U3G2nXSu+rn66yF1BdCFzQpc1zEXEPUJe3HFjD5F0DHQwsnLJ1EhFfi4gj\nI+IDwIvAI3n9pExxb6R97mg2llKmuCX9fET8JHtV6RnAMXltU4IoqRrLVCsee+vniQPq+UCdldd6\nuL0NWuKs2TWSDqR9JvXpiHgpr2FKEC0GNiXs/zPDqspj9TCxaNenZFdeP5jjpmSiiDixaNtSQSRp\nHu13/1xapHmZPsxSjcvauamzcwskTQL70C7VtSoi1ubtKOkXgLuBOcAb2V3fX4mI7YljNitkLFZx\nR8Sc7HMLsG8/B85e8Ti39MjMEo1LJjKrrdqs4pZ0LjB9gd76iFiWemyzFLUJomzFwprkkZgNmE/n\nzBLVJhOZjatZI+rHQWSN5UxklsjXRGaJnInMEjmIzBL5dM4skTORWaJRTXG72o81VtmSWQCSPi/p\nwaz23Dcl7Z3Xj4PIGqtsyazsebnzgSURcTjtWDszrx+fzlljJVwTvUT7sfB9Jb1O+zGgp/MaOxNZ\nY5U9nYuIF4B/BH4IPAO8GBG35vXjTGSNlZchHsu2PFnJ4D8H5gFbgaslfTIiruzUfiyCaL/cGB9P\nD1U9gH4NpJxM/eSdzk2vN3fL7k2OBO6MiOcBJK0DfhPoGEQ+nbPGmlVw6+Bh4FhJb5Mk4GS6/K9o\nLDKR2TCUnViIiPskrQXuoV0+eBL4l7z2DiJrrJTTrIi4GLi4SFsHkTWWl/2YJXIQmSXyKm6zRM5E\nZolcqMQskTORWSJfE5klciYyS+QgMkvk0zmzRM5EZok8xW2WyJnILJGvicwSjSoT9QxWSduzz3mS\nXpE0KWmTpA2SlvbY95OS7pP035L+U9L7BzVws17KFiqR9D5JG6dsWyVdkNdPkUwUU37eHBFLso7m\nA+skKXvlZCePAydGxFZJp9B+OvDYAn2aJSt7OhcRjwCLASTNoF0u69pB90NEPAFcCORGaETcFRFb\ns68bgEPK9mfWr4QaC1OdDDwWEU/mNUi9JtoILCrY9lPATYn9mRU2oGuiM4FvdmuQGkQq1Eg6CfhD\n4PhO//5/U37uVh/Zmqn1Q2jl/n++vNS/I0mzgd8F/rJbu9QgWkyPqmbZZMLlwCkR8dNObWYnDsLq\nbeK97e1NK+8czHHzrlW+l20FnArcGxHPdWtUOoiyot+XAJd2afNeYB1wdkRsLtuXWRl5mei4bHvT\nl/IPcRbwrV799Ds7t0DSJLAPsA1YFRFru+z718ABwGXtGnjsjIijC/RplizldE7SfrQnFc7v1bZn\nEEXEnOxzC+3q+IVFxPlFBmE2DIl153YABxVp6xUL1li1WoAq6Vxg+bRfr4+IZYM4vlkZtVqAmq1Y\nWDOIY5kNSq2CyGwceRW3WSJnIrNEDiKzRD6dM0tUqylus3Hk0zmzRA4is1RFL4reSOvGQWTNVTQV\nOYjMchQNop1p3TiIrLlGNMftILLmKvrI9Mtp3YzqfpTZ6M0ouHUg6R2SrpH0UFZnMbfUmzORNVfa\nHPcq4KaI+H1JM4H98hqORRDtuLnqEfTpsKoH0J/WoVWPoCIlz7Mk/RxwQkQsBYiI14Ctee19OmfN\nVbaOMMwHnpO0Oiubfbmk3NIIDiJrrvJBNBNYAnw5K5u9A/hcXjdjcTpnNhQ5KaL1anvr4ingqYi4\nO/t+DQ4i2yPlTHFPzIaJOW99XzntaicinpX0pKSFEfEo7dJZD+Z14yCy5kq7WFkGXJmVEn4MOC+v\noYPImithijsi7gOOKtLWQWTN5WU/ZolG9ECRg8iay0FklmhERRYcRNZczkRmiTyxYJbImcgskTOR\nWSJnIrNEDiKzRCOa4u561ihpe/Y5T9Ir2QNKmyRtkLS0x74flXSfpI2S7pX0wUEO3Kyn8s8T9aVX\nJpr65vDN2QNKSJoPrJOk7C15ndwaEddn7Q8HrgX21AeVrQojmlgo1U1EPAFcCFzQpc2OKV/3B/6n\nTF9mpY1JJupmI7CoWwNJHwP+Hng38DsJfZn1rwZT3OrVICKuA66TdALwdeB9ndqt+MZbP0+8v73Z\nnuP72TZwNZidWwxsKtIwIu6QNFPSgRHx/PR/X3F2wiis9o7ItjddMagDJwSRpC3AS8DrwM6IODqv\nbakgkjQPuAS4tEubBcDjERGSlgB0CiCzoUmb4g5gIiJe6NWwn9m5BZImgX2AbcCqiFjbZd/fA86R\ntBPYDpzZazBmA5V+OtfzkgV6BFFEzMk+twC5xety9r0YuLiffcwGKm1iIYBbJb0OfCUiLs9r6BUL\n1lw5maj1NLSe6bn38RHxI0kHA7dIejgi7ujUMDmIJJ0LLJ/26/URsSz12GZJcjLRxNz29qaV9+ze\nJiJ+lH0+J+la4GhgOEGUrVhYk3ocs4EreU2U1d3eKyK2SdqP9j3OlXntfTpnzVV+YuFdwLWSoB0j\nV0bEd/MaO4isuUpOcWfL2o7o2TDjILLmqsGKBbPxVoO1c2bjzZnILJEzkVkiZyKzRA4is0SuxW2W\nyJnILJEnFswSOROZJXImMku0J2WiR0+tegT9WRh9PeRbuYn4QNVD6I9uHsxxPDtnlmhPykRmQzHO\nZYTNaiGxjLCkvbIXMtzQrRtnImuu9NO55bQLlL69WyNnImuuGQW3DiQdApwGfJUe9eeciay50jLR\nPwGfBeb0auggsubKmeJu3d/e8kj6MPCTiNgoaaJXN4qIXm2GSlI8UukI+le3+0RQr/tE0s1ERKES\nvvnHUMSNBduezi79Sfo74A+A12iXzZ4DfDsizum0v6+JrLlKXhNFxF9FxNyImE+7hvy/5wUQ+HTO\nmmxwN1u7nq45iKy5BhBEEXE7cHu3Ng4iay6v4jZL5LVzZom8itsskTORWSJfE5klciYySzSiIOqa\n8CRtzz7nSXpF0qSkTZI2SFra6+CSLpX0A0n3SVo8qEGbFZKwirsfvTLR1Du1myNiCYCk+cA6Scpe\nN7kbSacBh0bEYZKOAS4Djk0fsllB45CJ8mRvErsQuKBLs48AV2TtNwDvkPSuMv2ZlTKr4JYo5Zpo\nI7Coy7+/B3hyyvengEOAHyf0aVZcDSYWiixVn96m40K+L075+WjgmLIjslpqtZ6n1Xph8AeuwRT3\nYtrPn+d5Gpg75fsh2e92syxhEFZ/ExMHMjFx4M++r1y5eTAHHudrIknzgEvYNYlM9x3gnKz9scCL\nEeFTORudxGo/RfUzO7dA0iTtJ/22AasiYm3ujhE3STpN0mZgB3Be8mjN+jEOp3MRMSf73AL0/Ux0\nRPxZuWGZDUDJLCNpH9rPEO0NzAauj4jP57X3igVrrpLT1xHxqqSTIuJlSTOB9ZJ+KyLWd2qfHESS\nzqVd5G6q9RHh+QKrVsL1TkS8nP04OztS7vRhchBlKxbWpB7HbOASrokkzQAmgQXAZRGROxPtaj/W\nXAmzcxHxRkQcQfvWzInd6s/5msiaKydAWi1odS098paI2CrpRuBIoNWpjYs3luDijcM1sOKNbxRs\nO2O34o0HAa9FxIuS3gb8G7AyIm7rtL8zkTWX9i7Y8H+n/+LdwBXZddEM4Ot5AQQOImu0on/euwZR\nRNwPLBl0L2Y1NJo/bweRNZiDyCyRg8gskYPILJGDyCxR0SnuNA4iazBnIrNEDiKzRHtQEC38UtUj\n6NeOqgfQnxuSlqHV2B4URGbD4SAyS+QgMku0z0h6cRBZgzkTmSUazZ+3ayxYg80suO1K0lxJ/yHp\nQUkPSOr29hNnImuy0n/eO4HPRMT3Je0P3Cvploh4aKC9mI2/cn/eEfEs8Gz283ZJDwG/CDiIbE+T\n/uedvbxhMbBheL2Yja20VdzZqdw1wPKI2J7XzkFkDdb5z7vVeppW65mue0qaBXwb+EZEXNe17TjU\nnYu6rZ37dLX/zfpWs7Vz+giDqTtXsBy89MXpdedE+33Dz0fEZ3rt7ylua7ByU9zA8cDZwEmSNmbb\nKd16MWuo0rNz6+kjwTiIrMG87McskYPILJFXcZslGoMFqJK2Z5/zJL0iaVLSJkkbJC0t0oGkoyS9\nJunjgxiwWXGlZ+f67qWbqTdENkfEEgBJ84F1ak/Gr8nbWdJewEXAvwL1ullhDTAGmShPRDwBXAh0\nXSIOLKO9bOK5Mv2YpRmPTNTNRmBR3j9Keg/wUeCDwFHsmtV2seLGt36eOAwmFiaMymqndX97G7zx\nn53rdXr2BeBzERHZMorc9itOTxiF1d7E4e3tTSuvGtSRxz+IFgO5ryUHfgO4qh0/HAScKmlnRHwn\noU+zPoxxLe7sGYtLgEvz2kTEL01pvxq4wQFkozUemWjqdcwCSZO072BtA1ZFxNqhjcws2RgEUUTM\nyT63AKXfOx8R55Xd16y8MQgis3qrSRBJOhdYPu3X6ws/EWU2NDUJomzFwprkkZgN3GgWoPrJVmuw\n0sUbvybpx5IK3QJ2EFmDlV72sxrIfRy8Uy9mDVX68fA7snuhQ+zFrBZqMrFgNr7y6s59j1br7oH1\n4rpzZbju3FANru7cwwXbLtqtv+x07oaIOLzjTlM4E1mDjWYBqmfnrMFKT3F/C7gTWCjpSUldl605\nE1mDlZ6dO2v4vZjVwhjXWKiD1qNVj6B/rVar6iH0ZTiPdA/SaGosNDeIflD1CPrnIBq08S9UYjbm\nfLPVLNFoVnGPxc3WSgdgY2kQN1tH1V/lQWRWd42dWDAbFQeRWSIHkVkiB5FZIgeRWaL/By2bXhMp\nCG8WAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# reorder rows with respect to the clustering\n", "row_dendr = dendrogram(row_clusters, labels=labels, no_plot=True)\n", "df_rowclust = df.ix[row_dendr['leaves']]\n", "\n", "# plot\n", "fig = plt.figure()\n", "ax = fig.add_subplot(111)\n", "\n", "cax = ax.matshow(df_rowclust, interpolation='nearest', cmap='hot_r')\n", "fig.colorbar(cax)\n", "\n", "ax.set_xticklabels([''] + list(df_rowclust.columns))\n", "ax.set_yticklabels([''] + list(df_rowclust.index))\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Heatmap plus Row-Dendrogram" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we can rotate the dendrogram via the by setting the `orientation` parameter to `'right'`, but note that\n", "we now have to sort the clustered data in reverse order to match the row labels in the heatmap via\n", "\n", " df.ix[row_dendr['leaves'][::-1]]" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAbAAAAF1CAYAAACJa78tAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAFnBJREFUeJzt3X+Q3HV9x/HXK4kYg0bv0ForZJIGY2qLQ9KKWItdHayA\n9Vdbp1itQJW2Yw1p6TDVzti5TKftQLQVOtaqrcSoyIwU/FHFKtZlSNUoXMTogW0gGfkhlnonEn60\nMbz7x27wuNzu3vez+73Pfvaej5nv7O7t57vfdyDZ170/n+9+1xEhAABKsyx3AQAApCDAAABFIsAA\nAEUiwAAARSLAAABFIsAAAEUiwIAEtk+wfbvtsfbjsfbjNblry8UtN9g+Y9bPXmv72px15WT7Nbb3\nzNkO235Z7tpGgfkcGJDG9kWSToyIP7D9Pkm3R8TFuevKyfbPS/q4pE2SHidpUtLLImJ/1sKGhO3f\nl/S6iHhx7lpGAQEGJLK9QtJNki6X9CZJJ0fE4bxV5Wf7YkkPSHqipPsi4q8ylzQUbG+Q9EVJL4iI\nO3PXMwoIMKAP7amgayW9NCK+mLueYWB7laQ9kh6W9EsRcShzSdnZfpykr0i6OCI+nrueUbEidwFA\n4c6UdLekk9T67XrJi4gHbV8p6X7C61F/KWkv4TVYBBiQyPbJkk6X9AJJu2xfGRH3ZC5rWDwiiekd\nSbYbkl4jaXPmUkYOZyECCWxb0nslbY2IOyRtl/TOvFVh2LTPUr1c0hsj4oHc9YwaAgxIc76kA7PW\nvf5B0s/ZPi1jTcOGDkz6Q0lPk/SPc06lf23uwkYBJ3EAAIpEBwYAKBIBBgAoEgEGACgSAQYAKFKv\nz4FxhsfocO4CBsk2fzeBPkREX+8Jg/o32E8dfJAZxYr35K6gZeIz0sTLc1ch6S3Dk+kTExOamJjI\nXUbLp4fjd7eJK6SJ38ldRYtfOZjXObbP/fv9YBwBBgBIknsNigADACRZnvn4uQMUKF7jWbkrGD6N\nRiN3CUOncVLuCkYPHRjQp8aG3BUMHwLsaKMYYLk7IAIMAJAk9xQiAQYASJI7wHJ3gAAAJKEDAwAk\nyd0BEWAAgCS5pxAJMABAktwBlrsDBAAgCR0YACBJ7g6IAAMAJGEKEQCABHRgAIAkuTsgAgwAkCT3\nFCIBBgBIkjvAcneAAAAkoQMDACTJ3QERYACAJLmnEAkwAECS3AGWuwMEACAJHRgAIEnuDogAAwAk\nyT2FSIABAJLk7sByHx8AgEfZ3mp7r+1v2d7abSwdGAAgyaCnEG3/gqQ3S3qepEOSPmf7XyPitvnG\n04EBAJIs73Obx0ZJuyPi4Yg4LOl6Sb/R6fgEGAAgybI+t3l8S9Jptsdtr5L0cknHdzs+MHC2D7Zv\n19p+yPak7Snbu22fs8DXeJ7tH9vu+BsYgHJMS7pt1jZXRNwq6WJJn5d0raQ9kh7p9HqsgaEuMev+\nvojYLEm210m62rYjYkennW0vV+sv8uckuc5CAaSpugb2tPZ2xP55xkTEByV9UJJs/7Wk73Z6PTow\nLKqI2C/pQkkX9Bi6RdJVku6tvSgASWpYA5Ptn2rfrpH0GklXdDo+HRhy2KPWYu28bD9T0qskvUSt\ns5Gi01gA+dTUAV1l+zi1zkJ8S0T8qNNAAqyH8fFxzczM5C6jbxFDlQG9pgTfLeltERG23Wn8xGd+\ncr/xLKmxYWD1ASOlube1lSAiXrTQsQRYDzMzM8P25j8KNkma6vL8L0q6spVdeqqkM20fiohPzR40\n8fL6CgRGSeOk1nbEtisH87pcSgpLiu21krZLuqzTmIj42VnjL5f06bnhBSC/3CdREGCoy+y2db3t\nSUkrJd0v6dKI2JmnLACDQgeGkRQRq9u3BySt6uN1zhtUTQBGCwEGAEhCB4Yly/a5kuZebXpXRGzJ\nUA6AilgDw5LVvhLHjsxlACgUAQYASMIUIgCgSAQYAKBIudfAch8fAIAkdGAAgCRMIQIAipR7Co8A\nAwAkyd2B5Q5QAACS0IEBAJLk7sAIMABAktxTeAQYACBJ7g4sd4ACAJCEDgwAkCR3B0aAAQCS5J7C\nI8AAAElyd2C5AxQAgCR0YACAJLk7IAIMAJAk9xQiAQYASJI7wHJ3gAAAJKEDAwAkyd0B5T4+AKBQ\ny/vc5mP77ba/bXuv7StsP77T8QkwAECSQQeY7bWSzpe0OSJOag87u9PxmUIEAAyLH0k6JGmV7cOS\nVkm6q9NgOjAAQJJlfW5zRcS0pHdJ+q6kuyX9MCKu63R8OjAAQJKqp9Hva2+d2F4v6Y8lrZV0n6SP\n2359RHx03vER0e14XZ/MZXx8XDMzM4t2vB7/jUrh3AUMku34Tu4ihsyGWJW7hCH1q7kLGDr2tYqI\nvt4TbMff9lnHhdJj6rD925JeGhFvbj/+XUmnRsQfzbd/kR3YzMzMooWKPVLv+wAwMDWsQd0q6R22\nnyDpYUmnS/pap8FFBhgAIL9BX4kjIm62vVPSjZIekTQp6f2dxhNgAIAkdVxKKiIukXTJQsZyFiIA\noEh0YACAJLk7IAIMAJAk99XoCTAAQJLcAZa7AwQAIAkdGAAgSe4OiAADACTJPYVIgAEAkuTuwHIf\nHwCAJHRgAIAkTCECAIpEgAEAipR7DSr38QEASEIHBgBIwhQiAKBIBBgAoEi516ByHx8AgCR0YACA\nJEwhAgCKlHsKjwADACTJ3YHlDlAAAJLQgQEAkuTuwAgwAECS3FN4uY8PAEASOjAAQBKmEAEARSLA\nAABFyr0Glfv4GFG2D7Zv19p+yPak7Snbu22fs4D9L7P9X7Zvtr2p/ooBlIYODHWJWff3RcRmSbK9\nTtLVth0RO+bb0fZZkk6MiGfZfr6k90o6te6CAVSTewqRDgyLKiL2S7pQ0gVdhr1S0ofa43dLeort\npy9CeQAqWNbnNpftZ9veM2u7z3bH9wo6MOSwR9LGLs8/U9Idsx7fKel4Sd+vsygA1Qy6A4uI70ja\nJEm2l0m6S9I1ncYTYD2MjY3Jdu4y+hYRvQctnoX8B5075qg/wN/Pun+KpOf3UxEwwprNH6jZnM5d\nRlWnS7otIu7oNIAA62F6urj/6SXYJGmqy/N3STph1uPj2z97jC0DLgoYVY3GcWo0jnv08bZt+wby\nujWvgZ0t6YpuAwgwLCrbayVtl3RZl2GfkvRWSVfaPlXSDyOC6UNgyFQ9ieJr7a0X28dIeoWkP+s2\njgBDXWZP+a23PSlppaT7JV0aETs77hjxWdtn2d4n6QFJ59VbKoAUVTuwF7S3I97TeeiZkm6KiHu7\nvR4BhlpExOr27QFJqxL2f+ugawJQjNdJ+livQQQYACBJHWtgto9V6wSO83uNJcCQje1zJW2d8+Nd\nEcH5GUAB6vggcUQ8IOmpCxlLgCGb9pU4dmQuA0AirsQBAEACOjAAQJLcHRABBgBIknsKkQADACTJ\nHWC5O0AAAJLQgQEAkuTugAgwAECS3FOIBBgAIEnuAMvdAQIAkCSpAxsfH9fMzMygawEAFCR3B5QU\nYDMzM1m/4XcUviEZAErHFCIAAAk4iQMAkCR3B0SAAQCS5J5CJMAAAElyB1juDhAAgCR0YACANP22\nQI/0tzsBBgBI0+8cIgEGAMii3wA71N/urIEBAIpEBwYASJO5BSLAAABpMp9HT4ABANJkDjDWwAAA\nRaIDAwCkYQ0MAFAk1sCANBuuzV3BkLntwdwVDKXmifxFqU3mDow1MADA0LD9FNtX2b7F9pTtUzuN\npQMDAKSpZwrxUkmfjYjfsr1C0rGdBhJgAIA0Aw4w20+WdFpEnCNJEfFjSfd1Gs8UIgAgzbI+t6Ot\nk3Sv7cttT9r+gO1VnQ5PBwYAWBTNh1tbFyskbZb01oj4uu13S3qbpL/oNBgAgOoqTiE2jm1tR2w7\nenLwTkl3RsTX24+vUivA5kWAAQDSDHgNLCLusX2H7Q0R8Z+STpf07U7jCTAAQJp6zqLYIumjto+R\ndJuk8zoNJMAAAEMjIm6W9LyFjCXAAABpuJQUAKBIXEoKAIDq6MAAAGmYQgQAFIkAAwAUiTUwAACq\nowMDAKRhChEAUCQCDABQJNbAAACojg4MAJCGKUQAQJEyz+ERYACANJk7MNbAAABFogMDAKRhDQwA\nUCTWwAAARWINDACA6ujAAABpWAMDABSJNTAAQJFYA8Mosn2wfbvW9kO2J21P2d5t+5we+77K9s22\n99i+yfZLFqdqACWhA0NdYtb9fRGxWZJsr5N0tW1HxI4O+14XEZ9sjz9J0jWSTqyzWAAJuBo9lpKI\n2C/pQkkXdBnzwKyHT5T0P3XXBSDB8j63PhXZgY2Njcl27jKKEhG9By2ePZI2dhtg+9WS/kbSMyT9\n2nxjJj7yk/uN57Y2AEf7RnsbOM5CrG56ejp3CehPz98+IuITkj5h+zRJH5b07LljJt5QQ2XACDq5\nvR3xoVyFDFiRAYbibZI0tZCBEXGD7RW2j4uIH9RcF4AqOI0eS4nttZK2S7qsy5j1km6PiLC9WZII\nL2AIMYWIETV70W297UlJKyXdL+nSiNjZZd/flPRG24ckHZR0dn1lAhgmtg9I+pGkw5IORcQpncYS\nYKhFRKxu3x6QtKrivpdIuqSGsgAMUj0dWEhqRETPkx0IMABAmvrWwBZ0mjkBhmxsnytp65wf74qI\nLRnKAVBVfR3YdbYPS3pfRHyg00ACDNm0r8SxI3MZABZJ8y6peXfPYS+MiO/ZfpqkL9i+NSJumG8g\nAQYASFNxCrFxQms7YtuNR4+JiO+1b++1fY2kUyTNG2BcSgoAkGbAl5Kyvcr2k9r3j1XrKjx7Ox2e\nDgwAkGbwa2BPl3RN+1KBKyR9NCI+32kwAQYAGArti32f3HNgGwEGAEjDpaQAAEXiUlIAgCJlDjDO\nQgQAFIkODACQhjUwAECRWAMDABQpcwfGGhgAoEh0YACANEwhAgCKRIABAIrEGhgAANXRgQEA0jCF\nCAAoEgEGACgSa2AAAFRHBwYASMMUIgCgSEwhAgBQHR0YACANU4gAgCIRYECaY8/MXcFwuSV3AUOq\ncVHuCobQ9gG9DmtgAABURwcGAEjDFCIAoEiZ5/AIMABAmswdGGtgAIAi0YEBANKwBgYAKFJNc3i2\nl0u6UdKdEfGKTuMIMABAmvo6sK2SpiQ9qdsg1sAAAEPD9vGSzpL0T5LcbSwdGAAgTT0d2N9JukjS\n6l4DCTAAQJqKc3jNva2tE9u/Lum/I2KP7Uav1yPAAABpKnZgjZNb2xHbrjxqyC9LeqXtsyStlLTa\n9s6IeON8r8caGABgKETEn0fECRGxTtLZkv69U3hJdGAAgFT1t0DR7UkCDACQpsYPMkfE9ZKu7zaG\nAAMApOFaiAAAVEcHBgBIw9epAACKxMV8AQBFYg0MAIDq6MAAAGlYAwMAFIkpRAAAqqMDAwCkYQoR\nAFAkTqMHABSJNTAAAKqjAwMApGENDABQJNbAAABFYg0Mo8j2wfbtWtsP2Z60PWV7t+1zeuz7ets3\n2/6m7f+w/dzFqRpASejAUJfZXwW+LyI2S5LtdZKutu2I2NFh39slvSgi7rN9hqT3Szq11moBVJe5\nBaIDw6KKiP2SLpR0QZcxX4mI+9oPd0s6fjFqA1DR8j63PtGBSRofH9fMzEzuMmoVEb0HLZ49kjYu\ncOybJH12vif+b9b9Af17AEZS87tS844aXpizEPObmZkZtjf4UecFDbJfLOn3JL1wvuePGWRFwAhr\nrGltR2z7cr5aBokAQw6bJE11G9A+ceMDks6IiNFuj4FScRo9lhLbayVtl3RZlzFrJF0t6Q0RsW9x\nKgNQGQGGETV7Tna97UlJKyXdL+nSiNjZZd93SBqT9F7bknQoIk6prVIAaVgDwyiKiNXt2wOSVlXc\n93xJ59dQFoARQoABANIwhYilyva5krbO+fGuiNiSoRwAVRFgWKraV+LYkbkMAKm4EgcAANXRgQEA\n0gx4CtH2SknXS3q8Wtcq+GREvL3TeAIMAJBmwHN4EfGw7RdHxIO2V0jaZftXImLXfOMJMABAmhpO\n4oiIB9t3j2kfYbrTWNbAAABDw/Yy29+Q9H1JX4qIjpedowMDAKSp2IE1m1Lz+u5jIuIRSSfbfrKk\nf7PdiIjmfGPd4yrs8z5pe6Su3j5qf54OFnQF+FLYjkqX91gCbsldwJBac1HuCoaPt0sR0dd7gu1W\n1PTzGsu612H7HZIeioh3zvc8U4gAgKFg+6m2n9K+/wRJL1Xr+wPnxRQiACCN+z2L4/DcHzxD0ods\nL1OrwfpwRHyx094EGAAgUb8R8tgAi4i9kjYv1tEBAEtWvxHyv33tzRoYAKBIdGAAgER5I4QAAwAk\nIsAAAEXKGyGsgQEAikQHBgBIxBQiAKBIBBgAoEg1fJ9KBayBAQCKRAcGAEjEFCIAoEgEGACgSHwO\nDACAyujAAACJmEIEABSpwAAbGxuT7UHXAlTywOm5KxguX70udwXDac1zclcwygoMsOnp6UHXkRVh\nDADlYQoRAJCowA4MAIDcEUKAAQAS8TkwAAAqowMDACRiChEAUCSmEAEAqIwODACQiClEAECRCDAA\nQJFYAwMAQLZPsP0l29+2/S3bF3QbTwcGAEg08Ag5JOlPIuIbtp8o6SbbX4iIWxbl6ACApWL5QF8t\nIu6RdE/7/kHbt0j6GUkEGABgkOqLENtrJW2StHvxjw4AwCzN5t1qNr/Xc1x7+vAqSVsj4mCncQQY\nACBRtQhpNNao0Vjz6ONt2yaPGmP7cZL+RdJHIuITgzs6AACPGmyEuPXtwv8saSoi3r24RwcALCED\nj5AXSnqDpG/a3tP+2dsj4nOLcnQAAFJExC5V+HwyAQYASMSlpAAARSLAAABF4lqIAABURgcGAEjE\nFCIAoEgEGACgSKyBAQBQGR0YACARU4gAgCIxhYgRZPtg+3at7YdsT9qesr3b9jk99t1o+yu2H7b9\np4tTMYDqVvS59X90oA4x6/6+iNgsSbbXSbratiNiR4d9fyBpi6RX11sigJLRgWFRRcR+SRdKuqDL\nmHsj4kZJhxatMAAJ6MCyGxsbU+traEZXRPQetHj2SNrY74tM3PaT+40xqTHe7ysCo6l5a2sbPE7i\nyG56ejp3CUvNQH5bmFg/iFcBRl9jY2s7Ytsn89UySAQYctgkaSp3EQD6tTzr0QkwLCrbayVtl3TZ\nQobXWgyAPjGFiNE0e9Ftve1JSSsl3S/p0ojY2WlH2z8t6euSVkt6xPZWSc+JiIN1FgygKgIMIygi\nVrdvD0haVXHfeySdUENZAEYIAQYASEQHhiXK9rmSts758a6I2JKhHACVEWBYotpX4tiRuQwAybgW\nIgAAldGBAQASMYUIACgSAQYAKBJrYAAAVEYHBgBIxBQiAKBITCECAIo02C+0tP1B29+3vXchRyfA\nAADD4nJJZyx0MFOIAIBEg42QiLih/ZVLGY4OAFhCOIkDAFCkahHSbO5Ws/m1gR3dEdHt+a5Poigj\n9e3GtiNOz13FcPnqdbkrGE6nXp67guHj86SI6Os9wXZE3NpfHd54VB3tKcRPR8RJvfanAwMAJOI0\negBAkZb3uT2W7Y9J+rKkDbbvsH1et6PTgQEAEg38LMTXVRlPBwYAKBIdGAAgEafRAwCKxEkcAABU\nRgcGAEjEFCIAoEhMIQJFa07nrmD4TOYuYAg1+7toxZAa7NepVEWAAX1qzuSuYPgQYEcbzQDLiylE\nAEAi1sAAAEXKGyG9rkYPDCXb/MUF+jCIq9HnroMAAwAUiZM4AABFIsAAAEUiwAAARSLAAABFIsAA\nAEX6f95clW92KVXqAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from scipy.cluster import hierarchy\n", "# makes dendrogram black (1)\n", "hierarchy.set_link_color_palette(['black'])\n", "\n", "# plot row dendrogram\n", "fig = plt.figure(figsize=(8,8))\n", "axd = fig.add_axes([0.09,0.1,0.2,0.6])\n", "row_dendr = dendrogram(row_clusters, orientation='right', \n", " color_threshold=np.inf, ) # makes dendrogram black (2))\n", "\n", "# reorder data with respect to clustering\n", "df_rowclust = df.ix[row_dendr['leaves'][::-1]]\n", "\n", "axd.set_xticks([])\n", "axd.set_yticks([])\n", "\n", "\n", "# remove axes spines from dendrogram\n", "for i in axd.spines.values():\n", " i.set_visible(False)\n", "\n", "# reorder rows with respect to the clustering\n", "df_rowclust = df.ix[row_dendr['leaves'][::-1]]\n", " \n", "# plot heatmap\n", "axm = fig.add_axes([0.26,0.1,0.6,0.6]) # x-pos, y-pos, width, height\n", "cax = axm.matshow(df_rowclust, interpolation='nearest', cmap='hot_r')\n", "fig.colorbar(cax)\n", "axm.set_xticklabels([''] + list(df_rowclust.columns))\n", "axm.set_yticklabels([''] + list(df_rowclust.index))\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Adding a Column Dendrogram" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Also, we can add a an additional dendrogram for the column clustering at the top. Here, we first sort the indexes after clustering the rows\n", "\n", " df_rowclust = df.ix[row_dendr['leaves'][::-1]]\n", " \n", "And then we can use the indices from the column clustering to reorder the columns:\n", "\n", " df_rowclust.columns = [df_rowclust.columns[col_dendr['leaves']]]" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAbAAAAG4CAYAAAAkHbJHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAGB9JREFUeJzt3X+wbWdZH/Dvk0Qag029F621QubGYExtsSQqYi326GAF\n/N3WKVYrUKXtWENaOky1M3ZuptNxILUVHGvVVmL8lRkp+GMUq1gPQ6pG4QaMBmyDyQgiSr0HTAK0\nMXn6xznBS3LPOXevvfd59zr5fGbW7L3Pftdezz333PO9z/uuvXZ1dwBgbi4aXQAATCHAAJglAQbA\nLAkwAGZJgAEwSwIMgFm6ZHQBMJH3f8B0NbqAVdCBATBLOjBY0smTJ7OzszO6DM7jxIkTOXv27Ogy\nWJNyJQ5mamN+cKsq/h1tJn83+zKFCACjCDAAZkmAATBLAgyAWRJgAMySAANglgQYALMkwACYJQEG\nwCy5lBSwcpt0ea2qzbjohMtarZ5LSTFXG/OD63JFj+V78lgb9j3ZjFRfkilEAGZJgAEwSwIMgFkS\nYADMkgADYJYEGACzJMAAmCUBBsAsCTAAZkmAATBLAgyAWRJgAMySAANglgQYALMkwACYJQEGwCwJ\nMABmSYABMEsCDIBZEmAAzJIAA2CWBBgAsyTAAJglAQbALAkwAGZJgAEwSwIMgFkSYADMkgADYJYE\nGACzJMAAmCUBBsAsCTAAZkmAATBLAgyAWRJgAMySAANglgQYALMkwACYJQEGwCwJMABmSYABMEvV\n3aNrAICF6cAAmCUBBsAsCTAAZkmAATBLAgyAWRJgAMySAANglgQYALMkwACYJQEGE1TVU6rqd6vq\nxN7jE3uPrxhd2yi1601V9ZxzvvY1VfX6kXWNVFVfXVV3PGp7qKq+ZHRtx4FLScFEVfWyJE/t7n9S\nVd+X5He7++Wj6xqpqv5qkp9Icm2Sj0lyJsmXdPc9QwvbEFX1j5N8bXd/4ehajgMBBhNV1SVJ3pLk\n1Um+McnTu/uhsVWNV1UvT/JAko9L8oHu/neDS9oIVXV1kl9K8nnd/e7R9RwHAgyWsDcV9PokX9zd\nvzS6nk1QVZcluSPJh5N8dnc/OLik4arqY5L8apKXd/dPjK7nuLhkdAEwc89N8p4kT8vu/64f97r7\ng1V1a5L7hNdH/Nskdwqv1RJgMFFVPT3Js5N8XpLbqurW7n7v4LI2xcNJTO8kqaqtJF+d5LrBpRw7\nzkKECaqqknxvkhu6+11Jbkry78dWxabZO0v11Um+obsfGF3PcSPAYJoXJ7n3nHWv/5Tkr1TVswbW\ntGl0YMk/TfKJSf7zo06l/5rRhR0HTuIAYJZ0YADMkgADYJYEGACzJMAAmKXD3gfmDI/jo0YXsEpV\n5WcTltDdS/1OWNW/wWXq8EZmZqu/Z3QFu07/bHL6S0dXkeSbNyfTT58+ndOnT48uY9fPbMb/3U7/\nWHL6H4yuYld9xWpe54lL7r/sG+MEGACTjF6DEmAATHLx4OOPDlCYva1PG13B5tna2hpdwsbZetro\nCo4fHRgsaevq0RVsHgH2WMcxwEZ3QAIMgElGTyEKMAAmGR1goztAAJhEBwbAJKM7IAEGwCSjpxAF\nGACTjA6w0R0gAEyiAwNgktEdkAADYBJTiAAwgQ4MgElGd0ACDIBJRk8hCjAAJhkdYKM7QACYRAcG\nwCSjOyABBsAko6cQBRgAk4wOsNEdIABMogMDYJLRHZAAA2ASU4gAzNJFS27nU1U3VNWdVfVbVXXD\nYccHgOGq6q8l+aYkn5Pkryf5sqq6ar/xAgyASS5ecjuPa5Lc3t0f7u6Hkrwxyd/Z7/gCDIBJ1hBg\nv5XkWVV1sqouS/KlSZ683/GdxAHAJKvugLr7HVX18iS/kOSBJHckefiojg9Jkqq6f+/2VFV9qKrO\nVNVdVXV7Vb3gAl/jc6rqT6tq3ykEYD7OJnnnOdv5dPcPdvdnd/ffSvL+JL+z3+vpwFiXPuf+3d19\nXZJU1ZVJXltV1d0377dzVV2c5OVJfj5JrbNQYJpFT6P/xL3tEfecZ0xV/cXu/qOquiLJVyf53P1e\nT4BxpLr7nqp6aZLvTHLzAUOvT/Ka7J6NBGygNb0P7DVV9aQkDyb55u7+k/0GCjBGuCO7ZxudV1V9\nSpKvTPJF2Q2w3m8sMM461qC6+wsudKwAO8TJkyezs7MzuoyldW9UBhw2JfhdSb61u7uqar/xp3/2\nz+5vfVqydfXK6oNjZfvO3e24EWCH2NnZ2bRf/sfBtUnuOuD5z0py62525ROSPLeqHuzunz530Okv\nXV+BcJxsPW13e8SNt67mdUdfSkqAcaSq6lSSm5K8ar8x3f2p54x/dZKfeXR4AeONPo1dgLEu57at\nV1XVmSSXJrkvySu7+5YxZQGrogPjWOruy/du701y2RKv86JV1QQcLwIMgEl0YDxuVdULkzz64xJu\n6+7rB5QDLMgaGI9be1fiuHlwGcBMCTAAJjGFCMAsCTAAZmn0Gtjo4wPAJDowACYxhQjALI2ewhNg\nAEwyugMbHaAAMIkODIBJRndgAgyASUZP4QkwACYZ3YGNDlAAmEQHBsAkozswAQbAJKOn8AQYAJOM\n7sBGBygATKIDA2CS0R2QAANgktFTiAIMgElGB9joDhAAJtGBATDJ6A5o9PEBmKmLl9zOp6q+rap+\nu6rurKofq6o/t9/xBRgAk6w6wKrqVJIXJ7muu5+2N+z5+x3fFCIAm+JPkjyY5LKqeijJZUl+f7/B\nOjAAJrloye3Ruvtsku9M8ntJ3pPk/d39hv2OrwMDYJJFT6O/e2/bT1VdleSfJzmV5ANJfqKqvq67\nf/S847v7oOMd+OQoJ0+ezM7OzpEd75Dv0VzU6AJWqar6d0YXsWGu7stGl7Ch/tboAjZO1evT3Uv9\nTqiq/g9L1vHS5KPqqKq/n+SLu/ub9h7/wyTP7O5/dr79Z9mB7ezsHFmoVB2r3/sAK7OGNah3JPn2\nqvrYJB9O8uwkv77f4FkGGADjrfpKHN39tqq6Jcmbkzyc5EyS799vvAADYJJ1XEqqu1+R5BUXMtZZ\niADMkg4MgElGd0ACDIBJRl+NXoABMMnoABvdAQLAJDowACYZ3QEJMAAmGT2FKMAAmGR0Bzb6+AAw\niQ4MgElMIQIwSwIMgFkavQY1+vgAMIkODIBJTCECMEsCDIBZGr0GNfr4ADCJDgyASUwhAjBLo6fw\nBBgAk4zuwEYHKABMogMDYJLRHZgAA2CS0VN4o48PAJPowACYxBQiALMkwACYpdFrUKOPzzFVVffv\n3Z6qqg9V1Zmququqbq+qF1zA/q+qqv9dVW+rqmvXXzEwNzow1qXPuX93d1+XJFV1ZZLXVlV1983n\n27Gqnpfkqd39aVX1uUm+N8kz110wsJhVTyFW1acnufWcL31qkm/v7ledb7wOjCPV3fckeWmSlxww\n7CuS/NDe+NuTfHxVfdIRlAcs4KIlt0fr7t/p7mu7+9okn5Xkg0let9/xdWCMcEeSaw54/lOSvOuc\nx+9O8uQkf7jOooDFrPkkjmcneWd3v2u/AQLsECdOnEhVjS5jad19+KCjcyHf0EePecwf4LvPuf+M\nJJ+7TEVwjG1v/3G2t8+OLmNRz0/yYwcNEGCHOHt2dn/pc3BtkrsOeP73kzzlnMdP3vvaR7l+xUXB\ncbW19aRsbT3pI49vvPHulbzuujqwqnpCki9P8q8OGifAOFJVdSrJTUnOuyi756eTfEuSW6vqmUne\n392mD2HDLHoSxa/vbRfguUne0t3vO2iQAGNdzp3yu6qqziS5NMl9SV7Z3bfsu2P3z1XV86rq7iQP\nJHnReksFpli0A/u8ve0R37P/0K9N8uOHvZ4AYy26+/K923uTXDZh/29ZdU3A5quqJ2b3BI4XHzZW\ngAEwyTrWwLr7gSSfcCFjBRjDVNULk9zwqC/f1t3Oz4AZGP1GYgHGMHtX4rh5cBnARKMv5js6QAFg\nEh0YAJOM7oAEGACTjJ5CFGAATDI6wEZ3gAAwiQ4MgElGd0ACDIBJRk8hCjAAJhkdYKM7QACYZFIH\ndvLkyezs7Ky6FgBmZHQHNCnAdnZ2hn7C73H4hGSAuTOFCAATOIkDgElGd0ACDIBJRk8hCjAAJhkd\nYKM7QACYRAcGwDTLtkAPL7e7AANgmmXnEAUYAEMsG2APLre7NTAAZkkHBsA0g1sgAQbANIPPoxdg\nAEwzOMCsgQEwSzowAKaxBgbALFkDg2mufv3oCjbMOz84uoKNtP1UPyhrM7gDswYGwMaoqo+vqtdU\n1dur6q6qeuZ+Y3VgAEyzninEVyb5ue7+e1V1SZIn7jdQgAEwzYoDrKr+QpJndfcLkqS7/zTJB/Yb\nbwoRgGkuWnJ7rCuTvK+qXl1VZ6rqB6rqsv0OrwMD4Ehsf3h3O8AlSa5L8i3d/RtV9V1JvjXJv9lv\nMAAsbsEpxK0n7m6PuPGxk4PvTvLu7v6NvcevyW6AnZcAA2CaFa+Bdfd7q+pdVXV1d/+vJM9O8tv7\njRdgAEyznrMork/yo1X1hCTvTPKi/QYKMAA2Rne/LcnnXMhYAQbANC4lBcAsuZQUACxOBwbANKYQ\nAZglAQbALFkDA4DF6cAAmMYUIgCzJMAAmCVrYACwOB0YANOYQgRglgbP4QkwAKYZ3IFZAwNglnRg\nAExjDQyAWbIGBsAsWQMDgMXpwACYxhoYALPkUlIcR1V1/97tqar6UFWdqaq7qur2qnrBIft+ZVW9\nraruqKq3VNUXHU3VwEIuXnJbkg6Mdelz7t/d3dclSVVdmeS1VVXdffM++76hu39qb/zTkrwuyVPX\nWSwwPzowjlR335PkpUlecsCYB855+HFJ/s+66wImuGjJbUk6MEa4I8k1Bw2oqq9K8h1JPjnJ3z6K\nooAFOYljcSdOnEhVjS5jVrr78EFH59C/vO7+ySQ/WVXPSvLDST790WNO/8if3d/6zN0NeKy37m0r\nJ8AWd/bs2dElsJxrk9x1IQO7+01VdUlVPam7//jc505//Vpqg2Pn6XvbI35oVCErZg2MI1VVp5Lc\nlOS7DxhzVe212FV1XZI8OryADWANjGPq3DnLq6rqTJJLk9yX5JXdfcsB+/7dJN9QVQ8muT/J89dX\nJjDZGqYQq+reJH+S5KEkD3b3M/YbK8BYi+6+fO/23iSXLbjvK5K8Yg1lAZuvk2x196FrRQIMgGnW\ndxLHBZ2lJ8AYpqpemOSGR335tu6+fkA5wKLWcxZFJ3lDVT2U5Pu6+wf2GyjAGGbvShw3Dy4DmGo9\nHdjnd/cfVNUnJvnFqnpHd7/pfAMFGABHYvv3k+33HDymu/9g7/Z9VfW6JM9IIsAAWKEFpxC3nrK7\nPeLGN3/081V1WZKLu/u+qnpidq/Cc+N+ryfAAJhm9VOIn5TkdXtvA70kyY929y/sN1iAATDNigNs\n72LfTz904B5X4gBglnRgAEwzuAUSYABM42r0AMzS4ACzBgbALOnAAJjGGhgAs2QNDIBZGtyBWQMD\nYJZ0YABMYwoRgFkSYADMkjUwAFicDgyAaUwhAjBLAgyAWbIGBgCL04EBMI0pRABmyRQiACxOBwbA\nNKYQAZglAQbTPPG5oyvYLG8fXcCG2nrZ6Ao20E0reh1rYACwOB0YANOYQgRglgbP4QkwAKYZ3IFZ\nAwNglnRgAExjDQyAWVrTHF5VXZzkzUne3d1fvt84AQbANOvrwG5IcleSP3/QIGtgAGyMqnpykucl\n+S9J6qCxOjAApllPB/Yfk7wsyeWHDRRgAEyz4Bze9p27236q6suS/FF331FVW4e9ngADYJoFO7Ct\np+9uj7jx1scM+RtJvqKqnpfk0iSXV9Ut3f0N53s9a2AAbITu/tfd/ZTuvjLJ85P8j/3CK9GBATDV\n+lugPuhJAQbANGt8I3N3vzHJGw8aI8AAmMa1EAFgcTowAKbxcSoAzJKL+QIwS9bAAGBxOjAAprEG\nBsAsmUIEgMXpwACYxhQiALPkNHoAZskaGAAsTgcGwDTWwACYJVOIHEdVdf/e7amq+lBVnamqu6rq\n9qp6wSH7fl1Vva2qfrOq/mdVfebRVA0s5OIltyXpwFiXcz9J9e7uvi5JqurKJK+tqurum/fZ93eT\nfEF3f6CqnpPk+5M8c63VArOjA+NIdfc9SV6a5CUHjPnV7v7A3sPbkzz5KGoDFnTRktuSdGCMcEeS\nay5w7Dcm+bk11gJM5X1g4508eTI7Ozujy1ir7j580NGpCxpU9YVJ/lGSzz/f8//vnPsrmlKHY2n7\n95Ltd63hhZ2FON7Ozs6m/YI/7q5NctdBA/ZO3PiBJM/p7vP+7+IJaygMjqOtK3a3R9z4K+NqWSUB\nxpGqqlNJbkryqgPGXJHktUm+vrvvPprKgIWZQuSYOrelvaqqziS5NMl9SV7Z3bccsO+3JzmR5Hur\nKkke7O5nrK1SYBoBxnHU3Zfv3d6b5LIF931xkhevoSxglQavgTmNHoBZ0oExTFW9MMkNj/rybd19\n/YBygEWZQuTxau9KHDcPLgOYSoABMEvWwAAgqapL9y74/da9i39/x0HjdWAATLPiKcTu/nBVfWF3\nf7CqLklyW1X9ze6+7XzjBRgA06xhDq+7P7h39wnZjcizR3h4AB4X1vB5YFV1UVW9NckfJvnl7t73\nsnMCDICN0d0Pd/fTs/sxSl9QVVv7jTWFCMA0C66BbW8n22+8sLF7H2j7s0k+O8n2+cbUIVdhP++T\nVXWsrt5+3P48+7igjzCZi6rqha5P9Tjw9tEFbKgrXja6gs1TNyXdvdTvhKrqfnjJOi766Dqq6hOS\n/Gl3v7+qPjbJf09yY3f/0vn214EBsCk+OckPVdUjn9n8w/uFVyLAAJiqlj2P/qGPetTddya57kL3\nFmAATLRshDx0+JA1Hh2Ax61lI+T/LrW30+gBmCUdGAATjY0QAQbARAIMgFkaGyHWwACYJR0YABOZ\nQgRglgQYALO04k+0XJA1MABmSQcGwESmEAGYJQEGwCx5HxgALEwHBsBEphABmKUZBtiJEydSVauu\nBRbywLNHV7BZfu0NoyvYTFd8xugKjrMZBtjZs2dXXcdQwhhgfkwhAjDRDDswABgdIQIMgIm8DwwA\nFqYDA2AiU4gAzJIpRABYmA4MgIlMIQIwSwIMgFmyBgYAqaqnVNUvV9VvV9VvVdVLDhqvAwNgopVH\nyINJ/kV3v7WqPi7JW6rqF7v77UdydAAeLy5e6at193uTvHfv/v1V9fYkfzmJAANgldYXIVV1Ksm1\nSW4/+qMDwDm2t9+T7e0/OHTc3vTha5Lc0N337zdOgAEw0WIRsrV1Rba2rvjI4xtvPPOYMVX1MUn+\nW5If6e6fXN3RAeAjVhshtfvpwv81yV3d/V1He3QAHkdWHiGfn+Trk/xmVd2x97Vv6+6fP5KjA8AU\n3X1bFnh/sgADYCKXkgJglgQYALPkWogAsDAdGAATmUIEYJYEGACzZA0MABamAwNgIh0Yx1BV3b93\ne6qqPlRVZ6rqrqq6vapecMi+11TVr1bVh6vqXx5NxcDiLllyW/7osA59zv27u/u6JKmqK5O8tqqq\nu2/eZ98/TnJ9kq9ab4nAcnRgPI509z1JXprkJQeMeV93vzm7Hy8OcF46MEa4I8k1o4sAluU0+uFO\nnDiR3Y+hOb66+/BBR2cl3+zT7/yz+1snkq2Tq3hVOH6237G7rZ4AG+7s2bOjS3i8uTbJXcu+yOmr\nVlAJPA5sXbO7PeLGnxpXyyoJMI5UVZ1KclOSV13I8LUWAyzp4qFHF2Csy7lzlldV1Zkklya5L8kr\nu/uW/Xasqr+U5DeSXJ7k4aq6IclndPf96ywYWJQpRI6h7r587/beJJctuO97kzxlDWUBK+U0egBY\nmA6MYarqhUlueNSXb+vu6weUAyzMFCKPU3tX4rh5cBnAZAIMgFmyBgYAC9OBATCRKUQAZkmAATBL\n1sAAYGE6MAAm0oEBMEuXLLl9tKr6war6w6q680KOLsAAmGi1AZbk1Umec6FHF2AAbITuflOSnQsd\nbw0MgImcRg/ALAkwAGZpsQjZ3r4929u/vrKjV3cf9PyBTzIrNbqAVaqq7mePrmKz/NobRlewmZ75\n6tEVbJ56UdLdS/1OqKrufsdyddQ1j6mjqk4l+Znuftph+zuJA4CJVn4a/Y8n+ZUkV1fVu6rqRYcd\nHQAmuHilr9bdX7vIeAEGwESuxAEAC9OBATCR0+gBmCVTiACwMB0YABOZQgRglkwhwqxtnx1dweY5\nM7qADbS93EUrNtTKP05lIQIMlrR9wR/+8PghwB7reAbYWKYQAZjIGhgAszQ2Qg67Gj1spKrygwtL\nWMXV6EfXIcAAmCUncQAwSwIMgFkSYADMkgADYJYEGACz9P8BlCzOu4XjkPAAAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Compute pairwise distances for columns\n", "col_dists = pdist(df.T, metric='euclidean')\n", "col_clusters = linkage(col_dists, method='complete')\n", "\n", "# plot column dendrogram\n", "fig = plt.figure(figsize=(8,8))\n", "\n", "axd2 = fig.add_axes([0.38,0.74,0.36,0.10]) \n", "col_dendr = dendrogram(col_clusters, orientation='top',\n", " color_threshold=np.inf) # makes dendrogram black)\n", "axd2.set_xticks([])\n", "axd2.set_yticks([])\n", "\n", "# plot row dendrogram\n", "axd1 = fig.add_axes([0.09,0.1,0.2,0.6])\n", "row_dendr = dendrogram(row_clusters, orientation='right', \n", " count_sort='ascending',\n", " color_threshold=np.inf) # makes dendrogram black\n", "axd1.set_xticks([])\n", "axd1.set_yticks([])\n", "\n", "# remove axes spines from dendrogram\n", "for i,j in zip(axd1.spines.values(), axd2.spines.values()):\n", " i.set_visible(False)\n", " j.set_visible(False)\n", " \n", "\n", "# reorder columns and rows with respect to the clustering\n", "df_rowclust = df.ix[row_dendr['leaves'][::-1]]\n", "df_rowclust.columns = [df_rowclust.columns[col_dendr['leaves']]]\n", "\n", "# plot heatmap\n", "axm = fig.add_axes([0.26,0.1,0.6,0.6])\n", "cax = axm.matshow(df_rowclust, interpolation='nearest', cmap='hot_r')\n", "fig.colorbar(cax)\n", "axm.set_xticklabels([''] + list(df_rowclust.columns))\n", "axm.set_yticklabels([''] + list(df_rowclust.index))\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Important Warning About Adding Dendrograms" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is very important that you always check that the labels on the heatmap and the dendrogram match when you create the first plot! For example, we can provide the `labels` to the `labels` parameter of the dendrogram and **DON'T set `axd.set_yticks([])`** to show the labels on the dendrogram to compare them to the heatmap labels." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAbAAAAF1CAYAAACJa78tAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAFvRJREFUeJzt3X2w3FV9x/HPJ4mIQdEbtNYKmdBATG1xSKyItejiYAWs\nT22dYrUCVduONaSlw1Q7Y+dmOm2HxraCY60PlYiKzEjBh2qsYl3GVI3CjRgN2AbD8CSUeq9IeGhj\n8u0fuyE3uXd37+/sw/mdve/XzG92997zO79vNOw33+85+1tHhAAAKM2S3AEAAJCCBAYAKBIJDABQ\nJBIYAKBIJDAAQJFIYACAIpHAgAS2T7D9A9sT7dcT7dcrc8eWi1u+avvsWT97re2tOePKyfZrbO84\n4thv+2W5YxsH5nNgQBrbl0g6KSL+wPb7Jf0gIi7NHVdOtn9R0iclrZP0OElTkl4WEXuyBlYTtn9f\n0usi4szcsYwDEhiQyPYySTdJukLSmySdGhH780aVn+1LJT0k6YmSHoiIv8ocUi3YXiPpy5JeEBF3\n5Y5nHJDAgD60W0FbJb00Ir6cO546sL1c0g5Jj0r65YjYlzmk7Gw/TtLXJV0aEZ/MHc+4WJY7AKBw\n50i6R9Ipav3retGLiIdtXy3pQZLXY/5S0k6S12CRwIBEtk+VdJakF0jaZvvqiLg3c1h1cUAS7R1J\nthuSXiNpfeZQxg67EIEEti3pfZI2RsSdkjZLelfeqFA37V2qV0h6Y0Q8lDuecUMCA9K8RdLts9a9\n/lHSL9g+I2NMdUMFJv2hpKdJ+qcjttK/Nndg44BNHACAIlGBAQCKRAIDABSJBAYAKBIJDABQpF6f\nA2OHx/hw7gAGyTZ/N4E+RERf7wmD+m+wnzj4IDOKFe/NHUHL5OekyZfnjkLSW+uT0ycnJzU5OZk7\njJbP1uPfbpNXSZO/kzuKFr9yMPMc0+f5/X4wjgQGAEiSew2KBAYASLI08/VzJ1CgeI2Tc0dQP41G\nI3cItdM4JXcE44cKDOhTY03uCOqHBDbXOCaw3BUQCQwAkCR3C5EEBgBIkjuB5a4AAQBIQgUGAEiS\nuwIigQEAkuRuIZLAAABJciew3BUgAABJqMAAAElyV0AkMABAElqIAAAkoAIDACTJXQGRwAAASXK3\nEElgAIAkuRNY7goQAIAkVGAAgCS5KyASGAAgSe4WIgkMAJAkdwLLXQECAJCECgwAkCR3BUQCAwAk\nyd1CJIEBAJLkrsByXx8AgMfY3mh7p+3v2t7YbSwVGAAgyaBbiLZ/SdKbJT1P0j5JX7D9rxFx23zj\nqcAAAEmW9nnMY62k7RHxaETsl3SDpN/odH0SGAAgyZI+j3l8V9IZtlfYXi7p5ZKO73Z9LEK297Yf\nV9l+xPaU7V22t9s+f4FzPM/2T23P+RfSsOcHUJ5pSbfNOo4UEbdKulTSFyVtlbRD0oFO87EGtnjF\nrOe7I2K9JNk+UdK1th0RWzqdbHupWn/RviDJGeYHkFnVNbCntY+D9swzJiI+LOnDkmT7ryXd0Wk+\nKjAcJiL2SLpY0kU9hm6QdI2k++s0P4DRGcIamGz/TPtxpaTXSLqq0/WpwDCfHWotps7L9jMlvUrS\nS9TaLRSdxmaaH8AIDKkCusb2cWrtQnxrRPyk00ASWA8rVqzQzMxM7jD6FlEpB/Rq2b1b0tsjImx7\nAeOHMv/k5w49b5wsNdZUjAJYJJo7W0cJIuJFCx1LAuthZmam6pv/OFgnaVeX3z9X0tWt3KKnSjrH\n9r6I+Mwo5598+QKvBixyjVNax0Gbrh7MvNxKCrVie5WkzZIu7zQmIn5+1vgrJH12oclr2PMDGJ3c\nmyhIYIvX7LJyte0pSUdLelDSZRFxZc3nB5AZFRiyiIhj24+3S1rexzwX5pgfAEhgAIAkVGCoLdsX\nSDrybtDbImJDCfMDGC7WwFBb7TtlbCl1fgDjjQQGAEhCCxEAUCQSGACgSLnXwHJfHwCAJFRgAIAk\ntBABAEXK3cIjgQEAkuSuwHInUAAAklCBAQCS5K7ASGAAgCS5W3gkMABAktwVWO4ECgBAEiowAECS\n3BUYCQwAkCR3C48EBgBIkrsCy51AAQBIQgUGAEiSuwIigQEAkuRuIZLAAABJciew3BUgAABJqMAA\nAElyV0C5rw8AKNTSPo/52H6H7e/Z3mn7KtuP73R9EhgAIMmgE5jtVZLeIml9RJzSHnZep+vTQgQA\n1MVPJO2TtNz2fknLJd3daTAVGAAgyZI+jyNFxLSkv5N0h6R7JP04Iq7vdH0qMABAkqrb6He3j05s\nr5b0x5JWSXpA0idtvz4iPj7v+Ijodr2uv8xlxYoVmpmZGdn1evxvVArnDmCQbMf3cwdRM2tiee4Q\naurFuQOoHXurIqKv9wTb8fd9xnGxdFgctn9b0ksj4s3t178r6fSI+KP5zi+yApuZmRlZUrHH6n0f\nAAZmCGtQt0p6p+0nSHpU0lmSvtlpcJEJDACQ36DvxBERN9u+UtKNkg5ImpL0gU7jSWAAgCTDuJVU\nRPytpL9dyFh2IQIAikQFBgBIkrsCIoEBAJLkvhs9CQwAkCR3AstdAQIAkIQKDACQJHcFRAIDACTJ\n3UIkgQEAkuSuwHJfHwCAJFRgAIAktBABAEUigQEAipR7DSr39QEASEIFBgBIQgsRAFAkEhgAoEi5\n16ByXx8AgCRUYACAJLQQAQBFyt3CI4EBAJLkrsByJ1AAAJJQgQEAkuSuwEhgAIAkuVt4ua8PAEAS\nKjAAQBJaiACAIpHAAABFyr0Glfv6yMT23vbjKtuP2J6yvcv2dtvnL+D8y23/l+2bba8b9fwAQAW2\neMWs57sjYr0k2T5R0rW2HRFb5jvR9rmSToqIk20/X9L7JJ0+4vkBZJa7hUgFhsNExB5JF0u6qMuw\nV0r6SHv8dklPsf30OswPYHSW9HkcyfazbO+YdTxgu+N7BRUY5rND0touv3+mpDtnvb5L0vGS7qvJ\n/ABGYNAVWER8X9I6SbK9RNLdkq7rNJ4E1sPExIRs5w6jbxHRe9AhC/kDHzmmygUGMv97Zj0/TdLz\nKwQALCbN5o/UbE7nDqOqsyTdFhF3dhpAAutherq4/9MHYZ2kXV1+f7ekE2a9Pr79s5HOv6HCBYHF\nrNE4To3GcY+93rRp90DmHfIa2HmSruo2gASGw9heJWmzpMu7DPuMpLdJutr26ZJ+HBELau8Ne34A\no1N1E8U320cvto+S9ApJf9ZtHAls8Zrdkltte0rS0ZIelHRZRFzZ8cSIz9s+1/ZuSQ9JujDD/AAy\nq1qBvaB9HPTezkPPkXRTRNzfbT4S2CIVEce2H2+XtDzh/LflnB/AWHudpE/0GkQCAwAkGcYamO1j\n1NrA8ZZeY0lg6Mj2BZI2HvHjbRExkP0Tw54fwHAN44PEEfGQpKcuZKx7bK+utPd6VGxX3RaOhW1d\nL4bt+H7uIGpmTVTu1C4SL84dQO3YWxURfb0n2I47+oxjpdRXHNyJAwBQJFqIAIAkuSsgEhgAIEnu\nm/mSwAAASXInsNwVIAAASajAAABJcldAJDAAQJLcLUQSGAAgSe4ElrsCBAAgSVIFtmLFCs3MzAw6\nFgBAQXJXQEkJbGZmJuutnMbhG5IBoHS0EAEASMAmDgBAktwVEAkMAJAkdwuRBAYASJI7geWuAAEA\nSEIFBgBI028JdKC/00lgAIA0/fYQSWAAgCz6TWD7+judNTAAQJGowAAAaTKXQCQwAECazPvoSWAA\ngDSZExhrYACAIlGBAQDSsAYGACgSa2BAmjVbc0dQM7c9nDuCWmqexF+UoclcgbEGBgCoDdtPsX2N\n7Vts77J9eqexVGAAgDTDaSFeJunzEfFbtpdJOqbTQBIYACDNgBOY7SdLOiMizpekiPippAc6jaeF\nCABIs6TPY64TJd1v+wrbU7Y/aHt5p8tTgQEARqL5aOvoYpmk9ZLeFhHfsv1uSW+X9BedBgMAUF3F\nFmLjmNZx0Ka5zcG7JN0VEd9qv75GrQQ2LxIYACDNgNfAIuJe23faXhMR/ynpLEnf6zSeBAYASDOc\nXRQbJH3c9lGSbpN0YaeBJDAAQG1ExM2SnreQsSQwAEAabiUFACgSt5ICAKA6KjAAQBpaiACAIpHA\nAABFYg0MAIDqqMAAAGloIQIAikQCAwAUiTUwAACqowIDAKShhQgAKFLmHh4JDACQJnMFxhoYAKBI\nVGAAgDSsgQEAisQaGACgSKyBAQBQHRUYACANa2AAgCKxBgYAKBJrYMjB9t724yrbj9iesr3L9nbb\n5/c491W2b7a9w/ZNtl8y6vkBgAps8YpZz3dHxHpJsn2ipGttOyK2dDj3+oj4dHv8KZKuk3TSiOcH\nkBt3o0edRMQeSRdLuqjLmIdmvXyipP+py/wARmhpn0efiqzAJiYmZDt3GEWJiN6DDtkhaW23AbZf\nLelvJD1D0q9VDGcg809+7NDzxnNaB4C5vt0+Bo5diNVNT0/nDmHc9fzXQUR8StKnbJ8h6aOSnjXq\n+SffUOGKwCJ2avs46CO5AhkwWoiYzzpJuxYyMCK+KmmZ7eNqND+AUVjS5zGAywOPsb1K0mZJ7+ky\nZrXbPVzb6yUpIn5Uh/kBjBBrYMhk9qLYattTko6W9KCkyyLiyi7n/qakN9reJ2mvpPMyzA9gDNm+\nXdJPJO2XtC8iTus4tsfi/ry/tF11UwDyG6tdL7YjtuaOomZOzh1APTX5AMYcZ0qKiL7eE2xHvLm/\nOPyhuXHY3iPpuRHRc7MDFRgAIM3wFqEWlFxJYOjI9gWSNh7x420RsaGE+QEM2XC20Yek623vl/T+\niPhgp4EkMHTUvlPGllLnB1Avzbul5j09h70wIn5o+2mSvmT71vZu5DlIYACANBVbiI0TWsdBm26c\nOyYifth+vN/2dZJOkzRvAmMbPQAgzYC30dtebvtJ7efHqHUXnp2dLk8FBgBIM/g1sKdLuq79MdBl\nkj4eEV/sNJgEBgCohfbNvk/tObCNBAYASMM3MgMAisTd6AEARcqcwNiFCAAoEhUYACANa2AAgCKx\nBgYAKFLmCow1MABAkajAAABpaCECAIpEAgMAFIk1MAAAqqMCAwCkoYUIACgSCQwAUCTWwAAAqI4K\nDACQhhYiAKBItBABAKiOCgwAkIYWIgCgSCQwIM0x5+SOoF5uyR1ATTUuyR1BDW0e0DysgQEAUB0V\nGAAgDS1EAECRMvfwSGAAgDSZKzDWwAAARaICAwCkYQ0MAFCkIfXwbC+VdKOkuyLiFZ3GkcAAAGmG\nV4FtlLRL0pO6DWINDABQG7aPl3SupA9JcrexVGAAgDTDqcD+QdIlko7tNZAEBgBIU7GH19zZOjqx\n/euS/jsidthu9JqPBAYASFOxAmuc2joO2nT1nCG/IumVts+VdLSkY21fGRFvnG8+1sAAALUQEX8e\nESdExImSzpP0752Sl0QFBgBINfwSKLr9kgQGAEgzxA8yR8QNkm7oNoYEBgBIw70QAQCojgoMAJCG\nr1MBABSJm/kCAIrEGhgAANVRgQEA0rAGBgAoEi1EAACqowIDAKShhQgAKBLb6AEARWINDACA6qjA\nAABpWAMDABSJNTAAQJFYA0MOtve2H1fZfsT2lO1dtrfbPr/Hua+3fbPt79j+D9vPGfX8AEAFtnjN\n/qru3RGxXpJsnyjpWtuOiC0dzv2BpBdFxAO2z5b0AUmnj3h+ALllLoGowHCYiNgj6WJJF3UZ8/WI\neKD9cruk4+syP4ARWtrn0ScqMEkrVqzQzMxM7jCGKiJ6Dzpkh6S1Cxz7JkmfrxjOQOb/v1nPB/Tf\nAzCWmndIzTuHMDG7EPObmZmp+gY/7rygQfaZkn5P0gtzzH9UxYsCi1VjZes4aNPX8sUySCQwzGed\npF3dBrQ3VnxQ0tkRUbV8Hfb8AEaBbfSoE9urJG2WdHmXMSslXSvpDRGxu07zAxghEhgymd0zXW17\nStLRkh6UdFlEXNnl3HdKmpD0PtuStC8iThvx/AByy7wG5h5rP/P+0vZYrRmN25+ngwWtO5XCdizP\nHUTN3JI7gJpaeUnuCOrHm6WI6Os9wXbEfX3G8fT+4qACAwCkoYWIurJ9gaSNR/x4W0RsKGF+AEOW\nOYHRQtT4/Xk6oIU45mghzo8W4lwDayE+0Htc1zme3F8c3IkDAFAkWogAgDQDbiHaPlrSDZIer9a9\nCj4dEe/oNJ4EBgBIM+AeXkQ8avvMiHjY9jJJ22z/akRsm288CQwAkGYImzgi4uH206PaV5juNJY1\nMABAbdheYvvbku6T9JWI6HjbOSowAECaihVYsyk1b+g+JiIOSDrV9pMl/ZvtRkQ05xvLNnqN35+n\nA7bRjzm20c+PbfRzDWwb/YE+41jSPQ7b75T0SES8a77f00IEANSC7afafkr7+RMkvVSt7w+cFy1E\nAEAa97uLY/+RP3iGpI/YXqJWgfXRiPhyp7NJYACARP2mkMMTWETslLR+VFcHACxa/aaQ/+3rbNbA\nAABFogIDACTKm0JIYACARCQwAECR8qYQ1sAAAEWiAgMAJKKFCAAoEgkMAFCkIXyfSgWsgQEAikQF\nBgBIRAsRAFAkEhgAoEh8DgwAgMqowAAAiWghAgCKVGACm5iYkO1BxwJU8tBZuSOol29cnzuCelr5\n7NwRjLMCE9j09PSg48iKZAwA5aGFCABIVGAFBgBA7hRCAgMAJOJzYAAAVEYFBgBIRAsRAFAkWogA\nAFRGBQYASEQLEQBQJBIYAKBIrIEBACDbJ9j+iu3v2f6u7Yu6jacCAwAkGngK2SfpTyLi27afKOkm\n21+KiFtGcnUAwGKxdKCzRcS9ku5tP99r+xZJPyeJBAYAGKThpRDbqyStk7R99FcHAGCWZvMeNZs/\n7Dmu3T68RtLGiNjbaRwJDACQqFoKaTRWqtFY+djrTZum5oyx/ThJ/yLpYxHxqcFdHQCAxww2hbj1\n7cL/LGlXRLx7tFcHACwiA08hL5T0Bknfsb2j/bN3RMQXRnJ1AABSRMQ2Vfh8MgkMAJCIW0kBAIpE\nAgMAFIl7IQIAUBkVGAAgES1EAECRSGAAgCKxBgYAQGVUYACARLQQAQBFooWIDGzvbT+usv2I7Snb\nu2xvt31+j3PX2v667Udt/2mO+QHUwbI+j/6vjsUpZj3fHRHrJcn2iZKute2I2NLh3B9J2iDp1Rnn\nB7DIUYHhMBGxR9LFki7qMub+iLhR0r66zQ9glKjAspuYmFDra2jGV0T0HnTIDklrhxTKwOafvO3Q\n88aE1FjR74zAeGre2joGj00c2U1PT+cOoW6Gnc0HMv/k6kHMAoy/xtrWcdCmT+eLZZBIYJjPOkm7\nCp4fwEgszXp1EhgOY3uVpM2SLl/I8LrND2CUaCEij9mLYqttT0k6WtKDki6LiCs7nWj7ZyV9S9Kx\nkg7Y3ijp2RGxd4TzA8iOBIYMIuLY9uPtkpZXPPdeSSfknB8ASGAAgERUYKgp2xdI2njEj7dFxIYS\n5gcwbCQw1FT7ThlbSp0fwLBxL0QAACqjAgMAJKKFCAAoEgkMAFAk1sAAAKiMCgwAkIgWIgCgSLQQ\nAQBFGuwXWtr+sO37bO9cyNVJYACAurhC0tkLHUwLEQCQaLApJCK+2v7KpQxXBwAsImziAAAUqVoK\naTa3q9n85sCu7ojo9vuuv0RRxurbjW1HnJU7inr5xvW5I6in06/IHUH9+EIpIvp6T7AdEbf2F4fX\nzomj3UL8bESc0ut8KjAAQCK20QMAirS0z+Nwtj8h6WuS1ti+0/aF3a5OBQYASDTwXYivqzKeCgwA\nUCQqMABAIrbRAwCKxCYOAAAqowIDACSihQgAKBItRKBozencEdTPVO4AaqjZ300ramqwX6dSFQkM\n6FNzJncE9UMCm2s8E1hetBABAIlYAwMAFClvCul1N3qglmzzFxfowyDuRp87DhIYAKBIbOIAABSJ\nBAYAKBIJDABQJBIYAKBIJDAAQJH+H3BbKJ4bu65BAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from scipy.cluster import hierarchy\n", "# makes dendrogram black (1)\n", "hierarchy.set_link_color_palette(['black'])\n", "\n", "# plot row dendrogram\n", "fig = plt.figure(figsize=(8,8))\n", "axd = fig.add_axes([0.09,0.1,0.2,0.6])\n", "row_dendr = dendrogram(row_clusters, orientation='right', \n", " labels=labels,\n", " color_threshold=np.inf, ) # makes dendrogram black (2))\n", "axd.set_xticks([])\n", "\n", "# uncomment to hide dendrogram labels\n", "#axd.set_yticks([])\n", "\n", "# remove axes spines from dendrogram\n", "for i in axd.spines.values():\n", " i.set_visible(False)\n", "\n", "# reorder columns and rows with respect to the clustering\n", "df_rowclust = df.ix[row_dendr['leaves'][::-1]]\n", " \n", "# plot heatmap\n", "axm = fig.add_axes([0.26,0.1,0.6,0.6]) # x-pos, y-pos, width, height\n", "cax = axm.matshow(df_rowclust, interpolation='nearest', cmap='hot_r')\n", "fig.colorbar(cax)\n", "axm.set_xticklabels([''] + list(df_rowclust.columns))\n", "axm.set_yticklabels([''] + list(df_rowclust.index))\n", "\n", "plt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.4.3" } }, "nbformat": 4, "nbformat_minor": 0 }