{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Linear algebra in SciPy\n",
    "\n",
    "> No one can be told what the matrix is. You have to see it for yourself.\n",
    ">\n",
    "> — Morpheus, *The Matrix*\n",
    "\n",
    "Just like Chapter 4, which dealt with the Fast Fourier Transform, this chapter\n",
    "will feature an elegant *method*. We\n",
    "want to highlight the packages available in SciPy to do linear algebra, which forms\n",
    "the basis of much scientific computing.\n",
    "\n",
    "## Linear algebra basics\n",
    "\n",
    "A chapter in a programming book is not really the right place to learn about\n",
    "linear algebra itself, so we assume familiarity with linear algebra concepts.\n",
    "At a minimum, you should know that linear algebra involves vectors (ordered\n",
    "collections of numbers) and their transformations by multiplying them with\n",
    "matrices (collections of vectors). If all of this sounded like gibberish to\n",
    "you, you should probably pick up an introductory linear algebra textbook before\n",
    "reading this. We highly recommend Gil Strang's *Linear Algebra and its\n",
    "Applications*. Introductory is all you need though — we hope to convey the power\n",
    "of linear algebra while keeping the operations relatively simple!\n",
    "\n",
    "As an aside, we will break Python notation convention in order to match linear\n",
    "algebra conventions: in Python, variables names should usually begin with a\n",
    "lower case letter. However, in linear algebra, matrices are denoted by\n",
    "a capital letter, while vectors and scalar values are lowercase. Since we're\n",
    "going to be dealing with quite a few matrices and vectors, following the\n",
    "linear algebra convention helps to keep them straight. Therefore, variables\n",
    "that represent matrices will start with a capital letter, while vectors and\n",
    "numbers will start with lowercase:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "m, n = (5, 6)  # scalars\n",
    "M = np.ones((m, n))  # a matrix\n",
    "v = np.random.random((n,))  # a vector\n",
    "w = M @ v  # another vector"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In mathematical notation, the vectors would typically be written in boldface,\n",
    "as in $\\mathbf{v}$ and $\\mathbf{w}$, while the scalars would not, as\n",
    "in $m$ and $n$.  In Python code, we can't make that distinction, so we will rely instead\n",
    "on context to keep scalars and vectors straight.\n",
    "\n",
    "## Laplacian matrix of a graph\n",
    "\n",
    "We discussed graphs in chapter 3, where we represented image regions as\n",
    "nodes, connected by edges between them. But we used a rather simple method of\n",
    "analysis: we *thresholded* the graph, removing all edges above some value.\n",
    "Thresholding works in simple cases, but can easily fail, because all you need\n",
    "is one value to fall on the wrong side of the threshold for the approach\n",
    "to fail.\n",
    "\n",
    "As an example, suppose you are at war, and your enemy is camped just across the\n",
    "river from your forces. You want to cut them off, so you decide to blow up all\n",
    "the bridges between you. Intelligence suggests that you need $t$ kilograms of\n",
    "TNT to blow each bridge crossing the river, but the bridges in your own\n",
    "territory can withstand $t+1$ kg. You might, having just read chapter 3 of\n",
    "*Elegant SciPy*, order your commandos to detonate $t$ kg of TNT on every bridge\n",
    "in the region. But, if intelligence was wrong about just *one* bridge crossing\n",
    "the river, and it remains standing, the enemy's army can come marching through!\n",
    "Disaster!\n",
    "\n",
    "So, in this chapter, we will explore some alternative approaches to graph\n",
    "analysis, based on linear algebra. It turns out that we can think of a graph, $G$,\n",
    "as an *adjacency matrix*, in which we number the nodes of the graph from $0$\n",
    "to $n-1$, and place a 1 in row $i$, column $j$ of the matrix whenever there is\n",
    "an edge from node $i$ to node $j$. In other words, if we call the adjacency\n",
    "matrix $A$, then $A_{i, j} = 1$ if and only if the edge $(i, j)$ is in $G$. We\n",
    "can then use linear algebra techniques to study this matrix, often with\n",
    "striking results.\n",
    "\n",
    "The *degree* of a node is defined as the number of edges touching it.  For\n",
    "example, if a node is connected to five other nodes in a graph, its degree\n",
    "is 5. (Later, we will differentiate between out-degree and in-degree, when edges\n",
    "have a \"from\" and \"to\".) In matrix terms, the degree corresponds to the *sum*\n",
    "of the values in a row or column.\n",
    "\n",
    "The *Laplacian* matrix of a graph (just \"the Laplacian\" for short) is defined\n",
    "as the *degree matrix*, $D$, which\n",
    "contains the degree of each node along the diagonal and zero everywhere else,\n",
    "minus the adjacency matrix $A$:\n",
    "\n",
    "$\n",
    "L = D - A\n",
    "$\n",
    "\n",
    "We definitely can't fit all of the linear algebra theory needed to understand\n",
    "the properties of this matrix, but suffice it to say: it has some *great*\n",
    "properties. We will exploit a couple in the following paragraphs.\n",
    "\n",
    "First, we will look at the *eigenvectors* of $L$.\n",
    "An eigenvector $v$ of a matrix $M$ is a vector that\n",
    "satisfies the property $Mv = \\lambda v$ for some number $\\lambda$,\n",
    "known as the eigenvalue.  In other words, $v$ is a special vector in\n",
    "relation to $M$ because $Mv$ simply changes the size of the vector, without\n",
    "changing its direction. As we will soon see, eigenvectors have many useful\n",
    "properties — sometimes seeming even magical!\n",
    "\n",
    "As an example, a 3x3 rotation matrix $R$, when multiplied by any\n",
    "3-dimensional vector $p$, rotates it $30^\\circ$ degrees around the z-axis.  $R$\n",
    "will rotate all vectors except for those that lie *on* the z-axis.  For those,\n",
    "we'll see no effect, or $Rp = p$, i.e. $Rp = \\lambda p$ with\n",
    "eigenvalue $\\lambda = 1$.\n",
    "\n",
    "<!-- exercise begin -->\n",
    "\n",
    "**Exercise:** Consider the rotation matrix\n",
    "\n",
    "$\n",
    "R = \\begin{bmatrix}\n",
    "  \\cos \\theta &  -\\sin \\theta & 0 \\\\\n",
    "  \\sin \\theta & \\cos \\theta & 0\\\\\n",
    "  0 & 0 & 1\\\\\n",
    "\\end{bmatrix}\n",
    "$\n",
    "\n",
    "When $R$ is multiplied with a 3-dimensional column-vector $p =\n",
    "\\left[ x\\, y\\, z \\right]^T$, the resulting vector $R p$ is rotated\n",
    "by $\\theta$ degrees around the z-axis.\n",
    "\n",
    "1. For $\\theta = 45^\\circ$, verify (by testing on a few arbitrary\n",
    "   vectors) that $R$ rotates these vectors around the z axis.\n",
    "   Remember that matrix multiplication in Python is denoted with `@`.\n",
    "\n",
    "2. What does the matrix $S = RR$ do? Verify this in Python.\n",
    "\n",
    "3. Verify that multiplying by $R$ leaves the vector\n",
    "   $\\left[ 0\\, 0\\, 1\\right]^T$ unchanged.  In other words, $R p = 1\n",
    "   p$, which means $p$ is an eigenvector of $R$ with eigenvalue 1.\n",
    "\n",
    "4. Use `np.linalg.eig` to find the eigenvalues and eigenvectors of $R$, and\n",
    "   verify that $\\left[0, 0, 1\\right]^T$ is indeed among them, and that it\n",
    "   corresponds to the eigenvalue 1.\n",
    "\n",
    "<!-- solution begin -->\n",
    "\n",
    "**Solution:**\n",
    "\n",
    "Part 1:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "theta = np.deg2rad(45)\n",
    "R = np.array([[np.cos(theta), -np.sin(theta), 0],\n",
    "              [np.sin(theta),  np.cos(theta), 0],\n",
    "              [            0,              0, 1]])\n",
    "\n",
    "print(\"R times the x-axis:\", R @ [1, 0, 0])\n",
    "print(\"R times the y-axis:\", R @ [0, 1, 0])\n",
    "print(\"R times a 45 degree vector:\", R @ [1, 1, 0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Part 2:\n",
    "\n",
    "Since multiplying a vector by $R$ rotates it 45 degrees, multiplying the result\n",
    "by $R$ again should result in the original vector being rotated 90 degrees.\n",
    "Matrix multiplication is associative, which means that $R(Rv) = (RR)v$, so\n",
    "$S = RR$ should rotate vectors by 90 degrees around the z-axis."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "S = R @ R\n",
    "S @ [1, 0, 0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Part 3:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"R @ z-axis:\", R @ [0, 0, 1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "R rotates both the x and y axes, but not the z-axis.\n",
    "\n",
    "Part 4:\n",
    "\n",
    "Looking at the documentation of `eig`, we see that it returns two values:\n",
    "a 1D array of eigenvalues, and a 2D array where each column contains the\n",
    "eigenvector corresponding to each eigenvalue."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.linalg.eig(R)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In addition to some complex-valued eigenvalues and vectors, we see the value 1\n",
    "associated with the vector $\\left[0, 0, 1\\right]^T$.\n",
    "\n",
    "<!-- solution end -->\n",
    "\n",
    "<!-- exercise end -->\n",
    "\n",
    "Back to the Laplacian. A common problem in network analysis is visualization.\n",
    "How do you draw nodes and edges in such a way that you don't get a complete\n",
    "mess such as this one?\n",
    "\n",
    "<img src=\"https://upload.wikimedia.org/wikipedia/commons/9/90/Visualization_of_wiki_structure_using_prefuse_visualization_package.png\"/>\n",
    "<!-- caption text=\"Visualization of wikipedia structure. Created by Chris Davis and released under CC-BY-SA-3.0 (https://commons.wikimedia.org/wiki/GNU_Free_Documentation_License).\" -->\n",
    "\n",
    "\n",
    "One way is to put nodes that share many edges close together. It turns out\n",
    "that this can be done by using the second-smallest eigenvalue of the Laplacian\n",
    "matrix, and its corresponding eigenvector, which is so important it has its\n",
    "own name: the\n",
    "[Fiedler vector](https://en.wikipedia.org/wiki/Algebraic_connectivity#The_Fiedler_vector).\n",
    "\n",
    "Let's use a minimal network to illustrate this. We start by creating the\n",
    "adjacency matrix:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "A = np.array([[0, 1, 1, 0, 0, 0],\n",
    "              [1, 0, 1, 0, 0, 0],\n",
    "              [1, 1, 0, 1, 0, 0],\n",
    "              [0, 0, 1, 0, 1, 1],\n",
    "              [0, 0, 0, 1, 0, 1],\n",
    "              [0, 0, 0, 1, 1, 0]], dtype=float)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can use NetworkX to draw this network. First, we initialize matplotlib as\n",
    "usual:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Make plots appear inline, set custom plotting style\n",
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt\n",
    "plt.style.use('style/elegant.mplstyle')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we can plot it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import networkx as nx\n",
    "g = nx.from_numpy_matrix(A)\n",
    "layout = nx.spring_layout(g, pos=nx.circular_layout(g))\n",
    "nx.draw(g, pos=layout,\n",
    "        with_labels=True, node_color='white')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!-- caption text=\"A simple network plotted with NetworkX\" -->\n",
    "\n",
    "\n",
    "You can see that the nodes fall naturally into two groups, 0, 1, 2 and 3, 4, 5.\n",
    "Can the Fiedler vector tell us this? First, we must compute the degree matrix\n",
    "and the Laplacian. We first get the degrees by summing along either axis of $A$.\n",
    "(Either axis works because $A$ is symmetric.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "d = np.sum(A, axis=0)\n",
    "print(d)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We then put those degrees into a diagonal matrix of the same shape\n",
    "as A, the *degree matrix*. We can use the `np.diag` function to do this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "D = np.diag(d)\n",
    "print(D)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we get the Laplacian from the definition:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "L = D - A\n",
    "print(L)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Because $L$ is symmetric, we can use the `np.linalg.eigh` function to compute\n",
    "the eigenvalues and eigenvectors:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "val, Vec = np.linalg.eigh(L)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can verify that the values returned satisfy the definition of eigenvalues\n",
    "and eigenvectors. For example, one of the eigenvalues is 3:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.any(np.isclose(val, 3))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And we can check that multiplying the matrix $L$ by the corresponding eigenvector\n",
    "does indeed multiply the vector by 3:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "idx_lambda3 = np.argmin(np.abs(val - 3))\n",
    "v3 = Vec[:, idx_lambda3]\n",
    "\n",
    "print(v3)\n",
    "print(L @ v3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As mentioned above, the Fiedler vector is the vector corresponding to the\n",
    "second-smallest eigenvalue of $L$. Sorting the eigenvalues tells us which one\n",
    "is the second-smallest:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.plot(np.sort(val), linestyle='-', marker='o');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!-- caption text=\"Eigenvalues of $L$\" -->\n",
    "\n",
    "It's the first non-zero eigenvalue, close to 0.4. The Fiedler vector is the\n",
    "corresponding eigenvector:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "f = Vec[:, np.argsort(val)[1]]\n",
    "plt.plot(f, linestyle='-', marker='o');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!-- caption text=\"Fiedler vector of $L$\" -->\n",
    "\n",
    "It's pretty remarkable: by looking at the *sign* of the elements of the Fiedler\n",
    "vector, we can separate the nodes into the two groups we identified in the\n",
    "drawing!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "colors = ['orange' if eigv > 0 else 'gray' for eigv in f]\n",
    "nx.draw(g, pos=layout, with_labels=True, node_color=colors)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!-- caption text=\"Nodes colored by their sign in the Fiedler vector of $L$\" -->\n",
    "\n",
    "## Laplacians with brain data\n",
    "\n",
    "Let's demonstrate this process in a real-world example by laying out the brain cells in a worm, as shown in\n",
    "[Figure 2](http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1001066)\n",
    "from the\n",
    "[Varshney *et al* paper](http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1001066)\n",
    "that we introduced in Chapter 3. (Information on\n",
    "how to do this is in the\n",
    "[supplementary material](http://journals.plos.org/ploscompbiol/article/asset?unique&id=info:doi/10.1371/journal.pcbi.1001066.s001)\n",
    "for the paper.) To obtain their\n",
    "layout of the worm brain neurons, they used a related matrix, the\n",
    "*degree-normalized Laplacian*.\n",
    "\n",
    "Because the order of the neurons is important in this analysis, we will use a\n",
    "preprocessed dataset, rather than clutter this chapter with data cleaning. We\n",
    "got the original data from Lav Varshney's\n",
    "[website](http://www.ifp.illinois.edu/~varshney/elegans),\n",
    "and the processed data is in our `data/` directory.\n",
    "\n",
    "First, let's load the data. There are four components:\n",
    "- the network of chemical synapses, through which a *pre-synaptic neuron*\n",
    "  sends a chemical signal to a *post-synaptic* neuron,\n",
    "- the gap junction network, which contains direct electrical contacts between\n",
    "  neurons),\n",
    "- the neuron IDs (names), and\n",
    "- the three neuron types:\n",
    "  - *sensory neurons*, those that detect signals coming from the outside world,\n",
    "    encoded as 0;\n",
    "  - *motor neurons*, those that activate muscles, enabling the worm to move,\n",
    "    encoded as 2; and\n",
    "  - *interneurons*, the neurons in between, which enable complex signal processing\n",
    "    to occur between sensory neurons and motor neurons, encoded as 1."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "Chem = np.load('data/chem-network.npy')\n",
    "Gap = np.load('data/gap-network.npy')\n",
    "neuron_ids = np.load('data/neurons.npy')\n",
    "neuron_types = np.load('data/neuron-types.npy')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We then simplify the network, adding the two kinds of connections together,\n",
    "and removing the directionality of the network by taking the average of\n",
    "in-connections and out-connections of neurons. This seems a bit like cheating\n",
    "but, since we are only looking for the *layout* of the neurons on a graph, we\n",
    "only care about *whether* neurons are connected, not in which direction.\n",
    "We are going to call the resulting matrix the *connectivity* matrix, $C$, which\n",
    "is just a different kind of adjacency matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "A = Chem + Gap\n",
    "C = (A + A.T) / 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To get the Laplacian matrix $L$, we need the degree matrix $D$, which contains\n",
    "the degree of node i at position [i, i], and zeros everywhere else."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "degrees = np.sum(C, axis=0)\n",
    "D = np.diag(degrees)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we can get the Laplacian just like before:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "L = D - C"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The vertical coordinates in Fig 2 are given by arranging nodes such that, on\n",
    "average, neurons are as close as possible to \"just above\" their downstream\n",
    "neighbors. Varshney _et al_ call this measure \"processing depth,\" and it's\n",
    "obtained by solving a linear equation involving the Laplacian. We use\n",
    "`scipy.linalg.pinv`, the\n",
    "[pseudoinverse](https://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_pseudoinverse),\n",
    "to solve it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy import linalg\n",
    "b = np.sum(C * np.sign(A - A.T), axis=1)\n",
    "z = linalg.pinv(L) @ b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(Note the use of the `@` symbol, which was introduced in Python 3.5 to denote\n",
    "matrix multiplication. As we noted in the preface and in Chapter 5, in previous\n",
    "versions of Python, you would need to use the function `np.dot`.)\n",
    "\n",
    "In order to obtain the degree-normalized Laplacian, $Q$, we need the inverse\n",
    "square root of the $D$ matrix:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Dinv2 = np.diag(1 / np.sqrt(degrees))\n",
    "Q = Dinv2 @ L @ Dinv2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we are able to extract the $x$ coordinates of the neurons to ensure that\n",
    "highly-connected neurons remain close: the eigenvector of $Q$ corresponding to\n",
    "its second-smallest eigenvalue, normalized by the degrees:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "val, Vec = linalg.eig(Q)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note from the documentation of `numpy.linalg.eig`:\n",
    "\n",
    "> \"The eigenvalues are not necessarily ordered.\"\n",
    "\n",
    "Although the documentation in SciPy's `eig` lacks this warning, it remains true\n",
    "in this case. We must therefore sort the eigenvalues and the corresponding\n",
    "eigenvector columns ourselves:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "smallest_first = np.argsort(val)\n",
    "val = val[smallest_first]\n",
    "Vec = Vec[:, smallest_first]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can find the eigenvector we need to compute the affinity coordinates:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = Dinv2 @ Vec[:, 1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(The reasons for using this vector are too long to explain here, but appear in\n",
    "the paper's supplementary material, linked above. The short version is that\n",
    "choosing this vector minimizes the total length of the links between neurons.)\n",
    "\n",
    "There is one small kink that we must address before proceeding: eigenvectors\n",
    "are only defined up to a multiplicative constant. This follows simply from the\n",
    "definition of an eigenvector: suppose $v$ is an eigenvector of the matrix $M$,\n",
    "with corresponding eigenvalue $\\lambda$. Then $\\alpha v$ is also an eigenvector\n",
    "of $M$ for any scalar number $\\alpha$,\n",
    "because $Mv = \\lambda v$ implies $M(\\alpha v) = \\lambda (\\alpha v)$.\n",
    "So, it is\n",
    "arbitrary whether a software package returns $v$ or $-v$ when asked for the\n",
    "eigenvectors of $M$. In order to make sure we reproduce the layout from the\n",
    "Varshney *et al.* paper, we must make sure that the vector is pointing in the\n",
    "same direction as theirs, rather than the opposite direction. We do this by\n",
    "choosing an arbitrary neuron from their Figure 2, and checking the sign of `x`\n",
    "at that position. We then reverse it if it doesn't match its sign in Figure 2\n",
    "of the paper."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "vc2_index = np.argwhere(neuron_ids == 'VC02')\n",
    "if x[vc2_index] < 0:\n",
    "    x = -x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now it's just a matter of drawing the nodes and the edges. We color them\n",
    "according to the type stored in `neuron_types`, using the appealing and\n",
    "functional \"colorblind\"\n",
    "[colorbrewer palette](http://chrisalbon.com/python/seaborn_color_palettes.html):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from matplotlib.colors import ListedColormap\n",
    "from matplotlib.collections import LineCollection\n",
    "\n",
    "\n",
    "def plot_connectome(x_coords, y_coords, conn_matrix, *,\n",
    "                    labels=(), types=None, type_names=('',),\n",
    "                    xlabel='', ylabel=''):\n",
    "    \"\"\"Plot neurons as points connected by lines.\n",
    "\n",
    "    Neurons can have different types (up to 6 distinct colors).\n",
    "\n",
    "    Parameters\n",
    "    ----------\n",
    "    x_coords, y_coords : array of float, shape (N,)\n",
    "        The x-coordinates and y-coordinates of the neurons.\n",
    "    conn_matrix : array or sparse matrix of float, shape (N, N)\n",
    "        The connectivity matrix, with non-zero entry (i, j) if and only\n",
    "        if node i and node j are connected.\n",
    "    labels : array-like of string, shape (N,), optional\n",
    "        The names of the nodes.\n",
    "    types : array of int, shape (N,), optional\n",
    "        The type (e.g. sensory neuron, interneuron) of each node.\n",
    "    type_names : array-like of string, optional\n",
    "        The name of each value of `types`. For example, if a 0 in\n",
    "        `types` means \"sensory neuron\", then `type_names[0]` should\n",
    "        be \"sensory neuron\".\n",
    "    xlabel, ylabel : str, optional\n",
    "        Labels for the axes.\n",
    "    \"\"\"\n",
    "    if types is None:\n",
    "        types = np.zeros(x_coords.shape, dtype=int)\n",
    "    ntypes = len(np.unique(types))\n",
    "    colors = plt.rcParams['axes.prop_cycle'][:ntypes].by_key()['color']\n",
    "    cmap = ListedColormap(colors)\n",
    "\n",
    "    fig, ax = plt.subplots()\n",
    "\n",
    "    # plot neuron locations:\n",
    "    for neuron_type in range(ntypes):\n",
    "        plotting = (types == neuron_type)\n",
    "        pts = ax.scatter(x_coords[plotting], y_coords[plotting],\n",
    "                         c=cmap(neuron_type), s=4, zorder=1)\n",
    "        pts.set_label(type_names[neuron_type])\n",
    "\n",
    "    # add text labels:\n",
    "    for x, y, label in zip(x_coords, y_coords, labels):\n",
    "        ax.text(x, y, '   ' + label,\n",
    "                verticalalignment='center', fontsize=3, zorder=2)\n",
    "\n",
    "    # plot edges\n",
    "    pre, post = np.nonzero(conn_matrix)\n",
    "    links = np.array([[x_coords[pre], x_coords[post]],\n",
    "                      [y_coords[pre], y_coords[post]]]).T\n",
    "    ax.add_collection(LineCollection(links, color='lightgray',\n",
    "                                     lw=0.3, alpha=0.5, zorder=0))\n",
    "\n",
    "    ax.legend(scatterpoints=3, fontsize=6)\n",
    "\n",
    "    ax.set_xlabel(xlabel, fontsize=8)\n",
    "    ax.set_ylabel(ylabel, fontsize=8)\n",
    "\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's use that function to plot the neurons:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_connectome(x, z, C, labels=neuron_ids, types=neuron_types,\n",
    "                type_names=['sensory neurons', 'interneurons',\n",
    "                            'motor neurons'],\n",
    "                xlabel='Affinity eigenvector 1', ylabel='Processing depth')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!-- caption text=\"Spectral layout of the neurons of a nematode worm\" -->\n",
    "\n",
    "There you are: a worm brain!\n",
    "As discussed in the original paper, you can see the top-down processing from\n",
    "sensory neurons to motor neurons through a network of interneurons. You can\n",
    "also see two distinct groups of motor neurons: these correspond to the neck\n",
    "(left) and body (right) body segments of the worm.\n",
    "\n",
    "<!-- exercise begin -->\n",
    "\n",
    "**Exercise:** How do you modify the above code to show the affinity view in\n",
    "Figure 2B from the paper?\n",
    "\n",
    "<!-- solution begin -->\n",
    "**Solution:** In the affinity view, instead of using the processing depth on the y-axis,\n",
    "we use the normalized third eigenvector of Q, just like we did with x. (And we\n",
    "invert it if necessary, just like we did with x!)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y = Dinv2 @ Vec[:, 2]\n",
    "asjl_index = np.argwhere(neuron_ids == 'ASJL')\n",
    "if y[asjl_index] < 0:\n",
    "    y = -y\n",
    "\n",
    "plot_connectome(x, y, C, labels=neuron_ids, types=neuron_types,\n",
    "                type_names=['sensory neurons', 'interneurons',\n",
    "                            'motor neurons'],\n",
    "                xlabel='Affinity eigenvector 1',\n",
    "                ylabel='Affinity eigenvector 2')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!-- caption text=\"Spectral layout of the neurons of a nematode worm, using two\n",
    "spectral dimensions\" -->\n",
    "\n",
    "<!-- solution end -->\n",
    "\n",
    "<!-- exercise end -->\n",
    "\n",
    "<!-- exercise begin -->\n",
    "\n",
    "### Challenge: linear algebra with sparse matrices\n",
    "\n",
    "The above code uses numpy arrays to hold the matrix and perform\n",
    "the necessary computations. Because we are using a small graph of fewer than 300\n",
    "nodes, this is feasible. However, for larger graphs, it would fail.\n",
    "\n",
    "For example, one might want to analyse the relationships between libraries\n",
    "listed on the Python Package Index, or PyPI, which contains over one hundred thousand packages.\n",
    "Holding the Laplacian matrix for this graph would take \n",
    "up $8 \\left(100 \\times 10^3\\right)^2 = 8 \\times 10^10$ bytes, or 80GB,\n",
    "of RAM. If you add to that the adjacency, symmetric adjacency, pseudoinverse,\n",
    "and, say, two temporary matrices used during calculations, you climb up to\n",
    "480GB, beyond the reach of most desktop computers.\n",
    "\n",
    "\"Ha!\", some of you might be thinking. \"Ha! My desktop has 512GB of RAM! It would\n",
    "make short work of this so-called 'large' graph!\"\n",
    "\n",
    "Perhaps. But you might also want to analyze the Association for Computing\n",
    "Machinery (ACM) citation graph, a network of over two million scholarly works\n",
    "and references. *That* Laplacian would take up 32 terabytes of RAM.\n",
    "\n",
    "However, we know that the dependency and reference graphs are *sparse*:\n",
    "packages usually depend on just a few other packages, not on the whole of PyPI.\n",
    "And papers and books usually only reference a few others, too. So we can hold\n",
    "the above matrices using the sparse data structures from `scipy.sparse` (see\n",
    "Chapter 5), and use the linear algebra functions in `scipy.sparse.linalg` to\n",
    "compute the values we need.\n",
    "\n",
    "Try to explore the documentation in `scipy.sparse.linalg` to come up with a\n",
    "sparse version of the above computation.\n",
    "\n",
    "Hint: the pseudoinverse of a sparse matrix is, in general, not sparse, so you\n",
    "can't use it here. Similarly, you can't get all the eigenvectors of a sparse\n",
    "matrix, because they would together make up a dense matrix.\n",
    "\n",
    "You'll find parts of the solution below (and of course in the solutions\n",
    "chapter), but we highly recommend that you try it out on your own.\n",
    "\n",
    "<!-- solution begin -->\n",
    "\n",
    "### Challenge accepted\n",
    "\n",
    "For the purposes of this challenge, we are going to use the small connectome\n",
    "above, because it's easier to visualise what is going on. In later parts of the\n",
    "challenge we'll use these techniques to analyze larger networks.\n",
    "\n",
    "First, we start with the adjacency matrix, A, in a sparse matrix format, in\n",
    "this case, CSR, which is the most common format for linear algebra. We'll\n",
    "append `s` to the names of all the matrices to indicate that they are sparse."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy import sparse\n",
    "import scipy.sparse.linalg\n",
    "\n",
    "As = sparse.csr_matrix(A)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can create our connectivity matrix in the same way:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Cs = (As + As.T) / 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In order to get the degrees matrix, we can use the \"diags\" sparse format, which\n",
    "stores diagonal and off-diagonal matrices."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "degrees = np.ravel(Cs.sum(axis=0))\n",
    "Ds = sparse.diags(degrees)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Getting the Laplacian is straightforward:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Ls = Ds - Cs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we want to get the processing depth. Remember that getting the\n",
    "pseudo-inverse of the Laplacian matrix is out of the question, because it will\n",
    "be a dense matrix (the inverse of a sparse matrix is not generally sparse\n",
    "itself). However, we were actually using the pseudo-inverse to compute a\n",
    "vector $z$ that would satisfy $L z = b$,\n",
    "where $b = C \\odot \\textrm{sign}\\left(A - A^T\\right) \\mathbf{1}$.\n",
    "(You can see this in the supplementary material for Varshney *et al*.) With\n",
    "dense matrices, we can simply use $z = L^+b$. With sparse ones, though, we can\n",
    "use one of the *solvers* (see sidebox, \"Solvers\") in `sparse.linalg.isolve` to get the `z` vector after\n",
    "providing `L` and `b`, no inversion required!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "b = Cs.multiply((As - As.T).sign()).sum(axis=1)\n",
    "z, error = sparse.linalg.isolve.cg(Ls, b, maxiter=10000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we must find the eigenvectors of $Q$, the degree-normalized Laplacian,\n",
    "corresponding to its second and third smallest eigenvalues.\n",
    "\n",
    "You might recall from Chapter 5 that the numerical data in sparse matrices is\n",
    "in the `.data` attribute. We use that to invert the degrees matrix:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Dsinv2 = Ds.copy()\n",
    "Dsinv2.data = 1 / np.sqrt(Ds.data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we use SciPy's sparse linear algebra functions to find the desired\n",
    "eigenvectors. The $Q$ matrix is symmetric, so we can use the `eigsh` function,\n",
    "specialized for symmetric matrices, to compute them. We use the `which` keyword\n",
    "argument to specify that we want the eigenvectors corresponding to the smallest\n",
    "eigenvalues, and `k` to specify that we need the 3 smallest:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "Qs = Dsinv2 @ Ls @ Dsinv2\n",
    "vals, Vecs = sparse.linalg.eigsh(Qs, k=3, which='SM')\n",
    "sorted_indices = np.argsort(vals)\n",
    "Vecs = Vecs[:, sorted_indices]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we normalize the eigenvectors to get the x and y coordinates\n",
    "(and flip these if necessary):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "_dsinv, x, y = (Dsinv2 @ Vecs).T\n",
    "if x[vc2_index] < 0:\n",
    "    x = -x\n",
    "if y[asjl_index] < 0:\n",
    "    y = -y"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(Note that the eigenvector corresponding to the smallest eigenvalue is always a\n",
    "vector of all ones, which we're not interested in.)\n",
    "We can now reproduce the above plots!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_connectome(x, z, C, labels=neuron_ids, types=neuron_types,\n",
    "                type_names=['sensory neurons', 'interneurons',\n",
    "                            'motor neurons'],\n",
    "                xlabel='Affinity eigenvector 1', ylabel='Processing depth')\n",
    "\n",
    "plot_connectome(x, y, C, labels=neuron_ids, types=neuron_types,\n",
    "                type_names=['sensory neurons', 'interneurons',\n",
    "                            'motor neurons'],\n",
    "                xlabel='Affinity eigenvector 1',\n",
    "                ylabel='Affinity eigenvector 2')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!-- caption text=\"Spectral layout of a nematode brain, computed using sparse\n",
    "matrices\" -->\n",
    "\n",
    "<!-- solution end -->\n",
    "\n",
    "<!-- exercise end -->\n",
    "\n",
    "> **Solvers {.callout}**\n",
    ">\n",
    "> SciPy has several sparse iterative solvers available, and it is not always\n",
    "> obvious which to use.  Unfortunately, that question also has no easy answer:\n",
    "> different algorithms have different strengths in terms of speed of\n",
    "> convergence, stability, accuracy, and memory use (amongst others).  It is also\n",
    "> not possible to predict, by looking at the input data, which algorithm will\n",
    "> perform best.\n",
    "> \n",
    "> Here is a rough guideline for choosing an iterative solver:\n",
    "> \n",
    "> - If A, the input matrix, is symmetric and positive definite, use the\n",
    ">   Conjugate Gradient solver `cg`.  If A is symmetric, but\n",
    ">   near-singular or indefinite, try the Minimum Residual iteration\n",
    ">   method `minres`.\n",
    "> \n",
    "> - For non-symmetric systems, try the Biconjugate Gradient Stabilized\n",
    ">   method, `bicgstab`.  The Conjugate Gradient Squared method, `cgs`,\n",
    ">   is a bit faster, but has more erratic convergence.\n",
    "> \n",
    "> - If you need to solve many similar systems, use the LGMRES algorithm `lgmres`.\n",
    "> \n",
    "> - If A is not square, use the least squares algorithm `lsmr`.\n",
    "> \n",
    "> For further reading, see\n",
    "> \n",
    "> - **How Fast are Nonsymmetric Matrix Iterations?**,\n",
    ">   Noël M. Nachtigal, Satish C. Reddy, and Lloyd N. Trefethen\n",
    ">   SIAM Journal on Matrix Analysis and Applications 1992 13:3, 778-795.\n",
    "> \n",
    "> - **Survey of recent Krylov methods**, Jack Dongarra,\n",
    ">   http://www.netlib.org/linalg/html_templates/node50.html\n",
    "\n",
    "\n",
    "## PageRank: linear algebra for reputation and importance\n",
    "\n",
    "Another application of linear algebra and eigenvectors is Google's PageRank\n",
    "algorithm, which is punnily named both for webpages and for one of its\n",
    "co-founders, Larry Page.\n",
    "\n",
    "To rank webpages by importance, you might count\n",
    "how many other webpages link to it. After all, if everyone is linking to a\n",
    "particular page, it must be good, right? But this metric is\n",
    "easily gamed: to make your own webpage rise in the rankings, just\n",
    "create as many other webpages as you can and have them all link to your\n",
    "original page.\n",
    "\n",
    "The key insight that drove Google's early success was that important webpages\n",
    "are not linked to by just many webpages, but by *important*\n",
    "webpages. And how do we know that those other pages are important? Because\n",
    "they themselves are linked to by important pages. And so on.\n",
    "\n",
    "This recursive definition implies that page\n",
    "importance can be measured by an eigenvector\n",
    "of the so-called *transition matrix*, which contains the links\n",
    "between webpages. Suppose you have your vector of importance $\\boldsymbol{r}$,\n",
    "and your matrix of links $M$. You don't know $\\boldsymbol{r}$ yet, but you\n",
    "do know that the importance of a page is proportional to the sum of\n",
    "importances of the pages that link to it: $\\boldsymbol{r} = \\alpha M \\boldsymbol{r}$,\n",
    "or $M \\boldsymbol{r} = \\lambda \\boldsymbol{r}$, for $\\lambda = 1/\\alpha$. That's\n",
    "just the definition of an eigenvalue!\n",
    "\n",
    "By ensuring some special properties are satisfied by the transition matrix, we\n",
    "can further determine that the required eigenvalue is 1, and that it is the\n",
    "largest eigenvalue of $M$.\n",
    "\n",
    "The transition matrix imagines a web\n",
    "surfer, often named Webster, randomly clicking a link from each webpage he\n",
    "visits, and then asks, what's the probability that he ends up at any given\n",
    "page? This probability is called the PageRank.\n",
    "\n",
    "Since Google's rise, researchers have been applying PageRank to all sorts of\n",
    "networks. We'll use an example by Stefano Allesina and Mercedes Pascual,\n",
    "which they\n",
    "[published](http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000494i)\n",
    "in PLoS Computational Biology. They thought to apply the method in ecological\n",
    "*food webs*, networks that link species to those that they eat.\n",
    "\n",
    "Naively, if you wanted to see how critical a species was for an ecosystem, you\n",
    "would look at how many species eat it. If it's many, and that species\n",
    "disappeared, then all its \"dependent\" species might disappear with it. In\n",
    "network parlance, you could say that its *in-degree* determines its ecological\n",
    "importance.\n",
    "\n",
    "Could PageRank be a better measure of importance for an ecosystem?\n",
    "\n",
    "Professor Allesina kindly provided us with a few food webs to play around\n",
    "with. We've saved one of these, from the St Marks National Wildlife Refuge in\n",
    "Florida, in the Graph Markup Language format. The web was\n",
    "[described](http://www.sciencedirect.com/science/article/pii/S0304380099000228)\n",
    "in 1999 by Robert R. Christian and Joseph J. Luczovich. In the dataset, a\n",
    "node $i$ has an edge to node $j$ if species $i$ eats species $j$.\n",
    "\n",
    "We'll start by loading in the data, which NetworkX knows how to read trivially:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import networkx as nx\n",
    "\n",
    "stmarks = nx.read_gml('data/stmarks.gml')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we get the sparse matrix corresponding to the graph. Because a matrix\n",
    "only holds numerical information, we need to maintain a separate list of\n",
    "package names corresponding to the matrix rows/columns:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "species = np.array(list(stmarks.nodes()))  # array for multi-indexing\n",
    "Adj = nx.to_scipy_sparse_matrix(stmarks, dtype=np.float64)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "From the adjacency matrix, we can derive a *transition probability* matrix,\n",
    "where every edge is replaced by a *probability* of 1 over the number of\n",
    "outgoing edges from that species. In the food web, it might make more sense\n",
    "to call this a lunch probability matrix.\n",
    "\n",
    "The total number of species in our matrix is going to be used a lot, so let's\n",
    "call it $n$:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "n = len(species)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we need the degrees, and, in particular, the *diagonal matrix* containing\n",
    "the inverse of the out-degrees of each node on the diagonal:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.seterr(divide='ignore')  # ignore division-by-zero errors\n",
    "from scipy import sparse\n",
    "\n",
    "degrees = np.ravel(Adj.sum(axis=1))\n",
    "Deginv = sparse.diags(1 / degrees).tocsr()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Trans = (Deginv @ Adj).T"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Normally, the PageRank score would simply be the first eigenvector of the\n",
    "transition matrix. If we call the transition matrix $M$ and the vector of\n",
    "PageRank values $r$, we have:\n",
    "\n",
    "$$\n",
    "\\boldsymbol{r} = M\\boldsymbol{r}\n",
    "$$\n",
    "\n",
    "But the `np.seterr` call above is a clue that it's not quite\n",
    "so simple. The PageRank approach only works when the\n",
    "transition matrix is a *column-stochastic* matrix, in which every\n",
    "column sums to 1. Additionally, every page must be reachable\n",
    "from every other page, even if the path to reach it is very long.\n",
    "\n",
    "In our food web, this causes problems, because the bottom of the food chain,\n",
    "what the authors call *detritus* (basically sea sludge), doesn't actually *eat*\n",
    "anything (the Circle of Life notwithstanding), so you can't reach other species\n",
    "from it.\n",
    "\n",
    "> *Young Simba:* But, Dad, don't we eat the antelope?\n",
    ">\n",
    "> *Mufasa:* Yes, Simba, but let me explain. When we die, our bodies become the\n",
    "> grass, and the antelope eat the grass. And so we are all connected in the\n",
    "> great Circle of Life.\n",
    ">\n",
    "> — *The Lion King*\n",
    "\n",
    "To deal with this, the PageRank algorithm uses a so-called \"damping\n",
    "factor\", usually taken to be 0.85. This means that 85% of the time, the\n",
    "algorithm follows a link at random, but for the other 15%, it randomly jumps to\n",
    "any arbitrary page. It's as if every page had a low probability link to every\n",
    "other page. Or, in our case, it's as if shrimp, on rare occasions, ate sharks.\n",
    "It might seem non-sensical but bear with us! It is, in fact, the mathematical\n",
    "representation of the Circle of Life. We'll set it to 0.85, but actually it\n",
    "doesn't really matter for this analysis: the results are similar for a large\n",
    "range of possible damping factors.\n",
    "\n",
    "If we call the damping factor $d$, then the modified PageRank equation is:\n",
    "\n",
    "$$\n",
    "\\boldsymbol{r} = dM\\boldsymbol{r} + \\frac{1-d}{n} \\boldsymbol{1}\n",
    "$$\n",
    "\n",
    "and\n",
    "\n",
    "$$\n",
    "(\\boldsymbol{I} - dM)\\boldsymbol{r} = \\frac{1-d}{n} \\boldsymbol{1}\n",
    "$$\n",
    "\n",
    "We can solve this equation using `scipy.sparse.linalg`'s direct\n",
    "solver, `spsolve`. Depending on the structure and size of a linear algebra\n",
    "problem, though, it might be more efficient to use an iterative solver. See\n",
    "the [`scipy.sparse.linalg` documentation](http://docs.scipy.org/doc/scipy/reference/sparse.linalg.html#solving-linear-problems)\n",
    "for more information on this."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.sparse.linalg import spsolve\n",
    "\n",
    "damping = 0.85\n",
    "beta = 1 - damping\n",
    "\n",
    "I = sparse.eye(n, format='csc')  # Same sparse format as Trans\n",
    "\n",
    "pagerank = spsolve(I - damping * Trans,\n",
    "                   np.full(n, beta / n))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We now have the \"foodrank\" of the St. Marks food web!\n",
    "\n",
    "So how does a species' foodrank compare to the number of other species eating\n",
    "it?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def pagerank_plot(in_degrees, pageranks, names, *,\n",
    "                  annotations=[], **figkwargs):\n",
    "    \"\"\"Plot node pagerank against in-degree, with hand-picked node names.\"\"\"\n",
    "\n",
    "    fig, ax = plt.subplots(**figkwargs)\n",
    "    ax.scatter(in_degrees, pageranks, c=[0.835, 0.369, 0], lw=0)\n",
    "    for name, indeg, pr in zip(names, in_degrees, pageranks):\n",
    "        if name in annotations:\n",
    "            text = ax.text(indeg + 0.1, pr, name)\n",
    "\n",
    "    ax.set_ylim(0, np.max(pageranks) * 1.1)\n",
    "    ax.set_xlim(-1, np.max(in_degrees) * 1.1)\n",
    "    ax.set_ylabel('PageRank')\n",
    "    ax.set_xlabel('In-degree')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We now draw the plot. Having explored the dataset before writing this, we have\n",
    "pre-labeled some interesting nodes in the plot:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "interesting = ['detritus', 'phytoplankton', 'benthic algae', 'micro-epiphytes',\n",
    "               'microfauna', 'zooplankton', 'predatory shrimps', 'meiofauna',\n",
    "               'gulls']\n",
    "in_degrees = np.ravel(Adj.sum(axis=0))\n",
    "pagerank_plot(in_degrees, pagerank, species, annotations=interesting)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Sea sludge (\"detritus\") is the most important element both by number of\n",
    "species feeding on it (15) and by PageRank (>0.003). But the second most\n",
    "important element is *not* benthic algae, which feeds 13 other species, but\n",
    "rather phytoplankton, which feeds just 7! That's because other *important*\n",
    "species feed on it. On the bottom left, we've got sea gulls, who, we can now\n",
    "confirm, do bugger-all for the ecosystem. Those vicious *predatory shrimps*\n",
    "(we're not making this up) support the same number of species as phytoplankton,\n",
    "but they are less essential species, so they end up with a lower foodrank.\n",
    "\n",
    "Although we won't do it here, Allesina and Pascual go on to model the\n",
    "ecological impact of species extinction, and indeed find that PageRank\n",
    "predicts ecological importance better than in-degree.\n",
    "\n",
    "Before we wrap up though, we'll note that PageRank can be computed several\n",
    "different ways. One way, complementary to what we did above, is called the\n",
    "*power method*, and it's quite, well, powerful! It stems from the\n",
    "[Perron-Frobenius theorem](https://en.wikipedia.org/wiki/Perron%E2%80%93Frobenius_theorem),\n",
    "which states, among other things, that a stochastic matrix has 1 as an\n",
    "eigenvalue, and that this is its *largest* eigenvalue. (The corresponding\n",
    "eigenvector is the PageRank vector.) What this means is that, whenever we\n",
    "multiply *any* vector by $M$, its component pointing towards this major\n",
    "eigenvector stays the same, while *all other components shrink* by a\n",
    "multiplicative factor. The consequence is that if we multiply some random\n",
    "starting vector by $M$ repeatedly, we should eventually get the PageRank\n",
    "vector!\n",
    "\n",
    "SciPy makes this very efficient with its sparse matrix module:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def power(Trans, damping=0.85, max_iter=10**5):\n",
    "    n = Trans.shape[0]\n",
    "    r0 = np.full(n, 1/n)\n",
    "    r = r0\n",
    "    for _iter_num in range(max_iter):\n",
    "        rnext = damping * Trans @ r + (1 - damping) / n\n",
    "        if np.allclose(rnext, r):\n",
    "            break\n",
    "        r = rnext\n",
    "    return r"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!-- exercise begin -->\n",
    "\n",
    "**Exercise:** In the above iteration, note that `Trans` is *not*\n",
    "column-stochastic, so the $r$ vector gets shrunk at each iteration. In order to\n",
    "make the matrix stochastic, we have to replace every zero-column by a column of\n",
    "all $1/n$. This is too expensive, but computing the iteration is cheaper. How\n",
    "can you modify the code above to ensure that $r$ remains a probability vector\n",
    "throughout?\n",
    "\n",
    "<!-- solution begin -->\n",
    "\n",
    "**Solution:** In order to have a stochastic matrix, all columns of the\n",
    "transition matrix must sum to 1. This is not satisfied when a species isn't\n",
    "eaten by any others: that column will consist of all zeroes. Replacing all\n",
    "those columns by $1/n \\boldsymbol{1}$, however, would be expensive.\n",
    "\n",
    "The key is to realise that *every row* will contribute the *same amount* to the\n",
    "multiplication of the transition matrix by the current probability vector. That\n",
    "is to say, adding these columns will add a single value to the result of the\n",
    "iteration multiplication. What value? $1/n$ times the elements of $r$ that\n",
    "correspond to a dangling node. This can be expressed as a dot-product of a\n",
    "vector containing $1/n$ for positions corresponding to dangling nodes, and zero\n",
    "elswhere, with the vector $r$ for the current iteration."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def power2(Trans, damping=0.85, max_iter=10**5):\n",
    "    n = Trans.shape[0]\n",
    "    dangling = (1/n) * np.ravel(Trans.sum(axis=0) == 0)\n",
    "    r0 = np.full(n, 1/n)\n",
    "    r = r0\n",
    "    for _ in range(max_iter):\n",
    "        rnext = (damping * (Trans @ r + dangling @ r) +\n",
    "                 (1 - damping) / n)\n",
    "        if np.allclose(rnext, r):\n",
    "            return rnext\n",
    "        else:\n",
    "            r = rnext\n",
    "    return r"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Try this out manually for a few iterations. Notice that if you start with a\n",
    "stochastic vector (a vector whose elements all sum to 1), the next vector will\n",
    "still be a stochastic vector. Thus, the output PageRank from this function will\n",
    "be a true probability vector, and the values will represent the probability\n",
    "that we end up at a particular species when following links in the food chain.\n",
    "\n",
    "<!-- solution end -->\n",
    "\n",
    "<!-- exercise end -->\n",
    "\n",
    "\n",
    "<!-- exercise begin -->\n",
    "\n",
    "**Exercise:** Verify that these three methods all give the same ranking for the\n",
    "nodes. `numpy.corrcoef` might be a useful function for this.\n",
    "\n",
    "<!-- solution begin -->\n",
    "\n",
    "**Solution:** `np.corrcoef` gives the Pearson correlation coefficient between\n",
    "all pairs of a list of vectors. This coefficient will be equal to 1 if and only\n",
    "if two vectors are scalar multiples of each other. Therefore, a correlation\n",
    "coefficient of 1 is sufficient to show that the above methods produce the same\n",
    "ranking."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pagerank_power = power(Trans)\n",
    "pagerank_power2 = power2(Trans)\n",
    "np.corrcoef([pagerank, pagerank_power, pagerank_power2])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!-- solution end -->\n",
    "\n",
    "<!-- exercise end -->\n",
    "\n",
    "## Concluding remarks\n",
    "\n",
    "The field of linear algebra is far too broad to adequately cover in a chapter,\n",
    "but this chapter gave you a glimpse into its power, and of\n",
    "the way Python, NumPy, and SciPy make its elegant algorithms accessible."
   ]
  }
 ],
 "metadata": {},
 "nbformat": 4,
 "nbformat_minor": 4
}