{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# [NTDS'19] tutorial 4: Manipulating graphs with NetworkX\n", "[ntds'19]: https://github.com/mdeff/ntds_2019\n", "\n", "[Effrosyni Simou](https://people.epfl.ch/effrosyni.simou), [EPFL LTS4](https://lts4.epfl.ch).\n", "Adapted from [NTDS'17 NetworkX demo](https://nbviewer.jupyter.org/github/mdeff/ntds_2017/blob/outputs/demos/04_networkx.ipynb)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this session we will get introduced to NetworkX, explore some of the most common network models, look at their basic properties and compare them." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1 Creating graphs using network models\n", "\n", "There are many libraries that deal with creation and manipulation of graph data.\n", "We will use NetworkX to create basic network models, as they are already implemented in the library.\n", "The full documentation of NetworkX 2.3 (installed in your `ntds_2019` environment) can be found [online](https://networkx.github.io/documentation/stable/)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import collections\n", "\n", "import numpy as np\n", "from scipy import spatial\n", "from matplotlib import pyplot as plt\n", "import networkx as nx" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create an Erdős-Rényi graph with $N=100$ vertices, and a probability of connecting each pair of vertices equal to $p=0.15$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "N = 100 # number of nodes\n", "p = 0.15 # probability of connection\n", "er = nx.erdos_renyi_graph(N, p)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can retrieve the adjacency matrix of the graph, from the `Graph` object `er` as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "er_adj = nx.adjacency_matrix(er, range(N))\n", "er_adj = er_adj.todense()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can now visualise the adjacency matrix:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.spy(er_adj);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2 Plotting graphs\n", "\n", "With NetworkX and Matplotlib we can also plot a graph. For example, we can plot the Erdős-Rényi graph that we created before as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nx.draw(er)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.1 Exercise\n", "\n", "Create a Barabasi-Albert graph and a Watts-Strogatz graph and plot them." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create a Barabasi-Albert graph.\n", "ba = # your code here" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create a Watts-Strogartz graph.\n", "ws = # your code here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3 Modifying graphs\n", "\n", "It's easy to add or remove edges, but also nodes. If we add an edge between nodes that don't yet exist, they will be automatically created." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "er.add_node(100)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "er.nodes()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similarly, you can add and remove a collection of nodes or edges, and add and remove one node or edge:\n", "* Adding nodes with:\n", " * `G.add_node`: One node at a time\n", " * `G.add_nodes_from`: A container of nodes\n", "* Adding edges with:\n", " * `G.add_edge`: One edge at a time\n", " * `G.add_edges_from`: A container of edges\n", "* Removing nodes with:\n", " * `G.remove_node`: One node at a time\n", " * `G.remove_nodes_from`: A container of nodes\n", "* Removing edges with:\n", " * `G.remove_edge`: One edge at a time\n", " * `G.remove_edges_from`: A container of edges\n", "\n", "You can get the number of edges with `G.size()`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Add an edge between two non-existant vertices. Remove all nodes up to node 50. Draw the graph after each change." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "er.add_edge(101, 102)\n", "nx.draw(er)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "er.remove_nodes_from(range(50))\n", "nx.draw(er)\n", "er.nodes()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "er.size()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4 Degree distribution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`G.degree()` returns a ``DegreeView`` object with pairs of nodes and their degree.\n", "If we specify a node, `G.degree(node)` will return the degree of that node." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create an Erdős-Rényi network and plot a histogram of node degrees. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "N = 100 # number of nodes\n", "p = 0.15 # probability of connection\n", "er = nx.erdos_renyi_graph(N, p)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "d = er.degree()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(d)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Erdős-Rényi node degree histogram.\n", "degree_sequence = sorted([d for n, d in er.degree()], reverse=True) # degree sequence\n", "degreeCount = collections.Counter(degree_sequence)\n", "deg, cnt = zip(*degreeCount.items())\n", "\n", "fig, ax = plt.subplots()\n", "ax.bar(deg, cnt)\n", "ax.set_title(\"Degree Histogram\")\n", "ax.set_ylabel(\"Count\")\n", "ax.set_xlabel(\"Degree\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.1 Fitting a distribution\n", "\n", "Try to fit a Poisson distribution." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Poisson distribution.\n", "def poisson(mu, k):\n", " return np.exp(-mu) * mu**k * (np.math.factorial(k)**-1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Erdős-Rényi node degree histogram.\n", "degree_sequence = sorted([d for n, d in er.degree()], reverse=True) # degree sequence\n", "degreeCount = collections.Counter(degree_sequence)\n", "deg, cnt = zip(*degreeCount.items())\n", "\n", "fig, ax = plt.subplots()\n", "ax.bar(deg, cnt, label='Histogram')\n", "\n", "# Poisson distribution\n", "mu = 2 * er.size() / 100\n", "k = np.linspace(1, 25, 25)\n", "deg = [100 * poisson(mu, i) for i in k]\n", "ax.plot(k, deg, color='r', label='Poisson distribution')\n", "\n", "ax.legend()\n", "ax.set_title(\"Degree Histogram\")\n", "ax.set_ylabel(\"Count\")\n", "ax.set_xlabel(\"Degree\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.2 Exercise\n", "\n", "Do it for the Barabasi-Albert and Watts-Strogatz networks." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# your code here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5 Creating a graph that approximates a manifold\n", "\n", "We can represent data laying on a manifold (sampled from a manifold) as a graph of connected samples.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generate 100 two-dimensional data points from a uniform random distribution in $[0, 1]$.\n", "These will be the coordinates of your nodes." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "N = 100\n", "nodes_coords = np.random.rand(N, 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Two nodes are connected if the Euclidean distance between them is smaller than a threshold.\n", "In that case, the weight of the edge is set to\n", "$$w(i,j) = \\exp \\left( -{\\frac{\\operatorname{dist}^2(i,j)}{2\\sigma^2}} \\right),$$\n", "for some *kernel width* $\\sigma$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sigma = 0.9\n", "threshold = 0.2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def gaussian_kernel(dist, sigma):\n", " return np.exp(-dist**2 / (2*sigma**2))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dist = spatial.distance.pdist(nodes_coords, metric='euclidean')\n", "dist = spatial.distance.squareform(dist)\n", "adj = gaussian_kernel(dist, sigma)\n", "adj -= np.identity(N)\n", "adj[dist > threshold] = 0" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.spy(adj);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot the graph with NetworkX. \n", "\n", "Hints: \n", "* `nx.from_numpy_array(adj)` creates a graph object from an adjacency matrix (in numpy form)\n", "* `nx.draw(G,pos)` will draw vertices at coordinates specified in pos. Variable pos is a dictionary assigning a pair of coordinates to each node." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "g = nx.from_numpy_matrix(adj)\n", "plt.spy(nx.adjacency_matrix(g).todense());" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pos = dict(zip(range(N), nodes_coords))\n", "nx.draw(g, pos)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot a degree distribution of this graph. What can you say about the distribution?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# node degree histogram\n", "degree_sequence = sorted([d for n, d in g.degree()], reverse=True) # degree sequence\n", "degreeCount = collections.Counter(degree_sequence)\n", "deg, cnt = zip(*degreeCount.items())\n", "\n", "fig, ax = plt.subplots()\n", "ax.bar(deg, cnt)\n", "ax.set_title(\"Degree Histogram\")\n", "ax.set_ylabel(\"Count\")\n", "ax.set_xlabel(\"Degree\");" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }