{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# [NTDS'19] assignment 2: learning with graphs\n",
    "[ntds'19]: https://github.com/mdeff/ntds_2019\n",
    "\n",
    "[Clément Vignac](https://people.epfl.ch/clement.vignac), [EPFL LTS4](https://lts4.epfl.ch) and\n",
    "[Guillermo Ortiz Jiménez](https://gortizji.github.io), [EPFL LTS4](https://lts4.epfl.ch)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Students\n",
    "\n",
    "* Team: `<your team number>`\n",
    "* Students: `<your name`> (for the indivudual submission) or `<the name of all students in the team>` (for the team submission)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Rules\n",
    "\n",
    "Grading:\n",
    "* The first deadline is for individual submissions. The second deadline is for the team submission.\n",
    "* All team members will receive the same grade based on the team solution submitted on the second deadline.\n",
    "* As a fallback, a team can ask for individual grading. In that case, solutions submitted on the first deadline are graded.\n",
    "* Collaboration between team members is encouraged. No collaboration between teams is allowed.\n",
    "\n",
    "Submission:\n",
    "* Textual answers shall be short. Typically one to two sentences.\n",
    "* Code has to be clean.\n",
    "* You cannot import any other library than we imported.\n",
    "* When submitting, the notebook is executed and the results are stored. I.e., if you open the notebook again it should show numerical results and plots. We won't be able to execute your notebooks.\n",
    "* The notebook is re-executed from a blank state before submission. That is to be sure it is reproducible. You can click \"Kernel\" then \"Restart Kernel and Run All Cells\" in Jupyter."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Objective\n",
    "\n",
    "In this assignment you will experiment with the main concepts of spectral graph theory, as well as familizarize yourself with the main data science techniques for network data.\n",
    "\n",
    "The assignment is made of three parts:\n",
    "1. [Spectral Graph Theory](#sgt)\n",
    "1. [Regularization on graphs with Graph Signal Processing](#gsp)\n",
    "1. [Machine Learning on Graphs](#ml)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='sgt'></a>\n",
    "## Part I: Spectral Graph Theory\n",
    "### Eigenvectors and eigenvalues\n",
    "\n",
    "We will start by reviewing some of the main concepts in spectral graph theory and see some of its applications to dimensionality reduction and data clustering. To illustrate the main concepts we will use the standard two moon dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from scipy.spatial.distance import pdist, squareform\n",
    "import matplotlib.pyplot as plt\n",
    "from mpl_toolkits.mplot3d import Axes3D\n",
    "\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pygsp.graphs import TwoMoons\n",
    "\n",
    "G = TwoMoons(moontype='synthesized', N=2000)\n",
    "X = G.coords\n",
    "Y = G.labels.astype(int)\n",
    "\n",
    "plt.scatter(X[:, 0], X[:, 1], c=Y)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 1: Graph construction\n",
    "Build a similarity graph using the euclidean distance between data points.   \n",
    "**Note:** Use an RBF kernel to set the edge weights $w_{ij}=\\exp(-||x_i- x_j||_2^2 / ~ 2 \\sigma^2)$ of your adjacency and threshold the ones with the smallest magnitude."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def epsilon_similarity_graph(X: np.ndarray, sigma=1, epsilon=0):\n",
    "    \"\"\" X (n x d): coordinates of the n data points in R^d.\n",
    "        sigma (float): width of the kernel\n",
    "        epsilon (float): threshold\n",
    "        Return:\n",
    "        adjacency (n x n ndarray): adjacency matrix of the graph.\n",
    "    \"\"\"\n",
    "    # Your code here\n",
    "    return adjacency"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "adjacency = epsilon_similarity_graph(X, sigma=None, epsilon=None)\n",
    "plt.spy(adjacency)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "How do you choose `sigma`?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Your answer here*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "How do you choose the threshold `epsilon`?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Your answer here*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 2: Laplacian\n",
    "Build the combinatorial and normalized graph laplacians for this dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def compute_laplacian(adjacency: np.ndarray, normalize: bool):\n",
    "    \"\"\" Return:\n",
    "        L (n x n ndarray): combinatorial or symmetric normalized Laplacian.\n",
    "    \"\"\"\n",
    "    # Your code here"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "laplacian_comb = compute_laplacian(adjacency, normalize=False)\n",
    "laplacian_norm = compute_laplacian(adjacency, normalize=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 3: Eigendecomposition\n",
    "\n",
    "For both Laplacian matrices, compute the eigendecomposition $L = U^\\top \\Lambda U$, where the columns $u_k \\in \\mathbb{R}^N$ of $U = [u_1, \\dots, u_N] \\in \\mathbb{R}^{N \\times N}$ are the eigenvectors and the diagonal elements $\\lambda_k = \\Lambda_{kk}$ are the corresponding eigenvalues. Make sure that the eigenvalues are ordered, i.e., $\\lambda_1 \\leq \\lambda_2 \\leq \\dots \\leq \\lambda_N$. \n",
    "\n",
    "Justify your choice of a solver for the eigendecomposition."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Your answer here*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def spectral_decomposition(laplacian: np.ndarray):\n",
    "    \"\"\" Return:\n",
    "        lamb (np.array): eigenvalues of the Laplacian\n",
    "        U (np.ndarray): corresponding eigenvectors.\n",
    "    \"\"\"\n",
    "    # Your code here"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lamb_comb, U_comb = spectral_decomposition(laplacian_comb)\n",
    "lamb_norm, U_norm = spectral_decomposition(laplacian_norm)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 4: Interpretation\n",
    "We plot the sorted eigenvalues as a function of their index:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(12,5))\n",
    "plt.subplot(121)\n",
    "plt.plot(lamb_comb)\n",
    "plt.xlabel('Index')\n",
    "plt.ylabel('Eigenvalue')\n",
    "plt.title('Eigenvalues $L_{comb}$')\n",
    "plt.subplot(122)\n",
    "plt.plot(lamb_norm)\n",
    "plt.xlabel('Index')\n",
    "plt.ylabel('Eigenvalue')\n",
    "plt.title('Eigenvalues $L_{norm}$')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "What is the lowest eigenvalue $\\lambda_0$ and the corresponding eigenvector $u_0$? Answer for both Laplacian matrices."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Your answer here*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When filtering a signal or computing polynomials, which Laplacian provides the best numerical stability? Justify your answer."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Your answer here*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 5: Connected components\n",
    "The eigendecomposition provides an easy way to compute the number of connected components in the graph. Fill the following function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def compute_number_connected_components(lamb: np.array, threshold: float):\n",
    "    \"\"\" lamb: array of eigenvalues of a Laplacian\n",
    "        Return:\n",
    "        n_components (int): number of connected components.\n",
    "    \"\"\"\n",
    "    # Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Tune the parameters $\\epsilon$ and $\\sigma$ of the similarity graph so that the graph is connected. Otherwise, clustering would be too simple!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(compute_number_connected_components(lamb_norm, threshold=None))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Spectral clustering\n",
    "\n",
    "Let us now see one application of spectral graph theory to clustering the two moon dataset.\n",
    "\n",
    "#### Question 6: Baseline\n",
    "\n",
    "As a baseline, let us first see how the simplest clustering algorithm, K-means, performs on this dataset. Use K-means to assign a cluster to each point."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.cluster import KMeans\n",
    "\n",
    "# Your code here\n",
    "y_pred =  # Vector with cluster assignments\n",
    "plt.scatter(X[:, 0], X[:, 1], c=y_pred)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "K-means cannot find a good solution to this problem. Why?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Your answer here*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 7: Spectral clustering\n",
    "\n",
    "As opposed to naive K-means, spectral clustering doesn't operate on the input space but on the eigenspace of the graph that represents the data. Implement spectral clustering. You can use \n",
    "[this tutorial](http://lasa.epfl.ch/teaching/lectures/ML_Phd/Notes/tutoSC.pdf)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class SpectralClustering():\n",
    "    def __init__(self, n_classes: int, normalize: bool):\n",
    "        self.n_classes = n_classes\n",
    "        self.normalize = normalize\n",
    "        self.laplacian = None\n",
    "        self.e = None\n",
    "        self.U = None\n",
    "        self.clustering_method =  # Your code here\n",
    "        \n",
    "    def fit_predict(self, adjacency):\n",
    "        \"\"\" Your code should be correct both for the combinatorial\n",
    "            and the symmetric normalized spectral clustering.\n",
    "            Return:\n",
    "            y_pred (np.ndarray): cluster assignments.\n",
    "        \"\"\"\n",
    "        # Your code here\n",
    "        return y_pred"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Connected components:\", compute_number_connected_components(lamb_norm, threshold=1e-12))\n",
    "spectral_clustering = SpectralClustering(n_classes=2, normalize=True)\n",
    "y_pred = spectral_clustering.fit_predict(adjacency)\n",
    "plt.scatter(X[:, 0], X[:, 1], c=y_pred)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 8: On your dataset\n",
    "\n",
    "Can you think of another 2D dataset in which k-means would badly perform, but spectral clustering would not?\n",
    "Construct it!\n",
    "For this question you can import any dataset of your choice, for example from `sklearn.datasets` or `pygsp.graphs`, but you can also get creative and define something of your own. First, create and plot the dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Run K-means:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create the similarity graph, and run spectral clustering with both the combinatorial and normalized Laplacian matrices:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Comment your results here*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Dimensionality Reduction with Laplacian Eigenmaps\n",
    "\n",
    "Most datasets are very high-dimensional, which means it can be very hard to understand their geometry. Fortunately, there exists multiple  techniques that can help us to reduce the dimensionality of the data, and allow us to visualize it. \n",
    "\n",
    "In this part of the assignment we will use MNIST to compare these techniques. Indeed, without dimensionality reduction it would be very difficult to answer questions like: are the different digits clustered together in different areas of space? \n",
    "\n",
    "But first, let's load our dataset:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from utils import load_mnist\n",
    "\n",
    "X_mnist, y_mnist = load_mnist()\n",
    "classes = np.unique(y_mnist)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 9: Laplacian eigenmaps\n",
    "\n",
    "Most dimensionality reduction algorithms are constructed such that some property of the dataset remains invariant in the lower dimensional representation. Before implementing laplacian eigenmaps, can you say what property of the data does this algorithm preserve?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Your answer here*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Implement a function that uses Laplacian eigenmaps to do dimensionality reduction."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def laplacian_eigenmaps(X:np.ndarray, dim: int, sigma: float, epsilon: float, normalize: bool):\n",
    "    \"\"\" Return:\n",
    "        coords (n x dim array): new coordinates for the data points.\"\"\"\n",
    "    # Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use this function to visualize MNIST in 2D. Feel free to play with the different parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dim = 2\n",
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Visualize MNIST in 3D:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dim = 3\n",
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 10: Comparison with other methods  \n",
    "We provide the visualization of MNIST with other methods:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.decomposition import PCA\n",
    "from sklearn.manifold import TSNE, Isomap\n",
    "\n",
    "# This cell can take a few minutes to run\n",
    "run_this_cell = False\n",
    "\n",
    "if run_this_cell:\n",
    "    # In 2d\n",
    "    embeddings = [PCA(n_components=2, copy=True, whiten=True, tol=1e-5),\n",
    "                  Isomap(n_components=2, n_neighbors=5),\n",
    "                  TSNE(n_components=2)]\n",
    "\n",
    "    for embedding in embeddings:\n",
    "        X_embedded = embedding.fit_transform(X_mnist)\n",
    "        fig = plt.figure()\n",
    "        for i in classes:\n",
    "            mask = y_mnist == i\n",
    "            plt.scatter(X_embedded[mask, 0], X_embedded[mask, 1], label=i)\n",
    "        plt.legend()\n",
    "        plt.title('Embedding method: '+ type(embedding).__name__)\n",
    "        plt.show()\n",
    "\n",
    "    # In 3d\n",
    "    embeddings = [PCA(n_components=3, copy=True, whiten=True, tol=1e-5),\n",
    "                  Isomap(n_components=3, n_neighbors=5),\n",
    "                  TSNE(n_components=3)]\n",
    "\n",
    "    for embedding in embeddings:\n",
    "        X_embedded = embedding.fit_transform(X_mnist)\n",
    "        fig = plt.figure()\n",
    "        ax = Axes3D(fig)\n",
    "        for i in classes:\n",
    "            mask = y_mnist == i\n",
    "            ax.scatter(X_embedded[mask, 0], X_embedded[mask, 1], X_embedded[mask, 2], label=i)\n",
    "        ax.legend()\n",
    "        ax.title.set_text('Embedding method: '+ type(embedding).__name__)\n",
    "        plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In a few words, what are the principles guiding the design of each method? Compare their results."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Your answer here*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='gsp'></a>\n",
    "## Part II: Regularization on graphs with Graph Signal Processing\n",
    "\n",
    "In this part of the assignment we are going to familiarize ourselves with the main concepts in Graph Signal Processing and regularization on graphs in general. From now on, you can only use the following libraries as well as the functions that you implemented in the previous parts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "from pygsp.graphs import Bunny"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this exercise we will use a nearest-neighbor graph constructed from the Stanford Bunny point cloud included in the PyGSP library."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "G = Bunny()\n",
    "adjacency = np.asarray(G.W.todense())\n",
    "n_nodes = adjacency.shape[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will use the following function to plot our signals on this graph."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def plot_bunny(x=None, title='', vlim=[-0.03, 0.03]):\n",
    "    fig = plt.gcf()\n",
    "    ax = plt.gca()\n",
    "    if not isinstance(ax, Axes3D):\n",
    "        ax = plt.subplot(111, projection='3d')\n",
    "    if x is not None:\n",
    "        x = np.squeeze(x)\n",
    "\n",
    "    p = ax.scatter(G.coords[:,0], G.coords[:,1], G.coords[:,2], c=x, marker='o',\n",
    "                   s=5, cmap='RdBu_r', vmin=vlim[0], vmax=vlim[1])\n",
    "    ax.view_init(elev=-90, azim=90)\n",
    "    ax.dist = 7\n",
    "    ax.set_axis_off()\n",
    "    ax.set_title(title)\n",
    "    if x is not None:\n",
    "        fig.colorbar(p)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.subplot(111, projection='3d')\n",
    "plot_bunny()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 11: Graph frequencies\n",
    "\n",
    "Let us start by constructing the normalized graph laplacians from the adjacency matrix and find its spectral decomposition."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "laplacian = compute_laplacian(adjacency, normalize=True)\n",
    "lam, U = spectral_decomposition(laplacian)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot the eigenvalues."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(6, 5))\n",
    "plt.plot(lam)\n",
    "plt.title('Eigenvalues $L_{norm}$')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To make things more clear we will plot some of its eigenvectors (0, 1, 3, 10, 100) as signals on the bunny graph."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(18, 9))\n",
    "plt.subplot(231, projection='3d')\n",
    "plot_bunny(x=U[:,0], title='Eigenvector #0')\n",
    "plt.subplot(232, projection='3d')\n",
    "plot_bunny(x=U[:,1], title='Eigenvector #1')\n",
    "plt.subplot(233, projection='3d')\n",
    "plot_bunny(x=U[:,2], title='Eigenvector #2')\n",
    "\n",
    "plt.subplot(234, projection='3d')\n",
    "plot_bunny(x=U[:,3], title='Eigenvector #3')\n",
    "plt.subplot(235, projection='3d')\n",
    "plot_bunny(x=U[:,10], title='Eigenvector #10')\n",
    "plt.subplot(236, projection='3d')\n",
    "plot_bunny(x=U[:,100], title='Eigenvector #100')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "What can you say in terms of the variation (smoothness) of these signals? How can the smoothness of a signal be measured?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Your answer here*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 12: Graph Fourier Transform\n",
    "\n",
    "Create a function to compute the Graph Fourier Transform (GFT) of a graph signal and its inverse.\n",
    "**Note**: You can assume that you have internal access to the eigendecomposition (`U` and `lam`) of the laplacian."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def GFT(signal: np.ndarray):\n",
    "    # Your code here\n",
    "\n",
    "def iGFT(fourier_coefficients: np.ndarray):\n",
    "    # Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's create a graph signal:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = G.coords[:, 0] + G.coords[:, 1] + 3 * G.coords[:, 2]\n",
    "x /= np.linalg.norm(x) \n",
    "\n",
    "noise = np.random.randn(n_nodes)\n",
    "noise /= np.linalg.norm(noise) \n",
    "\n",
    "x_noisy = x + 0.3*noise\n",
    "\n",
    "plot_bunny(x_noisy, vlim=[min(x_noisy), max(x_noisy)])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "and plot its graph spectrum:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(10, 6))\n",
    "plt.plot(lam, np.abs(GFT(x_noisy)), 'r.') \n",
    "plt.plot(lam, np.abs(GFT(x)), 'g-')\n",
    "plt.xlabel('$\\lambda$')\n",
    "plt.ylabel('GFT')\n",
    "plt.legend(['$x_{noisy}$', '$x$'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 13: Graph filters\n",
    "\n",
    "We will try to extract the signal from the noise using graph filters. Let us start by creating three ideal graph filters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ideal_lp = np.ones((n_nodes,))\n",
    "ideal_bp = np.ones((n_nodes,))\n",
    "ideal_hp = np.ones((n_nodes,))\n",
    "\n",
    "ideal_lp[lam >= 0.1] = 0  # Low-pass filter with cut-off at lambda=0.1\n",
    "ideal_bp[lam < 0.1] = 0  # Band-pass filter with cut-offs at lambda=0.1 and lambda=0.5\n",
    "ideal_bp[lam > 0.5] = 0\n",
    "ideal_hp[lam <= 1] = 0  # High-pass filter with cut-off at lambda=1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Additionally, create the ideal graph filter that implements the solution of Tikhonov regularization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "alpha = 0.99 / np.max(lam)\n",
    "\n",
    "ideal_tk =  # Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's plot the spectral responses:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.plot(lam, ideal_lp, '-', label='LP')\n",
    "plt.plot(lam, ideal_bp, '-', label='BP')\n",
    "plt.plot(lam, ideal_hp, '-', label='HP')\n",
    "plt.plot(lam, ideal_tk, '-', label='Tikhonov')\n",
    "plt.xlabel('$\\lambda$')\n",
    "plt.ylabel('Spectral response')\n",
    "plt.legend(loc='lower right')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a function to filter a signal given an ideal graph filter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def ideal_graph_filter(x: np.ndarray, spectral_response: np.ndarray):\n",
    "    \"\"\"Return a filtered signal.\"\"\"\n",
    "    # Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us visualize the results:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x_lp = ideal_graph_filter(x_noisy,ideal_lp)\n",
    "x_bp = ideal_graph_filter(x_noisy,ideal_bp)\n",
    "x_hp = ideal_graph_filter(x_noisy,ideal_hp)\n",
    "x_tk = ideal_graph_filter(x_noisy,ideal_tk)\n",
    "\n",
    "plt.figure(figsize=(18, 9))\n",
    "plt.subplot(231, projection='3d')\n",
    "plot_bunny(x=x, title='signal (true)', vlim=[min(x), max(x)])\n",
    "plt.subplot(232, projection='3d')\n",
    "plot_bunny(x=x_noisy, title='signal (noisy)', vlim=[min(x), max(x)])\n",
    "plt.subplot(233, projection='3d')\n",
    "plot_bunny(x=x_lp, title='Low-pass', vlim=[min(x_lp), max(x_lp)])\n",
    "plt.subplot(234, projection='3d')\n",
    "plot_bunny(x=x_bp, title='Band-pass', vlim=[min(x_bp), max(x_bp)])\n",
    "plt.subplot(235, projection='3d')\n",
    "plot_bunny(x=x_hp, title='High-pass', vlim=[min(x_hp), max(x_hp)])\n",
    "plt.subplot(236, projection='3d')\n",
    "plot_bunny(x=x_tk, title='Tikhonov denoised signal', vlim=[min(x_tk), max(x_tk)])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "How would you link to the observations you made before about the spectral decomposition of the laplacian?\n",
    "Also, judging from the results, what type of model prior do you think Tikhonov regularization enforces?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Your answer here*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 14: Polynomial graph filters\n",
    "\n",
    "We have seen how we can use the GFT to define different filters that enhance or reduce certain frequency bands. However, to do so, we require an explicit eigendecomposition of the graph laplacian, which has a cost $O(n^3)$. For very large graphs this is very intense computationally. We will now see how we can obtain similar results by filtering the signals directly without resorting to an eigendecomposition.\n",
    "\n",
    "The key idea is to use a polynomial of the graph laplacian to define a graph filter, i.e., $g(L)x=\\sum_{k=1}^K \\alpha_k L^k x$, and use the fact that the powers of a diagonalizable matrix can be written in terms of powers of its eigenvalues. This is\n",
    "$$\n",
    "L^k=(U\\Lambda U^T)^k=U\\Lambda^k U^T = U\\begin{bmatrix}\n",
    "(\\lambda_0)^k &\\dots & 0\\\\\n",
    "\\vdots & \\ddots & \\vdots\\\\\n",
    "0 & \\dots & (\\lambda_N)^k\n",
    "\\end{bmatrix} U^T.\n",
    "$$\n",
    "\n",
    "This means that a polynomial of the graph laplacian acts independently on each eigenvalue of the graph, and has a frequency spectrum of\n",
    "$$g(\\lambda)=\\sum_{k=1}^K \\alpha_k \\lambda^k.$$\n",
    "Hence,\n",
    "$$g(L)x=\\sum_{k=1}^K \\alpha_k L^k x=\\sum_{k=1}^K \\alpha_k U\\Lambda^k U^T x=U \\left(\\sum_{k=1}^K \\alpha_k\\Lambda^k \\right)U^T x=\\operatorname{iGFT}\\left(g(\\Lambda)\\operatorname{GFT}(x)\\right).$$\n",
    "\n",
    "With these ingredients, we have reduced the design of graph filters in the vertex domain to a regression task that approximates a given spectral response by a polynomial. There are multiple ways to do this, but in this assignment we will implement a very simple strategy based on [least-squares regression](https://en.wikipedia.org/wiki/Polynomial_regression#Matrix_form_and_calculation_of_estimates)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Implement a function to find the coefficients of a polynomial that approximates a given ideal filter.\n",
    "**Hint:** `np.vander` and `np.linalg.lstsq`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def fit_polynomial(lam: np.ndarray, order: int, spectral_response: np.ndarray):\n",
    "    \"\"\" Return an array of polynomial coefficients of length 'order'.\"\"\"\n",
    "    # Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Implement a function to compute the frequency response of that filter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def polynomial_graph_filter_response(coeff: np.array, lam: np.ndarray):\n",
    "    \"\"\" Return an array of the same shape as lam.\n",
    "        response[i] is the spectral response at frequency lam[i]. \"\"\"\n",
    "    # Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us fit the Tikhonov ideal filter with several polynomials of different order."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.plot(lam, ideal_tk)\n",
    "orders = [1, 2, 3, 5, 10, 20]\n",
    "for order in orders:    \n",
    "    coeff_tk = fit_polynomial(lam, order, ideal_tk)\n",
    "    plt.plot(lam, polynomial_graph_filter_response(coeff_tk, lam))\n",
    "\n",
    "plt.xlabel('$\\lambda$')\n",
    "plt.ylabel('Spectral response')\n",
    "plt.legend(orders)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So far, we have only defined a way to compute the coefficients of our laplacian polynomial. Let us now compute our graph filter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def polynomial_graph_filter(coeff: np.array, laplacian: np.ndarray):\n",
    "    \"\"\" Return the laplacian polynomial with coefficients 'coeff'. \"\"\"\n",
    "    # Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Based on the previous plot, choose a filter order that achieves (in your opinion) a good tradeoff in terms of computational complexity and response accuracy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "order =  # Your code here\n",
    "coeff_tk = fit_polynomial(lam, order, ideal_tk)\n",
    "g_tk = polynomial_graph_filter(coeff_tk, laplacian)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 15: ARMA filter\n",
    "\n",
    "As you have seen in class, polynomial graph filters are only one of the ways in which you can approximate ideal graph filters. \n",
    "In this sense, ARMA filters are a natural way to implement Tikhonov denoising on graphs.\n",
    "Let us recall the general solution of the Tikhonov regularized denoising problem \n",
    "\n",
    "$$y=(I+\\alpha L)^{-1}x. $$\n",
    "\n",
    "With a little bit of algebra manipulation we can rewrite this expression as\n",
    "$$\n",
    "    y = -\\alpha L y + x,\n",
    "$$\n",
    "from which we can derive the iterative algorithm\n",
    "$$\n",
    "    y_k = -\\alpha L y_{k-1} + x\\qquad k=1,2,\\dots\n",
    "$$\n",
    "which is guaranteed to converge as long as $\\alpha \\lambda_{max} < 1$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Implement the ARMA version of Tikhonov regularization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def arma_tikhonov(x: np.ndarray, laplacian: np.ndarray, alpha: float, max_iter=50):\n",
    "    \"\"\" Return an array of the same shape as x.\"\"\"\n",
    "    # Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Filter the previous noisy graph signal with the polynomial and ARMA approximations of the ideal Tikhonov filter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x_tk_polynomial =  # Your code here\n",
    "x_tk_arma = arma_tikhonov(x_noisy, laplacian, alpha)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us compare with the previous version."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(18, 4))\n",
    "plt.subplot(131, projection='3d')\n",
    "plot_bunny(x_tk, title='Ideal filter', vlim=[min(x_tk), max(x_tk)])\n",
    "plt.subplot(132, projection='3d')\n",
    "plot_bunny(x_tk_polynomial, title='Polynomial filter', vlim=[min(x_tk), max(x_tk)])\n",
    "plt.subplot(133, projection='3d')\n",
    "plot_bunny(x_tk_arma, title='ARMA filter', vlim=[min(x_tk), max(x_tk)])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='ml'></a>\n",
    "## Part III: Machine Learning on Graphs\n",
    "\n",
    "So far, we have only played with toy examples. Let us see the use of these tools in practice! In particular, let us see how we can use some graph filters to construct features to feed a classifier. For this part of the assignment we will import some extra packages."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import time\n",
    "\n",
    "import networkx as nx\n",
    "from sklearn.linear_model import LogisticRegression\n",
    "\n",
    "import torch\n",
    "import torch.nn as nn\n",
    "import torch.nn.functional as F\n",
    "\n",
    "import dgl.function as fn\n",
    "from dgl import DGLGraph\n",
    "from dgl.data.citation_graph import load_cora\n",
    "\n",
    "np.random.seed(0)\n",
    "torch.manual_seed(1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will use the CORA dataset and the citation graph that we created in Assignment 1. However, to simplify the next tasks we will directly use the preprocessed version of this dataset contained within the Deep Graph Library (DGL).\n",
    "\n",
    "In this assignment, we will interpret CORA's features as multidimensional graph signals living on the citation graph.\n",
    "Our task is to design a classifier that uses these features and the geometry of the graph can identify the type of paper each node represents.\n",
    "\n",
    "The goal of this exercise is to do semi-supervised learning on graphs.\n",
    "We assume that we know to which scientific field a small subset of the papers belongs (the ones contained in `train_mask`).\n",
    "The goal is to predict to which field the other papers belong, using both the citation graph and the bag-of-word representation of each paper."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cora = load_cora()\n",
    "\n",
    "features = torch.FloatTensor(cora.features)    # Feature vector for each paper\n",
    "labels = torch.LongTensor(cora.labels)         # The field to which each paper belongs\n",
    "\n",
    "train_mask = torch.BoolTensor(cora.train_mask) # Mask of nodes selected for training\n",
    "val_mask = torch.BoolTensor(cora.val_mask)     # Mask of nodes selected for validation\n",
    "test_mask = torch.BoolTensor(cora.test_mask)   # Mask of nodes selected for testing\n",
    "\n",
    "in_feats = features.shape[1]\n",
    "n_classes = cora.num_labels\n",
    "n_edges = cora.graph.number_of_edges()\n",
    "\n",
    "graph = cora.graph\n",
    "adjacency = np.asarray(nx.to_numpy_matrix(graph))\n",
    "n_nodes = adjacency.shape[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For this exercise we will use the normalized laplacian."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "laplacian = compute_laplacian(adjacency, normalize=True)\n",
    "lam, U = spectral_decomposition(laplacian)\n",
    "lam_max = np.max(lam)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 16: Logistic regression\n",
    "\n",
    "The simplest classification method consists in ignoring the citation graph and trying to classify the papers using only the features.\n",
    "In this case, the problem is viewed as a standard classification task.\n",
    "To train our classifier we will select a few nodes in our graph for training and fit a [logistic regression classifier](https://en.wikipedia.org/wiki/Logistic_regression) on them.\n",
    "To avoid overfitting to the test set when we do hyperparameter tuning, we will also select a validation set.\n",
    "And finally, we will test our classifier on the rest of the nodes.\n",
    "**Hint:** use `sklearn.linear_model.LogisticRegression`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_features = features[train_mask]\n",
    "train_labels = labels[train_mask]\n",
    "val_features = features[val_mask]\n",
    "val_labels = labels[val_mask]\n",
    "test_features = features[test_mask]\n",
    "test_labels = labels[test_mask]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Fit a logistic regression model\n",
    "# Your code here"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_acc =  # Your code here\n",
    "val_acc =  # Your code here\n",
    "test_acc =  # Your code here\n",
    "\n",
    "print('Train accuracy {:.4f} | Validation accuracy {:.4f} | Test accuracy {:.4f}'.format(train_acc, val_acc, test_acc))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 17: Handcrafted graph filters\n",
    "\n",
    "That's not a bad start! Now, let's try to improve a bit the results by taking into account the graph structure using tools from GSP. For this purpose, we will design a handcrafted filter that will be used to denoise the signal, before feeding it to a logistic regression.\n",
    "\n",
    "However, before we start, what hypothesis can you make on the spectral properties of the denoised signal?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Your answer here*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Based on this prior, design an ideal filter response that you believe could enhance important features of the graph. \n",
    "\n",
    "**Note:** you just need to design one graph filter that we will apply to all features. Don't design a different filter for each feature. \n",
    "\n",
    "**Note:** finding the right filter can be very challenging, don't worry if you can't find it. Just make sure you experiment with a few configurations and parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ideal_filter =  # Store your spectral response here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Choose a filter order to approximate your filter using laplacian polynomials."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "order =  # Your code here\n",
    "\n",
    "coeff = fit_polynomial(lam, order, ideal_filter)\n",
    "graph_filter = polynomial_graph_filter(coeff, laplacian)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's plot the frequency response of your spectral template and its polynomial approximation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.plot(lam, ideal_filter)\n",
    "plt.plot(lam, polynomial_graph_filter_response(coeff, lam))\n",
    "plt.legend(['Ideal', 'Polynomial'])\n",
    "plt.xlabel('$\\lambda$')\n",
    "plt.ylabel('Spectral response')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's create the new features."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "filtered_features = graph_filter @ features.numpy()\n",
    "\n",
    "train_features = filtered_features[train_mask,:]\n",
    "train_labels = labels[train_mask]\n",
    "\n",
    "val_features = filtered_features[val_mask,:]\n",
    "val_labels = labels[val_mask]\n",
    "\n",
    "test_features = filtered_features[test_mask,:]\n",
    "test_labels = labels[test_mask]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Train another logistic regression classifier on the new features. Remember to play with the regularization parameters to achieve a well performing model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Evaluate your model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_acc =  # Your code here\n",
    "val_acc =  # Your code here\n",
    "test_acc =  # Your code here\n",
    "\n",
    "print('Train accuracy {:.4f} | Validation accuracy {:.4f} | Test accuracy {:.4f}'.format(train_acc, val_acc, test_acc))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 18: Graph convolutional networks\n",
    "\n",
    "By now, you will probably have seen that it is challenging to find the right combination of spectral response, filter parameters and regularization method. And in most cases, this is a painstaking job. Wouldn't it be great to automate these tasks?\n",
    "\n",
    "Fortunately, this is possible if we use the right tools! Specifically, we will see that Graph Convolutional Networks are a great framework to automatize the feature extraction method.\n",
    "\n",
    "In this exercise, we will follow the same classification pipeline as above, but instead of hand-crafting our filter we will let `PyTorch` find the coefficients for us using gradient descent.\n",
    "\n",
    "In this section, most of the code is already written. Try to understand it and to play with some parameters. It may be useful if you want to solve some learning task in your project.\n",
    "\n",
    "We start by constructing a `LaplacianPolynomial` model in `DGL`. It computes the function: $f(X) = \\sum_{i=1}^{k} \\alpha_i L^i X \\theta$ where the trainable parameters are the coefficients $\\alpha_i$ and the matrix $\\theta$. This function can be interpreted as a filtering of $X$ by $\\sum_{i=1}^{k} \\alpha_i L^i$ followed by a linear layer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class LaplacianPolynomial(nn.Module):\n",
    "    def __init__(self,\n",
    "                 in_feats: int,\n",
    "                 out_feats: int,\n",
    "                 k: int,\n",
    "                 dropout_prob: float,\n",
    "                 norm=True):\n",
    "        super().__init__()\n",
    "        self._in_feats = in_feats\n",
    "        self._out_feats = out_feats\n",
    "        self._k = k\n",
    "        self._norm = norm\n",
    "        # Contains the weights learned by the Laplacian polynomial\n",
    "        self.pol_weights = nn.Parameter(torch.Tensor(self._k + 1))\n",
    "        # Contains the weights learned by the logistic regression (without bias)\n",
    "        self.logr_weights = nn.Parameter(torch.Tensor(in_feats, out_feats))\n",
    "        self.dropout = nn.Dropout(p=dropout_prob)\n",
    "        self.reset_parameters()\n",
    "\n",
    "    def reset_parameters(self):\n",
    "        \"\"\"Reinitialize learnable parameters.\"\"\"\n",
    "        torch.manual_seed(0)\n",
    "        torch.nn.init.xavier_uniform_(self.logr_weights, gain=0.01)\n",
    "        torch.nn.init.normal_(self.pol_weights, mean=0.0, std=1e-3)\n",
    "\n",
    "    def forward(self, graph, feat):\n",
    "        r\"\"\"Compute graph convolution.\n",
    "\n",
    "        Notes\n",
    "        -----\n",
    "        * Input shape: :math:`(N, *, \\text{in_feats})` where * means any number of additional\n",
    "          dimensions, :math:`N` is the number of nodes.\n",
    "        * Output shape: :math:`(N, *, \\text{out_feats})` where all but the last dimension are\n",
    "          the same shape as the input.\n",
    "\n",
    "        Parameters\n",
    "        ----------\n",
    "        graph (DGLGraph) : The graph.\n",
    "        feat (torch.Tensor): The input feature\n",
    "\n",
    "        Returns\n",
    "        -------\n",
    "        (torch.Tensor) The output feature\n",
    "        \"\"\"\n",
    "        feat = self.dropout(feat)\n",
    "        graph = graph.local_var()\n",
    "        \n",
    "        # D^(-1/2)\n",
    "        norm = torch.pow(graph.in_degrees().float().clamp(min=1), -0.5)\n",
    "        shp = norm.shape + (1,) * (feat.dim() - 1)\n",
    "        norm = torch.reshape(norm, shp)\n",
    "\n",
    "        # mult W first to reduce the feature size for aggregation.\n",
    "        feat = torch.matmul(feat, self.logr_weights)\n",
    "\n",
    "        result = self.pol_weights[0] * feat.clone()\n",
    "\n",
    "        for i in range(1, self._k + 1):\n",
    "            old_feat = feat.clone()\n",
    "            if self._norm:\n",
    "                feat = feat * norm\n",
    "            graph.ndata['h'] = feat\n",
    "            # Feat is not modified in place\n",
    "            graph.update_all(fn.copy_src(src='h', out='m'),\n",
    "                             fn.sum(msg='m', out='h'))\n",
    "            if self._norm:\n",
    "                graph.ndata['h'] = graph.ndata['h'] * norm\n",
    "\n",
    "            feat = old_feat - graph.ndata['h']\n",
    "            result += self.pol_weights[i] * feat\n",
    "\n",
    "        return result\n",
    "\n",
    "    def extra_repr(self):\n",
    "        \"\"\"Set the extra representation of the module,\n",
    "        which will come into effect when printing the model.\n",
    "        \"\"\"\n",
    "        summary = 'in={_in_feats}, out={_out_feats}'\n",
    "        summary += ', normalization={_norm}'\n",
    "        return summary.format(**self.__dict__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once we have are model ready we just need to create a function that performs one step of our training loop, and another one that evaluates our model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def train(model, g, features, labels, loss_fcn, train_mask, optimizer):\n",
    "    model.train()  # Activate dropout\n",
    "    \n",
    "    logits = model(g, features)\n",
    "    loss = loss_fcn(logits[train_mask], labels[train_mask])\n",
    "\n",
    "    optimizer.zero_grad()\n",
    "    loss.backward()\n",
    "    optimizer.step()\n",
    "    return loss\n",
    "\n",
    "def evaluate(model, g, features, labels, mask):\n",
    "    model.eval()  # Deactivate dropout\n",
    "    with torch.no_grad():\n",
    "        logits = model(g, features)[mask]  # only compute the evaluation set\n",
    "        labels = labels[mask]\n",
    "        _, indices = torch.max(logits, dim=1)\n",
    "        correct = torch.sum(indices == labels)\n",
    "        return correct.item() * 1.0 / len(labels)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Choose the training parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pol_order = 3\n",
    "lr = 0.2\n",
    "weight_decay = 5e-6\n",
    "n_epochs = 1000\n",
    "p_dropout = 0.8"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And train the classifier end to end."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "graph = DGLGraph(cora.graph)\n",
    "\n",
    "model = LaplacianPolynomial(in_feats, n_classes, pol_order, p_dropout)\n",
    "\n",
    "loss_fcn = torch.nn.CrossEntropyLoss()\n",
    "optimizer = torch.optim.Adam(model.parameters(),\n",
    "                             lr=lr,\n",
    "                             weight_decay=weight_decay)\n",
    "\n",
    "dur = []\n",
    "for epoch in range(n_epochs):\n",
    "    if epoch >= 3:\n",
    "        t0 = time.time()\n",
    "    loss = train(model, graph, features, labels, loss_fcn, train_mask, optimizer)\n",
    "\n",
    "    if epoch >= 3:\n",
    "        dur.append(time.time() - t0)\n",
    "\n",
    "    acc = evaluate(model, graph, features, labels, val_mask)\n",
    "    print(\"Epoch {:05d} | Time(s) {:.4f} | Train Loss {:.4f} | Val Accuracy {:.4f}\". format(\n",
    "            epoch, np.mean(dur), loss.item(), acc))\n",
    "\n",
    "print()\n",
    "acc = evaluate(model, graph, features, labels, test_mask)\n",
    "print(\"Test Accuracy {:.4f}\".format(acc))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Trained this way our GCN based on polynomials of the laplacian is a black box. Fortunately, however, the only difference between this shallow model and our previous classifier is the way we chose the filter coefficients.\n",
    "\n",
    "Let's see what the network learned.\n",
    "Print the coefficients of the learned filter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "coeff_gcn =  # Your code here\n",
    "print(coeff_gcn)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To interpret the model we can plot the frequency response of the learned filter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.semilogy(lam, np.abs(polynomial_graph_filter_response(coeff_gcn, lam)))\n",
    "plt.xlabel('$\\lambda$')\n",
    "plt.ylabel('Spectral response (db)')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Question 19\n",
    "\n",
    "As we said, the whole classification pipeline of the previous exercise is identical to the one we tried before: Graph filtering + Logistic regression. The only difference lies in the way we chose the filter coefficients. First we were choosing them manually, and now, we let `PyTorch` find them for us. However, if everything is correct we should be able to use this filter to construct new hand-crafted features and train a logistic regression model that achieves good accuracy on the training set. Let's do that!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use the learned coefficients to train a new feature extractor:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "graph_gcn_filter =  # Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's extract the new features by filtering the data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "features_gcn = graph_gcn_filter @ features.numpy()\n",
    "\n",
    "train_features_gcn = features_gcn[train_mask,:]\n",
    "train_labels = labels[train_mask]\n",
    "val_features_gcn = features_gcn[val_mask,:]\n",
    "val_labels = labels[val_mask]\n",
    "test_features_gcn = features_gcn[test_mask,:]\n",
    "test_labels = labels[test_mask]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Train a logistic regression on these features:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, let's evaluate this model:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_acc =  # Your code here\n",
    "val_acc =  # Your code here\n",
    "test_acc =  # Your code here\n",
    "\n",
    "print('Train accuracy {:.4f} | Validation accuracy {:.4f} | Test accuracy {:.4f}'.format(train_acc, val_acc, test_acc))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The performance of this model may not be exactly the same as the one obtained with Pytorch. What are the differences in the training procedure that can explain this gap?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Your answer here*"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}