{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# NTDS'18 milestone 1: network collection and properties\n", "[Effrosyni Simou](https://lts4.epfl.ch/simou), [EPFL LTS4](https://lts4.epfl.ch)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Students\n", "\n", "* Team: `42`\n", "* Students: `Alexandre Poussard, Robin Leurent, Vincent Coriou, Pierre Fouché`\n", "* Dataset: [`Flight routes`](https://openflights.org/data.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Rules\n", "\n", "* Milestones have to be completed by teams. No collaboration between teams is allowed.\n", "* Textual answers shall be short. Typically one to three sentences.\n", "* Code has to be clean.\n", "* You cannot import any other library than we imported.\n", "* When submitting, the notebook is executed and the results are stored. I.e., if you open the notebook again it should show numerical results and plots. We won't be able to execute your notebooks.\n", "* The notebook is re-executed from a blank state before submission. That is to be sure it is reproducible. You can click \"Kernel\" then \"Restart & Run All\" in Jupyter." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Objective " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The purpose of this milestone is to start getting acquainted to the network that you will use for this class. In the first part of the milestone you will import your data using [Pandas](http://pandas.pydata.org) and you will create the adjacency matrix using [Numpy](http://www.numpy.org). This part is project specific. In the second part you will have to compute some basic properties of your network. **For the computation of the properties you are only allowed to use the packages that have been imported in the cell below.** You are not allowed to use any graph-specific toolboxes for this milestone (such as networkx and PyGSP). Furthermore, the aim is not to blindly compute the network properties, but to also start to think about what kind of network you will be working with this semester. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Part 1 - Import your data and manipulate them. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### A. Load your data in a Panda dataframe." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, you should define and understand what are your nodes, what features you have and what are your labels. Please provide below a Panda dataframe where each row corresponds to a node with its features and labels. For example, in the the case of the Free Music Archive (FMA) Project, each row of the dataframe would be of the following form:\n", "\n", "\n", "| Track | Feature 1 | Feature 2 | . . . | Feature 518| Label 1 | Label 2 |. . .|Label 16|\n", "|:-------:|:-----------:|:---------:|:-----:|:----------:|:--------:|:--------:|:---:|:------:|\n", "| | | | | | | | | |\n", "\n", "It is possible that in some of the projects either the features or the labels are not available. This is OK, in that case just make sure that you create a dataframe where each of the rows corresponds to a node and its associated features or labels." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | node_idx | \n", "AirportID | \n", "Airline | \n", "Stops | \n", "DestAirportID | \n", "SourceAirportID | \n", "DestSourceRatio | \n", "Country | \n", "
---|---|---|---|---|---|---|---|---|
0 | \n", "0 | \n", "1 | \n", "2.0 | \n", "0.0 | \n", "5.0 | \n", "5.0 | \n", "1.000000 | \n", "Papua New Guinea | \n", "
1 | \n", "1 | \n", "2 | \n", "2.0 | \n", "0.0 | \n", "8.0 | \n", "8.0 | \n", "1.000000 | \n", "Papua New Guinea | \n", "
2 | \n", "2 | \n", "3 | \n", "2.0 | \n", "0.0 | \n", "10.0 | \n", "12.0 | \n", "0.833333 | \n", "Papua New Guinea | \n", "
3 | \n", "3 | \n", "4 | \n", "2.0 | \n", "0.0 | \n", "11.0 | \n", "11.0 | \n", "1.000000 | \n", "Papua New Guinea | \n", "
4 | \n", "4 | \n", "5 | \n", "4.0 | \n", "0.0 | \n", "51.0 | \n", "49.0 | \n", "1.040816 | \n", "Papua New Guinea | \n", "