{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# [NTDS'17] assignment 1: Student Solution\n", "[ntds'17]: https://github.com/mdeff/ntds_2017\n", "\n", "Florian Benedikt Roth" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Objective of Exercise\n", "The aim of this exercise is to learn how to create your own, real network using data collected from the Internet and then to discover some properties of the collected network. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Resources\n", "You might want to have a look at the following resources before starting:\n", "\n", "* [Twitter REST API](https://dev.twitter.com/rest/public)\n", "* [Tweepy Documentation](http://tweepy.readthedocs.io/en/v3.5.0/)\n", "* [Tutorial \"Mining Twitter data with Python\"](https://marcobonzanini.com/2015/03/02/mining-twitter-data-with-python-part-1/)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 1. Collect a Twitter Network" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to collect data from Twitter you will need to generate access tokens. To do this you will need to register a [client application with Twitter](https://apps.twitter.com/). Once you are done you should have your tokens. You can now create a `credentials.ini` file as follows:\n", "```\n", "[twitter]\n", "consumer_key = YOUR-CONSUMER-KEY\n", "consumer_secret = YOUR-CONSUMER-SECRET\n", "access_token = YOUR-ACCESS-TOKEN\n", "access_secret = YOUR-ACCESS-SECRET\n", "```\n", "In this way you will have this information readily available to you. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import os\n", "import random\n", "import configparser\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import copy\n", "import pickle \n", "from datetime import datetime\n", "from pprint import pprint\n", "import tweepy " ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Read the confidential token.\n", "credentials = configparser.ConfigParser()\n", "credentials.read(os.path.join('..','Data', 'credentials.ini'))\n", "\n", "#authentication\n", "auth = tweepy.OAuthHandler(credentials.get('twitter', 'consumer_key'), credentials.get('twitter', 'consumer_secret'))\n", "auth.set_access_token(credentials.get('twitter', 'access_token'), credentials.get('twitter', 'access_secret'))\n", "\n", "#construct API instance\n", "#deal with rate limits and notify when delayed because of rate limits\n", "api = tweepy.API(auth,wait_on_rate_limit=True, wait_on_rate_limit_notify=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now you are all set up to start collecting data from Twitter! " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this exercise we will construct a network with the following logic:\n", "\n", "1) We will chose a `user_id` in Twitter to be our first node. \n", "\n", "2) We will find (some) of the users who are both following `user_id` and are being followed by `user_id`. From now on we will call such users \"connections\" of `user_id`. We will place these user ids in a list called `first_nodes`. \n", "\n", "3) For every node in the list `first_nodes` we will then find (some) of the users who are following and are being followed by this node (aka the connections of this node). The user ids collected in this step will be placed in a list called `second_nodes`.\n", "\n", "4) The collection of the ids of all nodes (aka Twitter users) that we have collected so far will be placed in a list called `all_nodes`.\n", "\n", "5) Since we have only collected a subset of all possible \"connections\" for our nodes we have to check if there are any remaining inner connections that we have missed.\n", "\n", "The entire network is to be organized in a dictionary with entries that will have as key the Twitter id of the user (this is a number characterizing each user in Twitter) and as value the list of ids of his connections.\n", "\n", "So, let us begin. The first thing that you will have to do is to chose the node from which everything will start. I have chosen the Twitter account of [Applied Machine Learning Days](https://www.appliedmldays.org) that will take place in January 2018 in EPFL. You may change that if you wish to, but please make sure that the user you chose has both followers and friends and that he allows you to access this data." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The chosen user RohdeSchwarz has 6339 followers and 455 friends\n" ] } ], "source": [ "user = 'RohdeSchwarz' #'appliedmldays' #'tudresden_de' #'barkhausensarmy' \n", "user_obj = api.get_user(user) #'RohdeSchwarz' #'dl_weekly'\n", "user_id = user_obj.id\n", "\n", "print('The chosen user {} has {} followers and {} friends'.format(user_obj.screen_name, user_obj.followers_count, user_obj.friends_count))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the following cell write a function that takes as an argument the Twitter id of a user and returns a list with the **ids** of his connections. Take into account the case where a user does not allow you to access this information.\n", "\n", "**Reminder:** By connections we mean users that are both followers and friends of a given user. Friend means, that the user is a follower of the given account." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def find_connections(user_id, limit_min=5):\n", " followers = []\n", " friends=[]\n", " connections = []\n", " #limit_min = 5 # limit in minutes per node that the programm will wait additionally to get all friends and followers\n", " # if limit would be reached, the user will be replaced\n", " # take into account that this will decrease the probability of users with many connections\n", " # set to -1 to wait as long as it takes\n", " # 5000 follower/friends ~ 1 minute\n", " \n", " user_obj = api.get_user(user_id)\n", " name ,fol_cnt, fri_cnt = user_obj.screen_name, user_obj.followers_count, user_obj.friends_count\n", " \n", " # ask for number of followers & friends so that requests, that would take too long are filtered\n", " if max(fol_cnt, fri_cnt) > 5000:\n", " minutes = np.ceil(max(fol_cnt,fri_cnt)/5000-1)\n", " if limit_min < 0:\n", " print('# Because {}/{} has {} followers and {} friends the time waiting for \\n the rate limit to reset will increase by ~ {} minutes.'.format(name,user_id,fol_cnt,fri_cnt,minutes))\n", " elif minutes > limit_min:\n", " print('# Because {}/{} has {} followers and {} friends the time waiting for \\n the rate limit to reset would increase by ~ {} minutes.'.format(name,user_id,fol_cnt,fri_cnt,minutes))\n", " print(' Due to the chosen limit of {} minutes per node this user will be replaced'.format(limit_min))\n", " connections = [float('Nan')]\n", " return connections\n", " \n", " # get followers_ids & friends_ids\n", " try:\n", " for fol in tweepy.Cursor(api.followers_ids, user_id).pages():\n", " followers.extend(fol)\n", " for fr in tweepy.Cursor(api.friends_ids, user_id).pages():\n", " friends.extend(fr)\n", " \n", " # if user does not allow accessing its friends & followers -> return Nan\n", " except tweepy.TweepError:\n", " print('# Could not access the followers/friends of the user {}/{}'.format(name,user_id))\n", " connections = [float('Nan')]\n", " return connections\n", " \n", " # find connections as intersections between friends & followers\n", " connections = list(np.intersect1d(followers,friends))\n", " return connections" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "# Because RohdeSchwarz/53101979 has 6339 followers and 455 friends the time waiting for \n", " the rate limit to reset will increase by ~ 1.0 minutes.\n", "RohdeSchwarz has 209 connections\n" ] } ], "source": [ "first_connections = find_connections(user_id,-1)\n", "if np.isnan(first_connections[0]):\n", " print('Choose a new starting nod.')\n", "else:\n", " print('{} has {} connections'.format(user, len(first_connections)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Collect your `first_nodes` and `second_nodes` and organize your collected nodes and their connections in the dictionary called `network`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Hints:\n", "* Use `random.choice([1,3,4])` to randomly choose a number in `[1, 3, 4]`.\n", "* Use the `append` and `remove` methods to add and remove an element from a Python list.\n", "* The `pop` method removes the last item in the list." ] }, { "cell_type": "code", "execution_count": 308, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def calc_time(level, how_many):\n", " # This function calculates how long the collecting data part will last under the following assumptions:\n", " # 1) all the users share their followers & friends\n", " # 2) no one has more than 5000 followers or friends OR limit_min = 0 \n", " # -> would lead to multiple requests per node otherwise\n", " # 3) all nodes have at least 'how_many' connections\n", " # \n", " # real network neglecting A1 & A2 -> takes more time\n", " # real network neglecting A3 -> takes less time\n", " \n", " n_max = 0\n", " for i in range(0,level+1): # calculating N_max\n", " n_max += how_many**(i)\n", " \n", " # get remaining api requests in rate limit slot and and time of reset\n", " remaining = api.rate_limit_status()['resources']['friends']['/friends/ids']['remaining']\n", " reset = api.rate_limit_status()['resources']['friends']['/friends/ids']['reset']\n", " \n", " # add the amount of needed time_slots * seconds/time_slot to time of reset\n", " reset += np.floor((n_max-remaining)/15)*15*60\n", " print('The network you create will have up to {} nodes.'.format(n_max)) \n", " print(datetime.fromtimestamp(reset).strftime('Due to restrictions of the twitter API this takes about until %H:%M o\\'clock'))\n", " return" ] }, { "cell_type": "code", "execution_count": 309, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def get_nodes(n, collection, but=[]):\n", " # This function provides n random nodes from the given collection\n", " # excluding the entries in 'but'\n", " nodes = []\n", " i = 0\n", " \n", " flat = [x for sublist in but for x in sublist] # list of lists -> list containing all elements\n", " \n", " if not set(collection) <= set(flat): # dont start if entire collection is excluded\n", " pool = list(set(collection)-set(flat)) # pool to choose from\n", " \n", " # stop when: 1) n nodes are found, or 2) no free nodes are left\n", " for i in range(0,min(n, len(pool))):\n", " k = random.randint(0,len(pool)-1) # choose a random element out of the pool\n", " nodes.append(pool[k]) # add it to the chosen nodes\n", " pool.remove(pool[k]) # and delete it from the pool\n", " return nodes" ] }, { "cell_type": "code", "execution_count": 310, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from IPython.core.debugger import Tracer # for debugging insert Tracer()() \n", "\n", "# This functions collects 'cpn' (connections/node) connections for every node in 'nodes',\n", "# saves them in 'nodes_on_lvl', saves nodes with all corresponding connections in all_con,\n", "# and calls itself until the lowest level (0) is reached.\n", "def build_network(nodes, all_con, cpn, level, nodes_on_lvl = [], calling_nod = -1):\n", " \n", " # only called the first time in the highest level to add nodes to nodes_on_lvl \n", " if len(nodes_on_lvl) < (level+1):\n", " nodes_on_lvl.extend([[]]*(level+1))\n", " nodes_on_lvl[level] = nodes\n", " trash = [] # collect nodes that dont allow to access their friends&followers in here\n", " \n", " # iteration, get connections for every node\n", " for nod in nodes:\n", " # get connections from the twitter api\n", " connections = find_connections(nod)\n", "\n", " # user doesnt share connections\n", " if np.isnan(connections[0]):\n", " if calling_nod is -1: # 'nodes' is starting nod (user_id)\n", " print(' -> Choose another starting nod!')\n", " else: # replace the node\n", " nodes_on_lvl[level].remove(nod) # delete invalid node from structure\n", " trash.append(nod) # dont remove invalid node from 'nodes', otherwise next node will be skipped\n", " \n", " # get one random node, that is connected to the calling node in the level above \n", " # but not already in the network\n", " new_nod = get_nodes(1,all_con[calling_nod],nodes_on_lvl) \n", " \n", " if len(new_nod) > 0: # get_nodes found a new node\n", " nodes.extend(new_nod) # adding is allowed and for loop will iterate over new_nod as well\n", " nodes_on_lvl[level].extend(new_nod)\n", " name = api.get_user(new_nod[0]).screen_name\n", " print(' level {}: user was was replaced by {}/{}'.format(level,name,new_nod[0]))\n", " else:\n", " print(' level {}: user was deleted'.format(level))\n", " \n", " # user shares connections\n", " else:\n", " all_con[nod] = connections # node with all corresponding connections is saved in dictionary\n", " if level > 0: ## in every level except for the lowest:\n", " nxt_nodes = get_nodes(cpn, connections, nodes_on_lvl) # choose cpn connections as next nodes\n", " sublist = copy.deepcopy(nodes_on_lvl[level-1]) # add chosen nodes to structure\n", " sublist.extend(nxt_nodes)\n", " nodes_on_lvl[level-1] = sublist\n", " \n", " # call function on the next lower level\n", " build_network(nxt_nodes,all_con,cpn,level-1,nodes_on_lvl,nod)\n", " \n", " for element in trash:\n", " nodes.remove(element) # remove invalid nodes AFTER iterating over all nodes\n", " return" ] }, { "cell_type": "code", "execution_count": 311, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The network you create will have up to 111 nodes.\n", "Due to restrictions of the twitter API this takes about until 00:13 o'clock\n", "# Could not access the followers/friends of the user nomadtechnical/111391043\n", " level 0: user was was replaced by ITCTestWeek/174143998\n", "Rate limit reached. Sleeping for: 889\n", "Rate limit reached. Sleeping for: 890\n", "# Because BluprintAcademy/19837231 has 65908 followers and 70954 friends the time waiting for \n", "### the rate limit to reset would increase by ~ 14.0 minutes.\n", " Due to the chosen limit of 5 minutes per node this user will be replaced\n", " level 0: user was was replaced by gsohn/16905041\n", "Rate limit reached. Sleeping for: 889\n", "Rate limit reached. Sleeping for: 890\n", "Rate limit reached. Sleeping for: 889\n", "# Because GreatnessHQ/786223237 has 33874 followers and 27623 friends the time waiting for \n", "### the rate limit to reset would increase by ~ 6.0 minutes.\n", " Due to the chosen limit of 5 minutes per node this user will be replaced\n", " level 0: user was was replaced by autonostop/971646229\n", "# Because StartFreshHere/2506511533 has 45077 followers and 39692 friends the time waiting for \n", "### the rate limit to reset would increase by ~ 9.0 minutes.\n", " Due to the chosen limit of 5 minutes per node this user will be replaced\n", " level 0: user was was replaced by GrowthHackers/1705885393\n", "Rate limit reached. Sleeping for: 890\n", "# Because RussWrites/25343565 has 33802 followers and 21643 friends the time waiting for \n", "### the rate limit to reset would increase by ~ 6.0 minutes.\n", " Due to the chosen limit of 5 minutes per node this user will be replaced\n", " level 0: user was was replaced by ISpaDoYou/21757155\n", "# Because GrowthHackers/1705885393 has 186754 followers and 57209 friends the time waiting for \n", "### the rate limit to reset would increase by ~ 37.0 minutes.\n", " Due to the chosen limit of 5 minutes per node this user will be replaced\n", " level 0: user was was replaced by GoDaphers/2267868750\n", "Rate limit reached. Sleeping for: 890\n", "# Because IEEEorg/54290504 has 219820 followers and 2568 friends the time waiting for \n", "### the rate limit to reset would increase by ~ 43.0 minutes.\n", " Due to the chosen limit of 5 minutes per node this user will be replaced\n", " level 0: user was was replaced by Test_X/25315996\n", "Rate limit reached. Sleeping for: 890\n", "Rate limit reached. Sleeping for: 894\n", "# Could not access the followers/friends of the user exflygal/129197779\n", " level 0: user was was replaced by SuffolkTech/198039504\n", "# Because FKInt/140894914 has 42605 followers and 41570 friends the time waiting for \n", "### the rate limit to reset would increase by ~ 8.0 minutes.\n", " Due to the chosen limit of 5 minutes per node this user will be replaced\n", " level 0: user was was replaced by FortuneClubNews/860037217\n", "# Because 123top10/53112612 has 668699 followers and 213260 friends the time waiting for \n", "### the rate limit to reset would increase by ~ 133.0 minutes.\n", " Due to the chosen limit of 5 minutes per node this user will be replaced\n", " level 0: user was was replaced by marketingmngr/3066941832\n", "# Because KevinWGrossman/14738804 has 63432 followers and 30259 friends the time waiting for \n", "### the rate limit to reset would increase by ~ 12.0 minutes.\n", " Due to the chosen limit of 5 minutes per node this user will be replaced\n", " level 0: user was was replaced by stevedragoo/15343458\n", "Rate limit reached. Sleeping for: 891\n", "Rate limit reached. Sleeping for: 893\n", "Rate limit reached. Sleeping for: 889\n", "*** Collected all data from twitter at 01:30 o'clock ***\n" ] } ], "source": [ "all_connections = {} # dictionary for all connections => saves api requests\n", "nodes_on_lvl=[] # list of sublists containing nodes of a certain level in the network\n", "\n", "level = 2 # depth of network; in this task: level = 2\n", "how_many = 10 # This is the number of connections you are sampling. \n", " # Keep small (e.g.3) for development, larger later (e.g. 10)\n", "\n", " \n", "# make a guess how long the collection of data will take\n", "calc_time(level, how_many)\n", "\n", "# this function collects and assembles the data. \n", "build_network([user_id], all_connections, how_many, level, nodes_on_lvl)\n", "\n", "# assign the collected data from nodes_on_lvl to the different lists of nodes\n", "first_nodes = nodes_on_lvl[level-1]\n", "second_nodes = nodes_on_lvl[level-2]\n", "all_nodes = [x for sublist in nodes_on_lvl for x in sublist]\n", "\n", "\n", "print(datetime.now().time().strftime('*** Collected all data from twitter at %H:%M o\\'clock ***'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Be careful!** You should only keep a small value for the `how_many` parameter while you are developing your code. In order to answer to the questions you should raise the value of this parameter to `how_many=10` at least. This will take a while to execute because of the API rate limit (plan your time accordingly). You should also remember to submit your jupyter notebook with the **output shown for a large value of the `how_many` parameter**. " ] }, { "cell_type": "code", "execution_count": 312, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "There are 10 first hop nodes\n", "There are 91 second hop nodes\n", "There are overall 102 nodes in the collected network\n" ] } ], "source": [ "print('There are {} first hop nodes'.format(len(first_nodes)))\n", "print('There are {} second hop nodes'.format(len(second_nodes)))\n", "print('There are overall {} nodes in the collected network'.format(len(all_nodes)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Find the inner connections between your collected nodes that you might have missed because you sampled the connections." ] }, { "cell_type": "code", "execution_count": 313, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# now all connections and not only the inner connections are found here\n", "# possible connections that would miss anyways: \n", "# first-first, first-second, second-second, start-second\n", "network = {}\n", "for a in all_nodes:\n", " # using intersection between all nodes and all connections of one node\n", " network[a] = list(np.intersect1d(all_connections[a], all_nodes))" ] }, { "cell_type": "code", "execution_count": 314, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{14447771: [17781604,\n", " 30056032,\n", " 59083754,\n", " 87246782,\n", " 214976509,\n", " 415627631,\n", " 953457997],\n", " 14487859: [35863756],\n", " 15343458: [2765811500],\n", " 15756829: [17781604,\n", " 25315996,\n", " 25355790,\n", " 26590759,\n", " 30056032,\n", " 35863756,\n", " 43334583,\n", " 44682474,\n", " 53101979,\n", " 80735533,\n", " 84079827,\n", " 174143998,\n", " 214976509],\n", " 15833882: [57697965],\n", " 16544249: [35863756, 921680064],\n", " 16905041: [36327152, 57697965, 135984800, 205181291, 337727729],\n", " 16998179: [35863756],\n", " 17781604: [14447771,\n", " 15756829,\n", " 25315996,\n", " 26590759,\n", " 30056032,\n", " 34478091,\n", " 43334583,\n", " 44682474,\n", " 80735533,\n", " 84079827,\n", " 97561729,\n", " 214976509,\n", " 415627631,\n", " 953457997],\n", " 19013088: [28180307, 768731503],\n", " 19080487: [57697965],\n", " 19724166: [87246782, 259702917, 415627631],\n", " 19762552: [61726997],\n", " 20275129: [30056032],\n", " 20460581: [28180307, 34478091, 46046503, 768731503],\n", " 21757155: [1593405043],\n", " 25315996: [15756829,\n", " 17781604,\n", " 43334583,\n", " 44682474,\n", " 50380403,\n", " 53101979,\n", " 70140640,\n", " 87246782],\n", " 25355790: [15756829, 30056032, 44682474],\n", " 26590759: [15756829,\n", " 17781604,\n", " 30056032,\n", " 34115928,\n", " 44682474,\n", " 54496797,\n", " 80735533,\n", " 148882145,\n", " 340631773],\n", " 28180307: [19013088, 20460581, 46046503, 99114274, 171975880, 768731503],\n", " 30056032: [14447771,\n", " 15756829,\n", " 17781604,\n", " 20275129,\n", " 25355790,\n", " 26590759,\n", " 30331513,\n", " 32537903,\n", " 44682474,\n", " 53101979,\n", " 62854485,\n", " 97561729,\n", " 148882145,\n", " 236305811,\n", " 241043724,\n", " 415627631],\n", " 30331513: [30056032],\n", " 30513556: [87246782],\n", " 32537903: [30056032],\n", " 33789494: [35863756],\n", " 34115928: [26590759,\n", " 43334583,\n", " 54496797,\n", " 80735533,\n", " 84079827,\n", " 174143998,\n", " 214976509,\n", " 340631773,\n", " 415627631,\n", " 633801433],\n", " 34478091: [17781604, 20460581, 43334583, 97561729, 415627631],\n", " 35863756: [14487859,\n", " 15756829,\n", " 16544249,\n", " 16998179,\n", " 33789494,\n", " 44682474,\n", " 53101979,\n", " 66722629,\n", " 74753087,\n", " 113054322,\n", " 921680064,\n", " 2962716106],\n", " 36327152: [16905041, 57697965, 135984800, 166439907, 205181291],\n", " 43334583: [15756829,\n", " 17781604,\n", " 25315996,\n", " 34115928,\n", " 34478091,\n", " 53101979,\n", " 54496797,\n", " 57126820,\n", " 62854485,\n", " 78384511,\n", " 84079827,\n", " 107610279,\n", " 174143998,\n", " 214976509,\n", " 340631773,\n", " 633801433,\n", " 823097858,\n", " 953457997],\n", " 44110936: [57697965],\n", " 44682474: [15756829,\n", " 17781604,\n", " 25315996,\n", " 25355790,\n", " 26590759,\n", " 30056032,\n", " 35863756,\n", " 54496797,\n", " 80735533,\n", " 84079827,\n", " 174143998,\n", " 214976509,\n", " 257476940],\n", " 46046503: [20460581, 28180307, 768731503],\n", " 48765640: [768731503],\n", " 49534741: [860037217, 2765811500],\n", " 50380403: [25315996, 87246782],\n", " 52367544: [87246782],\n", " 53101979: [15756829,\n", " 25315996,\n", " 30056032,\n", " 35863756,\n", " 43334583,\n", " 57697965,\n", " 59083754,\n", " 61726997,\n", " 87246782,\n", " 415627631,\n", " 768731503,\n", " 1593405043,\n", " 2765811500],\n", " 54496797: [26590759, 34115928, 43334583, 44682474, 415627631],\n", " 57126820: [43334583, 80735533, 84079827, 415627631, 953457997],\n", " 57697965: [15833882,\n", " 16905041,\n", " 19080487,\n", " 36327152,\n", " 44110936,\n", " 53101979,\n", " 135984800,\n", " 166439907,\n", " 205181291,\n", " 337727729,\n", " 624781903],\n", " 59083754: [14447771, 53101979, 80735533, 87246782],\n", " 61726997: [19762552, 53101979],\n", " 62854485: [30056032, 43334583],\n", " 65759299: [70140640, 87246782],\n", " 66722629: [35863756],\n", " 70140640: [25315996, 65759299, 87246782],\n", " 74206569: [87246782],\n", " 74753087: [35863756],\n", " 76584219: [768731503],\n", " 78384511: [43334583],\n", " 80735533: [15756829,\n", " 17781604,\n", " 26590759,\n", " 34115928,\n", " 44682474,\n", " 57126820,\n", " 59083754,\n", " 84079827,\n", " 174143998,\n", " 214976509,\n", " 257476940,\n", " 259702917,\n", " 360629347,\n", " 377409225,\n", " 415627631,\n", " 953457997],\n", " 84079827: [15756829,\n", " 17781604,\n", " 34115928,\n", " 43334583,\n", " 44682474,\n", " 57126820,\n", " 80735533,\n", " 214976509,\n", " 257476940,\n", " 415627631],\n", " 87228054: [1593405043],\n", " 87246782: [14447771,\n", " 19724166,\n", " 25315996,\n", " 30513556,\n", " 50380403,\n", " 52367544,\n", " 53101979,\n", " 59083754,\n", " 65759299,\n", " 70140640,\n", " 74206569],\n", " 97561729: [17781604, 30056032, 34478091],\n", " 99114274: [28180307, 768731503],\n", " 107610279: [43334583],\n", " 113054322: [35863756],\n", " 135984800: [16905041, 36327152, 57697965],\n", " 148882145: [26590759, 30056032],\n", " 163557128: [1593405043],\n", " 166439907: [36327152, 57697965],\n", " 171975880: [28180307, 768731503],\n", " 174143998: [15756829,\n", " 34115928,\n", " 43334583,\n", " 44682474,\n", " 80735533,\n", " 214976509,\n", " 340631773,\n", " 415627631,\n", " 953457997],\n", " 198039504: [2765811500],\n", " 205181291: [16905041, 36327152, 57697965],\n", " 214976509: [14447771,\n", " 15756829,\n", " 17781604,\n", " 34115928,\n", " 43334583,\n", " 44682474,\n", " 80735533,\n", " 84079827,\n", " 174143998],\n", " 236305811: [30056032],\n", " 241043724: [30056032],\n", " 257476940: [44682474, 80735533, 84079827, 360629347, 415627631],\n", " 259702917: [19724166, 80735533, 415627631],\n", " 321454882: [1593405043],\n", " 337727729: [16905041, 57697965],\n", " 340631773: [26590759, 34115928, 43334583, 174143998],\n", " 360629347: [80735533, 257476940, 415627631],\n", " 377409225: [80735533, 415627631],\n", " 415627631: [14447771,\n", " 17781604,\n", " 19724166,\n", " 30056032,\n", " 34115928,\n", " 34478091,\n", " 53101979,\n", " 54496797,\n", " 57126820,\n", " 80735533,\n", " 84079827,\n", " 174143998,\n", " 257476940,\n", " 259702917,\n", " 360629347,\n", " 377409225],\n", " 606167086: [768731503],\n", " 624781903: [57697965],\n", " 633801433: [34115928, 43334583],\n", " 768731503: [19013088,\n", " 20460581,\n", " 28180307,\n", " 46046503,\n", " 48765640,\n", " 53101979,\n", " 76584219,\n", " 99114274,\n", " 171975880,\n", " 606167086,\n", " 1667954510],\n", " 823097858: [43334583],\n", " 860037217: [49534741, 2765811500],\n", " 921680064: [16544249, 35863756],\n", " 953457997: [14447771, 17781604, 43334583, 57126820, 80735533, 174143998],\n", " 971646229: [1593405043],\n", " 1078398283: [2765811500],\n", " 1593405043: [21757155,\n", " 53101979,\n", " 87228054,\n", " 163557128,\n", " 321454882,\n", " 971646229,\n", " 2267868750,\n", " 2800710751,\n", " 2831591483,\n", " 4099703772,\n", " 730894768511688705],\n", " 1667954510: [768731503],\n", " 2267868750: [1593405043],\n", " 2405512428: [2765811500],\n", " 2722038352: [2765811500, 3020748112],\n", " 2743046916: [2765811500],\n", " 2765811500: [15343458,\n", " 49534741,\n", " 53101979,\n", " 198039504,\n", " 860037217,\n", " 1078398283,\n", " 2405512428,\n", " 2722038352,\n", " 2743046916,\n", " 3020748112,\n", " 3066941832],\n", " 2800710751: [1593405043],\n", " 2831591483: [1593405043],\n", " 2962716106: [35863756],\n", " 3020748112: [2722038352, 2765811500],\n", " 3066941832: [2765811500],\n", " 4099703772: [1593405043],\n", " 730894768511688705: [1593405043]}\n" ] } ], "source": [ "pprint(network)" ] }, { "cell_type": "code", "execution_count": 315, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The created network was saved in RohdeSchwarz_2_10.p\n" ] } ], "source": [ "# to avoid doing the time consuming collection of data multiple times for \n", "# the same network here is the possibility to save it in a pickle file\n", "save = True\n", "if save:\n", " f = open('{}_{}_{}.p'.format(user,level,how_many),'wb')\n", " pickle.dump(network,f)\n", " f.close()\n", " print('The created network was saved in {}_{}_{}.p'.format(user,level,how_many))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 2. Discover some of the properties of the collected network" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The network from RohdeSchwarz_2_10.p was loaded\n" ] } ], "source": [ "# to save time it is possible to load some collected network data from a pickle file\n", "load = True\n", "filename = 'RohdeSchwarz_2_10.p' # startinguser_level_howmany.p\n", "\n", "# avaliable networks: 'dl_weekly_2_3.p' 'dl_weekly_2_4.p' 'dl_weekly_2_5.p'\n", "# 'appliedmldays_2_2.p' 'tudresden_de_2_8.p' 'RohdeSchwarz_2_10.p'\n", "\n", "if load:\n", " network = {}\n", " f = open(filename,'rb')\n", " network = pickle.load(f)\n", " f.close()\n", " all_nodes = []\n", " for key in network:\n", " all_nodes.append(key) # create all_nodes with the loaded network data\n", " print('The network from {} was loaded'.format(filename))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.1 Adjacency matrix" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Congradulations! You have now created a dictionary that describes a real Twitter network!\n", "We now want to transform this dictionary into the adjacency (or weight) matrix that you learned about in your first class. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# preparation for creatign the matrix: \n", "# 1) empty quadratic matrix of correct size \n", "W = np.zeros([len(all_nodes),len(all_nodes)], dtype=int)\n", "# 2) dictionary with nod -> index, that will be position in matrix\n", "code = {} \n", "for ind,k in enumerate(network): \n", " code[k] = ind\n", " \n", "# create matrix applying the node-index transform\n", "for nod in network:\n", " for connection in network[nod]:\n", " W[code[nod]][code[connection]] = 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Remember that a weight matrix should be symmetric. Check if it is: \n", "This code was combined with the next part, so that checking and fixing if needed are combined." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Question 1:**\n", "It might happen that $W \\neq W^{T} $ for some $(i,j)$. Explain why this might be the case." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Your answer here:** Depending on the implementation one can get a non-symmetric weight matrix W, if one does not assign the connection from a to b automatically to b to a. If one does that but checks before the friends and followers of twitter user b, one can run into a problem in the case that the twitter user does not allow to enter its connections followers and friends. Another problem would occur if one finds not all connections of a user. This can happen if the function find_connections does not use the Cursor object and the amount of found friends and followers is 5000 as maximum. \n", "In this implementation though:\n", "* twitter users, that dont allow to access their connections are replaced if possible, otherwise deleted\n", "* the cursor object is used" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Impose your weight matrix to be symmetric." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "W is symmetric\n" ] } ], "source": [ "# check if matrix is symmetric\n", "if len(W[np.nonzero(W-W.transpose())]) is not 0:\n", " # Make W symmetric\n", " bigger = W.transpose() > W # bigger is True, where a connection in W is missing\n", " W = W - W*bigger + W.transpose()*bigger # The term 'W*bigger' is only for security, W should be zero at these points\n", " print('W was not symmetric but it is now')\n", "else:\n", " print('W is symmetric')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot the weight matrix of your collected network.\n", "\n", "Hint: use `plt.spy()` to visualize a matrix." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAQsAAAEJCAYAAACDnQJZAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAGH9JREFUeJzt3XnQJHV9x/H3BwHvLOvBbpRjUSPeEP9QFGQnknhGIIlS\nGJXD0kp5RKNRWUwqPJiyIkZCrPIoT1gJ3kbZHIYNWafwIqhAUC5RIuDqPoTTK1GUb/7o37M7O848\nTz/T3dO/nvm8qqZqpme6+zs9M7/f93d0jyICM7OV7NF2AGbWDS4szKwUFxZmVooLCzMrxYWFmZXi\nwsLMSnFh0SJJJ0r64sDjH0va0F5E3SPpvZL+ou045oELiwZI6ku6TdJeJV6+c6JLRNw/Ir7XXGTN\nk3SgpLslfWNo+QMl/ULS9SW3s1tBOk5EvCIi3jpBnNdIesHA46emuAeXHS7pR5L8O8GFRe0kHQgc\nAdwNHN1yOG26j6THDDz+Y+C7q1hfDBSkI19Q7Ud8EXDkwOMjgauHlj0N+EpE3F1hPzPDhUX9TgC+\nCpwDnDT4hKQHSNoi6U5JFwMPH3r+bkkPS/efI+nS9NobJJ029NojJH1Z0u3p+RPS8r0lvSMt+6Gk\n90i6Z3puo6SbJL1e0qKk7ZJOGtjmvSSdKel7ku6QdFFa9s+SXjW0//+SdMwyx+Hcofd/AvCRoW2c\nIuk7qfb+lqRj0/JHAe8FnpKaZrel5Wen9/Mvkn4M9NKyt6Tn3yTp4qVCRNIrJH1T0t4j4hsuLJ4G\nnDFi2UXLvMf5EhG+1XgDrgP+BHgi8AvgwQPPfTzd7gU8Fvg+cNHA878CHpbuHwk8Nt1/HPBD4Oj0\n+EDgR8BxwD2AtcAT0nNnAZ8D1gD3Bc4H3pqe2wjcBZyW1ns28FNgTXr+3cA2YD1FzX4YsBfwAuDi\ngTgPAf4H2HPE+z8wvY8DgBvTdh4DXAUcBVw/8No/Atal+y8AfjLw+MTBY5OWnQ3cDhyWHt8zLXtL\neiygD/wV8AjgtqXjMiLOA4BfAvuk9Xak7d04sOwO4Ii2v1O53FoPYJZuFM2PnwNr0+OrgNem+3uk\nwuO3Bl7/1qHC4u6lwmLEts8Czkz3NwGfGfO6nwAHDTx+ytIPNBUWPwX2GHh+EXhS+nH8DHjciG3e\nE7gVeHh6/LfAu8bsf6mw2APYCjwD+Bvg1OHCYsS6lwHPS/fHFRbnjFj2lqH935qO/ZtW+LyuB54H\nHAp8MS372MCynwJ7tf29yuXmZki9TgC2RsTt6fHHKL70AA+mqM2/P/D6G8ZtSNKTJW2TdLOkOyiy\nlQelp/dnRPtf0oOB+wDfSB2stwGfBx448LJbY/c2+M+A+6Vt35PiB7SbiPg58AngxZIEvJCimbGS\npabI8aNeL+kESZelptTtFNnWg4ZfN+Sm5Z6MiBuAL1AUGu9ZYVtfpMjgjkz3Ab5EUageCVwSEXet\nsI254cKiJpLuRdEs2Jj6Cn4I/BlwiKTHU6Ttv6T4oS85YJlNnkfRnHhoROwDvI+i9ofiB/OIEevc\nQvHjf2xEPCDd9omINSXewi3A/zHUjzLgI8CLKbKDn0bEf5bY5meA5wLfjYjBQhJJBwDvB14ZEWsj\nYi1wJbve47jOzZU6PZ9LkU39B/COFeK7iKJgOIJdhcVSAeL+iiEuLOrzBxSFwaMp2vSHpPtfAk5I\ntflngQVJ904jBSeO2xhFbX97RNwl6UkUowlLzgOOkvR8SfdIHaeHRJFHfwD4+5RlIOmhkp6xUvBp\n3bOBv5P0m5L2kHTY0vBvRFxM0Uw6k5WzCqV1fgb8DvDyEa+5b9reLWlfJ1P0zSxZBPYrOfxc7FR6\nEMX7fylFRvP7kp69zCoXAb9NUTh8OS37JnAQ0MOFxW5cWNTnBODDEbE9Im5eugHvAl6UeuhfDdyf\norPyw+k2ziuBv5Z0J/CXFM0AACLiJuA5wBsoOvEuA56Qnt4EfAe4ODVftgKPXGY/gzX1Gyh+LF+j\naPe/jd2/Ix+h+EH/wzLb222bEXFpRPz3r70g4mqKgudiis7Fx1IUrEu2UWQaOyTdvML+lrwP+GxE\nXBARtwEvAz4gae3IICOuA24GfhgRP0rLAriE4nP6Ssn9zgWlTh1rWeoL+BVwwHDKngtJLwFeHhFH\nrvhimznOLPLxeOB/KWrZ7Ei6D0W28762Y7F2tFZYSHpWmnL7bUmntBXHOJL2S6MRV6aJPa9Jy9dK\n2irpWkkXSCrTebjSvv6QokPuTRHxywrb2SNN5NpSZ6ypz+NmiubTxyaNb2ibayR9StLV6Rg/uYlj\nWxdJr0sTx66QdF6a/JZNvJI+lCbaXTGwbGx8kk6VdF06/iv2aQHtzLOgKKS+QzG8tRdwOfCotseR\nh2JcDxya7t8PuBZ4FMUsvzel5acAb2s71oGYX0fRn7AlPc451nOAk9P9PSkmkWUZL/AQiiHlvdPj\nT1B0TmcTL8WIzqHAFQPLRsZHMUnusnTcN6TfolbcR0tv7DDg8wOPNwGntP2lWCHmzwG/C1zDrlmG\n64Fr2o4txbIf8O8UvfhLhUWusf4GxXDq8PJc430IxZyYtekHtiXH70KqfAcLi5HxDf/eKObiPHml\n7bfVDHkou0+u+X5aliUVp40fStFzvy4iFgEiYgewb3uR7eYs4I3sPrqRa6wHUQyZnp2aTe9PfSJZ\nxhsRP6AYubkR2A7cGREXkmm8A/YdE9/w7287JX5/7uBcgaT7AZ+mmLb9E359UlDrw0lpItJiRFzO\nrklNo7Qea7Inxbkz746IJ1JMq95EhscWQNI+wDEUNfdDgPtKehGZxruMSvG1VVhsZ/fZi/ulZVmR\ntCdFQXFuRJyfFi9KWpeeX0/R8de2w4GjVVwr4mPA0yWdSzFHIbdYocgkb4qIr6fHn6EoPHI8tlA0\nOa6PiNsi4lcUk+ueSr7xLhkX33Z2n0lc6vfXVmHxNeARKi6UsjfFuQNbWoplOR8GroqIdw4s28Ku\nU69PpDirs1UR8eaIOCAiHkZxLLdFxEuAfyKzWAFSanyTpKXJYkdRTMDK7tgmNwKHqThdXxTxXkV+\n8YrdM8tx8W0Bjk8jOgdRnDpwyYpbb7Ez5lkUIwzXAZva7BgaE9/hFJOkLqfoOb40xfwA4MIU+1Zg\nn7ZjHYp7I7s6OLONlWI6/NfS8f1HitGQnOM9jeLiOFcAmylG8bKJF/go8AOKs55vBE6m6JAdGR/F\nWcDfSe/pGWX24RmcZlaKOzjNrJTGCovcZ2ia2eo00gxJZ1h+m6Ij6AcUbdPjI+Ka2ndmZlPRVGbx\nJOC6iLghiisNfZxinNrMOqqpwqJTMzTNbGV7trVjSR6GMWtRRCw32/fXNFVYlJqhuXHjRnq9HgC9\nXm/n/VwtLCywsLDQdhilOd7RinlVUKW/rmvH9qSTTmLDhg07H59++umr3kZThcXOGZoU10A4nuKK\n0Lvp9XqdOuBmXbVhw4bdfmvZFBYR8StJr6aYNbYH8KEorrmYpTpqGusOf86TaazPIiL+DTh4udfk\n3uwY5nib1aV4uxQr1BNva9O9JUUuJbwzC5s3krLp4OwUFxLzwZVCNT43xMxKmcvMwjVMYbnjUOUY\n5Xp8l+LJNb7cObMws1LmMrOYhxqlTO056XMryf345h5frpxZmFkpM1tYSNpZu1qzfKznw8wWFmZW\nr5nts5j3duk0339Ox9ojHc1xZmFmpbiwMLNSZrYZYs3IIc0f7EwdjsPNj+Y4szCzUpxZ2KrkUHMv\nF0MOmc+scmZhZqU4s+ig5drs887HoznOLMyslE5lFk23R7tSY5eJrcyxqut4TutzGT7FfNJ91h3v\nuO1Naz+rXX9SzizMrBRfg9PmhkdKdpnkGpzOLMysFBcWZlZKpzo4zaqYVgfsrHJmYWalZJ1ZTHtI\nriu6MsQ7L3L6DJr8TjuzMLNSss4snFGY5cOZhZmVkk1mMc12eJXt1xXnarKb4deuZr+jpvjmkFHV\n8Y9nk67ftLqmZY9bv62L/zizMLNSZna6d1f6Jeqq+SfJVJbbX5XMZ9R+ZqX/qY7jMuk+l9R1Ipmn\ne5tZI1xYmFkpM9sMmWW5d/BZ/qZ61qmk/SRtk3SlpG9Kek1avlbSVknXSrpA0ppJ92Fm+Zg4s5C0\nHlgfEZdLuh/wDeAY4GTg1oh4u6RTgLURsWnE+s4sGtSVDl5rx1Qzi4jYERGXp/s/Aa4G9qMoMDan\nl20Gjp10H2aWj1o6OCVtAA4FLgbWRcQiFAUKsG8d+7DViQhnFVMiqfL1LbugcmGRmiCfBl6bMozh\nb6i/sWYzoNJ0b0l7UhQU50bE+WnxoqR1EbGY+jVuHrf+wsLCzvu9Xo9er1clnInNcvveIyfN68Jx\n7ff79Pv9StuoNHQq6SPALRHx+oFlZwC3RcQZXengdGFh82aSDs4qoyGHAxcB36RoagTwZuAS4JPA\n/sANwHERcceI9bMpLMyq6GJlM9XCoioXFjYr5qWw8HRvMyslm+tZmDWtqQygSxlFFc4szKwUZxY2\nN2blmhptcWZhZqU4s2iQ5zjUY1o196Sf17x8ts4szKwUZxYNmJc27LT4OObBmYWZleLMogGuCbup\nqf/5mBXOLMysFGcWM2pearsc5PifJU1wZmFmpbiwMLNS3Ayh/fSuCbP0XqzQ9mfqzMLMSnFmwa4S\ne16mZ5fJpNr4A+BZM2vHxZmFmZXizGIODWdSo2q+HE6kqjsDqrLOJHLNKCb9jxNnFmZWii/Y20Hz\n0rdizfEFe82sMe6z6CBnE+PN2ghETpxZmFkpLizMrBQ3Q3DqOktm4TPM9fvozMLMSnFmQX4leFty\nH5JtapJWbiaZYDbJuqvlzMLMSpm5zGI1tc+SUa9ts4Zqa9+518ZlTvgbflxXttR0VlPXf5Y0+d1x\nZmFmpXi694yahbZ7U3xsPN3bzBo0c30WVeQ+GrAaVePveu2bQ/yjYhheNs2LDE16avoSZxZmVkrl\nwkLSHpIulbQlPV4raaukayVdIGlN9TDNrG11ZBavBa4aeLwJuDAiDga2AafWsI+piIidt3nX9eMw\nKn5JSx17U3lvo/YzvGw1sVSNu+r3u1JhIWk/4DnABwcWHwNsTvc3A8dW2YeZ5aFqB+dZwBuBwabG\nuohYBIiIHZL2rbiPiaa01tXBVXdHWU7bW27dlbY7qrOs7k67cc/VNYGpTAw5aTvOiTMLSc8FFiPi\ncmC5bta8PwEzK6VKZnE4cLSk5wD3Bu4v6Vxgh6R1EbEoaT1w87gNLCws7Lzf6/Xo9XojXzdJSVrX\n1Z7rnj5cd61QtQ076XarrFt1Ozkdw2mqEme/36ff71fafy0zOCVtBP48Io6W9Hbg1og4Q9IpwNqI\n2DRinWxmcE6SNpd9vVmOcpnB+Tbg9yRdCxyVHs8Mj5jYvPK5IbTfcWQ2bZNkFp7uTbv/dZprQZVr\nXNYeT/c2s1JcWJhZKVk3Q5qacJSTpppA03z/Vc6kLLPdJTlMZGtzP3WdddrKdG8zmx8eDTFL5mkO\nTS7zLMxsBmXdZ2G760o/TJuqHKMuHtdpfiecWZhZKVlnFtOuSXOflLWa/6woc+p3me1Wsdxp7KtZ\nf7mT+Ya3W/eITNmYqr52UlVGClfLmYWZleLRELOKutiX5NEQM2uMCwszKyXrDk4zmP707NXuq0vN\njyqcWZhZKZ3PLLrYuWSrM63P1t+h5TmzMLNSOp9ZuDYYzRmX1c2ZhZmV0qnMoumptrlP956lfa9G\nlc+yjYv/NBVDXRe/mZQzCzMrxdO9bW50JZOaBk/3NrPGuLAws1Ky7uB02ji5Msdu3o7vvLzPpjiz\nMLNSss4spl0TzNLVneu+8lZV85bFzCJnFmZWStaZxbS51qtX1zK1rsU7bc4szKwUZxY2Uh01a261\n80r9JrnFmxtnFmZWijML3FM/L/z5VuPMwsxKqVRYSFoj6VOSrpZ0paQnS1oraaukayVdIGlNXcGa\nWXuqZhbvBP41Ih4NHAJcA2wCLoyIg4FtwKkV99G4iJi5FFVS5esXmA2a+BR1Sb8BXBYRDx9afg2w\nMSIWJa0H+hHxqBHr+xT1BrkfxpYz7VPUDwJukXS2pEslvV/SfYB1EbEIEBE7gH0r7MMmNIvZUtOW\nsjFnZKNVGQ3ZE3gi8KqI+LqksyiaIMPf0LHf2IWFhZ33e70evV6vQjhmNk6/36ff71faRpVmyDrg\nqxHxsPT4CIrC4uFAb6AZ8oXUpzG8vpshNlVumu0y1WZIamrcJOmRadFRwJXAFuCktOxE4PxJ92Fm\n+ah0DU5JhwAfBPYCrgdOBu4BfBLYH7gBOC4i7hixrjOLAXWfxFT14jfzWAvneiJZXZ/F4HYmySx8\nwd5MuLBonwuL5c3sdO+u/R9H3XFW3d5q/gNj2se66o+6awVhXXFW3Y6ne5tZKS4szKwU91l0SNfS\n5zbM2zGa9P36T4bMrDGd7+Cs409pu1IbLcVXpoOvrj/RzeGY1D0asGTcd2a1+2r6WC23/Wl+Ps4s\nzKwU91kMaHOcPbcx/pwyi7pN673lfAzdZ2Fmjel8n0Wd2qwBcqt9couni2btGDqzMLNSOplZjGrf\n59w+nNTwe5pWr3uT+6hi1EVpJokzp5GOqvteaYSnTs4szKwUFxZmVoqHTmdE3enyLE5qW9K1eJfU\nGbeHTs2sMZ3s4Jx3ozoi27weRtdq6K7Fu6TtLM+ZhZmV0mpmkdpNI5dDvZeXq2t7y+1jWjXWpJfK\na1rVS/mVfU3uQ7zDpjnU76FTM2td66MhTV3bsWs93lVPj66ybu7HaNzktFFymqQ3KoY2J9oN82iI\nmTWi9czCLBdd6wupwvMszKwxLizMrBRPyppROXTwdY2P1fKcWZhZKc4sGlDXdReqqHsItemrhS83\ncWm55VWOa1eyr1zidGZhZqV46NSsolxq/kFlMjkPnZpZI9xnYTOljVo+p4xiiU9RN7PWOLPooHma\nlrxaPh7NcWZhZqVUKiwkvU7StyRdIek8SXtLWitpq6RrJV0gaU1dwTZN0s5bziJi560pXTgONl0T\nFxaSHgL8KfDEiHgCRZPmhcAm4MKIOBjYBpxaR6Bm1q6qzZB7APeVtCdwb2A7cAywOT2/GTi24j7M\nLAMTFxYR8QPgTOBGikLizoi4EFgXEYvpNTuAfesIdBqmkd53hY9DefPSZJt4NETSPhRZxIHAncCn\nJL0IGP6Gjf3GLSws7Lzf6/Xo9XqThmNmy+j3+/T7/UrbmHi6t6TnA8+MiJenxy8BDgOeDvQiYlHS\neuALEfHoEes3Ot27zPUP695eU9cTbdM0TjKbNKYm99n1K4znNt37RuAwSfdSEdlRwFXAFuCk9JoT\ngfMr7MPMMlHpRDJJpwHHA3cBlwEvA+4PfBLYH7gBOC4i7hixrk8km1DutZq1p2yWN0lm4bNOO8iF\nhY3TZGHh6d50rz+hK3Ha9PkfycysdTObWawmW6j7cm2rUVdWM8l26hq9yf3/VetYZxZUfd/OLMys\nFBcWZlaKR0Nsbsxr82MUX4PTzBozsx2c88a15srKTMu38ZxZmFkpzixmhGvJlTmjqMaZhZmV4sxi\nRrjWXFnX+yzqnsC3Ws4szKwUZxYdNGpaetdqyTZ19VjVNfU+IibKLpxZmFkpLizMrBQ3Q8i/w2s4\nvlzjtPb5ehZm1jpnFuRfU+ceX05yzxLrNs3368zCzEpxZjHAF8LNy7hac7nPqeLV6mvZzjRNM05n\nFmZWysxmFrm2XZuKazX/jlZ3LE29p3Hba+q6oF25hmhbnFmYWSm+rB7dKt3N6uDL6plZY1xYmFkp\nM9vBuRpdaX403VzKfehwmp22OcjtvTizMLNSnFnYTrnUYOPkHl/dcnu/zizMrJS5zCyGrxKUWwk+\nTpt/4NwVq5kaXrcqfQxd+EydWZhZKXOZWeRacpfV9finadJjtZosoY5Riy58pitmFpI+JGlR0hUD\ny9ZK2irpWkkXSFoz8Nypkq6TdLWkZzQVuJlNV5lmyNnAM4eWbQIujIiDgW3AqQCSHgMcBzwaeDbw\nHk36JwVmDZG04tWtI6J0bb+a13bZioVFRHwJuH1o8THA5nR/M3Bsun808PGI+GVEfA+4DnhSPaGa\nWZsm7eDcNyIWASJiB7BvWv5Q4KaB121Py8ys4+oaDZkoB+v3+zXt/teVSTXHrTNuvSbjrWpU3CvF\nO8kxanI7w/HWtd0mlPku5BT/YLyTxjTpaMiipHURsShpPXBzWr4d2H/gdfulZSMtLCzQ6/UA6PV6\nO+/nqt/vZx/jIMfbnC7FCnDOOedUruzKFhZKtyVbgJOAM4ATgfMHlp8n6SyK5scjgEvGbbTX67Gw\nsLC6iGdUG5Nycj8hq+udhtMadi1jw4YNnH766Tv3NUl2sWJhIemjQA94oKQbgdOAtwGfkvRS4AaK\nERAi4ipJnwSuAu4CXpnNFW7MrJJWr5TVyo7NDGDVV8pqrbAws27xuSFmVooLCzMrxYWFmZXiwsLM\nSnFhYWaluLAws1JcWJhZKf8P4AUH9A7KotQAAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# visualize matrix\n", "plt.spy(W)\n", "plt.title('Adjacency Matrix W')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Question 2:**\n", "What is the maximum number of links $L_{max}$ in a network with $N$ nodes (where $N$ is the number of nodes in your collected network)? How many links $L$ are there in your collected network? Comment on how $L$ and $L_{max}$ compare." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Your answer here:**\n", "* Complete network: In a complete network, every nod has a link to every other node. Therefore \n", "$L_{max} = \\frac{N\\dot(N-1)}{2}$\n", "* For this created network we neglect the amount of connections, that were missed because of the sampling and we assume, that every node has at least how_many (in the formula $n$) nodes. Taking into account the depth or level of the network $l$, that was for this assignment $l=2$, the amount of nodes is \n", "$N = 1+n+n^{2}+...+n^{l} = \\sum_{i=0}\\limits^{l}n^{i}$ \n", "The amount of links is \n", "$L = n+n^{2}+...+n^{l} = \\sum\\limits_{i=1}^{l}n^{i}$ \n", "Therefore the amount of links can be expressed as $L = N-1$ \n", "* For $N >> 1$ the amount of links grows for a complete net work with $L_{max} \\sim N^{2}$ \n", "and for our specific network with $L \\approx N$ \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.2 Degrees distribution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot a histogram of the degree distribution. " ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# sum of row/column equals connections of specific user\n", "p = W.sum(1)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYwAAAEZCAYAAACEkhK6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAHjVJREFUeJzt3XuYXVWd5vHvG2KQiyK0dGiSTrgJCAqCGlFUis4o8UYY\nb4CMKF6GbgUZdVq8dA+J40yjtt3qMKJp0QYbjEo6EJ7xEoIcNYNIQK5agTBASEKIchFUEEPyzh97\nV9wpqursSp1d51C8n+c5T/Zee611fqeS1O/stfZeW7aJiIhoZ1K3A4iIiCeHJIyIiKglCSMiImpJ\nwoiIiFqSMCIiopYkjIiIqCUJIyY0SXdK+qshyl8uqb8bMUU8WSVhxFOS7eW2n9uunqSzJF0wHjFF\n9LokjIgxkqQO9rVdp/qK6LQkjHgqOEzSjZIelPRNSVMkHSVpzUAFSWdKWivpYUn9ko6WdAzwceB4\nSb+VdH1Z90pJn5K0XNLvgb0l/YWkSyXdL+k2Se+p9P10SedLekDSLyT97aD3vlPSRyTdCPxO0qQy\nntvLeG6RdFyl/jvK9/6n8jPdLumlZfndku6VdPJ4/GDjqWVytwOIGAdvAV4NPAZcBbwTuBUwgKT9\ngfcDL7S9QdIMYDvbd0r6n8C+tgf/Av5PwBzgNoovXlcANwF7AAcBl0u63XYLmAfMAPYCdga+N/De\nFScArwHut71Z0u3AkWU8bwH+TdK+tjeU9WcBC4DdgE8CC4ElwL5AH7BI0sW2H9nWH1rEYDnDiKeC\nL9jeYPs3wGXACwYd3wRMAZ4nabLtu23f2abPf7W90vZmiiTxMuBM2xtt3wh8FRhIMm8B/ofth23f\nA3xxmBjvsf0YgO1FA8nB9neAVRRJYsCdti9wsRjct4DpwPzy/S8H/gjsV+NnE1FbEkY8FWyobD9C\n8S1/C9v/D/gvFGcCGyRdJGmPNn2uqWzvCTww6Nv8amBa5fjaYdoOqB5H0smSri+HnB4EDgaePcxn\nerT8HPcNKtvqc0aMVRJGBGB7oe1XADPLok8PHBquSWX7HmA3STtVymYA68rt9RRnANVjw/ZXDokt\nAN5ne1fbuwK/ADo2uR6xLZIw4ilP0v7lJPcUiqGcR4HN5eENwF4jXQlley3F3Mg/SNpe0iHAu4Fv\nlFW+DXxM0rMkTaOYLxnJTuX731dOgJ8CPK/dx2hzPGLMkjBioqvzwJftgbOBX1OcLewOfKw89h2K\nX8b3S7p2hD5PBPYu2y8C/t72leWxT1KcbdwJLC37fGy4GG33A58DrgbupRiOWt7mMwyOKQ+6iY5T\n0w9QkjQH+DxFcjrP9qeHqfdiim9px9v+97LsLuAhim9bG23PGqptxJOJpL+m+Hd+dLdjiRiNRi+r\nlTQJOAeYTfHNa4WkS22vHKLe2cAPBnWxGeiz/WCTcUY0qZxA3wf4KbA/8GGGvlIqoqc1PSQ1C1hl\ne7XtjRTXis8dot7pwMXArwaViwybxZPfFOArwMPAMmAxcG5XI4rYBk3fuDeNrS8hXMvW15IjaU/g\nONtHSxo85GSKG6A2AQts/0uj0UY0wPbdwPO7HUfEWPXCnd6fB86s7Fev9jjS9npJu1Mkjn7b7Sb/\nIiKiAU0njHVsfc35dP50bfqAFwELy8sWnw28RtJG20tsrwew/WtJiynOTp6QMCTlipCIiFGyParL\nsZueH1gB7CdpZnmN+wkU691sYXuf8rU3xTzG+2wvkbSjpJ0ByhuiXg3cMtwb7bDDHmN6TZmyCyed\n9J8566yzsN2zr16OL7FNzPh6ObZej6+XY9sWjZ5h2N4k6TSKa88HLqvtl3RqcdgLBjepbE8FFpdn\nD5OBC20vHe69Hn10/RijXcS9917Efvv9xRj7iYiYmBqfw7D9feCAQWVfGabuuyrbd/LEReIiIqJL\ncsnqIH19fd0OYUS9HF9i23a9HF8vxwa9HV8vx7YtGr/TezwUw1Zj/RyLmD37IpYtW9SRmCIiepkk\n3GOT3hERMUEkYURERC1JGBERUUsSRkRE1JKEERERtSRhRERELUkYERFRSxJGRETUkoQRERG1JGFE\nREQtSRgREVFLEkZERNSShBEREbUkYURERC2NJwxJcyStlHSbpDNHqPdiSRslvXG0bSMionmNJgxJ\nk4BzgGOAg4ETJR04TL2zgR+Mtm1ERIyPps8wZgGrbK+2vRFYCMwdot7pwMXAr7ahbUREjIOmE8Y0\nYE1lf21ZtoWkPYHjbJ8LaDRtIyJi/EzudgDA54EOzE/Mq2z3la+IiABotVq0Wq0x9dF0wlgHzKjs\nTy/Lql4ELJQk4NnAayQ9XrNtxbyxRxsRMUH19fXR19e3ZX/+/Pmj7qPphLEC2E/STGA9cAJwYrWC\n7X0GtiV9HbjM9hJJ27VrGxER46fRhGF7k6TTgKUU8yXn2e6XdGpx2AsGN2nXtsl4IyJieI3PYdj+\nPnDAoLKvDFP3Xe3aRkREd+RO74iIqCUJIyIiaknCiIiIWpIwIiKiliSMiIioJQkjIiJqScKIiIha\nkjAiIqKWJIyIiKglCSMiImpJwoiIiFqSMCIiopYkjIiIqCUJIyIiaknCiIiIWpIwIiKilsYThqQ5\nklZKuk3SmUMcP1bSjZKul3SNpCMrx+6qHms61oiIGF6jT9yTNAk4B5gN3AOskHSp7ZWVastsLynr\nPx/4NvDc8thmoM/2g03GGRER7TV9hjELWGV7te2NwEJgbrWC7UcquztTJIkBGocYIyKihqZ/GU8D\n1lT215ZlW5F0nKR+4DKg+lxvA5dLWiHpvY1GGhERI2p0SKou25cAl0h6OfAp4FXloSNtr5e0O0Xi\n6Le9fOhe5lW2+8pXREQAtFotWq3WmPpoOmGsA2ZU9qeXZUOyvVzSPpJ2s/2A7fVl+a8lLaYY4qqR\nMCIioqqvr4++vr4t+/Pnzx91H00PSa0A9pM0U9IU4ARgSbWCpH0r24cDU2w/IGlHSTuX5TsBrwZu\naTjeiIgYRqNnGLY3SToNWEqRnM6z3S/p1OKwFwBvknQy8EfgUeCtZfOpwGJJLuO80PbSJuONiIjh\nyXa3YxizIqmM9XMsYvbsi1i2bFFHYoqI6GWSsK3RtMklqxERUUsSRkRE1JKEERERtSRhRERELUkY\nERFRSxJGRETUkoQRERG1JGFEREQtSRgREVFLEkZERNSShBEREbUkYURERC1JGBERUUsSRkRE1JKE\nERERtSRhRERELY0nDElzJK2UdJukM4c4fqykGyVdL+kaSUfWbRsREeOn0YQhaRJwDnAMcDBwoqQD\nB1VbZvtQ24cB7wa+Ooq2ERExTpo+w5gFrLK92vZGYCEwt1rB9iOV3Z2BzXXbRkTE+Gk6YUwD1lT2\n15ZlW5F0nKR+4DLgXaNpGxER42NytwMAsH0JcImklwOfAl41+l7mVbb7yldERAC0Wi1ardaY+mg6\nYawDZlT2p5dlQ7K9XNI+knYbbdutE0ZERFT19fXR19e3ZX/+/Pmj7qPpIakVwH6SZkqaApwALKlW\nkLRvZftwYIrtB+q0jYiI8dPoGYbtTZJOA5ZSJKfzbPdLOrU47AXAmySdDPwReBR460htm4w3IiKG\nJ9vdjmHMJBnG+jkWMXv2RSxbtqgjMUVE9DJJ2NZo2uRO74iIqCUJIyIiaknCiIiIWpIwIiKiliSM\niIioJQkjIiJqScKIiIhaaiUMSddJer+kXZsOKCIielPdM4zjgT2BFZIWSjpG0qhu+IiIiCe3WgnD\n9u22PwHsD1wEfA1YLWl+uVBgRERMcLXnMCQdAnwO+CywCHgL8DDww2ZCi4iIXlJr8UFJ1wG/Ac4D\nPmr7sfLQz6rP4I6IiImr7mq1b7F9R7VA0t6277T9xgbiioiIHlN3SOrimmURETFBjXiGIelA4GBg\nF0nVM4lnAk9vMrCIiOgt7YakDgBeDzwLeEOl/LfAe+u8gaQ5wOf500OQPj3o+NuAMyv9vs/2TeWx\nu4CHgM3ARtuz6rxnRER03ogJw/alwKWSXmr7p6PtXNIk4BxgNnAPxX0cl9peWal2B/BK2w+VyWUB\ncER5bDPQZ/vB0b53RER0VrshqY/Y/gzwNkknDj5u+wNt+p8FrLK9uuxvITAX2JIwbF9dqX81MK0a\nAlm+JCKiJ7Qbkhp4hva129j/NGBNZX8tRRIZznuA71X2DVwuaROwwPa/bGMcERExRu2GpC4r/zy/\n6UAkHQ2cAry8Unyk7fWSdqdIHP22lzcdS0REPFG7IanLKL7lD8n2sW36XwfMqOxPL8sGv88hFHMX\nc6rzFbbXl3/+WtJiirOTYRLGvMp2X/mKiAiAVqtFq9UaUx+yh80HSDpqpMa2fzRi59J2wK0Uk97r\ngWuAE233V+rMAK4A3l6dz5C0IzDJ9u8k7QQsBebbXjrE+3iEvFbTImbPvohlyxaNsZ+IiN4nCduj\nWkS23ZDUiAmhHdubJJ1G8ct+4LLafkmnFoe9APh7YDfgS+UKuAOXz04FFhfJgMnAhUMli4iIGB/t\nhqS+bfutkm5miK/wtg9p9wa2v09xP0e17CuV7fcyxD0dtu8EXtCu/4iIGB/trpI6o/zz9U0HEhER\nva3dkNTApPPq8QknIiJ6Vbshqd+y9VCUyn1RzEE8s8HYIiKih7Q7w3jGeAUSERG9re7zMJB0OMVN\ndQaW276+sagiIqLn1FqnSdJ/A84H/gx4NvCvkv6uycAiIqK31D3DOAk41PYfACSdDdwAfKqpwCIi\norfUXQn2HrZ+YNL2DLHER0RETFztrpL6XxRzFg8Bv5B0ebn/KoplPiIi4imi3ZDUwLLm1wGLK+Wt\nRqKJiIie1e6y2saXNY+IiCeHWpPekp4D/ANwEJW5DNv7NBRXRET0mLqT3l8HzgUeB44GLgD+ramg\nIiKi99RNGDvYvoLi+Rmrbc8DXtdcWBER0Wvq3ofxmKRJwKry+RbrgJ2bCysiInpN3TOMM4AdgQ8A\nLwTeDryjqaAiIqL31EoYtlfY/h3wMPAB22+sPk51JJLmSFop6TZJZw5x/G2Sbixfy8vne9dqGxER\n46fuWlIvKp+6dxNwc/nL/YU12k0CzgGOAQ4GTpR04KBqdwCvtH0oxVIjC0bRNiIixkndIamvAe+z\nvZftvYD3U1w51c4sYFU5Ub4RWAjMrVawfbXth8rdq4FpddtGRMT4qZswNtn+ycCO7eUUl9i2Mw1Y\nU9lfy58SwlDeA3xvG9tGRESD2q0ldXi5+SNJXwG+SbGW1PF0eHkQSUcDp1A8cyMiInpMu8tqPzdo\n/6zKtmlvHTCjsj+dIVa5LSe6FwBzbD84mrZ/Mq+y3Ve+IiICoNVq0Wq1xtSH7Dq/97exc2k74FZg\nNrCeYoXbE233V+rMAK4A3l698qpO20pd18tfI1nE7NkXsWzZojH2ExHR+yRhW6NpU3ctqV0ozi5e\nWRb9CPhkZbJ6SLY3lTf6LaWYLznPdr+kU4vDXgD8PbAb8CVJAjbanjVc29F8uIiI6JxaZxiSFgG3\nUDymFYob9w61/cYGY6stZxgREaPT2BkGsK/tN1X250u6YTRvFBERT251L6t9VNKWq5ckHQk82kxI\nERHRi+qeYfw1cEE5lwHwIBNwLamf/OQKimmUsZs6dSb33ntXR/qKiOgFbRNGuUTHAbYPlfRMANsP\nNx5ZF/zxjw8x9rmQwoYNnUk8ERG9ou2QlO3NwEfK7YcnarKIiIiR1Z3DWCbpv0r6S0m7DbwajSwi\nInpK3TmM4ynGat43qDzP9I6IeIqomzAOokgWL6dIHD8BvtxUUBER0XvqJozzKR6e9MVy/21l2Vub\nCCoiInpP3YTxPNsHVfavlPTLJgKKiIjeVHfS++eSjhjYkfQS4NpmQoqIiF5U9wzjhcBVku4u92cA\nt5aPbbXtQ4ZvGhERE0HdhDGn0SgiIqLn1UoYtlc3HUhERPS2unMYERHxFJeEERERtTSeMCTNkbRS\n0m2Szhzi+AGSrpL0B0kfGnTsLkk3Srpe0jVNxxoREcOrO+m9TcqVbs+heC73PcAKSZfaXlmpdj9w\nOnDcEF1sBvpsP9hknBER0V7TZxizgFW2V9veCCwE5lYr2L7P9nXA40O01zjEGBERNTT9y3gasKay\nv7Ysq8vA5ZJWSHpvRyOLiIhRaXRIqgOOtL1e0u4UiaPf9vKhq86rbPeVr4iIAGi1WrRarTH1Ibsz\nT5gbsvNiOZF5tueU+x+luDP800PUPQv4re1/GqavYY9L8tiflLcIeDOdeuIeiCZ/thERYyEJ26N6\nNGjTQ1IrgP0kzZQ0BTgBWDJC/S3BS9pR0s7l9k7Aq4Fbmgw2IiKG1+iQlO1Nkk4DllIkp/Ns90s6\ntTjsBZKmUixk+Axgs6QzKJ6/sTuwuDh7YDJwoe2lTcYbERHDa3RIarxkSCoiYnR6cUgqIiImiCSM\niIioJQkjIiJqScKIiIhakjAiIqKWJIyIiKglCSMiImpJwoiIiFqSMCIiopYkjIiIqCUJIyIiaknC\niIiIWpIwIiKiliSMiIioJQkjIiJqScKIiIhaGk8YkuZIWinpNklnDnH8AElXSfqDpA+Npm1ERIyf\nRhOGpEnAOcAxwMHAiZIOHFTtfuB04LPb0DYiIsZJ02cYs4BVtlfb3ggsBOZWK9i+z/Z1wOOjbRsR\nEeOn6YQxDVhT2V9bljXdNiIiOmxytwPonHmV7b7yFRERAK1Wi1arNaY+mk4Y64AZlf3pZVkDbeeN\nLrKIiKeQvr4++vr6tuzPnz9/1H00PSS1AthP0kxJU4ATgCUj1NcY2kZERIMaPcOwvUnSacBSiuR0\nnu1+SacWh71A0lTgWuAZwGZJZwAH2f7dUG2bjLeztkdS+2ptTJ06k3vvvWvs4UREjJFsdzuGMZNk\nGOvnWAS8mbH3M0Ad6ktMhL+jiOgtkrA9qm+1udM7IiJqScKIiIhakjAiIqKWJIyIiKglCSMiImpJ\nwoiIiFqSMCIiopYkjIiIqCUJIyIiaknCiIiIWpIwIiKiliSMiIioJQkjIiJqScKIiIhakjCeQvbY\nYy8kdeS1xx57dfvjRMQ4m0DP9I52NmxYTaee97Fhw9gfDhURTy6Nn2FImiNppaTbJJ05TJ0vSlol\n6QZJh1XK75J0o6TrJV3TdKwRETG8Rs8wJE0CzgFmA/cAKyRdantlpc5rgH1tP0fSS4BzgSPKw5uB\nPtsPNhlnRES01/QZxixgle3VtjcCC4G5g+rMBS4AsP0zYJfyOd9QPOc08ywRET2g6V/G04A1lf21\nZdlIddZV6hi4XNIKSe9tLMqIJ4FOXbSQCxZiW/X6pPeRttdL2p0icfTbXj501XmV7b7yFTFxdOqi\nhVyw8NTUarVotVpj6qPphLEOmFHZn16WDa7zl0PVsb2+/PPXkhZTDHHVSBgREVHV19dHX1/flv35\n8+ePuo+mh6RWAPtJmilpCnACsGRQnSXAyQCSjgB+Y3uDpB0l7VyW7wS8Gril4XgjImIYjZ5h2N4k\n6TRgKUVyOs92v6RTi8NeYPu7kl4r6Xbg98ApZfOpwGJJLuO80PbSJuONiIjhye7MjVzdVCSVsX6O\nRcCb6dSNbcUFXp3oS3Tq70jqVEzQybiins79/eXvLop/T7ZHNaGVS1YjIqKWJIyIiKglCSMiImpJ\nwoiIiFqSMHre9h1bkjzqyTLw9U30u88n+ucbrVwltUXvXiXVezEVfU2EfztD6dWryXrxKqlejKmT\nJvLny1VSERHRmCSMiIioJQkjIiJqScKIiIhaen158+hZ23fkyqupU2dy7713jT2cntWZn1NEL0jC\niG30GHk2Qx2d+TkVJvrPKnpdhqQiIqKWJIyIiKglCSMiImppPGFImiNppaTbJJ05TJ0vSlol6QZJ\nLxhN23iy69zSJ9ttt1OWUKmlF5eb6VxME2UZjuF0armSbdFowpA0CTgHOAY4GDhR0oGD6rwG2Nf2\nc4BTgS/XbduMVvNvMSatbgcwgtY2tBmYFB77a/PmR0Y4fuUo+uqG1ji+12h/5iP97LoV0/Dxbdiw\nuoNxjVWr4z0Wn68T/2dGr+kzjFnAKturbW8EFgJzB9WZC1wAYPtnwC6SptZs24BW828xJq1uBzCC\nVrcDGEGr2wG00ep2ACNodTuANlrdDmAErW4H0FFNJ4xpwJrK/tqyrE6dOm0jImKc9OJ9GNs0uPbM\nZ75hTG/6+OPreeSRMXURETGhNbq8uaQjgHm255T7HwVs+9OVOl8GrrT9rXJ/JXAUsHe7tpU+emvd\n4IiIJ4HRLm/e9BnGCmA/STOB9cAJwImD6iwB3g98q0wwv7G9QdJ9NdoCo//QERExeo0mDNubJJ0G\nLKWYLznPdr+kU4vDXmD7u5JeK+l24PfAKSO1bTLeiIgY3oR44l5ERDQvd3qXJE2X9ENJv5B0s6QP\ndDumwSRNkvRzSUu6HctgknaR9B1J/eXP8CXdjmmApA9KukXSTZIulDSly/GcJ2mDpJsqZbtKWirp\nVkk/kLRLD8X2mfLv9QZJiyQ9s1diqxz7sKTNknbrRmxlDEPGJ+n08ud3s6SzeyU2SYdK+qmk6yVd\nI+lF7fpJwviTx4EP2T4YeCnw/vG5UXBUzgB+2e0ghvEF4Lu2nwscCvTE8KGkPYHTgcNtH0IxDHtC\nd6Pi6xQ3pFZ9FFhm+wDgh8DHxj2qwlCxLQUOtv0CYBW9FRuSpgOvArp9x94T4pPUB7wBeL7t5wP/\n2IW4YOif3WeAs2wfBpwFfLZdJ0kYJdv32r6h3P4dxS+8nrnvo/xP8Vrgq92OZbDyG+crbH8dwPbj\nth/uclhV2wE7SZoM7Ajc081gbC8HHhxUPBc4v9w+HzhuXIMqDRWb7WW2N5e7VwPTxz0whv25Afwz\n8LfjHM4TDBPf3wBn2368rHPfuAfGsLFtBgbOZJ8FrGvXTxLGECTtBbwA+Fl3I9nKwH+KXpx02hu4\nT9LXyyGzBZJ26HZQALbvAT4H3E3xH+I3tpd1N6oh/bntDVB8eQH+vMvxDOddwPe6HcQASccCa2zf\n3O1YhrE/8EpJV0u6ss6wzzj6IPCPku6mONtoe+aYhDGIpJ2Bi4EzyjONrpP0OmBDeQYkeu9JOpOB\nw4H/bftw4BGKIZauk/Qsim/vM4E9gZ0lva27UdXSc18MJH0C2Gj7om7HAlB+Kfk4xXDKluIuhTOc\nycCuto8APgJ8u8vxVP0Nxe+5GRTJ42vtGiRhVJRDFhcD37B9abfjqTgSOFbSHcA3gaMlXdDlmKrW\nUnzLu7bcv5gigfSC/wDcYfsB25uAfwde1uWYhrKhXEMNSXsAv+pyPFuR9E6KIdFeSrb7AnsBN0q6\nk2Ko7DpJvXR2tobi3xy2VwCbJf1Zd0Pa4h22LwGwfTHF+n0jSsLY2teAX9r+QrcDqbL9cdszbO9D\nMWH7Q9sndzuuAeVQyhpJ+5dFs+mdyfm7gSMkPV3Fms6z6Y0J+cFnikuAd5bb7wC6+YVlq9gkzaEY\nDj3W9mNdi6oMp3xh+xbbe9jex/beFF9cDrPdzWQ7+O/1EuCvAMr/H0+zfX83AuOJsa2TdBSApNnA\nbW17sJ1XcS/KkcAm4AbgeuDnwJxuxzVEnEcBS7odxxBxHUpxZ/8NFN+odul2TJXYzqJIEjdRTCg/\nrcvxXEQx8f4YRUI7BdgVWAbcSnFV0rN6KLZVFFcg/bx8falXYht0/A5gtx77e50MfAO4GbgWOKqH\nYntZGdP1wE8pku2I/eTGvYiIqCVDUhERUUsSRkRE1JKEERERtSRhRERELUkYERFRSxJGRETUkoQR\nUZOksyR9qNtxRHRLEkbEOJK0XbdjiNhWSRgRI5D0ifKhRj8GDijL9pH0PUkrJP1oYEmUsvynkm6U\n9N8l/bYsP0rSjyVdCvyiLDtJ0s/K1X3PLZctQdKrJF0l6VpJ35K0Y3c+ecQTJWFEDEPS4cBbgUOA\n1wEvLg8tAE6z/WKKNZbOLcu/APyz7UMp1jWqLqNwGHC67QPLB3MdD7zMxeq+m4GTykXp/g6YbftF\nwHXAh5v8jBGjMbnbAUT0sFcAi10suPdYeYawA8UaPN8ZOCsAnlb++VKKpdShWLun+gSza2zfXW7P\npljNd0XZx9OBDcARwEHA/y3Ln0axxk9ET0jCiKhPFGflD5ZnBoN5UN2q3w86dr7tT2zVufR6YKnt\nkzoRbESnZUgqYng/Bo6TtL2kZ1A8m/n3wJ2S3jxQSdIh5ebVwED5SM8NvwJ4s6Tdy/a7SppRtj9S\n0r5l+Y6SntPRTxQxBkkYEcOwfT3wLYpl0f8PcE156CTg3ZJukHQLcGxZ/kHgQ5JuoHi4z0PD9NtP\nMVexVNKNFMuZ7+Hiec/vBL5Zll9FOdEe0QuyvHlEh0jawfaj5fbxwAm2/2OXw4romMxhRHTOCyWd\nQzFH8SDwri7HE9FROcOIiIhaMocRERG1JGFEREQtSRgREVFLEkZERNSShBEREbUkYURERC3/H1AD\nmYWg6FMpAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.hist(p,max(p),normed=1); # hist does the rest of the work, normed returns probablilities\n", "plt.xlabel('degree')\n", "plt.ylabel('probablility')\n", "plt.title('histrogram')\n", "plt.xlim([1,max(p)])\n", "if max(p) < 10:\n", " plt.xticks(np.arange(1,max(p)+1,1)) # avoid decimal values as ticks which dont make sense at a histogram" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Question 3:** Comment on the plot. What do you observe? Would you expect a similar degree disribution in the complete Twitter network?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Your answer here:** \n", "In this collected dataset there are a lot of nods having only one connection. Then the amount decreases very fast, roughly with $\\frac{1}{k}$ keeping it at a low but quite constant level for higher values of $k$. The maximum degree in this network is 18. \n", "I think, that the histogram of the complete Twitter network would look similar with a minor difference. Probably the degree with the highest probability is not 1 but slightly higher. Almost every user shares a couple connections and the reason we found that many in our data set with only one connection is, that we stopped after the second_nodes and did not continue going on. Apart from that I think it looks similar with only a few users having a very big amount of connections. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.3 Average degree" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Calculate the average degree of your collected network." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# p: degree per nod -> mean(p): average degree\n", "d_avg = np.mean(p)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.4 Diameter of the collected network" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Question 4:** What is the diameter of the collected network? Please justify." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Your answer here:** \n", "The maximum distance between two nodes of the network is 4, because we went only 2 layers down from the starting user_id. This makes a maximum distance of 2 hops up from the bottom layer to user_id and from there 2 hops down to another node in the base layer. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.5 Pruning the collected network" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You might notice that some nodes have very few connections and hence our matrix is very sparse. Prune the collected network so that you keep only the nodes that have a degree that is greater than the average degree and plot the new adjacency matrix." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# collect the indices of nodes that have a lower degree than average\n", "indices = []\n", "for ind,nods in enumerate(p):\n", " if nods < d_avg:\n", " indices.append(ind)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "By pruning the average degree of the network changed form 4.000 to 6.500\n" ] } ], "source": [ "# create the pruned matrix by deleting the rows and columns to the belonging indices\n", "Wpruned = np.delete(copy.copy(W),indices,0)\n", "Wpruned = np.delete(Wpruned,indices,1)\n", "\n", "# compare d_avg to before\n", "d_avg_p = np.mean(Wpruned.sum(1))\n", "print('By pruning the average degree of the network changed form {0:.3f} to {1:.3f}'.format(d_avg,d_avg_p))" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAP4AAAEHCAYAAACOfPs0AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAHWlJREFUeJztnXu0HVV9xz+/QAgh2BBeiQsM+CihonBxraZ6Qe5xqRSf\nWFpZFpUELbpUWrvQJQ+7ek9QV6XVWKviEgQk1AcuXAj2oYHSQ8Q0QoFbIUBAMSBKwiMQJDwM5tc/\nZp/cc09m5syZM7Pn9fusddY9Z2bP3r/ZM7+79+zv/PYWVcUwjGYxq2gDDMPwjzm+YTQQc3zDaCDm\n+IbRQMzxDaOBmOMbRgMxx/eAiCwTkR/3/P6tiBxanEXVQ0S+KiKfLNqOumCOPwIi0hGRLSIyO0Hy\nnS9MqOoLVHVjfpblj4gcIiI7ROSWvu37icjvROS+hPnM+KcYhap+SFU/k8LOu0XknT2/x53dvduO\nEZEnRaQx/tCYE80aETkEOBbYAby9YHOKZC8ReXnP71OAXwxxvNDzTzE0wWgOuQY4ruf3ccBdfdte\nC6xV1R0jlFMpzPHTcyrwP8A3gOW9O0RkXxG5RkS2isg64KV9+3eIyEvc9zeLyK0u7f0iMtmX9lgR\n+YmIPO72n+q27yEin3PbHhKRC0Rkjts3ISK/EpEzRWSziPxaRJb35LmniHxeRDaKyBMissZt+zcR\n+Uhf+f8nIifG1MPlfed/KrCqL4+zROTnrlW9Q0Te4bYfDnwVeI17/Nnitl/qzuffReS3QMttO8/t\n/4SIrOv+QxCRD4nI7SKyR4h9/Y7/WuD8kG1rYs6xfqiqfVJ8gHuBDwKvAn4HHNCz7zvusydwBPAg\nsKZn/++Bl7jvxwFHuO+vAB4C3u5+HwI8CZwM7AYsAI50+74AfB+YD8wDrgY+4/ZNANuBSXfcm4Bt\nwHy3/yvA9cAighb31cBs4J3Auh47jwIeAXYPOf9D3HksBh5w+bwcuBN4PXBfT9o/Bxa67+8Enur5\nvay3bty2S4HHgVe733PctvPcbwE6wN8DLwO2dOslxM7FwPPAPu64TS6/B3q2PQEcW/Q95fX+LdqA\nKn4IuvjPAQvc7zuBj7rvs9w/gj/sSf+ZPsff0XX8kLy/AHzefT8b+F5EuqeAF/f8fk3X2ZzjbwNm\n9ezfDCx1N/rTwCtC8pwDPAa81P3+J+DLEeV3HX8WsBo4HvgH4Jx+xw859jbgbe57lON/I2TbeX3l\nP+bq/hMDrtd9wNuAMeDHbtu3e7ZtA2YXfV/5/FhXPx2nAqtV9XH3+9sENzDAAQSt7IM96e+PykhE\n/kRErheRh0XkCYJexP5u94sIeV4WkQOAvYBb3ODiFuA/gf16kj2mM59Znwb2dnnPIXCGGajqc8AV\nwHtERIC/JOjKD6Lb3X9XWHoROVVEbnOPK48T9IL270/Xx6/idqrq/cB/E/wDuGBAXj8m6Fkd574D\n3EjwD/I44CZV3T4gj1phjj8kIrInQdd7wj1bPwT8LXCUiLySoGv8PIHTdlkck+U3CbrsB6nqPsDX\nCFplCG7+l4Uc8yiBIx+hqvu6zz6qOj/BKTwKPEvfuEMPq4D3ELTa21T1pwny/B7wFuAXqtr7Dw8R\nWQxcCHxYVReo6gJgPdPnGDWwN2jA7y0EvZz/Aj43wL41BE5+LNOO3/1n0Lzne8zx0/BnBI79RwTP\nwEe57zcCp7pW9iqgLSJz3Yj3sqjMCFrhx1V1u4gsJRgV7/JN4PUi8hcispsbNDxKg77qRcA/u9Yf\nETlIRI4fZLw79lJgpYi8UERmiciru5Kkqq4jeBT5PINbe3HHPA28Djg9JM08l9+jrqzTCMYyumwG\nDk4oiQaFiuxPcP7vI+hpvFVE3hRzyBrgaAJH/4nbdjvwYqCFOb6RgFOBS1T116r6cPcDfBl4txtp\nPgN4AcFA3SXuE8WHgU+JyFbg7wi62gCo6q+ANwMfJxjAug040u0+G/g5sM49IqwGDospp7cF/TjB\njX8zwXPyZ5l5L6wicM5/jclvRp6qequq/nKXBKp3EfwTWUcwsHYEwT/JLtcT9AA2icjDA8rr8jXg\nKlX9kapuAf4KuEhEFoQaqXov8DDwkKo+6bYpcBPBdVqbsNzaIG6gw/CEe3b+PbC4v1tcFkTkvcDp\nqnrcwMRGJbEW3z+vBJ4haP1Kh4jsRdAL+VrRthj54c3xReQE9/rkPSJylq9yI2zZ6F5MuU1EbvJY\n7knATwla/Ft7ti8QkdUiskFEfiQiSQbpsrDnYveCz8/c7+MJ9PMx4GPuxaITPNhxsFM21rsXcf7G\nbfdeLyG2/LXbPikiD7o68VUvc0Tkp+4+vV3cy12Z1IsPzZDgH8zPCaSX2cAUcHhRGiaBlLWgoLKP\nJXCsn/VsOx+nRQNnAZ8t0JZJ4EzPdbIIGHPf9wY2AIcXUS8xtnivF2fDXu7vbgTjJEuzqBdfLf5S\n4F5VvV8DvfQ7QNxroHkjFPSYo6o3ErSqvZwIXOa+Xwa8o0BbYFpq84KqblLVKff9KYJ36Q+mgHqJ\nsOUgt9trvTgbnnZf5wC7Ewyojlwvvm7+g5j5QsaDTFdmEShwrYjcLCJhEpRvDlTVzRDceMCBBdtz\nhohMicjXfT12dJEgXHmMoHVbWGS99NjSfZfBe704CfQ2gjGha1X1ZjKol6YO7h2jqq8ikMo+IiLH\nFm1QH0VKLRcQvE48RnCzrfRVsIjsDVxJ8PrzU+xaD97qJcSWQupFVXeo6tEEPaClInIEGdSLL8f/\nNTPfXjvYbSsEVX3I/X2E4GWbpUXZ4tgsIgsBRGQRgeZcCKr6iLqHR4KXZP7YR7kisjuBo12uqle7\nzYXUS5gtRdVLFw3eP+gAJ5BBvfhy/JuBl0kwecMeBO90X+Op7BmIyF7uvzkiMo8guOQO32Yw83nx\nGqZDW5cRRNoVYou7kbqchL+6uQS4U1W/2LOtqHrZxZYi6kVE9u8+UojIXOCNBGMOo9eLx9HJEwhG\nSO8FzvY9Otpjx4sJVIXbCN5e82oL8C3gNwTRfQ8ApxGE217n6mc1sE+BtqwCfubq6Pu48Nmc7TiG\nQOLsXpdb3f2yr+96ibGliHp5pSt/ypX9Sbd95HqxN/cMo4E0dXDPMBqNOb5hNBBzfMNoIOb4htFA\nRnL8MgXeGIaRnNSj+m7CiXsIpmj6DYFW/y5VvbsvnckGhlEQqhoaXzBKi5848KarHU5OTg7UFycm\nJgneQIz/TEwMzivuk8QWX5+uLfHnbvWS97lXrV4GfeIYxfHLFnhjGEZCdvdRSLvdBqDT6dDpdGi1\nWj6KNYxG0fWvJIzi+IkDb3odvyxOXxY7wGyJwmwJJ8qWVqs1Y9+KFSsi8xjF8XcG3hDMJvsuggUY\nIsmz8j7wgc9yzz3PDkx32GF7cuGFZxd2IaPt7LB27TqmYy9mMfP/6kag7b7vSTDJbpfPEkyVD1NT\nG2m12oTRPfcktmzYsJ5nnpkHwNy5s1iyJHxpgP48s6R7jTZsWM/0uUcTpJsm7p4Y9vyS3C/D3oNJ\nyOIahZHa8VX19yJyBkGQwCzgYg2mUi6Ee+55lhtuaCdImSRNfsTb2SaZff1pnt25betWuOGGZMcl\ntWXrVtgUOTVo1PHZEdzcg8t55pnlM377Pr887sHRziGfFh9V/SGwZJQ8DMPwj5fBvWE47LA9SfIf\nMUjXTGbPvpvx8fbO31NTG9m6tTh78mbu3FmJzm/uXHsRNSmlc/y8nhfrxPj44XQ67Z2/W612TPe+\n+ixZsjimKz4znZEM+xdpGA3EHN8wGoiXrn6UvAT5ykF5kod0YzSTuHtpamoj0TJuerw4frzEEbev\nvFRFPjTKTxH3knX1DaOBlG5UPy1VkQHj7AzexFoODH6bLGmeaY9La0seVOX88r4H58/fyNjYdP6D\nziFW6ck7hBBQ0MjPxMSkVpGJicnY86r6+Rn+yOteCtw73C+tq28YDcQc3zAaSG2e8Y1sMbmy3pjj\nG6GYXFlvrKtvGA3ES4s/MdGO3Fe0vJaWqsiHRvkp4l7y4vi9kWR1wZ5rjawo4l6yrr5hNBBzfMNo\nIIWM6vuWirKcdHHU8uLyTGun/4kx003uWSZ8X6O4PNPaOcpkm15e2fX1imIU8eVlb0va80tvp+9z\nqP7ryv6vURHXAdUIv7SuvmE0EHuBxwglTmKq++SeTcAc3wgl7hm07pN7NgHr6htGAzHHN4wGUpuu\nfhETFvolbv24uDX3pulfW67JpF2Pry6M5PgishHYCuwAtqvq0iyMSkP9o8ni1o9rx+ybpn9tuSaT\ndj2+ujBqi78DaKnq41kYYxiGH0Z1fCHFOEGRkW3DTliYxpY8Jodcu3Yb27cnNiGUrNaWq0NkYtr1\n+Mo0gWdhk20C9wG3AjcDp0ekGfqtqjTUffLLur9J55u63y+q8W/ujdriH6OqD4nIAcC1InKXqt7Y\nn6jdbu/83mq1aLVaIxZrGEY/nU6HTqeTKO1Ijq+qD7m/j4jIVcBSINbxDcPIh/5GdcWKFZFpUzu+\niOwFzFLVp0RkHnA8EF1SztRBnslHksw+kq5M0ZV5lFcVktZLGKO0+AuBq0REXT7fVNXVI+Q3EnWQ\nZ/KRJJ/dmX7r1rgBn+R5+pZO6y/VpmNwveTQ4qvqL4GxtMcbhlEctXlzL608U0WGkSQtki6cOkiS\no1Abx1+yZDGbNiVLV3XGxg5NPIGpRdKF06SxgDCq3/wZhjE05viG0UBq09U3ykL1J+LMg7JJkub4\nRsZkLx/WgbJJktbVN4wGUpsWvw7yTB7n4DvPPOTDOlzbPBhUL7lF5yX54Ck6zygHTYh6S0MR9YLN\nq28YRi/m+IbRQEr3jF822aOp2HXwiX8JtHSOXzbZo6nYdfCJfwnUuvqG0UBK1+Ib1cakt3DKthah\nOb6RKfa8H07Z1iK0rr5hNBBzfMNoILXp6tddfqrD+aU9hzzOPS7PYEajecDghTHytjNO6ktqZxi1\ncfy6y091OL+055DHucfn2aZXXoue2Wnm8f4nS20Tb2f0ZJvW1TeMBlK6Ft/koHJg18EfRUh9pXP8\nsj6fNg27Dv4oQuqzrr5hNBBzfMNoIIV09esgTeVBXL2sXbsOWO5+zQJ6pZv1wDyXblsqycePNFVv\nqrR+40DHF5GLgbcCm1X1SLdtAXAFcAiwEThZVRMPQdRBmsqDpBJT3L7t24eVfHrTJbUl+rgmU6X1\nG5N09S8F/rRv29nAdaq6BLgeOCdrwwzDyI+BLb6q3igih/RtPhGYcN8vAzokX7c5F0x+Kj9pr5Hv\nCUPj1iKMKy+P9RtHsTNODUj7jH+gqm4GUNVNInJgynwyw54zy0/aa5THtc0jzzzWbxzFTpEclsnu\nQ+N2ttvtnd9brVZGRRqG0Uun06HT6SRKm9bxN4vIQlXdLCKLgIfjEvc6fkAnZbGGYUTRarVmNKwr\nVoze4ov7dLmGQFs6H1gGXB1vUHvG73hpapphZI8yRX7FEVfe1NRGpp/n9mTmsEmcVBRXn+lkwHhb\n/OI7Oi9tnnnIebndn1ET7nc/wLeA3wDPAQ8ApwELgOuADcBqYJ+Y40MWDki2uMD8+csSLx6QdsEC\n3wsdJC1v1zpaNkTaLPYlS+djYYw8rlEeec6fH3eN/N7Xqhq7oEaSUf1TIna9Ifm/F8MwykTpgnR6\nGUb2qBvz529kbKy98/fatdvYvr0ctvRi8ug0ech5eVFqxx9G9qgbY2OH0um0d/4uYkLGKFuMcPKQ\n8/Ki+H89hmF4xxzfMBqIp65+u+/3RrKWiuKlFL+yVR5RduntjKuXuDxHn+SxaOk0jipF0kWTbM29\nMApy/KT7khMfGdXeuW/X6LXsbcknyq4/XVKSRYztmmaUSR7D8yxTxF+VIumiGbTmnk22aRhGD4WP\n6s+efTfj4+3QfcNIRUmllDiqIlv12xkXpZVUBuy/DkWs5xZFHtF5viPpdk03ep6jXKPCHX98/PBM\npKKkUkocVZGthrEzqQzYfx2KlA/7aUIkXZo8R7lG1tU3jAZijm8YDaSgrn4yGSK7yKjsowHTUyY7\no69DHjKnbwktvayazhbfUYRJzyGMghx/kAzRpZ04x6RyXnweyxOXl54y2Zn9dYjDt4SWXlZNZ0uR\na/yFY3KeYRg9eGnxJybaM37nIRVlIedlFTUVJ8FkEWWXlcSU9DoMIx/uWvY0VYpe61ImW7LEi+P3\nS095SEVZyHlZRU3lvRZaVhJTUluykjmrFL3WpUy2ZEk9/50ZhhGLOb5hNJDC39wrnuylxTjSy47R\nkXt52FkV0k9eWo3y4u+X9OWZ45dK0mon2ucjwrAq+I74811eUgl02PKsq28YDaSQFr9Ma6H5jkLL\nQnbMCt/rDRa5vmFWkmTa8tLmmfR+CYtyzWPtvJEo01povqPQspAds8L3WECRYw++Iy99S6BhUa5x\na+dZV98wGog5vmE0kIFdfRG5GHgrsFlVj3TbJoHTmV4s81xV/WHSQss06WK5yH5izF6KXlsuLb6j\n7JpAkmf8S4EvAav6tq9U1ZVpCi3TpIvlIo+JMeOOi6ZM18h3lF0TGNjVV9UbgcdDdknINsMwKsAo\no/pniMh7gf8FPqaqJRGphqNMklYdJsYsKz4iL3dNV97y0jr+BcB5qqoi8mlgJfD+lHkVSpkkrTpM\njFlWfERe5kFe5aVyfFV9pOfnRcAP4tK32+2d31utVpoiDcMYQKfTodPpJEqb1PGFnmd6EVmkqt3X\nCk4C7og7uNfxnYkJizUMIymtVmtGw7piRfQLPEnkvG8BLWA/EXkAmAReJyJjwA4CnemDwxiYdtLF\nqkhMcbakj+7KfmLMPCLN8qiXeMnOb0Rjme7BURjo+Kp6SsjmS0cpNO2ki9WRmHqZmSb9OVRDssun\nXtoxNiTbl1VEY5nuwVGwN/cMo4EUEqRTxUkXfdMf3ZV2Ysxeio40M8pDIY5fxUkXfdMf3eV7Ysy8\n8zSKpblNqmE0GHN8w2ggNZpzL12EWlXkGd/rzpWLPCac9DvJatmokeOnk7uqIs/4XneuXOQx4aTf\nSVbLhnX1DaOBVGqyzSzWgcuKPM4h7rg8JFDfk57G5ZlH1GIek6wWOWFopqhqrp+giPyZmJhU0IGf\niYnJTI7zTVXszIM8zr0J9el8L9QvratvGA3EHN8wGkiNRvXzoLmST1VkzqpQtvo0x4+luZJPVWTO\nqlC2+rSuvmE0kNq0+HWQCOOojYyUgjLJjnWhNo6f9rkoi8kvfdDk5+gyrbVYF6yrbxgNxBzfMBpI\nbbr6ZSIP6SYuz+DV1GBSyUFrvVexi1s2KSyKPCYvzQtz/BzwP4llm17ZMXp2o+TllYmySWFRVMVO\nsK6+YTQSa/FjaLLk0+Rzz5syTF5qjh9DFZ+Hs6LJ5543ZZi81Lr6htFAzPENo4EkWTvvYGAVsJBg\nrbyLVPVfRGQBcAVwCMGMhyeraklecDVGpSoSmpGOJM/4zwNnquqUiOwN3CIiq4HTgOtU9R9F5Czg\nHIoWJ43MqJI0ZQzPwK6+qm5S1Sn3/SngLuBg4ETgMpfsMuAdeRlpGEa2DDWqLyKHAmMEaxMvVNXN\nEPxzEJEDM7euoviOJoubVDJteWWiKtJiVewEkGBOvgQJg25+B/iUql4tIltUdd+e/Y+p6n4hx2nS\nMozyEEQmtgemm5hoFy5NGeGICKoqYfsStfgisjtwJXC5ql7tNm8WkYWqullEFgEPRx3fbrd3fm+1\nWrRarYSmG4aRlE6nQ6fTSZQ2UYsvIquAR1X1zJ5t5wNbVPV8N7i3QFV3GdyzFr+aWItffeJa/IGO\nLyLHAGuA2wF1n3OBm4DvAi8C7ieQ854IOV4nJiZnbCtTNJnvSLq4PH1LaHHlrV27ju3bF7lfs4De\na7SeYFkrmD17G+PjRySys0wSYVWiHUex84YbVkQ6vpcFNXZdqKA8ixmUabEG34s8xJeXdl95zi+f\nc6+SnahG+KW9uWcYDcQc3zAaiDm+YTQQc3zDaCDm+IbRQGwijpozSLKD5e5Xv2QXt28jZZo40i/1\nWE/RHL/mJJ2kc7h9/emaRD3WU7SuvmE0EC8t/sREe8bvMkWTlWldtipFd3WZPftuxsfbofvKfH5p\nox19r6c4SlRm7PJvUW/2ZPUJijCKwvfbeXWnTG8fDgJ7c88wjF7M8Q2jgRQyql+mKK06kM+abetJ\n8jy+YcP6hPkZZaIQx7eJHLMln/qclyh9d3DJqBbW1TeMBmIv8DSI/jXb4uSgtWu3sX374Dznzm1W\n21EmSXIUzPEbxDBrtgVTbw1OF/UORl2py5hTs/5dG4YBFNTiByPB7YTpklGmOdTymHMv7hzig21m\n5pENfgNVmqwC5XXuhTh+cAO3E6RbnjjPpMEoW7fCpk1RuQy2aXRbostLfw7T++LIbgTeb6BKk1Wg\nvM7duvqG0UDM8Q2jgRTS1Z87d1aiCKemSUV5M0x9xslWviPUjOwpxPGXLFkc85w9M52RHcPUZ9xA\nUVKpzygv1qQaRgOxF3hyIA+5sslUpT79y47JZNUwBjq+iBwMrAIWAjuAC1X1SyIyCZzO9Cq556rq\nD4eyu6bkIVc2marUp3/ZcZCsuiLyyCQt/vPAmao6JSJ7A7eIyLVu30pVXTmsuYZhFMtAx1fVTcAm\n9/0pEbkLOMjtDl+JcwC+57nzPcdfWtUi7Tn4Pj/fgSpNVoFyU1ei5uQK+wCHEkyqvjcwCfwSmAK+\nDsyPOMbL/GJlokrzslWBqtRnuVY7jp9zL/HgnuvmXwl8VIOW/wLgPFVVEfk0sBJ4f9ix7XZ75/dW\nq0Wr1Rruv5NhGAnouM9gEjm+iOxO4PSXq+rVAKr6SE+Si4AfRB3f6/iGYeRFy326jDa4B3AJcKeq\nfrG7QUQWafD8D3AScMdQNmZMkyO44sgjUjDuOP/UY0kr3ySR844B3g3cLiK3AQqcC5wiImMEEt9G\n4IM52jmQJkdwxZFPpGD0cf6px5JWvkkyqv8TYLeQXabZG0ZFsTf3cqAu87KVhaoEDPm+7oPKi4un\nMMfPAXuWzJaqBAz5vu6DyhOJHtyr3xsPhmEMxBzfMBpIbZbQKlMEV9pJM9NOCprPElrZUxWJsCp2\njkJtltAqUwRXFpNmDjMpaFWkN7OzPFhX3zAaSG1G9ZscwZWU/iW0eumXmKoiSVbFzrJRG8e3efwG\nM8wSWlV5dq2KnWWjuc2fYTQYc3zDaCC16ernQRNknaypirTYdMzxY2iCrJM1VmfVwLr6htFACmnx\nfU+2mTbPtPieNLNM556UYaRF31SxPocmajK+rD5UeLLNqkzyWCaszsoDMZNtWlffMBqIOb5hNJDS\njepXR0KzSR6N6lI6x6+OHGSTPBrVxbr6htFAStfil4mqTPJYJhohhdUAc/wYqjLJY5mw8YxqYF19\nw2gg5viG0UBKN9lm2giucsmA6aS+PNa5SzuBZ5nI49qW635JR9JzCCPJ2nlzgDXAHi79laq6QkQW\nAFcAhxCsnXeyqiYa7spDsiuXDJhO6stnnbv2DFuSTuBZJup/v6Rj8DmMsKCGqj4HvE5VjwbGgDeJ\nyFKCpvg6VV0CXA+cM4zRhmEUR6Kuvqo+7b7OcccocCIw4bZfBnTIeGaFqkZwmdRnlJ1Eji8is4Bb\ngJcCX1HVm0VkoapuBlDVTSJyYNbGDTM5pG9M6jOqTNIWfwdwtIj8AXCViBxB0OrPSBZ1fLvd3vm9\n1WoNbaRhGEnouM9ghhrVV9UnRaQDnABs7rb6IrIIeDjquF7HnzbQMIxsablPl+jBvSSj+vsD21V1\nq4jMBd5IoFddAywHzgeWAVcnNS+Pde7S5lmddfyiJcK1a9cRXAoIxmt7JbuNpJFHy4T/dRHLE3mZ\n1+SlSVr8FwKXuef8WcAVqvofIrIO+K6IvA+4Hzg5aaF5rHOXNs/qrOMXJxG2E5VXZmkqDv/rIpYn\n8jIv2XGg46vq7cCrQrZvAd4wVGmGYZSCQt7cy2OduzKtnZfWlrwlwtmz72Z8PDz/ouXROPK4tnWT\nY8Ok7zhlqRDHz2OduzKtnZfWlrwlwvHxw0srj8aRx7WtmxwbJn2LjPDmXpZ0Oh2fxcVitoRjtkTR\nKdqAHjoj52COXwLMlnDKZIs5vmEYlccc3zAaiAQLbuRYgEi+BRiGEYmqStj23B3fMIzyYV19w2gg\n5viG0UDM8Q2jgZjjG0YDMcc3jAby/xcocB+ds42IAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.spy(Wpruned, markersize=10)\n", "plt.title('Adjacency Matrix W');" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python [Root]", "language": "python", "name": "Python [Root]" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 1 }