{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tutorial Outline\n", "\n", "- [Introduction](#Introduction)\n", "- [Preprerequisites](#Preprerequisites)\n", "- [How does it work?](#How-does-it-work?)\n", "- [Authentication](#Authentication)\n", " - [Authentication keys](#Authentication-keys)\n", "- [MongoDB Collection](#MongoDB-Collection)\n", "- [Starting a Stream](#Starting-a-Stream)\n", " - [Stream Listener](#Stream-Listener)\n", " - [Connect to a streaming API](#Connect-to-a-streaming-API)\n", "- [Data Access and Analysis](#Data-Access-and-Analysis)\n", " - [Load results to a DataFrame](#Load-results-to-a-DataFrame)\n", "- [Visualization](#Visualization)\n", "\n", "# Introduction\n", "\n", "Twitter provides two types of API to access their data:\n", "\n", "- RESTful API: Used to get data about existing data objects like statuses \"tweets\", user, ... etc\n", "- Streaming API: Used to get live statuses \"tweets\" as they are sent\n", "\n", "The reason why you would like to use streaming API:\n", "\n", "- Capture large amount of data because RESTful API has limited access to older data\n", "- Real-time analysis like monitoring social discussion about a live event\n", "- In house archive like archiving social discussion about your brand(s)\n", "- AI response system for a twitter account like automated reply and filing questions or providing answers\n", "\n", "# Preprerequisites\n", "\n", "- Python 2 or 3\n", "- Jupyter /w IPyWidgets\n", "- Pandas\n", "- Numpy\n", "- Matplotlib\n", "- MogoDB Installtion\n", "- Pymongo\n", "- Scikit-learn\n", "- Tweepy\n", "- Twitter account\n", "\n", "\n", "# How does it work?\n", "\n", "Twitter streaming API can provide data through a streaming HTTP response. This is very similar to downloading a file where you read a number of bytes and store it to disk and repeat until the end of file. The only difference is this stream is endless. The only things that could stop this stream are:\n", "\n", "- If you closed your connection to the streaming response\n", "- If your connection speed is not capable of receiving data and the servers buffer is filling up\n", "\n", "This means that this process will be using the thread that it was launched from until it is stopped. In production, you should always start this in a different thread or process to make sure your software doesn't freeze until you stop the stream.\n", "\n", "# Authentication\n", "\n", "You will need four numbers from twitter development to start using streaming API. First, let's import some important libraries for dealing with twitter API, data analysis, data storage ... etc" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import tweepy\n", "import matplotlib.pyplot as plt\n", "import pymongo\n", "import ipywidgets as wgt\n", "from IPython.display import display\n", "from sklearn.feature_extraction.text import CountVectorizer\n", "import re\n", "from datetime import datetime\n", "\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Authentication keys\n", "\n", "1. Go to https://apps.twitter.com/\n", "2. Create an App (if you don't have one yet)\n", "3. Grant read-only access to your account\n", "4. Copy the four keys and paste them here:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "api_key = \"yP0yoCitoUNgD63ebMerGyJaE\" # <---- Add your API Key\n", "api_secret = \"kLO5YUtlth3cd4lOHLy8nlLHW5npVQgUfO4FhsyCn6wCMIz5E6\" # <---- Add your API Secret\n", "access_token = \"259862037-iMXNjfL8JBApm4LVcdfwc3FcMm7Xta4TKg5cd44K\" # <---- Add your access token\n", "access_token_secret = \"UIgh08dtmavzlvlWWukIXwN5HDIQD0wNwyn5sPzhrynBf\" # <---- Add your access token secret\n", "\n", "auth = tweepy.OAuthHandler(api_key, api_secret)\n", "auth.set_access_token(access_token, access_token_secret)\n", "\n", "api = tweepy.API(auth)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# MongoDB Collection\n", "\n", "Connect to MongoDB and create/get a collection." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "2251" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "col = pymongo.MongoClient()[\"tweets\"][\"StreamingTutorial\"]\n", "col.count()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Starting a Stream\n", "\n", "We need a listener which should extend `tweepy.StreamListener` class. There is a number of methods that you can extend to instruct the listener class to perform functionality. Some of the important methods are:\n", "\n", "- `on_status(self, status)`: This will pass a status \"tweet\" object when a tweet is received\n", "- `on_data(self, raw_data)`: Called when any any data is received and the raw data will be passed\n", "- `on_error(self, status_code)`: Called when you get a response with code other than 200 (ok)\n", "\n", "## Stream Listener" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [], "source": [ "class MyStreamListener(tweepy.StreamListener):\n", " \n", " counter = 0\n", " \n", " def __init__(self, max_tweets=1000, *args, **kwargs):\n", " self.max_tweets = max_tweets\n", " self.counter = 0\n", " super().__init__(*args, **kwargs)\n", " \n", " def on_connect(self):\n", " self.counter = 0\n", " self.start_time = datetime.now()\n", " \n", " def on_status(self, status):\n", " # Increment counter\n", " self.counter += 1\n", " \n", " # Store tweet to MongoDB\n", " col.insert_one(status._json)\n", " \n", " \n", " if self.counter % 1 == 0:\n", " value = int(100.00 * self.counter / self.max_tweets)\n", " mining_time = datetime.now() - self.start_time\n", " progress_bar.value = value\n", " html_value = \"\"\"Tweets/Sec: %.1f\"\"\" % (self.counter / max([1,mining_time.seconds]))\n", " html_value += \"\"\" Progress: %.1f%%\"\"\" % (self.counter / self.max_tweets * 100.0)\n", " html_value += \"\"\" ETA: %.1f Sec\"\"\" % ((self.max_tweets - self.counter) / (self.counter / max([1,mining_time.seconds])))\n", " wgt_status.value = html_value\n", " #print(\"%s/%s\" % (self.counter, self.max_tweets))\n", " if self.counter >= self.max_tweets:\n", " myStream.disconnect()\n", " print(\"Finished\")\n", " print(\"Total Mining Time: %s\" % (mining_time))\n", " print(\"Tweets/Sec: %.1f\" % (self.max_tweets / mining_time.seconds))\n", " progress_bar.value = 0\n", " \n", " \n", "myStreamListener = MyStreamListener(max_tweets=100)\n", "myStream = tweepy.Stream(auth = api.auth, listener=myStreamListener)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Connect to a streaming API\n", "\n", "There are two methods to connect to a stream:\n", "\n", "- `filter(follow=None, track=None, async=False, locations=None, stall_warnings=False, languages=None, encoding='utf8', filter_level=None)`\n", "- `firehose(count=None, async=False)`\n", "\n", "Firehose captures everything. You should make sure that you have connection speed that can handle the stream and you have the storage capacity that can store these tweets at the same rate. We cannot really use firehose for this tutorial but we'll be using `filter`.\n", "\n", "You have to specify one of two things to filter:\n", "\n", "- `follow`: A list of user ID to follow. This will stream all their tweets, retweets, and others retweeting their tweets. This doesn't include mentions and manual retweets where the user doesn't press the retweet button.\n", "- `track`: A string or list of string to be used for filtering. If you use multiple words separated by spaces, this will be used for AND operator. If you use multiple words in a string separated by commas or pass a list of words this will be treated as OR operator.\n", "\n", "**Note**: `track` is case insensitive.\n", "\n", "### What to track?\n", "I want to collect all tweets that contains any of these words:\n", "\n", "- Jupyter\n", "- Python\n", "- Data Mining\n", "- Machine Learning\n", "- Data Science\n", "- Big Data\n", "- IoT\n", "- #R\n", "\n", "This could be done with a string or a list. It is easier to to it with a list to make your code clear to read." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Finished\n", "Total Mining Time: 0:01:21.477351\n", "Tweets/Sec: 1.2\n", "Tweets collected: 100\n", "Total tweets in collection: 2351\n" ] } ], "source": [ "keywords = [\"Jupyter\",\n", " \"Python\",\n", " \"Data Mining\",\n", " \"Machine Learning\",\n", " \"Data Science\",\n", " \"Big Data\",\n", " \"DataMining\",\n", " \"MachineLearning\",\n", " \"DataScience\",\n", " \"BigData\",\n", " \"IoT\",\n", " \"#R\",\n", " ]\n", "\n", "# Visualize a progress bar to track progress\n", "progress_bar = wgt.IntProgress(value=0)\n", "display(progress_bar)\n", "wgt_status = wgt.HTML(value=\"\"\"Tweets/Sec: 0.0\"\"\")\n", "display(wgt_status)\n", "\n", "# Start a filter with an error counter of 20\n", "for error_counter in range(20):\n", " try:\n", " myStream.filter(track=keywords)\n", " print(\"Tweets collected: %s\" % myStream.listener.counter)\n", " print(\"Total tweets in collection: %s\" % col.count())\n", " break\n", " except:\n", " print(\"ERROR# %s\" % (error_counter + 1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Data Access and Analysis\n", "\n", "Now that we have stored all these tweets in a MongoDB collection, let's take a look at one of these tweets" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{'_id': ObjectId('56937d2e105f1970314720e2'),\n", " 'contributors': None,\n", " 'coordinates': None,\n", " 'created_at': 'Mon Jan 11 10:00:14 +0000 2016',\n", " 'entities': {'hashtags': [{'indices': [22, 27], 'text': 'Rの法則'}],\n", " 'symbols': [],\n", " 'urls': [],\n", " 'user_mentions': []},\n", " 'favorite_count': 0,\n", " 'favorited': False,\n", " 'filter_level': 'low',\n", " 'geo': None,\n", " 'id': 686487772970942466,\n", " 'id_str': '686487772970942466',\n", " 'in_reply_to_screen_name': None,\n", " 'in_reply_to_status_id': None,\n", " 'in_reply_to_status_id_str': None,\n", " 'in_reply_to_user_id': None,\n", " 'in_reply_to_user_id_str': None,\n", " 'is_quote_status': False,\n", " 'lang': 'ja',\n", " 'place': None,\n", " 'retweet_count': 0,\n", " 'retweeted': False,\n", " 'source': 'Twitter for iPhone',\n", " 'text': '体力落ちてきておばさんみたいになってきた。\\n#Rの法則',\n", " 'timestamp_ms': '1452506414059',\n", " 'truncated': False,\n", " 'user': {'contributors_enabled': False,\n", " 'created_at': 'Tue Aug 18 16:19:16 +0000 2015',\n", " 'default_profile': True,\n", " 'default_profile_image': False,\n", " 'description': '☮ 関ジャニ∞ & 山田涼介 & Justin Bieber & Benjamin Lasnier & Selena Gomez ☮',\n", " 'favourites_count': 1121,\n", " 'follow_request_sent': None,\n", " 'followers_count': 121,\n", " 'following': None,\n", " 'friends_count': 92,\n", " 'geo_enabled': True,\n", " 'id': 3318871652,\n", " 'id_str': '3318871652',\n", " 'is_translator': False,\n", " 'lang': 'en',\n", " 'listed_count': 0,\n", " 'location': 'The land of dreams',\n", " 'name': 'rena',\n", " 'notifications': None,\n", " 'profile_background_color': 'C0DEED',\n", " 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png',\n", " 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png',\n", " 'profile_background_tile': False,\n", " 'profile_banner_url': 'https://pbs.twimg.com/profile_banners/3318871652/1452436374',\n", " 'profile_image_url': 'http://pbs.twimg.com/profile_images/683964013558931456/Q1rx1s5b_normal.jpg',\n", " 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/683964013558931456/Q1rx1s5b_normal.jpg',\n", " 'profile_link_color': '0084B4',\n", " 'profile_sidebar_border_color': 'C0DEED',\n", " 'profile_sidebar_fill_color': 'DDEEF6',\n", " 'profile_text_color': '333333',\n", " 'profile_use_background_image': True,\n", " 'protected': False,\n", " 'screen_name': 'Q2HpiJwCX1huBwf',\n", " 'statuses_count': 497,\n", " 'time_zone': None,\n", " 'url': None,\n", " 'utc_offset': None,\n", " 'verified': False}}" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "col.find_one()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load results to a DataFrame" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
created_atsourcetextuser
0Mon Jan 11 10:00:14 +0000 2016<a href=\"http://twitter.com/download/iphone\" r...体力落ちてきておばさんみたいになってきた。\\n#Rの法則@Q2HpiJwCX1huBwf
1Mon Jan 11 10:09:26 +0000 2016<a href=\"http://twitter.com/download/android\" ...皆におばさんと言われてうれしがってる #Rの法則@Tamutamu1017
2Mon Jan 11 10:00:10 +0000 2016<a href=\"http://trendkeyword.blog.jp/\" rel=\"no...【R.I.P】急上昇ワード「R.I.Pã€ã®ã¾ã¨ã‚é€Ÿå ± https://t.co/yi1yfC...@pickword_matome
3Mon Jan 11 10:00:10 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...#Rの法則 \\nどれもおばさん臭いけれどやっぱり黄色が一番だなぁ@kakinotise
4Mon Jan 11 10:00:10 +0000 2016<a href=\"http://bufferapp.com\" rel=\"nofollow\">...The New Best Thing HP ATP - Vertica Big Data S...@DataCentreNews1
5Mon Jan 11 10:00:11 +0000 2016<a href=\"http://dlvr.it\" rel=\"nofollow\">dlvr.i...IoT Now: l’Internet of Things è qui, ora https...@datamanager_it
6Mon Jan 11 10:00:11 +0000 2016<a href=\"http://trendkeyword.doorblog.jp/\" rel...今話題の「R.I.P」まとめ https://t.co/VOc5cwK5hg #R.I.P ...@buzz_wadai
7Mon Jan 11 10:00:11 +0000 2016<a href=\"http://twitterfeed.com\" rel=\"nofollow...#oldham #stockport VIDEO: Snake thief hides py...@Labour_is_PIE
8Mon Jan 11 10:00:11 +0000 2016<a href=\"https://about.twitter.com/products/tw...Las #startup pioneras de #machinelearning ofre...@techreview_es
9Mon Jan 11 10:00:12 +0000 2016<a href=\"http://www.linkedin.com/\" rel=\"nofoll...Lets talk about how to harness the power of ma...@jansmit1
10Mon Jan 11 10:00:13 +0000 2016<a href=\"http://catalystfive.com\" rel=\"nofollo...Business Intelligence and Big Data Consulting ...@Catalyst5Jobs
11Mon Jan 11 10:00:13 +0000 2016<a href=\"http://twitter.com/NewsICT\" rel=\"nofo...[æƒ…å ±é€šä¿¡]2016年台北国際コンピューター見本市が新しい位置づけと新しい展示で装い新たに!...@NewsICT
12Mon Jan 11 10:02:10 +0000 2016<a href=\"http://dlvr.it\" rel=\"nofollow\">dlvr.i...#bonplan Parties de Laser Quest entre amis à 2...@Bons_Plans_
13Mon Jan 11 10:02:10 +0000 2016<a href=\"http://dlvr.it\" rel=\"nofollow\">dlvr.i...Parties de Laser Quest entre amis à 22.00€ au ...@keepmymindfree
14Mon Jan 11 10:02:10 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...RT @jose_garde: Why a Simple Data Analytics St...@martingeldish
15Mon Jan 11 10:02:11 +0000 2016<a href=\"http://twitter.com/download/iphone\" r...芸能人の人たくさん手たたき笑いしてるからおばさんたくさんになっちゃうよwww\\n\\n#Rの法則@YK__0704
16Mon Jan 11 10:02:12 +0000 2016<a href=\"http://dlvr.it\" rel=\"nofollow\">dlvr.i...Découvrez le jeu Pure Mission entre amis à 22....@keepmymindfree
17Mon Jan 11 10:02:12 +0000 2016<a href=\"http://dlvr.it\" rel=\"nofollow\">dlvr.i...#bonplan Parties de bowling pour 4 Ã #POINCY :...@Bons_Plans_
18Mon Jan 11 10:02:12 +0000 2016<a href=\"http://dlvr.it\" rel=\"nofollow\">dlvr.i...20 min de vol découverte ULM pour 1 ou 2 à 79....@CrationSiteWeb
19Mon Jan 11 10:02:12 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...RT @hortonworks: Paris is the city of love but...@bigdataparis
20Mon Jan 11 10:02:12 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...RT @hynek: So #emacs / @spacemacs nerds: is th...@fdiesch
21Mon Jan 11 10:02:12 +0000 2016<a href=\"http://www.hootsuite.com\" rel=\"nofoll....@QonexCyber founder member of @IoT_SF is orga...@QonexCyber
22Mon Jan 11 10:02:13 +0000 2016<a href=\"http://dlvr.it\" rel=\"nofollow\">dlvr.i...Parties de bowling pour 4 à #POINCY : 35.00€ a...@keepmymindfree
23Mon Jan 11 10:02:13 +0000 2016<a href=\"http://dlvr.it\" rel=\"nofollow\">dlvr.i...#bonplan 30 séances de Squash à #LISSES : 39.9...@Bons_Plans_
24Mon Jan 11 10:02:13 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...App: ExZeus 2 – free to play https://t.co/ZT...@UniversalConsol
25Mon Jan 11 10:02:13 +0000 2016<a href=\"http://dlvr.it\" rel=\"nofollow\">dlvr.i...#discount Parties de bowling pour 4 Ã #POINCY ...@PromosPromos
26Mon Jan 11 10:02:13 +0000 2016<a href=\"http://www.google.com/\" rel=\"nofollow...spiegel.de : Tier macht Sachen: Python beißt ...@arminfischer_de
27Mon Jan 11 10:09:02 +0000 2016<a href=\"http://twitter.com/download/android\" ...若いっていいねえ…ってよく言うw #Rの法則@naco75x
28Mon Jan 11 10:09:17 +0000 2016<a href=\"http://twitter.com/download/iphone\" r...最近の若い子は最近使った\\n#Rの法則@K1224West
29Mon Jan 11 10:09:19 +0000 2016<a href=\"http://twitter.com/download/android\" ...#Rの法則\\n自分も若者なのに笑@V6ZRRT7Q22BZ1cF
...............
2221Mon Jan 11 10:29:40 +0000 2016<a href=\"http://201512291327-7430af.bitnamiapp...https://t.co/BTAAq6HuuJ - pcgamer - #machinele...@vinceyue
2222Mon Jan 11 10:29:40 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...RT @jose_garde: The Big Data Analytics Softwar...@LJ_Blanchard
2223Mon Jan 11 10:29:41 +0000 2016<a href=\"http://www.linkedin.com/\" rel=\"nofoll...What is data mining? Do you have to be a mathe...@ednuwan
2224Mon Jan 11 10:29:41 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...RT @AgroKnow: A #BigData platform for the futu...@albertspijkers
2225Mon Jan 11 10:29:42 +0000 2016<a href=\"http://www.linkedin.com/\" rel=\"nofoll...Big Data: Is It A Tsunami, The New Oil, Or Sim...@Summerlovegrove
2226Mon Jan 11 10:29:43 +0000 2016<a href=\"https://www.jobfindly.com/php-jobs.ht...Sr Software Engineer C Php Python Linux Jobs i...@jobfindlyphpdev
2227Mon Jan 11 10:29:43 +0000 2016<a href=\"http://twitter.com/download/iphone\" r...RT @bigdataparis: #Bigdata bang : un marché en...@LifeIsWeb
2228Mon Jan 11 10:29:43 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...Learn from the best professors in India .. cou...@ashwaniapex
2229Mon Jan 11 10:29:45 +0000 2016<a href=\"http://getsmoup.com\" rel=\"nofollow\">S...RT @rebrandtoday: #startup or #rebrand -Buy Cr...@SmartData_Fr
2230Mon Jan 11 10:29:45 +0000 2016<a href=\"http://www.ajaymatharu.com/\" rel=\"nof...¿Cómo será el futuro del Big Data? https://t.c...@eduardogarsanch
2231Mon Jan 11 10:29:46 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...RT @jose_garde: 3 Ways to Transform Your Compa...@LJ_Blanchard
2232Mon Jan 11 10:29:46 +0000 2016<a href=\"http://publicize.wp.com/\" rel=\"nofoll...Woman Tries To Kiss Python, Gets Bitten In The...@NAIJA_VIBEZ
2233Mon Jan 11 10:29:46 +0000 2016<a href=\"http://www.itknowingness.com\" rel=\"no...RT @jose_garde: 3 Ways to Transform Your Compa...@itknowingness
2234Mon Jan 11 10:29:48 +0000 2016<a href=\"http://twitter.com/download/iphone\" r...RT @ErikaPauwels: Building a #BigData platform...@impulsater
2235Mon Jan 11 10:29:49 +0000 2016<a href=\"http://twitter.com/download/android\" ...En 2016 j'aimerais moins râler. #résolution. S...@ce1ce2makarenko
2236Mon Jan 11 10:29:51 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...RT @jose_garde: How Marketing Can Be Better Au...@LJ_Blanchard
2237Mon Jan 11 10:29:51 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...Hey @Pontifex accueille ces réfugiées dans ta ...@Atmosfive
2238Mon Jan 11 10:29:51 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...RT @Ubixr: .@GroupeLaPoste choisit le toulousa...@The_Nextwork
2239Mon Jan 11 10:29:56 +0000 2016<a href=\"http://twitterfeed.com\" rel=\"nofollow...Thanks @hackplayers Blade: un webshell en Pyth...@Navarmedia
2240Mon Jan 11 10:29:56 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...RT @jose_garde: Which big data personality are...@LJ_Blanchard
2241Mon Jan 11 10:29:56 +0000 2016<a href=\"http://ifttt.com\" rel=\"nofollow\">IFTT...Cybersecurity Forum tackles challenges with th...@wulfsec
2242Mon Jan 11 10:29:56 +0000 2016<a href=\"http://publicize.wp.com/\" rel=\"nofoll...Woman Tries To Kiss Python, Gets Bitten In The...@Lola2Records
2243Mon Jan 11 10:29:56 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...RT @TeamAnodot: Join David Drai, CEO of Anodot...@iottechexpo
2244Mon Jan 11 10:29:57 +0000 2016<a href=\"http://getsmoup.com\" rel=\"nofollow\">A...RT @rebrandtoday: #startup or #rebrand -Buy Cr...@AI__news
2245Mon Jan 11 10:29:57 +0000 2016<a href=\"http://www.twitter.com\" rel=\"nofollow...RT @Matthis__VERNON: \"@Fred_Poquet Sans #confi...@sibueta
2246Mon Jan 11 10:29:57 +0000 2016<a href=\"https://social.zoho.com\" rel=\"nofollo...The right place for #BigData is #Cloud #Storag...@TyroneSystems
2247Mon Jan 11 10:29:59 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...RT @PEBlanrue: Il a toujours le mot pour rire,...@lesroisduring
2248Mon Jan 11 10:29:58 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...RT @DigitalAgendaEU: €15 million for a #IoT so...@ImproveNPA
2249Mon Jan 11 10:29:59 +0000 2016<a href=\"http://twitter.com\" rel=\"nofollow\">Tw...Neat IoT innovation https://t.co/atARX0m5Bj@sherwinnovator
2250Mon Jan 11 10:30:00 +0000 2016<a href=\"http://www.hubspot.com/\" rel=\"nofollo...Check out our #Mobile App Predicitions for 201...@B60uk
\n", "

2251 rows × 4 columns

\n", "
" ], "text/plain": [ " created_at \\\n", "0 Mon Jan 11 10:00:14 +0000 2016 \n", "1 Mon Jan 11 10:09:26 +0000 2016 \n", "2 Mon Jan 11 10:00:10 +0000 2016 \n", "3 Mon Jan 11 10:00:10 +0000 2016 \n", "4 Mon Jan 11 10:00:10 +0000 2016 \n", "5 Mon Jan 11 10:00:11 +0000 2016 \n", "6 Mon Jan 11 10:00:11 +0000 2016 \n", "7 Mon Jan 11 10:00:11 +0000 2016 \n", "8 Mon Jan 11 10:00:11 +0000 2016 \n", "9 Mon Jan 11 10:00:12 +0000 2016 \n", "10 Mon Jan 11 10:00:13 +0000 2016 \n", "11 Mon Jan 11 10:00:13 +0000 2016 \n", "12 Mon Jan 11 10:02:10 +0000 2016 \n", "13 Mon Jan 11 10:02:10 +0000 2016 \n", "14 Mon Jan 11 10:02:10 +0000 2016 \n", "15 Mon Jan 11 10:02:11 +0000 2016 \n", "16 Mon Jan 11 10:02:12 +0000 2016 \n", "17 Mon Jan 11 10:02:12 +0000 2016 \n", "18 Mon Jan 11 10:02:12 +0000 2016 \n", "19 Mon Jan 11 10:02:12 +0000 2016 \n", "20 Mon Jan 11 10:02:12 +0000 2016 \n", "21 Mon Jan 11 10:02:12 +0000 2016 \n", "22 Mon Jan 11 10:02:13 +0000 2016 \n", "23 Mon Jan 11 10:02:13 +0000 2016 \n", "24 Mon Jan 11 10:02:13 +0000 2016 \n", "25 Mon Jan 11 10:02:13 +0000 2016 \n", "26 Mon Jan 11 10:02:13 +0000 2016 \n", "27 Mon Jan 11 10:09:02 +0000 2016 \n", "28 Mon Jan 11 10:09:17 +0000 2016 \n", "29 Mon Jan 11 10:09:19 +0000 2016 \n", "... ... \n", "2221 Mon Jan 11 10:29:40 +0000 2016 \n", "2222 Mon Jan 11 10:29:40 +0000 2016 \n", "2223 Mon Jan 11 10:29:41 +0000 2016 \n", "2224 Mon Jan 11 10:29:41 +0000 2016 \n", "2225 Mon Jan 11 10:29:42 +0000 2016 \n", "2226 Mon Jan 11 10:29:43 +0000 2016 \n", "2227 Mon Jan 11 10:29:43 +0000 2016 \n", "2228 Mon Jan 11 10:29:43 +0000 2016 \n", "2229 Mon Jan 11 10:29:45 +0000 2016 \n", "2230 Mon Jan 11 10:29:45 +0000 2016 \n", "2231 Mon Jan 11 10:29:46 +0000 2016 \n", "2232 Mon Jan 11 10:29:46 +0000 2016 \n", "2233 Mon Jan 11 10:29:46 +0000 2016 \n", "2234 Mon Jan 11 10:29:48 +0000 2016 \n", "2235 Mon Jan 11 10:29:49 +0000 2016 \n", "2236 Mon Jan 11 10:29:51 +0000 2016 \n", "2237 Mon Jan 11 10:29:51 +0000 2016 \n", "2238 Mon Jan 11 10:29:51 +0000 2016 \n", "2239 Mon Jan 11 10:29:56 +0000 2016 \n", "2240 Mon Jan 11 10:29:56 +0000 2016 \n", "2241 Mon Jan 11 10:29:56 +0000 2016 \n", "2242 Mon Jan 11 10:29:56 +0000 2016 \n", "2243 Mon Jan 11 10:29:56 +0000 2016 \n", "2244 Mon Jan 11 10:29:57 +0000 2016 \n", "2245 Mon Jan 11 10:29:57 +0000 2016 \n", "2246 Mon Jan 11 10:29:57 +0000 2016 \n", "2247 Mon Jan 11 10:29:59 +0000 2016 \n", "2248 Mon Jan 11 10:29:58 +0000 2016 \n", "2249 Mon Jan 11 10:29:59 +0000 2016 \n", "2250 Mon Jan 11 10:30:00 +0000 2016 \n", "\n", " source \\\n", "0 Tw... \n", "4 ... \n", "5 dlvr.i... \n", "6 dlvr.i... \n", "13 dlvr.i... \n", "14 Tw... \n", "15 dlvr.i... \n", "17 dlvr.i... \n", "18 dlvr.i... \n", "19 Tw... \n", "20 Tw... \n", "21 dlvr.i... \n", "23 dlvr.i... \n", "24 Tw... \n", "25 dlvr.i... \n", "26 Tw... \n", "2223 Tw... \n", "2225 Tw... \n", "2229 S... \n", "2230 Tw... \n", "2232 Tw... \n", "2237 Tw... \n", "2238 Tw... \n", "2239 Tw... \n", "2241 IFTT... \n", "2242 Tw... \n", "2244 A... \n", "2245 Tw... \n", "2248 Tw... \n", "2249 Tw... \n", "2250 \n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
wordcount
0https1986
1co1907
2rt804
3de550
4rの法則408
5iot374
6the358
7bigdata293
800275
9data250
10in234
11python219
12to212
13au199
14of188
15lieu168
16réduction166
17big157
18on143
19is142
20and140
21for136
22analytics107
23le89
24via86
25you86
26thingsexpo86
27by85
28201684
29snake80
30en76
31bowie75
32la74
33thief74
34video73
35m2m70
36jose_garde68
371967
38david66
39with63
40how61
41it60
42will55
43un54
44amp53
45des53
46réparation53
47new52
483952
49at51
\n", "" ], "text/plain": [ " word count\n", "0 https 1986\n", "1 co 1907\n", "2 rt 804\n", "3 de 550\n", "4 rの法則 408\n", "5 iot 374\n", "6 the 358\n", "7 bigdata 293\n", "8 00 275\n", "9 data 250\n", "10 in 234\n", "11 python 219\n", "12 to 212\n", "13 au 199\n", "14 of 188\n", "15 lieu 168\n", "16 réduction 166\n", "17 big 157\n", "18 on 143\n", "19 is 142\n", "20 and 140\n", "21 for 136\n", "22 analytics 107\n", "23 le 89\n", "24 via 86\n", "25 you 86\n", "26 thingsexpo 86\n", "27 by 85\n", "28 2016 84\n", "29 snake 80\n", "30 en 76\n", "31 bowie 75\n", "32 la 74\n", "33 thief 74\n", "34 video 73\n", "35 m2m 70\n", "36 jose_garde 68\n", "37 19 67\n", "38 david 66\n", "39 with 63\n", "40 how 61\n", "41 it 60\n", "42 will 55\n", "43 un 54\n", "44 amp 53\n", "45 des 53\n", "46 réparation 53\n", "47 new 52\n", "48 39 52\n", "49 at 51" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cv = CountVectorizer()\n", "count_matrix = cv.fit_transform(dataset.text)\n", "\n", "word_count = pd.DataFrame(cv.get_feature_names(), columns=[\"word\"])\n", "word_count[\"count\"] = count_matrix.sum(axis=0).tolist()[0]\n", "word_count = word_count.sort_values(\"count\", ascending=False).reset_index(drop=True)\n", "word_count[:50]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Visualization" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def get_source_name(x):\n", " value = re.findall(pattern=\"<[^>]+>([^<]+)
\", string=x)\n", " if len(value) > 0:\n", " return value[0]\n", " else:\n", " return \"\"" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "Facebook 25\n", "TweetDeck 37\n", "RoundTeam 41\n", "Hootsuite 46\n", "twitterfeed 81\n", "IFTTT 134\n", "dlvr.it 200\n", "Twitter Web Client 388\n", "Twitter for Android 392\n", "Twitter for iPhone 515\n", "Name: source, dtype: int64" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAegAAAD7CAYAAAChZQeNAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xm8HEW9///Xm7AlAUGEqyIgyOJNlCVEEEFwEEX0yiYo\niAq4gQuLV+C6gJroVVD44VVUVNw1AVEQQZDVHIPs2VlEjBq/ICCIoEHAQPL+/dF1ks4wZ8nkhEzm\nvJ+Pxzymu7qqumpO4DNVXdMt20RERERnWW1lNyAiIiKeLgE6IiKiAyVAR0REdKAE6IiIiA6UAB0R\nEdGBEqAjIiI60OoruwGxapCU3+NFRLTBttoplxF0DJrtrn196lOfWultSN/Sv/Sv+17LIwE6IiKi\nAyVAR0REdCAt7xA8hgdJ9qSV3YoVp+cOaIxd2a1YMbq5b5D+reoG7N9hq3aMkoTbvAadAB2D0u0B\nOiI61DAO0JnijoiI6EAJ0BERER0oAToiIqIDLXOAlvQcSTPL6z5J95TtGZJa3vhE0tGS3lG2j5T0\n/NqxD0ka2X4XQNJGkm6SNF3Sbm3WMVHSq8t2j6Q7Jc2S9BtJ25T0eZI2WJ62LmObJpV23Crp272f\nr6S3SZotaY6k6yRt10f5V5fP5FZJ35M0oqQfJOk2SVN7+yNpS0nnPVN9i4iI/i1zgLb9kO1xtscB\nXwfOLPs72n6qjzLfsP3DsnsEsHHt8PHAqGVpg6Tmdu8FzLE93vZ17dRh+1O2f9W7Cxxmewfg+8Dp\ntfS2Lva36Ue2/9P2tsBI4D0l/Y/AHra3Az4DfLO5YOnf94BDSvk/U332AMcALwO+ARxW0j4DnLyC\n+hEREctoKKa4R0iaBiBpe0mLJG1S9udKGilpgqQTJB1EFRgmlVH3cVTBeoqka0qZvSVdX0Z+50sa\nXdLnSTpN0nTg4N6TS9oB+DywfxnFry3prWV0eauk02p5H5V0hqRZwC71TpQR5kEt+nctsFVt/9jS\ntjmSXlzKbiDpojKqvUHStiV9gqTvSJoi6Q+Sjq2d7+1l1D9T0tdbfOnA9i9ru7cAm5T0G2z/o6Tf\n1Jve5DnAAttzy/7VQG//FgFrA6OBBZJ2B+6z/YcW9URExEowFAF6EbCWpHWB3akCyR6SXgg8YPtx\nqpGnbV8ATKManY6z/WXgXqBhey9JG1KN4vayPR6YDny4nMfA38oo+fzek9ueBXwSOM/2jsAGwGnA\nnsAOwE6S9i/ZRwE32t7B9vVN/XB59eodKe8LzKmlP1jadjZwYkmbCEy3vT3wceAHtfzbAHsDOwOf\nkjRC0hjgLcCuZSZiEfC2vj5gSWsAbwd+2eLwu4HLWqT/DVhd0viyfzCwadk+lSpg/xdwHnAK1Qg6\nIiI6xFA9LOMGYDeqAH0qsA9VgJvaR/6+pol3AcYC10sCWBOoB9If91Nfb507AVNsPwTVdVxgD+Dn\nwELggoG7g6hG+Y8DfwKOrR27sLzPAN5Utnfr3bY9pVynX5cq4F9q+0ngIUkPAM+jmpIfD0wr/RwJ\n3N9Pe74G/Lp5+l7SnsC7yvmXYtuSDgW+KGkt4MrSf2xfTTWTgaTDgUuB/5R0AvAwcHz5YrWUCbVP\nrjGmu2+eEBHRjp6eHnp6eoakrqEK0FOpguBmVIHwo1TB6Rd95O/vl+dX2T6sj2P/GkR9zdeJVTv+\nhAd3Z5bea9AzWhz7d3lfyNKfX19fOhbUtutlvm/74wM1RNKngOfYfm9T+nbAOcA+th9uVdb2jVR/\nFyTtDWzdVMcoquvSr6P6Wx0IvJlqNP+t5vomtLoAEBERizUaDRqNxuL9iRMntl3XUP3M6lqqKdjf\nlwD4d+ANwG9qeXoD2HzgWbX0+v5NwG6StgSQNFrSUkGlD/XgeAvwqjKKHQEcCvx6GfvTXOdArqVM\nUUtqUE2Dz++jDgPXAAdL2qiU2UDSZk9rgPQequnxw5rSN6Mayb+9do356R1YUv9awP9QLeqrOwn4\nUlnc17uS3rXtiIhYSYYiQNv2n8t275T2tcDDtYVMsGQU+z3g670LuqhWIF8u6RrbDwJHAudKmk01\nvf3iwbSht37b91GN4KcAs4Bpti9pasOg+jVAWv2a9QRgfGnz51iyWrr5ujaljb+luu57ZSlzJdXU\nd7Ozgf8AbiiLyU4p6Z8Eng2cXdJv7i0g6VJJvXWdJOkOYDZwse2eWr6NgZ1sX1ySzqL6cnMUMLlF\nWyIi4hmUe3HHoCj34o6IlSH34o6IiIhOkgAdERHRgRKgIyIiOlACdERERAfKIrEYFEmD/Al5RET0\nyiKxiIiILpMAHRER0YESoCMiIjrQUN2LO4aDyc/ko7AjhqlV/MYcMXQygo6IiOhACdAREREdKAE6\nIiKiAyVAR0REdKB+A3R5pvLM8rpP0j1le4aklgvMJB0t6R1l+0hJz68d+5Ck5XrWsKSNJN0kabqk\n3ZajntUlPSjp1DbLz5O0wTLk31jST/o41iNpfIv0YyTNlbSofi5JJ9b+LrdKekrS+v2c+8uS5tf2\nD5J0m6SpvfVK2lLSeYPtT0RErFj9BmjbD9keZ3sc8HXgzLK/o+2n+ijzDds/LLtHABvXDh8PjFqW\nBkpqbuNewBzb421f12YdAK8FpgMHLUt7ap621FJFy8z2vbbf3E9drZZu/oaqv39eKrN9Ru3v8jGg\nx/YjrSqW9DJg/ab6jwFeBnwDOKykfQY4uY/2RUTEM2xZp7hHSJoGIGn7MrLbpOzPlTRS0gRJJ0g6\niCoITCojveOogvUUSdeUMntLur6Mhs+XNLqkz5N0mqTpwMG9J5e0A/B5YP8yil9b0lslzSkjydNq\neR+VdIakWcAuLfpyKHA28EdJr6iVm1f6ML3U++KS/hxJV5aR5zmASvrmkn4n6fvArcCmkk4v7Zkj\n6S21fLeW7ZGSzpN0h6QLgZG99dXZnmX7z83pTQ4Dzm11QNII4AvA/zTVvwhYGxgNLJC0O3Cf7T8M\ncK6IiHiGLGuAXgSsJWldYHfgFmAPSS8EHrD9OGU0aPsCYBpwWBntfRm4F2jY3kvShlQjtr1sj6ca\nzX64nMfA38oo+fzek9ueBXwSOM/2jsAGwGnAnsAOwE6S9i/ZRwE32t7B9vX1TkhaG3g18EvgfOCt\ntcMGHixtOhs4saR/Cphq+6XAz4DNamW2Ar5aju0EbA9sB7wGOF3Sc5s+x/cDj9oeW+odT+sRdL8k\njQJeB1zQR5ZjgJ/bvr8p/VTgauC/gPOAU6hG0BER0SHauVHJDcBuVAH6VGAfqtHZ1D7y93V3i12A\nscD1ZVZ4TaAeSH/cT329de4ETLH9EICkScAewM+BhfQduN5INS28QNJFwARJx9eeBnFheZ8BvKls\n7w4cCGD7MkkP1+r7s+2by/ZuwORS1wOSfg3sTDW6plbXl0pdt0qa00c7B7Iv8JtW09uSNqaafWg0\nT7vbvppqdgNJhwOXAv8p6QTgYeD48mVrKRNqn2ZjDDTGttnqiIgu1dPTQ09Pz5DU1U6AnkoVBDej\nCoQfpRr9/aKP/P2NDK+yfVgfx/41iPrM0l8AVDv+RD+PX3orsJukP5X9Daiu9V5d9v9d3hey9GfU\n15eN5rY252vVjqG4Ldeh9DG9TTWjsBUwt+yPknSX7W0WN6AagR9BNQr/BdUXkDcDbwO+1VzhhHav\n1kdEDBONRoNGo7F4f+LEiW3X1c7PrK4F3g78vgTAvwNvoFrQ1Ks3+MwHnlVLr+/fRBUktwSQNFrS\n1oM4fz2w3QK8qlwfHkEVsH7db2HpWcArgU1tb2F7C6qp4Lf2V47qi8lhpY7XA8/uI9+1wCGSVpO0\nEdWXmZub8tTreinVdPhAlgroktZjyWzB09i+zPbza318rB6ci5OAL5UFf72r613bjoiIlWRZA7Rr\ni5Z6p7SvBR62/Y96vvL+PeDrvQu6gG8Cl0u6xvaDwJHAuZJmU01vv3gwbeit3/Z9VCP4KcAsYJrt\nS5ra0OwA4BrbT9bSLgbeKGnNvs4FTKS63n4b1Ujzz035KG36GTAHmA1cA5xk+4GmfGcD60i6o9Q7\nrVVDJR0n6W7gBcAcSd9s6scVzVPRki6V9LwW1bkp38bATrYvLklnUX3hOQqY3Ko9ERHxzFHfs8AR\nS0iyJ63sVkQMA3lYRleRhO22LmnmTmIREREdKAE6IiKiAyVAR0REdKAE6IiIiA6URWIxKJL6+Vl5\nRES0kkViERERXSYBOiIiogMlQEdERHSgBOiIiIgO1M7DMmK4mjwUz/eIiKfJ3cOihYygIyIiOlAC\ndERERAdKgI6IiOhAgwrQ5XnLM8vrPkn3lO0Zklpex5Z0tKR3lO0jJT2/duxDktp+5rCk9SX9rbb/\nCkmLyiMUkbSepIf6Kd+QdElfx2v51pB0mqS7JE2XdL2kfcqxeZI2KNvXLUdfjqh/Nk3H3izpdkkL\nJY2vpR9W+3vMLMef9kxpSadL+q2k2ZIuLM+QRtJuJe0WSVuVtPUlXdFuPyIiYmgNKkDbfsj2ONvj\ngK8DZ5b9HW0/1UeZb9j+Ydk9Ati4dvh4YNSyNFTS4rbafgS4T9KYkrQrMAPYrezvAty0LPX34TPA\nc4GX2B5P9QzmdXubUWvPbi3KDtaRLP3Z1N1K9ezpqU3nm1z7e7wD+KPtOS3KX1navj1wF/Cxkv5h\n4PXAh4D3lbRTgM8uRz8iImIItTvFPULSNABJ25fR6yZlf66kkZImSDpB0kHAy4BJZbR3HFVAmiLp\nmlJm7zI6nS7pfEmjS/q8MoKdDhzc1IbrqQIzwCuA/6vt7wpcJ2m1Moq8uYwYj6qVf5akX0i6U9LZ\nkpZaoixpFPAe4FjbTwLYfsD2T5o/DEmP1rZPqp1vQknbvIxkvynpNklXSFpb0sG1z2aGpLXr9dq+\n0/ZdA/wtDgPOa3XA9lW2F5Xdm4BNyvaTwOjyWiBpS2AT21MHOFdERDxD2g3Qi4C1JK0L7A7cAuwh\n6YXAA7Yfpxrx2fYFwDTgsDLq+zJwL9CwvZekDYGTgb3KKHU61QiPUsffbI+3fX5TG65jSUB+EfAT\nqmAHVcC+nirAPmJ7Z2Bn4L2SNi95dgaOAcYCWwJvaqp/K+D/2X6UgRmqLxrAVuV844Dxknav1fcV\n2y8FHgEOsv3T2mezo+0nBnGuZm8Bzh1EvncBl5XtU4EfAB8Bvgr8L9XfICIiOsTy/A76Bqop5d2p\n/oe/DyCq6dhW+voR7S5UQfL6Mohdkyq49vpxH+WuBz5WAu482/9WZTQwHrgZ+CCwbRmpAjyLKlA+\nBdxsex6ApHOBVwIX9N3dQdkb2FvSzLI/upzvbuBPtWno6cDmtXJt/cBY0suBx2zfMUC+k4EFticD\n2J5N9SUGSXtQfWFaTdKPgQXACbYfaKdNERExNJYnQE8F9gA2A34OfJRqJPmLPvL390v8q2wf1sex\nf7WszJ4raX1gX5YE9OlUI8U/2f5XCfjH2L6qXlZSo6k9atG+ucBmkta1Pb+ftjc71fY3m863OfDv\nWtJCoD6d3e5dCg4FJveXQdKRwBuAvVocE9XI+VDgLOBEYAvgOKpr0kuZUPv60hgDjbFttjoiokv1\n9PTQ09MzJHUtT4C+Fvgc0GPbkv5OFQg+WsvTOzKcTzV6pWn/71TXRr8qaUvbfygj4I1t/34QbbiR\nasHZEWX/BqqFTr1fEq4APiBpiu2nJG0D3FOO7VwC5/+jmib+Rr1i249J+jbwJUlH235S0kbAq8rU\ndCtXAJ+RNKl8QXgB1Yi0lb4+m740XyNfDXgz1ci/dYFqxflJpc2tps8PBy61/XC55u7yarmAb8JB\ng2hlRMQw1mg0aDQai/cnTpzYdl3tXoO27T+X7d4p7WuBh23/o56vvH8P+HptIdQ3gcslXWP7QaqV\nzOdKmk01Gn7xINtxHdXCp2ll/0aqEWDviPpbwB3ADEm3AmdTfSkx1XXzr5TjfwQualH/KcCDwB2l\n/CXAP1rkM1SLsqhGtDdImgOcD6zT9FnQtP89lv5sFpN0oKS7qS4DXCrpl7XDe1BdI5/XVOYcSTuW\n3bPK+a8qC/S+Vss3iuqLzVdL0plU16jPpPqcIiJiJZKde8DGwCTZk1Z2KyK6VO7F3bUkYbutdUa5\nk1hEREQHSoCOiIjoQAnQERERHSgBOiIiogMlQEdERHSgrOKOQZHk/FuJiFg2WcUdERHRZRKgIyIi\nOlACdERERAdanntxx3Azua3LKLGy5S5VEaukjKAjIiI6UAJ0REREB0qAjoiI6EAJ0BERER0oAXoI\nSJog6YSy/T1JB7VZz76SPlK2D5A0po98W0m6tjzjebak1/eR75By/DZJp9XSj5V0q6RLJa1R0l4p\n6cx22h0REUMvAXpouLyatwckafHfwPYltj9fdg8AxvZR7BTgR7bHAYcCX2tR73OALwCvtv1S4HmS\nXl0OH2Z7W+B64HWSVOr89GDbHRERK1YCdJsknSzpd5KuBV789MN6naTzawkNSZeU7UclnSFpFrBL\nLc+Rks6S9ApgX+D0Mkp+UVP99wHrle31gb+0aOKLgN/bfqjsXwP0juwlaS1gFPAk8HbgMtuPLNun\nEBERK0p+B90GSeOBQ4DtgTWAGcC0WhYDVwPflDTS9uMl/7nl+CjgRtsnNlVtANs3SLoYuMT2hS2a\ncCpwg6RjgdHAXi3yzAVeLOmFVAH8AJb8vb8C3ADcBlwH/BzYe5Ddj4iIZ0ACdHt2By60/QTwRAmm\nS7G9UNLlwH6SLgDeAPQG5IXABYM4T193BjkT+JbtL0raBfgR8JKm8z8s6f3Aj4FFVNPZW5ZjPypl\nkPRJ4EvAf0l6B3A3cEKrJ2NMqLW4MQYafU3AR0QMUz09PfT09AxJXQnQ7TFLB8++Aul5wDHA34Fp\ntv9V0p8Y5KOh+sqzK/ApANs3Slpb0oa2/7ZUYfsXwC8AJB0FPFU/LmljYCfbn5bUA+wJfIJqRH51\n80kntLX0LSJi+Gg0GjQajcX7EydObLuuXINuz1TggBIY1wXe2HRctXw7Au9lyfR2f+qBfj7wrD7y\n3Qm8BqCs9F67OTiXY/9R3p8NvB/4VlOWz1AFZICR5fwu2xERsRIlQLfB9kyqqePZwGXAzc1ZSr6F\nVCPYfcr7UscBJB0t6ehaeu+x84CTJE1vsUjsJOCdZZHZZOCIWn0za/n+T9LtwG+AU23PreXbAVhk\ne1ZJmgzMAV4BXD7ghxARESuUBjfTGsOdJHvSym5FtCUPy4hYaSRhu60nDWUEHRER0YESoCMiIjpQ\nAnREREQHSoCOiIjoQFkkFoMiaZA/3Y6IiF5ZJBYREdFlEqAjIiI6UAJ0REREB8q9uGPwJrd1GWX4\nyY1BImIIZAQdERHRgRKgIyIiOlACdERERAdKgI6IiOhACdBDQNKj5X1zSY9Lmll7fbK2vbC2vai8\n3y7psRbpv5f0SC19lxbnPVbSbyXdJunzfbRtH0l3lvo+Ukv/vKTZkr5fS3u7pONXxGcUERHLJqu4\nh0Z92e5c2+Oajn8aQNL85mOSXgj8okX6q4ATbe/b6oSS9gT2A7az/aSkjVrkGQF8BXgN8BfgFkkX\nA/cC42xvL+kcSS8F/gAcCbxusJ2OiIgVJyPola+v3y4N9Jum9wOn2n4SwPaDLfLsTPWFYV7Jdx6w\nP7AQWEOSgFHAk8CJwJdtL2yjDxERMcQSoIfelrVp6bNW4Hm2BvaQdKOkHkkva5HnBcDdtf17gBfY\nfhS4DJhBNZr+J7Cz7YtXYHsjImIZZIp76P2hxRT3irA68Gzbu0jaCTgfeFFTnj7vmGH7dOB0AEnn\nAJ+Q9B7gtcAc259tLjPhgiXbjTHQGLu8XYiI6C49PT309PQMSV0J0Kuue4ALAWzfUhaXPcf2Q7U8\nfwE2re1vWsotJqn3y8RdwGm295H0HUlb2Z5bzzvhoCHvQ0REV2k0GjQajcX7EydObLuuTHGvui4C\nXg0gaRtgzabgDDAN2LqsLl8TOARonsb+NPAJYE1gRElbBIxcUQ2PiIiBJUAPDfex3V++gdI9QF3f\nAV4k6VbgXOBwAEkbS7oUwPZTwDHAFcAdwI9t/7a3Akn7A7fYvt/2I8AsSXOAtWzf2s+5IyJiBZOd\nG/vHwCTZk1Z2K1YReVhGRBSSsN3Wk4Yygo6IiOhACdAREREdKAE6IiKiAyVAR0REdKAsEotBkeT8\nW4mIWDZZJBYREdFlEqAjIiI6UAJ0REREB0qAjoiI6EB5WEYM3uS21jmsPLmjV0SswjKCjoiI6EAJ\n0BERER0oAToiIqIDdVWAlrSepPcvY5nryvsLJb21lr69pNe32Y7TJd0m6fPtlG+qq0fS+Bbp35Y0\nS9IcST+TtF4f5TeTdKWkOyTdLmmzkj5J0mxJn63lPaU8gjIiIlayrgrQwLOBDyxLAdu7lc0tgMNq\nh8YBb1iWuiSNKJvvBba1/ZFlKd+Hvp4L/SHbO9jeDvgjcGwf5X8AfN72WGAn4EFJ2wGP2d4e2EnS\nupKeD+xs++dD0OaIiFhO3baK+zRgS0kzgauAUcAVti+R9DPg77bfLeldwItsnyLpUdvrlLL/Wcqe\nC3wQGCnplcDngMuAs4CXAGsAE2xfLOlI4E3AaGCEpH8A6wAzJJ0KTAHOBjYrbfyQ7eslje6jvpHA\nd4HtgDuBkcDTlk/bng8gSSXP75vzSBoLjLB9TSnzWElfUPq2Wjn3IuDTwCeX9QOPiIgVo9sC9EeA\nl9geByDpEGB34BLgBcBzS77dgcll27WyJ9ret5T9KzDe9nFl/3PANbbfJWl94CZJV5ey46hGzI+U\nvPNrbZgMfNH2dWV6+XJgLHByH/W9D3jU9lhJ2wIzaD2CRtJ3gdcDc4HjWmTZBnhE0gVUMwRXAx+1\nfaekB4HpVCPsranuyz5rwE84IiKeEd0WoJtHmtcCH5I0BrgdWF/S84BdgGMGKKumtL2BfSWdWPbX\nohoVG7iqNzi38BpgTDXQBWDdMnruq77dgS8B2L5V0py+Omv7nWUU/BWqgD+xKcvqpb4dgLuBHwNH\nAt+x/d+LOypdDBwl6WSqkftVtr/V13kjImLF67YAvRTb95bR6T7AVGAD4BCqEeq/BireIu1Ntpea\nSpb0cqC/ugS83PaCpnJ91ddbZlBsL5J0HvA/LQ7fDcyyPa/UfRHVl5Pv1M63PzANWJdq2v8QSZdL\nmmT78XplEy5Yst0YA42xg21lRMTw0NPTQ09Pz5DU1W0Bej5VoKm7EfgQsCewIXABcP4gyjbvX0E1\njXwsgKRxtmcycDC9spQ7o5Tb3vbsfuqbSrVYbYqkl1KNaJ9G0la255Zr0PsBM1tkm0Y1a7Ch7b8B\newE31+pYAzieajHcNiz5UjKC6tr00gH6oAF6GhExzDUaDRqNxuL9iRObJzYHr6tWcdt+CLhO0q21\nnzhdS7VQ6o9UQezZJW1xsfI+G1hYfrp0PNXirrGSZkp6M/AZYI3ys6bbWDKd3GqVdX3/OOBl5SdN\ntwNHl/S+6jsbWEfSHSVtWnM/S1D+Xpn+nk01M/C5cmy8pHPK57EQOBG4puQ1cE6tqg8A37P9hO05\nwKiSb5rtfz7tA46IiGeM7NyvOAYmyZ60sluxjHIv7ohYySRhu60HGXTVCDoiIqJbJEBHRER0oATo\niIiIDpQAHRER0YESoCMiIjpQVnHHoEhy/q1ERCybrOKOiIjoMgnQERERHSgBOiIiogN12724Y0Wa\n3NZllMHLnb8iIhbLCDoiIqIDJUBHRER0oAToiIiIDpQAHRER0YGGVYCW9GjT/pGSzmqzrv0ljVmO\ntlxX3l8o6a3LWHZTSVMk3S7pNknH9ZP3y5J+X55HPa6kbSTpN+W52fvX8l4k6Xnt9ikiIobOsArQ\nQPMy4eVZNnwgMLbthti7lc0tgMOWsfiTwH/bfgmwC/DBVl8WJL0B2Mr21sBRwNnl0FuBrwE7Ax8q\nefcFZti+f1n7EhERQ2+4Behmi383JGlzSb8qI82rJW3aV7qkXYF9gdMlzZD0IknHlRHtbEmTS9kJ\nkk6oneM2SZuV7d7R/GnA7pJmSjpe0mqSTpd0c6nrqOZG277f9qyy/SjwW2DjFv3bD/h+yXcTsH4Z\nIS8ARgNrAwsljQCOB76wHJ9lREQMoeH2O+iRkmbW9jcAfl62zwK+a/uHkt4JfJlqlPy0dNsHSroY\nuMT2hQCSPgJsbvtJSc8qdfY3Yu/d/ghwou19Sz1HAY/Y3lnSWsBvJF1pe16rDknaHBgH3NTi8AuA\nu2v791AF8snldRTwP8AHgR/YfqLVOSIi4pk33AL047bH9e5IOgJ4WdndBTigbP+IJaPJvtKhNgIH\n5gCTJV0EXLQMbWq++8fewLaSDi77zwK2AuY9raC0DvBT4Pgykh5M/dj+J/DGUsezgY8BB0o6B1gf\n+P9s39hcbsIFS7YbY6DR9gR/RER36unpoaenZ0jqGm4Bullz8OrrVll9pddHxP8F7EE19X2ypG2B\np1j6MsLag2zXMbav6i+DpDWAC4Af2e7rC8FfgE1r+5uUtLpPAP9LdR18aqnzQmCf5somHDSotkdE\nDFuNRoNGo7F4f+LEiW3XNdyvQdddDxxatt9GFaz6S59PNbpFkoDNbPcAHwXWo7rGOw/YseTZkWpB\nWLP5wLq1/SuAD0havZTbRtKoeoFyvm8Dd9j+v376dDFweCmzC9XU+V9r9WwNbGx7KjCSJV84RvZT\nZ0REPAOG2wi61TXh3rRjge9KOgl4AHjnAOnnAedIOpZqVfS3Ja1HNdr+ku1/SroAOFzSbVTXiH/X\noi2zqRZqzQK+S3Xte3NgRgnED1BdC6/bDXg7MKd2Tf1jti+XdDSA7W/YvkzSGyTNBf5Va3uv/wU+\nXrbPpZqa/yjVqDoiIlYi2XlAQQxMkj1pBZ8kD8uIiC4jCdttPWkoU9wREREdKAE6IiKiAyVAR0RE\ndKAE6IiIiA6URWIxKJKcfysREcsmi8QiIiK6TAJ0REREB0qAjoiI6EDD7U5isTwmt3UZZWC5QUlE\nxNNkBB0REdGBEqAjIiI6UAJ0REREB0qAjoiI6EBdEaAlLZQ0U9IcSRdKWmcFn+9ISWdJ+ng578xa\nG2ZKOmbOXEc5AAAPmklEQVQFn38DSVMkzZd0Vj/5dpZ0c2nTLZJ2Kum7SZpd0rYqaetLumJFtjsi\nIgavKwI08Jjtcba3A/4JHP1MnNT258p5x9XaMM72V1bwqZ8ATgFOHCDfF4BPlPZ9suwDfBh4PfAh\n4H0l7RTgs0Pf1IiIaEe3BOi6G4AtASTtIOnGMlq8UNL6Jb1H0viyvaGkP5XtI0u+X0q6S9LneyuV\n9E5Jv5N0E7BrXyeXNELS6WXkOlvSUSV9HUlXS5peRvr7lfTNJd0p6bul/kmS9pZ0XWnDTs3nsP2Y\n7euAfw/wWdwHrFe21wf+UrafBEaX1wJJWwKb2J46QH0REfEM6arfQUsaAewNXFOSfgB80Pa1kiYC\nnwL+G3B5tbI9sAOwAPidpC8Di4AJwI5UI/QpwIw+yr8beMT2zpLWAn4j6UrgbuBA2/MlbUj1ReLi\nUmZL4CDgDuAW4BDbu5Ug/nHgwD7ONdAPiD9azn8G1ZexV5T0U6k+m8eAw4EzgJMHqCsiIp5B3RKg\nR0qaCbwAmAd8XdJ6wHq2ry15vg/8ZBB1XWN7PoCkO4DNgY2AHtsPlfQfA9v0UX5vYFtJB5f9ZwFb\nAfcAp0ranSrgbyzpP0qeP9m+vdR9O3B1Sb+tnL9d3waOs/0zSW8GvgO81vZsSrCWtAdwL7Ba6dcC\n4ATbDzRXNuGCJduNMdAYuxwti4joQj09PfT09AxJXd0SoB+3PU7SSOAKYH+WjKJ71W+D9RRLpvfX\nbspXnzZeSPUZNY9UB7ql1jG2r1qqgHQksCGwo+2FZVq999z1cy6iCpK928vzN9rZ9mvK9k+BbzW1\nSVQj50OBs6iuaW8BHEd1TXopEw5ajpZERAwDjUaDRqOxeH/ixIlt19VV16BtP04VXD4LzAcelvTK\ncvgdQE/Znge8rGwfTP8M3AS8qqyeXgN4cz/5rwA+IGl1AEnbSBpFNZJ+oATnPYEXLkvf+jDQF4W5\nkl5Vtl8N3NV0/HDgUtsPA6NYMvU/agjaFhERy6FbRtCLR7i2Z0maC7wFOIJqunsU8AfgnSXbGcD5\nZQHXpbXyLa9N275f0gSq68aPADNb5Ovd/xbVtPSMMkJ9ADgAmARcImkOMA34bav2t9hveZ1Z0jxg\nXWBNSQdQTV3fKekc4Ou2pwNHAV8t18IfL/u95UeVz+e1JelM4DKq0fxhrc4ZERHPHNl5UEEMTJI9\naQVVnodlRESXkoTttp401FVT3BEREd0iAToiIqIDJUBHRER0oAToiIiIDpRFYjEokpx/KxERyyaL\nxCIiIrpMAnREREQHSoCOiIjoQAnQERERHahbbvUZz4TJba1zaC13D4uI6FdG0BERER0oAToiIqID\nJUBHRER0oK4L0JKeI2lmed0n6Z6yPaP3Gc1DcI7tJb2+tn+kpAfLOe6SdLmkV7RZ9+aSbh1Evssl\nPSzpkgHyvUXS7ZJukzSppL1Y0nRJsyXtUtJWl3SVpLXbaXdERAytrlskZvshYByApE8B822fOcSn\nGQeMB37Ze1rgXNvHlfM2gAsl7Wn7ziE+d68vAKOAo/vKIGlr4KPArrb/IWnDcugo4Fjgz8CXgIOB\n9wM/tP3ECmpvREQsg64bQbcwQtI0WDzyXSRpk7L/B0lrS9pI0k8l3Vxeu5bjoyV9R9JNZXS8n6Q1\ngE8Dh5SR+VvKeRYvcbbdA3yTKhAiaUtJv5Q0TdJUSS8u6c+V9DNJs8prl3rDJb2onHd8c6ds/wp4\ndIC+vxf4iu1/lDJ/K+lPAqPLa4Gk9YA32v7B4D7SiIhY0bpuBN3CImAtSesCuwO3AHtIug74q+0n\nJH0H+KLt6yRtBlwOjAVOBq6x/S5J6wM3AVcDnwDG10bMR7Q470xKgKYK1kfbnivp5cDXgL2ALwNT\nbB8oaTVgHWCDUueLgXOBI2wPOOXdh60BS/oNMAKYYPsK4KvAD4A1gfcBnwQ+2+Y5IiJiBRgOARrg\nBmA3qgB9KrAP1Yh3ajn+GmCMtHgQvK6k0cDewL6STizpawGblbID/ShYUI3CgV2Bn9TqX7O87wm8\nHcD2IuCfkjYA/gO4CDhwOafIVwe2Al4FbApMlbSt7bvLuZG0FfAC4E5JPwTWAD5h+/fLcd6IiFhO\nwyVATwX2oAquP6e6LmvgF+W4gJfbXlAvVALqm5qDVRkFD2QccAfVZYSHbY/rI1+rQP8I1fXh3YH+\nAvRAd/u4B7jJ9kJgnqS7qAL29Fqe/6WaKTieaqT/Z+BzlC8OdRMuWLLdGAONsQOcPSJimOnp6aGn\np2dI6houAfpaqqDTY9uS/g68gSpQA1wJHAecAdW1atuzgStK+rElfZztmcB8YN1a/UsFWUmvorr+\n27A9X9KfJB1s+6eqov62tucA11AtzvqSpBFU14QBFgBvAq6Q9Kjtc/vo10Cj+IuAtwLfKwvEtgH+\n2NTOv9j+g6SRVAHfVIvPnmbCQQOcLSJimGs0GjQajcX7EydObLuu4bBIzLb/XLZ7p7SvpRrV/qPs\nHwe8rPzs6HaWrIz+DLCGpDmSbgN6P+kpwNjaIjGzZNHY76gC/5ts/67kfxvwbkmzgNuA/Ur68cCe\nkuYA04AxtTY/BrwR+G9Jb2zulKRrgfOBvSTdLem1JX2ipH1LJVcAD5U+/Qo40fbDJZ+oRs6fKVV+\nk2pF9yXA6YP6ZCMiYoWRnXsix8Ak2ZOGsMLcizsihgFJ2G7rQQbDYQQdERGxykmAjoiI6EAJ0BER\nER0oAToiIqIDZZFYDIok599KRMSyySKxiIiILpMAHRER0YESoCMiIjpQAnREREQHGi734o6hMLmN\ndQ65Y1hERFsygo6IiOhACdAREREdKAE6IiKiAw27AC1pYXksZO9rsyGqd4KkE4agnoakSwbIs4Gk\nKZLmSzqrn3xvlnR76fP4Wvpu5dGat0jaqqStL+mK5W1/REQMjeG4SOwx2+NWQL3P5GqoJ4BTgJeW\nV19uBQ4EvsHS7fsw8HpgC+B9wImlvs+uiMZGRMSyG3Yj6GaSRku6WtJ0SXMk7Vc7dngZac6S9IOS\ntpGkn0q6ubx2rVW3vaTrJd0l6T0lvySdLunWUv9b+ktvattOkmZI2qKebvsx29cB/+6vb7bvtH1X\ni0NPAqPLa4GkLYFNbE8d1IcWEREr3HAcQY+UNLNs/xF4C3Cg7fmSNgRuAC6W9BLgZOAVtv8uaf1S\n5kvAF21fV6bHLwfGAgK2A14OrAPMlHQpsCuwfTm2EXCLpKnAbn2kA1AC/5eB/Wzf00df2h21nwr8\nAHgMOBw4o/Q1IiI6xHAM0I/Xp7glrQGcKml3YBGwsaTnAq8Gzrf9dwDbj5QirwHGSIt/E7yupNFU\nwfIi2/8G/i1pCrAzVSCeXJ408YCkXwM79ZP+T6qA/w3gtbbvH+oPwPZs4BWl/3sA9wKrSfoxsAA4\nwfYDQ33eiIgYvOEYoJu9DdgQ2NH2Qkl/AtamCrit7swh4OW2FyyVqJY38egd4fZ1h4/m9N789wJr\nATsClw3UgXapavTJwKHAWVTXorcAjqO6Jr2UCRcs2W6MgcbYFdWyiIhVU09PDz09PUNSVwI0PAt4\noATnPYEXUgXKXwE/k3RmmeJ+tu2HgSupAtgZAJJ2sD2LKtjuL+lUqinuBvARYARwtKTvA88B9qAK\nhKv3kT4WeAR4N3CVpH/Z/nUfbV+WW3u1yns4cKnthyWNKv02MKpVBRMOWoazRUQMQ41Gg0ajsXh/\n4sSJbdc1HAN083XbScAlkuYA04DfAti+Q9JngV9LWgjMAN5FFZy/Kmk21ef3a+ADpd45wBSqEfmn\ny/T0zyS9Aphd8pxUpo9bpksaU53eD0h6I/BLSe+0fUu90ZLmAesCa0o6gGo6/E5J5wBftz1d0oFU\n17E3BC6VNNP260v5UcARwGtLlWdSjdb/DRy2HJ9vREQMAVWXQCP6J8me1EbB3Is7IoYxSdhu40EG\n+ZlVRERER0qAjoAhW9TRibq5b5D+req6vX/LIwE6gu7+n0Q39w3Sv1Vdt/dveSRAR0REdKAE6IiI\niA6UVdwxKJLyDyUiog3truJOgI6IiOhAmeKOiIjoQAnQERERHSgBOvolaR9Jd0r6vaSPrOz2tEPS\ndyT9VdKttbQNJF1Vnt19Ze1xokj6WOnvnZL2XjmtHjxJm0qaIul2SbdJOq6kr/J9lLS2pJvKM9nv\nKPe674q+1UkaIWmmpEvKftf0T9K88sz7mZJuLmnd1L/1Jf1U0m/Lv9GXD1n/bOeVV8sX1YM+5gKb\nA2sAs4AxK7tdbfRjd2AccGst7QvA/5TtjwCnle2xpZ9rlH7PBVZb2X0YoH/PA3Yo2+sAvwPGdEsf\ngVHlfXXgRuCV3dK3Wh8/TPVcgIu78N/nn4ANmtK6qX/fB95VtlcH1huq/mUEHf3ZGZhre57tJ4Hz\ngP1XcpuWme1rgYebkvej+g+L8n5A2d4fONf2k7bnUf0HtPMz0c522b7f1RPVsP0o1QNfXkCX9NH2\nY2VzTaovjQ/TJX0DkLQJ8AbgWyx56lzX9K9oXsXcFf2TtB6wu+3vANh+yvY/GKL+JUBHf14A3F3b\nv6ekdYPn2v5r2f4r8NyyvTFVP3utUn2WtDnVbMFNdEkfJa0maRZVH6bYvp0u6VvxReAkYFEtrZv6\nZ+BqSdMkvbekdUv/tgAelPRdSTMknSNpNEPUvwTo6M+w+A2eq7mn/vq6SnwOktYBLgCOtz2/fmxV\n7qPtRbZ3ADYB9ijPba8fX2X7Vh4p+4DtmfTxfPdVuX/FbrbHAa8HPihp9/rBVbx/qwM7Al+zvSPw\nL+Cj9QzL078E6OjPX4BNa/ubsvS3v1XZXyU9D0DS84EHSnpznzcpaR1N0hpUwfmHti8qyV3VxzJ1\neCkwnu7p267AfpL+BJwLvFrSD+me/mH7vvL+IPAzqindbunfPcA9tm8p+z+lCtj3D0X/EqCjP9OA\nrSVtLmlN4BDg4pXcpqFyMXBE2T4CuKiWfqikNSVtAWwN3LwS2jdokgR8G7jD9v/VDq3yfZS0Ye8K\nWEkjgdcCM+mCvgHY/rjtTW1vARwK/Mr2O+iS/kkaJWndsj0a2Bu4lS7pn+37gbslbVOSXgPcDlzC\nUPRvZa+Ay6uzX1TTUr+jWszwsZXdnjb7cC5wL7CA6pr6O4ENgKuBu4ArgfVr+T9e+nsn8LqV3f5B\n9O+VVNcvZ1EFr5nAPt3QR2BbYEbp2xzgpJK+yvetRV9fxZJV3F3RP6prtLPK67be/4d0S/9Ke7cH\nbgFmAxdSreIekv7lVp8REREdKFPcERERHSgBOiIiogMlQEdERHSgBOiIiIgOlAAdERHRgRKgIyIi\nOlACdERERAdKgI6IiOhA/z8aaHKp8XyaNwAAAABJRU5ErkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "dataset.source_name = dataset.source.apply(get_source_name)\n", "\n", "source_counts = dataset.source_name.value_counts().sort_values()[-10:]\n", "\n", "bottom = [index for index, item in enumerate(source_counts.index)]\n", "plt.barh(bottom, width=source_counts, color=\"orange\", linewidth=0)\n", "\n", "y_labels = [\"%s %.1f%%\" % (item, 100.0*source_counts[item]/len(dataset)) for index,item in enumerate(source_counts.index)]\n", "plt.yticks(np.array(bottom)+0.4, y_labels)\n", "\n", "source_counts" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.4.3" } }, "nbformat": 4, "nbformat_minor": 0 }