{ "metadata": { "name": "", "signature": "sha256:d28e5939cb6bde04a5eca1e8d3248e11c8a7f7c9cfde61e02b925c6a411ed574" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# La libert\u00e9, m\u00eame sur internet\n", "\n", "- Depuis le 26 mars, le site [GitHub.com](https://github.com) est sous attaque, victime d'une DDoS.\n", "\n", "- Le moteur de recherche Baidu a \u00e9t\u00e9 utilis\u00e9 comme vecteur d'attaque. \n", "\n", "- Les scripts malveillants semblent \u00eatre inject\u00e9s par des serveurs \u00e0 la fronti\u00e8re de l'infrastrucutre de r\u00e9seau chinoise.\n", "\n", "- L'attaque vise [GreatFire](https://en.greatfire.org/), et d'autres sites, h\u00e9berg\u00e9s par GitHub, qui s'opposent \u00e0 la surveillance du net pratiqu\u00e9e par le gouvernement chinois.\n", "\n", "Sources:\n", "\n", "- https://github.com/blog/1981-large-scale-ddos-attack-on-github-com\n", "- http://www.theverge.com/2015/3/27/8299555/github-china-ddos-censorship-great-firewall\n", "\n", "\n", "## \u00c7a ne me concerne pas, je ne suis pas Chinois\n", "\n", "Vous vous trompez :\n", "\n", "- http://en.wikipedia.org/wiki/Internet_censorship_in_France\n", "- http://www.laquadrature.net/\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Fouille et tra\u00eetement de donn\u00e9es\n", "\n", "La d\u00e9cennie des donn\u00e9es\u202f?\n", "\n", "- Grande production de donn\u00e9es\u202f: services web, senseurs, wearables, ...\n", "- Grande disponibilit\u00e9 de donn\u00e9es\u202f: internet, open health, open government, ...\n", "\n", "Probl\u00e9matiques majeures\u202f: tra\u00eetement de grosses masses de donn\u00e9es, s\u00e9curit\u00e9, vie priv\u00e9e.\n", "\n", "## Formats d'interrogation et distribution de donn\u00e9es\n", "\n", "**API : Application Programming Interface**\n", "\n", "- Vieux terme, traditionnellement utilis\u00e9 pour d\u00e9signer les fonctions expos\u00e9es par une bilbioth\u00e8que logicielle.\n", "\n", "- **Applications Web\u202f(API REST):** description des URLs et de leurs **param\u00e8tres**.\n", "\n", "Envoyer des param\u00e8tres\u202f:\n", "\n", "- **GET\u202f:** par l'URL. Ex.: `http://www.google.fr/?`**`q=parametres+GET`**\n", "\n", "- **POST\u202f:** dans le _corps de la requ\u00eate_ (pour des donn\u00e9es de grande taille).\n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from IPython.display import HTML\n", "import urllib2\n", "\n", "goog = urllib2.urlopen(\"https://www.google.com/?q=parametres+GET\")\n", "HTML(goog.read().decode('iso-8859-1'))" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "Google
Recherche Images Maps Play YouTube Actualit\u00e9s Gmail Drive Plus »
Historique Web | Param\u00e8tres | Connexion
×
Surfez encore plus vite

France

 

Recherche avanc\u00e9eOutils linguistiques

© 2015 - Confidentialit\u00e9 - Conditions

" ], "metadata": {}, "output_type": "pyout", "prompt_number": 1, "text": [ "" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Format des param\u00e8tres *GET*\n", "\n", "L'application web est libre d'accepter toute URL, mais il existe un standard, appel\u00e9 [RFC 3986](http://tools.ietf.org/html/rfc3986), qui est universalement respect\u00e9\u202f:\n", "\n", "- On ajoute \u00e0 la fin de l'URL\u202f:\n", " - Un point d'interrogation `?`,\n", " - Des couples `cle=valeur`,\n", " - S\u00e9par\u00e9s par des _ampersand_ `&`.\n", "\n", "**Exemple\u202f:**\n", "\n", "`https://www.google.fr/?`**`q=recherche`**`&`**`hl=fr`**\n", "\n", "- `q=recherche`\u202f: quoi chercher\n", "- `hl=fr`\u202f: langue de l'interface" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Formats de donn\u00e9es\n", "\n", "- Format *url encoded*\u202f: ce que nous venons de voir,\n", "- Formats *CSV/TSV (Comma/Tab Separated Value)*\u202f: valeurs s\u00e9par\u00e9es par une virgule/tabulation,\n", "- Format *JSON (JavaScript Object Notation)*,\n", "- Format *XML (eXtensible Markup Language)*,\n", "- ...\n", "\n", "\n", "## Exemple de CSV\n", "\n", "https://github.com/defeo/in202/raw/gh-pages/assets/bike-dataset.csv" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import pandas as pd\n", "\n", "bikes = urllib2.urlopen('https://github.com/defeo/in202/raw/gh-pages/assets/bike-dataset.csv')\n", "b = pd.read_csv(bikes)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## JSON\n", "\n", "Format d\u00e9riv\u00e9 de JavaScript\u202f:\n", "\n", "- Nombres\u00a0: `1`, `2.0`,\n", "- Cha\u00eenes\u202f: `\"encod\u00e9s en utf8\"`,\n", "- Listes\u202f: `[\"comme\", \"en\", \"python\"]`,\n", "- Objets : `{ \"clef\" : \"valeur\", \"autre clef\": \"valeur\" }`\n", "\n", "Attention\u202f: les clefs des objets sont limit\u00e9s \u00e0 des cha\u00eenes de caract\u00e8res.\n", "\n", "Biblioth\u00e8que **`json`**\u202f: conversion de JSON en donn\u00e9es Python\n", "\n", "**Exemple**\n", "\n", "http://eu.battle.net/api/sc2/ladder/grandmaster?locale=fr_FR'\n", "\n", "**Note\u202f:** Quand les donn\u00e9es JSON sont _plates_, on peut directement les lire avec pandas" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import json\n", "\n", "data = json.load(urllib2.urlopen('http://eu.battle.net/api/sc2/ladder/grandmaster?locale=fr_FR'))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "type(data)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 5, "text": [ "dict" ] } ], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "data.keys()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 6, "text": [ "[u'ladderMembers']" ] } ], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "type(data['ladderMembers'])" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 7, "text": [ "list" ] } ], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "data['ladderMembers'][0]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 8, "text": [ "{u'character': {u'clanName': u'',\n", " u'clanTag': u'',\n", " u'displayName': u'IIIIIIIII',\n", " u'id': 3257655,\n", " u'profilePath': u'/profile/3257655/1/IIIIIIIII/',\n", " u'realm': 1},\n", " u'favoriteRaceP1': u'PROTOSS',\n", " u'highestRank': 1,\n", " u'joinTimestamp': 1421665350,\n", " u'losses': 126,\n", " u'points': 2889.0,\n", " u'previousRank': 4,\n", " u'wins': 218}" ] } ], "prompt_number": 8 }, { "cell_type": "code", "collapsed": false, "input": [ "sc2 = pd.DataFrame(data['ladderMembers'])" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 9 }, { "cell_type": "code", "collapsed": false, "input": [ "sc2" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
characterfavoriteRaceP1highestRankjoinTimestamplossespointspreviousRankwins
0 {u'displayName': u'IIIIIIIII', u'clanName': u'... PROTOSS 1 1421665350 126 2889 4 218
1 {u'displayName': u'IlIlIlIlIlIl', u'clanName':... PROTOSS 2 1422633286 188 2850 6 296
2 {u'displayName': u'HatsuneMiku', u'clanName': ... TERRAN 3 1423671811 153 2827 10 258
3 {u'displayName': u'PtitDrogo', u'clanName': u'... PROTOSS 3 1425058466 233 2759 9 322
4 {u'displayName': u'PenetraTHOR', u'clanName': ... TERRAN 3 1425513212 108 2754 9 173
5 {u'displayName': u'lllllIIIllIl', u'clanName':... PROTOSS 3 1423078934 88 2751 6 162
6 {u'displayName': u'llllllllllll', u'clanName':... PROTOSS 6 1422391181 170 2741 6 250
7 {u'displayName': u'LiquidSnute', u'clanName': ... ZERG 2 1426706701 108 2732 8 186
8 {u'displayName': u'llllllllllll', u'clanName':... TERRAN 3 1421669242 231 2712 2 399
9 {u'displayName': u'lllllIIIllll', u'clanName':... ZERG 6 1425790701 60 2709 7 240
10 {u'displayName': u'IIIIIIIIIIII', u'clanName':... ZERG 1 1425427183 339 2693 2 432
11 {u'displayName': u'KaarisOCLICK', u'clanName':... TERRAN 6 1427370092 83 2679 0 142
12 {u'displayName': u'LiquidMaNa', u'clanName': u... PROTOSS 10 1422380341 124 2664 18 209
13 {u'displayName': u'lIlIlIlIlIlI', u'clanName':... ZERG 13 1421687897 306 2654 20 361
14 {u'displayName': u'fraer', u'clanName': u'ExTr... PROTOSS 9 1425406077 230 2580 15 242
15 {u'displayName': u'IlIlIlIlIlIl', u'clanName':... PROTOSS 13 1421625605 268 2571 23 371
16 {u'displayName': u'IIIIIIIIIIII', u'clanName':... TERRAN 17 1421625625 64 2567 30 118
17 {u'displayName': u'elfi', u'clanName': u'PEKKA... PROTOSS 11 1421663442 437 2564 19 478
18 {u'displayName': u'llllllllllll', u'clanName':... ZERG 11 1422473792 195 2554 27 212
19 {u'displayName': u'IIIIIIIIIIII', u'clanName':... ZERG 14 1426411330 135 2553 14 183
20 {u'displayName': u'IIIIIIIIIIII', u'clanName':... TERRAN 13 1425348745 168 2547 27 204
21 {u'displayName': u'llllllllllll', u'clanName':... PROTOSS 20 1421776196 104 2539 25 134
22 {u'displayName': u'IIIIIIIIIIII', u'clanName':... TERRAN 20 1425397354 156 2518 37 176
23 {u'displayName': u'IlIIIIllIlll', u'clanName':... ZERG 20 1426846393 180 2512 34 222
24 {u'displayName': u'Lambo', u'clanName': u'', u... ZERG 25 1421630424 265 2497 44 278
25 {u'displayName': u'CARTIER', u'clanName': u'CR... ZERG 20 1422687731 188 2488 21 292
26 {u'displayName': u'llllllllllll', u'clanName':... TERRAN 27 1421628225 231 2472 37 245
27 {u'displayName': u'FXOStrelok', u'clanName': u... TERRAN 27 1423750644 139 2463 51 197
28 {u'displayName': u'MaDMarC', u'clanName': u'AT... TERRAN 17 1424450160 236 2456 15 310
29 {u'displayName': u'IIIIIIIIIIII', u'clanName':... PROTOSS 11 1421680402 339 2455 18 352
...........................
170 {u'displayName': u'NorthBrute', u'clanName': u... TERRAN 166 1425766140 305 1903 176 316
171 {u'displayName': u'ReaVer', u'clanName': u'Tea... PROTOSS 140 1427176238 420 1900 0 448
172 {u'displayName': u'IIIIIIII', u'clanName': u''... TERRAN 148 1423651429 238 1896 171 248
173 {u'displayName': u'BobaFett', u'clanName': u''... TERRAN 147 1427143902 99 1890 0 89
174 {u'displayName': u'Itwasluck', u'clanName': u'... PROTOSS 148 1425306855 161 1890 150 176
175 {u'displayName': u'Hephaistas', u'clanName': u... ZERG 167 1425830784 108 1888 175 125
176 {u'displayName': u'Scandicain', u'clanName': u... PROTOSS 167 1425914514 109 1886 169 103
177 {u'displayName': u'Poseidon', u'clanName': u'H... PROTOSS 161 1421668780 206 1872 171 212
178 {u'displayName': u'NoCti', u'clanName': u'Fan ... TERRAN 171 1423344375 260 1862 176 251
179 {u'displayName': u'Talia', u'clanName': u'', u... TERRAN 177 1426202660 87 1799 182 144
180 {u'displayName': u'JeSuisCharli', u'clanName':... PROTOSS 177 1423578734 392 1780 179 415
181 {u'displayName': u'VeniVidiVins', u'clanName':... ZERG 174 1424819521 414 1777 180 427
182 {u'displayName': u'RiSky', u'clanName': u'Team... ZERG 182 1425861392 138 1775 180 141
183 {u'displayName': u'IIIIIIIIIIII', u'clanName':... PROTOSS 177 1426076915 318 1754 180 339
184 {u'displayName': u'MafiatA', u'clanName': u'Nu... ZERG 181 1421638711 106 1748 185 114
185 {u'displayName': u'M\u00f6\u00f6p', u'clanName': u'Heral... PROTOSS 180 1424175561 179 1741 187 190
186 {u'displayName': u'D\u00e9ca', u'clanName': u'Worke... PROTOSS 186 1421626457 97 1655 187 79
187 {u'displayName': u'QwerelL', u'clanName': u'Ba... ZERG 184 1424791290 92 1653 183 92
188 {u'displayName': u'IIIIIIIIIIII', u'clanName':... ZERG 184 1421670781 125 1641 185 113
189 {u'displayName': u'IllllllllIll', u'clanName':... PROTOSS 186 1425521351 84 1592 184 108
190 {u'displayName': u'Teacher', u'clanName': u'Up... PROTOSS 187 1422631018 91 1591 190 90
191 {u'displayName': u'SK\u00e9viN', u'clanName': u'New... TERRAN 187 1421686287 92 1560 192 92
192 {u'displayName': u'Justice', u'clanName': u'we... TERRAN 191 1421627433 98 1507 193 101
193 {u'displayName': u'Zlayer', u'clanName': u'Old... TERRAN 190 1421626220 97 1479 193 96
194 {u'displayName': u'\u0166\u1e5d\u016b\u1e3f\u1e55\u01c2A\u0186\u01a9', u'clanName': u'... PROTOSS 192 1425312651 75 1229 193 137
195 {u'displayName': u'IIIIIIIIIIII', u'clanName':... PROTOSS 193 1423687954 83 1186 195 92
196 {u'displayName': u'IIIIIIIIIIII', u'clanName':... TERRAN 194 1424778035 79 1129 197 139
197 {u'displayName': u'Rayman', u'clanName': u'Wes... PROTOSS 196 1421626381 162 1099 198 146
198 {u'displayName': u'nBroccoli', u'clanName': u'... TERRAN 195 1423400164 100 1068 196 68
199 {u'displayName': u'imRDA', u'clanName': u'Pani... ZERG 197 1421852264 116 675 198 33
\n", "

200 rows \u00d7 8 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 10, "text": [ " character favoriteRaceP1 \\\n", "0 {u'displayName': u'IIIIIIIII', u'clanName': u'... PROTOSS \n", "1 {u'displayName': u'IlIlIlIlIlIl', u'clanName':... PROTOSS \n", "2 {u'displayName': u'HatsuneMiku', u'clanName': ... TERRAN \n", "3 {u'displayName': u'PtitDrogo', u'clanName': u'... PROTOSS \n", "4 {u'displayName': u'PenetraTHOR', u'clanName': ... TERRAN \n", "5 {u'displayName': u'lllllIIIllIl', u'clanName':... PROTOSS \n", "6 {u'displayName': u'llllllllllll', u'clanName':... PROTOSS \n", "7 {u'displayName': u'LiquidSnute', u'clanName': ... ZERG \n", "8 {u'displayName': u'llllllllllll', u'clanName':... TERRAN \n", "9 {u'displayName': u'lllllIIIllll', u'clanName':... ZERG \n", "10 {u'displayName': u'IIIIIIIIIIII', u'clanName':... ZERG \n", "11 {u'displayName': u'KaarisOCLICK', u'clanName':... TERRAN \n", "12 {u'displayName': u'LiquidMaNa', u'clanName': u... PROTOSS \n", "13 {u'displayName': u'lIlIlIlIlIlI', u'clanName':... ZERG \n", "14 {u'displayName': u'fraer', u'clanName': u'ExTr... PROTOSS \n", "15 {u'displayName': u'IlIlIlIlIlIl', u'clanName':... PROTOSS \n", "16 {u'displayName': u'IIIIIIIIIIII', u'clanName':... TERRAN \n", "17 {u'displayName': u'elfi', u'clanName': u'PEKKA... PROTOSS \n", "18 {u'displayName': u'llllllllllll', u'clanName':... ZERG \n", "19 {u'displayName': u'IIIIIIIIIIII', u'clanName':... ZERG \n", "20 {u'displayName': u'IIIIIIIIIIII', u'clanName':... TERRAN \n", "21 {u'displayName': u'llllllllllll', u'clanName':... PROTOSS \n", "22 {u'displayName': u'IIIIIIIIIIII', u'clanName':... TERRAN \n", "23 {u'displayName': u'IlIIIIllIlll', u'clanName':... ZERG \n", "24 {u'displayName': u'Lambo', u'clanName': u'', u... ZERG \n", "25 {u'displayName': u'CARTIER', u'clanName': u'CR... ZERG \n", "26 {u'displayName': u'llllllllllll', u'clanName':... TERRAN \n", "27 {u'displayName': u'FXOStrelok', u'clanName': u... TERRAN \n", "28 {u'displayName': u'MaDMarC', u'clanName': u'AT... TERRAN \n", "29 {u'displayName': u'IIIIIIIIIIII', u'clanName':... PROTOSS \n", ".. ... ... \n", "170 {u'displayName': u'NorthBrute', u'clanName': u... TERRAN \n", "171 {u'displayName': u'ReaVer', u'clanName': u'Tea... PROTOSS \n", "172 {u'displayName': u'IIIIIIII', u'clanName': u''... TERRAN \n", "173 {u'displayName': u'BobaFett', u'clanName': u''... TERRAN \n", "174 {u'displayName': u'Itwasluck', u'clanName': u'... PROTOSS \n", "175 {u'displayName': u'Hephaistas', u'clanName': u... ZERG \n", "176 {u'displayName': u'Scandicain', u'clanName': u... PROTOSS \n", "177 {u'displayName': u'Poseidon', u'clanName': u'H... PROTOSS \n", "178 {u'displayName': u'NoCti', u'clanName': u'Fan ... TERRAN \n", "179 {u'displayName': u'Talia', u'clanName': u'', u... TERRAN \n", "180 {u'displayName': u'JeSuisCharli', u'clanName':... PROTOSS \n", "181 {u'displayName': u'VeniVidiVins', u'clanName':... ZERG \n", "182 {u'displayName': u'RiSky', u'clanName': u'Team... ZERG \n", "183 {u'displayName': u'IIIIIIIIIIII', u'clanName':... PROTOSS \n", "184 {u'displayName': u'MafiatA', u'clanName': u'Nu... ZERG \n", "185 {u'displayName': u'M\u00f6\u00f6p', u'clanName': u'Heral... PROTOSS \n", "186 {u'displayName': u'D\u00e9ca', u'clanName': u'Worke... PROTOSS \n", "187 {u'displayName': u'QwerelL', u'clanName': u'Ba... ZERG \n", "188 {u'displayName': u'IIIIIIIIIIII', u'clanName':... ZERG \n", "189 {u'displayName': u'IllllllllIll', u'clanName':... PROTOSS \n", "190 {u'displayName': u'Teacher', u'clanName': u'Up... PROTOSS \n", "191 {u'displayName': u'SK\u00e9viN', u'clanName': u'New... TERRAN \n", "192 {u'displayName': u'Justice', u'clanName': u'we... TERRAN \n", "193 {u'displayName': u'Zlayer', u'clanName': u'Old... TERRAN \n", "194 {u'displayName': u'\u0166\u1e5d\u016b\u1e3f\u1e55\u01c2A\u0186\u01a9', u'clanName': u'... PROTOSS \n", "195 {u'displayName': u'IIIIIIIIIIII', u'clanName':... PROTOSS \n", "196 {u'displayName': u'IIIIIIIIIIII', u'clanName':... TERRAN \n", "197 {u'displayName': u'Rayman', u'clanName': u'Wes... PROTOSS \n", "198 {u'displayName': u'nBroccoli', u'clanName': u'... TERRAN \n", "199 {u'displayName': u'imRDA', u'clanName': u'Pani... ZERG \n", "\n", " highestRank joinTimestamp losses points previousRank wins \n", "0 1 1421665350 126 2889 4 218 \n", "1 2 1422633286 188 2850 6 296 \n", "2 3 1423671811 153 2827 10 258 \n", "3 3 1425058466 233 2759 9 322 \n", "4 3 1425513212 108 2754 9 173 \n", "5 3 1423078934 88 2751 6 162 \n", "6 6 1422391181 170 2741 6 250 \n", "7 2 1426706701 108 2732 8 186 \n", "8 3 1421669242 231 2712 2 399 \n", "9 6 1425790701 60 2709 7 240 \n", "10 1 1425427183 339 2693 2 432 \n", "11 6 1427370092 83 2679 0 142 \n", "12 10 1422380341 124 2664 18 209 \n", "13 13 1421687897 306 2654 20 361 \n", "14 9 1425406077 230 2580 15 242 \n", "15 13 1421625605 268 2571 23 371 \n", "16 17 1421625625 64 2567 30 118 \n", "17 11 1421663442 437 2564 19 478 \n", "18 11 1422473792 195 2554 27 212 \n", "19 14 1426411330 135 2553 14 183 \n", "20 13 1425348745 168 2547 27 204 \n", "21 20 1421776196 104 2539 25 134 \n", "22 20 1425397354 156 2518 37 176 \n", "23 20 1426846393 180 2512 34 222 \n", "24 25 1421630424 265 2497 44 278 \n", "25 20 1422687731 188 2488 21 292 \n", "26 27 1421628225 231 2472 37 245 \n", "27 27 1423750644 139 2463 51 197 \n", "28 17 1424450160 236 2456 15 310 \n", "29 11 1421680402 339 2455 18 352 \n", ".. ... ... ... ... ... ... \n", "170 166 1425766140 305 1903 176 316 \n", "171 140 1427176238 420 1900 0 448 \n", "172 148 1423651429 238 1896 171 248 \n", "173 147 1427143902 99 1890 0 89 \n", "174 148 1425306855 161 1890 150 176 \n", "175 167 1425830784 108 1888 175 125 \n", "176 167 1425914514 109 1886 169 103 \n", "177 161 1421668780 206 1872 171 212 \n", "178 171 1423344375 260 1862 176 251 \n", "179 177 1426202660 87 1799 182 144 \n", "180 177 1423578734 392 1780 179 415 \n", "181 174 1424819521 414 1777 180 427 \n", "182 182 1425861392 138 1775 180 141 \n", "183 177 1426076915 318 1754 180 339 \n", "184 181 1421638711 106 1748 185 114 \n", "185 180 1424175561 179 1741 187 190 \n", "186 186 1421626457 97 1655 187 79 \n", "187 184 1424791290 92 1653 183 92 \n", "188 184 1421670781 125 1641 185 113 \n", "189 186 1425521351 84 1592 184 108 \n", "190 187 1422631018 91 1591 190 90 \n", "191 187 1421686287 92 1560 192 92 \n", "192 191 1421627433 98 1507 193 101 \n", "193 190 1421626220 97 1479 193 96 \n", "194 192 1425312651 75 1229 193 137 \n", "195 193 1423687954 83 1186 195 92 \n", "196 194 1424778035 79 1129 197 139 \n", "197 196 1421626381 162 1099 198 146 \n", "198 195 1423400164 100 1068 196 68 \n", "199 197 1421852264 116 675 198 33 \n", "\n", "[200 rows x 8 columns]" ] } ], "prompt_number": 10 }, { "cell_type": "code", "collapsed": false, "input": [ "sc2['percent'] = sc2.wins / (sc2.wins + sc2.losses)\n", "sc2.groupby('favoriteRaceP1').mean()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
highestRankjoinTimestamplossespointspreviousRankwinspercent
favoriteRaceP1
PROTOSS 84.053333 1.424071e+09 188.200000 2129.186667 85.933333 213.880000 0.535269
RANDOM 117.000000 1.423326e+09 224.000000 2042.000000 154.000000 229.000000 0.505519
TERRAN 88.272727 1.424079e+09 186.345455 2124.672727 87.054545 212.763636 0.536759
ZERG 75.514706 1.424795e+09 181.514706 2159.000000 75.426471 208.088235 0.532856
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 11, "text": [ " highestRank joinTimestamp losses points \\\n", "favoriteRaceP1 \n", "PROTOSS 84.053333 1.424071e+09 188.200000 2129.186667 \n", "RANDOM 117.000000 1.423326e+09 224.000000 2042.000000 \n", "TERRAN 88.272727 1.424079e+09 186.345455 2124.672727 \n", "ZERG 75.514706 1.424795e+09 181.514706 2159.000000 \n", "\n", " previousRank wins percent \n", "favoriteRaceP1 \n", "PROTOSS 85.933333 213.880000 0.535269 \n", "RANDOM 154.000000 229.000000 0.505519 \n", "TERRAN 87.054545 212.763636 0.536759 \n", "ZERG 75.426471 208.088235 0.532856 " ] } ], "prompt_number": 11 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## XML\n", "\n", "> \"XML is crap. Really. There are no excuses. XML is nasty to parse for humans, and it's a disaster to parse even for computers.\n", "> There's just no reason for that horrible crap to exist.\"\n", ">\n", "> [Linus Torvalds](https://plus.google.com/+LinusTorvalds/posts/X2XVf9Q7MfV)\n", "\n", "Exemple\u202f:\n", "\n", "https://github.com/defeo/in202/commits/gh-pages.atom" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gestion des dates\n", "\n", "Deux biblioth\u00e8ques" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import datetime\n", "import dateutil" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 12 }, { "cell_type": "code", "collapsed": false, "input": [ "date = datetime.datetime(2015, 3, 2)\n", "date" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 13, "text": [ "datetime.datetime(2015, 3, 2, 0, 0)" ] } ], "prompt_number": 13 }, { "cell_type": "code", "collapsed": false, "input": [ "date.ctime()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 14, "text": [ "'Mon Mar 2 00:00:00 2015'" ] } ], "prompt_number": 14 }, { "cell_type": "code", "collapsed": false, "input": [ "dateutil.parser.parse('2015-3-2')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 15, "text": [ "datetime.datetime(2015, 3, 2, 0, 0)" ] } ], "prompt_number": 15 }, { "cell_type": "code", "collapsed": false, "input": [ "dateutil.parser.parse('2/3/2015')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 16, "text": [ "datetime.datetime(2015, 2, 3, 0, 0)" ] } ], "prompt_number": 16 }, { "cell_type": "code", "collapsed": false, "input": [ "dateutil.parser.parse('20/3/2015')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 17, "text": [ "datetime.datetime(2015, 3, 20, 0, 0)" ] } ], "prompt_number": 17 }, { "cell_type": "code", "collapsed": false, "input": [ "delta = datetime.datetime.now() - datetime.datetime(2015, 3, 4)\n", "delta" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 18, "text": [ "datetime.timedelta(26, 24180, 3905)" ] } ], "prompt_number": 18 }, { "cell_type": "code", "collapsed": false, "input": [ "delta.total_seconds()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 19, "text": [ "2270580.003905" ] } ], "prompt_number": 19 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Revenons \u00e0 la m\u00e9t\u00e9o" ] }, { "cell_type": "code", "collapsed": false, "input": [ "b.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
instantdtedayseasonyrmnthholidayweekdayworkingdayweathersittempatemphumwindspeedcasualregisteredcnt
0 1 2011-01-01 1 0 1 0 6 0 2 0.344167 0.363625 0.805833 0.160446 331 654 985
1 2 2011-01-02 1 0 1 0 0 0 2 0.363478 0.353739 0.696087 0.248539 131 670 801
2 3 2011-01-03 1 0 1 0 1 1 1 0.196364 0.189405 0.437273 0.248309 120 1229 1349
3 4 2011-01-04 1 0 1 0 2 1 1 0.200000 0.212122 0.590435 0.160296 108 1454 1562
4 5 2011-01-05 1 0 1 0 3 1 1 0.226957 0.229270 0.436957 0.186900 82 1518 1600
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 20, "text": [ " instant dteday season yr mnth holiday weekday workingday \\\n", "0 1 2011-01-01 1 0 1 0 6 0 \n", "1 2 2011-01-02 1 0 1 0 0 0 \n", "2 3 2011-01-03 1 0 1 0 1 1 \n", "3 4 2011-01-04 1 0 1 0 2 1 \n", "4 5 2011-01-05 1 0 1 0 3 1 \n", "\n", " weathersit temp atemp hum windspeed casual registered \\\n", "0 2 0.344167 0.363625 0.805833 0.160446 331 654 \n", "1 2 0.363478 0.353739 0.696087 0.248539 131 670 \n", "2 1 0.196364 0.189405 0.437273 0.248309 120 1229 \n", "3 1 0.200000 0.212122 0.590435 0.160296 108 1454 \n", "4 1 0.226957 0.229270 0.436957 0.186900 82 1518 \n", "\n", " cnt \n", "0 985 \n", "1 801 \n", "2 1349 \n", "3 1562 \n", "4 1600 " ] } ], "prompt_number": 20 }, { "cell_type": "code", "collapsed": false, "input": [ "type(b['dteday'][0])" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 21, "text": [ "str" ] } ], "prompt_number": 21 }, { "cell_type": "code", "collapsed": false, "input": [ "b.dtypes" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 22, "text": [ "instant int64\n", "dteday object\n", "season int64\n", "yr int64\n", "mnth int64\n", "holiday int64\n", "weekday int64\n", "workingday int64\n", "weathersit int64\n", "temp float64\n", "atemp float64\n", "hum float64\n", "windspeed float64\n", "casual int64\n", "registered int64\n", "cnt int64\n", "dtype: object" ] } ], "prompt_number": 22 }, { "cell_type": "code", "collapsed": false, "input": [ "b.dteday - b.dteday" ], "language": "python", "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "unsupported operand type(s) for -: 'str' and 'str'", "output_type": "pyerr", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mb\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdteday\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0mb\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdteday\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m/usr/lib/python2.7/dist-packages/pandas/core/ops.pyc\u001b[0m in \u001b[0;36mwrapper\u001b[0;34m(left, right, name)\u001b[0m\n\u001b[1;32m 503\u001b[0m \u001b[0mrvalues\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcom\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtake_1d\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mrvalues\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mridx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 504\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 505\u001b[0;31m \u001b[0marr\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mna_op\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlvalues\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mrvalues\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 506\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 507\u001b[0m return left._constructor(wrap_results(arr), index=index,\n", "\u001b[0;32m/usr/lib/python2.7/dist-packages/pandas/core/ops.pyc\u001b[0m in \u001b[0;36mna_op\u001b[0;34m(x, y)\u001b[0m\n\u001b[1;32m 456\u001b[0m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mempty\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msize\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdtype\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 457\u001b[0m \u001b[0mmask\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnotnull\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m&\u001b[0m \u001b[0mnotnull\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 458\u001b[0;31m \u001b[0mresult\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mmask\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mop\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mmask\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mmask\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 459\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 460\u001b[0m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpa\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mempty\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdtype\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mTypeError\u001b[0m: unsupported operand type(s) for -: 'str' and 'str'" ] } ], "prompt_number": 23 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### D\u00e9codage automatique de la date en pandas\n", "\n", "Attention\u202f: tr\u00e8s lent pour des grosses donn\u00e9es" ] }, { "cell_type": "code", "collapsed": false, "input": [ "bikes = urllib2.urlopen('https://github.com/defeo/in202/raw/gh-pages/assets/bike-dataset.csv')\n", "bb = pd.read_csv(bikes, parse_dates=[\"dteday\"])" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 24 }, { "cell_type": "code", "collapsed": false, "input": [ "bb.dtypes" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 25, "text": [ "instant int64\n", "dteday datetime64[ns]\n", "season int64\n", "yr int64\n", "mnth int64\n", "holiday int64\n", "weekday int64\n", "workingday int64\n", "weathersit int64\n", "temp float64\n", "atemp float64\n", "hum float64\n", "windspeed float64\n", "casual int64\n", "registered int64\n", "cnt int64\n", "dtype: object" ] } ], "prompt_number": 25 }, { "cell_type": "code", "collapsed": false, "input": [ "type(bb['dteday'][0])" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 26, "text": [ "pandas.tslib.Timestamp" ] } ], "prompt_number": 26 }, { "cell_type": "code", "collapsed": false, "input": [ "bb.dteday - bb.dteday" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 27, "text": [ "0 0 days\n", "1 0 days\n", "2 0 days\n", "3 0 days\n", "4 0 days\n", "5 0 days\n", "6 0 days\n", "7 0 days\n", "8 0 days\n", "9 0 days\n", "10 0 days\n", "11 0 days\n", "12 0 days\n", "13 0 days\n", "14 0 days\n", "...\n", "716 0 days\n", "717 0 days\n", "718 0 days\n", "719 0 days\n", "720 0 days\n", "721 0 days\n", "722 0 days\n", "723 0 days\n", "724 0 days\n", "725 0 days\n", "726 0 days\n", "727 0 days\n", "728 0 days\n", "729 0 days\n", "730 0 days\n", "Name: dteday, Length: 731, dtype: timedelta64[ns]" ] } ], "prompt_number": 27 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Geocoding\n", "\n", "Quelques biblioth\u00e8ques\n", "\n", "- `geopy`\n", "- `geocoder`\n", "- ..." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from geopy.geocoders import Nominatim\n", "coder = Nominatim()\n", "\n", "l = coder.geocode(\"45 avenue des \u00c9tats Unis, Versailles\")\n", "l" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 28, "text": [ "Location((48.8084125, 2.1460823, 0.0))" ] } ], "prompt_number": 28 }, { "cell_type": "code", "collapsed": false, "input": [ "l2 = coder.reverse((46.0,4.0))\n", "l2.address" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 29, "text": [ "u'Route de Villemontais, Ouches, Roanne, Loire, Rh\\xf4ne-Alpes, France m\\xe9tropolitaine, 42155, France'" ] } ], "prompt_number": 29 }, { "cell_type": "code", "collapsed": false, "input": [ "l.latitude, l.longitude" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 30, "text": [ "(48.8084125, 2.1460823)" ] } ], "prompt_number": 30 }, { "cell_type": "code", "collapsed": true, "input": [ "from geopy.distance import distance\n", "\n", "distance(l.point, l2.point)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 31, "text": [ "Distance(342.001142917)" ] } ], "prompt_number": 31 }, { "cell_type": "code", "collapsed": false, "input": [ "distance(l.point, (49.0, 3.4))" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 32, "text": [ "Distance(94.3627744207)" ] } ], "prompt_number": 32 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Geojson\n", "\n", "http://geojson.io/\n", "\n", "Int\u00e9gration avec IPython\u202f:\n", "\n", "http://nbviewer.ipython.org/gist/jwass/c349bb0190e8dc3e251a" ] } ], "metadata": {} } ] }