{ "metadata": { "name": "", "signature": "sha256:0744d2ec1fb36ad903b5770fb0e64641d05d3c372f6cb58e0ad66257bb1bac89" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# GeoPandas and OpenStreetMap\n", "\n", "The GeoPandas `osm` module makes it easy to dynamically query and analyze arbitrary OpenStreetMap (OSM) data. It's a powerful combination.\n", "\n", "This example finds the number of traffic lights that are along an arbitrary running route, using just a few lines of code. Note that OpenStreetMap doesn't have lots of traffic lights, but they are pretty good around Cambridge, MA where this example is located.\n", "\n", "I used RunKeeper to download a recorded run from my phone. You can find the raw file here:\n", "\n", "https://gist.githubusercontent.com/jwass/4ae78872e30a21b34a44/raw/e48b18797e2dd9fb653d38329df574fed1852057/RK_gpx%20_2014-07-10_2054.gpx \n", "\n", "I removed a bunch of random points near the start and end so you can't exactly figure out where I live :)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import numpy as np\n", "import geopandas as gpd\n", "import geopandas.io.osm as osm" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load the trace\n", "First load up the GPX file. Specifying the `tracks` layer loads the data as a single row containing a `LineString` or `MultiLineString`. For other uses you can specify `layer=track_points` to have a row for each recorded point." ] }, { "cell_type": "code", "collapsed": false, "input": [ "df = gpd.read_file('RK_gpx_2014-07-10_2054.gpx', layer='tracks')\n", "df" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
cmtdescgeometrylink1_hreflink1_textlink1_typelink2_hreflink2_textlink2_typenamenumbersrctype
0 None None (LINESTRING (-71.106996 42.36279, -71.10707499... None None None None None None Running 7/10/14 8:54 pm None None None
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ " cmt desc geometry link1_href \\\n", "0 None None (LINESTRING (-71.106996 42.36279, -71.10707499... None \n", "\n", " link1_text link1_type link2_href link2_text link2_type \\\n", "0 None None None None None \n", "\n", " name number src type \n", "0 Running 7/10/14 8:54 pm None None None " ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load the OpenStreetMap data\n", "Now load all the OpenStreetMap traffic lights in the route's bounding box. Traffic lights in OSM are represented as nodes where the `highway` tag is `traffic_signals`. The `query_osm` function takes care of formulating the query and returning a GeoDataFrame ready to go with all the points." ] }, { "cell_type": "code", "collapsed": false, "input": [ "df_lights = osm.query_osm('node', bbox=df.total_bounds, tags='highway=traffic_signals')\n", "df_lights" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
highwayidtraffic_signalsgeometry
0 traffic_signals 61283269 NaN POINT (-71.09488469999999 42.3601506)
1 traffic_signals 61283293 NaN POINT (-71.0965425 42.3561028)
2 traffic_signals 61317400 NaN POINT (-71.1135493 42.3622591)
3 traffic_signals 61318066 NaN POINT (-71.1046257 42.3535672)
4 traffic_signals 61321123 blinker POINT (-71.11036199999999 42.357534)
5 traffic_signals 61321134 blinker POINT (-71.10724089999999 42.3581054)
6 traffic_signals 61321277 NaN POINT (-71.09943370000001 42.3634087)
7 traffic_signals 61322052 NaN POINT (-71.110533 42.363275)
8 traffic_signals 61323022 NaN POINT (-71.0960005 42.3608275)
9 traffic_signals 61323032 NaN POINT (-71.0975461 42.361748)
10 traffic_signals 61327037 NaN POINT (-71.113873 42.3644915)
11 traffic_signals 61327067 NaN POINT (-71.101387 42.364001)
12 traffic_signals 61327121 NaN POINT (-71.0967089 42.3632016)
13 traffic_signals 61328029 blinker POINT (-71.1127706 42.3601424)
14 traffic_signals 61328070 NaN POINT (-71.11107869999999 42.3591161)
15 traffic_signals 61329632 NaN POINT (-71.10118799999999 42.363889)
16 traffic_signals 61329844 NaN POINT (-71.099593 42.362982)
17 traffic_signals 61331756 NaN POINT (-71.1095894 42.3582726)
18 traffic_signals 579060230 NaN POINT (-71.0935624 42.3590304)
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ " highway id traffic_signals \\\n", "0 traffic_signals 61283269 NaN \n", "1 traffic_signals 61283293 NaN \n", "2 traffic_signals 61317400 NaN \n", "3 traffic_signals 61318066 NaN \n", "4 traffic_signals 61321123 blinker \n", "5 traffic_signals 61321134 blinker \n", "6 traffic_signals 61321277 NaN \n", "7 traffic_signals 61322052 NaN \n", "8 traffic_signals 61323022 NaN \n", "9 traffic_signals 61323032 NaN \n", "10 traffic_signals 61327037 NaN \n", "11 traffic_signals 61327067 NaN \n", "12 traffic_signals 61327121 NaN \n", "13 traffic_signals 61328029 blinker \n", "14 traffic_signals 61328070 NaN \n", "15 traffic_signals 61329632 NaN \n", "16 traffic_signals 61329844 NaN \n", "17 traffic_signals 61331756 NaN \n", "18 traffic_signals 579060230 NaN \n", "\n", " geometry \n", "0 POINT (-71.09488469999999 42.3601506) \n", "1 POINT (-71.0965425 42.3561028) \n", "2 POINT (-71.1135493 42.3622591) \n", "3 POINT (-71.1046257 42.3535672) \n", "4 POINT (-71.11036199999999 42.357534) \n", "5 POINT (-71.10724089999999 42.3581054) \n", "6 POINT (-71.09943370000001 42.3634087) \n", "7 POINT (-71.110533 42.363275) \n", "8 POINT (-71.0960005 42.3608275) \n", "9 POINT (-71.0975461 42.361748) \n", "10 POINT (-71.113873 42.3644915) \n", "11 POINT (-71.101387 42.364001) \n", "12 POINT (-71.0967089 42.3632016) \n", "13 POINT (-71.1127706 42.3601424) \n", "14 POINT (-71.11107869999999 42.3591161) \n", "15 POINT (-71.10118799999999 42.363889) \n", "16 POINT (-71.099593 42.362982) \n", "17 POINT (-71.1095894 42.3582726) \n", "18 POINT (-71.0935624 42.3590304) " ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Project both GeoDataFrames to the Massachusetts state plane (EPSG:26986) (Hopefully in the future GeoPandas will have geography methods and we can skip the projections).\n", "\n", "Then just compute the distance from the lights to the line using `distance()`." ] }, { "cell_type": "code", "collapsed": false, "input": [ "epsg = 26986 # MA mainland state plane\n", "df_p = df.to_crs(epsg=epsg)\n", "df_lights_p = df_lights.to_crs(epsg=epsg)\n", "\n", "df_lights['d'] = df_lights_p.distance(df_p['geometry'].iloc[0])\n", "df_lights['d']" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 4, "text": [ "0 13.632391\n", "1 2.799502\n", "2 382.609745\n", "3 11.127261\n", "4 150.686201\n", "5 308.685223\n", "6 129.537163\n", "7 259.754943\n", "8 17.562312\n", "9 30.430179\n", "10 555.348314\n", "11 86.198503\n", "12 206.003460\n", "13 182.843900\n", "14 5.179193\n", "15 90.214907\n", "16 83.550469\n", "17 147.394822\n", "18 12.741685\n", "Name: d, dtype: float64" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualization\n", "Now visualize the data. I'm using [geojsonio.py](http://github.com/jwass/geojsonio.py) which opens http://geojsonio.io. We can take advantage of [simplestyle-spec](https://github.com/mapbox/simplestyle-spec) by setting a few properties on the GeoDataFrame to control plotting behavior\n", "\n", "Use the `marker-color` property - traffic lights that are close to the route (< 20 meters) are green and the others will be red. The light at Massachusetts Ave. & Landsdowne St. is a false negative due to noise in the GPS trace.\n", "\n", "You can click on each marker and inspect the `d` property to see how far it is from the route." ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Shouldn't need the 'set_geometry()' call below, this is a problem in GeoPandas\n", "combined = df.append(df_lights).set_geometry('geometry')\n", "combined['marker-color'] = np.where(combined['d'] <= 20, '#4daf4a', '#e41a1c')\n", "\n", "import geojsonio\n", "geojsonio.embed(combined.to_json(na='drop'))" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "" ], "metadata": {}, "output_type": "pyout", "prompt_number": 6, "text": [ "" ] } ], "prompt_number": 6 } ], "metadata": {} } ] }