{
"metadata": {
"name": "",
"signature": "sha256:0744d2ec1fb36ad903b5770fb0e64641d05d3c372f6cb58e0ad66257bb1bac89"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# GeoPandas and OpenStreetMap\n",
"\n",
"The GeoPandas `osm` module makes it easy to dynamically query and analyze arbitrary OpenStreetMap (OSM) data. It's a powerful combination.\n",
"\n",
"This example finds the number of traffic lights that are along an arbitrary running route, using just a few lines of code. Note that OpenStreetMap doesn't have lots of traffic lights, but they are pretty good around Cambridge, MA where this example is located.\n",
"\n",
"I used RunKeeper to download a recorded run from my phone. You can find the raw file here:\n",
"\n",
"https://gist.githubusercontent.com/jwass/4ae78872e30a21b34a44/raw/e48b18797e2dd9fb653d38329df574fed1852057/RK_gpx%20_2014-07-10_2054.gpx \n",
"\n",
"I removed a bunch of random points near the start and end so you can't exactly figure out where I live :)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import numpy as np\n",
"import geopandas as gpd\n",
"import geopandas.io.osm as osm"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load the trace\n",
"First load up the GPX file. Specifying the `tracks` layer loads the data as a single row containing a `LineString` or `MultiLineString`. For other uses you can specify `layer=track_points` to have a row for each recorded point."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"df = gpd.read_file('RK_gpx_2014-07-10_2054.gpx', layer='tracks')\n",
"df"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"
\n",
"
\n",
" \n",
" \n",
" | \n",
" cmt | \n",
" desc | \n",
" geometry | \n",
" link1_href | \n",
" link1_text | \n",
" link1_type | \n",
" link2_href | \n",
" link2_text | \n",
" link2_type | \n",
" name | \n",
" number | \n",
" src | \n",
" type | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" None | \n",
" None | \n",
" (LINESTRING (-71.106996 42.36279, -71.10707499... | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" Running 7/10/14 8:54 pm | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 2,
"text": [
" cmt desc geometry link1_href \\\n",
"0 None None (LINESTRING (-71.106996 42.36279, -71.10707499... None \n",
"\n",
" link1_text link1_type link2_href link2_text link2_type \\\n",
"0 None None None None None \n",
"\n",
" name number src type \n",
"0 Running 7/10/14 8:54 pm None None None "
]
}
],
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load the OpenStreetMap data\n",
"Now load all the OpenStreetMap traffic lights in the route's bounding box. Traffic lights in OSM are represented as nodes where the `highway` tag is `traffic_signals`. The `query_osm` function takes care of formulating the query and returning a GeoDataFrame ready to go with all the points."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"df_lights = osm.query_osm('node', bbox=df.total_bounds, tags='highway=traffic_signals')\n",
"df_lights"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" highway | \n",
" id | \n",
" traffic_signals | \n",
" geometry | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" traffic_signals | \n",
" 61283269 | \n",
" NaN | \n",
" POINT (-71.09488469999999 42.3601506) | \n",
"
\n",
" \n",
" 1 | \n",
" traffic_signals | \n",
" 61283293 | \n",
" NaN | \n",
" POINT (-71.0965425 42.3561028) | \n",
"
\n",
" \n",
" 2 | \n",
" traffic_signals | \n",
" 61317400 | \n",
" NaN | \n",
" POINT (-71.1135493 42.3622591) | \n",
"
\n",
" \n",
" 3 | \n",
" traffic_signals | \n",
" 61318066 | \n",
" NaN | \n",
" POINT (-71.1046257 42.3535672) | \n",
"
\n",
" \n",
" 4 | \n",
" traffic_signals | \n",
" 61321123 | \n",
" blinker | \n",
" POINT (-71.11036199999999 42.357534) | \n",
"
\n",
" \n",
" 5 | \n",
" traffic_signals | \n",
" 61321134 | \n",
" blinker | \n",
" POINT (-71.10724089999999 42.3581054) | \n",
"
\n",
" \n",
" 6 | \n",
" traffic_signals | \n",
" 61321277 | \n",
" NaN | \n",
" POINT (-71.09943370000001 42.3634087) | \n",
"
\n",
" \n",
" 7 | \n",
" traffic_signals | \n",
" 61322052 | \n",
" NaN | \n",
" POINT (-71.110533 42.363275) | \n",
"
\n",
" \n",
" 8 | \n",
" traffic_signals | \n",
" 61323022 | \n",
" NaN | \n",
" POINT (-71.0960005 42.3608275) | \n",
"
\n",
" \n",
" 9 | \n",
" traffic_signals | \n",
" 61323032 | \n",
" NaN | \n",
" POINT (-71.0975461 42.361748) | \n",
"
\n",
" \n",
" 10 | \n",
" traffic_signals | \n",
" 61327037 | \n",
" NaN | \n",
" POINT (-71.113873 42.3644915) | \n",
"
\n",
" \n",
" 11 | \n",
" traffic_signals | \n",
" 61327067 | \n",
" NaN | \n",
" POINT (-71.101387 42.364001) | \n",
"
\n",
" \n",
" 12 | \n",
" traffic_signals | \n",
" 61327121 | \n",
" NaN | \n",
" POINT (-71.0967089 42.3632016) | \n",
"
\n",
" \n",
" 13 | \n",
" traffic_signals | \n",
" 61328029 | \n",
" blinker | \n",
" POINT (-71.1127706 42.3601424) | \n",
"
\n",
" \n",
" 14 | \n",
" traffic_signals | \n",
" 61328070 | \n",
" NaN | \n",
" POINT (-71.11107869999999 42.3591161) | \n",
"
\n",
" \n",
" 15 | \n",
" traffic_signals | \n",
" 61329632 | \n",
" NaN | \n",
" POINT (-71.10118799999999 42.363889) | \n",
"
\n",
" \n",
" 16 | \n",
" traffic_signals | \n",
" 61329844 | \n",
" NaN | \n",
" POINT (-71.099593 42.362982) | \n",
"
\n",
" \n",
" 17 | \n",
" traffic_signals | \n",
" 61331756 | \n",
" NaN | \n",
" POINT (-71.1095894 42.3582726) | \n",
"
\n",
" \n",
" 18 | \n",
" traffic_signals | \n",
" 579060230 | \n",
" NaN | \n",
" POINT (-71.0935624 42.3590304) | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 3,
"text": [
" highway id traffic_signals \\\n",
"0 traffic_signals 61283269 NaN \n",
"1 traffic_signals 61283293 NaN \n",
"2 traffic_signals 61317400 NaN \n",
"3 traffic_signals 61318066 NaN \n",
"4 traffic_signals 61321123 blinker \n",
"5 traffic_signals 61321134 blinker \n",
"6 traffic_signals 61321277 NaN \n",
"7 traffic_signals 61322052 NaN \n",
"8 traffic_signals 61323022 NaN \n",
"9 traffic_signals 61323032 NaN \n",
"10 traffic_signals 61327037 NaN \n",
"11 traffic_signals 61327067 NaN \n",
"12 traffic_signals 61327121 NaN \n",
"13 traffic_signals 61328029 blinker \n",
"14 traffic_signals 61328070 NaN \n",
"15 traffic_signals 61329632 NaN \n",
"16 traffic_signals 61329844 NaN \n",
"17 traffic_signals 61331756 NaN \n",
"18 traffic_signals 579060230 NaN \n",
"\n",
" geometry \n",
"0 POINT (-71.09488469999999 42.3601506) \n",
"1 POINT (-71.0965425 42.3561028) \n",
"2 POINT (-71.1135493 42.3622591) \n",
"3 POINT (-71.1046257 42.3535672) \n",
"4 POINT (-71.11036199999999 42.357534) \n",
"5 POINT (-71.10724089999999 42.3581054) \n",
"6 POINT (-71.09943370000001 42.3634087) \n",
"7 POINT (-71.110533 42.363275) \n",
"8 POINT (-71.0960005 42.3608275) \n",
"9 POINT (-71.0975461 42.361748) \n",
"10 POINT (-71.113873 42.3644915) \n",
"11 POINT (-71.101387 42.364001) \n",
"12 POINT (-71.0967089 42.3632016) \n",
"13 POINT (-71.1127706 42.3601424) \n",
"14 POINT (-71.11107869999999 42.3591161) \n",
"15 POINT (-71.10118799999999 42.363889) \n",
"16 POINT (-71.099593 42.362982) \n",
"17 POINT (-71.1095894 42.3582726) \n",
"18 POINT (-71.0935624 42.3590304) "
]
}
],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Project both GeoDataFrames to the Massachusetts state plane (EPSG:26986) (Hopefully in the future GeoPandas will have geography methods and we can skip the projections).\n",
"\n",
"Then just compute the distance from the lights to the line using `distance()`."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"epsg = 26986 # MA mainland state plane\n",
"df_p = df.to_crs(epsg=epsg)\n",
"df_lights_p = df_lights.to_crs(epsg=epsg)\n",
"\n",
"df_lights['d'] = df_lights_p.distance(df_p['geometry'].iloc[0])\n",
"df_lights['d']"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 4,
"text": [
"0 13.632391\n",
"1 2.799502\n",
"2 382.609745\n",
"3 11.127261\n",
"4 150.686201\n",
"5 308.685223\n",
"6 129.537163\n",
"7 259.754943\n",
"8 17.562312\n",
"9 30.430179\n",
"10 555.348314\n",
"11 86.198503\n",
"12 206.003460\n",
"13 182.843900\n",
"14 5.179193\n",
"15 90.214907\n",
"16 83.550469\n",
"17 147.394822\n",
"18 12.741685\n",
"Name: d, dtype: float64"
]
}
],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualization\n",
"Now visualize the data. I'm using [geojsonio.py](http://github.com/jwass/geojsonio.py) which opens http://geojsonio.io. We can take advantage of [simplestyle-spec](https://github.com/mapbox/simplestyle-spec) by setting a few properties on the GeoDataFrame to control plotting behavior\n",
"\n",
"Use the `marker-color` property - traffic lights that are close to the route (< 20 meters) are green and the others will be red. The light at Massachusetts Ave. & Landsdowne St. is a false negative due to noise in the GPS trace.\n",
"\n",
"You can click on each marker and inspect the `d` property to see how far it is from the route."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Shouldn't need the 'set_geometry()' call below, this is a problem in GeoPandas\n",
"combined = df.append(df_lights).set_geometry('geometry')\n",
"combined['marker-color'] = np.where(combined['d'] <= 20, '#4daf4a', '#e41a1c')\n",
"\n",
"import geojsonio\n",
"geojsonio.embed(combined.to_json(na='drop'))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
""
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 6,
"text": [
""
]
}
],
"prompt_number": 6
}
],
"metadata": {}
}
]
}