{ "cells": [ { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "_Since we announced [our collaboration with the World Bank and more partners to create the Open Traffic platform](https://mapzen.com/blog/announcing-open-traffic/), we’ve been busy. We’ve shared [two](https://mapzen.com/blog/open-traffic-osmlr-technical-preview/) [technical](https://mapzen.com/blog/osmlr-2nd-technical-preview/) previews of the OSMLR linear referencing system. Now we’re ready to share more about how we’re using [Mapzen Map Matching](https://mapzen.com/blog/map-matching/) to “snap” GPS-derived locations to OSMLR segments, and how we’re using a data-driven approach to evaluate and improve the algorithms._" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "# A \"data-driven\" approach to improving map-matching - Part I:\n", "## _VALIDATION_\n", "============================================================================================" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Mapzen has been testing and matching GPS measurements from some of Open Traffic’s partners since development began, but one burning question remained: were our matches any good? Map-matching real-time GPS traces is one thing, but without on-the-ground knowledge about where the traces actually came from, it was impossible to to determine how close to — or far from — the truth our predictions were.\n", "\n", "Our in-house solution was to use Mapzen's very own [Turn-By-Turn](https://mapzen.com/products/turn-by-turn/) routing API to simulate fake GPS data, send the synthetic data through the [Mapzen Map Matching](https://mapzen.com/blog/map-matching/) service, and compare the results to the original routes used to simulate the fake traces. We have documented this process below:" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "## 0. Setup test environment" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "from __future__ import division\n", "from matplotlib import pyplot as plt\n", "from matplotlib import cm, colors, patheffects\n", "import numpy as np\n", "import os\n", "import glob\n", "import urllib\n", "import json\n", "import pandas as pd\n", "from random import shuffle, choice\n", "import pickle\n", "import sys; sys.path.insert(0, os.path.abspath('..'));\n", "import validator.validator as val\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "deletable": true, "editable": true }, "source": [ "#### User vars" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "mapzenKey = os.environ.get('MAPZEN_API')\n", "gmapsKey = os.environ.get('GOOGLE_MAPS')" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "deletable": true, "editable": true }, "source": [ "## 1. Generate Routes" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "The first step in route generation is picking a test region, which for us was San Francisco. Routes are defined as a set of start and stop coordinates, which we obtain by randomly sampling venues from Mapzen’s [Who’s on First](https://whosonfirst.mapzen.com/) gazetteer for the specified city. Additionally, we want to limit our route distances to be between ½ km and 1 km because this is the localized scale at which map matching actually takes place.\n", "\n", "In this example, we specify 200 fake routes:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "cityName = 'San Francisco'\n", "minRouteLen = 0.5 # specified in km\n", "maxRouteLen = 1 # specified in km\n", "numRoutes = 200" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### a) Get random start and end coordinates" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "# Using Mapzen Venues (requires good Who's on First coverage)\n", "routeList = val.get_routes_by_length(cityName, minRouteLen, maxRouteLen, numRoutes, apiKey=mapzenKey)\n", "\n", "## Using Google Maps POIs (better for non-Western capitals):\n", "# routeList = val.get_POI_routes_by_length(cityName, minRouteLen, maxRouteLen, numRoutes, gmapsKey)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "A sample route:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false, "deletable": true, "editable": true, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "({u'Cabovida Dvelopments A Cal Ltd': {'lat': 37.775456, 'lon': -122.406369}},\n", " {u'Palomar Group': {'lat': 37.788245, 'lon': -122.409508}})" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "myRoute = routeList[2]\n", "myRoute" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### b) Get the route shapes and attributes" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "For each route, we then pass the start and end coordinates to the Turn-By-Turn API to obtain the coordinates of the road segments along the route:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "shape, routeUrl = val.get_route_shape(myRoute)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "The Turn-By-Turn API returns the shape of the route as an [encoded polyline](https://developers.google.com/maps/documentation/utilities/polylinealgorithm):" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/plain": [ "u'mes`gAv}anhFtUr[q{@vkAgOnS{EaHqRwWuY{`@sLePoMcQiMqQqMcQiCwD}TsZ}Yz`@cV`\\\\mYj`@{Zxa@}U`\\\\}Yj`@}Yj`@cUd[wTbZmDlKO\\\\Ol@}D|IsAzAwC|@sp@nHcy@|J}x@lJmy@|Jey@|J}OhBuThCqRxBaLqeBqCuc@'" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "shape" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "The route shape then gets passed to the map matching service in order to obtain the coordinates and attributes of the road segments (i.e. edges) that lie along the original route:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "edges, matchedPts, shapeCoords, _ = val.get_trace_attrs(shape)\n", "edges = val.get_coords_per_second(shapeCoords, edges, '2768')" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "We can inspect the attributes returned for our example route:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "
\n", " | id | \n", "begin_shape_index | \n", "end_shape_index | \n", "length | \n", "speed | \n", "density | \n", "oneSecCoords | \n", "segment_id | \n", "num_segments | \n", "starts_segment | \n", "ends_segment | \n", "begin_percent | \n", "end_percent | \n", "begin_resampled_shape_index | \n", "end_resampled_shape_index | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "3738534172576 | \n", "0 | \n", "1 | \n", "0.057 | \n", "40 | \n", "15 | \n", "[[-122.40638, 37.775463], [-122.406469252, 37.... | \n", "497780021152 | \n", "1 | \n", "True | \n", "True | \n", "0.0 | \n", "1.0 | \n", "None | \n", "None | \n", "
1 | \n", "962716097074 | \n", "1 | \n", "2 | \n", "0.153 | \n", "35 | \n", "15 | \n", "[[-122.406916267, 37.7751617604], [-122.406994... | \n", "39633672754 | \n", "1 | \n", "True | \n", "True | \n", "0.0 | \n", "1.0 | \n", "None | \n", "None | \n", "
2 | \n", "931778910770 | \n", "2 | \n", "3 | \n", "0.041 | \n", "35 | \n", "15 | \n", "[[-122.408144091, 37.7761309014], [-122.408222... | \n", "39633672754 | \n", "1 | \n", "False | \n", "False | \n", "0.0 | \n", "1.0 | \n", "None | \n", "None | \n", "
3 | \n", "3780477212576 | \n", "3 | \n", "4 | \n", "0.018 | \n", "40 | \n", "15 | \n", "[[-122.408302818, 37.7763981724], [-122.408249... | \n", "None | \n", "0 | \n", "None | \n", "None | \n", "NaN | \n", "NaN | \n", "None | \n", "None | \n", "
4 | \n", "4122363320224 | \n", "4 | \n", "5 | \n", "0.049 | \n", "40 | \n", "15 | \n", "[[-122.408159625, 37.7765096426], [-122.408070... | \n", "None | \n", "0 | \n", "None | \n", "None | \n", "NaN | \n", "NaN | \n", "None | \n", "None | \n", "