{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Computing The Really Optimal Tour Across The USA On The Cloud With Python\n",
"\n",
"## Author: [Jean-François Puget](https://www.ibm.com/developerworks/community/blogs/jfp/?lang=en)\n",
"\n",
"This is a follow up on Randy Olson work on solving TSPs in Python: [Computing the optimal road trip across the U.S.](http://www.randalolson.com/2015/03/08/computing-the-optimal-road-trip-across-the-u-s/). The code below is commented at length in my blog entry on [Computing The Really Optimal Tour Across The USA On The Cloud With Python](https://www.ibm.com/developerworks/community/blogs/jfp/entry/computing_the_really_optimal_tour_acrosss_the_usa_on_the_cloud_with_python)\n",
"\n",
"Some of the code is a derivative work of Randy Olson's code, namely the list of points to visits, and the html template used for rendering the tour on Google Map.\n",
"\n",
"Can we do better than Randy, i.e. can we compute the shortest tour visiting each one of the 50 landmarks he has selected? \n",
"\n",
"The answer is yes. I will use these tools to compute it.\n",
" \n",
"### Google Maps\n",
"No need to present this, is it? We will use it for gathering data and rendering the result on a map.\n",
" \n",
"### Concorde\n",
"[Concorde](http://www.math.uwaterloo.ca/tsp/concorde.html) is the best known algorithm to solve TSPs, see my [post](http://www.randalolson.com/2015/03/08/computing-the-optimal-road-trip-across-the-u-s/) for more background on it.\n",
" \n",
"### CPLEX\n",
"We use [CPLEX](http://www.ibm.com/software/commerce/optimization/cplex-optimizer/) indirectly as Concorde is built on top of it. In all fairness, Concorde can also be used with an alternative solver called QSopt. Using CPLEX is faster, but the difference becomes noticeable only for very large TSPs. For a 50 city TSP lke Randy Olson's tour, using CPLEX or QSopt is similar. Let's admit that I have a strong bias towards CPLEX in general!\n",
" \n",
"### NEOS\n",
"This is what makes Concorde and CPLEX easy to consume in Python. NEOS is a server delivering Software as a Service (SaaS). Actually, we could say that NEOS delivers Solvers as a Service because it provides quite a few optimization solvers. In particular, Concorde on NEOS is available at: http://neos.mcs.anl.gov/neos/solvers/co:concorde/TSP.html\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Enough discussion, let's start.\n",
"\n",
"## Python Setup\n",
"\n",
"Let's first import packages we'll use in this notebook. We use Python 2.7 here. if you use Python 3 then you should rename xmlrpclib into xmlrpc.client ."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import googlemaps\n",
"import time\n",
"import xmlrpclib\n",
"import re\n",
"from jinja2 import Template"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"For ease of reuse, let's state the versions of what we are using here. We will use the convenient [watermark package](https://pypi.python.org/pypi/watermark). It can be installed with ```pip install ipyext```. Note that the package name is not watermark, for some reason.\n",
"\n",
"Once installed, it is used as a magic function."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"%load_ext watermark"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"JFPuget Sat Mar 12 2016 \n",
"\n",
"CPython 2.7.11\n",
"IPython 4.0.3\n",
"\n",
"googlemaps 2.4.2\n",
"jinja2 2.8\n",
"requests 2.6.0\n",
"\n",
"compiler : MSC v.1500 64 bit (AMD64)\n",
"system : Windows\n",
"release : 7\n",
"machine : AMD64\n",
"processor : Intel64 Family 6 Model 42 Stepping 7, GenuineIntel\n",
"CPU cores : 8\n",
"interpreter: 64bit\n"
]
}
],
"source": [
"%watermark -a 'JFPuget' -nmv --packages googlemaps,jinja2,requests"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data\n",
"\n",
"We are set, let's start real work. We start with the list of cities to be visited. The list below is from Randy Olson [article](http://www.randalolson.com/2015/03/08/computing-the-optimal-road-trip-across-the-u-s/). You can paste your own list to get another tour. Note that all points must be understandable by Google Maps, hence you should try to locate each of those points on it before running the script."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"USA_50_points = [\"USS Alabama, Battleship Parkway, Mobile, AL\",\n",
" \"Grand Canyon National Park, Arizona\",\n",
" \"Toltec Mounds, Scott, AR\",\n",
" \"San Andreas Fault, San Benito County, CA\",\n",
" \"Cable Car Museum, 94108, 1201 Mason St, San Francisco, CA 94108\",\n",
" \"Pikes Peak, Colorado\",\n",
" \"The Mark Twain House & Museum, Farmington Avenue, Hartford, CT\",\n",
" \"New Castle Historic District, Delaware\",\n",
" \"White House, Pennsylvania Avenue Northwest, Washington, DC\",\n",
" \"Cape Canaveral, FL\",\n",
" \"Okefenokee Swamp Park, Okefenokee Swamp Park Road, Waycross, GA\",\n",
" \"Craters of the Moon National Monument & Preserve, Arco, ID\",\n",
" \"Lincoln Home National Historic Site Visitor Center, 426 South 7th Street, Springfield, IL\",\n",
" \"West Baden Springs Hotel, West Baden Avenue, West Baden Springs, IN\",\n",
" \"Terrace Hill, Grand Avenue, Des Moines, IA\",\n",
" \"C. W. Parker Carousel Museum, South Esplanade Street, Leavenworth, KS\",\n",
" \"Mammoth Cave National Park, Mammoth Cave Pkwy, Mammoth Cave, KY\",\n",
" \"French Quarter, New Orleans, LA\",\n",
" \"Acadia National Park, Maine\",\n",
" \"Maryland State House, 100 State Cir, Annapolis, MD 21401\",\n",
" \"USS Constitution, Boston, MA\",\n",
" \"Olympia Entertainment, Woodward Avenue, Detroit, MI\",\n",
" \"Fort Snelling, Tower Avenue, Saint Paul, MN\",\n",
" \"Vicksburg National Military Park, Clay Street, Vicksburg, MS\",\n",
" \"Gateway Arch, Washington Avenue, St Louis, MO\",\n",
" \"Glacier National Park, West Glacier, MT\",\n",
" \"Ashfall Fossil Bed, Royal, NE\",\n",
" \"Hoover Dam, NV\",\n",
" \"Omni Mount Washington Resort, Mount Washington Hotel Road, Bretton Woods, NH\",\n",
" \"Congress Hall, Congress Place, Cape May, NJ 08204\",\n",
" \"Carlsbad Caverns National Park, Carlsbad, NM\",\n",
" \"Statue of Liberty, Liberty Island, NYC, NY\",\n",
" \"Wright Brothers National Memorial Visitor Center, Manteo, NC\",\n",
" \"Fort Union Trading Post National Historic Site, Williston, North Dakota 1804, ND\",\n",
" \"Spring Grove Cemetery, Spring Grove Avenue, Cincinnati, OH\",\n",
" \"Chickasaw National Recreation Area, 1008 W 2nd St, Sulphur, OK 73086\",\n",
" \"Columbia River Gorge National Scenic Area, Oregon\",\n",
" \"Liberty Bell, 6th Street, Philadelphia, PA\",\n",
" \"The Breakers, Ochre Point Avenue, Newport, RI\",\n",
" \"Fort Sumter National Monument, Sullivan's Island, SC\",\n",
" \"Mount Rushmore National Memorial, South Dakota 244, Keystone, SD\",\n",
" \"Graceland, Elvis Presley Boulevard, Memphis, TN\",\n",
" \"The Alamo, Alamo Plaza, San Antonio, TX\",\n",
" \"Bryce Canyon National Park, Hwy 63, Bryce, UT\",\n",
" \"Shelburne Farms, Harbor Road, Shelburne, VT\",\n",
" \"Mount Vernon, Fairfax County, Virginia\",\n",
" \"Hanford Site, Benton County, WA\",\n",
" \"Lost World Caverns, Lewisburg, WV\",\n",
" \"Taliesin, County Road C, Spring Green, Wisconsin\",\n",
" \"Yellowstone National Park, WY 82190\"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Google Maps Setup\n",
"\n",
"We need the distance between each pair of cities we want to visit. We will use Google map distance matrix API here, but we could provide the distance matrix in any way that suits us. We will see later than we can import an existing [TSPLIB](http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/) file for instance.\n",
"\n",
"We must enable the Google map distance matrix API on the [developer console](https://console.developers.google.com/). We will use the Python client library([documentation](https://googlemaps.github.io/google-maps-services-python/docs/2.2/)). We imported it above via ```import googlemaps```.\n",
"\n",
"First step is to get a google maps API key, see [instructions](https://github.com/googlemaps/google-maps-services-python#api-keys). Once we have a key we paste it below. I removed my actual key and you should paste yours if you want to reuse this code. We initiailize the client connection to the service with the key."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"google_key = \"paste your key here\"\n",
"\n",
"gmaps = googlemaps.Client(key=google_key)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We are using the free version of the service. The free API is limited to 2500 pairwise distances per 24 hours. Given we have 50 points to visit, we need 50x49/2 = 1225 pairwise distances, hence we can call this only twice a day or so. The free API is also limited to 100 pairwise distances per 10 seconds. We therefore have to wait about 123 seconds for getting all our distances.\n",
"\n",
"We use the default parameters for the distance matrix api: units=metric, mode=driving, departure_time=now.\n",
"\n",
"We return the result as a string in the [TSPLIB](http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/) format. The string must represent the distance matrix in lower triangular form. If the cities are numbered from 0 to 49, this distance matrix must look like the following (I only show the first few lines):\n",
"```\n",
"0\n",
"1-0 0\n",
"2-0 2-1 0\n",
"3-0 3-1 3-2 0\n",
"4-0 4-1 4-2 4-3 0\n",
"```\n",
"where `i-j` represents the distance from city `i` to city `j`. \n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def get_tsp_data(points):\n",
" distances = ''\n",
" try:\n",
" for i in range(0, len(points)):\n",
" for j in range(0,i+1):\n",
" if i == j: \n",
" distances += '0\\n'\n",
" else: \n",
" result1 = gmaps.distance_matrix(origins=points[i],destinations=points[j])\n",
" d = result1['rows'][0]['elements'][0]['distance']['value']\n",
" distances += '%d ' % d\n",
" except Exception as e:\n",
" print 'Error in getting distance between %s and %s' % (points[i], points[j])\n",
" return distances"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's try it!"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"distances = get_tsp_data(USA_50_points)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Does it look good?"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n",
"2677783 0\n",
"708032 2066041 0\n",
"3679695 1086596 3007532 0\n",
"3854200 1261100 3182037 213466 0\n",
"2223711 9742\n"
]
}
],
"source": [
"print(distances[:100])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Seems it is correct. \n",
"\n",
"## NEOS Data Preparation\n",
"\n",
"We can now prepare data for [NEOS](http://neos.mcs.anl.gov/neos/solvers/co:concorde/TSP.html). We will use the NEOS API for Concorde. It is not a REST API as is now customary for web services, because NEOS was developed before REST APIs were invented! NEOS API uses XML-RPC. All we need to provide is a XML file that contains the data of the problem plus few parameters for Concorde. \n",
"\n",
"We first need to wrap data to get a valid [TSPLIB format](http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/). This format has few fixed input lines describing the problem before the distance matrix. The string below contains the template for files containing a lower triangular distance matrix"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"tsp_template = \"\"\"\n",
"TYPE : TSP\n",
"DIMENSION : %i\n",
"EDGE_WEIGHT_TYPE : EXPLICIT\n",
"EDGE_WEIGHT_FORMAT : LOWER_DIAG_ROW \n",
"EDGE_WEIGHT_SECTION\n",
"%s\n",
"EOF\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We insert the distance matrix we computed via the % operator."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"tsp_data = tsp_template % (50, distances)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We save it in case we ned to rerun it. This will avoids waiting 5 minutes to get the distance matrix from Google map next time"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"with open('usa_50.tsp','wb') as file:\n",
" file.write(tsp_data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If for some reason we lost our Python kernel, or if we restart it, we can get the data back from the file."
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"```\n",
"with open('usa_50.tsp','rb') as file:\n",
" tsp_data = file.read()\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We now have to insert that data in the right xml file to submit it to NEOS. As explained in [my post](https://www.ibm.com/developerworks/community/blogs/jfp/entry/computing_the_really_optimal_tour_acrosss_the_usa_on_the_cloud_with_python), I could have carefully read the documentation. I preferred to use another nice way NEOS offers: it is possible to generate the xml file using the dry run option on the NEOS [submission page for Concorde](http://neos.mcs.anl.gov/neos/solvers/co:concorde/TSP.html). For that, I created a file named fake.tsp with the two character string '%s' as content. This content is only there for helping processing with Python later. I then entered the path to that file on the submission page, and I checked the dry run option before submitting. This does not run the job. It produces an xml content that I stored in the string below. I removed my actual email and you should put yours instead if you want to reuse it."
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"base_xml = \"\"\"\n",
"