{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tutorial: Interactive Maps with Python and Folium, Part 1\n", "## Basic Maps and Circle Markers\n", "#### Using data from NYC CitiBike program\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2018-11-28T12:43:13.849242Z", "start_time": "2018-11-28T12:43:12.776957Z" } }, "outputs": [], "source": [ "import pandas as pd\n", "import folium" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To test that everything is working, let’s pull up a map of New York City and add a circle marker." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2018-11-28T12:43:13.884669Z", "start_time": "2018-11-28T12:43:13.851475Z" }, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "folium_map = folium.Map(location=[40.738, -73.98],\n", " zoom_start=13,\n", " tiles=\"CartoDB dark_matter\")\n", "\n", "folium.CircleMarker(location=[40.738, -73.98],fill=True).add_to(folium_map)\n", "folium_map" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Showing some real data, NYC bike trips\n", "Next, we’ll load some data. The NYC bike share program makes its data public, you can download it here:\n", "https://www.citibikenyc.com/system-data.\n", "Just one month of data will do for this example. \n", "\n", "We’ll use pandas to load the data into python, we’ll convert time strings into DateTime objects" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2018-11-28T12:43:32.653296Z", "start_time": "2018-11-28T12:43:13.890081Z" } }, "outputs": [], "source": [ "bike_data = pd.read_csv(\"201610-citibike-tripdata.csv\")\n", "bike_data[\"Start Time\"] = pd.to_datetime(bike_data[\"Start Time\"])\n", "bike_data[\"Stop Time\"] = pd.to_datetime(bike_data[\"Stop Time\"])\n", "bike_data[\"hour\"] = bike_data[\"Start Time\"].map(lambda x: x.hour)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2018-11-28T12:43:32.688022Z", "start_time": "2018-11-28T12:43:32.657573Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Trip DurationStart TimeStop TimeStart Station IDStart Station NameStart Station LatitudeStart Station LongitudeEnd Station IDEnd Station NameEnd Station LatitudeEnd Station LongitudeBike IDUser TypeBirth YearGenderhour
03282016-10-01 00:00:072016-10-01 00:05:35471Grand St & Havemeyer St40.712868-73.9569813077Stagg St & Union Ave40.708771-73.95095325254Subscriber1992.010
13982016-10-01 00:00:112016-10-01 00:06:493147E 85 St & 3 Ave40.778012-73.95407131401 Ave & E 78 St40.771404-73.95351717810Subscriber1988.020
24302016-10-01 00:00:142016-10-01 00:07:25345W 13 St & 6 Ave40.736494-73.997044470W 20 St & 8 Ave40.743453-74.00004020940Subscriber1965.010
33512016-10-01 00:00:212016-10-01 00:06:123307West End Ave & W 94 St40.794165-73.9741243357W 106 St & Amsterdam Ave40.800836-73.96644919086Subscriber1993.010
426932016-10-01 00:00:212016-10-01 00:45:1534288 Ave & W 16 St40.740983-74.0017023323W 106 St & Central Park West40.798186-73.96059126502Subscriber1991.010
\n", "
" ], "text/plain": [ " Trip Duration Start Time Stop Time Start Station ID \\\n", "0 328 2016-10-01 00:00:07 2016-10-01 00:05:35 471 \n", "1 398 2016-10-01 00:00:11 2016-10-01 00:06:49 3147 \n", "2 430 2016-10-01 00:00:14 2016-10-01 00:07:25 345 \n", "3 351 2016-10-01 00:00:21 2016-10-01 00:06:12 3307 \n", "4 2693 2016-10-01 00:00:21 2016-10-01 00:45:15 3428 \n", "\n", " Start Station Name Start Station Latitude Start Station Longitude \\\n", "0 Grand St & Havemeyer St 40.712868 -73.956981 \n", "1 E 85 St & 3 Ave 40.778012 -73.954071 \n", "2 W 13 St & 6 Ave 40.736494 -73.997044 \n", "3 West End Ave & W 94 St 40.794165 -73.974124 \n", "4 8 Ave & W 16 St 40.740983 -74.001702 \n", "\n", " End Station ID End Station Name End Station Latitude \\\n", "0 3077 Stagg St & Union Ave 40.708771 \n", "1 3140 1 Ave & E 78 St 40.771404 \n", "2 470 W 20 St & 8 Ave 40.743453 \n", "3 3357 W 106 St & Amsterdam Ave 40.800836 \n", "4 3323 W 106 St & Central Park West 40.798186 \n", "\n", " End Station Longitude Bike ID User Type Birth Year Gender hour \n", "0 -73.950953 25254 Subscriber 1992.0 1 0 \n", "1 -73.953517 17810 Subscriber 1988.0 2 0 \n", "2 -74.000040 20940 Subscriber 1965.0 1 0 \n", "3 -73.966449 19086 Subscriber 1993.0 1 0 \n", "4 -73.960591 26502 Subscriber 1991.0 1 0 " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bike_data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### pre-processing data\n", "We'll write a function that does the following: \n", "- generate a DataFrame containing locations of stations\n", "- generates a DataFrame containing the number of trips originating at each station. \n", "- generates a DataFrame containing the number of trips arriving at each station. \n", "- join the three dataframes into one." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2018-11-28T12:43:33.438458Z", "start_time": "2018-11-28T12:43:32.690255Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Departure CountStart Station LatitudeStart Station LongitudeStart Station NameArrival Count
Start Station ID
724240.767272-73.993929W 52 St & 11 Ave96.0
792940.719116-74.006667Franklin St & W Broadway79.0
823140.711174-74.000165St James Pl & Pearl St13.0
833240.683826-73.976323Atlantic Ave & Fort Greene Pl35.0
1166440.741776-74.001497W 17 St & 8 Ave85.0
\n", "
" ], "text/plain": [ " Departure Count Start Station Latitude \\\n", "Start Station ID \n", "72 42 40.767272 \n", "79 29 40.719116 \n", "82 31 40.711174 \n", "83 32 40.683826 \n", "116 64 40.741776 \n", "\n", " Start Station Longitude Start Station Name \\\n", "Start Station ID \n", "72 -73.993929 W 52 St & 11 Ave \n", "79 -74.006667 Franklin St & W Broadway \n", "82 -74.000165 St James Pl & Pearl St \n", "83 -73.976323 Atlantic Ave & Fort Greene Pl \n", "116 -74.001497 W 17 St & 8 Ave \n", "\n", " Arrival Count \n", "Start Station ID \n", "72 96.0 \n", "79 79.0 \n", "82 13.0 \n", "83 35.0 \n", "116 85.0 " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def get_trip_counts_by_hour(selected_hour):\n", " # make a DataFrame with locations for each bike station\n", " locations = bike_data.groupby(\"Start Station ID\").first()\n", " locations = locations.loc[:, [\"Start Station Latitude\",\n", " \"Start Station Longitude\",\n", " \"Start Station Name\"]]\n", " \n", " #select one time of day\n", " subset = bike_data[bike_data[\"hour\"]==selected_hour]\n", " \n", " # count trips for each destination\n", " departure_counts = subset.groupby(\"Start Station ID\").count()\n", " departure_counts = departure_counts.iloc[:,[0]]\n", " departure_counts.columns= [\"Departure Count\"]\n", " \n", " # count trips for each origin\n", " arrival_counts = subset.groupby(\"End Station ID\").count().iloc[:,[0]]\n", " arrival_counts.columns= [\"Arrival Count\"]\n", "\n", " #join departure counts, arrival counts, and locations\n", " trip_counts = departure_counts.join(locations).join(arrival_counts)\n", " return trip_counts\n", "\n", "# print a sample to check our code works\n", "get_trip_counts_by_hour(6).head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we'll write a function that generates a new folium map and adds circle markers for each station." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2018-11-28T12:43:33.451717Z", "start_time": "2018-11-28T12:43:33.441093Z" } }, "outputs": [], "source": [ "def plot_station_counts(trip_counts):\n", " # generate a new map\n", " folium_map = folium.Map(location=[40.738, -73.98],\n", " zoom_start=13,\n", " tiles=\"CartoDB dark_matter\",\n", " width='50%')\n", "\n", " # for each row in the data, add a cicle marker\n", " for index, row in trip_counts.iterrows():\n", " # calculate net departures\n", " net_departures = (row[\"Departure Count\"]-row[\"Arrival Count\"])\n", " \n", " # generate the popup message that is shown on click.\n", " popup_text = \"{}
total departures: {}
total arrivals: {}
net departures: {}\"\n", " popup_text = popup_text.format(row[\"Start Station Name\"],\n", " row[\"Arrival Count\"],\n", " row[\"Departure Count\"],\n", " net_departures)\n", " \n", " # radius of circles\n", " radius = net_departures/20\n", " \n", " # choose the color of the marker\n", " if net_departures>0:\n", " # color=\"#FFCE00\" # orange\n", " # color=\"#007849\" # green\n", " color=\"#E37222\" # tangerine\n", " else:\n", " # color=\"#0375B4\" # blue\n", " # color=\"#FFCE00\" # yellow \n", " color=\"#0A8A9F\" # teal\n", " \n", " # add marker to the map\n", " folium.CircleMarker(location=(row[\"Start Station Latitude\"],\n", " row[\"Start Station Longitude\"]),\n", " radius=radius,\n", " color=color,\n", " popup=popup_text,\n", " fill=True).add_to(folium_map)\n", " return folium_map" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Showing Real data\n", "We'll make 2 maps to show the different patterns for bike migration at 9 am and 6pm." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2018-11-28T12:43:35.683522Z", "start_time": "2018-11-28T12:43:33.455012Z" } }, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# plot net departures at 9AM\n", "\n", "trip_counts = get_trip_counts_by_hour(9)\n", "plot_station_counts(trip_counts)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2018-11-28T12:43:37.922462Z", "start_time": "2018-11-28T12:43:35.685665Z" } }, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# plot net departures at 6PM\n", "\n", "trip_counts = get_trip_counts_by_hour(18)\n", "folium_map = plot_station_counts(trip_counts)\n", "folium_map" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2018-11-28T12:43:39.164530Z", "start_time": "2018-11-28T12:43:37.925216Z" } }, "outputs": [], "source": [ "folium_map.save(\"part_1.html\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }