{ "cells": [ { "cell_type": "markdown", "source": [ "# Convert a pandas dataframe to geojson for web-mapping\n", "\n", "Author: Geoff Boeing\n", "\n", "Original: [pandas-to-geojson](https://github.com/gboeing/urban-data-science/blob/dc86c9c89b73f87f97301883d7456f1f814589f5/17-Leaflet-Web-Mapping/pandas-to-geojson.ipynb)" ], "metadata": {} }, { "cell_type": "code", "source": [ "import pandas as pd, requests, json" ], "outputs": [], "execution_count": 1, "metadata": { "collapsed": false, "execution": { "iopub.status.busy": "2020-07-17T02:12:46.974Z", "iopub.execute_input": "2020-07-17T02:12:46.977Z", "iopub.status.idle": "2020-07-17T02:12:47.175Z", "shell.execute_reply": "2020-07-17T02:12:47.207Z" } } }, { "cell_type": "markdown", "source": [ "First download data from the city of Berkeley's API. You can use Socrata's $limit parameter to specify how many rows to grab (otherwise the default is 1,000 rows of data): https://dev.socrata.com/docs/paging.html\n", "\n", "Example request: https://data.cityofberkeley.info/resource/k489-uv4i.json?$limit=5" ], "metadata": {} }, { "cell_type": "code", "source": [ "# API endpoint for city of Berkeley's 311 calls\n", "endpoint_url = 'https://data.cityofberkeley.info/resource/bscu-qpbu.json?$limit=20&$where=latitude%20%3C%3E%20%22%22'" ], "outputs": [], "execution_count": 2, "metadata": { "collapsed": true, "execution": { "iopub.status.busy": "2020-07-17T02:12:47.180Z", "iopub.execute_input": "2020-07-17T02:12:47.184Z", "iopub.status.idle": "2020-07-17T02:12:47.191Z", "shell.execute_reply": "2020-07-17T02:12:47.209Z" } } }, { "cell_type": "code", "source": [ "# fetch the URL and load the data\n", "response = requests.get(endpoint_url)\n", "data = response.json()" ], "outputs": [], "execution_count": 3, "metadata": { "collapsed": true, "execution": { "iopub.status.busy": "2020-07-17T02:12:47.197Z", "iopub.execute_input": "2020-07-17T02:12:47.201Z", "iopub.status.idle": "2020-07-17T02:12:47.625Z", "shell.execute_reply": "2020-07-17T02:12:47.631Z" } } }, { "cell_type": "markdown", "source": [ "Next, turn the json data into a dataframe and clean it up a bit: drop unnecessary columns and any rows that lack lat-long data. We want to make our json file as small as possible (prefer under 5 mb) so that it can be loaded over the Internet to anyone viewing your map, without taking forever to download a huge file." ], "metadata": {} }, { "cell_type": "code", "source": [ "# turn the json data into a dataframe and see how many rows and what columns we have\n", "df = pd.DataFrame(data)\n", "\n", "print('We have {} rows'.format(len(df)))\n", "str(df.columns.tolist())" ], "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "We have 20 rows\n" ] }, { "output_type": "execute_result", "execution_count": 4, "data": { "text/plain": "\"['case_id', 'date_opened', 'case_status', 'date_closed', 'request_category', 'request_subcategory', 'request_detail', 'object_type', 'apn', 'street_address', 'city', 'state', 'neighborhood', 'latitude', 'longitude', 'location']\"" }, "metadata": {} } ], "execution_count": 4, "metadata": { "collapsed": false, "execution": { "iopub.status.busy": "2020-07-17T02:12:47.637Z", "iopub.execute_input": "2020-07-17T02:12:47.642Z", "iopub.status.idle": "2020-07-17T02:12:47.655Z", "shell.execute_reply": "2020-07-17T02:12:47.890Z" } } }, { "cell_type": "code", "source": [ "# convert lat-long to floats and change address from ALL CAPS to regular capitalization\n", "df['latitude'] = df['latitude'].astype(float)\n", "df['longitude'] = df['longitude'].astype(float)\n", "df['street_address'] = df['street_address'].str.title()" ], "outputs": [], "execution_count": 5, "metadata": { "collapsed": true, "execution": { "iopub.status.busy": "2020-07-17T02:12:47.661Z", "iopub.execute_input": "2020-07-17T02:12:47.665Z", "iopub.status.idle": "2020-07-17T02:12:47.671Z", "shell.execute_reply": "2020-07-17T02:12:47.893Z" } } }, { "cell_type": "code", "source": [ "# we don't need all those columns - only keep useful ones\n", "cols = ['request_detail', 'request_subcategory', 'latitude', 'longitude', 'street_address', 'case_status']\n", "df_subset = df[cols]" ], "outputs": [], "execution_count": 6, "metadata": { "collapsed": false, "execution": { "iopub.status.busy": "2020-07-17T02:12:47.677Z", "iopub.execute_input": "2020-07-17T02:12:47.681Z", "iopub.status.idle": "2020-07-17T02:12:47.688Z", "shell.execute_reply": "2020-07-17T02:12:47.896Z" } } }, { "cell_type": "code", "source": [ "# drop any rows that lack lat/long data\n", "df_geo = df_subset.dropna(subset=['latitude', 'longitude'], axis=0, inplace=False)\n", "\n", "print('We have {} geotagged rows'.format(len(df_geo)))\n", "df_geo.tail()" ], "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "We have 20 geotagged rows\n" ] }, { "output_type": "execute_result", "execution_count": 7, "data": { "text/plain": " request_detail request_subcategory latitude longitude \\\n15 Commercial Reminder Commercial 37.880719 -122.268930 \n16 Commercial Reminder Commercial 37.851141 -122.271171 \n17 Recycling - City of Berkeley Request 37.878899 -122.297101 \n18 Roll Off Bin Request 37.853874 -122.291220 \n19 Encampment Complaint Inquiry 37.882101 -122.302594 \n\n street_address case_status \n15 1475 Shattuck Ave Closed \n16 3132 M L King Jr Way Closed \n17 1005 Camelia St Closed \n18 2840 Eighth St Closed \n19 1102 Sixth St Closed ", "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
request_detailrequest_subcategorylatitudelongitudestreet_addresscase_status
15Commercial ReminderCommercial37.880719-122.2689301475 Shattuck AveClosed
16Commercial ReminderCommercial37.851141-122.2711713132 M L King Jr WayClosed
17Recycling - City of BerkeleyRequest37.878899-122.2971011005 Camelia StClosed
18Roll Off BinRequest37.853874-122.2912202840 Eighth StClosed
19Encampment ComplaintInquiry37.882101-122.3025941102 Sixth StClosed
\n
" }, "metadata": {} } ], "execution_count": 7, "metadata": { "collapsed": false, "execution": { "iopub.status.busy": "2020-07-17T02:12:47.694Z", "iopub.execute_input": "2020-07-17T02:12:47.698Z", "iopub.status.idle": "2020-07-17T02:12:47.708Z", "shell.execute_reply": "2020-07-17T02:12:47.898Z" } } }, { "cell_type": "code", "source": [ "# what is the distribution of issue types?\n", "df_geo['request_subcategory'].value_counts()" ], "outputs": [ { "output_type": "execute_result", "execution_count": 8, "data": { "text/plain": "Miscellaneous 4\nRequest 4\nCommercial 4\nResidential 4\nClean City Program 2\nParking 1\nInquiry 1\nName: request_subcategory, dtype: int64" }, "metadata": {} } ], "execution_count": 8, "metadata": { "collapsed": false, "execution": { "iopub.status.busy": "2020-07-17T02:12:47.713Z", "iopub.execute_input": "2020-07-17T02:12:47.717Z", "iopub.status.idle": "2020-07-17T02:12:47.724Z", "shell.execute_reply": "2020-07-17T02:12:47.901Z" } } }, { "cell_type": "markdown", "source": [ "Finally, convert each row in the dataframe to a geojson-formatted feature and save the result as a file. The format is pretty simple and you can see it here: http://geojson.org/" ], "metadata": {} }, { "cell_type": "code", "source": [ "def df_to_geojson(df, properties, lat='latitude', lon='longitude'):\n", " # create a new python dict to contain our geojson data, using geojson format\n", " geojson = {'type':'FeatureCollection', 'features':[]}\n", "\n", " # loop through each row in the dataframe and convert each row to geojson format\n", " for _, row in df.iterrows():\n", " # create a feature template to fill in\n", " feature = {'type':'Feature',\n", " 'properties':{},\n", " 'geometry':{'type':'Point',\n", " 'coordinates':[]}}\n", "\n", " # fill in the coordinates\n", " feature['geometry']['coordinates'] = [row[lon],row[lat]]\n", "\n", " # for each column, get the value and add it as a new feature property\n", " for prop in properties:\n", " feature['properties'][prop] = row[prop]\n", " \n", " # add this feature (aka, converted dataframe row) to the list of features inside our dict\n", " geojson['features'].append(feature)\n", " \n", " return geojson" ], "outputs": [], "execution_count": 9, "metadata": { "collapsed": false, "execution": { "iopub.status.busy": "2020-07-17T02:12:47.730Z", "iopub.execute_input": "2020-07-17T02:12:47.735Z", "iopub.status.idle": "2020-07-17T02:12:47.740Z", "shell.execute_reply": "2020-07-17T02:12:47.903Z" } } }, { "cell_type": "code", "source": [ "cols = [\n", " 'street_address',\n", " 'request_detail',\n", " 'request_subcategory',\n", " 'case_status'\n", "]\n", "geojson = df_to_geojson(df_geo, cols)" ], "outputs": [ { "output_type": "execute_result", "execution_count": 14, "data": { "text/plain": "\"{'type': 'FeatureCollection', 'features': [{'type': 'Feature', 'properties': {'street_address': '1611 Sixty-Second St', 'request_detail': 'Blue or Red Zone Install', 'request_subcategory': 'Parking', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.27500398, 37.84668714]}}, {'type': 'Feature', 'properties': {'street_address': '1533 Harmon St', 'request_detail': 'Illegal Dumping - City Property', 'request_subcategory': 'Clean City Program', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.27659745, 37.84927276]}}, {'type': 'Feature', 'properties': {'street_address': '1235 Carrison St', 'request_detail': 'Residential Service Stop', 'request_subcategory': 'Residential', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.28482558, 37.8519337]}}, {'type': 'Feature', 'properties': {'street_address': '2515 Mathews St', 'request_detail': 'Residential Missed Pickup Integration', 'request_subcategory': 'Residential', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.28693591, 37.86095284]}}, {'type': 'Feature', 'properties': {'street_address': '2315 Durant Ave Parkg', 'request_detail': 'Miscellaneous Service Request', 'request_subcategory': 'Miscellaneous', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.26229585, 37.86767849]}}, {'type': 'Feature', 'properties': {'street_address': '2905 Shattuck Ave', 'request_detail': 'Miscellaneous Service Request', 'request_subcategory': 'Miscellaneous', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.26633357, 37.8564289]}}, {'type': 'Feature', 'properties': {'street_address': '3300 M L King Jr Way', 'request_detail': 'Illegal Dumping - City Property', 'request_subcategory': 'Clean City Program', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.27096909, 37.84816597]}}, {'type': 'Feature', 'properties': {'street_address': '3079 Bateman St', 'request_detail': 'Miscellaneous Internet Request', 'request_subcategory': 'Miscellaneous', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.25558886, 37.85382983]}}, {'type': 'Feature', 'properties': {'street_address': '2635 Ashby Ave', 'request_detail': 'Miscellaneous Internet Request', 'request_subcategory': 'Miscellaneous', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.25364902, 37.85722757]}}, {'type': 'Feature', 'properties': {'street_address': '2921 Fulton St', 'request_detail': 'Cart Repair', 'request_subcategory': 'Request', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.2632558, 37.85632234]}}, {'type': 'Feature', 'properties': {'street_address': '1530 Bancroft Way', 'request_detail': 'Recycling - City of Berkeley', 'request_subcategory': 'Request', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.28011788, 37.86586124]}}, {'type': 'Feature', 'properties': {'street_address': '1709 Alcatraz Ave', 'request_detail': 'Commercial Site Inspection', 'request_subcategory': 'Commercial', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.27327879, 37.84896425]}}, {'type': 'Feature', 'properties': {'street_address': '1744 Alcatraz Ave', 'request_detail': 'Residential Site Inspection', 'request_subcategory': 'Residential', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.27217761, 37.84846343]}}, {'type': 'Feature', 'properties': {'street_address': '1850 Fourth St', 'request_detail': 'Commercial Lost or Stolen Cart', 'request_subcategory': 'Commercial', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.30061859, 37.86903492]}}, {'type': 'Feature', 'properties': {'street_address': '1031 Bancroft Way', 'request_detail': 'Residential Bulky Pickup', 'request_subcategory': 'Residential', 'case_status': 'Open'}, 'geometry': {'type': 'Point', 'coordinates': [-122.29146406, 37.86468716]}}, {'type': 'Feature', 'properties': {'street_address': '1475 Shattuck Ave', 'request_detail': 'Commercial Reminder', 'request_subcategory': 'Commercial', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.26893005, 37.88071914]}}, {'type': 'Feature', 'properties': {'street_address': '3132 M L King Jr Way', 'request_detail': 'Commercial Reminder', 'request_subcategory': 'Commercial', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.27117109, 37.85114072]}}, {'type': 'Feature', 'properties': {'street_address': '1005 Camelia St', 'request_detail': 'Recycling - City of Berkeley', 'request_subcategory': 'Request', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.29710072, 37.87889918]}}, {'type': 'Feature', 'properties': {'street_address': '2840 Eighth St', 'request_detail': 'Roll Off Bin', 'request_subcategory': 'Request', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.29122018, 37.85387368]}}, {'type': 'Feature', 'properties': {'street_address': '1102 Sixth St', 'request_detail': 'Encampment Complaint', 'request_subcategory': 'Inquiry', 'case_status': 'Closed'}, 'geometry': {'type': 'Point', 'coordinates': [-122.30259371, 37.88210128]}}]}\"" }, "metadata": {} } ], "execution_count": 14, "metadata": { "collapsed": true, "execution": { "iopub.status.busy": "2020-07-17T02:16:56.030Z", "iopub.execute_input": "2020-07-17T02:16:56.035Z", "iopub.status.idle": "2020-07-17T02:16:56.052Z", "shell.execute_reply": "2020-07-17T02:16:56.069Z" } } }, { "cell_type": "markdown", "source": [ "In [nteract](https://github.com/nteract/nteract), we can display geojson directly with the built-in leaflet renderer." ], "metadata": {} }, { "cell_type": "code", "source": [ "import IPython\n", "IPython.display.display({'application/geo+json': geojson}, raw=True)" ], "outputs": [ { "output_type": "display_data", "data": { "application/geo+json": { "type": "FeatureCollection", "features": [ { "type": "Feature", "properties": { "street_address": "1611 Sixty-Second St", "request_detail": "Blue or Red Zone Install", "request_subcategory": "Parking", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.27500398, 37.84668714 ] } }, { "type": "Feature", "properties": { "street_address": "1533 Harmon St", "request_detail": "Illegal Dumping - City Property", "request_subcategory": "Clean City Program", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.27659745, 37.84927276 ] } }, { "type": "Feature", "properties": { "street_address": "1235 Carrison St", "request_detail": "Residential Service Stop", "request_subcategory": "Residential", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.28482558, 37.8519337 ] } }, { "type": "Feature", "properties": { "street_address": "2515 Mathews St", "request_detail": "Residential Missed Pickup Integration", "request_subcategory": "Residential", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.28693591, 37.86095284 ] } }, { "type": "Feature", "properties": { "street_address": "2315 Durant Ave Parkg", "request_detail": "Miscellaneous Service Request", "request_subcategory": "Miscellaneous", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.26229585, 37.86767849 ] } }, { "type": "Feature", "properties": { "street_address": "2905 Shattuck Ave", "request_detail": "Miscellaneous Service Request", "request_subcategory": "Miscellaneous", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.26633357, 37.8564289 ] } }, { "type": "Feature", "properties": { "street_address": "3300 M L King Jr Way", "request_detail": "Illegal Dumping - City Property", "request_subcategory": "Clean City Program", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.27096909, 37.84816597 ] } }, { "type": "Feature", "properties": { "street_address": "3079 Bateman St", "request_detail": "Miscellaneous Internet Request", "request_subcategory": "Miscellaneous", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.25558886, 37.85382983 ] } }, { "type": "Feature", "properties": { "street_address": "2635 Ashby Ave", "request_detail": "Miscellaneous Internet Request", "request_subcategory": "Miscellaneous", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.25364902, 37.85722757 ] } }, { "type": "Feature", "properties": { "street_address": "2921 Fulton St", "request_detail": "Cart Repair", "request_subcategory": "Request", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.2632558, 37.85632234 ] } }, { "type": "Feature", "properties": { "street_address": "1530 Bancroft Way", "request_detail": "Recycling - City of Berkeley", "request_subcategory": "Request", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.28011788, 37.86586124 ] } }, { "type": "Feature", "properties": { "street_address": "1709 Alcatraz Ave", "request_detail": "Commercial Site Inspection", "request_subcategory": "Commercial", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.27327879, 37.84896425 ] } }, { "type": "Feature", "properties": { "street_address": "1744 Alcatraz Ave", "request_detail": "Residential Site Inspection", "request_subcategory": "Residential", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.27217761, 37.84846343 ] } }, { "type": "Feature", "properties": { "street_address": "1850 Fourth St", "request_detail": "Commercial Lost or Stolen Cart", "request_subcategory": "Commercial", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.30061859, 37.86903492 ] } }, { "type": "Feature", "properties": { "street_address": "1031 Bancroft Way", "request_detail": "Residential Bulky Pickup", "request_subcategory": "Residential", "case_status": "Open" }, "geometry": { "type": "Point", "coordinates": [ -122.29146406, 37.86468716 ] } }, { "type": "Feature", "properties": { "street_address": "1475 Shattuck Ave", "request_detail": "Commercial Reminder", "request_subcategory": "Commercial", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.26893005, 37.88071914 ] } }, { "type": "Feature", "properties": { "street_address": "3132 M L King Jr Way", "request_detail": "Commercial Reminder", "request_subcategory": "Commercial", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.27117109, 37.85114072 ] } }, { "type": "Feature", "properties": { "street_address": "1005 Camelia St", "request_detail": "Recycling - City of Berkeley", "request_subcategory": "Request", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.29710072, 37.87889918 ] } }, { "type": "Feature", "properties": { "street_address": "2840 Eighth St", "request_detail": "Roll Off Bin", "request_subcategory": "Request", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.29122018, 37.85387368 ] } }, { "type": "Feature", "properties": { "street_address": "1102 Sixth St", "request_detail": "Encampment Complaint", "request_subcategory": "Inquiry", "case_status": "Closed" }, "geometry": { "type": "Point", "coordinates": [ -122.30259371, 37.88210128 ] } } ] } }, "metadata": {} } ], "execution_count": 13, "metadata": { "collapsed": false, "execution": { "iopub.status.busy": "2020-07-17T02:14:50.321Z", "iopub.execute_input": "2020-07-17T02:14:50.329Z", "iopub.status.idle": "2020-07-17T02:14:50.465Z", "shell.execute_reply": "2020-07-17T02:14:50.470Z" } } }, { "cell_type": "markdown", "source": [ "_Known temporary [issue](https://github.com/nteract/nteract/issues/2034): leaflet fails to render_" ], "metadata": {} } ], "metadata": { "LICENSE": "MIT License\n\nCopyright (c) 2016 Geoff Boeing, http://geoffboeing.com\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.", "kernel_info": { "name": "python3" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.7.7", "mimetype": "text/x-python", "codemirror_mode": { "name": "ipython", "version": 3 }, "pygments_lexer": "ipython3", "nbconvert_exporter": "python", "file_extension": ".py" }, "nteract": { "version": "0.24.0" }, "title": "Go from Pandas to GeoJSON" }, "nbformat": 4, "nbformat_minor": 0 }