{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"## Assignment 9: Geographic visualization with Python\n",
"\n",
"In this assignment, the second-to-last of the semester, we will venture outside of the Python ecosystem to create interactive maps with Mapbox Studio. You'll be making an interactive map of public intoxication violations in Fort Worth, drawn from the same City crime dataset you worked with in Assignment 7. You'll explore methods for visualizing large amounts of point data and customize your map's appearance.. \n",
"\n",
"In this instance, you are going to upload a bulk CSV download that I've prepared for you from the City's open data catalog. The data as currently constituted are not quite ready for use in Mapbox Studio, so we're going to clean it up a bit in `pandas` first. \n",
"\n",
"Mapbox Studio accepts data in many formats, such as CSV, GeoJSON, and zipped shapefile. In this instance, our data are in CSV format. Let's read the data into `pandas` and have a look. "
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Case and Offense
\n",
"
Case Number
\n",
"
Reported Date
\n",
"
Nature Of Call
\n",
"
From Date
\n",
"
Offense
\n",
"
Description
\n",
"
Block Address
\n",
"
City
\n",
"
State
\n",
"
Beat
\n",
"
Division
\n",
"
Council District
\n",
"
Attempt Complete
\n",
"
Location Type
\n",
"
Location Description
\n",
"
Location
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
200025542-90E
\n",
"
200025542
\n",
"
03/28/2020
\n",
"
PUBLIC INTOXICATION
\n",
"
03/28/2020 11:50:00 PM
\n",
"
90E
\n",
"
GC 080 Public Intoxication 90E DRUNKENNESS 000...
\n",
"
200 NE 28TH ST
\n",
"
FORT WORTH
\n",
"
TX
\n",
"
C11
\n",
"
Northwest
\n",
"
2.0
\n",
"
C
\n",
"
18
\n",
"
18 PARKING LOT/GARAGE
\n",
"
(32.79512267527676, -97.34736968390531)
\n",
"
\n",
"
\n",
"
1
\n",
"
200025291-90E
\n",
"
200025291
\n",
"
03/27/2020
\n",
"
PUBLIC INTOXICATION
\n",
"
03/27/2020 08:29:28 PM
\n",
"
90E
\n",
"
GC 080 Public Intoxication 90E DRUNKENNESS 000...
\n",
"
14300 CENTREPORT LANDING CIR
\n",
"
FORT WORTH
\n",
"
TX
\n",
"
H17
\n",
"
NaN
\n",
"
NaN
\n",
"
C
\n",
"
20
\n",
"
20 RESIDENCE/HOME
\n",
"
(32.81901915935712, -97.05732878148257)
\n",
"
\n",
"
\n",
"
2
\n",
"
200025472-35A
\n",
"
200025472
\n",
"
03/28/2020
\n",
"
PUBLIC INTOXICATION
\n",
"
03/28/2020 04:41:22 PM
\n",
"
35A
\n",
"
HSC 481.117(B) Poss CS PG 3 <28G 35A DRUG/NARC...
\n",
"
5400 ENCLAVE CIR
\n",
"
FORT WORTH
\n",
"
TX
\n",
"
K18
\n",
"
NaN
\n",
"
NaN
\n",
"
C
\n",
"
15
\n",
"
15 JAIL/PRISON
\n",
"
(32.67383727899393, -97.41089914948881)
\n",
"
\n",
"
\n",
"
3
\n",
"
200025309-90E
\n",
"
200025309
\n",
"
03/27/2020
\n",
"
PUBLIC INTOXICATION
\n",
"
03/27/2020 09:11:59 PM
\n",
"
90E
\n",
"
GC 080 Public Intoxication 90E DRUNKENNESS 000...
\n",
"
3300 W BOYCE AVE
\n",
"
FORT WORTH
\n",
"
TX
\n",
"
J12
\n",
"
South
\n",
"
9.0
\n",
"
C
\n",
"
20
\n",
"
20 RESIDENCE/HOME
\n",
"
(32.68139719413715, -97.36576854742505)
\n",
"
\n",
"
\n",
"
4
\n",
"
200025525-90E
\n",
"
200025525
\n",
"
03/28/2020
\n",
"
PUBLIC INTOXICATION
\n",
"
03/28/2020 10:54:17 PM
\n",
"
90E
\n",
"
GC 080 Public Intoxication 90E DRUNKENNESS 000...
\n",
"
100 E BOLT ST
\n",
"
FORT WORTH
\n",
"
TX
\n",
"
I11
\n",
"
NaN
\n",
"
NaN
\n",
"
C
\n",
"
13
\n",
"
13 HIGHWAY/ROAD/ALLEY
\n",
"
(32.68865080126258, -97.3265765609558)
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Case and Offense Case Number Reported Date Nature Of Call \\\n",
"0 200025542-90E 200025542 03/28/2020 PUBLIC INTOXICATION \n",
"1 200025291-90E 200025291 03/27/2020 PUBLIC INTOXICATION \n",
"2 200025472-35A 200025472 03/28/2020 PUBLIC INTOXICATION \n",
"3 200025309-90E 200025309 03/27/2020 PUBLIC INTOXICATION \n",
"4 200025525-90E 200025525 03/28/2020 PUBLIC INTOXICATION \n",
"\n",
" From Date Offense \\\n",
"0 03/28/2020 11:50:00 PM 90E \n",
"1 03/27/2020 08:29:28 PM 90E \n",
"2 03/28/2020 04:41:22 PM 35A \n",
"3 03/27/2020 09:11:59 PM 90E \n",
"4 03/28/2020 10:54:17 PM 90E \n",
"\n",
" Description \\\n",
"0 GC 080 Public Intoxication 90E DRUNKENNESS 000... \n",
"1 GC 080 Public Intoxication 90E DRUNKENNESS 000... \n",
"2 HSC 481.117(B) Poss CS PG 3 <28G 35A DRUG/NARC... \n",
"3 GC 080 Public Intoxication 90E DRUNKENNESS 000... \n",
"4 GC 080 Public Intoxication 90E DRUNKENNESS 000... \n",
"\n",
" Block Address City State Beat Division \\\n",
"0 200 NE 28TH ST FORT WORTH TX C11 Northwest \n",
"1 14300 CENTREPORT LANDING CIR FORT WORTH TX H17 NaN \n",
"2 5400 ENCLAVE CIR FORT WORTH TX K18 NaN \n",
"3 3300 W BOYCE AVE FORT WORTH TX J12 South \n",
"4 100 E BOLT ST FORT WORTH TX I11 NaN \n",
"\n",
" Council District Attempt Complete Location Type Location Description \\\n",
"0 2.0 C 18 18 PARKING LOT/GARAGE \n",
"1 NaN C 20 20 RESIDENCE/HOME \n",
"2 NaN C 15 15 JAIL/PRISON \n",
"3 9.0 C 20 20 RESIDENCE/HOME \n",
"4 NaN C 13 13 HIGHWAY/ROAD/ALLEY \n",
"\n",
" Location \n",
"0 (32.79512267527676, -97.34736968390531) \n",
"1 (32.81901915935712, -97.05732878148257) \n",
"2 (32.67383727899393, -97.41089914948881) \n",
"3 (32.68139719413715, -97.36576854742505) \n",
"4 (32.68865080126258, -97.3265765609558) "
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"\n",
"# Modify the path below to your data path\n",
"df = pd.read_csv(\"data/fort_worth_crime_extract.csv\")\n",
"\n",
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"Scroll to the right-hand side of your data frame to view the `Location` column. You'll notice that it is formatted as latitude/longitude coordinate pairs, separated by a comma and enclosed in parentheses. Additionally, some of the latitude/longitude values are missing. "
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"Location\n",
"False 1979\n",
"True 44\n",
"Name: count, dtype: int64"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"null_rows = pd.isnull(df.Location)\n",
"\n",
"null_rows.value_counts()"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"It looks like 44 rows in our dataset are missing latitude and longitude information. In a regular data analysis workflow, you'd want to do some investigation of these missing values, and ultimately see if there is any systematic bias there. Another option would be to try to locate as many of these missing values as possible. As our focus here is learning mapping in Mapbox Studio, however, let's forge ahead. \n",
"\n",
"To map CSV data, Mapbox Studio requires that one column contain the longitude coordinates, and one column contain the latitude coordinates. As such, we'll need to get rid of the parentheses in the column, then split the column into two columns, and finally add those columns back to our data frame. Study the code below to examine what it does: "
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Case and Offense
\n",
"
Case Number
\n",
"
Reported Date
\n",
"
Nature Of Call
\n",
"
From Date
\n",
"
Offense
\n",
"
Description
\n",
"
Block Address
\n",
"
City
\n",
"
State
\n",
"
Beat
\n",
"
Division
\n",
"
Council District
\n",
"
Attempt Complete
\n",
"
Location Type
\n",
"
Location Description
\n",
"
Location
\n",
"
latitude
\n",
"
longitude
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
200025542-90E
\n",
"
200025542
\n",
"
03/28/2020
\n",
"
PUBLIC INTOXICATION
\n",
"
03/28/2020 11:50:00 PM
\n",
"
90E
\n",
"
GC 080 Public Intoxication 90E DRUNKENNESS 000...
\n",
"
200 NE 28TH ST
\n",
"
FORT WORTH
\n",
"
TX
\n",
"
C11
\n",
"
Northwest
\n",
"
2.0
\n",
"
C
\n",
"
18
\n",
"
18 PARKING LOT/GARAGE
\n",
"
32.79512267527676, -97.34736968390531
\n",
"
32.79512267527676
\n",
"
-97.34736968390531
\n",
"
\n",
"
\n",
"
1
\n",
"
200025291-90E
\n",
"
200025291
\n",
"
03/27/2020
\n",
"
PUBLIC INTOXICATION
\n",
"
03/27/2020 08:29:28 PM
\n",
"
90E
\n",
"
GC 080 Public Intoxication 90E DRUNKENNESS 000...
\n",
"
14300 CENTREPORT LANDING CIR
\n",
"
FORT WORTH
\n",
"
TX
\n",
"
H17
\n",
"
NaN
\n",
"
NaN
\n",
"
C
\n",
"
20
\n",
"
20 RESIDENCE/HOME
\n",
"
32.81901915935712, -97.05732878148257
\n",
"
32.81901915935712
\n",
"
-97.05732878148257
\n",
"
\n",
"
\n",
"
2
\n",
"
200025472-35A
\n",
"
200025472
\n",
"
03/28/2020
\n",
"
PUBLIC INTOXICATION
\n",
"
03/28/2020 04:41:22 PM
\n",
"
35A
\n",
"
HSC 481.117(B) Poss CS PG 3 <28G 35A DRUG/NARC...
\n",
"
5400 ENCLAVE CIR
\n",
"
FORT WORTH
\n",
"
TX
\n",
"
K18
\n",
"
NaN
\n",
"
NaN
\n",
"
C
\n",
"
15
\n",
"
15 JAIL/PRISON
\n",
"
32.67383727899393, -97.41089914948881
\n",
"
32.67383727899393
\n",
"
-97.41089914948881
\n",
"
\n",
"
\n",
"
3
\n",
"
200025309-90E
\n",
"
200025309
\n",
"
03/27/2020
\n",
"
PUBLIC INTOXICATION
\n",
"
03/27/2020 09:11:59 PM
\n",
"
90E
\n",
"
GC 080 Public Intoxication 90E DRUNKENNESS 000...
\n",
"
3300 W BOYCE AVE
\n",
"
FORT WORTH
\n",
"
TX
\n",
"
J12
\n",
"
South
\n",
"
9.0
\n",
"
C
\n",
"
20
\n",
"
20 RESIDENCE/HOME
\n",
"
32.68139719413715, -97.36576854742505
\n",
"
32.68139719413715
\n",
"
-97.36576854742505
\n",
"
\n",
"
\n",
"
4
\n",
"
200025525-90E
\n",
"
200025525
\n",
"
03/28/2020
\n",
"
PUBLIC INTOXICATION
\n",
"
03/28/2020 10:54:17 PM
\n",
"
90E
\n",
"
GC 080 Public Intoxication 90E DRUNKENNESS 000...
\n",
"
100 E BOLT ST
\n",
"
FORT WORTH
\n",
"
TX
\n",
"
I11
\n",
"
NaN
\n",
"
NaN
\n",
"
C
\n",
"
13
\n",
"
13 HIGHWAY/ROAD/ALLEY
\n",
"
32.68865080126258, -97.3265765609558
\n",
"
32.68865080126258
\n",
"
-97.3265765609558
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Case and Offense Case Number Reported Date Nature Of Call \\\n",
"0 200025542-90E 200025542 03/28/2020 PUBLIC INTOXICATION \n",
"1 200025291-90E 200025291 03/27/2020 PUBLIC INTOXICATION \n",
"2 200025472-35A 200025472 03/28/2020 PUBLIC INTOXICATION \n",
"3 200025309-90E 200025309 03/27/2020 PUBLIC INTOXICATION \n",
"4 200025525-90E 200025525 03/28/2020 PUBLIC INTOXICATION \n",
"\n",
" From Date Offense \\\n",
"0 03/28/2020 11:50:00 PM 90E \n",
"1 03/27/2020 08:29:28 PM 90E \n",
"2 03/28/2020 04:41:22 PM 35A \n",
"3 03/27/2020 09:11:59 PM 90E \n",
"4 03/28/2020 10:54:17 PM 90E \n",
"\n",
" Description \\\n",
"0 GC 080 Public Intoxication 90E DRUNKENNESS 000... \n",
"1 GC 080 Public Intoxication 90E DRUNKENNESS 000... \n",
"2 HSC 481.117(B) Poss CS PG 3 <28G 35A DRUG/NARC... \n",
"3 GC 080 Public Intoxication 90E DRUNKENNESS 000... \n",
"4 GC 080 Public Intoxication 90E DRUNKENNESS 000... \n",
"\n",
" Block Address City State Beat Division \\\n",
"0 200 NE 28TH ST FORT WORTH TX C11 Northwest \n",
"1 14300 CENTREPORT LANDING CIR FORT WORTH TX H17 NaN \n",
"2 5400 ENCLAVE CIR FORT WORTH TX K18 NaN \n",
"3 3300 W BOYCE AVE FORT WORTH TX J12 South \n",
"4 100 E BOLT ST FORT WORTH TX I11 NaN \n",
"\n",
" Council District Attempt Complete Location Type Location Description \\\n",
"0 2.0 C 18 18 PARKING LOT/GARAGE \n",
"1 NaN C 20 20 RESIDENCE/HOME \n",
"2 NaN C 15 15 JAIL/PRISON \n",
"3 9.0 C 20 20 RESIDENCE/HOME \n",
"4 NaN C 13 13 HIGHWAY/ROAD/ALLEY \n",
"\n",
" Location latitude \\\n",
"0 32.79512267527676, -97.34736968390531 32.79512267527676 \n",
"1 32.81901915935712, -97.05732878148257 32.81901915935712 \n",
"2 32.67383727899393, -97.41089914948881 32.67383727899393 \n",
"3 32.68139719413715, -97.36576854742505 32.68139719413715 \n",
"4 32.68865080126258, -97.3265765609558 32.68865080126258 \n",
"\n",
" longitude \n",
"0 -97.34736968390531 \n",
"1 -97.05732878148257 \n",
"2 -97.41089914948881 \n",
"3 -97.36576854742505 \n",
"4 -97.3265765609558 "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Loop through the characters in the string we want to replace, and replace them in turn\n",
"for char in ['(', ')']: \n",
" df['Location'] = df.Location.str.replace(char, '')\n",
"\n",
"# Split our `Location_1` column at the comma with the `.split()` method;\n",
"# the `expand = True` argument returns a new data frame\n",
"locs = df.Location.str.split(',', expand = True)\n",
"\n",
"locs.columns = ['latitude', 'longitude']\n",
"\n",
"# Add these new columns to our existing data frame with the\n",
"# `pd.concat` method, to which we pass a list of data frames;\n",
"# specifying `axis = 1` makes sure that we combine columns, not\n",
"# rows.\n",
"df2 = pd.concat([df, locs], axis = 1)\n",
"\n",
"df2.head()"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"Success! We can now write our location-aware data frame to a CSV, for use in Mapbox Studio. "
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"ename": "CRSError",
"evalue": "Invalid projection: EPSG:4326: (Internal Proj Error: proj_create: no database context specified)",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mCRSError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[5], line 3\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mgeopandas\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mgp\u001b[39;00m\n\u001b[0;32m----> 3\u001b[0m geo_df \u001b[38;5;241m=\u001b[39m \u001b[43mgp\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mGeoDataFrame\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdata\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[43mdf2\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mgeometry\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[43mgp\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mpoints_from_xy\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdf2\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mlongitude\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mdf2\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mlatitude\u001b[49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcrs\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m \u001b[49m\u001b[38;5;241;43m4326\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[1;32m 5\u001b[0m geo_df\u001b[38;5;241m.\u001b[39mhead()\n",
"File \u001b[0;32m~/anaconda3/envs/geospatial/lib/python3.11/site-packages/geopandas/geodataframe.py:192\u001b[0m, in \u001b[0;36mGeoDataFrame.__init__\u001b[0;34m(self, data, geometry, crs, *args, **kwargs)\u001b[0m\n\u001b[1;32m 184\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m (\n\u001b[1;32m 185\u001b[0m \u001b[38;5;28mhasattr\u001b[39m(geometry, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcrs\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 186\u001b[0m \u001b[38;5;129;01mand\u001b[39;00m geometry\u001b[38;5;241m.\u001b[39mcrs\n\u001b[1;32m 187\u001b[0m \u001b[38;5;129;01mand\u001b[39;00m crs\n\u001b[1;32m 188\u001b[0m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m geometry\u001b[38;5;241m.\u001b[39mcrs \u001b[38;5;241m==\u001b[39m crs\n\u001b[1;32m 189\u001b[0m ):\n\u001b[1;32m 190\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(crs_mismatch_error)\n\u001b[0;32m--> 192\u001b[0m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mset_geometry\u001b[49m\u001b[43m(\u001b[49m\u001b[43mgeometry\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43minplace\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcrs\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mcrs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 194\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m geometry \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m crs:\n\u001b[1;32m 195\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\n\u001b[1;32m 196\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mAssigning CRS to a GeoDataFrame without a geometry column is not \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 197\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124msupported. Supply geometry using the \u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mgeometry=\u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124m keyword argument, \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 198\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mor by providing a DataFrame with column name \u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mgeometry\u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[1;32m 199\u001b[0m )\n",
"File \u001b[0;32m~/anaconda3/envs/geospatial/lib/python3.11/site-packages/geopandas/geodataframe.py:346\u001b[0m, in \u001b[0;36mGeoDataFrame.set_geometry\u001b[0;34m(self, col, drop, inplace, crs)\u001b[0m\n\u001b[1;32m 343\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(level, (GeoSeries, GeometryArray)) \u001b[38;5;129;01mand\u001b[39;00m level\u001b[38;5;241m.\u001b[39mcrs \u001b[38;5;241m!=\u001b[39m crs:\n\u001b[1;32m 344\u001b[0m \u001b[38;5;66;03m# Avoids caching issues/crs sharing issues\u001b[39;00m\n\u001b[1;32m 345\u001b[0m level \u001b[38;5;241m=\u001b[39m level\u001b[38;5;241m.\u001b[39mcopy()\n\u001b[0;32m--> 346\u001b[0m \u001b[43mlevel\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mcrs\u001b[49m \u001b[38;5;241m=\u001b[39m crs\n\u001b[1;32m 348\u001b[0m \u001b[38;5;66;03m# Check that we are using a listlike of geometries\u001b[39;00m\n\u001b[1;32m 349\u001b[0m level \u001b[38;5;241m=\u001b[39m _ensure_geometry(level, crs\u001b[38;5;241m=\u001b[39mcrs)\n",
"File \u001b[0;32m~/anaconda3/envs/geospatial/lib/python3.11/site-packages/geopandas/array.py:360\u001b[0m, in \u001b[0;36mGeometryArray.crs\u001b[0;34m(self, value)\u001b[0m\n\u001b[1;32m 357\u001b[0m \u001b[38;5;129m@crs\u001b[39m\u001b[38;5;241m.\u001b[39msetter\n\u001b[1;32m 358\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mcrs\u001b[39m(\u001b[38;5;28mself\u001b[39m, value):\n\u001b[1;32m 359\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124;03m\"\"\"Sets the value of the crs\"\"\"\u001b[39;00m\n\u001b[0;32m--> 360\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_crs \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m value \u001b[38;5;28;01melse\u001b[39;00m \u001b[43mCRS\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mfrom_user_input\u001b[49m\u001b[43m(\u001b[49m\u001b[43mvalue\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m~/anaconda3/envs/geospatial/lib/python3.11/site-packages/pyproj/crs/crs.py:501\u001b[0m, in \u001b[0;36mCRS.from_user_input\u001b[0;34m(cls, value, **kwargs)\u001b[0m\n\u001b[1;32m 499\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(value, \u001b[38;5;28mcls\u001b[39m):\n\u001b[1;32m 500\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m value\n\u001b[0;32m--> 501\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mcls\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43mvalue\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m~/anaconda3/envs/geospatial/lib/python3.11/site-packages/pyproj/crs/crs.py:348\u001b[0m, in \u001b[0;36mCRS.__init__\u001b[0;34m(self, projparams, **kwargs)\u001b[0m\n\u001b[1;32m 346\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_local\u001b[38;5;241m.\u001b[39mcrs \u001b[38;5;241m=\u001b[39m projparams\n\u001b[1;32m 347\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m--> 348\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_local\u001b[38;5;241m.\u001b[39mcrs \u001b[38;5;241m=\u001b[39m \u001b[43m_CRS\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43msrs\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m~/anaconda3/envs/geospatial/lib/python3.11/site-packages/pyproj/_crs.pyx:2378\u001b[0m, in \u001b[0;36mpyproj._crs._CRS.__init__\u001b[0;34m()\u001b[0m\n",
"\u001b[0;31mCRSError\u001b[0m: Invalid projection: EPSG:4326: (Internal Proj Error: proj_create: no database context specified)"
]
}
],
"source": [
"import geopandas as gp\n",
"\n",
"geo_df = gp.GeoDataFrame(data = df2, geometry = gp.points_from_xy(df2.longitude, df2.latitude), crs = 4326)\n",
"\n",
"geo_df.head()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"SYSTEM INFO\n",
"-----------\n",
"python : 3.11.6 | packaged by conda-forge | (main, Oct 3 2023, 10:37:07) [Clang 15.0.7 ]\n",
"executable : /Users/kylewalker/anaconda3/envs/geospatial/bin/python\n",
"machine : macOS-13.3-arm64-arm-64bit\n",
"\n",
"GEOS, GDAL, PROJ INFO\n",
"---------------------\n",
"GEOS : 3.12.0\n",
"GEOS lib : None\n",
"GDAL : 3.7.3\n",
"GDAL data dir: /Users/kylewalker/anaconda3/envs/geospatial/share/gdal\n",
"PROJ : 9.3.0\n",
"PROJ data dir: /Users/kylewalker/anaconda3/share/proj\n",
"\n",
"PYTHON DEPENDENCIES\n",
"-------------------\n",
"geopandas : 0.14.0\n",
"numpy : 1.26.0\n",
"pandas : 2.1.2\n",
"pyproj : 3.6.1\n",
"shapely : 2.0.2\n",
"fiona : 1.9.5\n",
"geoalchemy2: None\n",
"geopy : None\n",
"matplotlib : 3.8.1\n",
"mapclassify: 2.6.1\n",
"pygeos : None\n",
"pyogrio : None\n",
"psycopg2 : None\n",
"pyarrow : None\n",
"rtree : 1.1.0\n"
]
}
],
"source": [
"gp.show_versions()"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"When you run this cell, `pandas` will write a CSV file named `intoxication.csv` in your top-level directory in Colab. Go find it by clicking the folder icon and expand its options; you'll get an option to download the file to the computer, which should default to your Downloads directory. \n",
"\n",
"Let's now head over to Mapbox Studio at http://studio.mapbox.com. \n",
"\n",
"## Setting up a Style in Mapbox Studio\n",
"\n",
"If you don't yet have a Mapbox Studio account, go ahead and sign up for one by clicking the \"Sign up for Mapbox\" link from the \"Sign in\" screen. All you need to do is specify a username, email address, and password and agree to the Terms of Service, and you are good to go. Once you've established your account, you'll be taken to the Mapbox Studio splash screen, which defaults to the \"Styles\" tab. You won't see any styles as you haven't created any yet. Click the \"New style\" button to get started. \n",
"\n",
"\n",
"\n",
"View the different templates available to you. Given that we are plotting data over a basemap layer, I'd recommend using a template with more muted colors; I have a preference for the \"Monochrome - Light\" version given the way we'll be visualizing our data but I'll let you choose the template you like best. Once you've selected a basemap template, click the \"Customize\" button to advance to the Mapbox Studio editor. \n",
"\n",
"\n",
"\n",
"In the map editor, pan over to the Fort Worth area and zoom in and out. Mapbox Studio gives you access to _vector tiles_, which are variants of the tiled mapping structure we've discussed in class but which expose the underlying data rather than simply showing images. This allows for the customization of any component of the map. You'll see a series of such \"components\" on the left-hand side of the screen. They are organized into categories which you can explore and edit collectively. Alternatively, you can view major categories and style them by clicking the \"Colors\" option in the lower left section of the screen and choosing a map element, then selecting a color from the interactive picker. You can also click the \"Layers\" tab to view all of the individual map layers, and edit them one-by-one; you may need to \"override\" the color to do so. Below, I show an example of the \"Monochrome - Light\" template where I've styled water features with a light blue and park features with a light green. Experiment and try designing a map yourself!\n",
"\n",
"![customized map](img/map1.png)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"## Loading new data into Mapbox Studio\n",
"\n",
"Now that you've spent some time exploring Mapbox Studio and customizing your map, let's add the public drunkenness data to the map as well to visualize it. Follow these steps to upload your data and add it to the map: \n",
"\n",
"1. Click \"Layers\" on the left-hand side of the screen and look for the plus sign icon. This will bring up the \"New layer\" menu. \n",
"2. Look for the \"Upload data\" option and click there. Locate your intoxication CSV file and drag and drop it on the dialog box, or click on \"Select a file\" and navigate to it on your computer. You'll see the \"New tileset\" dialog appear; click __Confirm__ if the information is correct.\n",
"3. You'll see a \"Notifications\" box appear in the lower left corner of the screen, letting you know that your upload is processing. Mapbox Studio is converting your public intoxication dataset into vector tiles, which is the same format that is used by the rest of the layers that you've been working with. This will allow you to display your data along with those other layers on the map.\n",
"4. Once your upload has finished processing, scroll down the list of \"Sources\" to the bottom. You should see your intoxication tile source appear as an option. Click there to reveal the point layer within the tileset, then click the revealed layer to add it to your map. You should see your points added to your map. \n",
"\n",
"\n",
"\n",
" \n",
" \n",
"\n",
"Once you've successfully added your tileset, click \"Style\" as instructed on the screen to style your layer. "
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"## Modifying visualization options for your tileset\n",
"\n",
"By default, your points will show up as black dots on top of your basemap. While this does show clusters of violations around the city, it is not the most visually pleasing representation of your data. Let's make some modifications to the data representation to show it in an alternative way. Feel free to experiment with different style options; you can modify the circle radius, the color, blur and opacity, and other characteristics. \n",
"\n",
"You are not limited to styling your point data as circles, however. Click the \"Select data\" option once more (ensuring first that your intoxication layer is still selected) and click the __Type__ option. From the options that appear, choose __Heatmap__. Now, click __Style__ once more to style your data as a heatmap. You'll note that your data no longer appears as points, but as a smoothed representation of where your points are concentrated. My default view is the image below; yours may appear slightly different depending on how far you are currently zoomed in.\n",
"\n",
"\n",
"\n",
"The visualization defaults to a blue-to-red representation based on five colors, which each represent different density values. The highest density value is scaled to 1, which is bright red; the lowest color value is 0.1 by default, colored blue. Areas without violations are represented without color. Try zooming in and out on your map to see how the heatmap changes based on zoom level. \n",
"\n",
"There are several ways we can make the map more legible, however; I'll outline a few here. \n",
"\n",
"1. When zoomed out to show the entire city, the high-density areas cover most of the urban core. This limits our ability to show differentiation on the map. To modify this, change the __Radius__ of your heatmap from the default 30 pixels and make it smaller. Experiment with different radii and observe how the data representation changes on your map.\n",
"\n",
"2. The default color representation, while mapping to perceptions of \"hot\" and \"cold\" is not colorblindness-safe. Make some changes to the color values to use a colorblind-safe palette. I recommend the __viridis__ family of color palettes, which are built into __seaborn__ for quick lookup. There are five palettes in the viridis family: `viridis`, `magma`, `inferno`, `plasma`, and `cividis`. All are color-blind safe, perceptually uniform, and safe to print in black and white. Let's load in the palettes and take a look at them:\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import seaborn as sns\n",
"\n",
"viridis = sns.color_palette(\"viridis\", 5).as_hex()\n",
"inferno = sns.color_palette(\"inferno\", 5).as_hex()\n",
"magma = sns.color_palette(\"magma\", 5).as_hex()\n",
"plasma = sns.color_palette(\"plasma\", 5).as_hex()\n",
"cividis = sns.color_palette(\"cividis\", 5).as_hex()"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"I've assigned five-color extracts from the palettes and converted them to hexadecimal representation, which is a common way to represent colors and can be used in Mapbox Studio. Let's take a look at each palette in turn:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['#443983', '#31688e', '#21918c', '#35b779', '#90d743']\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAlEAAACQCAYAAAAoVz7mAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAABYlAAAWJQFJUiTwAAADvElEQVR4nO3csYpcZRiA4f+Mm7DDQorR7TeNaKOFhRdi6RV4L15ZwEIIYhW3sEmxIpGFSLL5bbyBfT0zxyHP00z1f3zwM4eXMzDLnHMAAPA4u60XAAA4RyIKACAQUQAAgYgCAAhEFABAIKIAAAIRBQAQiCgAgEBEAQAEIgoAIBBRAACBiAIACC7WHrgsy29jjGdjjNu1ZwMArOxmjPFmzvn8sQdXj6gxxrPd7snhan99OMJsTuD93gvKc/VwObdegf/g4vJh6xWIDk/vt16B6PWr+/Hu7Yd09hgRdXu1vz58+9UPRxjNKdx9fbX1CkR/ftkeBPw/fPb53dYrEH1/82LrFYh+/O7F+P2Xv27LWa8cAAACEQUAEIgoAIBARAEABCIKACAQUQAAgYgCAAhEFABAIKIAAAIRBQAQiCgAgEBEAQAEIgoAIBBRAACBiAIACEQUAEAgogAAAhEFABCIKACAQEQBAAQiCgAgEFEAAIGIAgAIRBQAQCCiAAACEQUAEIgoAIBARAEABCIKACAQUQAAgYgCAAhEFABAIKIAAAIRBQAQiCgAgEBEAQAEIgoAIBBRAACBiAIACEQUAEAgogAAAhEFABCIKACAQEQBAAQiCgAgEFEAAIGIAgAIRBQAQCCiAAACEQUAEIgoAIBARAEABCIKACAQUQAAgYgCAAhEFABAIKIAAAIRBQAQiCgAgEBEAQAEIgoAIBBRAACBiAIACEQUAEAgogAAAhEFABCIKACAQEQBAAQiCgAgEFEAAIGIAgAIRBQAQCCiAAACEQUAEIgoAIBARAEABCIKACAQUQAAgYgCAAhEFABAIKIAAAIRBQAQiCgAgEBEAQAEIgoAIBBRAACBiAIACEQUAEAgogAAAhEFABCIKACAQEQBAAQiCgAgEFEAAIGIAgAIRBQAQCCiAAACEQUAEIgoAIBARAEABCIKACAQUQAAgYgCAAhEFABAIKIAAAIRBQAQLHPOdQcuy91u9+Rwtb9edS6n836vrc/Vw+W632dO6+LyYesViA5P77degej1q/vx7u2HP+acnz727DEi6u8xxidjjJ9XHcypfPHv56+bbkHl/s6Xuztv7u983Ywx3sw5nz/24MX6u4yXY4wx5/zmCLM5smVZfhrD/Z0r93e+3N15c38fJ7/bAAAEIgoAIBBRAACBiAIACEQUAECw+l8cAAB8DLyJAgAIRBQAQCCiAAACEQUAEIgoAIBARAEABCIKACAQUQAAgYgCAAhEFABAIKIAAAIRBQAQiCgAgOAfCtlZGKiX8AUAAAAASUVORK5CYII=",
"text/plain": [
"