{
"cells": [
{
"cell_type": "markdown",
"id": "e48279b4-107e-4a36-8989-1f4c849e2725",
"metadata": {
"tags": []
},
"source": [
"# Visualizing the Drug Epidemic in San Francisco "
]
},
{
"cell_type": "markdown",
"id": "fbd71f29-7203-4ed0-af53-5c0cfcd7a5dd",
"metadata": {},
"source": [
"### Introduction"
]
},
{
"cell_type": "markdown",
"id": "92b3e929-0216-4592-ad3f-2ee12a533622",
"metadata": {},
"source": [
"
San Francisco, a city once celebrated for its vibrant and diverse cultures, has faced numerous challenges in recent years, particularly with regards to drug addiction and crime.
"
]
},
{
"cell_type": "markdown",
"id": "500f9663-e075-4c27-9d88-3a0dffc85bdb",
"metadata": {},
"source": [
"Drug overdose in San Francisco has been driven largely by the proliferation of the synthetic opioid fentanyl, according to the [SF Chronicle](https://www.sfchronicle.com/sf/article/S-F-is-investing-millions-in-overdose-response-16491976.php). On Dec. 17, 2021, Mayor London Breed declared an official state of emergency in the Tenderloin district to address the escalating epidemic."
]
},
{
"cell_type": "markdown",
"id": "c3561561-d5af-4d90-893a-4aaae3bb4e53",
"metadata": {},
"source": [
"\n",
"In response to these issues, the city has implemented several strategies aimed at addressing drug addiction and its impact on the community. These strategies include increasing law enforcement efforts to combat illegal drug possession and sales, as well as deploying more prescription medications, such as Narcan, to reverse overdose and save lives."
]
},
{
"cell_type": "markdown",
"id": "e47468a4-21e4-47f9-be4e-6a270f70a4de",
"metadata": {},
"source": [
"> Narcan, also known as naloxone, is a prescription medication used to reverse opioid overdose. It works by binding to the same receptors in the brain that opioids bind to, reversing their effects and restoring breathing."
]
},
{
"cell_type": "markdown",
"id": "7bffab82-899a-4fef-9253-14d3e2450342",
"metadata": {
"tags": []
},
"source": [
"### Quantifiable questions"
]
},
{
"cell_type": "markdown",
"id": "6255427b-3280-45b4-a236-c398e2fec775",
"metadata": {
"tags": []
},
"source": [
"Using the [San Francisco police response records](https://data.sfgov.org/Public-Safety/Police-Department-Incident-Reports-2018-to-Present/wg3w-h783), we hope to answer the following questions:\n",
"\n",
"- Where do most drug-related arrests and Narcan deployments occur?\n",
"- How do the frequencies change over time?\n",
"- When do drug-related arrests happen? Mornings? Evenings? Late at night?\n",
"- What's the correlation between drug-related arrests and Narcan deployments?"
]
},
{
"cell_type": "markdown",
"id": "91ed78ad-293b-486c-958f-58836c4eda43",
"metadata": {
"tags": []
},
"source": [
"## Downloading and Preparing Data "
]
},
{
"cell_type": "markdown",
"id": "bbdc4d14-3f78-440c-a122-b0de7dec2c66",
"metadata": {},
"source": [
"The SFPD data can be downloaded by clicking `Export` - `CSV`:"
]
},
{
"cell_type": "markdown",
"id": "b54a5bf5-6817-43c1-ab87-9829289f53ce",
"metadata": {},
"source": [
""
]
},
{
"cell_type": "markdown",
"id": "09ffe9e7-eca3-49c4-a439-c4aa9dd91a5d",
"metadata": {},
"source": [
"Save the dataset to `data/raw`. "
]
},
{
"cell_type": "markdown",
"id": "b9767faf-d1e7-4c96-b842-29bb793f2b01",
"metadata": {
"tags": []
},
"source": [
"> Notice that since this file is big (over 200MB), before committing and pushing to github, make sure to add the line '/data' in `.gitignore` file so that we don't break the git!"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "7c83094f-c080-4f54-9a15-9465ad18ecd1",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# importing the modules \n",
"import csv\n",
"import os\n",
"import pandas as pd\n",
"import altair as alt\n",
"from altair import datum"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "b896ca80-c41a-4299-a649-bab40d51042a",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Creating portable path\n",
"data_dir = os.environ['DATA_DIR']\n",
"file_path = os.path.join(data_dir, 'raw/Police_Department_Incident_Reports__2018_to_Present.csv')"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "c3cca097-597f-4052-a5fc-ba5d84d5e7dd",
"metadata": {},
"outputs": [],
"source": [
"# Reading the file from local data/raw folder\n",
"sfpd_data = pd.read_csv(file_path)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "bde9f523-6d79-426b-bebd-a7914b46bb01",
"metadata": {},
"outputs": [],
"source": [
"# Filtering for narcan deployment incidents\n",
"narcan_deployment = sfpd_data[sfpd_data['Incident Code'] == 51050]"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "00952db8-bca7-4684-ba02-de96df4f2b09",
"metadata": {},
"outputs": [],
"source": [
"# Filtering for drug-related arrests\n",
"drug_arrest = sfpd_data[~sfpd_data['Incident Category'].isna()]\n",
"drug_arrest_data = drug_arrest[drug_arrest['Incident Category'].str.contains('Drug')]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7775692a-2db3-4b2b-bbc3-0e3e7d7617bf",
"metadata": {},
"outputs": [],
"source": [
"# Saving the files to the data/processed directory\n",
"narcan_deployment.to_csv(\"/Users/tracy/sfpd-data/data/processed/narcan_deployment.csv\")\n",
"drug_arrest_data.to_csv(\"/Users/tracy/sfpd-data/data/processed/drug_arrest_data.csv\")"
]
},
{
"cell_type": "markdown",
"id": "96ab5d12-7699-4ee1-ab3a-1b92ba14ccb0",
"metadata": {},
"source": [
"Now, load the filterd csv files and be ready to dive in!"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "b1eca270-0d6e-4f5a-a15e-7d0fddb57d8a",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Read dataframes\n",
"drug_arrest = pd.read_csv('/Users/tracy/code/sfpd_data/data/processed/drug_arrest_data.csv')\n",
"narcan_deploy = pd.read_csv('/Users/tracy/sfpd-data/data/processed/narcan_deployment.csv')\n",
"# Make the column names snake_case\n",
"drug_arrest.columns = [col.lower().replace(' ', '_') for col in drug_arrest.columns]\n",
"narcan_deploy.columns = [col.lower().replace(' ', '_') for col in narcan_deploy.columns]"
]
},
{
"cell_type": "markdown",
"id": "c6d33f22-7837-4f7b-8c30-8a57b372310d",
"metadata": {
"tags": []
},
"source": [
"## Drug Arrests and Narcan Deployment by Neighborhood"
]
},
{
"cell_type": "markdown",
"id": "db93a908-ca55-4af3-a00f-cbc5ee5d52b5",
"metadata": {
"tags": []
},
"source": [
"### Finding #1\n",
"Our analysis found that most drug-related arrests occur in the **Tenderloin, South of Market (SoMa), and Mission** neighborhoods. Located near each other in downtown San Francisco, the two neighborhoods have undergone significant changes following rapid urban development and gentrification in recent years. Both suffer from a high-rate of poverty and crime."
]
},
{
"cell_type": "markdown",
"id": "489160fd-7f27-413a-89d1-72e60ce24dbf",
"metadata": {
"tags": []
},
"source": [
"#### Analysis and Visualization #1"
]
},
{
"cell_type": "code",
"execution_count": 359,
"id": "7082ad2a-cde4-44a1-81bb-d4f462b62dd3",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" analysis_neighborhood | \n",
" drug_arrests_count | \n",
"
\n",
" \n",
" \n",
" \n",
" | 1 | \n",
" Tenderloin | \n",
" 8613 | \n",
"
\n",
" \n",
" | 2 | \n",
" South of Market | \n",
" 2962 | \n",
"
\n",
" \n",
" | 3 | \n",
" Mission | \n",
" 1740 | \n",
"
\n",
" \n",
" | 4 | \n",
" Financial District/South Beach | \n",
" 451 | \n",
"
\n",
" \n",
" | 5 | \n",
" Bayview Hunters Point | \n",
" 437 | \n",
"
\n",
" \n",
" | 6 | \n",
" Castro/Upper Market | \n",
" 291 | \n",
"
\n",
" \n",
" | 7 | \n",
" Western Addition | \n",
" 273 | \n",
"
\n",
" \n",
" | 8 | \n",
" Haight Ashbury | \n",
" 242 | \n",
"
\n",
" \n",
" | 9 | \n",
" Nob Hill | \n",
" 213 | \n",
"
\n",
" \n",
" | 10 | \n",
" North Beach | \n",
" 137 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" analysis_neighborhood drug_arrests_count\n",
"1 Tenderloin 8613\n",
"2 South of Market 2962\n",
"3 Mission 1740\n",
"4 Financial District/South Beach 451\n",
"5 Bayview Hunters Point 437\n",
"6 Castro/Upper Market 291\n",
"7 Western Addition 273\n",
"8 Haight Ashbury 242\n",
"9 Nob Hill 213\n",
"10 North Beach 137"
]
},
"execution_count": 359,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"da_neighb = drug_arrest.groupby(drug_arrest['analysis_neighborhood']).size()\n",
"da_df = da_neighb.sort_values(ascending = False).reset_index(name='drug_arrests_count')\n",
"da_df.index +=1 \n",
"da_df.head(10)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "938349c5-6dde-43e6-9fb7-851af7f62c12",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Group the incidents by time and location\n",
"# Note that incident_date_2 has no significant meaning other than for the purpose of tooltips in visualization.\n",
"drug_arrest['incident_date'] = pd.to_datetime(drug_arrest['incident_date'])\n",
"drug_arrest['incident_time'] = pd.to_datetime(drug_arrest['incident_time'])\n",
"da_nbh = drug_arrest.groupby([(drug_arrest['incident_date'].dt.strftime('%Y-%m')),drug_arrest['analysis_neighborhood']]).size().reset_index(name='count')"
]
},
{
"cell_type": "code",
"execution_count": 362,
"id": "e92914ab-54dc-4756-bb1a-8b5841a8491c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Making a new column so that all the other neighborhoods except for the three mentioned above \n",
"# will be categorized into \"All Other Neighborhoods\"\n",
"\n",
"da_viz1 = da_nbh\n",
"for i, row in da_viz1.iterrows():\n",
" if row['analysis_neighborhood'] == 'Tenderloin':\n",
" da_viz1.loc[i, 'viz1'] = 'Tenderloin'\n",
" elif row['analysis_neighborhood'] == 'South of Market':\n",
" da_viz1.loc[i, 'viz1'] = 'SoMA'\n",
" elif row['analysis_neighborhood'] == 'Mission':\n",
" da_viz1.loc[i, 'viz1'] = 'Mission'\n",
" else:\n",
" da_viz1.loc[i, 'viz1'] = 'All Other Neighborhoods'\n",
"da_viz = da_viz1.groupby(['incident_date','viz1']).sum('count').reset_index()\n",
"da_viz['incident_date_2'] = da_viz['incident_date']\n",
"# Note that incident_date_2 has no significant meaning other than for the purpose of tooltips in visualization.\n",
"\n",
"## Excluding the month we're currently in.\n",
"end_date = '2023-02'\n",
"filtered_da_viz = da_viz[da_viz['incident_date']<=end_date]"
]
},
{
"cell_type": "code",
"execution_count": 363,
"id": "76b3dd52-0b7e-4c51-9373-3039069269c4",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
""
],
"text/plain": [
"alt.Chart(...)"
]
},
"execution_count": 363,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Visualization\n",
"alt.Chart(filtered_da_viz).mark_area(size=9,opacity=0.8).encode(\n",
" x= alt.X('incident_date:T',title='Date',axis=alt.Axis(domain=False, format='%Y.%m')),\n",
" y= alt.Y('count:Q',stack = 'normalize',title = 'Count'),\n",
" color=alt.Color(\"viz1\",legend=alt.Legend(title='Neighborhood', values=['All Other Neighborhoods','Mission', 'SoMA','Tenderloin'])),\n",
" tooltip = [alt.Tooltip('incident_date_2',title = 'Date'),alt.Tooltip('count',title='Drug Arrests'),alt.Tooltip('viz1',title='Neighborhood')]\n",
").properties(\n",
" width=650,\n",
" height=350,\n",
" title='Total Drug Arrests by Neighborhoods (2018-now)'\n",
")"
]
},
{
"cell_type": "markdown",
"id": "2fef299b-2571-46df-8e2f-662ee0559482",
"metadata": {},
"source": [
"From 2018, the number of drug-arrests in Tenderloin alone is more than that in all the other neighborhoods combined."
]
},
{
"cell_type": "code",
"execution_count": 376,
"id": "f2185a3f-e2fa-4b61-888c-0161ed677561",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Neighborhood | \n",
" Count | \n",
"
\n",
" \n",
" \n",
" \n",
" | 1 | \n",
" All Other Neighborhoods | \n",
" 8610 | \n",
"
\n",
" \n",
" | 2 | \n",
" Tenderloin | \n",
" 8613 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Neighborhood Count\n",
"1 All Other Neighborhoods 8610\n",
"2 Tenderloin 8613"
]
},
"execution_count": 376,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Making a new column so that all the neighborhoods except for Tenderloin will be categorized into \"All Other Neighborhoods\"\n",
"\n",
"for i, row in da_nbh.iterrows():\n",
" if row['analysis_neighborhood'] != 'Tenderloin':\n",
" da_nbh.loc[i, 'compare'] = 'All Other Neighborhoods'\n",
" else:\n",
" da_nbh.loc[i, 'compare'] = 'Tenderloin'\n",
"da_nbh_df = da_nbh.groupby('compare').sum('count').reset_index()\n",
"da_nbh_df = da_nbh_df.rename(columns={'compare': 'Neighborhood', 'count': 'Count'})\n",
"da_nbh_df.index += 1\n",
"da_nbh_df"
]
},
{
"cell_type": "code",
"execution_count": 377,
"id": "a9bf4494-877c-448c-bf3b-e64566b69723",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
""
],
"text/plain": [
"alt.Chart(...)"
]
},
"execution_count": 377,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Visualization\n",
"da_plot = da_nbh.groupby(['incident_date','compare']).sum('count').reset_index()\n",
"da_plot['incident_date_2'] = da_plot['incident_date']\n",
"\n",
"## Excluding the month we're currently in.\n",
"end_date = '2023-02'\n",
"filtered_da_plot = da_plot[da_plot['incident_date']<=end_date]\n",
"\n",
"alt.Chart(filtered_da_plot).mark_line(opacity=0.8).encode(\n",
" alt.X('incident_date:T',title='Date'),\n",
" alt.Y('count:Q',title = 'Count'),\n",
" color = alt.Color('compare:N',legend = alt.Legend(title = 'Neighborhoods'),scale=alt.Scale(domain=['Tenderloin','All Other Neighborhoods'], range=['crimson','grey'])),\n",
" tooltip = [alt.Tooltip('incident_date_2',title='Date'),alt.Tooltip('count',title='Incidents')]\n",
").properties(\n",
" width=650,\n",
" height=350,\n",
" title='Drug-related Arrests in Tenderloin vs. in All Other Neighborhoods'\n",
")"
]
},
{
"cell_type": "markdown",
"id": "ca4f5c70-6b6a-4994-a9c2-9f25872c704a",
"metadata": {
"tags": []
},
"source": [
"### Finding #2\n",
"\n",
"A similar pattern is found with Narcan deployment: incidents happen significantly more often in **Terderloin** than in other neighborhoods."
]
},
{
"cell_type": "markdown",
"id": "9a38606d-4784-486e-b853-ba1c059730f0",
"metadata": {},
"source": [
"#### Analysis and Visualization #2"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "0a9207c2-a296-4448-a00d-b0277ad19d66",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" analysis_neighborhood | \n",
" narcan_deployment_count | \n",
"
\n",
" \n",
" \n",
" \n",
" | 1 | \n",
" Tenderloin | \n",
" 401 | \n",
"
\n",
" \n",
" | 2 | \n",
" South of Market | \n",
" 153 | \n",
"
\n",
" \n",
" | 3 | \n",
" Financial District/South Beach | \n",
" 56 | \n",
"
\n",
" \n",
" | 4 | \n",
" Mission | \n",
" 38 | \n",
"
\n",
" \n",
" | 5 | \n",
" Bayview Hunters Point | \n",
" 15 | \n",
"
\n",
" \n",
" | 6 | \n",
" Castro/Upper Market | \n",
" 12 | \n",
"
\n",
" \n",
" | 7 | \n",
" Western Addition | \n",
" 10 | \n",
"
\n",
" \n",
" | 8 | \n",
" Marina | \n",
" 7 | \n",
"
\n",
" \n",
" | 9 | \n",
" Chinatown | \n",
" 7 | \n",
"
\n",
" \n",
" | 10 | \n",
" North Beach | \n",
" 6 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" analysis_neighborhood narcan_deployment_count\n",
"1 Tenderloin 401\n",
"2 South of Market 153\n",
"3 Financial District/South Beach 56\n",
"4 Mission 38\n",
"5 Bayview Hunters Point 15\n",
"6 Castro/Upper Market 12\n",
"7 Western Addition 10\n",
"8 Marina 7\n",
"9 Chinatown 7\n",
"10 North Beach 6"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nd_neighb = narcan_deploy.groupby(narcan_deploy['analysis_neighborhood']).size()\n",
"nd_df = nd_neighb.sort_values(ascending = False).reset_index(name='narcan_deployment_count')\n",
"nd_df.index +=1 \n",
"nd_df.head(10)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "e60c3974-6fb1-4d09-8fe6-a9be178c5380",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Neighborhood | \n",
" Count | \n",
"
\n",
" \n",
" \n",
" \n",
" | 1 | \n",
" All Other Neighborhoods | \n",
" 355 | \n",
"
\n",
" \n",
" | 2 | \n",
" Tenderloin Incidents | \n",
" 401 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Neighborhood Count\n",
"1 All Other Neighborhoods 355\n",
"2 Tenderloin Incidents 401"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Group the incidents by time and location\n",
"# Note that incident_date_2 has no significant meaning other than for the purpose of tooltips in visualization.\n",
"narcan_deploy['incident_date'] = pd.to_datetime(narcan_deploy['incident_date'])\n",
"narcan_deploy['incident_time'] = pd.to_datetime(narcan_deploy['incident_time'])\n",
"\n",
"nd_nbh = narcan_deploy.groupby([(narcan_deploy['incident_date'].dt.strftime('%Y-%m')),narcan_deploy['analysis_neighborhood']]).size().reset_index(name='count')\n",
"nd_nbh['incident_date_2'] = nd_nbh['incident_date']\n",
"for i, row in nd_nbh.iterrows():\n",
" if row['analysis_neighborhood'] == 'Tenderloin':\n",
" nd_nbh.loc[i, 'compare'] = 'Tenderloin Incidents'\n",
" else:\n",
" nd_nbh.loc[i, 'compare'] = 'All Other Neighborhoods'\n",
"nd_nbh_df = nd_nbh.groupby('compare').sum('count').reset_index()\n",
"nd_nbh_df = nd_nbh_df.rename(columns={'compare': 'Neighborhood', 'count': 'Count'})\n",
"nd_nbh_df.index += 1\n",
"nd_nbh_df"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "1331af42-e60c-46c6-aa5f-f653a17d6122",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
""
],
"text/plain": [
"alt.Chart(...)"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Visualization\n",
"nd_plot = nd_nbh.groupby(['incident_date','compare']).sum('count').reset_index()\n",
"nd_plot['incident_date_2'] = nd_plot['incident_date']\n",
"\n",
"## Excluding the month we're currently in.\n",
"end_date = '2023-02'\n",
"filtered_nd_plot = nd_plot[nd_plot['incident_date']<=end_date]\n",
"\n",
"alt.Chart(filtered_nd_plot).mark_line(opacity=0.8).encode(\n",
" alt.X('incident_date:T',title='Date'),\n",
" alt.Y('sum(count):Q',title = 'Count',scale=alt.Scale(domain=(0,25))),\n",
" color = alt.Color('compare:N',legend = alt.Legend(title = 'Neighborhoods'),scale=alt.Scale(domain=['Tenderloin Incidents','All Other Neighborhoods'], range=['crimson','grey'])),\n",
" tooltip = [alt.Tooltip('incident_date_2',title='Date'),alt.Tooltip('count',title='Incidents')]\n",
").properties(\n",
" width=650,\n",
" height=350,\n",
" title='Narcan Deployment in Tenderloin vs. in All Other Neighborhoods'\n",
")"
]
},
{
"cell_type": "markdown",
"id": "fc9de7b1-1116-4cc0-947f-7b747659347a",
"metadata": {
"tags": []
},
"source": [
"## Drug Arrests and Narcan Deployment Over Time"
]
},
{
"cell_type": "markdown",
"id": "92caaae5-f506-42cc-a999-cfe95557c17f",
"metadata": {
"tags": []
},
"source": [
"### Finding #3\n",
"As the pandemic began to subside and people started moving around more freely in the city, there has been an increase in drug-related arrests. During July and August of 2022, the city saw **more than 1000** drug-related arrests, almost double the typical monthly average of **271**."
]
},
{
"cell_type": "markdown",
"id": "40e82ac2-f629-4b5e-889f-78d9ff157fe6",
"metadata": {},
"source": [
"#### Analysis and Visualization #3"
]
},
{
"cell_type": "code",
"execution_count": 379,
"id": "5144c09d-a7c1-47f5-ad34-8970cd0c5849",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" incident_date | \n",
" count | \n",
" incident_date_2 | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" 2018-01 | \n",
" 337 | \n",
" 2018-01 | \n",
"
\n",
" \n",
" | 1 | \n",
" 2018-02 | \n",
" 356 | \n",
" 2018-02 | \n",
"
\n",
" \n",
" | 2 | \n",
" 2018-03 | \n",
" 423 | \n",
" 2018-03 | \n",
"
\n",
" \n",
" | 3 | \n",
" 2018-04 | \n",
" 348 | \n",
" 2018-04 | \n",
"
\n",
" \n",
" | 4 | \n",
" 2018-05 | \n",
" 384 | \n",
" 2018-05 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" incident_date count incident_date_2\n",
"0 2018-01 337 2018-01\n",
"1 2018-02 356 2018-02\n",
"2 2018-03 423 2018-03\n",
"3 2018-04 348 2018-04\n",
"4 2018-05 384 2018-05"
]
},
"execution_count": 379,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Drug arrests over time, by year-month\n",
"# Again, note that incident_date_2 has no significant meaning other than for the purpose of tooltips in visualization.\n",
"drug_arrest['incident_date'] = pd.to_datetime(drug_arrest['incident_date'])\n",
"drug_arrest['incident_time'] = pd.to_datetime(drug_arrest['incident_time'])\n",
"\n",
"counts = drug_arrest.groupby(drug_arrest['incident_date'].dt.strftime('%Y-%m')).size().reset_index(name='count')\n",
"counts['incident_date_2'] = counts['incident_date']\n",
"counts.head()"
]
},
{
"cell_type": "code",
"execution_count": 380,
"id": "270391ca-dd56-46f5-918d-d4c7b791f13a",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"271.5"
]
},
"execution_count": 380,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Excluding the month we're currently in.\n",
"end_date = '2023-02'\n",
"filtered_counts = counts[counts['incident_date']<=end_date]\n",
"# Calculating the monthly average. We chose median to account for extreme values.\n",
"drug_arrest_median = filtered_counts['count'].median()\n",
"drug_arrest_median"
]
},
{
"cell_type": "code",
"execution_count": 382,
"id": "218e8f0d-c492-4df4-afac-94a04709d2b5",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
""
],
"text/plain": [
"alt.LayerChart(...)"
]
},
"execution_count": 382,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Visualize\n",
"base = alt.Chart(filtered_counts).mark_line(size=3).encode(\n",
" alt.X('incident_date:T',title='Date'),\n",
" alt.Y('count',title='Count, Median of Counts',scale=alt.Scale(domain=(0,600))),\n",
" tooltip = [alt.Tooltip('incident_date_2',title = 'Date'),alt.Tooltip('count',title='Drug Arrests')]\n",
").properties(\n",
" width=600,\n",
" height=400,\n",
" title='Drug-related Arrests, Jan. 2018 - Feb.2023'\n",
")\n",
"\n",
"median = alt.Chart(filtered_counts).mark_rule(size=2,color='crimson').encode(\n",
" y='median(count)',\n",
" tooltip = alt.Tooltip(title = 'Median of counts')\n",
")\n",
"\n",
"text = (\n",
" alt.Chart(filtered_counts.query(\"count == count.max()\"))\n",
" .mark_text(dy=-15, color=\"crimson\")\n",
" .encode(x=alt.X(\"incident_date:T\"), y=alt.Y(\"count\"), text=alt.Text((\"count\")) \n",
"))\n",
"base+median+text"
]
},
{
"cell_type": "markdown",
"id": "0c5e1318-1186-4852-97d9-4324fc57ee0b",
"metadata": {},
"source": [
"### Finding #4\n",
"Narcan deployment spiked as incidents of drug overdose worsened during the COVID-19 pandemic.\n",
"\n",
"In March 2020, there were **31 recorded deployments** of Narcan in San Francisco, more than doubling the typical monthly amount of **12**.\n"
]
},
{
"cell_type": "markdown",
"id": "fe102f53-1c31-4475-bebd-be8d0bd2268f",
"metadata": {},
"source": [
"#### Analysis and Visualization #4"
]
},
{
"cell_type": "code",
"execution_count": 383,
"id": "39f6b08d-5ba5-4131-b44b-508066146551",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" incident_date | \n",
" count | \n",
" incident_date_2 | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" 2018-01 | \n",
" 4 | \n",
" 2018-01 | \n",
"
\n",
" \n",
" | 1 | \n",
" 2018-02 | \n",
" 13 | \n",
" 2018-02 | \n",
"
\n",
" \n",
" | 2 | \n",
" 2018-03 | \n",
" 7 | \n",
" 2018-03 | \n",
"
\n",
" \n",
" | 3 | \n",
" 2018-04 | \n",
" 5 | \n",
" 2018-04 | \n",
"
\n",
" \n",
" | 4 | \n",
" 2018-05 | \n",
" 3 | \n",
" 2018-05 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" incident_date count incident_date_2\n",
"0 2018-01 4 2018-01\n",
"1 2018-02 13 2018-02\n",
"2 2018-03 7 2018-03\n",
"3 2018-04 5 2018-04\n",
"4 2018-05 3 2018-05"
]
},
"execution_count": 383,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Grouping the narcan deployments by Year-month.\n",
"# Note that incident_date_2 is not necessary, just for the sake of tooltips for visualization\n",
"narcan_deploy['incident_date'] = pd.to_datetime(narcan_deploy['incident_date'])\n",
"narcan_deploy['incident_time'] = pd.to_datetime(narcan_deploy['incident_time'])\n",
"\n",
"nd_counts = narcan_deploy.groupby(narcan_deploy['incident_date'].dt.strftime('%Y-%m')).size().reset_index(name='count')\n",
"nd_counts['incident_date_2'] = nd_counts['incident_date']\n",
"nd_counts.head()"
]
},
{
"cell_type": "code",
"execution_count": 384,
"id": "c4d3a493-236d-4e99-bf7b-238fb90a459c",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"12.0"
]
},
"execution_count": 384,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Excluding the month we're currently in.\n",
"end_date = '2023-02'\n",
"filtered_nd_counts = nd_counts[nd_counts['incident_date']<=end_date]\n",
"# Calculating the monthly average. We chose median to account for extreme values.\n",
"median_nd = filtered_nd_counts['count'].median()\n",
"median_nd"
]
},
{
"cell_type": "code",
"execution_count": 386,
"id": "a6317689-c08c-4942-a5c5-ddaa1630a8cd",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
""
],
"text/plain": [
"alt.LayerChart(...)"
]
},
"execution_count": 386,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Visualize\n",
"base = alt.Chart(filtered_nd_counts).mark_line(size=3).encode(\n",
" alt.X('incident_date:T',title='Date'),\n",
" alt.Y('count',title = 'Count, Median of Counts'),\n",
" tooltip = [alt.Tooltip('incident_date_2',title = 'Date'),alt.Tooltip('count',title='Narcan Deployment')]\n",
").properties(\n",
" width=600,\n",
" height=400,\n",
" title='Narcan Deployments, Jan 2018 - Feb 2023'\n",
")\n",
"median = alt.Chart(filtered_nd_counts).mark_rule(size=2,color='crimson').encode(\n",
" y='median(count)',\n",
" tooltip = alt.Tooltip(title = 'Median of counts')\n",
")\n",
"text = (\n",
" alt.Chart(filtered_nd_counts.query(\"count == count.max()\"))\n",
" .mark_text(dy=-25, color=\"crimson\")\n",
" .encode(x=alt.X(\"incident_date:T\"), y=alt.Y(\"count\"), text=alt.Text((\"count\")) \n",
"))\n",
"base+median+text"
]
},
{
"cell_type": "markdown",
"id": "0813bd2d-d864-4cd7-894a-ba30320a01f7",
"metadata": {
"jp-MarkdownHeadingCollapsed": true,
"tags": []
},
"source": [
"## Drug Arrests by Day-of-Week and Time-of-Day"
]
},
{
"cell_type": "markdown",
"id": "b2d02c9d-a602-48f3-92bd-ee08b114f56c",
"metadata": {},
"source": [
"### Finding #5\n",
"\n",
"Most drug-related arrests occur on the **afternoons of Tuesday and Wednesday**. This result could be due to routine police patrol around those times."
]
},
{
"cell_type": "markdown",
"id": "813a699c-28fd-426b-b28a-e38a5521764b",
"metadata": {},
"source": [
"#### Analysis and Visualization #5"
]
},
{
"cell_type": "code",
"execution_count": 387,
"id": "530a2d3f-86e6-4d1b-b067-ea6945f6d71f",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" incident_day_of_week | \n",
" incident_time | \n",
" count | \n",
" hour | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" Friday | \n",
" 00 | \n",
" 73 | \n",
" 00:00 | \n",
"
\n",
" \n",
" | 1 | \n",
" Friday | \n",
" 01 | \n",
" 37 | \n",
" 01:00 | \n",
"
\n",
" \n",
" | 2 | \n",
" Friday | \n",
" 02 | \n",
" 23 | \n",
" 02:00 | \n",
"
\n",
" \n",
" | 3 | \n",
" Friday | \n",
" 03 | \n",
" 33 | \n",
" 03:00 | \n",
"
\n",
" \n",
" | 4 | \n",
" Friday | \n",
" 04 | \n",
" 22 | \n",
" 04:00 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" incident_day_of_week incident_time count hour\n",
"0 Friday 00 73 00:00\n",
"1 Friday 01 37 01:00\n",
"2 Friday 02 23 02:00\n",
"3 Friday 03 33 03:00\n",
"4 Friday 04 22 04:00"
]
},
"execution_count": 387,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Grouping the arrests\n",
"week_time = drug_arrest.groupby([drug_arrest['incident_day_of_week'],drug_arrest['incident_time'].dt.strftime('%H')]).size().reset_index(name='count')\n",
"week_time['hour']=week_time['incident_time']+':00'\n",
"week_time.head()"
]
},
{
"cell_type": "code",
"execution_count": 388,
"id": "f811b594-ecc3-40fb-8ed4-1d2ed0aff2b6",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
""
],
"text/plain": [
"alt.Chart(...)"
]
},
"execution_count": 388,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weekday_order = ['Sunday','Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday','Saturday']\n",
"\n",
"base = alt.Chart(week_time).mark_rect().encode(\n",
" alt.Y('incident_day_of_week',title = 'Day of Week',sort= weekday_order),\n",
" alt.X('incident_time',title = 'Hour of Day'),\n",
" color='count',\n",
" tooltip = [alt.Tooltip('incident_day_of_week',title='Day'),\n",
" alt.Tooltip('hour',title = 'Time'),\n",
" alt.Tooltip('count',title='Drug Arrests')]\n",
")\n",
"base"
]
},
{
"cell_type": "markdown",
"id": "ba5f77a1-d80b-4786-97c8-a6d500e7118b",
"metadata": {
"tags": []
},
"source": [
"## The Correlation Between Drug Arrests and Narcan Deployments"
]
},
{
"cell_type": "markdown",
"id": "294d1cf6-4229-4477-a128-7a949dda9d0e",
"metadata": {
"jp-MarkdownHeadingCollapsed": true,
"tags": []
},
"source": [
"### Finding #6\n",
"Perhaps unsurprisingly, there's a strong correlation between drug arrests and Narcan deployments."
]
},
{
"cell_type": "code",
"execution_count": 391,
"id": "d4495bf4-19c2-4fac-ad56-c3cdce844828",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
""
],
"text/plain": [
"alt.Chart(...)"
]
},
"execution_count": 391,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Merging the drug arrests dataframe and the narcan deployment dataframe\n",
"da_nd = da_df.merge(nd_df, on = 'analysis_neighborhood')\n",
"# Visualization\n",
"base = alt.Chart(da_nd).mark_square().encode(\n",
" alt.X('drug_arrests_count',title = 'Drug Arrests'),\n",
" alt.Y('narcan_deployment_count',title = 'Narcan Deployment'),\n",
" color = alt.Color('analysis_neighborhood',legend=alt.Legend(title='Neighborhoods', values=['Tenderloin', 'South of Market','Mission'])),\n",
" size = alt.Size('drug_arrests_count',legend=None),\n",
" tooltip=[alt.Tooltip('analysis_neighborhood',title='Neighborhood'),alt.Tooltip('drug_arrests_count',title = 'Drug Arrests'),alt.Tooltip('narcan_deployment_count',title = 'Narcan Deployment')]\n",
").properties(\n",
" width=650,\n",
" height=250,\n",
" title='Drug Arrests vs. Narcan Deployment, by Neighborhood'\n",
")\n",
"base"
]
},
{
"cell_type": "markdown",
"id": "d4d0d5ce-c22f-4f33-92b1-fd2022f801f8",
"metadata": {},
"source": [
"Let's ignore the three outliers — *Tenderloin, SoMA and Mission* — and take a closer look at the busy bottom left corner:"
]
},
{
"cell_type": "code",
"execution_count": 392,
"id": "f1dbc3e4-2da9-4148-88f6-63e9bd00c7aa",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
""
],
"text/plain": [
"alt.Chart(...)"
]
},
"execution_count": 392,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Filtering out the three neighborhood with the most drug arrests\n",
"left_coner = base.encode(\n",
" color = alt.Color('analysis_neighborhood',legend=alt.Legend(title='Neighborhoods', values=['Financial District/South Beach', 'Bayview Hunters Point']))\n",
").transform_filter(\n",
" datum.drug_arrests_count < 1000\n",
")\n",
"left_coner "
]
},
{
"cell_type": "markdown",
"id": "5b35abbd-332f-4301-8268-aae9c798aaa1",
"metadata": {},
"source": [
"A similar correlation appears — high numbers of drug arrests corresponds with high numbers of Narcan deployment. Note that compared to other neighborhoods, **Financial District/South Beach** has a relatively high number of Narcan deployment."
]
},
{
"cell_type": "markdown",
"id": "71e9768b-c2dd-4e62-947d-49096f95a0bf",
"metadata": {
"tags": []
},
"source": [
"## Beyond the Notebook"
]
},
{
"cell_type": "markdown",
"id": "537806c9-a211-479f-a5f5-201c43e685ef",
"metadata": {},
"source": [
"Explore our dashboard to see how the drug epidemic affects different neighborhoods in San Francisco. \n",
"- [Drug Arrests in SF](https://fishdmy-sfpd-data-scriptsstreamlit-drugarrests-183wkh.streamlit.app/)\n",
"- [Narcan Deployment in SF](https://fishdmy-sfpd-data-scriptsstreamlit-narcan-xpqd8b.streamlit.app/)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}