{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Black Lives Matter Visualized\n", "## Lena Bohman\n", "![blm](blm.jpg)\n", "\n", "Source: BBC" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In 2014, I was taking a year off from my undergraduate degree to manage a family crisis and staying with my parents in St. Louis, MO. I grew up middle class in the \"inner city\" of St. Louis, and it was impossible to escape the legacy of white flight and discrimination that I saw everywhere I looked. No one trusted the cops.\n", "\n", "However, I remember being shocked by Ferguson. Not by the sentiments being expressed, but by the fact that long-simmering greivances had blown up in such dramatic fashion. My family lives next to a large city business district, and the protests came down to our neighborhood. One night, the cops fired tear gas into a coffee shop down the street, and it blew into our backyard. I had been told to return home from my job in a nearby restaurant by the riot police, and I was watching from my window as it all played out.\n", "\n", "2020 was, of course, another spark with the murder of George Floyd by a police officer. 6 years later, what have we learned?\n", "\n", "Sometimes, it seems like very little. But we have at least had time to collect data. \n", "\n", "**In these data visualizations, I will attempt to test (and visualize) the hypothesize of the #BLM movement: are black Americans, particularly men, being killed by police at a disproportionate rate by police?**" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import bqplot\n", "import numpy as np\n", "import traitlets\n", "import ipywidgets\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Data\n", "This data visualization is based off a data set compiled by the Washington Post and made public on [GitHub](https://github.com/washingtonpost/data-police-shootings). The data set \"contains records of every fatal shooting in the United States by a police officer in the line of duty since Jan. 1, 2015.\"\n", "\n", "The Washington Post explains why their data set is different:\n", "\n", "\"The Post is documenting only those shootings in which a police officer, in the line of duty, shoots and kills a civilian — the circumstances that most closely parallel the 2014 killing of Michael Brown in Ferguson, Mo., which began the protest movement culminating in Black Lives Matter and an increased focus on police accountability nationwide. The Post is not tracking deaths of people in police custody, fatal shootings by off-duty officers or non-shooting deaths.\n", "\n", "The FBI and the Centers for Disease Control and Prevention log fatal shootings by police, but officials acknowledge that their data is incomplete. Since 2015, The Post has documented more than twice as many fatal shootings by police as recorded on average annually.\"\n", "\n", "The data is compiled in a csv, which looks like this:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnamedatemanner_of_deatharmedagegenderracecitystatesigns_of_mental_illnessthreat_levelfleebody_cameralongitudelatitudeis_geocoding_exact
03Tim Elliot2015-01-02shotgun53.0MASheltonWATrueattackNot fleeingFalse-123.12247.247True
14Lewis Lee Lembke2015-01-02shotgun47.0MWAlohaORFalseattackNot fleeingFalse-122.89245.487True
25John Paul Quintero2015-01-03shot and Taseredunarmed23.0MHWichitaKSFalseotherNot fleeingFalse-97.28137.695True
38Matthew Hoffman2015-01-04shottoy weapon32.0MWSan FranciscoCATrueattackNot fleeingFalse-122.42237.763True
49Michael Rodriguez2015-01-04shotnail gun39.0MHEvansCOFalseattackNot fleeingFalse-104.69240.384True
\n", "
" ], "text/plain": [ " id name date manner_of_death armed age \\\n", "0 3 Tim Elliot 2015-01-02 shot gun 53.0 \n", "1 4 Lewis Lee Lembke 2015-01-02 shot gun 47.0 \n", "2 5 John Paul Quintero 2015-01-03 shot and Tasered unarmed 23.0 \n", "3 8 Matthew Hoffman 2015-01-04 shot toy weapon 32.0 \n", "4 9 Michael Rodriguez 2015-01-04 shot nail gun 39.0 \n", "\n", " gender race city state signs_of_mental_illness threat_level \\\n", "0 M A Shelton WA True attack \n", "1 M W Aloha OR False attack \n", "2 M H Wichita KS False other \n", "3 M W San Francisco CA True attack \n", "4 M H Evans CO False attack \n", "\n", " flee body_camera longitude latitude is_geocoding_exact \n", "0 Not fleeing False -123.122 47.247 True \n", "1 Not fleeing False -122.892 45.487 True \n", "2 Not fleeing False -97.281 37.695 True \n", "3 Not fleeing False -122.422 37.763 True \n", "4 Not fleeing False -104.692 40.384 True " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv(\"fatal-police-shootings-data.csv\")\n", "df.head()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "race = df.groupby(\"race\")[\"id\"].count()\n", "race = race.reset_index()\n", "race = race.set_index(\"race\")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "mentalillness = df.groupby([\"race\", \"signs_of_mental_illness\"])[\"id\"].count()\n", "mentalillness = mentalillness.reset_index()\n", "mentalillness = mentalillness.set_index(\"race\")" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "bodycamera = df.groupby([\"race\", \"body_camera\"])[\"id\"].count()\n", "bodycamera = bodycamera.reset_index()\n", "bodycamera = bodycamera.set_index(\"race\")" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "df[\"date\"]=[pd.to_datetime(d) for d in df[\"date\"]]\n", "df[\"race\"] = df[\"race\"].astype(\"str\")\n", "df[\"race\"] = df[\"race\"].fillna(\"None\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Dashboard\n", "\n", "This dashboard has three components. The main bar chart divides all shootings (from 2015 to present) by race. From this bar chart, we can indeed see that African Americans (represented by `B` for Black) are overrepresented in this sample compared to their share of the population (13.4% of the US population).\n", "\n", "The Washington Post uses the following designations for race:\n", "\n", "`W: White, non-Hispanic\n", "B: Black, non-Hispanic\n", "A: Asian\n", "N: Native American\n", "H: Hispanic\n", "O: Other\n", "None: unknown`\n", "\n", "If you click on any of the columns of the main bar chart, the two subplots will change to reflect data for that race. The two suplots show data for `signs of mental illness` (false or true) and `body camera` (if the cop was using a body camera, false or true)." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def on_selected(change):\n", " if len(change['owner'].selected) == 1:\n", " barm.x = mentalillness.loc[race.index[change['owner'].selected[0]]].index\n", " barm.y = mentalillness.loc[mentalillness.index == race.index[change['owner'].selected[0]], [\"id\"]].values\n", " barb.x = bodycamera.loc[race.index[change['owner'].selected[0]]].index\n", " barb.y = bodycamera.loc[mentalillness.index == race.index[change['owner'].selected[0]], [\"id\"]].values" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "33b29c29aa86463b96e337b9d72802d1", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(Figure(axes=[Axis(scale=OrdinalScale()), Axis(orientation='vertical', scale=LinearScale())], fi…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Main Bar\n", "x_sc = bqplot.OrdinalScale()\n", "y_sc = bqplot.LinearScale()\n", "\n", "x_ax = bqplot.Axis(scale = x_sc)\n", "y_ax = bqplot.Axis(scale = y_sc, \n", " orientation = 'vertical')\n", "\n", "bar = bqplot.Bars(x = race.index, y = race[\"id\"],\n", " scales={'x': x_sc, 'y': y_sc},\n", " color = [\"tomato\"],\n", " interactions = {'click': 'select'}, # make interactive on click of each box\n", " anchor_style = {'fill':'pink'}, # to make our selection blue\n", " selected_style = {'opacity': 1.0}, # make 100% opaque if box is selected\n", " unselected_style = {'opacity': 0.8, \"fill\": \"pink\"}) # make a little see-through if not)\n", "bar.observe(on_selected, 'selected')\n", "figr = bqplot.Figure(marks = [bar], axes = [x_ax, y_ax], title = \"Fatal Police Shootings by Race\")\n", "figr.layout.min_width = '600px'\n", "figr.layout.min_height = '600px'\n", "\n", "#Mental Illness Bar Chart\n", "x_scm = bqplot.OrdinalScale()\n", "y_scm = bqplot.LinearScale()\n", "\n", "x_axm = bqplot.Axis(scale = x_scm)\n", "y_axm = bqplot.Axis(scale = y_scm, \n", " orientation = 'vertical')\n", "barm = bqplot.Bars(x = mentalillness.loc[\"A\"].index, y = mentalillness.loc[[\"A\"],[\"id\"]],\n", " scales={'x': x_scm, 'y': y_scm},\n", " type = \"grouped\")\n", "\n", "figm = bqplot.Figure(marks = [barm], axes = [x_axm, y_axm], title = \"Signs of Mental Illness (False, True)\")\n", "figm.layout.min_width = '450px'\n", "\n", "#Body Camera\n", "x_scb = bqplot.OrdinalScale()\n", "y_scb = bqplot.LinearScale()\n", "\n", "x_axb = bqplot.Axis(scale = x_scb)\n", "y_axb = bqplot.Axis(scale = y_scb, \n", " orientation = 'vertical')\n", "\n", "barb = bqplot.Bars(x = bodycamera.loc[\"A\"].index, y = bodycamera.loc[[\"A\"],[\"id\"]],\n", " scales={'x': x_scb, 'y': y_scb},\n", " type = \"grouped\")\n", "\n", "figb = bqplot.Figure(marks = [barb], axes = [x_axb, y_axb], title = \"Body Camera Usage (False, True)\")\n", "figb.layout.min_width = '450px'\n", "\n", "subdash = ipywidgets.HBox([figm, figb])\n", "\n", "dashboard = ipywidgets.VBox([figr, subdash])\n", "\n", "dashboard" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Shootings by Race over Time" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This scatter plot visualizes the disparities in police violence in a different way - over time. Each shooting incident is represented by one dot on the timeline and the races are segmented in the same categories as above. This allows for visualizing trends over time.\n", "\n", "To interact with this visualization, you can scroll through using the scroll bar. Hovering over a dot will tell you the date of the incident." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "scrolled": false }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "513b3029b5db4dca93971818b95e774e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Figure(axes=[Axis(orientation='vertical', scale=OrdinalScale()), Axis(num_ticks=20, scale=DateScale())], fig_m…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "x_scs = bqplot.DateScale()\n", "y_scs = bqplot.OrdinalScale()\n", "\n", "x_axs = bqplot.Axis(scale = x_scs, num_ticks = 20)\n", "y_axs = bqplot.Axis(scale = y_scs, \n", " orientation = 'vertical')\n", "\n", "def_tt = bqplot.Tooltip(fields=['x', 'y'], formats = ['%Y-%m-%d', ''],)\n", "\n", "scatters = bqplot.Scatter(x = df[\"date\"],\n", " y = df[\"race\"],\n", " scales = {'x': x_scs, 'y': y_scs},\n", " tooltip = def_tt)\n", "#pz = bqplot.interacts.PanZoom(scales = {'x': [x_scs]})\n", "#def_tt = bqplot.Tooltip(fields=['x', 'y'])\n", "\n", "fig = bqplot.Figure(marks = [scatters], \n", " axes = [y_axs, x_axs])\n", " #interaction=pz)\n", " #tooltip = def_tt)\n", "fig.layout.min_width = '15000px'\n", "fig" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Mapping the Spark\n", "\n", "The uprising in 2020 led to a burst of data journalism by major news outlets on the #BLM movement. In June, The New York Times published an article entitled [\"Minneapolis Police Use Force Against Black People at 7 Times the Rate of Whites\"](https://www.nytimes.com/interactive/2020/06/03/us/minneapolis-police-use-of-force.html), a piece of data journalism that looked at data from the Minneapolis police force from 2015. This is a micro version of the Washington Post database, so the comparison between their findings and the Post database is instructive. The NYT provided a visualization similar to the first bar chart comparing African-Americans as a share of the population and as victims of police violence.\n", "\n", "![image1](minn1.png)\n", "\n", "They also did a detailed survey of the use of force mapped on the streets of Minneapolis, and looked at the geographic concentration.\n", "\n", "![image2](minn2.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Conclusion\n", "According to this data exploration and the sources I consulted for inspiration, the data gathered since 2014 does support the assertations of the Black Lives Matter Movement. We have clearly not done enough to change the pattern in the 6 years since the start of BLM." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.8" } }, "nbformat": 4, "nbformat_minor": 4 }