{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Calculating Return on Investment for Hot Spots Policing\n", "\n", "By [Andrew Wheeler, PhD](mailto:apwheele@gmail.com) \n", "Website: [andrewpwheeler.com](https://andrewpwheeler.com/)\n", "\n", "So we want to go from the Cohen's D measures in the [Braga hot spot meta-analysis](https://www.tandfonline.com/doi/abs/10.1080/07418825.2012.673632), to a measure more directly related to how much crime is reduced. What I want to do in the end is to recreate the pre and post counts of crime, so I can estimate the direct crime reduction effect of the number of crime reduced.\n", "\n", "First, the Cohen's D measure for these interventions is calculated [as below](https://www.ncbi.nlm.nih.gov/pubmed/7870860):\n", "\n", "$$ D = \\log(OR) \\cdot \\sqrt{ 3 }/\\pi $$\n", "\n", "Where $OR$ is the odds ratio. The odds ratio is calculated from the pre-post counts of crime table. \n", "\n", "| | Pre | Post |\n", "|---------|-------|-------|\n", "| Treated | $t_0$ | $t_1$ |\n", "| Control | $c_0$ | $c_1$ |\n", "\n", "The odds ratio is calculated as:\n", "\n", "$$ OR = \\frac{t_1/t_0}{c_1/c_0} $$\n", "\n", "So we have a few unknowns here to back calculate a return on investment for a particular crime intervention. First, the baseline of crime makes a big difference. Using the typical interpretation of the $OR$ as a percent reduction, say we have a 10% decline (i.e. an $OR = 0.9$). If we start with a baseline of 100 crimes, that is a net reduction of 10 crimes ($100 - OR \\cdot 100 = 10$). But if we start with a baseline of 10 crimes, it is only a reduction of 1 crime ($10 - OR \\cdot 10 = 1$). So those are big differences! The ROI is tied directly to the baseline crime counts. \n", "\n", "Second, this is ignoring the control areas. For simplicity in this analysis, first I am going to assume the control areas do not change, so $c_1 = c_0$. This then reduces the odds ratio formula to simply $OR = t_1/t_0$. Second, I assume that $t_0 = c_0 = c_1$. This is basically saying that the control areas had the same numbers of crime to the treated area at baseline. \n", "\n", "So the formula then to go from the Cohen's D estimates in the Braga meta-analysis to an estimate of $t_1$ and $t_0$ would be:\n", "\n", "$$ \\exp( \\frac{D \\cdot \\pi}{\\sqrt{3}} ) = OR = t_1/t_0 $$ \n", "\n", "So subsquently if I want to calculate $t_1 - t_0 = \\Delta t$, if we fix $t_0$ to some arbitrary value, we then have:\n", "\n", "$$ OR \\cdot t_0 = t_1 $$\n", "\n", "So then:\n", "\n", "$$ \\Delta t = OR \\cdot t_0 - t_0 $$\n", "\n", "What I do below is based on the overall Cohen's D effect size, I plot $\\Delta t$ given a fixed value of $t_0$. Note that the Braga meta-analysis gives the estimates where D is positive (but signifies a reduction in crime). So I actually take the negative of the Cohen's D value. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "meta analysis D of -0.184 to odds ratio of 0.716\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "%matplotlib inline\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import math\n", "\n", "def d_to_or(d):\n", " return math.exp( d*math.pi / math.sqrt(3) )\n", "\n", "#overall meta analysis estimate of D\n", "big_D = -0.184\n", "\n", "#So about a 15% decline\n", "meta_or = d_to_or(big_D)\n", "print( \"meta analysis D of {:1.3f} to odds ratio of {:1.3f}\".format(big_D, meta_or) )\n", "\n", "#plot over a range of 10 to 200\n", "baseline = np.arange(10,201)\n", "reduction = baseline - baseline*meta_or #This is the positive reduction in crimes\n", "\n", "fig, ax = plt.subplots(figsize=(8,6))\n", "ax.plot(baseline,reduction, color='k')\n", "ax.set_xlabel('Baseline Crime Counts ($t_0$)')\n", "ax.set_ylabel('Reduced Crimes ($\\Delta t$)')\n", "ax.grid(True,linestyle='--')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So that was alot of work just to figure out a percent reduction estimate!\n", "\n", "So what I do now is take the specific estimates from Braga's meta-analysis (Table 3), for the overall effects for violent and property crimes. Then I take the cost of law-enforcement estimates from this [Hunt paper](https://www.cambridge.org/core/journals/journal-of-benefit-cost-analysis/article/estimates-of-law-enforcement-costs-by-crime-type-for-benefitcost-analyses/0A1A55F70324FDBAA947FF1F18AA1B74). Those are costs directly related to law enforcement, like how much time it takes to investigate/respond to crime. Finally, I grab the biggest hot spot in my current working paper on [cost of crime hot spots](https://osf.io/preprints/socarxiv/nmq8r/), and get an ROI estimate conditional on the total number of crimes in the hot spot. (In particular it is hot spot 59, the largest in the [center of this map](https://apwheele.github.io/MathPosts/HotSpotMap.html).)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total estimated crime reduction ROI for hot spot 59: $357,289\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CrimeCostTotalCrimeCrimeReductionROI
0Murder12435300.0000000.000000
1Assault829213035.356277293174.245531
2Robbery22293810.33491223036.518002
3Burglary1185294.0982854856.467525
4Theft102421229.95987530678.912323
5Veh Theft769517.2073285542.435613
\n", "
" ], "text/plain": [ " Crime Cost TotalCrime CrimeReduction ROI\n", "0 Murder 124353 0 0.000000 0.000000\n", "1 Assault 8292 130 35.356277 293174.245531\n", "2 Robbery 2229 38 10.334912 23036.518002\n", "3 Burglary 1185 29 4.098285 4856.467525\n", "4 Theft 1024 212 29.959875 30678.912323\n", "5 Veh Theft 769 51 7.207328 5542.435613" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Now lets go through an example for one of my hot spot areas\n", "#https://apwheele.github.io/MathPosts/HotSpotMap.html\n", "#Choosing the biggest area, ID 59 in the center of map\n", "\n", "#Taken from Braga's Table 3, specific estimates for violent/property\n", "or_v = d_to_or(-0.175)\n", "or_p = d_to_or(-0.084)\n", "\n", "#Hot spot 59 and cost of crime estimates for Texas from Hunt paper\n", "dat = [('v','Murder',124353,0),\n", " ('v','Assault',8292,130),\n", " ('v','Robbery',2229,38),\n", " ('p','Burglary',1185,29),\n", " ('p','Theft',1024,212),\n", " ('p','Veh Theft',769,51)]\n", "\n", "hs_59 = pd.DataFrame(dat, columns=['Type','Crime','Cost','TotalCrime'])\n", "hs_59['OR'] = or_v\n", "hs_59.loc[ hs_59['Type'] == 'p', 'OR'] = or_p\n", "hs_59['CrimeReduction'] = hs_59['TotalCrime'] - hs_59['TotalCrime']*hs_59['OR']\n", "hs_59['ROI'] = hs_59['CrimeReduction']*hs_59['Cost']\n", "\n", "print( \"Total estimated crime reduction ROI for hot spot 59: ${:,.0f}\".format( hs_59['ROI'].sum() ))\n", "\n", "var_order = ['Crime','Cost','TotalCrime','CrimeReduction','ROI']\n", "hs_59[var_order]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some caveats to this of course. One, you can always do a poor job on a hot spots intervention (so the lower bound of ROI is a negative number). Doing planning on the overall effect size though seems reasonable on its face to me. (So your banking on being as effective as the average hot spots experiment.) I wouldn't assume offhand you are going to be more effective than this, at least to justify an intervention from a cost-benefit perspective.\n", "\n", "This also averages together all different crimes, which may not be reasonable given the intervention. For example, one of the most successful hot spots experiments was the [Kansas City gun experiment](https://www.tandfonline.com/doi/abs/10.1080/07418829500096241). An intervention like that may not be reasonable to extrapolate to all crimes. So if the Dallas PD wanted to do something like that, it would probably only make sense to look at the relevant violent crimes, not all of the crimes (so only look at a specific row of the above table).\n", "\n", "These estimates are the 'credits' of conducting a hot spots policing strategy, not the 'debits' of officer time. So if what you want to do costs way more than the ROI in that area (like pay a ton of overtime to officers), it seems hot spots policing is potentially not worth the effort.\n", "\n", "These ROI estimates are also for 1.5 years of data, and the long term effectiveness of hot spots policing has not been established. So that means these estimates are not 'you do this hot spots thing for a short time period crime goes down forever', they are 'if you continue to do this hot spots thing, you should see reduced crimes around this margin'.\n", "\n", "So for this, if you wanted to create a POP officer for just this one hot spot, that ends up being a return of around $150k per year. So that seems to justify that single position. Smaller hot spots will need to be more along the lines of shifting current resources, as oppossed to creating new positions." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 2 }