{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# The Battle of Neighborhoods " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "\n", "Table of Content\n", "\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Introduction to the Business Problem \n", "Back to page top" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An investor is looking to open a new restaurant in San Francisco, but he is not sure about the best location for his new venue and needs input for making the decision. San Francisco is rather busy city famous for its business innovation and several famous tourit attractions. So while it looks promising to set up a new restaurant business in San Francisco, the venue's location must be carefully picked in order to maximize the profit. According to an analysis in the FSR Magazine, the 8 factors for choosing a new restaurant location are\n", "\n", "1. Visibility - look for foot and car traffic patterns that can give the venue the best visibility.\n", "2. Parking - need to have sufficient parking space for customers.\n", "3. Space size - consider how big of a space one needs for restaurant requirements.\n", "4. Crime rates - avoid crime-laden areas in the city.\n", "5. Surrounding businesses and competitor analysis - know what types of restaurant would do well in a certain area. Know what will distinguish a new restaurant from competitors.\n", "6. Acessibility - keep in mind about things like off-the-hightway locations, locations near busy intersections.\n", "7. Affordability - the cost of the venue space (rental or purchase) is a bottom-line consideration for any business.\n", "8. Safety - workplace safty is important for the restaurant owner as well as workers.\n", "\n", "In the capstone project, we will get the help from FourSquare API to address at least part of these considerations. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Data \n", "Back to page top" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Because of the availability of datasets, we will not address all of the factors listed above. However, we will work on some of the most important factors such as visibility, parking, crime rates, and affordability. We will utilize the following datasets/tools.\n", "\n", "Static datasets:\n", "\n", "1. Police Department Incident Reports: 2018 to Present (Link). The dataset includes police incident reports filed by officers and by individuals through self-service online reporting for non-emergency cases. Reports included are those for incidents that occurred starting January 1, 2018 onward and have been approved by a supervising officer. \n", "2. MTA On Street Parking Census (Link). The dataset contains locations and space count of unmetered motorcycle parking for the City of San Francisco.\n", "3. MTA Off Street Parking Census (Link). SFMTA managed off street parking locations, hours, and amenities. Includes both lots and garages.\n", "4. San Francisco Historica Secured Property Tax Rolls, 2007-2015 (Link). This dataset includes the SF Office of the Assessor-Recorder’s secured property tax roll spanning from 2007 to 2015. We will use the latest data as a measure of the cost of venue space.\n", "5. San Francisco Realtor Neighborhoods (Link)\n", "\n", "Search engines:\n", "1. Foursquare. We will use the foursquare API to carry out venue and point of interest search. The results will give us an idea of the neighborhood of the venue's potential location.\n", "2. PARKWHIZ. We intend to use the static dataset to look up parking space information near the potential venue location. PARKWHIZ is a quick and convenient alternative." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Methodology \n", "Back to page top\n", "\n", "In this section, we are going to explore San Francisco crime and housing datasets and answer two of the most important factors discussed in the Introduction. Then, using the Foursquare API, we will explore neighborhoods of the city of San Francisco. The neighborhoods will be clustered using the $k$-mean algorithm. The combined results will provide us insights into possible locations for opening a new restaurant." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Import libraries" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/chiachen/miniconda3/envs/mlenv/lib/python3.6/site-packages/pysal/__init__.py:65: VisibleDeprecationWarning: PySAL's API will be changed on 2018-12-31. The last release made with this API is version 1.14.4. A preview of the next API version is provided in the `pysal` 2.0 prelease candidate. The API changes and a guide on how to change imports is provided at https://migrating.pysal.org\n", " ), VisibleDeprecationWarning)\n" ] } ], "source": [ "import numpy as np\n", "import pandas as pd\n", "import geopandas as gpd\n", "import folium\n", "import matplotlib as mpl\n", "import matplotlib.pyplot as plt\n", "import matplotlib.cm as cm\n", "import matplotlib.colors as colors\n", "import pysal as ps\n", "import requests\n", "\n", "from pandas.io.json import json_normalize\n", "from geopandas.tools import sjoin\n", "from geopandas import GeoDataFrame\n", "from geopy.geocoders import Nominatim\n", "from folium.plugins import FastMarkerCluster\n", "from shapely.geometry import Point\n", "from sklearn.cluster import KMeans\n", "#from branca.utilities import split_six\n", "\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.1 San Francisco Crime Data Analysis \n", "Back to page top" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Read in the San Francisco Police Department Incident Reports and perform an initial check." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 111531 entries, 0 to 111530\n", "Data columns (total 26 columns):\n", "Incident Datetime 111531 non-null object\n", "Incident Date 111531 non-null object\n", "Incident Time 111531 non-null object\n", "Incident Year 111531 non-null int64\n", "Incident Day of Week 111531 non-null object\n", "Report Datetime 111531 non-null object\n", "Row ID 111531 non-null int64\n", "Incident ID 111531 non-null int64\n", "Incident Number 111531 non-null int64\n", "CAD Number 86415 non-null float64\n", "Report Type Code 111531 non-null object\n", "Report Type Description 111531 non-null object\n", "Filed Online 23641 non-null object\n", "Incident Code 111531 non-null int64\n", "Incident Category 111520 non-null object\n", "Incident Subcategory 111520 non-null object\n", "Incident Description 111531 non-null object\n", "Resolution 111531 non-null object\n", "Intersection 105956 non-null object\n", "CNN 105956 non-null float64\n", "Police District 111531 non-null object\n", "Analysis Neighborhood 105913 non-null object\n", "Supervisor District 105956 non-null float64\n", "Latitude 105956 non-null float64\n", "Longitude 105956 non-null float64\n", "point 105956 non-null object\n", "dtypes: float64(5), int64(5), object(16)\n", "memory usage: 22.1+ MB\n" ] } ], "source": [ "df_crime = pd.read_csv(\"./Police_Department_Incident_Reports__2018_to_Present.csv\")\n", "df_crime.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First five rows of the dataset." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Incident DatetimeIncident DateIncident TimeIncident YearIncident Day of WeekReport DatetimeRow IDIncident IDIncident NumberCAD NumberReport Type CodeReport Type DescriptionFiled OnlineIncident CodeIncident CategoryIncident SubcategoryIncident DescriptionResolutionIntersectionCNNPolice DistrictAnalysis NeighborhoodSupervisor DistrictLatitudeLongitudepoint
02018/01/01 01:30:00 AM2018/01/0101:302018Monday2018/01/01 02:13:00 AM61870203073618702180000263180010563.0IIInitialNaN3073RobberyRobbery - OtherRobbery, W/ Other WeaponOpen or ActiveJUSTIN DR \\ COLLEGE AVE21236000.0InglesideBernal Heights9.037.732261-122.423486(37.732261252752224, -122.42348641495892)
12018/01/01 01:59:00 AM2018/01/0101:592018Monday2018/01/01 01:59:00 AM61870768000618707180000326180010504.0IIInitialNaN68000Fire ReportFire ReportFire ReportOpen or Active16TH ST \\ MISSION ST24170000.0MissionMission9.037.765051-122.419669(37.76505133632968, -122.41966897380142)
22018/01/01 02:28:00 AM2018/01/0102:282018Monday2018/01/01 02:31:00 AM61870904134618709180000348180010636.0IIInitialNaN4134AssaultSimple AssaultBatteryOpen or Active03RD ST \\ PERRY ST20657000.0SouthernSouth of Market6.037.782119-122.396841(37.78211912156566, -122.39684142850209)
32018/01/01 02:28:00 AM2018/01/0102:282018Monday2018/01/01 02:31:00 AM61870928160618709180000348180010636.0IIInitialNaN28160Malicious MischiefVandalismMalicious Mischief, Vandalism to VehicleOpen or Active03RD ST \\ PERRY ST20657000.0SouthernSouth of Market6.037.782119-122.396841(37.78211912156566, -122.39684142850209)
42018/01/01 02:08:00 AM2018/01/0102:082018Monday2018/01/01 02:08:00 AM61871004014618710180000285180010537.0IIInitialNaN4014AssaultAggravated AssaultAssault, Aggravated, W/ ForceCite or Arrest AdultCESAR CHAVEZ ST \\ CAPP ST \\ MISSION ST21304000.0MissionBernal Heights9.037.748166-122.418221(37.74816568813204, -122.41822117169174)
\n", "
" ], "text/plain": [ " Incident Datetime Incident Date Incident Time Incident Year \\\n", "0 2018/01/01 01:30:00 AM 2018/01/01 01:30 2018 \n", "1 2018/01/01 01:59:00 AM 2018/01/01 01:59 2018 \n", "2 2018/01/01 02:28:00 AM 2018/01/01 02:28 2018 \n", "3 2018/01/01 02:28:00 AM 2018/01/01 02:28 2018 \n", "4 2018/01/01 02:08:00 AM 2018/01/01 02:08 2018 \n", "\n", " Incident Day of Week Report Datetime Row ID Incident ID \\\n", "0 Monday 2018/01/01 02:13:00 AM 61870203073 618702 \n", "1 Monday 2018/01/01 01:59:00 AM 61870768000 618707 \n", "2 Monday 2018/01/01 02:31:00 AM 61870904134 618709 \n", "3 Monday 2018/01/01 02:31:00 AM 61870928160 618709 \n", "4 Monday 2018/01/01 02:08:00 AM 61871004014 618710 \n", "\n", " Incident Number CAD Number Report Type Code Report Type Description \\\n", "0 180000263 180010563.0 II Initial \n", "1 180000326 180010504.0 II Initial \n", "2 180000348 180010636.0 II Initial \n", "3 180000348 180010636.0 II Initial \n", "4 180000285 180010537.0 II Initial \n", "\n", " Filed Online Incident Code Incident Category Incident Subcategory \\\n", "0 NaN 3073 Robbery Robbery - Other \n", "1 NaN 68000 Fire Report Fire Report \n", "2 NaN 4134 Assault Simple Assault \n", "3 NaN 28160 Malicious Mischief Vandalism \n", "4 NaN 4014 Assault Aggravated Assault \n", "\n", " Incident Description Resolution \\\n", "0 Robbery, W/ Other Weapon Open or Active \n", "1 Fire Report Open or Active \n", "2 Battery Open or Active \n", "3 Malicious Mischief, Vandalism to Vehicle Open or Active \n", "4 Assault, Aggravated, W/ Force Cite or Arrest Adult \n", "\n", " Intersection CNN Police District \\\n", "0 JUSTIN DR \\ COLLEGE AVE 21236000.0 Ingleside \n", "1 16TH ST \\ MISSION ST 24170000.0 Mission \n", "2 03RD ST \\ PERRY ST 20657000.0 Southern \n", "3 03RD ST \\ PERRY ST 20657000.0 Southern \n", "4 CESAR CHAVEZ ST \\ CAPP ST \\ MISSION ST 21304000.0 Mission \n", "\n", " Analysis Neighborhood Supervisor District Latitude Longitude \\\n", "0 Bernal Heights 9.0 37.732261 -122.423486 \n", "1 Mission 9.0 37.765051 -122.419669 \n", "2 South of Market 6.0 37.782119 -122.396841 \n", "3 South of Market 6.0 37.782119 -122.396841 \n", "4 Bernal Heights 9.0 37.748166 -122.418221 \n", "\n", " point \n", "0 (37.732261252752224, -122.42348641495892) \n", "1 (37.76505133632968, -122.41966897380142) \n", "2 (37.78211912156566, -122.39684142850209) \n", "3 (37.78211912156566, -122.39684142850209) \n", "4 (37.74816568813204, -122.41822117169174) " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.set_option('display.max_columns', 100)\n", "df_crime.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The most important columns are Incident Category, Latitude, Longitude, and time stamps. We remove columns that are not needed for the analysis." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "columns = ['Incident Datetime', 'Incident Day of Week', 'Incident Year', \n", " 'Report Datetime', 'Row ID', 'Incident ID', 'CAD Number', 'Report Type Code', \n", " 'Report Type Description', 'Filed Online', 'Incident Code', 'Incident Subcategory', \n", " 'Incident Description', 'Intersection', 'CNN', 'Analysis Neighborhood', \n", " 'Supervisor District', 'Resolution', 'point']\n", "df_crime = df_crime.drop(columns, axis=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dropping NaN rows from the remaining dataset." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "Incident Date 0\n", "Incident Time 0\n", "Incident Number 0\n", "Incident Category 11\n", "Police District 0\n", "Latitude 5575\n", "Longitude 5575\n", "dtype: int64" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_crime.isnull().sum()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Incident Date 0\n", "Incident Time 0\n", "Incident Number 0\n", "Incident Category 0\n", "Police District 0\n", "Latitude 0\n", "Longitude 0\n", "dtype: int64" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_crime.dropna(inplace=True)\n", "df_crime.isnull().sum()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get a list the type of incidents reported" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(['Robbery', 'Fire Report', 'Assault', 'Malicious Mischief',\n", " 'Larceny Theft', 'Non-Criminal', 'Miscellaneous Investigation',\n", " 'Disorderly Conduct', 'Warrant', 'Weapons Carrying Etc',\n", " 'Recovered Vehicle', 'Other Miscellaneous', 'Burglary',\n", " 'Missing Person', 'Suspicious Occ', 'Civil Sidewalks', 'Fraud',\n", " 'Motor Vehicle Theft', 'Traffic Violation Arrest', 'Drug Offense',\n", " 'Weapons Offense', 'Offences Against The Family And Children',\n", " 'Stolen Property', 'Lost Property', 'Other Offenses',\n", " 'Traffic Collision', 'Suicide', 'Homicide', 'Vehicle Misplaced',\n", " 'Other', 'Family Offense', 'Forgery And Counterfeiting',\n", " 'Sex Offense', 'Arson', 'Courtesy Report', 'Case Closure',\n", " 'Gambling', 'Drug Violation', 'Prostitution', 'Juvenile Offenses',\n", " 'Embezzlement', 'Vehicle Impounded', 'Vandalism',\n", " 'Human Trafficking (A), Commercial Sex Acts', 'Liquor Laws',\n", " 'Suspicious', 'Motor Vehicle Theft?', 'Rape', 'Weapons Offence',\n", " 'Human Trafficking, Commercial Sex Acts'], dtype=object)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_crime['Incident Category'].unique()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Remove the 'Non-Criminal' column." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "df_crime = df_crime[df_crime['Incident Category'] != 'Non-Criminal'].reset_index(drop=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Visualize crime distribution by category." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "df_crime['Incident Category'].value_counts().plot(kind='bar', figsize=(16,8))\n", "plt.ylabel('Number of incidents')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The number one category is larceny theft, followed by assault and burglary, not including 'Other Miscellaneous' and 'Miscellaneous Mischief.'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Visualize crime distribution by police districts." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "scrolled": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA8MAAAIPCAYAAABXKAxNAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzs3Xu87XVdJ/7XWxDvCsqxHC4dctAZNEs7EqaTpSOXKFDT1KzIKKaycqaZMSyLSccJy7Js0mIGCqufiHQBFUNivEwlKF4RL8NJSU54wQEvo4mi798f63tkc9hn733YZ+3vWfv7fD4e67HW9/P9rrXe68tlr9f6fi7V3QEAAIApudPYBQAAAMBGE4YBAACYHGEYAACAyRGGAQAAmBxhGAAAgMkRhgEAAJgcYRgAAIDJEYYBAACYHGEYAACAyRGGAQAAmJz9xy5gox188MG9devWscsAAABgLzv44INzySWXXNLdx6927OTC8NatW3PllVeOXQYAAABzUFUHr+U43aQBAACYHGEYAACAyRGGAQAAmBxhGAAAgMkRhgEAAJgcYRgAAIDJEYYBAACYnLmF4ao6p6o+VVXv36X956rqw1V1dVX9xpL251XV9mHfcUvajx/atlfV6Uvaj6iqK6rqmqp6dVUdMK/PAgAAwOYyzyvDf5zk+KUNVfU9SU5O8rDufkiSlwztRyV5epKHDM95eVXtV1X7Jfn9JCckOSrJM4Zjk+TFSV7a3UcmuSnJqXP8LAAAAGwicwvD3f3WJDfu0vzTSc7s7puHYz41tJ+c5Lzuvrm7P5pke5Kjh9v27v5Id385yXlJTq6qSvK4JBcMzz83yRPn9VkAAADYXDZ6zPCDkvyboXvzW6rqkUP7IUmuW3LcjqFtd+33S/KZ7r5ll3YAAABY1f4jvN9BSY5J8sgk51fVNyepZY7tLB/We4Xjl1VVpyU5LUkOP/zwPSwZAACAzWajrwzvSPIXPfP2JF9LcvDQftiS4w5Ncv0K7Z9OcmBV7b9L+7K6+6zu3tbd27Zs2bLXPgwAAACLaaPD8F9lNtY3VfWgJAdkFmwvSvL0qrpLVR2R5Mgkb0/yjiRHDjNHH5DZJFsXdXcneVOSpwyve0qSCzf0kwAAALCw5tZNuqpeleS7kxxcVTuSnJHknCTnDMstfTnJKUOwvbqqzk/ygSS3JHl2d391eJ2fTXJJkv2SnNPdVw9v8YtJzquq/5rk3UnOntdnAQAAYHOpWRadjm3btvWVV145dhkAAADMQVW9s7u3rXbcRneTBgAAgNEJwwAAAEzORi+ttOlsPf31Y5ewqmvPPHHsEgAAAPYprgwDAAAwOcIwAAAAkyMMAwAAMDnCMAAAAJMjDAMAADA5wjAAAACTIwwDAAAwOcIwAAAAkyMMAwAAMDnCMAAAAJMjDAMAADA5wjAAAACTIwwDAAAwOcIwAAAAkyMMAwAAMDnCMAAAAJMjDAMAADA5wjAAAACTIwwDAAAwOcIwAAAAkyMMAwAAMDnCMAAAAJMjDAMAADA5wjAAAACTIwwDAAAwOcIwAAAAkyMMAwAAMDnCMAAAAJMjDAMAADA5wjAAAACTIwwDAAAwOcIwAAAAkyMMAwAAMDn7j10AbD399WOXsKprzzxx7BIAAIC9yJVhAAAAJkcYBgAAYHKEYQAAACZHGAYAAGByhGEAAAAmRxgGAABgcoRhAAAAJmduYbiqzqmqT1XV+5fZ95+qqqvq4GG7quplVbW9qt5XVY9YcuwpVXXNcDtlSfu3V9VVw3NeVlU1r88CAADA5jLPK8N/nOT4XRur6rAkT0jysSXNJyQ5cridluQVw7H3TXJGku9IcnSSM6rqoOE5rxiO3fm8270XAAAALGduYbi735rkxmV2vTTJc5P0kraTk7yyZy5PcmBVPSDJcUku7e4bu/umJJcmOX7Yd+/uflt3d5JXJnnivD4LAAAAm8uGjhmuqpOS/FN3v3eXXYckuW7J9o6hbaX2Hcu0AwAAwKr236g3qqq7J/nlJMcut3uZtr4D7bt779My61Kdww8/fNVaAQAA2Nw28srwA5MckeS9VXVtkkOTvKuqvjGzK7uHLTn20CTXr9J+6DLty+rus7p7W3dv27Jly174KAAAACyyDQvD3X1Vd9+/u7d299bMAu0juvsTSS5K8qPDrNLHJPlsd388ySVJjq2qg4aJs45Ncsmw7/NVdcwwi/SPJrlwoz4LAAAAi22eSyu9Ksnbkjy4qnZU1akrHH5xko8k2Z7kfyT5mSTp7huTvDDJO4bbC4a2JPnpJP9zeM4/JHnDPD4HAAAAm8/cxgx39zNW2b91yeNO8uzdHHdOknOWab8yyUPXVyUAAABTtKGzSQMAAMC+QBgGAABgcoRhAAAAJkcYBgAAYHKEYQAAACZHGAYAAGByhGEAAAAmRxgGAABgcoRhAAAAJkcYBgAAYHKEYQAAACZHGAYAAGByhGEAAAAmRxgGAABgcoRhAAAAJkcYBgAAYHKEYQAAACZHGAYAAGByhGEAAAAmRxgGAABgcoRhAAAAJkcYBgAAYHKEYQAAACZHGAYAAGByhGEAAAAmRxgGAABgcoRhAAAAJkcYBgAAYHL2H7sAYP22nv76sUtY1bVnnjh2CQAA8HXCMED8oAAAMDW6SQMAADA5wjAAAACTIwwDAAAwOcIwAAAAkyMMAwAAMDnCMAAAAJMjDAMAADA5wjAAAACTIwwDAAAwOcIwAAAAkyMMAwAAMDnCMAAAAJMjDAMAADA5cwvDVXVOVX2qqt6/pO03q+pDVfW+qvrLqjpwyb7nVdX2qvpwVR23pP34oW17VZ2+pP2Iqrqiqq6pqldX1QHz+iwAAABsLvO8MvzHSY7fpe3SJA/t7ocl+T9JnpckVXVUkqcnecjwnJdX1X5VtV+S309yQpKjkjxjODZJXpzkpd19ZJKbkpw6x88CAADAJjK3MNzdb01y4y5tb+zuW4bNy5McOjw+Ocl53X1zd380yfYkRw+37d39ke7+cpLzkpxcVZXkcUkuGJ5/bpInzuuzAAAAsLmMOWb4x5O8YXh8SJLrluzbMbTtrv1+ST6zJFjvbAcAAIBVjRKGq+qXk9yS5M92Ni1zWN+B9t2932lVdWVVXXnDDTfsabkAAABsMhsehqvqlCTfl+SZ3b0zwO5IctiSww5Ncv0K7Z9OcmBV7b9L+7K6+6zu3tbd27Zs2bJ3PggAAAALa0PDcFUdn+QXk5zU3V9csuuiJE+vqrtU1RFJjkzy9iTvSHLkMHP0AZlNsnXREKLflOQpw/NPSXLhRn0OAAAAFts8l1Z6VZK3JXlwVe2oqlOT/Pck90pyaVW9p6r+IEm6++ok5yf5QJK/TvLs7v7qMCb4Z5NckuSDSc4fjk1mofoXqmp7ZmOIz57XZwEAAGBz2X/1Q+6Y7n7GMs27Dazd/aIkL1qm/eIkFy/T/pHMZpsGAACAPTLmbNIAAAAwCmEYAACAyRGGAQAAmBxhGAAAgMkRhgEAAJgcYRgAAIDJEYYBAACYHGEYAACAyRGGAQAAmBxhGAAAgMkRhgEAAJgcYRgAAIDJEYYBAACYHGEYAACAyRGGAQAAmBxhGAAAgMkRhgEAAJgcYRgAAIDJEYYBAACYHGEYAACAyRGGAQAAmBxhGAAAgMkRhgEAAJgcYRgAAIDJEYYBAACYHGEYAACAyRGGAQAAmBxhGAAAgMkRhgEAAJgcYRgAAIDJEYYBAACYHGEYAACAyRGGAQAAmBxhGAAAgMkRhgEAAJgcYRgAAIDJEYYBAACYHGEYAACAyRGGAQAAmBxhGAAAgMkRhgEAAJgcYRgAAIDJEYYBAACYHGEYAACAyZlbGK6qc6rqU1X1/iVt962qS6vqmuH+oKG9quplVbW9qt5XVY9Y8pxThuOvqapTlrR/e1VdNTznZVVV8/osAAAAbC6rhuGq+o2qundV3bmqLquqT1fVD6/htf84yfG7tJ2e5LLuPjLJZcN2kpyQ5MjhdlqSVwzvfd8kZyT5jiRHJzljZ4AejjltyfN2fS8AAABY1v5rOObY7n5uVT0pyY4kT03ypiR/utKTuvutVbV1l+aTk3z38PjcJG9O8otD+yu7u5NcXlUHVtUDhmMv7e4bk6SqLk1yfFW9Ocm9u/ttQ/srkzwxyRvW8HkAmJOtp79+7BJWdO2ZJ45dAgCwj1hLN+k7D/ffm+RVO4PpHfQN3f3xJBnu7z+0H5LkuiXH7RjaVmrfsUw7AAAArGotYfi1VfWhJNuSXFZVW5J8aS/Xsdx4374D7cu/eNVpVXVlVV15ww033MESAQAA2CzWEobPSPKoJNu6+ytJvpjkpDv4fp8cuj9nuP/U0L4jyWFLjjs0yfWrtB+6TPuyuvus7t7W3du2bNlyB0sHAABgs1hLGH5bd9/U3V9Nku7+Qu742NyLkuycEfqUJBcuaf/RYVbpY5J8duhGfUmSY6vqoGHirGOTXDLs+3xVHTPMIv2jS14LAAAAVrTbCbSq6hszG4d7t6p6eG7tmnzvJHdf7YWr6lWZTYB1cFXtyOwK85lJzq+qU5N8LLPJuJLk4szGJG/P7Mrzs5Kku2+sqhcmecdw3AuWjFn+6cxmrL5bZuHc5FkAAACsyUqzSR+X5Mcy64L820vaP5/kl1Z74e5+xm52PX6ZYzvJs3fzOuckOWeZ9iuTPHS1OgAAAGBXuw3D3X1uknOr6ge6+883sCYAAACYq7WsM/y6qvqhJFuXHt/dL5hXUQAAADBPawnDFyb5bJJ3Jrl5vuUAAADA/K0lDB/a3cfPvRIAAADYIGtZWunvq+pb5l4JAAAAbJC1XBl+TJIfq6qPZtZNujKbAPphc60MAAAA5mQtYfiEuVcBAAAAG2jVbtLd/Y9JDkvyuOHxF9fyPAAAANhXrRpqq+qMJL+Y5HlD052T/Ok8iwIAAIB5WssV3iclOSnJF5Kku69Pcq95FgUAAADztJYw/OXu7iSdJFV1j/mWBAAAAPO1ljB8flX9YZIDq+onk/xNkv8x37IAAABgfladTbq7X1JVT0jyuSQPTvKr3X3p3CsDAACAOVnL0koZwq8ADAAAwKaw2zBcVZ/PME54Od1977lUBAAAAHO22zDc3fdKkqp6QZJPJPmTJJXkmTGbNAAAAAtsLRNoHdfdL+/uz3f357r7FUl+YN6FAQAAwLysJQx/taqeWVX7VdWdquqZSb4678IAAABgXtYShn8oyQ8m+eRwe+rQBgAAAAtpLUsrXZvk5PmXAgAAABtjpdmkn9vdv1FVv5dlZpXu7p+fa2UAAAAwJytdGf7gcH/lRhQCAAAAG2WlpZVeO9yfu3HlAAAAwPytOoFWVV1aVQcu2T6oqi6Zb1kAAAAwP2uZTXpLd39m50Z335Tk/vMrCQAAAOZrresMH75zo6q+KctMqAUAAACLYtWllZL8cpK/raq3DNvfleS0+ZUEAAAA87WWdYb/uqoekeSYJJXkP3T3p+deGQAAAMzJWq4MJ8ldktw4HH9UVaW73zq/sgAAAGB+Vg3DVfXiJE9LcnWSrw3NnUQYBgAAYCGt5crwE5M8uLtvnncxAAAAsBHWMpv0R5Lced6FAAAAwEZZy5XhLyZ5T1VdluTrV4e7++fnVhUAAADM0VrC8EXDDQAAADaFtSytdO5GFAIAAAAbZbdhuKrO7+4frKqrMps9+ja6+2FzrQwAAADmZKUrw88Z7r9vIwoBAACAjbLbMNzdHx/u/3HjygEAAID5W8vSSgAAALCpCMMAAABMzm7D8LCucKrqxRtXDgAAAMzfShNoPaCqHpvkpKo6L0kt3dnd75prZQAAADAnK4XhX01yepJDk/z2Lvs6yePmVRQATNXW018/dgkruvbME8cuAQD2ipVmk74gyQVV9Svd/cINrAkAAADmatUJtLr7hVV1UlW9ZLite93hqvoPVXV1Vb2/ql5VVXetqiOq6oqquqaqXl1VBwzH3mXY3j7s37rkdZ43tH+4qo5bb10AAABMw6phuKp+PclzknxguD1naLtDquqQJD+fZFt3PzTJfkmenuTFSV7a3UcmuSnJqcNTTk1yU3f/yyQvHY5LVR01PO8hSY5P8vKq2u+O1gUAAMB0rGVppROTPKG7z+nuczILnusdMLR/krtV1f5J7p7k45mNQb5g2H9ukicOj08etjPsf3xV1dB+Xnff3N0fTbI9ydHrrAsAAIAJWOs6wwcueXyf9bxhd/9Tkpck+VhmIfizSd6Z5DPdfctw2I4khwyPD0ly3fDcW4bj77e0fZnnAAAAwG6tNJv0Tr+e5N1V9abMllf6riTPu6NvWFUHZXZV94gkn0nymiQnLHNo73zKbvbtrn259zwtyWlJcvjhh+9hxQAAAGw2a5lA61VJjknyF8PtUd193jre898m+Wh339DdXxle8zuTHDh0m05myzldPzzekeSwJBn23yfJjUvbl3nOrp/hrO7e1t3btmzZso7SAQAA2AzW1E26uz/e3Rd194Xd/Yl1vufHkhxTVXcfxv4+PrOJud6U5CnDMackuXB4fNGwnWH//+ruHtqfPsw2fUSSI5O8fZ21AQAAMAFr6Sa9V3X3FVV1QZJ3JbklybuTnJXk9UnOq6r/OrSdPTzl7CR/UlXbM7si/PThda6uqvMzC9K3JHl2d391Qz8MAAAAC2nDw3CSdPcZSc7YpfkjWWY26O7+UpKn7uZ1XpTkRXu9QAAAADa1FbtJV9Wdqur9G1UMAAAAbIQVw3B3fy3Je6vKFMwAAABsGmvpJv2AJFdX1duTfGFnY3efNLeqAAAAYI7WEoZ/be5VAAAAwAZaNQx391uq6puSHNndf1NVd0+y3/xLAwAAgPlYdZ3hqvrJJBck+cOh6ZAkfzXPogAAAGCeVg3DSZ6d5NFJPpck3X1NkvvPsygAAACYp7WE4Zu7+8s7N6pq/yQ9v5IAAABgvtYSht9SVb+U5G5V9YQkr0ny2vmWBQAAAPOzljB8epIbklyV5N8luTjJ8+dZFAAAAMzTWmaT/lpVnZvkisy6R3+4u3WTBgAAYGGtGoar6sQkf5DkH5JUkiOq6t919xvmXRwAAADMw6phOMlvJfme7t6eJFX1wCSvTyIMAwD7nK2nv37sElZ17Zknjl0CwOStZczwp3YG4cFHknxqTvUAAADA3O32ynBVPXl4eHVVXZzk/MzGDD81yTs2oDYAAACYi5W6SX//ksefTPLY4fENSQ6aW0UAAAAwZ7sNw939rI0sBAAAADbKWmaTPiLJzyXZuvT47j5pfmUBAADA/KxlNum/SnJ2ktcm+dp8ywEAAID5W0sY/lJ3v2zulQAAAMAGWUsY/t2qOiPJG5PcvLOxu981t6oAABiNtZqBKVhLGP6WJD+S5HG5tZt0D9sAAACwcNYShp+U5Ju7+8vzLgYAAAA2wp3WcMx7kxw470IAAABgo6zlyvA3JPlQVb0jtx0zbGklAAAAFtJawvAZc68CAAAANtCqYbi737IRhQAAAMBGWTUMV9XnM5s9OkkOSHLnJF/o7nvPszAAAACYl7VcGb7X0u2qemKSo+dWEQAAAMzZWmaTvo3u/qtYYxgAAIAFtpZu0k9esnmnJNtya7dpAAAAWDhrmU36+5c8viXJtUlOnks1AAAAsAHWMmb4WRtRCAAAAGyU3YbhqvrVFZ7X3f3COdQDAAAAc7fSleEvLNN2jySnJrlfEmEYAACAhbTbMNzdv7XzcVXdK8lzkjwryXlJfmt3zwMAAIB93Ypjhqvqvkl+Ickzk5yb5BHdfdNGFAYAAADzstKY4d9M8uQkZyX5lu7+fxtWFQAAAMzRnVbY9x+T/Iskz09yfVV9brh9vqo+tzHlAQAAwN630pjhlYIyAAAALCyBFwAAgMkRhgEAAJgcYRgAAIDJGSUMV9WBVXVBVX2oqj5YVY+qqvtW1aVVdc1wf9BwbFXVy6pqe1W9r6oeseR1ThmOv6aqThnjswAAALB4xroy/LtJ/rq7/1WSb03ywSSnJ7msu49MctmwnSQnJDlyuJ2W5BXJ19dAPiPJdyQ5OskZOwM0AAAArGTDw3BV3TvJdyU5O0m6+8vd/ZkkJyc5dzjs3CRPHB6fnOSVPXN5kgOr6gFJjktyaXff2N03Jbk0yfEb+FEAAABYUGNcGf7mJDck+aOqendV/c+qukeSb+jujyfJcH//4fhDkly35Pk7hrbdtQMAAMCKxgjD+yd5RJJXdPfDk3wht3aJXk4t09YrtN/+BapOq6orq+rKG264YU/rBQAAYJMZIwzvSLKju68Yti/ILBx/cuj+nOH+U0uOP2zJ8w9Ncv0K7bfT3Wd197bu3rZly5a99kEAAABYTBsehrv7E0muq6oHD02PT/KBJBcl2Tkj9ClJLhweX5TkR4dZpY9J8tmhG/UlSY6tqoOGibOOHdoAAABgRfuP9L4/l+TPquqAJB9J8qzMgvn5VXVqko8leepw7MVJvjfJ9iRfHI5Nd99YVS9M8o7huBd0940b9xEAAABYVKOE4e5+T5Jty+x6/DLHdpJn7+Z1zklyzt6tDgAAgM1urHWGAQAAYDTCMAAAAJMjDAMAADA5wjAAAACTIwwDAAAwOcIwAAAAkyMMAwAAMDnCMAAAAJMjDAMAADA5wjAAAACTIwwDAAAwOcIwAAAAkyMMAwAAMDnCMAAAAJMjDAMAADA5wjAAAACTIwwDAAAwOcIwAAAAkyMMAwAAMDnCMAAAAJMjDAMAADA5+49dAAAAbDZbT3/92CWs6tozTxy7BBiVK8MAAABMjjAMAADA5AjDAAAATI4wDAAAwOQIwwAAAEyOMAwAAMDkCMMAAABMjjAMAADA5AjDAAAATI4wDAAAwOQIwwAAAEzO/mMXAAAAsKutp79+7BJWde2ZJ45dAuvgyjAAAACTIwwDAAAwOcIwAAAAkyMMAwAAMDnCMAAAAJMjDAMAADA5wjAAAACTIwwDAAAwOcIwAAAAkyMMAwAAMDmjheGq2q+q3l1Vrxu2j6iqK6rqmqp6dVUdMLTfZdjePuzfuuQ1nje0f7iqjhvnkwAAALBoxrwy/JwkH1yy/eIkL+3uI5PclOTUof3UJDd1979M8tLhuFTVUUmenuQhSY5P8vKq2m+DagcAAGCBjRKGq+rQJCcm+Z/DdiV5XJILhkPOTfLE4fHJw3aG/Y8fjj85yXndfXN3fzTJ9iRHb8wnAAAAYJGNdWX4d5I8N8nXhu37JflMd98ybO9Icsjw+JAk1yXJsP+zw/Ffb1/mOQAAALBbGx6Gq+r7knyqu9+5tHmZQ3uVfSs9Z9f3PK2qrqyqK2+44YY9qhcAAIDNZ4wrw49OclJVXZvkvMy6R/9OkgOrav/hmEOTXD883pHksCQZ9t8nyY1L25d5zm1091ndva27t23ZsmXvfhoAAAAWzoaH4e5+Xncf2t1bM5sA63919zOTvCnJU4bDTkly4fD4omE7w/7/1d09tD99mG36iCRHJnn7Bn0MAAAAFtj+qx+yYX4xyXlV9V+TvDvJ2UP72Un+pKq2Z3ZF+OlJ0t1XV9X5ST6Q5JYkz+7ur2582QAAACyaUcNwd785yZuHxx/JMrNBd/eXkjx1N89/UZIXza9CAAAANqMx1xkGAACAUQjDAAAATI4wDAAAwOQIwwAAAEyOMAwAAMDkCMMAAABMjjAMAADA5AjDAAAATI4wDAAAwOQIwwAAAEyOMAwAAMDkCMMAAABMjjAMAADA5AjDAAAATI4wDAAAwOQIwwAAAEyOMAwAAMDkCMMAAABMjjAMAADA5AjDAAAATI4wDAAAwOQIwwAAAEyOMAwAAMDkCMMAAABMjjAMAADA5AjDAAAATI4wDAAAwOQIwwAAAEyOMAwAAMDkCMMAAABMjjAMAADA5AjDAAAATI4wDAAAwOQIwwAAAEyOMAwAAMDkCMMAAABMjjAMAADA5AjDAAAATI4wDAAAwOQIwwAAAEyOMAwAAMDkCMMAAABMjjAMAADA5AjDAAAATM6Gh+GqOqyq3lRVH6yqq6vqOUP7favq0qq6Zrg/aGivqnpZVW2vqvdV1SOWvNYpw/HXVNUpG/1ZAAAAWExjXBm+Jcl/7O5/neSYJM+uqqOSnJ7ksu4+Msllw3aSnJDkyOF2WpJXJLPwnOSMJN+R5OgkZ+wM0AAAALCSDQ/D3f3x7n7X8PjzST6Y5JAkJyc5dzjs3CRPHB6fnOSVPXN5kgOr6gFJjktyaXff2N03Jbk0yfEb+FEAAABYUKOOGa6qrUkenuSKJN/Q3R9PZoE5yf2Hww5Jct2Sp+0Y2nbXDgAAACsaLQxX1T2T/HmSf9/dn1vp0GXaeoX25d7rtKq6sqquvOGGG/a8WAAAADaVUcJwVd05syD8Z939F0PzJ4fuzxnuPzW070hy2JKnH5rk+hXab6e7z+rubd29bcuWLXvvgwAAALCQxphNupKcneSD3f3bS3ZdlGTnjNCnJLlwSfuPDrNKH5Pks0M36kuSHFtVBw0TZx07tAEAAMCK9h/hPR+d5EeSXFVV7xnafinJmUnOr6pTk3wsyVOHfRcn+d4k25N8McmzkqS7b6yqFyZ5x3DcC7r7xo35CAAAACyyDQ/D3f23WX68b5I8fpnjO8mzd/Na5yQ5Z+9VBwAAwBSMOps0AAAAjEEYBgAAYHKEYQAAACZHGAYAAGByhGEAAAAmRxgGAABgcoRhAAAAJkcYBgAAYHKEYQAAACZHGAYAAGByhGEAAAAmRxgGAABgcoRhAAAAJkcYBgAAYHKEYQAAACZHGAYAAGByhGEAAAAmRxgGAABgcoRhAAAAJkcYBgAAYHKEYQAAACZHGAYAAGByhGEAAAAmZ/+xCwAAAGA+tp7++rFLWNG1Z5442nu7MgwAAMDkCMMAAABMjjAMAADA5AjDAAAATI4wDAAAwOQIwwAAAEyOMAwAAMDkCMMAAABMjjAMAADA5AjDAAAATI4wDAAAwOQIwwAAAEyOMAwAAMDkCMMAAABMjjAMAADA5AjDAAAATI4wDAAAwOQIwwAAAEyOMAwAAMDkCMMAAABMzsKH4ao6vqo+XFXbq+r0sesBAABg37fQYbiq9kvy+0lOSHJUkmdU1VHjVgUAAMC+bqHDcJKjk2zv7o9xFBnHAAAeM0lEQVR095eTnJfk5JFrAgAAYB+36GH4kCTXLdneMbQBAADAblV3j13DHVZVT01yXHf/xLD9I0mO7u6f2+W405KcNmw+OMmHN7TQPXNwkk+PXcSCcw73Dudx/ZzD9XMO1885XD/ncP2cw73DeVw/53D99vVz+Okk6e7jVztw//nXMlc7khy2ZPvQJNfvelB3n5XkrI0qaj2q6sru3jZ2HYvMOdw7nMf1cw7XzzlcP+dw/ZzD9XMO9w7ncf2cw/XbTOdw0btJvyPJkVV1RFUdkOTpSS4auSYAAAD2cQt9Zbi7b6mqn01ySZL9kpzT3VePXBYAAAD7uIUOw0nS3RcnuXjsOvaihejOvY9zDvcO53H9nMP1cw7XzzlcP+dw/ZzDvcN5XD/ncP02zTlc6Am0AAAA4I5Y9DHDAAAAsMeEYQAAAG6jqo4Zu4Z5000agH1KVR2dZGuWzGvR3f/faAUtoKr68ST/u7uvGbsWpqWqTlppf3db9QMWRFW9q7sfMTx+W3c/auya9raFn0BrkVXVL6y0v7t/e6Nq2Qyq6jtz+y/QrxytoAVUVVuS/GRufx5/fKyaFonzt35V9cdJjkryniRfHZo7iTC8Z7Ym+eGq+qYk70zyvzMLx+8ZtaoFUlV3T/Ifkxze3T9ZVUcmeXB3v27k0vZ1T11hX8cSmGtWVfddaX9337hRtSy6qjq1u8/epe3M7j59rJoWRC15fNfRqpgjYXhc9xq7gM2iqv4kyQNz+y/QwvCeuTCzL81/k1vPI2vn/K3fMUmO6u6vjV3IIuvuX02SqrpbZj/Q/Ockv5PZMoSszR9l9kPCzishO5K8JokwvILu/pGxa9hE3pnZd5lKcniSm4bHByb5WJIjxitt4Tylqr7U3X+WJFX18iR3GbmmRXCnqjoos6G1Ox9/PSBvhh9khOERdfevjV3DJrItsy/Q+v2vz927+xfHLmKBOX/rd3WSg5N8auxCFllVPT/Jo5PcM8m7k/ynzH6oYe0e2N1Pq6pnJEl3/3NV1WpP4lZVdVySh2TJFaXu/m/jVbRYuvuIJKmqP0hy0bCcaKrqhCT/dszaFtCTk1xUVV9LckKSG7v7Z0auaRHcJ7MfZXb+v+9dS/Z1km/e8Ir2MmF4H1BVd01yam7/B0PXyrV7f5JvTPLxsQtZcK+rqu/d+QeXPeb8rd99knywqi5PcvPOxu5+8nglLaQnJ7klyeuTvCXJ5d39pXFLWjhfHq6sd5JU1QOz5N9JVjZceTswyXdldpX9B5JcPmpRi+uR3f1TOze6+w1V9cIxC1oUu3Q1/4kkf5Xk75K8oKruuxmubM5Td28du4Z5M4HWPqCqXpPkQ0l+KMkLkjwzyQe7+zmjFrZAqupNSb4tydtz2y/QK07kwW1V1eeT3COzc/iVzH4J7O6+96iFLQjnb/2q6vHLtXf3ZRtdy6Krqnslecxw+8Ekn+zux4xb1eKoqickeX5mY9jfmNmV9h/r7jePWdeiqKr3dffDquq93f2tw7+Pf97dx45d26Kpqksy69nxp5n9OPPDSb6ru48btbAFUFUfzfCD1s6mJY+7uxf+yuY8DfNOfKa7Pztsf0+SJya5Nsnvd/eXRyxvrxCG9wFV9e7ufviSPxx3TnJJdz9u7NoWRVU9drn27n7LRteyqIbuf4d198fGrmUROX/sS6rqoUn+TZLHZjaM5LrMJtD61VELWzBVdb/MxrFXZlfXPz1ySQujqq7o7u+oqiuSnJzk/ya5ursfNHJpC2e4unlGZlfZk+StSX7NVc21qao7JXlUd//d2LUsmuG/3yd19/VV9W2ZzYny60keluQr3f0Toxa4F+gmvW/4ynD/meELzCcymwmUNerut1TVNyR55ND09u425nAPdHdX1V8m+faxa1lEzt/6VNVbuvuxVXVTbv8rfnf3irOqcjsvzuwL88uSvKO7v7LK8Qyq6hG7NO0cfnN4VR3e3e/a9Tks6w1VdWCSl+TWyS3PHbekxTSEXr0F76Du/lpVvSS3TobH2t2tu68fHv9wknO6+7eGHxg2xeoEwvC+4axhdrbnZ7bkwD2T/Mq4JS2WqvrBJL+Z5M2ZfXn+var6z919waiFLZ7Lq+qR3f2OsQtZUM7fHfc9w/3Bo1axSXT3icN418MF4T32W8P9XTO7qv7ezP6uPCzJFZl1O2cV3f1fhoevqarXZfal2pXMO6CqHpTZJHhbc9tl+/QgXLs3VtUPJPkLk63ukaXdyh+X5HnJ139gGKeivUw36ZENv6w8pbvPH7uWRVZV703yhJ1Xg4f1Xv+mu7913MoWS1V9IMmDMxsL8oXcelXuYWPWtSicv71j6CGzM3C8tbs/MGY9i6iqvj+zK3IHdPcRQ/e2F5hHYe2q6rwkL+ruq4bthyb5T939Y6MWtiCq6l1JzktyfndfO3I5C234jvMHmc3q+/Vl+7r7naMVtWCWzOlxS5IvxZwea1JVv5vkAZn1kDkpyYO6+ytV9YAkr+3ubaMWuBcIw/uAqnprd3/X6keyO1V1VXd/y5LtOyV579I2VjdMlHA73f2PG13LInL+1q+qfjbJz2Q242cyG2v4+9398vGqWjxV9c7MfsV/c3c/fGh7nx9m1q6q3tPd37ZaG8sbZt9+2nD7YpJXJ3lNd//TqIUtoKp6Z3cbgsOGG+ZDeVpmgfj8nf/9VtXDk9y/uy8Zs769QRjeB1TVryT558z+UHxhZ7vuRGtXVb+ZWRe2Vw1NT0vyPmu+7rmqekySI7v7j4Yr7Pfs7o+OXdeicP7Wp6rel+Q7u/v/Ddv3TPL3QtyeWTJ50buF4Tumql6V2d/kpTP43rO7nzFqYQuoqv51kl9K8ozuNkRvD1XVf8ls7fW/zG1XzPA9cQ8MQxKPzG2XMX3reBWxLxCG9wHDtO+7Mt37HhrGgjw6s64vb+3uvxy5pIVTVWdkNkbuwd39oKr6F5n9kv/okUtbCM7f+lXVVUm2dffNw/Zdklypl8eeqaqzk1yW5PTM1nf9+SR3XrpWKSurqrsm+encdgbfV1ivee2q6tDMlvV6WmZjXc/v7hePW9Xi8T1x/arqJzKbhOzQzCZ+OibJ24y7RhjeB1TVXXf947pcG8xbVb0nycOTvMvVpD3n/K1fVT03yTOS/PnQ9KQkr+rul4xX1eKpqrsn+eUkx2b2A+ElSV7o7wobpar+Lsm9krwmyau7+/+MXBITNvzQ+sjMlkj7tqr6V5ktT/W0kUtjZLqq7Bv+PsmuSzks18Yuqupvu/sxw8QIyy3HYmKEPfPlYYmgTpKqusfYBS0Y52+duvs3qupNma2RW0l+yuzce667v5hZGP7lsWtZNFV1fnf/4PDl+XZXDPy4tWb/rrvfP3YRm0FV3Tm37aXw5iR/aKb4PfKl7v5SVaWq7tLdH6qqB49d1L6uqi7r7sdX1Ys369BDYXhEVfWNSQ5JcrdhIPrOOcrvneTuoxW2QLr7McP9vcauZZM4v6r+MMmBVfWTSX48yf8YuaZF4vzdQVW19IerDw+3r+/r7s9tfFWLp6p+p7v/fVW9NssHObNJr27neq7fN2oVC667319VxyV5SG47RvO/jVfVwnpFkjsn2TmR4I8MbT8xWkWLZ8ew7vVfJbl0WNP++lWeQ/KAqnpskpOGGfZvs57SZlh3XTfpEVXVKUl+LLMxhlcu2fX5JH/c3X8xRl2LaJi1ckd331xV353ZZFqv7O7PjFvZ4qmqJ2RJ18ruvnTkkhaK83fHVNV1mYW3nX9od/5x2tnL4/BRClswVfXt3f3O4cvL7XT3Wza6pkU19Oz452E9zQcl+VdJ3uBq3NpU1cuTHJjZ1cw/ymzs+uXd/eOjFraAquq9uy4VuVwbazP8//E+Sf66u788dj37sqp6SpJTM1vu8MpddvdmGHMtDO8DquoHuvvPVz+S3RnGam7LbEH6S5JclNkkRt87Zl3A2g1LODygu/1av05V9aQkF++ciIw9NyxP9W+SHJTk8sy+CH6xu585amELYud8CTtDW1XdK8mfd/exY9e2aIY1m5/a3f8wbH9zkgu623C6VQwT4f1Ukn+Z5KokZ3f3LeNWtXiq6le6+4Vj1zEPuknvG15XVT+UWZD7+j+T7n7BaBUtnq919y3DF8Df6e7fq6p3j13UoqmqJyd5cZL7Z3ZFztjrPeD8rc8w3vq1SaynuX4nJfmdqnprkvMy66XgC+Ceqe7+YlWdmuT3hvHs/q6s3c7J2r40DAv7v5l9z2HP/eckb6qqj2T2d+Wbkjxr3JIWxrlJvpLkfyc5IclRuXUoBGvU3S+sqpOyZNx6d79uzJr2FmF433Bhks8meWeWrB/HHvlKVT0jySlJvn9ou/OI9Syq30jy/d39wbELWVDO3/q9vaoesRnGIY2pu581TLpzQpIfSvLyqrq0u40xXLuqqkcleWZm3QQT35v2xMXDGM2XZLaUzVczCybsoe6+rKqOTPLgzMLwh/T6WLOjdi7NNyw59/aR61lIVfXrSY5O8mdD03Oq6tHd/bwRy9or/E9933Bodx8/dhEL7lmZdYN5UXd/tKqOSPKnI9e0iD4pyK2L87d+j0nyk1X1D0m+kFuvrusOuIe6+ytV9YbMxl/fLcnJMeHOnvj3SZ6X5C+7++qha+qbRq5pIVTVnTIbX/2ZJK+pqtcluVt33zhyaQupqvZLclxu7UH4+KpKd//2qIUthq+P8R96EI5ZyyI7Mcm3dffXkqSqzk3y7sz+H7nQjBneB1TVWZl1wbpq7Fo2g6o6KMlh3f2+sWtZFEP33iR5bJJvzGy2xa//6mwyt5U5f3vPMBne7ewcK8faVNXxSZ6e5HsyW4bl1UneqKv0nquqe3T3F8auY9FU1eXdfczYdWwGVXVxZt3Or0rytZ3t3f1roxW1IKrqq5n9sJrMfly9W5IvxjCmPVJV70vy3Tt/0Kqq+2bWVXrhl5oThvcBVfWBzAb2fzSzL9A7/wNd+H/BNkpVvTmzMXL7Z9Yd64Ykb+nuXxizrkVRVX+0wu42++fKnL+9q6qOSfKg7n5lVd0vyT26+2Nj17VIhiUwzsvs6pzulHfA0EX67CT37O7Dq+pbM1s792dGLm0hVNULk1zZ3ReOXcui2zkZ2dh1MF3DUMQzM+sdU5mNHX5ed583amF7gTC8D6iqb1quvbv/caNrWVRV9e7ufnhV/URmV4XP8Mdjzw3jP/5utTaW5/ytX1U9P8mjkzywux9UVYckefXONcVZm6r6uSR/2t03jV3LoqqqK5I8JclF3f3woe393f3QcStbDMM6rvfJ7Ef+f86tP/Tfd9TCFlBVvTjJZd39xrFrYbqq6gFJHpnZf8tXdPcnRi5pr7jT2AXw9dB7WJLHDY+/GP9s9tT+w3+kP5hkU8xuN5LfW2Mby3P+1u8pSb43Q7e27v6nJLqx7blvSPKOqjq/qo4vA+XukO6+bpemr45SyGI6OLOJLO+ZZMuwvWXUihbX5Un+sqr+uao+V1Wfr6rPjV0U09LdH+/ui7r7ws0ShBMTaO0TquqMzNbIfXBmC9PfObPJnx49Zl0L5gWZrS/8t939jmGik2tGrmlhDN0BvzPJlqpa2rX83kn2G6eqxeH87VU3D0ssdZJU1d3HLmgRdffzq+pXkhyb2QSD/72qzs9sjU3jr9fmuqr6ziRdVQck+fkkJshbo+7+alXdJ8kDk9x1ya6/H6mkRfZbSR6V5KrWpRP2KmF43/CkJA9P8q4k6e7rh8XpWaPufk2S1yzZ/kiSHxivooVzQGa/3u+fZOm/e5/L7EodK3P+9p6/qKrfT3KfqnpWZkvanDNyTQtp+FHhE0k+keSWJAcluWBYYum541a3EH4qye8mOSTJjiRvTPLsUStaIMP6zL+Q2fm7KrPulZcn+e4Ry1pU1yR5vyAMe58xw/uAqnp7dx9dVe/q7kdU1T2SvM1419VV1XO7+zeq6vcyWz7kNrr750coayENSze8uvv/b+/eg+2syjuOf3+5YIKiGLEUpJJEbjWgUaAItGgR7FDrBQcNF6VK1VJQ6VCsFR2tZbRYqDNexkRL5TaAjbeWlJJEKUJFbgkEEpWOA1SspSJEW5BYCPn1j/VusnM4V7LPXuc9+/eZOZP9vnu/L8/Zwzlnr7We9TzO4O1pkrRn9vpvP0nHUFY0BayyfXXlkFpH0vsofdcfBC4A/rFptTQD+KHtYat2R/SKpPWUvqQ32l4saRHwYdsnVA6tdSRdBCwErmbbTgVprRR9IelS228b61wbZWV4algu6QvAzpLeBZwC/F3lmNqik7K2pmoU00CT0pbCJtvnok56bzfbR9YIpq2awW8GwNtnF+BNQydnbG+R9AeVYmqFkSZXOzLJOm6/sr1JEpJ2aHo171c7qJa6t/naofmK6LdF3QfNAsqBlWLpqQyGK5K0F7Cr7fMlHU1JqdyX8iHwX6oG1xK2VzT/Xlw7lmnidklXUlLOn+yrmT6543ZW1+M5lFT99HUdh6by7GgDkEzUTIDtjwBI+jW69mvavs929r2OLpOr20HSrKaf9f2SdgZWAKskbQR+Wje6dko/4ahF0geBs4G5XUXbBDwGfLFaYD2UNOmKJP0zcLbtO4ecPwj4qO3X1YmsPZqB24hsv75fsUwHI/TLTZ/c7SDpOtuvrB3HVNfMMgv4KKVP+KXN8UnAjrY/WTG81pH0OuBTwO7AA8CewA9sLxr1wojt1NnyNeTcqyltlq5K3+uJk7QPZbJ1Pl0LWck6in6R9Ne2P1g7jsmQwXBFo/UrlLTe9gH9jqltJP0M+DFwBXAz5cPzk2xfVyOuGExD0sxnUFKIPmN730ohtY6km20fMuTcTbZfUSumNpJ0B3Ak8K2mB/vvAifYfnfl0FpD0gqemq3wP5SV4y/Y/lX/o5r6JN3e6cscvdH8PC8D1tLV3sv22mpBxUCRdMRw521f3+9Yei1p0nXNGeW5uX2Lot1+HTgaOAE4EbgKuML296pG1VKS9qD0xT2c8iHwO8AZtv+zamDtsZbyvomSHn0vpRpyjJ8lLQGWN9WQl9QOqKUet/2QpBmSZti+VlJW1yfmHkpf3Cua4yWUNN99KHU9Wl84ZpIMbTG3jRR9elo2215aO4gYaO/vejyHUhxvLWXStdUyGK7rVknvsr1NsaymHUFm+8bB9hPASmClpGdQBsXflvRXtj9bN7pWuhC4HHhzc/zW5tzR1SJqEdsLascwDZxImZBZKmkLpRXLSXVDaqVfSHoWcD1wmaQHyP71iXqZ7e7VkBWSrrd9hKRMuI5sJqXVnMZ6YYyuK9tohaTTgG+wbTXpjVUCi4EzdOumpN8A/qZSOD2VNOmKJO1K+cX2GFsHvwdRKgUea/u/a8XWJs0g+LWUgfB84ErgS7Z/UjOuNpK0zvbisc7F8CTNBv4E6HyA/jYlnfLxakHFQGpa9G2ipOufRNmveZnth6oG1iKSfgD8nu37muMXAittvzipwCMbbs9wPD2S7mVrttFQtr2wzyFFACBJwJ3TYUtnVoYrsv1T4LBmL1dn7/BVtv+1YlitIuliynt3NfAx2xsqh9R2D0p6K1vTAk8A8uF5/JYCs4HPN8dva869s1pELSNpF0p7uflsWygme10nwHanGvwWSVcBDzmz3xP1Z8B3JN1NGYwsAE5rJhrSwWBkWRHukWQbxVQxpOXcDGAxcEe9iHonK8PRak0aZedDX/f/zKLMmj67/1G1V7Py8TngUMr7+V3KnuEfjXphAKXIie2XjnUuRibpBkpq9NBCMf9QLagWkfQK4FxgI3AOpSr3LpQPLyfbXlkxvNZpMo/2o/xNuStFs8YmaV7Sd3tL0umUzI5fNMfPpRTE+/zoV0b0hqQ/7DrcDPyH7RtqxdNLGQxHRPSIpNuAN9u+uzleCHw1KYPjl7T87SNpDaUn5HMoPSCPsX2TpP0oxQWT2jsBkg7jqVkKl1QLKAbSCFuYkqoffSNpDrAXZaHk7uk0MZg06YhA0kdGedq2z+lbMO32fuBaSfc0x/OBd9QLp5WulvQa26trB9JSszrvXVNI8CYA23eVLV4xXpIuBV4ErGNrloKBDIaj32ZIUmerQ9OXfYfKMcUAkDQL+ARl+9KPKFlGe0i6EPjQdKiJksFwRMDWVPNuz6S0BXoeJd0yRiDpYODHtq+RtDfwx8BRwGqmyZ6aPjoV+ICkRynFBTtbHuaNflk0tnQ93jTkuaSCTcxBwIuz1zqmgFXAcknLKD/Hp1I6aURMtvOAnYAFth8GkPRs4Pzm64yKsfVE0qQjYhuSdqL8cvsjYDnwt7YfqBvV1NakRx9le2PTmP7LwHspBSZ+0/ZxVQNskWbF4ymaNmoxBklPUCa3ROlX/2jnKWCO7dm1YmsbSV8B3mf7/tqxxGCTNIMyyfpqys/yauCC/F6MySbph8A+QycFm7/Vd9neu05kvZOV4YgAnuxneCalDcvFwMtt/7xuVK0xs6tgzBLgi7a/BnxN0rqKcbWO7SckHQ8stP0JSXsAu5Le6+Nie9jJhHhadgG+L+kWtu3t+vp6IcUgsr2F0plgae1YYuB4uOyY5m/1tFhRzWA4IpB0HvAmSsGdA2w/UjmktpkpaZbtzZSZ++42QPk9OwGSPkdpT3UEZZ/So8Ay4OCaccVA+svaAcRgk7Tc9lskrWeYbQ62X1IhrBgs35d08tDCgU0bzrsqxdRTSZOOiE6Lqv+jlMtPi6oJkvQh4PeBB4EXUlbVLWkv4GLbh1cNsEUk3Wb75d2VUtOeKiIGkaTdbN8vac/hnk/bw5hskl4AfJ1Sg2It5TPiwZRtOMfa/knF8HoiKxYRge0ZtWNoM9sfl3QNsBuwuiulaAZl73CM3+PN/rhO1dTnsW1RqIhJJelhhi82lsnB6KvOfvWhg95mv+bxlOq+EZOmGeweIulIYBHl9+DVtq+pG1nvZGU4IiKq66SZSzoZOJZSyfdLwFuAj9n+ctUAIyL6rKnaezrwAuBK4JvAe4CzgHW231AxvIhpIYPhiIiorpMe3TxeRGlNJeBbtjdUDS4iogJJ/wT8HLiRUo/iuZT+wmfYTnHGiB7IYDgiIqrr3iMcEREgab3tA5rHM2nqUnT6vUbE9sue4YiImAqeL+nMkZ60/al+BhMRMQU83nnQtLK5NwPhiN7KYDgiIqaCmcCzKKnREREBL5X0v81jAXOb4xRzi+iRpElHRER13XuGIyIiIvoh7VQiImIqyIpwRERE9FVWhiMiojpJ82xvrB1HREREDI4MhiMiIiIiImLgJE06IiIiIiIiBk4GwxERERERETFwMhiOiIiYJJKekLRO0gZJX5G04xivf6T5d3dJX52EeMZ1f0k7SzptjHt9d4zn3y5p96cXaURExOTLYDgiImLybLK92Pb+wGPAqeO5yPZ/2T5usoIax/13BoYdDEua2dzjsDH+M28HMhiOiIgpK4PhiIiI/vg3YC8ASWc2q8UbJP3p0BdKmi9pQ/N4pqTzJa2XdKek9zbnD5R0naS1klZJ2m2Y+yyQdKOkWyWdM8L9F0m6pVnBvlPS3sC5wIuac+dJepWkayVdDqxvrnuk635/3sR3h6RzJR0HHARc1txjbu/exoiIiN6YVTuAiIiI6U7SLOAYYKWkA4F3AIdQ+ivfLOk627ePcPm7gQXAy2xvljRP0mzgs8AbbP9M0hLg48ApQ679NLDU9iWSTh/h/qcCn7Z9maQdgJnAXwD7217cxP8q4Leac/cO+d6OAd4IHGL70U6bLEnvAc6yvWa871NEREQ/ZWU4IiJi8syVtA5YA9wH/D3w28A3bP/S9iPA14HfGeUeRwHLbG8GaPox7wvsD3yzuf+HgT2GufZw4Irm8aUj3P9G4GxJHwD2tL1phNfdMnQg3BXfhbYf7YovIiJiysvKcERExOTZ1Fld7ZCkCd5DgIc59z3bh47j+qHXbvukfbmkm4HXAqskvRO4Z5iX/nIC8UVEREx5WRmOiIjor+uBN0raUdIzgWMp+4lHsho4tUm1RtI84N+B50s6tDk3W9KiYa69ATi+eXzScDeXtBC4x/ZngCuBlwAPAzuN8/tZDZzSqZTdxMcE7xEREdF3GQxHRET0ke3bgIuAW4CbgQtG2S8McAElxfpOSXcAJ9p+DDgO+GRzbh0wXHXnM4DTJd0KPGeE+y8BNjTp1vsBl9h+CLihKfB13hjfz0rKIHpNc4+zmqcuApalgFZERExVspPZFBEREREREYMlK8MRERERERExcDIYjoiIiIiIiIGTwXBEREREREQMnAyGIyIiIiIiYuBkMBwREREREREDJ4PhiIiIiIiIGDgZDEdERERERMTAyWA4IiIiIiIiBs7/A91bNMgQA3jjAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# calculating total number of incidents per district\n", "crimedata_police_district = pd.DataFrame(df_crime['Police District'].value_counts().astype(float))\n", "crimedata_police_district = crimedata_police_district.reset_index()\n", "crimedata_police_district.columns = ['District', 'Number']\n", "crimedata_police_district.plot(kind='bar', figsize=(16,8), legend=None)\n", "xticks = [i for i in range(len(crimedata_police_district))]\n", "plt.xticks(xticks, list(crimedata_police_district['District']))\n", "plt.xlabel('Police district')\n", "plt.ylabel('Number of incidents')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It appears that the Central Police District has the most number of incidents. The next district is Mission district." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Process the crime data for mapping\n", "We first convert the Pandas *df_crime* into a GeoPandas GeoDataFrame, a spatial version of *df_crime*. This is done by first creating Shapely point geometry objects with proper coordinate projection for each record. Then we attach the results as a new column to *df_crime*." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Creating Shapely object for each record. Details of the coordinate system, ESPG 4326 which represents the standard WGS84 coordinate system, can be found in this link. Here we implement the Point() function from the shapely package." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "geometry = gpd.GeoSeries(df_crime.apply(lambda z: Point(z['Longitude'], z['Latitude']), 1), crs={'init': 'epsg:4326'})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert *df_crime* into GeoDataFrame." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Incident DateIncident TimeIncident NumberIncident CategoryPolice DistrictLatitudeLongitudegeometry
02018/01/0101:30180000263RobberyIngleside37.732261-122.423486POINT (-122.4234864149589 37.73226125275223)
12018/01/0101:59180000326Fire ReportMission37.765051-122.419669POINT (-122.4196689738014 37.76505133632968)
22018/01/0102:28180000348AssaultSouthern37.782119-122.396841POINT (-122.3968414285021 37.78211912156566)
32018/01/0102:28180000348Malicious MischiefSouthern37.782119-122.396841POINT (-122.3968414285021 37.78211912156566)
42018/01/0102:08180000285AssaultMission37.748166-122.418221POINT (-122.4182211716917 37.74816568813204)
\n", "
" ], "text/plain": [ " Incident Date Incident Time Incident Number Incident Category \\\n", "0 2018/01/01 01:30 180000263 Robbery \n", "1 2018/01/01 01:59 180000326 Fire Report \n", "2 2018/01/01 02:28 180000348 Assault \n", "3 2018/01/01 02:28 180000348 Malicious Mischief \n", "4 2018/01/01 02:08 180000285 Assault \n", "\n", " Police District Latitude Longitude \\\n", "0 Ingleside 37.732261 -122.423486 \n", "1 Mission 37.765051 -122.419669 \n", "2 Southern 37.782119 -122.396841 \n", "3 Southern 37.782119 -122.396841 \n", "4 Mission 37.748166 -122.418221 \n", "\n", " geometry \n", "0 POINT (-122.4234864149589 37.73226125275223) \n", "1 POINT (-122.4196689738014 37.76505133632968) \n", "2 POINT (-122.3968414285021 37.78211912156566) \n", "3 POINT (-122.3968414285021 37.78211912156566) \n", "4 POINT (-122.4182211716917 37.74816568813204) " ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_crime = gpd.GeoDataFrame(df_crime, geometry=geometry)\n", "df_crime.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To map out the crime data, there are three geological units we can work with: police districe, census tracts, and neighborhoods. Here we choose the last one which is designed by San Francisco Association of Realtors. The data can be obtained from DataSF. We download the shape file from the website and import it using GeoPandas." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nbrhoodnidsfar_distrgeometry
0Alamo Square6eDistrict 6 - Central NorthPOLYGON ((-122.4294839489174 37.77509623070431...
1Anza Vista6aDistrict 6 - Central NorthPOLYGON ((-122.4474643913587 37.77986335309237...
2Balboa Terrace4aDistrict 4 - Twin Peaks WestPOLYGON ((-122.464508862148 37.73220849554402,...
3Bayview10aDistrict 10 - SoutheastPOLYGON ((-122.38758527039 37.7502633777501, -...
4Bernal Heights9aDistrict 9 - Central EastPOLYGON ((-122.4037549223623 37.74919006373567...
\n", "
" ], "text/plain": [ " nbrhood nid sfar_distr \\\n", "0 Alamo Square 6e District 6 - Central North \n", "1 Anza Vista 6a District 6 - Central North \n", "2 Balboa Terrace 4a District 4 - Twin Peaks West \n", "3 Bayview 10a District 10 - Southeast \n", "4 Bernal Heights 9a District 9 - Central East \n", "\n", " geometry \n", "0 POLYGON ((-122.4294839489174 37.77509623070431... \n", "1 POLYGON ((-122.4474643913587 37.77986335309237... \n", "2 POLYGON ((-122.464508862148 37.73220849554402,... \n", "3 POLYGON ((-122.38758527039 37.7502633777501, -... \n", "4 POLYGON ((-122.4037549223623 37.74919006373567... " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nbrhoods = gpd.read_file('sf_neighborhoods.shp')\n", "nbrhoods.head()" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "nbrhoods.plot(figsize=(12,14))\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check *nbrhoods*'s coordinate reference system." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'init': 'epsg:4326'}\n" ] } ], "source": [ "print(nbrhoods.crs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Aggregate crime numbers for each neighborhood\n", "\n", "Using the geological information in *df_crime*, we can calculate the number of crimes in each neighborhood by implementing GeoPandas' sjoin function. Since we want to aggregate the number in each neighborhood, we set op='within.' The resulted GeoDataFrame is further grouped by neighborhood. " ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nbrhoodincident_counts
0Alamo Square687
1Anza Vista310
2Balboa Terrace49
3Bayview3185
4Bayview Heights253
\n", "
" ], "text/plain": [ " nbrhood incident_counts\n", "0 Alamo Square 687\n", "1 Anza Vista 310\n", "2 Balboa Terrace 49\n", "3 Bayview 3185\n", "4 Bayview Heights 253" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nbh_crime_counts = gpd.tools.sjoin(df_crime.to_crs(nbrhoods.crs), nbrhoods, how=\"inner\", op='intersects').groupby('nbrhood').size()\n", "nbh_crime_counts = pd.DataFrame(data=nbh_crime_counts.reset_index())\n", "nbh_crime_counts.columns=['nbrhood', 'incident_counts']\n", "nbh_crime_counts.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we combine the nbh_crime_counts and the nbrhoods GeoDataFrames using the merge function. We use __nbrhood__ as the key where the two frames are joined. Details of the implementation can be found here." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nbrhoodnidsfar_distrgeometryincident_counts
0Alamo Square6eDistrict 6 - Central NorthPOLYGON ((-122.4294839489174 37.77509623070431...687
1Anza Vista6aDistrict 6 - Central NorthPOLYGON ((-122.4474643913587 37.77986335309237...310
2Balboa Terrace4aDistrict 4 - Twin Peaks WestPOLYGON ((-122.464508862148 37.73220849554402,...49
3Bayview10aDistrict 10 - SoutheastPOLYGON ((-122.38758527039 37.7502633777501, -...3185
4Bernal Heights9aDistrict 9 - Central EastPOLYGON ((-122.4037549223623 37.74919006373567...1561
\n", "
" ], "text/plain": [ " nbrhood nid sfar_distr \\\n", "0 Alamo Square 6e District 6 - Central North \n", "1 Anza Vista 6a District 6 - Central North \n", "2 Balboa Terrace 4a District 4 - Twin Peaks West \n", "3 Bayview 10a District 10 - Southeast \n", "4 Bernal Heights 9a District 9 - Central East \n", "\n", " geometry incident_counts \n", "0 POLYGON ((-122.4294839489174 37.77509623070431... 687 \n", "1 POLYGON ((-122.4474643913587 37.77986335309237... 310 \n", "2 POLYGON ((-122.464508862148 37.73220849554402,... 49 \n", "3 POLYGON ((-122.38758527039 37.7502633777501, -... 3185 \n", "4 POLYGON ((-122.4037549223623 37.74919006373567... 1561 " ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nbrhoods = nbrhoods.merge(nbh_crime_counts, on='nbrhood')\n", "nbrhoods.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Delete *df_crime* in order to reduce memory usage." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "del df_crime" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.2 San Francisco Housing Data Analysis \n", "Back to page top" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import the San Francisco Historica Secured Property Tax Rolls, 2007-2015 and perform an initial check." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/chiachen/miniconda3/envs/mlenv/lib/python3.6/site-packages/IPython/core/interactiveshell.py:3020: DtypeWarning: Columns (29) have mixed types. Specify dtype option on import or set low_memory=False.\n", " interactivity=interactivity, compiler=compiler, result=result)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 1612110 entries, 0 to 1612109\n", "Data columns (total 43 columns):\n", "Closed Roll Fiscal Year 1612109 non-null float64\n", "Property Location 1612110 non-null object\n", "Neighborhood Code 1611431 non-null object\n", "Neighborhood Code Definition 1564368 non-null object\n", "Block and Lot Number 1612110 non-null object\n", "Volume Number 1612110 non-null int64\n", "Property Class Code 1611252 non-null object\n", "Property Class Code Definition 1596776 non-null object\n", "Year Property Built 1483511 non-null float64\n", "Number of Bathrooms 1612110 non-null float64\n", "Number of Bedrooms 1612110 non-null int64\n", "Number of Rooms 1612110 non-null int64\n", "Number of Stories 1612110 non-null int64\n", "Number of Units 1612110 non-null int64\n", "Characteristics Change Date 1404465 non-null object\n", "Zoning Code 1393875 non-null object\n", "Construction Type 1357497 non-null object\n", "Lot Depth 1612110 non-null float64\n", "Lot Frontage 1612110 non-null float64\n", "Property Area in Square Feet 1612110 non-null int64\n", "Basement Area 1612109 non-null float64\n", "Lot Area 1612109 non-null float64\n", "Lot Code 569318 non-null object\n", "Prior Sales Date 0 non-null float64\n", "Recordation Date 1491934 non-null object\n", "Document Number 635747 non-null object\n", "Document Number 2 1612109 non-null float64\n", "Tax Rate Area Code 1607742 non-null float64\n", "Percent of Ownership 1612109 non-null float64\n", "Closed Roll Exemption Type Code 740703 non-null object\n", "Closed Roll Exemption Type Code Definition 740660 non-null object\n", "Closed Roll Status Code 28589 non-null object\n", "Closed Roll Misc Exemption Value 1612109 non-null float64\n", "Closed Roll Homeowner Exemption Value 1612109 non-null float64\n", "Current Sales Date 835570 non-null object\n", "Closed Roll Assessed Fixtures Value 1612109 non-null float64\n", "Closed Roll Assessed Improvement Value 1612109 non-null float64\n", "Closed Roll Assessed Land Value 1612109 non-null float64\n", "Closed Roll Assessed Personal Prop Value 1612109 non-null float64\n", "Zipcode of Parcel 1584672 non-null float64\n", "Supervisor District 1586142 non-null float64\n", "Neighborhoods - Analysis Boundaries 1584816 non-null object\n", "Location 1591808 non-null object\n", "dtypes: float64(19), int64(6), object(18)\n", "memory usage: 528.9+ MB\n" ] } ], "source": [ "df_import = pd.read_csv('Historic_Secured_Property_Tax_Rolls.csv')\n", "df_import.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will work with the following columns only." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "columns = ['Block and Lot Number', \n", " 'Closed Roll Assessed Fixtures Value',\n", " 'Closed Roll Assessed Improvement Value',\n", " 'Closed Roll Assessed Land Value',\n", " 'Closed Roll Assessed Personal Prop Value', 'Neighborhoods - Analysis Boundaries',\n", " 'Location']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get the data obtained in 2014. The *df_housing* will only contain columns defined in the last cell." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "df_housing = df_import[df_import['Closed Roll Fiscal Year']==2014.0].loc[:,columns].reset_index(drop=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check if there's any NaNs. If so, drop those rows." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Block and Lot Number 0\n", "Closed Roll Assessed Fixtures Value 0\n", "Closed Roll Assessed Improvement Value 0\n", "Closed Roll Assessed Land Value 0\n", "Closed Roll Assessed Personal Prop Value 0\n", "Neighborhoods - Analysis Boundaries 2497\n", "Location 1619\n", "dtype: int64" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_housing.isnull().sum()" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Block and Lot Number 0\n", "Closed Roll Assessed Fixtures Value 0\n", "Closed Roll Assessed Improvement Value 0\n", "Closed Roll Assessed Land Value 0\n", "Closed Roll Assessed Personal Prop Value 0\n", "Neighborhoods - Analysis Boundaries 0\n", "Location 0\n", "dtype: int64" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_housing.dropna(inplace=True)\n", "df_housing.isnull().sum()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute tha total value of the house. The total value is the combination of assessed fixtures value, improvement value, land value, and personal prop value." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "df_housing['total_price'] = df_housing['Closed Roll Assessed Fixtures Value'] + \\\n", " df_housing['Closed Roll Assessed Improvement Value'] + \\\n", " df_housing['Closed Roll Assessed Land Value'] + \\\n", " df_housing['Closed Roll Assessed Personal Prop Value']" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Block and Lot NumberClosed Roll Assessed Fixtures ValueClosed Roll Assessed Improvement ValueClosed Roll Assessed Land ValueClosed Roll Assessed Personal Prop ValueNeighborhoods - Analysis BoundariesLocationtotal_price
037514350.0149168.0149168.00.0South of Market(37.7816504619473, -122.399116945614)298336.0
262760090.0270000.0405000.00.0Excelsior(37.7190514589638, -122.433999199176)675000.0
337514200.0128078.0128078.00.0South of Market(37.7816504619473, -122.399116945614)256156.0
475173780.0129545.0141594.00.0Noe Valley(37.7463212609468, -122.441519528492)271139.0
537350980.0336716.0336716.00.0Financial District/South Beach(37.7857477114134, -122.397398669759)673432.0
\n", "
" ], "text/plain": [ " Block and Lot Number Closed Roll Assessed Fixtures Value \\\n", "0 3751435 0.0 \n", "2 6276009 0.0 \n", "3 3751420 0.0 \n", "4 7517378 0.0 \n", "5 3735098 0.0 \n", "\n", " Closed Roll Assessed Improvement Value Closed Roll Assessed Land Value \\\n", "0 149168.0 149168.0 \n", "2 270000.0 405000.0 \n", "3 128078.0 128078.0 \n", "4 129545.0 141594.0 \n", "5 336716.0 336716.0 \n", "\n", " Closed Roll Assessed Personal Prop Value \\\n", "0 0.0 \n", "2 0.0 \n", "3 0.0 \n", "4 0.0 \n", "5 0.0 \n", "\n", " Neighborhoods - Analysis Boundaries Location \\\n", "0 South of Market (37.7816504619473, -122.399116945614) \n", "2 Excelsior (37.7190514589638, -122.433999199176) \n", "3 South of Market (37.7816504619473, -122.399116945614) \n", "4 Noe Valley (37.7463212609468, -122.441519528492) \n", "5 Financial District/South Beach (37.7857477114134, -122.397398669759) \n", "\n", " total_price \n", "0 298336.0 \n", "2 675000.0 \n", "3 256156.0 \n", "4 271139.0 \n", "5 673432.0 " ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_housing.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Change the format of GPS coordinates so that it is consistent with other datasets." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "coordinates = df_housing['Location'].str.strip('()') \\\n", " .str.split(', ', expand=True) \\\n", " .rename(columns={0:'Latitude', 1:'Longitude'}) " ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "columns = list(df_housing.columns) + list(coordinates.columns)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "df_housing = pd.concat([df_housing, coordinates], axis=1, ignore_index=True)\n", "df_housing.columns = columns\n", "df_housing = df_housing.drop(columns=['Closed Roll Assessed Fixtures Value',\n", " 'Closed Roll Assessed Improvement Value',\n", " 'Closed Roll Assessed Land Value',\n", " 'Closed Roll Assessed Personal Prop Value',\n", " 'Location'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Latitude and longitude are text objects. Convert them to float." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "df_housing[['Latitude','Longitude']] = df_housing[['Latitude','Longitude']].apply(pd.to_numeric)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Final checkup." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Block and Lot NumberNeighborhoods - Analysis Boundariestotal_priceLatitudeLongitude
03751435South of Market298336.037.781650-122.399117
26276009Excelsior675000.037.719051-122.433999
33751420South of Market256156.037.781650-122.399117
47517378Noe Valley271139.037.746321-122.441520
53735098Financial District/South Beach673432.037.785748-122.397399
\n", "
" ], "text/plain": [ " Block and Lot Number Neighborhoods - Analysis Boundaries total_price \\\n", "0 3751435 South of Market 298336.0 \n", "2 6276009 Excelsior 675000.0 \n", "3 3751420 South of Market 256156.0 \n", "4 7517378 Noe Valley 271139.0 \n", "5 3735098 Financial District/South Beach 673432.0 \n", "\n", " Latitude Longitude \n", "0 37.781650 -122.399117 \n", "2 37.719051 -122.433999 \n", "3 37.781650 -122.399117 \n", "4 37.746321 -122.441520 \n", "5 37.785748 -122.397399 " ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_housing.head()" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Int64Index: 204319 entries, 0 to 206815\n", "Data columns (total 5 columns):\n", "Block and Lot Number 204319 non-null object\n", "Neighborhoods - Analysis Boundaries 204319 non-null object\n", "total_price 204319 non-null float64\n", "Latitude 204319 non-null float64\n", "Longitude 204319 non-null float64\n", "dtypes: float64(3), object(2)\n", "memory usage: 9.4+ MB\n" ] } ], "source": [ "df_housing.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Processing the housing data for mapping\n", "Following the steps in the previous section, we are going to convert *df_housing* into GeoDataFrame which is will be used for mapping. First, we generate the Shapely point column by combining Longitude and Latitude data." ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "geometry_housing = gpd.GeoSeries(df_housing.apply(lambda z: Point(z['Longitude'], z['Latitude']), 1), crs={'init': 'epsg:4326'})" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Block and Lot NumberNeighborhoods - Analysis Boundariestotal_priceLatitudeLongitudegeometry
03751435South of Market298336.037.781650-122.399117POINT (-122.399116945614 37.7816504619473)
26276009Excelsior675000.037.719051-122.433999POINT (-122.433999199176 37.7190514589638)
33751420South of Market256156.037.781650-122.399117POINT (-122.399116945614 37.7816504619473)
47517378Noe Valley271139.037.746321-122.441520POINT (-122.441519528492 37.7463212609468)
53735098Financial District/South Beach673432.037.785748-122.397399POINT (-122.397398669759 37.7857477114134)
\n", "
" ], "text/plain": [ " Block and Lot Number Neighborhoods - Analysis Boundaries total_price \\\n", "0 3751435 South of Market 298336.0 \n", "2 6276009 Excelsior 675000.0 \n", "3 3751420 South of Market 256156.0 \n", "4 7517378 Noe Valley 271139.0 \n", "5 3735098 Financial District/South Beach 673432.0 \n", "\n", " Latitude Longitude geometry \n", "0 37.781650 -122.399117 POINT (-122.399116945614 37.7816504619473) \n", "2 37.719051 -122.433999 POINT (-122.433999199176 37.7190514589638) \n", "3 37.781650 -122.399117 POINT (-122.399116945614 37.7816504619473) \n", "4 37.746321 -122.441520 POINT (-122.441519528492 37.7463212609468) \n", "5 37.785748 -122.397399 POINT (-122.397398669759 37.7857477114134) " ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_housing = gpd.GeoDataFrame(df_housing, geometry=geometry_housing)\n", "df_housing.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Compute average housing price per neighborhood\n", "Using sjoin and mean functions, we compute average housing price in each neighborhood. The price is in units of million." ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nbrhoodhouse_avg_price
0Alamo Square0.862016
1Anza Vista1.121534
2Balboa Terrace0.737963
3Bayview0.439684
4Bayview Heights0.296895
\n", "
" ], "text/plain": [ " nbrhood house_avg_price\n", "0 Alamo Square 0.862016\n", "1 Anza Vista 1.121534\n", "2 Balboa Terrace 0.737963\n", "3 Bayview 0.439684\n", "4 Bayview Heights 0.296895" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nbh_house_avg_value = gpd.tools.sjoin(df_housing.to_crs(nbrhoods.crs), nbrhoods, how=\"inner\", op='intersects').groupby('nbrhood').mean()\n", "nbh_house_avg_value = pd.DataFrame(data=nbh_house_avg_value.reset_index())\n", "nbh_house_avg_value = nbh_house_avg_value.drop(columns=['Latitude', 'Longitude', 'index_right', 'incident_counts'])\n", "nbh_house_avg_value.columns=['nbrhood', 'house_avg_price']\n", "\n", "# Normalize the price by one million.\n", "nbh_house_avg_value['house_avg_price'] = nbh_house_avg_value['house_avg_price'] / 1_000_000\n", "nbh_house_avg_value.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we merge the average housing price information with the *nbrhoods* GeoDataFrame." ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nbrhoodnidsfar_distrgeometryincident_countshouse_avg_price
0Alamo Square6eDistrict 6 - Central NorthPOLYGON ((-122.4294839489174 37.77509623070431...6870.862016
1Anza Vista6aDistrict 6 - Central NorthPOLYGON ((-122.4474643913587 37.77986335309237...3101.121534
2Balboa Terrace4aDistrict 4 - Twin Peaks WestPOLYGON ((-122.464508862148 37.73220849554402,...490.737963
3Bayview10aDistrict 10 - SoutheastPOLYGON ((-122.38758527039 37.7502633777501, -...31850.439684
4Bernal Heights9aDistrict 9 - Central EastPOLYGON ((-122.4037549223623 37.74919006373567...15610.448200
\n", "
" ], "text/plain": [ " nbrhood nid sfar_distr \\\n", "0 Alamo Square 6e District 6 - Central North \n", "1 Anza Vista 6a District 6 - Central North \n", "2 Balboa Terrace 4a District 4 - Twin Peaks West \n", "3 Bayview 10a District 10 - Southeast \n", "4 Bernal Heights 9a District 9 - Central East \n", "\n", " geometry incident_counts \\\n", "0 POLYGON ((-122.4294839489174 37.77509623070431... 687 \n", "1 POLYGON ((-122.4474643913587 37.77986335309237... 310 \n", "2 POLYGON ((-122.464508862148 37.73220849554402,... 49 \n", "3 POLYGON ((-122.38758527039 37.7502633777501, -... 3185 \n", "4 POLYGON ((-122.4037549223623 37.74919006373567... 1561 \n", "\n", " house_avg_price \n", "0 0.862016 \n", "1 1.121534 \n", "2 0.737963 \n", "3 0.439684 \n", "4 0.448200 " ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nbrhoods = nbrhoods.merge(nbh_house_avg_value, on='nbrhood')\n", "nbrhoods.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Delete *df_import* and *df_housing* datasets in order to save memory." ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "del df_import\n", "del df_housing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.3 Generating Crime and Housing Maps using Folium\n", "Back to page top" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this section, we are going to use folium to produce crime and housing price maps of San Francisco using the data prepared in the previous two sections. Before that we first use GeoPandas' representative_point() function to generate a representative location for each neighborhood. This data will be used to create popups on the map." ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nbrhoodincident_countshouse_avg_priceLatitudeLongitude
0Alamo Square6870.86201637.776076-122.433919
1Anza Vista3101.12153437.780611-122.443255
2Balboa Terrace490.73796337.730649-122.468267
3Bayview31850.43968437.732391-122.387170
4Bernal Heights15610.44820037.740230-122.415885
\n", "
" ], "text/plain": [ " nbrhood incident_counts house_avg_price Latitude Longitude\n", "0 Alamo Square 687 0.862016 37.776076 -122.433919\n", "1 Anza Vista 310 1.121534 37.780611 -122.443255\n", "2 Balboa Terrace 49 0.737963 37.730649 -122.468267\n", "3 Bayview 3185 0.439684 37.732391 -122.387170\n", "4 Bernal Heights 1561 0.448200 37.740230 -122.415885" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#nbh_centroid = pd.DataFrame(nbrhoods.centroid)\n", "nbh_centroid = pd.DataFrame(nbrhoods.representative_point())\n", "nbh_centroid.columns=(['centroid'])\n", "nbh_centroid['nbrhood'] = nbrhoods['nbrhood']\n", "nbh_centroid['incident_counts'] = nbrhoods['incident_counts']\n", "nbh_centroid['house_avg_price'] = nbrhoods['house_avg_price']\n", "\n", "lat = []\n", "lng = []\n", "for index, row in nbh_centroid.iterrows():\n", " tmp = str(row[0]).strip('POINT ()').split(' ')\n", " lng.append(float(tmp[0]))\n", " lat.append(float(tmp[1]))\n", " #print(tmp[0], tmp[1])\n", " \n", "nbh_centroid['Latitude'] = lat\n", "nbh_centroid['Longitude'] = lng\n", "\n", "nbh_centroid = nbh_centroid.drop(columns=['centroid'])\n", "\n", "nbh_centroid.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define a function that generates popups." ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "def get_popups(df, field, name, map_object):\n", " for lat, lng, nbrhood, value in zip( df['Latitude'], \n", " df['Longitude'], \n", " df['nbrhood'], \n", " df[field]\n", " ):\n", " label = (\"{0}, {1}: {2:.2f}\").format(nbrhood, name, value)\n", " label = folium.Popup(label, parse_html=True)\n", " folium.CircleMarker(\n", " [lat, lng],\n", " radius=2,\n", " popup=label,\n", " color='green',\n", " fill=True,\n", " fill_color='#3186cc',\n", " fill_opacity=0.3).add_to(map_object)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define San Francisco's GPS coordinates." ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [], "source": [ "SF_Coord = (37.7792808, -122.4192363)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create the crime map object." ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "scrolled": false }, "outputs": [], "source": [ "# Create San Francisco base map\n", "SF_crime_map = folium.Map(location=SF_Coord, zoom_start=12)\n", "\n", "#geodata = gpd.read_file('./tmp/geo_export_0e291dd6-c6fb-40dd-8323-68a750ad5743.geojson')\n", "# Crime data at the census tract level\n", "threshold_scale = [0, 1000, 2000, 4000, 6000, 8000]\n", "SF_crime_map.choropleth(geo_data = nbrhoods.to_json(),\n", " data = nbrhoods,\n", " columns = ['nbrhood', 'incident_counts'], \n", " key_on = 'feature.properties.nbrhood',\n", " fill_color = 'YlOrRd', \n", " fill_opacity = 0.60, \n", " line_opacity = 0.60,\n", " legend_name = 'Number of incidents',\n", " name = 'Number of Incidents',\n", " threshold_scale = threshold_scale,\n", " reset = True \n", " )\n", "\n", "get_popups(nbh_centroid, 'incident_counts', 'Incident Counts', SF_crime_map)\n", "\n", "# Add control layer to the map\n", "#folium.LayerControl().add_to(SF_crime_map)\n", "#SF_crime_map" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create the housing price map object." ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "# Create San Francisco base map\n", "SF_housing_map = folium.Map(location=SF_Coord, zoom_start=12)\n", "\n", "threshold_scale2 = [0.0, 0.5, 1.0, 2.0, 4.0, 8.0]\n", "SF_housing_map.choropleth(geo_data = nbrhoods.to_json(),\n", " data = nbrhoods,\n", " columns = ['nbrhood', 'house_avg_price'], \n", " key_on = 'feature.properties.nbrhood',\n", " fill_color = 'YlOrRd', \n", " fill_opacity = 0.60, \n", " line_opacity = 0.60,\n", " legend_name = 'Average Housing Price (Million)',\n", " name = 'Average Housing Price',\n", " threshold_scale = threshold_scale2,\n", " reset = True\n", " )\n", "\n", "get_popups(nbh_centroid, 'house_avg_price', 'Avg. House Price (Million)', SF_housing_map)\n", "\n", "# Add control layer to the map\n", "#folium.LayerControl().add_to(SF_housing_map)\n", "#SF_housing_map" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.4 Exploring and Clustering Venues in San Francisco Neighborhoods \n", "Back to page top\n", "\n", "In this section, are going to use the Foursquare APIs to explore San Francisco neighborhoods and cluster them using k-means clustering. From the Toronto zipcode neighborhood exercise, we learn that k-means will generate at least one cluster that emphasizes on the restaurant section. This is exactly the kind of information needed when we are looking for possible locations for opening a new restaurant." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set up Foursqure API id and basic API call parameters" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [], "source": [ "# @hidden_cell\n", "CLIENT_ID = 'JBAVIGVGG3N3AWC1FGO2G3U1N3GUOWBEKXFI1SDAOCYYPULD' \n", "CLIENT_SECRET = 'GPGOGGAB5YFPREDIUHAT5OZNYRDBVGZH1WC21KBQMVEP3BIC' \n", "VERSION = '20180927'\n", "\n", "# Set up the FourSquare API call parameters\n", "RADIUS = 500\n", "LIMIT = 100" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define the function that extracts the category of the venue." ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [], "source": [ "def get_category_type(row):\n", " try:\n", " categories_list = row['categories']\n", " except:\n", " categories_list = row['venue.categories']\n", " \n", " if len(categories_list) == 0:\n", " return None\n", " else:\n", " return categories_list[0]['name']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Get venue recommendations from Foursquare Expore API\n", "We use the function defined in the Toronto exercise. Here the function calls the __explore__ method to return a list of recommended venues for each neighborhood." ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [], "source": [ "def getNearbyVenues(names, latitudes, longitudes, radius, limit):\n", " \n", " venues_check_list = []\n", " venues_list=[]\n", " idx = 0\n", " for name, lat, lng in zip(names, latitudes, longitudes):\n", " \n", " # create the API request URL\n", " url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(\n", " CLIENT_ID, \n", " CLIENT_SECRET, \n", " VERSION, \n", " lat, \n", " lng, \n", " radius, \n", " limit)\n", " \n", " # make the GET request\n", " results = requests.get(url).json()[\"response\"]['groups'][0]['items']\n", " \n", " # return only relevant information for each nearby venue\n", " venues_list.append([(\n", " name, \n", " lat, \n", " lng, \n", " v['venue']['name'], \n", " v['venue']['location']['lat'], \n", " v['venue']['location']['lng'], \n", " v['venue']['categories'][0]['name']) for v in results])\n", " \n", " num_of_venues_found = len(results)\n", " if (num_of_venues_found == 0):\n", " venues_check_list.append(False)\n", " else:\n", " venues_check_list.append(True)\n", " print('{0:4d} Neighborhood: {1:35s}, number of venues found:{2:6d}'.format(idx, name, num_of_venues_found))\n", " idx = idx + 1\n", "\n", "\n", " nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])\n", " nearby_venues.columns = ['nbrhood', \n", " 'nbrhood Latitude', \n", " 'nbrhood Longitude', \n", " 'Venue', \n", " 'Venue Latitude', \n", " 'Venue Longitude', \n", " 'Venue Category']\n", " \n", " return(nearby_venues, venues_check_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Apply the function getNearbyVenues() to the neighborhoods whose coordinates are extracted from *nbh_centroid*." ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", " Search radius: 500.0 meters\n", " Maximum number of venues: 100\n", "\n", " 0 Neighborhood: Alamo Square , number of venues found: 72\n", " 1 Neighborhood: Anza Vista , number of venues found: 18\n", " 2 Neighborhood: Balboa Terrace , number of venues found: 15\n", " 3 Neighborhood: Bayview , number of venues found: 17\n", " 4 Neighborhood: Bernal Heights , number of venues found: 39\n", " 5 Neighborhood: Buena Vista Park/Ashbury Heights , number of venues found: 53\n", " 6 Neighborhood: Central Richmond , number of venues found: 96\n", " 7 Neighborhood: Central Sunset , number of venues found: 12\n", " 8 Neighborhood: Clarendon Heights , number of venues found: 13\n", " 9 Neighborhood: Corona Heights , number of venues found: 59\n", " 10 Neighborhood: Cow Hollow , number of venues found: 100\n", " 11 Neighborhood: Crocker Amazon , number of venues found: 16\n", " 12 Neighborhood: Diamond Heights , number of venues found: 10\n", " 13 Neighborhood: Downtown , number of venues found: 100\n", " 14 Neighborhood: Duboce Triangle , number of venues found: 87\n", " 15 Neighborhood: Eureka Valley / Dolores Heights , number of venues found: 100\n", " 16 Neighborhood: Excelsior , number of venues found: 4\n", " 17 Neighborhood: Financial District/Barbary Coast , number of venues found: 100\n", " 18 Neighborhood: Yerba Buena , number of venues found: 100\n", " 19 Neighborhood: Forest Hill , number of venues found: 6\n", " 20 Neighborhood: Forest Hills Extension , number of venues found: 12\n", " 21 Neighborhood: Forest Knolls , number of venues found: 11\n", " 22 Neighborhood: Glen Park , number of venues found: 47\n", " 23 Neighborhood: Golden Gate Heights , number of venues found: 14\n", " 24 Neighborhood: Golden Gate Park , number of venues found: 6\n", " 25 Neighborhood: Haight Ashbury , number of venues found: 100\n", " 26 Neighborhood: Hayes Valley , number of venues found: 100\n", " 27 Neighborhood: Hunters Point , number of venues found: 4\n", " 28 Neighborhood: Ingleside , number of venues found: 25\n", " 29 Neighborhood: Ingleside Heights , number of venues found: 8\n", " 30 Neighborhood: Ingleside Terrace , number of venues found: 12\n", " 31 Neighborhood: Inner Mission , number of venues found: 81\n", " 32 Neighborhood: Inner Parkside , number of venues found: 27\n", " 33 Neighborhood: Inner Richmond , number of venues found: 51\n", " 34 Neighborhood: Inner Sunset , number of venues found: 14\n", " 35 Neighborhood: Jordan Park / Laurel Heights , number of venues found: 35\n", " 36 Neighborhood: Lake Street , number of venues found: 19\n", " 37 Neighborhood: Monterey Heights , number of venues found: 4\n", " 38 Neighborhood: Lake Shore , number of venues found: 6\n", " 39 Neighborhood: Lakeside , number of venues found: 54\n", " 40 Neighborhood: Lone Mountain , number of venues found: 13\n", " 41 Neighborhood: Lower Pacific Heights , number of venues found: 100\n", " 42 Neighborhood: Marina , number of venues found: 84\n", " 43 Neighborhood: Merced Heights , number of venues found: 5\n", " 44 Neighborhood: Merced Manor , number of venues found: 13\n", " 45 Neighborhood: Midtown Terrace , number of venues found: 5\n", " 46 Neighborhood: South Beach , number of venues found: 56\n", " 47 Neighborhood: Miraloma Park , number of venues found: 7\n", " 48 Neighborhood: Mission Bay , number of venues found: 57\n", " 49 Neighborhood: Mission Dolores , number of venues found: 100\n", " 50 Neighborhood: Mission Terrace , number of venues found: 22\n", " 51 Neighborhood: Mount Davidson Manor , number of venues found: 18\n", " 52 Neighborhood: Noe Valley , number of venues found: 57\n", " 53 Neighborhood: North Beach , number of venues found: 100\n", " 54 Neighborhood: North Panhandle , number of venues found: 45\n", " 55 Neighborhood: North Waterfront , number of venues found: 39\n", " 56 Neighborhood: Oceanview , number of venues found: 8\n", " 57 Neighborhood: Outer Mission , number of venues found: 17\n", " 58 Neighborhood: Outer Parkside , number of venues found: 21\n", " 59 Neighborhood: Outer Richmond , number of venues found: 46\n", " 60 Neighborhood: Outer Sunset , number of venues found: 30\n", " 61 Neighborhood: Pacific Heights , number of venues found: 61\n", " 62 Neighborhood: Parkside , number of venues found: 34\n", " 63 Neighborhood: Cole Valley/Parnassus Heights , number of venues found: 25\n", " 64 Neighborhood: Pine Lake Park , number of venues found: 10\n", " 65 Neighborhood: Portola , number of venues found: 5\n", " 66 Neighborhood: Potrero Hill , number of venues found: 47\n", " 67 Neighborhood: Presidio , number of venues found: 12\n", " 68 Neighborhood: Presidio Heights , number of venues found: 26\n", " 69 Neighborhood: Russian Hill , number of venues found: 70\n", " 70 Neighborhood: Saint Francis Wood , number of venues found: 53\n", " 71 Neighborhood: Sea Cliff , number of venues found: 7\n", " 72 Neighborhood: Silver Terrace , number of venues found: 4\n", " 73 Neighborhood: South of Market , number of venues found: 93\n", " 74 Neighborhood: Stonestown , number of venues found: 17\n", " 75 Neighborhood: Sunnyside , number of venues found: 16\n", " 76 Neighborhood: Telegraph Hill , number of venues found: 100\n", " 77 Neighborhood: Twin Peaks , number of venues found: 8\n", " 78 Neighborhood: Van Ness/Civic Center , number of venues found: 72\n", " 79 Neighborhood: Visitacion Valley , number of venues found: 5\n", " 80 Neighborhood: West Portal , number of venues found: 58\n", " 81 Neighborhood: Western Addition , number of venues found: 62\n", " 82 Neighborhood: Westwood Highlands , number of venues found: 10\n", " 83 Neighborhood: Westwood Park , number of venues found: 32\n", " 84 Neighborhood: Lincoln Park , number of venues found: 22\n", " 85 Neighborhood: Sherwood Forest , number of venues found: 4\n", " 86 Neighborhood: Tenderloin , number of venues found: 100\n", " 87 Neighborhood: Central Waterfront/Dogpatch , number of venues found: 53\n", " 88 Neighborhood: Candlestick Point , number of venues found: 8\n", " 89 Neighborhood: Bayview Heights , number of venues found: 4\n", " 90 Neighborhood: Little Hollywood , number of venues found: 11\n", " 91 Neighborhood: Nob Hill , number of venues found: 48\n" ] } ], "source": [ "nbhs = nbh_centroid.loc[:, 'nbrhood']\n", "latitudes = nbh_centroid.loc[:, 'Latitude']\n", "longitudes = nbh_centroid.loc[:, 'Longitude']\n", "\n", "print('\\n Search radius: {0:8.1f} meters'.format(RADIUS))\n", "print(' Maximum number of venues: {0:6d}\\n'.format(LIMIT))\n", "SF_venues, SF_venues_check_list = getNearbyVenues(nbhs, latitudes, longitudes, RADIUS, LIMIT)" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(3567, 7)\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nbrhoodnbrhood Latitudenbrhood LongitudeVenueVenue LatitudeVenue LongitudeVenue Category
0Alamo Square37.776076-122.433919Alamo Square37.775906-122.434047Park
1Alamo Square37.776076-122.433919Painted Ladies37.776010-122.433179Historic Site
2Alamo Square37.776076-122.433919Alamo Square Dog Park37.775878-122.435740Dog Run
3Alamo Square37.776076-122.433919Originals Vinyl37.775835-122.431227Record Shop
4Alamo Square37.776076-122.433919The Center SF37.774545-122.430730Spiritual Center
\n", "
" ], "text/plain": [ " nbrhood nbrhood Latitude nbrhood Longitude Venue \\\n", "0 Alamo Square 37.776076 -122.433919 Alamo Square \n", "1 Alamo Square 37.776076 -122.433919 Painted Ladies \n", "2 Alamo Square 37.776076 -122.433919 Alamo Square Dog Park \n", "3 Alamo Square 37.776076 -122.433919 Originals Vinyl \n", "4 Alamo Square 37.776076 -122.433919 The Center SF \n", "\n", " Venue Latitude Venue Longitude Venue Category \n", "0 37.775906 -122.434047 Park \n", "1 37.776010 -122.433179 Historic Site \n", "2 37.775878 -122.435740 Dog Run \n", "3 37.775835 -122.431227 Record Shop \n", "4 37.774545 -122.430730 Spiritual Center " ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print(SF_venues.shape)\n", "SF_venues.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Find out the number of unique categories can be curated from all the returned venues." ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "There are 343 uniques categories.\n" ] } ], "source": [ "print('There are {} uniques categories.'.format(len(SF_venues['Venue Category'].unique())))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Cluster the neighborhoods using the k-means algorithm: Preprocessing\n", "First we one-hot encode venue categories." ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(3567, 344)" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "SF_onehot = pd.get_dummies(SF_venues[['Venue Category']], prefix=\"\", prefix_sep=\"\")\n", "\n", "# add postcode column back to dataframe\n", "SF_onehot['nbrhood'] = SF_venues['nbrhood'] \n", "\n", "# move postcode column to the first column\n", "fixed_columns = [SF_onehot.columns[-1]] + list(SF_onehot.columns[:-1])\n", "SF_onehot = SF_onehot[fixed_columns]\n", "\n", "SF_onehot.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Group rows by neighborhood name and by taking the mean of the frequency of occurrence of each category." ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nbrhoodATMAcai HouseAccessories StoreAdult BoutiqueAfghan RestaurantAfrican RestaurantAlternative HealerAmerican RestaurantAntique ShopArcadeArgentinian RestaurantArt GalleryArt MuseumArts & Crafts StoreAsian RestaurantAthletics & SportsAutomotive ShopBBQ JointBaby StoreBagel ShopBakeryBankBarBaseball FieldBaseball StadiumBasketball CourtBasketball StadiumBeachBed & BreakfastBeer BarBeer GardenBeer StoreBelgian RestaurantBig Box StoreBike Rental / Bike ShareBike ShopBistroBoard ShopBoat or FerryBookstoreBoutiqueBowling AlleyBoxing GymBrazilian RestaurantBreakfast SpotBreweryBubble Tea ShopBuffetBuilding...StadiumStationery StoreSteakhouseStreet Food GatheringSupermarketSupplement ShopSurf SpotSushi RestaurantSzechuan RestaurantTaco PlaceTaiwanese RestaurantTapas RestaurantTattoo ParlorTea RoomTech StartupTennis CourtThai RestaurantTheaterThrift / Vintage StoreTiki BarTour ProviderTourist Information CenterToy / Game StoreTrackTrack StadiumTrade SchoolTrailTrain StationTrattoria/OsteriaTreeTunnelTurkish RestaurantTuscan RestaurantUdon RestaurantUsed BookstoreVegetarian / Vegan RestaurantVeterinarianVideo Game StoreVideo StoreVietnamese RestaurantVineyardWagashi PlaceWeight Loss CenterWhisky BarWine BarWine ShopWineryWings JointWomen's StoreYoga Studio
0Alamo Square0.00.00.00.00.00.0000000.00.0000000.0138890.0138890.00.00.00.0000000.00.00.00.0277780.00.00.0138890.0138890.0555560.00.00.00.00.00.00.00.00.00.00.0000000.00.00.00.00.00.00.0277780.00.00.00.000.0138890.0138890.00.0...0.00.00.00.00.00.00.00.0277780.00.0000000.00.00.00.00.00.00.00.0000000.0000000.00.00.00.0138890.00.00.00.00.00.00.00.0000000.00.00.00.00.00.00.00.00.0000000.00.00.00.00.0277780.00.00.00.00.0
1Anza Vista0.00.00.00.00.00.0000000.00.0000000.0000000.0000000.00.00.00.0555560.00.00.00.0000000.00.00.0000000.0000000.0000000.00.00.00.00.00.00.00.00.00.00.0555560.00.00.00.00.00.00.0000000.00.00.00.000.0000000.0000000.00.0...0.00.00.00.00.00.00.00.0000000.00.0000000.00.00.00.00.00.00.00.0000000.0000000.00.00.00.0000000.00.00.00.00.00.00.00.0555560.00.00.00.00.00.00.00.00.0000000.00.00.00.00.0000000.00.00.00.00.0
2Balboa Terrace0.00.00.00.00.00.0000000.00.0666670.0000000.0000000.00.00.00.0000000.00.00.00.0000000.00.00.0666670.0000000.0000000.00.00.00.00.00.00.00.00.00.00.0000000.00.00.00.00.00.00.0000000.00.00.00.000.0000000.0000000.00.0...0.00.00.00.00.00.00.00.0000000.00.0000000.00.00.00.00.00.00.00.0000000.0000000.00.00.00.0000000.00.00.00.00.00.00.00.0000000.00.00.00.00.00.00.00.00.0666670.00.00.00.00.0000000.00.00.00.00.0
3Bayview0.00.00.00.00.00.0588240.00.0000000.0000000.0000000.00.00.00.0000000.00.00.00.0588240.00.00.0588240.0000000.0000000.00.00.00.00.00.00.00.00.00.00.0000000.00.00.00.00.00.00.0000000.00.00.00.000.0000000.0000000.00.0...0.00.00.00.00.00.00.00.0000000.00.0588240.00.00.00.00.00.00.00.0588240.0588240.00.00.00.0000000.00.00.00.00.00.00.00.0000000.00.00.00.00.00.00.00.00.0000000.00.00.00.00.0000000.00.00.00.00.0
4Bayview Heights0.00.00.00.00.00.0000000.00.0000000.0000000.0000000.00.00.00.0000000.00.00.00.0000000.00.00.0000000.0000000.0000000.00.00.00.00.00.00.00.00.00.00.0000000.00.00.00.00.00.00.0000000.00.00.00.250.0000000.0000000.00.0...0.00.00.00.00.00.00.00.0000000.00.0000000.00.00.00.00.00.00.00.0000000.0000000.00.00.00.0000000.00.00.00.00.00.00.00.0000000.00.00.00.00.00.00.00.00.0000000.00.00.00.00.0000000.00.00.00.00.0
\n", "

5 rows Ă— 344 columns

\n", "
" ], "text/plain": [ " nbrhood ATM Acai House Accessories Store Adult Boutique \\\n", "0 Alamo Square 0.0 0.0 0.0 0.0 \n", "1 Anza Vista 0.0 0.0 0.0 0.0 \n", "2 Balboa Terrace 0.0 0.0 0.0 0.0 \n", "3 Bayview 0.0 0.0 0.0 0.0 \n", "4 Bayview Heights 0.0 0.0 0.0 0.0 \n", "\n", " Afghan Restaurant African Restaurant Alternative Healer \\\n", "0 0.0 0.000000 0.0 \n", "1 0.0 0.000000 0.0 \n", "2 0.0 0.000000 0.0 \n", "3 0.0 0.058824 0.0 \n", "4 0.0 0.000000 0.0 \n", "\n", " American Restaurant Antique Shop Arcade Argentinian Restaurant \\\n", "0 0.000000 0.013889 0.013889 0.0 \n", "1 0.000000 0.000000 0.000000 0.0 \n", "2 0.066667 0.000000 0.000000 0.0 \n", "3 0.000000 0.000000 0.000000 0.0 \n", "4 0.000000 0.000000 0.000000 0.0 \n", "\n", " Art Gallery Art Museum Arts & Crafts Store Asian Restaurant \\\n", "0 0.0 0.0 0.000000 0.0 \n", "1 0.0 0.0 0.055556 0.0 \n", "2 0.0 0.0 0.000000 0.0 \n", "3 0.0 0.0 0.000000 0.0 \n", "4 0.0 0.0 0.000000 0.0 \n", "\n", " Athletics & Sports Automotive Shop BBQ Joint Baby Store Bagel Shop \\\n", "0 0.0 0.0 0.027778 0.0 0.0 \n", "1 0.0 0.0 0.000000 0.0 0.0 \n", "2 0.0 0.0 0.000000 0.0 0.0 \n", "3 0.0 0.0 0.058824 0.0 0.0 \n", "4 0.0 0.0 0.000000 0.0 0.0 \n", "\n", " Bakery Bank Bar Baseball Field Baseball Stadium \\\n", "0 0.013889 0.013889 0.055556 0.0 0.0 \n", "1 0.000000 0.000000 0.000000 0.0 0.0 \n", "2 0.066667 0.000000 0.000000 0.0 0.0 \n", "3 0.058824 0.000000 0.000000 0.0 0.0 \n", "4 0.000000 0.000000 0.000000 0.0 0.0 \n", "\n", " Basketball Court Basketball Stadium Beach Bed & Breakfast Beer Bar \\\n", "0 0.0 0.0 0.0 0.0 0.0 \n", "1 0.0 0.0 0.0 0.0 0.0 \n", "2 0.0 0.0 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 \n", "\n", " Beer Garden Beer Store Belgian Restaurant Big Box Store \\\n", "0 0.0 0.0 0.0 0.000000 \n", "1 0.0 0.0 0.0 0.055556 \n", "2 0.0 0.0 0.0 0.000000 \n", "3 0.0 0.0 0.0 0.000000 \n", "4 0.0 0.0 0.0 0.000000 \n", "\n", " Bike Rental / Bike Share Bike Shop Bistro Board Shop Boat or Ferry \\\n", "0 0.0 0.0 0.0 0.0 0.0 \n", "1 0.0 0.0 0.0 0.0 0.0 \n", "2 0.0 0.0 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 \n", "\n", " Bookstore Boutique Bowling Alley Boxing Gym Brazilian Restaurant \\\n", "0 0.0 0.027778 0.0 0.0 0.0 \n", "1 0.0 0.000000 0.0 0.0 0.0 \n", "2 0.0 0.000000 0.0 0.0 0.0 \n", "3 0.0 0.000000 0.0 0.0 0.0 \n", "4 0.0 0.000000 0.0 0.0 0.0 \n", "\n", " Breakfast Spot Brewery Bubble Tea Shop Buffet Building ... \\\n", "0 0.00 0.013889 0.013889 0.0 0.0 ... \n", "1 0.00 0.000000 0.000000 0.0 0.0 ... \n", "2 0.00 0.000000 0.000000 0.0 0.0 ... \n", "3 0.00 0.000000 0.000000 0.0 0.0 ... \n", "4 0.25 0.000000 0.000000 0.0 0.0 ... \n", "\n", " Stadium Stationery Store Steakhouse Street Food Gathering Supermarket \\\n", "0 0.0 0.0 0.0 0.0 0.0 \n", "1 0.0 0.0 0.0 0.0 0.0 \n", "2 0.0 0.0 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 \n", "\n", " Supplement Shop Surf Spot Sushi Restaurant Szechuan Restaurant \\\n", "0 0.0 0.0 0.027778 0.0 \n", "1 0.0 0.0 0.000000 0.0 \n", "2 0.0 0.0 0.000000 0.0 \n", "3 0.0 0.0 0.000000 0.0 \n", "4 0.0 0.0 0.000000 0.0 \n", "\n", " Taco Place Taiwanese Restaurant Tapas Restaurant Tattoo Parlor \\\n", "0 0.000000 0.0 0.0 0.0 \n", "1 0.000000 0.0 0.0 0.0 \n", "2 0.000000 0.0 0.0 0.0 \n", "3 0.058824 0.0 0.0 0.0 \n", "4 0.000000 0.0 0.0 0.0 \n", "\n", " Tea Room Tech Startup Tennis Court Thai Restaurant Theater \\\n", "0 0.0 0.0 0.0 0.0 0.000000 \n", "1 0.0 0.0 0.0 0.0 0.000000 \n", "2 0.0 0.0 0.0 0.0 0.000000 \n", "3 0.0 0.0 0.0 0.0 0.058824 \n", "4 0.0 0.0 0.0 0.0 0.000000 \n", "\n", " Thrift / Vintage Store Tiki Bar Tour Provider \\\n", "0 0.000000 0.0 0.0 \n", "1 0.000000 0.0 0.0 \n", "2 0.000000 0.0 0.0 \n", "3 0.058824 0.0 0.0 \n", "4 0.000000 0.0 0.0 \n", "\n", " Tourist Information Center Toy / Game Store Track Track Stadium \\\n", "0 0.0 0.013889 0.0 0.0 \n", "1 0.0 0.000000 0.0 0.0 \n", "2 0.0 0.000000 0.0 0.0 \n", "3 0.0 0.000000 0.0 0.0 \n", "4 0.0 0.000000 0.0 0.0 \n", "\n", " Trade School Trail Train Station Trattoria/Osteria Tree Tunnel \\\n", "0 0.0 0.0 0.0 0.0 0.0 0.000000 \n", "1 0.0 0.0 0.0 0.0 0.0 0.055556 \n", "2 0.0 0.0 0.0 0.0 0.0 0.000000 \n", "3 0.0 0.0 0.0 0.0 0.0 0.000000 \n", "4 0.0 0.0 0.0 0.0 0.0 0.000000 \n", "\n", " Turkish Restaurant Tuscan Restaurant Udon Restaurant Used Bookstore \\\n", "0 0.0 0.0 0.0 0.0 \n", "1 0.0 0.0 0.0 0.0 \n", "2 0.0 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 \n", "\n", " Vegetarian / Vegan Restaurant Veterinarian Video Game Store Video Store \\\n", "0 0.0 0.0 0.0 0.0 \n", "1 0.0 0.0 0.0 0.0 \n", "2 0.0 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 \n", "\n", " Vietnamese Restaurant Vineyard Wagashi Place Weight Loss Center \\\n", "0 0.000000 0.0 0.0 0.0 \n", "1 0.000000 0.0 0.0 0.0 \n", "2 0.066667 0.0 0.0 0.0 \n", "3 0.000000 0.0 0.0 0.0 \n", "4 0.000000 0.0 0.0 0.0 \n", "\n", " Whisky Bar Wine Bar Wine Shop Winery Wings Joint Women's Store \\\n", "0 0.0 0.027778 0.0 0.0 0.0 0.0 \n", "1 0.0 0.000000 0.0 0.0 0.0 0.0 \n", "2 0.0 0.000000 0.0 0.0 0.0 0.0 \n", "3 0.0 0.000000 0.0 0.0 0.0 0.0 \n", "4 0.0 0.000000 0.0 0.0 0.0 0.0 \n", "\n", " Yoga Studio \n", "0 0.0 \n", "1 0.0 \n", "2 0.0 \n", "3 0.0 \n", "4 0.0 \n", "\n", "[5 rows x 344 columns]" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "SF_grouped = SF_onehot.groupby('nbrhood').mean().reset_index()\n", "SF_grouped.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define a function that sorts the venues in descending order." ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [], "source": [ "def return_most_common_venues(row, num_top_venues):\n", " row_categories = row.iloc[1:]\n", " row_categories_sorted = row_categories.sort_values(ascending=False)\n", " \n", " return row_categories_sorted.index.values[0:num_top_venues]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a dataframe that contains venues in descending order for each zipcode area." ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(92, 11)\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nbrhood1st Most Common Venue2nd Most Common Venue3rd Most Common Venue4th Most Common Venue5th Most Common Venue6th Most Common Venue7th Most Common Venue8th Most Common Venue9th Most Common Venue10th Most Common Venue
0Alamo SquareBarRecord ShopBoutiqueItalian RestaurantWine BarSushi RestaurantGift ShopMediterranean RestaurantBBQ JointEthiopian Restaurant
1Anza VistaCaféHealth & Beauty ServiceCoffee ShopSouthern / Soul Food RestaurantMexican RestaurantGrocery StoreSandwich PlaceTunnelBig Box StoreArts & Crafts Store
2Balboa TerraceLight Rail StationPharmacyCircusBakeryParkPlaygroundDessert ShopVietnamese RestaurantItalian RestaurantAmerican Restaurant
3BayviewSouthern / Soul Food RestaurantParkLight Rail StationChinese RestaurantThrift / Vintage StoreTheaterAfrican RestaurantPlazaGymBakery
4Bayview HeightsBreakfast SpotParkBurger JointLatin American RestaurantFoodExhibitFarmers MarketFast Food RestaurantFilipino RestaurantFlea Market
5Bernal HeightsCoffee ShopParkBakeryItalian RestaurantPlaygroundGourmet ShopYoga StudioSeafood RestaurantSandwich PlaceGym
6Buena Vista Park/Ashbury HeightsBoutiqueParkCoffee ShopClothing StoreConvenience StoreTrailScenic LookoutGift ShopBreakfast SpotCafé
7Candlestick PointFootball StadiumParkCampgroundStadiumAmerican RestaurantFood & Drink ShopSoccer FieldFlea MarketEvent SpaceExhibit
8Central RichmondGrocery StoreChinese RestaurantSushi RestaurantKorean RestaurantCaféDeli / BodegaBakeryDim Sum RestaurantCoffee ShopVietnamese Restaurant
9Central SunsetChinese RestaurantGym / Fitness CenterPilates StudioSpaShoe StoreFoodPet StorePlaygroundDessert ShopEthiopian Restaurant
10Central Waterfront/DogpatchBakeryArt GalleryCocktail BarBreweryWine BarCaféGym / Fitness CenterCoffee ShopGift ShopDessert Shop
11Clarendon HeightsTrailParkRoadBus StopScenic LookoutArt GalleryPlaygroundConvenience StoreWine BarDance Studio
12Cole Valley/Parnassus HeightsWine BarParkYoga StudioAthletics & SportsBurger JointSports BarSports ClubBreakfast SpotStreet Food GatheringMexican Restaurant
13Corona HeightsGay BarParkThai RestaurantCosmetics ShopDog RunSushi RestaurantDim Sum RestaurantGrocery StoreCoffee ShopDiner
14Cow HollowCosmetics ShopWine BarFrench RestaurantGym / Fitness CenterItalian RestaurantSandwich PlaceSalad PlaceGymBakerySalon / Barbershop
15Crocker AmazonPharmacyGastropubCoffee ShopBarLatin American RestaurantScenic LookoutTennis CourtBasketball CourtPizza PlaceAmerican Restaurant
16Diamond HeightsTrailPlaygroundGrocery StorePharmacyDim Sum RestaurantSalon / BarbershopBus StationVideo StoreCoffee ShopShopping Mall
17DowntownHotelCocktail BarSpeakeasyTheaterAmerican RestaurantHostelBreakfast SpotThai RestaurantBarCoffee Shop
18Duboce TriangleGay BarCoffee ShopGymSushi RestaurantNew American RestaurantMexican RestaurantGrocery StoreSandwich PlaceJewelry StoreCocktail Bar
19Eureka Valley / Dolores HeightsGay BarCoffee ShopNew American RestaurantThai RestaurantPet StorePlaygroundParkDeli / BodegaMexican RestaurantMen's Store
20ExcelsiorConvenience StoreMoving TargetScenic LookoutLakeFlower ShopEthiopian RestaurantEvent SpaceExhibitFarmers MarketFast Food Restaurant
21Financial District/Barbary CoastCoffee ShopItalian RestaurantFood TruckMen's StoreNew American RestaurantSandwich PlaceGymSeafood RestaurantCaféJapanese Restaurant
22Forest HillJapanese RestaurantHotpot RestaurantTennis CourtParkPlaygroundFrench RestaurantCycle StudioCoworking SpaceEvent SpaceExhibit
23Forest Hills ExtensionConvenience StoreBurger JointHotpot RestaurantGymFrench RestaurantParkPharmacyDive BarSandwich PlaceBus Stop
24Forest KnollsTrailParkMountainGardenFountainFootball StadiumEthiopian RestaurantEvent SpaceFrench RestaurantExhibit
25Glen ParkParkCaféTrailCoffee ShopGrocery StoreCosmetics ShopDive BarSpaTennis CourtGift Shop
26Golden Gate HeightsTrailParkBus StopPlaygroundScenic LookoutHome ServiceTennis CourtVideo Game StoreFlower ShopFarmers Market
27Golden Gate ParkParkTrackDisc GolfBus StopYoga StudioEthiopian RestaurantExhibitFarmers MarketFast Food RestaurantFilipino Restaurant
28Haight AshburyCaféBoutiqueCoffee ShopThrift / Vintage StorePizza PlaceShoe StoreClothing StoreThai RestaurantGift ShopBoard Shop
29Hayes ValleyBoutiqueWine BarCaféFrench RestaurantSushi RestaurantClothing StoreCocktail BarCoffee ShopBubble Tea ShopMexican Restaurant
....................................
62Outer MissionLatin American RestaurantCosmetics ShopSpanish RestaurantChinese RestaurantMexican RestaurantBakeryGrocery StoreFoodAsian RestaurantMotel
63Outer ParksideCoffee ShopChinese RestaurantPizza PlacePharmacyBeachThai RestaurantGrocery StoreBarBakeryLight Rail Station
64Outer RichmondChinese RestaurantCaféJapanese RestaurantSporting Goods ShopMusic StoreBakerySandwich PlaceRamen RestaurantYoga StudioRecord Shop
65Outer SunsetArt GalleryMexican RestaurantThai RestaurantLight Rail StationCoffee ShopYoga StudioArts & Crafts StoreBreweryBreakfast SpotBookstore
66Pacific HeightsCosmetics ShopCoffee ShopParkGrocery StoreJuice BarGym / Fitness CenterFrench RestaurantSalon / BarbershopBarIce Cream Shop
67ParksideChinese RestaurantDumpling RestaurantLight Rail StationBubble Tea ShopCaféJapanese RestaurantSandwich PlaceBarLiquor StoreBurrito Place
68Pine Lake ParkParkGymMusic VenueSandwich PlaceFood TruckDog RunAsian RestaurantHawaiian RestaurantEvent SpaceExhibit
69PortolaRecreation CenterLakePlaygroundShopping MallBus StationFood TruckFood StandElectronics StoreEthiopian RestaurantEvent Space
70Potrero HillParkCaféBreweryGrocery StorePlaygroundBreakfast SpotBurger JointBus StationPizza PlaceSandwich Place
71PresidioMuseumBreweryAmerican RestaurantOutdoor SculptureTrailAsian RestaurantGeneral EntertainmentArt GalleryParkBowling Alley
72Presidio HeightsAmerican RestaurantParkCosmetics ShopGolf CourseBaby StoreNew American RestaurantSupermarketCaféBookstoreMiscellaneous Shop
73Russian HillParkChocolate ShopIce Cream ShopCoffee ShopHotelBike Rental / Bike ShareTour ProviderDinerArt GallerySeafood Restaurant
74Saint Francis WoodCoffee ShopChinese RestaurantItalian RestaurantIndian RestaurantPizza PlaceGym / Fitness CenterWine BarMexican RestaurantPubMusic Store
75Sea CliffTrailTea RoomBeachScenic LookoutGolf CourseNeighborhoodFootball StadiumFood TruckEvent SpaceExhibit
76Sherwood ForestTreeParkMonument / LandmarkTrailYoga StudioFlea MarketEthiopian RestaurantEvent SpaceExhibitFarmers Market
77Silver TerraceParkLiquor StoreSoccer FieldOutdoor GymYoga StudioElectronics StoreEvent SpaceExhibitFarmers MarketFast Food Restaurant
78South BeachCaféGymSandwich PlaceCoffee ShopDeli / BodegaParkScenic LookoutResidential Building (Apartment / Condo)SpaAmerican Restaurant
79South of MarketNightclubSandwich PlaceArt GalleryCoffee ShopFurniture / Home StoreFood TruckJewelry StoreBarCaféClothing Store
80StonestownCaféSandwich PlacePizza PlaceCollege CafeteriaTennis CourtParkCoffee ShopGymJuice BarMexican Restaurant
81SunnysideBarVietnamese RestaurantBaseball FieldTrailGrocery StoreDumpling RestaurantSoccer FieldCantonese RestaurantSpaCollege Gym
82Telegraph HillItalian RestaurantPizza PlaceCocktail BarCoffee ShopParkMexican RestaurantBakeryTrailChinese RestaurantScenic Lookout
83TenderloinCoffee ShopVietnamese RestaurantThai RestaurantTheaterCocktail BarSpeakeasyIndian RestaurantBurger JointSandwich PlaceHotel
84Twin PeaksTrailScenic LookoutReservoirYoga StudioFilipino RestaurantEgyptian RestaurantElectronics StoreEthiopian RestaurantEvent SpaceExhibit
85Van Ness/Civic CenterThai RestaurantVietnamese RestaurantSandwich PlaceSushi RestaurantCoffee ShopBarKorean RestaurantSouthern / Soul Food RestaurantVegetarian / Vegan RestaurantTheater
86Visitacion ValleyGardenBaseball FieldParkPoolTrailFlower ShopEvent SpaceExhibitFarmers MarketFast Food Restaurant
87West PortalCoffee ShopChinese RestaurantMusic StorePizza PlaceIndian RestaurantWine BarParkMexican RestaurantPubGym / Fitness Center
88Western AdditionCosmetics ShopShopping MallJazz ClubJapanese RestaurantNew American RestaurantFurniture / Home StoreSushi RestaurantGift ShopTea RoomGrocery Store
89Westwood HighlandsYoga StudioDinerSushi RestaurantBus LineFoodMonument / LandmarkGun RangeCantonese RestaurantBreakfast SpotTrail
90Westwood ParkYoga StudioAsian RestaurantPoke PlacePharmacyCoffee ShopBubble Tea ShopCaféGastropubFoodGrocery Store
91Yerba BuenaCoffee ShopHotelSushi RestaurantSandwich PlaceGym / Fitness CenterCaféMuseumArt MuseumBarPizza Place
\n", "

92 rows Ă— 11 columns

\n", "
" ], "text/plain": [ " nbrhood 1st Most Common Venue \\\n", "0 Alamo Square Bar \n", "1 Anza Vista Café \n", "2 Balboa Terrace Light Rail Station \n", "3 Bayview Southern / Soul Food Restaurant \n", "4 Bayview Heights Breakfast Spot \n", "5 Bernal Heights Coffee Shop \n", "6 Buena Vista Park/Ashbury Heights Boutique \n", "7 Candlestick Point Football Stadium \n", "8 Central Richmond Grocery Store \n", "9 Central Sunset Chinese Restaurant \n", "10 Central Waterfront/Dogpatch Bakery \n", "11 Clarendon Heights Trail \n", "12 Cole Valley/Parnassus Heights Wine Bar \n", "13 Corona Heights Gay Bar \n", "14 Cow Hollow Cosmetics Shop \n", "15 Crocker Amazon Pharmacy \n", "16 Diamond Heights Trail \n", "17 Downtown Hotel \n", "18 Duboce Triangle Gay Bar \n", "19 Eureka Valley / Dolores Heights Gay Bar \n", "20 Excelsior Convenience Store \n", "21 Financial District/Barbary Coast Coffee Shop \n", "22 Forest Hill Japanese Restaurant \n", "23 Forest Hills Extension Convenience Store \n", "24 Forest Knolls Trail \n", "25 Glen Park Park \n", "26 Golden Gate Heights Trail \n", "27 Golden Gate Park Park \n", "28 Haight Ashbury Café \n", "29 Hayes Valley Boutique \n", ".. ... ... \n", "62 Outer Mission Latin American Restaurant \n", "63 Outer Parkside Coffee Shop \n", "64 Outer Richmond Chinese Restaurant \n", "65 Outer Sunset Art Gallery \n", "66 Pacific Heights Cosmetics Shop \n", "67 Parkside Chinese Restaurant \n", "68 Pine Lake Park Park \n", "69 Portola Recreation Center \n", "70 Potrero Hill Park \n", "71 Presidio Museum \n", "72 Presidio Heights American Restaurant \n", "73 Russian Hill Park \n", "74 Saint Francis Wood Coffee Shop \n", "75 Sea Cliff Trail \n", "76 Sherwood Forest Tree \n", "77 Silver Terrace Park \n", "78 South Beach Café \n", "79 South of Market Nightclub \n", "80 Stonestown Café \n", "81 Sunnyside Bar \n", "82 Telegraph Hill Italian Restaurant \n", "83 Tenderloin Coffee Shop \n", "84 Twin Peaks Trail \n", "85 Van Ness/Civic Center Thai Restaurant \n", "86 Visitacion Valley Garden \n", "87 West Portal Coffee Shop \n", "88 Western Addition Cosmetics Shop \n", "89 Westwood Highlands Yoga Studio \n", "90 Westwood Park Yoga Studio \n", "91 Yerba Buena Coffee Shop \n", "\n", " 2nd Most Common Venue 3rd Most Common Venue \\\n", "0 Record Shop Boutique \n", "1 Health & Beauty Service Coffee Shop \n", "2 Pharmacy Circus \n", "3 Park Light Rail Station \n", "4 Park Burger Joint \n", "5 Park Bakery \n", "6 Park Coffee Shop \n", "7 Park Campground \n", "8 Chinese Restaurant Sushi Restaurant \n", "9 Gym / Fitness Center Pilates Studio \n", "10 Art Gallery Cocktail Bar \n", "11 Park Road \n", "12 Park Yoga Studio \n", "13 Park Thai Restaurant \n", "14 Wine Bar French Restaurant \n", "15 Gastropub Coffee Shop \n", "16 Playground Grocery Store \n", "17 Cocktail Bar Speakeasy \n", "18 Coffee Shop Gym \n", "19 Coffee Shop New American Restaurant \n", "20 Moving Target Scenic Lookout \n", "21 Italian Restaurant Food Truck \n", "22 Hotpot Restaurant Tennis Court \n", "23 Burger Joint Hotpot Restaurant \n", "24 Park Mountain \n", "25 Café Trail \n", "26 Park Bus Stop \n", "27 Track Disc Golf \n", "28 Boutique Coffee Shop \n", "29 Wine Bar Café \n", ".. ... ... \n", "62 Cosmetics Shop Spanish Restaurant \n", "63 Chinese Restaurant Pizza Place \n", "64 Café Japanese Restaurant \n", "65 Mexican Restaurant Thai Restaurant \n", "66 Coffee Shop Park \n", "67 Dumpling Restaurant Light Rail Station \n", "68 Gym Music Venue \n", "69 Lake Playground \n", "70 Café Brewery \n", "71 Brewery American Restaurant \n", "72 Park Cosmetics Shop \n", "73 Chocolate Shop Ice Cream Shop \n", "74 Chinese Restaurant Italian Restaurant \n", "75 Tea Room Beach \n", "76 Park Monument / Landmark \n", "77 Liquor Store Soccer Field \n", "78 Gym Sandwich Place \n", "79 Sandwich Place Art Gallery \n", "80 Sandwich Place Pizza Place \n", "81 Vietnamese Restaurant Baseball Field \n", "82 Pizza Place Cocktail Bar \n", "83 Vietnamese Restaurant Thai Restaurant \n", "84 Scenic Lookout Reservoir \n", "85 Vietnamese Restaurant Sandwich Place \n", "86 Baseball Field Park \n", "87 Chinese Restaurant Music Store \n", "88 Shopping Mall Jazz Club \n", "89 Diner Sushi Restaurant \n", "90 Asian Restaurant Poke Place \n", "91 Hotel Sushi Restaurant \n", "\n", " 4th Most Common Venue 5th Most Common Venue \\\n", "0 Italian Restaurant Wine Bar \n", "1 Southern / Soul Food Restaurant Mexican Restaurant \n", "2 Bakery Park \n", "3 Chinese Restaurant Thrift / Vintage Store \n", "4 Latin American Restaurant Food \n", "5 Italian Restaurant Playground \n", "6 Clothing Store Convenience Store \n", "7 Stadium American Restaurant \n", "8 Korean Restaurant Café \n", "9 Spa Shoe Store \n", "10 Brewery Wine Bar \n", "11 Bus Stop Scenic Lookout \n", "12 Athletics & Sports Burger Joint \n", "13 Cosmetics Shop Dog Run \n", "14 Gym / Fitness Center Italian Restaurant \n", "15 Bar Latin American Restaurant \n", "16 Pharmacy Dim Sum Restaurant \n", "17 Theater American Restaurant \n", "18 Sushi Restaurant New American Restaurant \n", "19 Thai Restaurant Pet Store \n", "20 Lake Flower Shop \n", "21 Men's Store New American Restaurant \n", "22 Park Playground \n", "23 Gym French Restaurant \n", "24 Garden Fountain \n", "25 Coffee Shop Grocery Store \n", "26 Playground Scenic Lookout \n", "27 Bus Stop Yoga Studio \n", "28 Thrift / Vintage Store Pizza Place \n", "29 French Restaurant Sushi Restaurant \n", ".. ... ... \n", "62 Chinese Restaurant Mexican Restaurant \n", "63 Pharmacy Beach \n", "64 Sporting Goods Shop Music Store \n", "65 Light Rail Station Coffee Shop \n", "66 Grocery Store Juice Bar \n", "67 Bubble Tea Shop Café \n", "68 Sandwich Place Food Truck \n", "69 Shopping Mall Bus Station \n", "70 Grocery Store Playground \n", "71 Outdoor Sculpture Trail \n", "72 Golf Course Baby Store \n", "73 Coffee Shop Hotel \n", "74 Indian Restaurant Pizza Place \n", "75 Scenic Lookout Golf Course \n", "76 Trail Yoga Studio \n", "77 Outdoor Gym Yoga Studio \n", "78 Coffee Shop Deli / Bodega \n", "79 Coffee Shop Furniture / Home Store \n", "80 College Cafeteria Tennis Court \n", "81 Trail Grocery Store \n", "82 Coffee Shop Park \n", "83 Theater Cocktail Bar \n", "84 Yoga Studio Filipino Restaurant \n", "85 Sushi Restaurant Coffee Shop \n", "86 Pool Trail \n", "87 Pizza Place Indian Restaurant \n", "88 Japanese Restaurant New American Restaurant \n", "89 Bus Line Food \n", "90 Pharmacy Coffee Shop \n", "91 Sandwich Place Gym / Fitness Center \n", "\n", " 6th Most Common Venue 7th Most Common Venue \\\n", "0 Sushi Restaurant Gift Shop \n", "1 Grocery Store Sandwich Place \n", "2 Playground Dessert Shop \n", "3 Theater African Restaurant \n", "4 Exhibit Farmers Market \n", "5 Gourmet Shop Yoga Studio \n", "6 Trail Scenic Lookout \n", "7 Food & Drink Shop Soccer Field \n", "8 Deli / Bodega Bakery \n", "9 Food Pet Store \n", "10 Café Gym / Fitness Center \n", "11 Art Gallery Playground \n", "12 Sports Bar Sports Club \n", "13 Sushi Restaurant Dim Sum Restaurant \n", "14 Sandwich Place Salad Place \n", "15 Scenic Lookout Tennis Court \n", "16 Salon / Barbershop Bus Station \n", "17 Hostel Breakfast Spot \n", "18 Mexican Restaurant Grocery Store \n", "19 Playground Park \n", "20 Ethiopian Restaurant Event Space \n", "21 Sandwich Place Gym \n", "22 French Restaurant Cycle Studio \n", "23 Park Pharmacy \n", "24 Football Stadium Ethiopian Restaurant \n", "25 Cosmetics Shop Dive Bar \n", "26 Home Service Tennis Court \n", "27 Ethiopian Restaurant Exhibit \n", "28 Shoe Store Clothing Store \n", "29 Clothing Store Cocktail Bar \n", ".. ... ... \n", "62 Bakery Grocery Store \n", "63 Thai Restaurant Grocery Store \n", "64 Bakery Sandwich Place \n", "65 Yoga Studio Arts & Crafts Store \n", "66 Gym / Fitness Center French Restaurant \n", "67 Japanese Restaurant Sandwich Place \n", "68 Dog Run Asian Restaurant \n", "69 Food Truck Food Stand \n", "70 Breakfast Spot Burger Joint \n", "71 Asian Restaurant General Entertainment \n", "72 New American Restaurant Supermarket \n", "73 Bike Rental / Bike Share Tour Provider \n", "74 Gym / Fitness Center Wine Bar \n", "75 Neighborhood Football Stadium \n", "76 Flea Market Ethiopian Restaurant \n", "77 Electronics Store Event Space \n", "78 Park Scenic Lookout \n", "79 Food Truck Jewelry Store \n", "80 Park Coffee Shop \n", "81 Dumpling Restaurant Soccer Field \n", "82 Mexican Restaurant Bakery \n", "83 Speakeasy Indian Restaurant \n", "84 Egyptian Restaurant Electronics Store \n", "85 Bar Korean Restaurant \n", "86 Flower Shop Event Space \n", "87 Wine Bar Park \n", "88 Furniture / Home Store Sushi Restaurant \n", "89 Monument / Landmark Gun Range \n", "90 Bubble Tea Shop Café \n", "91 Café Museum \n", "\n", " 8th Most Common Venue 9th Most Common Venue \\\n", "0 Mediterranean Restaurant BBQ Joint \n", "1 Tunnel Big Box Store \n", "2 Vietnamese Restaurant Italian Restaurant \n", "3 Plaza Gym \n", "4 Fast Food Restaurant Filipino Restaurant \n", "5 Seafood Restaurant Sandwich Place \n", "6 Gift Shop Breakfast Spot \n", "7 Flea Market Event Space \n", "8 Dim Sum Restaurant Coffee Shop \n", "9 Playground Dessert Shop \n", "10 Coffee Shop Gift Shop \n", "11 Convenience Store Wine Bar \n", "12 Breakfast Spot Street Food Gathering \n", "13 Grocery Store Coffee Shop \n", "14 Gym Bakery \n", "15 Basketball Court Pizza Place \n", "16 Video Store Coffee Shop \n", "17 Thai Restaurant Bar \n", "18 Sandwich Place Jewelry Store \n", "19 Deli / Bodega Mexican Restaurant \n", "20 Exhibit Farmers Market \n", "21 Seafood Restaurant Café \n", "22 Coworking Space Event Space \n", "23 Dive Bar Sandwich Place \n", "24 Event Space French Restaurant \n", "25 Spa Tennis Court \n", "26 Video Game Store Flower Shop \n", "27 Farmers Market Fast Food Restaurant \n", "28 Thai Restaurant Gift Shop \n", "29 Coffee Shop Bubble Tea Shop \n", ".. ... ... \n", "62 Food Asian Restaurant \n", "63 Bar Bakery \n", "64 Ramen Restaurant Yoga Studio \n", "65 Brewery Breakfast Spot \n", "66 Salon / Barbershop Bar \n", "67 Bar Liquor Store \n", "68 Hawaiian Restaurant Event Space \n", "69 Electronics Store Ethiopian Restaurant \n", "70 Bus Station Pizza Place \n", "71 Art Gallery Park \n", "72 Café Bookstore \n", "73 Diner Art Gallery \n", "74 Mexican Restaurant Pub \n", "75 Food Truck Event Space \n", "76 Event Space Exhibit \n", "77 Exhibit Farmers Market \n", "78 Residential Building (Apartment / Condo) Spa \n", "79 Bar Café \n", "80 Gym Juice Bar \n", "81 Cantonese Restaurant Spa \n", "82 Trail Chinese Restaurant \n", "83 Burger Joint Sandwich Place \n", "84 Ethiopian Restaurant Event Space \n", "85 Southern / Soul Food Restaurant Vegetarian / Vegan Restaurant \n", "86 Exhibit Farmers Market \n", "87 Mexican Restaurant Pub \n", "88 Gift Shop Tea Room \n", "89 Cantonese Restaurant Breakfast Spot \n", "90 Gastropub Food \n", "91 Art Museum Bar \n", "\n", " 10th Most Common Venue \n", "0 Ethiopian Restaurant \n", "1 Arts & Crafts Store \n", "2 American Restaurant \n", "3 Bakery \n", "4 Flea Market \n", "5 Gym \n", "6 Café \n", "7 Exhibit \n", "8 Vietnamese Restaurant \n", "9 Ethiopian Restaurant \n", "10 Dessert Shop \n", "11 Dance Studio \n", "12 Mexican Restaurant \n", "13 Diner \n", "14 Salon / Barbershop \n", "15 American Restaurant \n", "16 Shopping Mall \n", "17 Coffee Shop \n", "18 Cocktail Bar \n", "19 Men's Store \n", "20 Fast Food Restaurant \n", "21 Japanese Restaurant \n", "22 Exhibit \n", "23 Bus Stop \n", "24 Exhibit \n", "25 Gift Shop \n", "26 Farmers Market \n", "27 Filipino Restaurant \n", "28 Board Shop \n", "29 Mexican Restaurant \n", ".. ... \n", "62 Motel \n", "63 Light Rail Station \n", "64 Record Shop \n", "65 Bookstore \n", "66 Ice Cream Shop \n", "67 Burrito Place \n", "68 Exhibit \n", "69 Event Space \n", "70 Sandwich Place \n", "71 Bowling Alley \n", "72 Miscellaneous Shop \n", "73 Seafood Restaurant \n", "74 Music Store \n", "75 Exhibit \n", "76 Farmers Market \n", "77 Fast Food Restaurant \n", "78 American Restaurant \n", "79 Clothing Store \n", "80 Mexican Restaurant \n", "81 College Gym \n", "82 Scenic Lookout \n", "83 Hotel \n", "84 Exhibit \n", "85 Theater \n", "86 Fast Food Restaurant \n", "87 Gym / Fitness Center \n", "88 Grocery Store \n", "89 Trail \n", "90 Grocery Store \n", "91 Pizza Place \n", "\n", "[92 rows x 11 columns]" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "num_top_venues = 10\n", "\n", "indicators = ['st', 'nd', 'rd']\n", "\n", "# create columns according to number of top venues\n", "columns = ['nbrhood']\n", "for ind in np.arange(num_top_venues):\n", " try:\n", " columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))\n", " except:\n", " columns.append('{}th Most Common Venue'.format(ind+1))\n", "\n", "# create a new dataframe\n", "SF_venues_sorted = pd.DataFrame(columns=columns)\n", "SF_venues_sorted['nbrhood'] = SF_grouped['nbrhood']\n", "\n", "for ind in np.arange(SF_grouped.shape[0]):\n", " SF_venues_sorted.iloc[ind, 1:] = return_most_common_venues(SF_grouped.iloc[ind, :], num_top_venues)\n", "\n", "print(SF_venues_sorted.shape)\n", "SF_venues_sorted" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Cluster the neighborhoods using the k-means algorithm: Realization\n", "We use k-means to cluster the results into 5 clusters." ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 1, 3, 3, 0, 1, 3, 3, 1, 3], dtype=int32)" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Set the number of clusters\n", "kclusters = 5\n", "\n", "SF_grouped_clustering = SF_grouped.drop('nbrhood', 1)\n", "\n", "# Run k-means clustering\n", "kmeans = KMeans(n_clusters=kclusters, random_state=34).fit(SF_grouped_clustering)\n", "\n", "# Check cluster labels generated for each row in the dataframe\n", "kmeans.labels_[0:10] " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood." ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nbrhoodincident_countshouse_avg_priceLatitudeLongitude1st Most Common Venue2nd Most Common Venue3rd Most Common Venue4th Most Common Venue5th Most Common Venue6th Most Common Venue7th Most Common Venue8th Most Common Venue9th Most Common Venue10th Most Common VenueCluster Labels
0Alamo Square6870.86201637.776076-122.433919BarRecord ShopBoutiqueItalian RestaurantWine BarSushi RestaurantGift ShopMediterranean RestaurantBBQ JointEthiopian Restaurant1
1Anza Vista3101.12153437.780611-122.443255CaféHealth & Beauty ServiceCoffee ShopSouthern / Soul Food RestaurantMexican RestaurantGrocery StoreSandwich PlaceTunnelBig Box StoreArts & Crafts Store1
2Balboa Terrace490.73796337.730649-122.468267Light Rail StationPharmacyCircusBakeryParkPlaygroundDessert ShopVietnamese RestaurantItalian RestaurantAmerican Restaurant3
3Bayview31850.43968437.732391-122.387170Southern / Soul Food RestaurantParkLight Rail StationChinese RestaurantThrift / Vintage StoreTheaterAfrican RestaurantPlazaGymBakery3
4Bernal Heights15610.44820037.740230-122.415885Coffee ShopParkBakeryItalian RestaurantPlaygroundGourmet ShopYoga StudioSeafood RestaurantSandwich PlaceGym1
\n", "
" ], "text/plain": [ " nbrhood incident_counts house_avg_price Latitude Longitude \\\n", "0 Alamo Square 687 0.862016 37.776076 -122.433919 \n", "1 Anza Vista 310 1.121534 37.780611 -122.443255 \n", "2 Balboa Terrace 49 0.737963 37.730649 -122.468267 \n", "3 Bayview 3185 0.439684 37.732391 -122.387170 \n", "4 Bernal Heights 1561 0.448200 37.740230 -122.415885 \n", "\n", " 1st Most Common Venue 2nd Most Common Venue \\\n", "0 Bar Record Shop \n", "1 Café Health & Beauty Service \n", "2 Light Rail Station Pharmacy \n", "3 Southern / Soul Food Restaurant Park \n", "4 Coffee Shop Park \n", "\n", " 3rd Most Common Venue 4th Most Common Venue \\\n", "0 Boutique Italian Restaurant \n", "1 Coffee Shop Southern / Soul Food Restaurant \n", "2 Circus Bakery \n", "3 Light Rail Station Chinese Restaurant \n", "4 Bakery Italian Restaurant \n", "\n", " 5th Most Common Venue 6th Most Common Venue 7th Most Common Venue \\\n", "0 Wine Bar Sushi Restaurant Gift Shop \n", "1 Mexican Restaurant Grocery Store Sandwich Place \n", "2 Park Playground Dessert Shop \n", "3 Thrift / Vintage Store Theater African Restaurant \n", "4 Playground Gourmet Shop Yoga Studio \n", "\n", " 8th Most Common Venue 9th Most Common Venue 10th Most Common Venue \\\n", "0 Mediterranean Restaurant BBQ Joint Ethiopian Restaurant \n", "1 Tunnel Big Box Store Arts & Crafts Store \n", "2 Vietnamese Restaurant Italian Restaurant American Restaurant \n", "3 Plaza Gym Bakery \n", "4 Seafood Restaurant Sandwich Place Gym \n", "\n", " Cluster Labels \n", "0 1 \n", "1 1 \n", "2 3 \n", "3 3 \n", "4 1 " ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create the dataframe\n", "SF_merged = SF_venues_sorted\n", "\n", "# Add clustering labels\n", "SF_merged['Cluster Labels'] = kmeans.labels_\n", "\n", "# Merge SF_grouped with SF_data to add latitude/longitude for each neighborhood\n", "SF_merged = nbh_centroid.join(SF_venues_sorted.set_index('nbrhood'), on='nbrhood')\n", "\n", "SF_merged.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generate the San Francisco neighborhood clusters map." ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [], "source": [ "# Create San Francisco base map\n", "SF_cluster_map = folium.Map(location=SF_Coord, zoom_start=12)\n", "\n", "# set color scheme for the clusters\n", "x = np.arange(kclusters)\n", "ys = [i+x+(i*x)**3.2 for i in range(kclusters)]\n", "colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))\n", "rainbow = [colors.rgb2hex(i) for i in colors_array]\n", "\n", "# add markers to map\n", "markers_colors = []\n", "for lat, lng, nbrhood, cluster in zip(\n", " SF_merged['Latitude'], \n", " SF_merged['Longitude'], \n", " SF_merged['nbrhood'], \n", " SF_merged['Cluster Labels']):\n", " label = (\"Cluster : {}, Neighborhood: {}\").format(cluster, nbrhood)\n", " label = folium.Popup(label, parse_html=True)\n", " folium.CircleMarker(\n", " [lat, lng],\n", " radius=5,\n", " popup=label,\n", " color=rainbow[cluster-1],\n", " fill=True,\n", " fill_color=rainbow[cluster-1],\n", " fill_opacity=0.7).add_to(SF_cluster_map)\n", " \n", "#SF_cluster_map" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Results " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.1 Crime and Housing Maps \n", "Back to page top" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Distribution of San Francisco crimes: 2018 Jan. - 2018 Sep." ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(SF_crime_map)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Distribution of 2014 San Francisco average housing price." ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(SF_housing_map)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### San Francisco neighborhood clusters based on the k-means algorithm" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(SF_cluster_map)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "#### The proposed location for the new restaurant\n", "The crime and housing price maps suggest that __North Beach__ has a relatively low crime rate and an somewhat below average housing price. This neighborhood is close to attractions such as the Fisherman's Wharf, the Lombard Street and the Colt Tower. This neighborhood belongs to the first cluster. Let's take a look at this cluster." ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [], "source": [ "def examine_clusters(id):\n", " return SF_merged.loc[SF_merged['Cluster Labels'] == id, SF_merged.columns[[0] + [1] + list(range(5, SF_merged.shape[1]))]]" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nbrhoodincident_counts1st Most Common Venue2nd Most Common Venue3rd Most Common Venue4th Most Common Venue5th Most Common Venue6th Most Common Venue7th Most Common Venue8th Most Common Venue9th Most Common Venue10th Most Common VenueCluster Labels
19Forest Hill89Japanese RestaurantHotpot RestaurantTennis CourtParkPlaygroundFrench RestaurantCycle StudioCoworking SpaceEvent SpaceExhibit0
24Golden Gate Park544ParkTrackDisc GolfBus StopYoga StudioEthiopian RestaurantExhibitFarmers MarketFast Food RestaurantFilipino Restaurant0
43Merced Heights33GardenRacetrackParkLiquor StoreLight Rail StationYoga StudioFlea MarketEvent SpaceExhibitFarmers Market0
44Merced Manor61GymParkArt GalleryTennis CourtJapanese RestaurantBubble Tea ShopMusic VenueWeight Loss CenterFood TruckPet Store0
47Miraloma Park191ParkPlaygroundMonument / LandmarkTreeBus LineCollege AuditoriumEvent SpaceExhibitFarmers MarketFast Food Restaurant0
64Pine Lake Park36ParkGymMusic VenueSandwich PlaceFood TruckDog RunAsian RestaurantHawaiian RestaurantEvent SpaceExhibit0
72Silver Terrace678ParkLiquor StoreSoccer FieldOutdoor GymYoga StudioElectronics StoreEvent SpaceExhibitFarmers MarketFast Food Restaurant0
89Bayview Heights253Breakfast SpotParkBurger JointLatin American RestaurantFoodExhibitFarmers MarketFast Food RestaurantFilipino RestaurantFlea Market0
\n", "
" ], "text/plain": [ " nbrhood incident_counts 1st Most Common Venue \\\n", "19 Forest Hill 89 Japanese Restaurant \n", "24 Golden Gate Park 544 Park \n", "43 Merced Heights 33 Garden \n", "44 Merced Manor 61 Gym \n", "47 Miraloma Park 191 Park \n", "64 Pine Lake Park 36 Park \n", "72 Silver Terrace 678 Park \n", "89 Bayview Heights 253 Breakfast Spot \n", "\n", " 2nd Most Common Venue 3rd Most Common Venue 4th Most Common Venue \\\n", "19 Hotpot Restaurant Tennis Court Park \n", "24 Track Disc Golf Bus Stop \n", "43 Racetrack Park Liquor Store \n", "44 Park Art Gallery Tennis Court \n", "47 Playground Monument / Landmark Tree \n", "64 Gym Music Venue Sandwich Place \n", "72 Liquor Store Soccer Field Outdoor Gym \n", "89 Park Burger Joint Latin American Restaurant \n", "\n", " 5th Most Common Venue 6th Most Common Venue 7th Most Common Venue \\\n", "19 Playground French Restaurant Cycle Studio \n", "24 Yoga Studio Ethiopian Restaurant Exhibit \n", "43 Light Rail Station Yoga Studio Flea Market \n", "44 Japanese Restaurant Bubble Tea Shop Music Venue \n", "47 Bus Line College Auditorium Event Space \n", "64 Food Truck Dog Run Asian Restaurant \n", "72 Yoga Studio Electronics Store Event Space \n", "89 Food Exhibit Farmers Market \n", "\n", " 8th Most Common Venue 9th Most Common Venue 10th Most Common Venue \\\n", "19 Coworking Space Event Space Exhibit \n", "24 Farmers Market Fast Food Restaurant Filipino Restaurant \n", "43 Event Space Exhibit Farmers Market \n", "44 Weight Loss Center Food Truck Pet Store \n", "47 Exhibit Farmers Market Fast Food Restaurant \n", "64 Hawaiian Restaurant Event Space Exhibit \n", "72 Exhibit Farmers Market Fast Food Restaurant \n", "89 Fast Food Restaurant Filipino Restaurant Flea Market \n", "\n", " Cluster Labels \n", "19 0 \n", "24 0 \n", "43 0 \n", "44 0 \n", "47 0 \n", "64 0 \n", "72 0 \n", "89 0 " ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.set_option('display.max_rows', 100)\n", "examine_clusters(0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So indeed, the first few most common venues in neighborhoods that belong to cluster #0 are all restaurants which include bars, coffee shops/cafe, Chinese/Japanese/Korean/American/Italian restaurants. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.2 A Closer Look at the Proposed Location \n", "Back to page top" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "According to the maps, it appears that the neighborhood __Russian Hills__ has a relatively low crime rate and housing price. This neighborhood is close several attractions too. Let's use Foursquare API to explore vicinities of this neighborhood." ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "nbrhood North Beach\n", "nid 8d\n", "sfar_distr District 8 - Northeast\n", "geometry POLYGON ((-122.4172945622149 37.80506527491107...\n", "incident_counts 732\n", "house_avg_price 0.799696\n", "Name: 53, dtype: object" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nbh_name = 'North Beach'\n", "nbh_index = nbrhoods.index[nbrhoods['nbrhood'] == nbh_name][0]\n", "nbh_lat, nbh_lng = nbh_centroid.loc[nbh_index, ['Latitude', 'Longitude']]\n", "\n", "nbrhoods.loc[nbh_index]" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nbrhood1st Most Common Venue2nd Most Common Venue3rd Most Common Venue4th Most Common Venue5th Most Common Venue6th Most Common Venue7th Most Common Venue8th Most Common Venue9th Most Common Venue10th Most Common VenueCluster Labels
58North BeachItalian RestaurantPizza PlaceChinese RestaurantBakeryCocktail BarCoffee ShopCaféDive BarDeli / BodegaYoga Studio1
\n", "
" ], "text/plain": [ " nbrhood 1st Most Common Venue 2nd Most Common Venue \\\n", "58 North Beach Italian Restaurant Pizza Place \n", "\n", " 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue \\\n", "58 Chinese Restaurant Bakery Cocktail Bar \n", "\n", " 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue \\\n", "58 Coffee Shop CafĂ© Dive Bar \n", "\n", " 9th Most Common Venue 10th Most Common Venue Cluster Labels \n", "58 Deli / Bodega Yoga Studio 1 " ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "SF_venues_sorted.loc[SF_venues_sorted['nbrhood'] == nbh_name]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This table shows that the North Beach already has many restaurants in the area. This sort of indicates that indeed North Beach is an ideal neighborhood for opening a new restanrant. But we have many coompetitors! To know our potential competitors, we use Foursquare API and this time add __food__ to the parameter *section* to look for restaurants in this neighborhood. Using the results, we'll map out our competitors." ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [], "source": [ "# Set up the FourSquare API call\n", "section = 'food'\n", "url = 'https://api.foursquare.com/v2/venues/explore?client_id={0}&client_secret={1}&v={2}&ll={3},{4}§ion={5}&radius={6}&limit={7}'.format(\n", " CLIENT_ID,\n", " CLIENT_SECRET,\n", " VERSION,\n", " nbh_lat,\n", " nbh_lng,\n", " section,\n", " RADIUS,\n", " LIMIT)\n", "\n", "# Fetch the top 100 venues\n", "results = requests.get(url).json()" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namecategorieslatlng
0Tony’s Pizza NapoletanaPizza Place37.800328-122.409040
1Park TavernNew American Restaurant37.801097-122.409301
2Trattoria ContadinaTrattoria/Osteria37.800078-122.412422
3The Italian Homemade CompanyItalian Restaurant37.801497-122.411795
4Mario's Bohemian Cigar Store CafeCafé37.800391-122.409876
\n", "
" ], "text/plain": [ " name categories lat \\\n", "0 Tony’s Pizza Napoletana Pizza Place 37.800328 \n", "1 Park Tavern New American Restaurant 37.801097 \n", "2 Trattoria Contadina Trattoria/Osteria 37.800078 \n", "3 The Italian Homemade Company Italian Restaurant 37.801497 \n", "4 Mario's Bohemian Cigar Store Cafe Café 37.800391 \n", "\n", " lng \n", "0 -122.409040 \n", "1 -122.409301 \n", "2 -122.412422 \n", "3 -122.411795 \n", "4 -122.409876 " ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "venues = results['response']['groups'][0]['items']\n", " \n", "nearby_venues = json_normalize(venues) # flatten JSON\n", "\n", "# filter columns\n", "filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']\n", "nearby_venues =nearby_venues.loc[:, filtered_columns]\n", "\n", "# filter the category for each row\n", "nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)\n", "\n", "# clean columns\n", "nearby_venues.columns = [col.split(\".\")[-1] for col in nearby_venues.columns]\n", "\n", "nearby_venues.head()" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "62 venues were returned by Foursquare.\n" ] } ], "source": [ "print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create San Francisco base map\n", "target_map = folium.Map(location=(nbh_lat, nbh_lng), zoom_start=17)\n", "\n", "for lat, lng, categories in zip(\n", " nearby_venues['lat'], \n", " nearby_venues['lng'], \n", " nearby_venues['categories']):\n", " label = (\"{}\").format(categories)\n", " label = folium.Popup(label, parse_html=True)\n", " folium.CircleMarker(\n", " [lat, lng],\n", " radius=6,\n", " popup=label,\n", " color='green',\n", " fill=True,\n", " fill_color='green',\n", " fill_opacity=0.7).add_to(target_map)\n", " \n", "target_map" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above map shows that the restaurants are located mostly in the blocks along Columbus Avene. We can get a list of restaurant categories from *nearby_venues*." ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['Pizza Place',\n", " 'New American Restaurant',\n", " 'Trattoria/Osteria',\n", " 'Italian Restaurant',\n", " 'Café',\n", " 'Seafood Restaurant',\n", " 'Argentinian Restaurant',\n", " 'Bakery',\n", " 'Tuscan Restaurant',\n", " 'Latin American Restaurant',\n", " 'South American Restaurant',\n", " 'Breakfast Spot',\n", " 'Deli / Bodega',\n", " 'Sicilian Restaurant',\n", " 'Chinese Restaurant',\n", " 'Taco Place',\n", " 'Mexican Restaurant',\n", " 'Sushi Restaurant',\n", " 'Asian Restaurant',\n", " 'Sandwich Place',\n", " 'Diner',\n", " 'Burger Joint',\n", " 'Gastropub',\n", " 'French Restaurant',\n", " 'Persian Restaurant',\n", " 'Thai Restaurant',\n", " 'Indian Restaurant',\n", " 'Southern / Soul Food Restaurant']" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "venue_category_list = list(nearby_venues['categories'].unique())\n", "venue_category_list" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Discussion \n", "Back to page top" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By analyzing public datasets obtained from DataSF, we locate __North Beach__ as a possible neighborhood that is ideal for opening a new restaurant. We rely on aggregating crime and housing price data, the two primary factors that are critical for determining the location. With the help from the Foursquare API, we further demonstrate that __North Beach__ indeed is a good candidate by showing there are already restaurants in this area along the Columbus Avenue. We also obtain a list of our competitors by using again the Foursquare Explore API, and are able to pinpoint the competitors' location. \n", "At the end, while we have identified an ideal neighborhood, we are also facing rather strong competitions. To differenciate our new restaurant from competitors, we need inputs from the data. Obviously, our analysis has rooms for improvement.\n", "\n", "1. Visibility: Our results indicate that __North Beach__ should have sizable foot and car traffic along the Columbus Avenue. So to follow these traffics, we can certainly open our new restaurant somewhere on the avenue. A good alternative is the blocks away from the avenue. However, we need to analyze the data in further detail so that the traffic pattern can be revealed.\n", "\n", "2. Competitor analysis: In Section 4.2, we very list the categories of restaurant in __North Beach__ neighborhood. This list provides an excellent overview of our competitors. It also guides us in determining which category we should be focusing on or avoiding. An obvious improvement here is to cluster the restaurant categories. The results could let us know the landscape of the restaurant business in this neighborhood.\n", "\n", "3. Parking: This is a rather important factor that is not addressed in our analysis. While we have public parking space data, there are many private parkings in San Francisco and we don't have the data. The next step is to obtain a distribution of these parking spaces. Or one could extract the parking information in realtime by using the ParkWhiz API calls." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 6. Conclusion \n", "Back to page top" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this capstone project, we address the business problem of finding a good location in San Francisco for opening a new restaurant. We identify the most important factors that could impact the choice. Using crime and housing price datasets, we are able to locate a possible neighborhood that has a relatively low crime rate and housing cost. Foursquare API recommendation results also seem to support our pick. But we equally are facing competitions. We have discussed several possible improvements. It would be a very interesting followup project if somehow we could automate the location recommendation process. In thsi project, we sort of idtentify the neighborhood by just eyeballing the maps and results. To build an automatic recommender, we would have to design a quantitative measure that allows us to gauge each location. The final location is obviously a compromise between all the factors that could impact the selection. So in addition to the clustering algorithm, we may need other machine learning models such as regression to determine or estimate the score. We'll leave this for future projects." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 2 }