{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Restaurant Quest using TomTom API\n", "\n", "#### About this document\n", "* **Scope**: This is the full story that includes both the demographic data analysis and using TomTom API to examine the surroundings of a neighborhood.
[Click here to get the **Patial story that focus on maps**](https://nbviewer.jupyter.org/github/xding78/Sharing/blob/master/RestaurantQuest/RestaurantQuest_TomTomAPI.ipynb).\n", "* **Version**: 1.0 | Updated on: 14 Jan 2020\n", "* **Author**: [Xinrong Ding](https://www.linkedin.com/in/xding/)\n", "\n", "### Table of contents\n", "\n", "[1. Introduction: The Challenge](#1)
\n", "[2. Data Requirements](#2)
\n", "[3. Analyze Demographic Data](#3)
\n", "[4. View Candidate Neighborhoods on a Map](#4)
\n", "[5. Explore the surrondings](#5)
\n", "[6. In-depth analysis of one neighborhood](#6)
\n", "[7. Conclusion and future work](#7)\n", "\n", "
\n", "
\n", "\n", "

\n", "\n", "\n", "# 2. Data Requirements\n", "[Back to top](#top)\n", "\n", "We have the big questions, where to \n", "open a Chinese restaurant in Amsterdam. Now, we need to collect data that can help us answer these questions. I will collect data from these two sources:\n", "\n", "* Data of the surroundings (density of similar restaurants nearby)\n", "* Demographic data (per area in Amsterdam)\n", "\n", "\n", "## 2.1. Data of Surroundings\n", "[Back to top](#top)\n", "\n", "* [**TomTom Maps API**](https://developer.tomtom.com/products/maps-api).\n", "* [**TomTom Search API**](https://developer.tomtom.com/products/search-api).\n", "\n", "\n", "\n", "## 2.2 Demographic Data\n", "[Back to top](#top)\n", "\n", "To know which neighborhoods are more interesting to investigate further, we need to look into demographic data. We will choose neighborhoods that have more target customers in terms of quantity and density.\n", "\n", "The Central Bureau of Statistics of the Netherlands, **CBS** in short, provides a large number of demographic data regarding who live and work in the Netherlands.
\n", "In [this page](https://opendata.cbs.nl/statline/#/CBS/nl/dataset/84286NED/table?ts=1558337871247), you can choose whatever feature you need to solve the problem. The list of features is quite comprehensive. It is important, therefore, to define exactly what the data can be used.\n", "\n", "\n", "#### Features that can be used to answer the question\n", "\n", "* **Regional specifics** _(Regioaanduiding)_: This information can help me to link relative details to a specific area.\n", "* **Total Households** _(Particulier huishouden)_: Number of households in a neighborhood.\n", "* **One-person Households** _(Eenpersoonshuishoudens)_: Number of households with only one person.\n", "* **Population density** _(Bevolkingsdichtheid)_: A more densely populated area means more customers for a restaurant. The unit of population density is **number of people per square kilometer**.\n", "\n", "#### Information that isn't available\n", "\n", "* **Origin of birth** _(Personen met een migratieachtergrond)_: This information can help us to determine the types of food the restaurant should offer.\n", " * Unfortunately, the categorization is not detailed enough. I am not able to single out people come from China in the data.\n", "* **Income per household** _(Inkomen van huishoudens)_: Since restaurants are usually quite expensive comparing to home-cooking. Only people who have sufficient income can afford to go to restaurants often.
\n", "_Unfortunately, this information is missing consistently from the database._\n", "\n", "#### Features deliberately excluded\n", "\n", "* **Civil status**: Since we can distinguish one-person households, it's not necessary to understand why people live alone (single or divorced does not seem to link to food strategy directly).\n", "* **Gender**: In the Netherlands, there isn't a big difference between men and women in terms of the likeliness of cooking at home.\n", "* **Type of house**: This has no direct correlation to people's choice of food.\n", "\n", "\n", "
\n", "\n", "

\n", "\n", "\n", "# 3. Analyze Demographic Data\n", "[Back to top](#top)\n", "\n", "I selected necessary data from [CBS](https://opendata.cbs.nl/statline/#/CBS/nl/dataset/84286NED/table?ts=1558337871247), as mentioned in [chapter 2.2](#22).
\n", "The data is in CSV format.\n", "\n", "## 3.1 Load Data to a Dataframe\n", "\n", "#### Load necessary libraries" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# library to handle data in a vectorized manner\n", "import numpy as np \n", "# library to load dataframe\n", "import pandas as pd\n", "\n", "# Matplotlib and associated plotting mhttps://leafletjs.com/odules\n", "import matplotlib.colors as colors\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Load the CSV file" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NeighborhoodIDTotal ResidencesTotal HouseholdsOne-person HouseholdsPopulation DensityLatLon
0Burgwallen-Oude ZijdeWK0363004305309021801232352.3719464.896103
1Burgwallen-Nieuwe ZijdeWK036301393028352000688152.3737064.889922
2Grachtengordel-WestWK0363026385411025701426152.3708374.885478
3Grachtengordel-ZuidWK0363035350341021401030352.3644224.894243
4NieuwmarktWK0363049765648542851374152.3721604.900096
\n", "
" ], "text/plain": [ " Neighborhood ID Total Residences Total Households \\\n", "0 Burgwallen-Oude Zijde WK036300 4305 3090 \n", "1 Burgwallen-Nieuwe Zijde WK036301 3930 2835 \n", "2 Grachtengordel-West WK036302 6385 4110 \n", "3 Grachtengordel-Zuid WK036303 5350 3410 \n", "4 Nieuwmarkt WK036304 9765 6485 \n", "\n", " One-person Households Population Density Lat Lon \n", "0 2180 12323 52.371946 4.896103 \n", "1 2000 6881 52.373706 4.889922 \n", "2 2570 14261 52.370837 4.885478 \n", "3 2140 10303 52.364422 4.894243 \n", "4 4285 13741 52.372160 4.900096 " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv('https://github.com/xding78/Sharing/raw/master/RestaurantQuest/Amsterdam.csv')\n", "df.head()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(65, 8)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### In total 65 neighborhoods\n", "\n", "_There are definitely more neighborhoods in Amsterdam municipality. However, for the sake of this challenge, we decided to focus on the neighborhoods that within or connected to Amsterdam city proper._ \n", "\n", "#### Sort the neighborhoods by the population density" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NeighborhoodIDTotal ResidencesTotal HouseholdsOne-person HouseholdsPopulation DensityLatLon
14StaatsliedenbuurtWK03631413315810548602813952.3802874.870951
19Van LennepbuurtWK0363196990453530052800552.3651444.867845
31Indische Buurt WestWK03633112640706039302698552.3616254.938813
21Overtoomse SluisWK0363217890484029102648252.3594684.860689
18KinkerbuurtWK0363186590395024602613552.3691674.866649
\n", "
" ], "text/plain": [ " Neighborhood ID Total Residences Total Households \\\n", "14 Staatsliedenbuurt WK036314 13315 8105 \n", "19 Van Lennepbuurt WK036319 6990 4535 \n", "31 Indische Buurt West WK036331 12640 7060 \n", "21 Overtoomse Sluis WK036321 7890 4840 \n", "18 Kinkerbuurt WK036318 6590 3950 \n", "\n", " One-person Households Population Density Lat Lon \n", "14 4860 28139 52.380287 4.870951 \n", "19 3005 28005 52.365144 4.867845 \n", "31 3930 26985 52.361625 4.938813 \n", "21 2910 26482 52.359468 4.860689 \n", "18 2460 26135 52.369167 4.866649 " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.sort_values([\"Population Density\"], axis=0, ascending=False, inplace=True)\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.2 Drop Unnecessary Data\n", "[Back to top](#top)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we mentioned earlier, we want to learn the population density together with how many households are with only one person. It seems that the total number of residences is not necessary to answer any of the questions. Therefore, we decide to remove it from the data from now on." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NeighborhoodIDTotal HouseholdsOne-person HouseholdsPopulation DensityLatLon
14StaatsliedenbuurtWK036314810548602813952.3802874.870951
19Van LennepbuurtWK036319453530052800552.3651444.867845
31Indische Buurt WestWK036331706039302698552.3616254.938813
21Overtoomse SluisWK036321484029102648252.3594684.860689
18KinkerbuurtWK036318395024602613552.3691674.866649
\n", "
" ], "text/plain": [ " Neighborhood ID Total Households One-person Households \\\n", "14 Staatsliedenbuurt WK036314 8105 4860 \n", "19 Van Lennepbuurt WK036319 4535 3005 \n", "31 Indische Buurt West WK036331 7060 3930 \n", "21 Overtoomse Sluis WK036321 4840 2910 \n", "18 Kinkerbuurt WK036318 3950 2460 \n", "\n", " Population Density Lat Lon \n", "14 28139 52.380287 4.870951 \n", "19 28005 52.365144 4.867845 \n", "31 26985 52.361625 4.938813 \n", "21 26482 52.359468 4.860689 \n", "18 26135 52.369167 4.866649 " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.drop(\"Total Residences\", axis=1, inplace=True)\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.3. Observe Data using a Bar Chart\n", "[Back to top](#top)\n", "\n", "In order to better decide what to do with the data, I want to take a good look at the data. Visualizing the data will help a lot. I choose to use a horizontal bar chart, because I want the neighborhood names to be very easy to read. Due to the number of neighborhoods (65), the vertical bar might not offer enough room to show all the bars." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NeighborhoodTotal HouseholdsOne-person HouseholdsPopulation Density
14Staatsliedenbuurt8105486028139
19Van Lennepbuurt4535300528005
31Indische Buurt West7060393026985
21Overtoomse Sluis4840291026482
18Kinkerbuurt3950246026135
\n", "