{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Capstone Project - Final Assignment (Part 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Xinrong Ding**
\n", "**Applied Data Science Capstone by IBM/Coursera**
\n", "*This is part 2 of the Capstone Final Assignment*\n", "\n", "# 2. Data\n", "\n", "Now we know what questions we must answer. It's time to collect necessary data.\n", "\n", "#### Recap of the questions\n", "\n", "1. What type of food to serve? (Foursquare API + Demographic data)\n", "2. How many similar restaurants already exist? (Foursquare API)\n", "3. Who are the target customers and where do they live? (Demographic data + Geolocation)\n", "\n", "\n", "## 2.1. Data preparation\n", "\n", "As mentioned above, we need to collect data from at least two sources:\n", "\n", "* Data of the surroundings (density of similar restaurants nearby)\n", "* Demographic data (per area in Amsterdam)\n", "\n", "### 2.1.1. Data of surroundings\n", "\n", "[**Foursquare API**](https://developer.foursquare.com/docs/api/endpoints), or **Google API**.\n", "\n", "List of Foursqaure API Endpoints that we need:\n", "\n", "* [**Search**](https://developer.foursquare.com/docs/api/venues/search): Returns a list of venues near the current location, optionally matching a search term.\n", "* [**Categories**](https://developer.foursquare.com/docs/api/venues/categories): Returns a hierarchical list of categories applied to venues. This list is also available on the [categories page](https://developer.foursquare.com/docs/resources/categories).\n", "* ~~**Number of Liks**: Assumption: the more voting, the more the restaurant is visited. It can be used to see how often~~\n", " * **Unfortunately, this information needs a premium account.**\n", "* [**Trending**](https://developer.foursquare.com/docs/api/venues/trending): Returns a list of venues near the current location with the most people currently checked in.\n", "\n", "\n", "### 2.1.2 Demographic data\n", "\n", "The Central Bureau of Statistics of the Netherlands, CBS in short, hosts a large number of demographic data regarding who live and work in the Netherlands.
\n", "In [this page](https://opendata.cbs.nl/statline/#/CBS/nl/dataset/84286NED/table?ts=1558337871247), you can choose whatever feature you need to solve the problem. The list of features is quite comprehensive. It is important, therefore, to define exactly what the data can be used.\n", "\n", "\n", "#### List of features from CBS that can be used to answer the question\n", "\n", "* **Regional specifics** _(Regioaanduiding)_: This information can help me to link relative details to a specific area.\n", "* **Household** _(Particulier huishouden)_: People who live alone are our target customers.\n", "* **Population density** _(Bevolkingsdichtheid)_: A more densely populated area means more customers for a restaurant.\n", "* ~~**Income per household** _(Inkomen van huishoudens)_: Since restaurants are usually quite expensive comparing to home-cooking. Only people who have sufficient income can afford to go to restaurants often.~~\n", " * **Unfortunately, this information is missing consistently from the database.**\n", "\n", "#### Features deliberately excluded\n", "\n", "* **Civil status**: Since we can distinguish one-person households, it's not necessary to understand why people live alone (single or divorced does not seem to link to food strategy directly).\n", "* **Gender**: Reason, in the Netherlands, there isn't a big difference between men and women in terms of the likeliness of cooking at home.\n", "* **Type of house**: This has no direct correlation to people's choice of food." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }