{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Airbnb Homes in Vancouver - Data Cleaning\n",
    "#### Arash Tavassoli - April 2019\n",
    "---\n",
    "### Introduction\n",
    "\n",
    "In less than 10 years since its official launch, airbnb has now found its place in many of our lives and continues to prove itself as an affordable and convenient alternative to conventional hotel services. This notebook cleans and prepares the data before importing into a Tableau workbook that will visualize airbnb's presence in the City of Vancouver based on publicly available data from airbnb-owned website, Inside-airbnb.\n",
    "\n",
    "The Tableau visualization is available in [here](https://public.tableau.com/views/AirbnbHomesinVancouver/Dashboard?:embed=y&:display_count=yes).\n",
    "\n",
    "### Data Source\n",
    "\n",
    "The data is downloaded from Inside-airbnb on April 29, 2019 and reflects the detailed listing data, calendar data and listing reviews for 4,929 uniquely identified homes in Vancouver, BC, that were active on March 8, 2019.\n",
    "\n",
    "The data is dowloaded from [Inside Airbnb website](http://insideairbnb.com/get-the-data.html) where information is available under a Creative Commons CC0 1.0 Universal (CC0 1.0) \"Public Domain Dedication\" license.\n",
    "\n",
    "The datasets are specific to the City of Vancouver and are provided in 5 separate CSV and 1 GeoJSON files:\n",
    "\n",
    "|File Name | Description \n",
    "|:---------|:---------\n",
    "|`calendar.csv` | Detailed Calendar Data for listings in Vancouver\n",
    "|`listings.csv` | Summary information and metrics for listings in Vancouver\n",
    "|`listings-2.csv` | Detailed Listings data for Vancouver\n",
    "|`reviews.csv` | Summary Review data and Listing ID\n",
    "|`reviews-2.csv` | Detailed Review Data for listings in Vancouver\n",
    "|`neighbourhoods.csv` | Neighbourhood list for geo filter (ourced from city or open source GIS files)\n",
    "|`neighbourhoods.geojson` | GeoJSON file of neighbourhoods of the city"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Importing libraries:\n",
    "import pandas as pd\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Importing raw datasets:\n",
    "df_cal = pd.read_csv('Data/calendar.csv')\n",
    "df_listings = pd.read_csv('Data/listings.csv')\n",
    "df_listings_more = pd.read_csv('Data/listings-2.csv')\n",
    "df_neighb = pd.read_csv('Data/neighbourhoods.csv')\n",
    "df_reviews = pd.read_csv('Data/reviews.csv')\n",
    "df_reviews_more = pd.read_csv('Data/reviews-2.csv')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Cleaning the `df_listings` and `df_listings_more` datasets:\n",
    "\n",
    "We start by looking at the two listing datasets: `df_listings` and `df_listings_more`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Shape of df_listings:\t\t (4938, 16)\n",
      "Shape of df_listings_more:\t (4938, 106)\n"
     ]
    }
   ],
   "source": [
    "print('Shape of df_listings:\\t\\t', df_listings.shape)\n",
    "print('Shape of df_listings_more:\\t', df_listings_more.shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>id</th>\n",
       "      <td>10080</td>\n",
       "      <td>11400</td>\n",
       "      <td>13188</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>name</th>\n",
       "      <td>D1 -  Million Dollar View 2 BR</td>\n",
       "      <td>Central Lovely Rm in Victorian Home</td>\n",
       "      <td>Garden level studio in ideal loc.</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_id</th>\n",
       "      <td>30899</td>\n",
       "      <td>42305</td>\n",
       "      <td>51466</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_name</th>\n",
       "      <td>Rami</td>\n",
       "      <td>Jaynne</td>\n",
       "      <td>Family Guns</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>neighbourhood_group</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>neighbourhood</th>\n",
       "      <td>Downtown</td>\n",
       "      <td>Riley Park</td>\n",
       "      <td>Riley Park</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>latitude</th>\n",
       "      <td>49.2877</td>\n",
       "      <td>49.2538</td>\n",
       "      <td>49.2458</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>longitude</th>\n",
       "      <td>-123.121</td>\n",
       "      <td>-123.113</td>\n",
       "      <td>-123.105</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>room_type</th>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>Private room</td>\n",
       "      <td>Entire home/apt</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>price</th>\n",
       "      <td>296</td>\n",
       "      <td>60</td>\n",
       "      <td>120</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>minimum_nights</th>\n",
       "      <td>60</td>\n",
       "      <td>364</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>number_of_reviews</th>\n",
       "      <td>16</td>\n",
       "      <td>70</td>\n",
       "      <td>171</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>last_review</th>\n",
       "      <td>2017-02-26</td>\n",
       "      <td>2016-06-14</td>\n",
       "      <td>2019-02-24</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>reviews_per_month</th>\n",
       "      <td>0.18</td>\n",
       "      <td>0.64</td>\n",
       "      <td>1.55</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>calculated_host_listings_count</th>\n",
       "      <td>33</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>availability_365</th>\n",
       "      <td>191</td>\n",
       "      <td>365</td>\n",
       "      <td>214</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                             0  \\\n",
       "id                                                       10080   \n",
       "name                            D1 -  Million Dollar View 2 BR   \n",
       "host_id                                                  30899   \n",
       "host_name                                                 Rami   \n",
       "neighbourhood_group                                        NaN   \n",
       "neighbourhood                                         Downtown   \n",
       "latitude                                               49.2877   \n",
       "longitude                                             -123.121   \n",
       "room_type                                      Entire home/apt   \n",
       "price                                                      296   \n",
       "minimum_nights                                              60   \n",
       "number_of_reviews                                           16   \n",
       "last_review                                         2017-02-26   \n",
       "reviews_per_month                                         0.18   \n",
       "calculated_host_listings_count                              33   \n",
       "availability_365                                           191   \n",
       "\n",
       "                                                                  1  \\\n",
       "id                                                            11400   \n",
       "name                            Central Lovely Rm in Victorian Home   \n",
       "host_id                                                       42305   \n",
       "host_name                                                    Jaynne   \n",
       "neighbourhood_group                                             NaN   \n",
       "neighbourhood                                            Riley Park   \n",
       "latitude                                                    49.2538   \n",
       "longitude                                                  -123.113   \n",
       "room_type                                              Private room   \n",
       "price                                                            60   \n",
       "minimum_nights                                                  364   \n",
       "number_of_reviews                                                70   \n",
       "last_review                                              2016-06-14   \n",
       "reviews_per_month                                              0.64   \n",
       "calculated_host_listings_count                                    1   \n",
       "availability_365                                                365   \n",
       "\n",
       "                                                                2  \n",
       "id                                                          13188  \n",
       "name                            Garden level studio in ideal loc.  \n",
       "host_id                                                     51466  \n",
       "host_name                                             Family Guns  \n",
       "neighbourhood_group                                           NaN  \n",
       "neighbourhood                                          Riley Park  \n",
       "latitude                                                  49.2458  \n",
       "longitude                                                -123.105  \n",
       "room_type                                         Entire home/apt  \n",
       "price                                                         120  \n",
       "minimum_nights                                                  2  \n",
       "number_of_reviews                                             171  \n",
       "last_review                                            2019-02-24  \n",
       "reviews_per_month                                            1.55  \n",
       "calculated_host_listings_count                                  1  \n",
       "availability_365                                              214  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_listings.head(3).T"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We drop the columns that we do not need in the analysis (not required or repeated on other datasets):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_listings.drop(['neighbourhood_group', 'last_review', 'reviews_per_month', 'availability_365', 'neighbourhood'], \n",
    "                 axis = 1, inplace = True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "id                                0\n",
       "name                              1\n",
       "host_id                           0\n",
       "host_name                         0\n",
       "latitude                          0\n",
       "longitude                         0\n",
       "room_type                         0\n",
       "price                             0\n",
       "minimum_nights                    0\n",
       "number_of_reviews                 0\n",
       "calculated_host_listings_count    0\n",
       "dtype: int64"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Checking for NaNs:\n",
    "df_listings.isna().sum()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll deal with that one missing value later."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>id</th>\n",
       "      <td>10080</td>\n",
       "      <td>11400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>listing_url</th>\n",
       "      <td>https://www.airbnb.com/rooms/10080</td>\n",
       "      <td>https://www.airbnb.com/rooms/11400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>scrape_id</th>\n",
       "      <td>20190308052700</td>\n",
       "      <td>20190308052700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>last_scraped</th>\n",
       "      <td>2019-03-08</td>\n",
       "      <td>2019-03-08</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>name</th>\n",
       "      <td>D1 -  Million Dollar View 2 BR</td>\n",
       "      <td>Central Lovely Rm in Victorian Home</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>summary</th>\n",
       "      <td>Stunning two bedroom, two bathroom apartment. ...</td>\n",
       "      <td>Well-appointed room with a view of the garden ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>space</th>\n",
       "      <td>Bed setup: 2 x queen, I can add up to 2 twin s...</td>\n",
       "      <td>Centrally-located lovely, quiet home on tree-l...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>description</th>\n",
       "      <td>Stunning two bedroom, two bathroom apartment. ...</td>\n",
       "      <td>Well-appointed room with a view of the garden ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>experiences_offered</th>\n",
       "      <td>none</td>\n",
       "      <td>none</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>neighborhood_overview</th>\n",
       "      <td>NaN</td>\n",
       "      <td>Very quiet residential area, yet only 1-1/2 bl...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>notes</th>\n",
       "      <td>1. CHECK-IN TIME IS AFTER 3PM PST AND CHECK-OU...</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>transit</th>\n",
       "      <td>NaN</td>\n",
       "      <td>Bus stop 1-1/2 blocks away. 15 minutes to down...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>access</th>\n",
       "      <td>There is no access to the building ammenities.</td>\n",
       "      <td>Access to my home and garden. Shared bathroom....</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>interaction</th>\n",
       "      <td>NaN</td>\n",
       "      <td>I live here so will interact with the guest wh...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>house_rules</th>\n",
       "      <td>1. CHECK-IN TIME IS AFTER 3 PM PST AND CHECK-O...</td>\n",
       "      <td>-No smoking indoors.  -Recycling, composting  ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>thumbnail_url</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>medium_url</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>picture_url</th>\n",
       "      <td>https://a0.muscache.com/im/pictures/55778229/c...</td>\n",
       "      <td>https://a0.muscache.com/im/pictures/74189163/a...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>xl_picture_url</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_id</th>\n",
       "      <td>30899</td>\n",
       "      <td>42305</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_url</th>\n",
       "      <td>https://www.airbnb.com/users/show/30899</td>\n",
       "      <td>https://www.airbnb.com/users/show/42305</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_name</th>\n",
       "      <td>Rami</td>\n",
       "      <td>Jaynne</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_since</th>\n",
       "      <td>2009-08-10</td>\n",
       "      <td>2009-09-30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_location</th>\n",
       "      <td>Vancouver, British Columbia, Canada</td>\n",
       "      <td>Vancouver, British Columbia, Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_about</th>\n",
       "      <td>I will be happy to host you.</td>\n",
       "      <td>I am a self-employed graphic designer, AutoCAD...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_response_time</th>\n",
       "      <td>within a few hours</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_response_rate</th>\n",
       "      <td>100%</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_acceptance_rate</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_is_superhost</th>\n",
       "      <td>f</td>\n",
       "      <td>f</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_thumbnail_url</th>\n",
       "      <td>https://a0.muscache.com/im/pictures/user/55302...</td>\n",
       "      <td>https://a0.muscache.com/im/users/42305/profile...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_picture_url</th>\n",
       "      <td>https://a0.muscache.com/im/pictures/user/55302...</td>\n",
       "      <td>https://a0.muscache.com/im/users/42305/profile...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_neighbourhood</th>\n",
       "      <td>Coal Harbour</td>\n",
       "      <td>Riley Park</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_listings_count</th>\n",
       "      <td>33</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_total_listings_count</th>\n",
       "      <td>33</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_verifications</th>\n",
       "      <td>['email', 'phone', 'reviews', 'jumio', 'govern...</td>\n",
       "      <td>['email', 'phone', 'reviews', 'jumio', 'govern...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_has_profile_pic</th>\n",
       "      <td>t</td>\n",
       "      <td>t</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>host_identity_verified</th>\n",
       "      <td>t</td>\n",
       "      <td>t</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>street</th>\n",
       "      <td>Vancouver, BC, Canada</td>\n",
       "      <td>Vancouver, BC, Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>neighbourhood</th>\n",
       "      <td>Coal Harbour</td>\n",
       "      <td>Riley Park</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>neighbourhood_cleansed</th>\n",
       "      <td>Downtown</td>\n",
       "      <td>Riley Park</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>neighbourhood_group_cleansed</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>city</th>\n",
       "      <td>Vancouver</td>\n",
       "      <td>Vancouver</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>state</th>\n",
       "      <td>BC</td>\n",
       "      <td>BC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>zipcode</th>\n",
       "      <td>V6E 2P4</td>\n",
       "      <td>V5Y</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>market</th>\n",
       "      <td>Vancouver</td>\n",
       "      <td>Vancouver</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>smart_location</th>\n",
       "      <td>Vancouver, Canada</td>\n",
       "      <td>Vancouver, Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>country_code</th>\n",
       "      <td>CA</td>\n",
       "      <td>CA</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>country</th>\n",
       "      <td>Canada</td>\n",
       "      <td>Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>latitude</th>\n",
       "      <td>49.2877</td>\n",
       "      <td>49.2538</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>longitude</th>\n",
       "      <td>-123.121</td>\n",
       "      <td>-123.113</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>is_location_exact</th>\n",
       "      <td>t</td>\n",
       "      <td>t</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>property_type</th>\n",
       "      <td>Condominium</td>\n",
       "      <td>House</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>room_type</th>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>Private room</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>accommodates</th>\n",
       "      <td>6</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>bathrooms</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>bedrooms</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>beds</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>bed_type</th>\n",
       "      <td>Real Bed</td>\n",
       "      <td>Real Bed</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>amenities</th>\n",
       "      <td>{TV,\"Cable TV\",Internet,Wifi,\"Air conditioning...</td>\n",
       "      <td>{TV,\"Cable TV\",Internet,Wifi,\"Free parking on ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>square_feet</th>\n",
       "      <td>1200</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>price</th>\n",
       "      <td>$296.00</td>\n",
       "      <td>$60.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>weekly_price</th>\n",
       "      <td>NaN</td>\n",
       "      <td>$385.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>monthly_price</th>\n",
       "      <td>$4,750.00</td>\n",
       "      <td>$1,400.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>security_deposit</th>\n",
       "      <td>$0.00</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>cleaning_fee</th>\n",
       "      <td>$195.00</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>guests_included</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>extra_people</th>\n",
       "      <td>$0.00</td>\n",
       "      <td>$0.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>minimum_nights</th>\n",
       "      <td>60</td>\n",
       "      <td>364</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>maximum_nights</th>\n",
       "      <td>1124</td>\n",
       "      <td>364</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>minimum_minimum_nights</th>\n",
       "      <td>60</td>\n",
       "      <td>364</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>maximum_minimum_nights</th>\n",
       "      <td>60</td>\n",
       "      <td>364</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>minimum_maximum_nights</th>\n",
       "      <td>1124</td>\n",
       "      <td>364</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>maximum_maximum_nights</th>\n",
       "      <td>1124</td>\n",
       "      <td>364</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>minimum_nights_avg_ntm</th>\n",
       "      <td>60</td>\n",
       "      <td>364</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>maximum_nights_avg_ntm</th>\n",
       "      <td>1124</td>\n",
       "      <td>364</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>calendar_updated</th>\n",
       "      <td>today</td>\n",
       "      <td>11 months ago</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>has_availability</th>\n",
       "      <td>t</td>\n",
       "      <td>t</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>availability_30</th>\n",
       "      <td>0</td>\n",
       "      <td>30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>availability_60</th>\n",
       "      <td>0</td>\n",
       "      <td>60</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>availability_90</th>\n",
       "      <td>0</td>\n",
       "      <td>90</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>availability_365</th>\n",
       "      <td>191</td>\n",
       "      <td>365</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>calendar_last_scraped</th>\n",
       "      <td>2019-03-08</td>\n",
       "      <td>2019-03-08</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>number_of_reviews</th>\n",
       "      <td>16</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>number_of_reviews_ltm</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>first_review</th>\n",
       "      <td>2011-11-15</td>\n",
       "      <td>2010-02-18</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>last_review</th>\n",
       "      <td>2017-02-26</td>\n",
       "      <td>2016-06-14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>review_scores_rating</th>\n",
       "      <td>93</td>\n",
       "      <td>92</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>review_scores_accuracy</th>\n",
       "      <td>9</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>review_scores_cleanliness</th>\n",
       "      <td>9</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>review_scores_checkin</th>\n",
       "      <td>10</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>review_scores_communication</th>\n",
       "      <td>9</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>review_scores_location</th>\n",
       "      <td>10</td>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>review_scores_value</th>\n",
       "      <td>9</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>requires_license</th>\n",
       "      <td>t</td>\n",
       "      <td>t</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>license</th>\n",
       "      <td>18-476608</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>jurisdiction_names</th>\n",
       "      <td>{\"British Columbia\",\" Canada\",\" Vancouver\",\" B...</td>\n",
       "      <td>{\"British Columbia\",\" Canada\",\" Vancouver\",\" B...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>instant_bookable</th>\n",
       "      <td>f</td>\n",
       "      <td>f</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>is_business_travel_ready</th>\n",
       "      <td>f</td>\n",
       "      <td>f</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>cancellation_policy</th>\n",
       "      <td>strict_14_with_grace_period</td>\n",
       "      <td>strict_14_with_grace_period</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>require_guest_profile_picture</th>\n",
       "      <td>f</td>\n",
       "      <td>t</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>require_guest_phone_verification</th>\n",
       "      <td>f</td>\n",
       "      <td>t</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>calculated_host_listings_count</th>\n",
       "      <td>33</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>calculated_host_listings_count_entire_homes</th>\n",
       "      <td>33</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>calculated_host_listings_count_private_rooms</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>calculated_host_listings_count_shared_rooms</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>reviews_per_month</th>\n",
       "      <td>0.18</td>\n",
       "      <td>0.64</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                                                              0  \\\n",
       "id                                                                                        10080   \n",
       "listing_url                                                  https://www.airbnb.com/rooms/10080   \n",
       "scrape_id                                                                        20190308052700   \n",
       "last_scraped                                                                         2019-03-08   \n",
       "name                                                             D1 -  Million Dollar View 2 BR   \n",
       "summary                                       Stunning two bedroom, two bathroom apartment. ...   \n",
       "space                                         Bed setup: 2 x queen, I can add up to 2 twin s...   \n",
       "description                                   Stunning two bedroom, two bathroom apartment. ...   \n",
       "experiences_offered                                                                        none   \n",
       "neighborhood_overview                                                                       NaN   \n",
       "notes                                         1. CHECK-IN TIME IS AFTER 3PM PST AND CHECK-OU...   \n",
       "transit                                                                                     NaN   \n",
       "access                                           There is no access to the building ammenities.   \n",
       "interaction                                                                                 NaN   \n",
       "house_rules                                   1. CHECK-IN TIME IS AFTER 3 PM PST AND CHECK-O...   \n",
       "thumbnail_url                                                                               NaN   \n",
       "medium_url                                                                                  NaN   \n",
       "picture_url                                   https://a0.muscache.com/im/pictures/55778229/c...   \n",
       "xl_picture_url                                                                              NaN   \n",
       "host_id                                                                                   30899   \n",
       "host_url                                                https://www.airbnb.com/users/show/30899   \n",
       "host_name                                                                                  Rami   \n",
       "host_since                                                                           2009-08-10   \n",
       "host_location                                               Vancouver, British Columbia, Canada   \n",
       "host_about                                                         I will be happy to host you.   \n",
       "host_response_time                                                           within a few hours   \n",
       "host_response_rate                                                                         100%   \n",
       "host_acceptance_rate                                                                        NaN   \n",
       "host_is_superhost                                                                             f   \n",
       "host_thumbnail_url                            https://a0.muscache.com/im/pictures/user/55302...   \n",
       "host_picture_url                              https://a0.muscache.com/im/pictures/user/55302...   \n",
       "host_neighbourhood                                                                 Coal Harbour   \n",
       "host_listings_count                                                                          33   \n",
       "host_total_listings_count                                                                    33   \n",
       "host_verifications                            ['email', 'phone', 'reviews', 'jumio', 'govern...   \n",
       "host_has_profile_pic                                                                          t   \n",
       "host_identity_verified                                                                        t   \n",
       "street                                                                    Vancouver, BC, Canada   \n",
       "neighbourhood                                                                      Coal Harbour   \n",
       "neighbourhood_cleansed                                                                 Downtown   \n",
       "neighbourhood_group_cleansed                                                                NaN   \n",
       "city                                                                                  Vancouver   \n",
       "state                                                                                        BC   \n",
       "zipcode                                                                                 V6E 2P4   \n",
       "market                                                                                Vancouver   \n",
       "smart_location                                                                Vancouver, Canada   \n",
       "country_code                                                                                 CA   \n",
       "country                                                                                  Canada   \n",
       "latitude                                                                                49.2877   \n",
       "longitude                                                                              -123.121   \n",
       "is_location_exact                                                                             t   \n",
       "property_type                                                                       Condominium   \n",
       "room_type                                                                       Entire home/apt   \n",
       "accommodates                                                                                  6   \n",
       "bathrooms                                                                                     2   \n",
       "bedrooms                                                                                      2   \n",
       "beds                                                                                          4   \n",
       "bed_type                                                                               Real Bed   \n",
       "amenities                                     {TV,\"Cable TV\",Internet,Wifi,\"Air conditioning...   \n",
       "square_feet                                                                                1200   \n",
       "price                                                                                   $296.00   \n",
       "weekly_price                                                                                NaN   \n",
       "monthly_price                                                                         $4,750.00   \n",
       "security_deposit                                                                          $0.00   \n",
       "cleaning_fee                                                                            $195.00   \n",
       "guests_included                                                                               1   \n",
       "extra_people                                                                              $0.00   \n",
       "minimum_nights                                                                               60   \n",
       "maximum_nights                                                                             1124   \n",
       "minimum_minimum_nights                                                                       60   \n",
       "maximum_minimum_nights                                                                       60   \n",
       "minimum_maximum_nights                                                                     1124   \n",
       "maximum_maximum_nights                                                                     1124   \n",
       "minimum_nights_avg_ntm                                                                       60   \n",
       "maximum_nights_avg_ntm                                                                     1124   \n",
       "calendar_updated                                                                          today   \n",
       "has_availability                                                                              t   \n",
       "availability_30                                                                               0   \n",
       "availability_60                                                                               0   \n",
       "availability_90                                                                               0   \n",
       "availability_365                                                                            191   \n",
       "calendar_last_scraped                                                                2019-03-08   \n",
       "number_of_reviews                                                                            16   \n",
       "number_of_reviews_ltm                                                                         0   \n",
       "first_review                                                                         2011-11-15   \n",
       "last_review                                                                          2017-02-26   \n",
       "review_scores_rating                                                                         93   \n",
       "review_scores_accuracy                                                                        9   \n",
       "review_scores_cleanliness                                                                     9   \n",
       "review_scores_checkin                                                                        10   \n",
       "review_scores_communication                                                                   9   \n",
       "review_scores_location                                                                       10   \n",
       "review_scores_value                                                                           9   \n",
       "requires_license                                                                              t   \n",
       "license                                                                               18-476608   \n",
       "jurisdiction_names                            {\"British Columbia\",\" Canada\",\" Vancouver\",\" B...   \n",
       "instant_bookable                                                                              f   \n",
       "is_business_travel_ready                                                                      f   \n",
       "cancellation_policy                                                 strict_14_with_grace_period   \n",
       "require_guest_profile_picture                                                                 f   \n",
       "require_guest_phone_verification                                                              f   \n",
       "calculated_host_listings_count                                                               33   \n",
       "calculated_host_listings_count_entire_homes                                                  33   \n",
       "calculated_host_listings_count_private_rooms                                                  0   \n",
       "calculated_host_listings_count_shared_rooms                                                   0   \n",
       "reviews_per_month                                                                          0.18   \n",
       "\n",
       "                                                                                              1  \n",
       "id                                                                                        11400  \n",
       "listing_url                                                  https://www.airbnb.com/rooms/11400  \n",
       "scrape_id                                                                        20190308052700  \n",
       "last_scraped                                                                         2019-03-08  \n",
       "name                                                        Central Lovely Rm in Victorian Home  \n",
       "summary                                       Well-appointed room with a view of the garden ...  \n",
       "space                                         Centrally-located lovely, quiet home on tree-l...  \n",
       "description                                   Well-appointed room with a view of the garden ...  \n",
       "experiences_offered                                                                        none  \n",
       "neighborhood_overview                         Very quiet residential area, yet only 1-1/2 bl...  \n",
       "notes                                                                                       NaN  \n",
       "transit                                       Bus stop 1-1/2 blocks away. 15 minutes to down...  \n",
       "access                                        Access to my home and garden. Shared bathroom....  \n",
       "interaction                                   I live here so will interact with the guest wh...  \n",
       "house_rules                                   -No smoking indoors.  -Recycling, composting  ...  \n",
       "thumbnail_url                                                                               NaN  \n",
       "medium_url                                                                                  NaN  \n",
       "picture_url                                   https://a0.muscache.com/im/pictures/74189163/a...  \n",
       "xl_picture_url                                                                              NaN  \n",
       "host_id                                                                                   42305  \n",
       "host_url                                                https://www.airbnb.com/users/show/42305  \n",
       "host_name                                                                                Jaynne  \n",
       "host_since                                                                           2009-09-30  \n",
       "host_location                                               Vancouver, British Columbia, Canada  \n",
       "host_about                                    I am a self-employed graphic designer, AutoCAD...  \n",
       "host_response_time                                                                          NaN  \n",
       "host_response_rate                                                                          NaN  \n",
       "host_acceptance_rate                                                                        NaN  \n",
       "host_is_superhost                                                                             f  \n",
       "host_thumbnail_url                            https://a0.muscache.com/im/users/42305/profile...  \n",
       "host_picture_url                              https://a0.muscache.com/im/users/42305/profile...  \n",
       "host_neighbourhood                                                                   Riley Park  \n",
       "host_listings_count                                                                           1  \n",
       "host_total_listings_count                                                                     1  \n",
       "host_verifications                            ['email', 'phone', 'reviews', 'jumio', 'govern...  \n",
       "host_has_profile_pic                                                                          t  \n",
       "host_identity_verified                                                                        t  \n",
       "street                                                                    Vancouver, BC, Canada  \n",
       "neighbourhood                                                                        Riley Park  \n",
       "neighbourhood_cleansed                                                               Riley Park  \n",
       "neighbourhood_group_cleansed                                                                NaN  \n",
       "city                                                                                  Vancouver  \n",
       "state                                                                                        BC  \n",
       "zipcode                                                                                     V5Y  \n",
       "market                                                                                Vancouver  \n",
       "smart_location                                                                Vancouver, Canada  \n",
       "country_code                                                                                 CA  \n",
       "country                                                                                  Canada  \n",
       "latitude                                                                                49.2538  \n",
       "longitude                                                                              -123.113  \n",
       "is_location_exact                                                                             t  \n",
       "property_type                                                                             House  \n",
       "room_type                                                                          Private room  \n",
       "accommodates                                                                                  1  \n",
       "bathrooms                                                                                     1  \n",
       "bedrooms                                                                                      1  \n",
       "beds                                                                                          1  \n",
       "bed_type                                                                               Real Bed  \n",
       "amenities                                     {TV,\"Cable TV\",Internet,Wifi,\"Free parking on ...  \n",
       "square_feet                                                                                   0  \n",
       "price                                                                                    $60.00  \n",
       "weekly_price                                                                            $385.00  \n",
       "monthly_price                                                                         $1,400.00  \n",
       "security_deposit                                                                            NaN  \n",
       "cleaning_fee                                                                                NaN  \n",
       "guests_included                                                                               1  \n",
       "extra_people                                                                              $0.00  \n",
       "minimum_nights                                                                              364  \n",
       "maximum_nights                                                                              364  \n",
       "minimum_minimum_nights                                                                      364  \n",
       "maximum_minimum_nights                                                                      364  \n",
       "minimum_maximum_nights                                                                      364  \n",
       "maximum_maximum_nights                                                                      364  \n",
       "minimum_nights_avg_ntm                                                                      364  \n",
       "maximum_nights_avg_ntm                                                                      364  \n",
       "calendar_updated                                                                  11 months ago  \n",
       "has_availability                                                                              t  \n",
       "availability_30                                                                              30  \n",
       "availability_60                                                                              60  \n",
       "availability_90                                                                              90  \n",
       "availability_365                                                                            365  \n",
       "calendar_last_scraped                                                                2019-03-08  \n",
       "number_of_reviews                                                                            70  \n",
       "number_of_reviews_ltm                                                                         0  \n",
       "first_review                                                                         2010-02-18  \n",
       "last_review                                                                          2016-06-14  \n",
       "review_scores_rating                                                                         92  \n",
       "review_scores_accuracy                                                                        9  \n",
       "review_scores_cleanliness                                                                    10  \n",
       "review_scores_checkin                                                                        10  \n",
       "review_scores_communication                                                                  10  \n",
       "review_scores_location                                                                       10  \n",
       "review_scores_value                                                                           9  \n",
       "requires_license                                                                              t  \n",
       "license                                                                                     NaN  \n",
       "jurisdiction_names                            {\"British Columbia\",\" Canada\",\" Vancouver\",\" B...  \n",
       "instant_bookable                                                                              f  \n",
       "is_business_travel_ready                                                                      f  \n",
       "cancellation_policy                                                 strict_14_with_grace_period  \n",
       "require_guest_profile_picture                                                                 t  \n",
       "require_guest_phone_verification                                                              t  \n",
       "calculated_host_listings_count                                                                1  \n",
       "calculated_host_listings_count_entire_homes                                                   0  \n",
       "calculated_host_listings_count_private_rooms                                                  1  \n",
       "calculated_host_listings_count_shared_rooms                                                   0  \n",
       "reviews_per_month                                                                          0.64  "
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.set_option('display.max_rows', 500)\n",
    "pd.set_option('display.max_columns', 500)\n",
    "\n",
    "df_listings_more.head(2).T"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Not all columns will be required for visualization so let's do some clean-up:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "columns_to_keep = ['id', 'summary', 'space', 'description', 'host_listings_count', 'host_total_listings_count',\n",
    "                  'neighbourhood_cleansed', 'is_location_exact', 'property_type', 'accommodates', 'bedrooms',\n",
    "                  'bathrooms', 'beds', 'amenities', 'availability_30', 'availability_60', 'availability_90',\n",
    "                  'availability_365']\n",
    "\n",
    "df_listings_more = df_listings_more.loc[:,columns_to_keep]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "id                             int64\n",
       "summary                       object\n",
       "space                         object\n",
       "description                   object\n",
       "host_listings_count            int64\n",
       "host_total_listings_count      int64\n",
       "neighbourhood_cleansed        object\n",
       "is_location_exact             object\n",
       "property_type                 object\n",
       "accommodates                   int64\n",
       "bedrooms                     float64\n",
       "bathrooms                    float64\n",
       "beds                         float64\n",
       "amenities                     object\n",
       "availability_30                int64\n",
       "availability_60                int64\n",
       "availability_90                int64\n",
       "availability_365               int64\n",
       "dtype: object"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_listings_more.dtypes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "id                              0\n",
       "summary                       167\n",
       "space                        1057\n",
       "description                    81\n",
       "host_listings_count             0\n",
       "host_total_listings_count       0\n",
       "neighbourhood_cleansed          0\n",
       "is_location_exact               0\n",
       "property_type                   0\n",
       "accommodates                    0\n",
       "bedrooms                        3\n",
       "bathrooms                       0\n",
       "beds                            5\n",
       "amenities                       0\n",
       "availability_30                 0\n",
       "availability_60                 0\n",
       "availability_90                 0\n",
       "availability_365                0\n",
       "dtype: int64"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_listings_more.isna().sum()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In case of missing data in `summary`, `space` and `description` columns we will replace NaN's with empty strings:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_listings_more['summary'] = df_listings_more['summary'].fillna('')\n",
    "df_listings_more['space'] = df_listings_more['space'].fillna('')\n",
    "df_listings_more['description'] = df_listings_more['description'].fillna('')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "id                           0\n",
       "summary                      0\n",
       "space                        0\n",
       "description                  0\n",
       "host_listings_count          0\n",
       "host_total_listings_count    0\n",
       "neighbourhood_cleansed       0\n",
       "is_location_exact            0\n",
       "property_type                0\n",
       "accommodates                 0\n",
       "bedrooms                     3\n",
       "bathrooms                    0\n",
       "beds                         5\n",
       "amenities                    0\n",
       "availability_30              0\n",
       "availability_60              0\n",
       "availability_90              0\n",
       "availability_365             0\n",
       "dtype: int64"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_listings_more.isna().sum()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will deal with the remaining NaN's in the following sections."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We now merge the two datasets, `df_listings` and `df_listings_more` into one single dataset and call it `df_listings_clean`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_listings_clean = df_listings.merge(df_listings_more, on = 'id')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "id                                0\n",
       "name                              1\n",
       "host_id                           0\n",
       "host_name                         0\n",
       "latitude                          0\n",
       "longitude                         0\n",
       "room_type                         0\n",
       "price                             0\n",
       "minimum_nights                    0\n",
       "number_of_reviews                 0\n",
       "calculated_host_listings_count    0\n",
       "summary                           0\n",
       "space                             0\n",
       "description                       0\n",
       "host_listings_count               0\n",
       "host_total_listings_count         0\n",
       "neighbourhood_cleansed            0\n",
       "is_location_exact                 0\n",
       "property_type                     0\n",
       "accommodates                      0\n",
       "bedrooms                          3\n",
       "bathrooms                         0\n",
       "beds                              5\n",
       "amenities                         0\n",
       "availability_30                   0\n",
       "availability_60                   0\n",
       "availability_90                   0\n",
       "availability_365                  0\n",
       "dtype: int64"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_listings_clean.isna().sum()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now drop the 9 remaining rows with missing values in `name`, `bedrooms` and `beds` columns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_listings_clean.dropna(axis = 0, inplace = True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_listings_clean.isna().sum().any()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "id                                  int64\n",
       "name                               object\n",
       "host_id                             int64\n",
       "host_name                          object\n",
       "latitude                          float64\n",
       "longitude                         float64\n",
       "room_type                          object\n",
       "price                               int64\n",
       "minimum_nights                      int64\n",
       "number_of_reviews                   int64\n",
       "calculated_host_listings_count      int64\n",
       "summary                            object\n",
       "space                              object\n",
       "description                        object\n",
       "host_listings_count                 int64\n",
       "host_total_listings_count           int64\n",
       "neighbourhood_cleansed             object\n",
       "is_location_exact                  object\n",
       "property_type                      object\n",
       "accommodates                        int64\n",
       "bedrooms                          float64\n",
       "bathrooms                         float64\n",
       "beds                              float64\n",
       "amenities                          object\n",
       "availability_30                     int64\n",
       "availability_60                     int64\n",
       "availability_90                     int64\n",
       "availability_365                    int64\n",
       "dtype: object"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_listings_clean.dtypes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As an optional help for visulaization down the road let's change data type in some columns to appropriate integer type:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_listings_clean['bedrooms'] = df_listings_clean['bedrooms'].astype('int64')\n",
    "df_listings_clean['bathrooms'] = df_listings_clean['bathrooms'].astype('int64')\n",
    "df_listings_clean['beds'] = df_listings_clean['beds'].astype('int64')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### EDA on different neighbourhoods:\n",
    "\n",
    "Although Tableau will be used down the road to visualize different measures in different neighbourhoods we can also have a quick look at different features across neighbourhoods in here to get an understanding that will help us choose the best visualizations in Tableau:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Creating an empty dataframe to store different measures for different neighbourhoods:\n",
    "df_neighbourhood = pd.DataFrame(index = df_listings_clean['neighbourhood_cleansed'].unique())\n",
    "\n",
    "# Average nighly price in different neighbourhoods:\n",
    "df_neighbourhood['Avg_price'] = df_listings_clean.groupby(by = 'neighbourhood_cleansed')['price'].mean().round()\n",
    "\n",
    "# Total number of listings in different neighbourhoods:\n",
    "df_neighbourhood['Listing_count'] = df_listings_clean.groupby(by = 'neighbourhood_cleansed')['id'].count()\n",
    "\n",
    "# Average 30-Day availability in different neighbourhoods:\n",
    "df_neighbourhood['Avg_30_availability'] = df_listings_clean.groupby(by = 'neighbourhood_cleansed')['availability_30']\\\n",
    "                                            .mean().round().astype(int)\n",
    "\n",
    "# Average 60-Day availability in different neighbourhoods:\n",
    "df_neighbourhood['Avg_60_availability'] = df_listings_clean.groupby(by = 'neighbourhood_cleansed')['availability_60']\\\n",
    "                                            .mean().round().astype(int)\n",
    "\n",
    "# Average 90-Day availability in different neighbourhoods:\n",
    "df_neighbourhood['Avg_90_availability'] = df_listings_clean.groupby(by = 'neighbourhood_cleansed')['availability_90']\\\n",
    "                                            .mean().round().astype(int)\n",
    "\n",
    "# Average 365-Day availability in different neighbourhoods:\n",
    "df_neighbourhood['Avg_365_availability'] = df_listings_clean.groupby(by = 'neighbourhood_cleansed')['availability_365']\\\n",
    "                                            .mean().round().astype(int)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Avg_price</th>\n",
       "      <th>Listing_count</th>\n",
       "      <th>Avg_30_availability</th>\n",
       "      <th>Avg_60_availability</th>\n",
       "      <th>Avg_90_availability</th>\n",
       "      <th>Avg_365_availability</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Downtown</th>\n",
       "      <td>196.0</td>\n",
       "      <td>1064</td>\n",
       "      <td>11</td>\n",
       "      <td>25</td>\n",
       "      <td>41</td>\n",
       "      <td>147</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Riley Park</th>\n",
       "      <td>145.0</td>\n",
       "      <td>275</td>\n",
       "      <td>10</td>\n",
       "      <td>21</td>\n",
       "      <td>32</td>\n",
       "      <td>117</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Downtown Eastside</th>\n",
       "      <td>176.0</td>\n",
       "      <td>271</td>\n",
       "      <td>12</td>\n",
       "      <td>25</td>\n",
       "      <td>38</td>\n",
       "      <td>129</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>West End</th>\n",
       "      <td>168.0</td>\n",
       "      <td>362</td>\n",
       "      <td>10</td>\n",
       "      <td>22</td>\n",
       "      <td>36</td>\n",
       "      <td>117</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Kensington-Cedar Cottage</th>\n",
       "      <td>134.0</td>\n",
       "      <td>319</td>\n",
       "      <td>10</td>\n",
       "      <td>22</td>\n",
       "      <td>35</td>\n",
       "      <td>136</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Hastings-Sunrise</th>\n",
       "      <td>120.0</td>\n",
       "      <td>247</td>\n",
       "      <td>12</td>\n",
       "      <td>26</td>\n",
       "      <td>42</td>\n",
       "      <td>138</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Renfrew-Collingwood</th>\n",
       "      <td>102.0</td>\n",
       "      <td>200</td>\n",
       "      <td>12</td>\n",
       "      <td>27</td>\n",
       "      <td>43</td>\n",
       "      <td>153</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Mount Pleasant</th>\n",
       "      <td>150.0</td>\n",
       "      <td>294</td>\n",
       "      <td>10</td>\n",
       "      <td>22</td>\n",
       "      <td>35</td>\n",
       "      <td>126</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Grandview-Woodland</th>\n",
       "      <td>135.0</td>\n",
       "      <td>247</td>\n",
       "      <td>8</td>\n",
       "      <td>17</td>\n",
       "      <td>28</td>\n",
       "      <td>116</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Fairview</th>\n",
       "      <td>144.0</td>\n",
       "      <td>151</td>\n",
       "      <td>7</td>\n",
       "      <td>18</td>\n",
       "      <td>31</td>\n",
       "      <td>132</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Kitsilano</th>\n",
       "      <td>190.0</td>\n",
       "      <td>424</td>\n",
       "      <td>10</td>\n",
       "      <td>22</td>\n",
       "      <td>34</td>\n",
       "      <td>133</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Kerrisdale</th>\n",
       "      <td>173.0</td>\n",
       "      <td>69</td>\n",
       "      <td>15</td>\n",
       "      <td>31</td>\n",
       "      <td>47</td>\n",
       "      <td>145</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Arbutus Ridge</th>\n",
       "      <td>198.0</td>\n",
       "      <td>84</td>\n",
       "      <td>13</td>\n",
       "      <td>29</td>\n",
       "      <td>47</td>\n",
       "      <td>156</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Sunset</th>\n",
       "      <td>172.0</td>\n",
       "      <td>106</td>\n",
       "      <td>13</td>\n",
       "      <td>30</td>\n",
       "      <td>48</td>\n",
       "      <td>142</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dunbar Southlands</th>\n",
       "      <td>190.0</td>\n",
       "      <td>138</td>\n",
       "      <td>13</td>\n",
       "      <td>30</td>\n",
       "      <td>48</td>\n",
       "      <td>170</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Marpole</th>\n",
       "      <td>129.0</td>\n",
       "      <td>166</td>\n",
       "      <td>17</td>\n",
       "      <td>38</td>\n",
       "      <td>60</td>\n",
       "      <td>167</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Killarney</th>\n",
       "      <td>91.0</td>\n",
       "      <td>67</td>\n",
       "      <td>15</td>\n",
       "      <td>34</td>\n",
       "      <td>53</td>\n",
       "      <td>140</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Shaughnessy</th>\n",
       "      <td>216.0</td>\n",
       "      <td>87</td>\n",
       "      <td>12</td>\n",
       "      <td>26</td>\n",
       "      <td>39</td>\n",
       "      <td>107</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>South Cambie</th>\n",
       "      <td>185.0</td>\n",
       "      <td>59</td>\n",
       "      <td>10</td>\n",
       "      <td>22</td>\n",
       "      <td>35</td>\n",
       "      <td>125</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Strathcona</th>\n",
       "      <td>128.0</td>\n",
       "      <td>22</td>\n",
       "      <td>8</td>\n",
       "      <td>20</td>\n",
       "      <td>32</td>\n",
       "      <td>129</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>West Point Grey</th>\n",
       "      <td>216.0</td>\n",
       "      <td>88</td>\n",
       "      <td>13</td>\n",
       "      <td>27</td>\n",
       "      <td>40</td>\n",
       "      <td>133</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Oakridge</th>\n",
       "      <td>141.0</td>\n",
       "      <td>88</td>\n",
       "      <td>15</td>\n",
       "      <td>34</td>\n",
       "      <td>52</td>\n",
       "      <td>136</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Victoria-Fraserview</th>\n",
       "      <td>90.0</td>\n",
       "      <td>101</td>\n",
       "      <td>19</td>\n",
       "      <td>41</td>\n",
       "      <td>65</td>\n",
       "      <td>228</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                          Avg_price  Listing_count  Avg_30_availability  \\\n",
       "Downtown                      196.0           1064                   11   \n",
       "Riley Park                    145.0            275                   10   \n",
       "Downtown Eastside             176.0            271                   12   \n",
       "West End                      168.0            362                   10   \n",
       "Kensington-Cedar Cottage      134.0            319                   10   \n",
       "Hastings-Sunrise              120.0            247                   12   \n",
       "Renfrew-Collingwood           102.0            200                   12   \n",
       "Mount Pleasant                150.0            294                   10   \n",
       "Grandview-Woodland            135.0            247                    8   \n",
       "Fairview                      144.0            151                    7   \n",
       "Kitsilano                     190.0            424                   10   \n",
       "Kerrisdale                    173.0             69                   15   \n",
       "Arbutus Ridge                 198.0             84                   13   \n",
       "Sunset                        172.0            106                   13   \n",
       "Dunbar Southlands             190.0            138                   13   \n",
       "Marpole                       129.0            166                   17   \n",
       "Killarney                      91.0             67                   15   \n",
       "Shaughnessy                   216.0             87                   12   \n",
       "South Cambie                  185.0             59                   10   \n",
       "Strathcona                    128.0             22                    8   \n",
       "West Point Grey               216.0             88                   13   \n",
       "Oakridge                      141.0             88                   15   \n",
       "Victoria-Fraserview            90.0            101                   19   \n",
       "\n",
       "                          Avg_60_availability  Avg_90_availability  \\\n",
       "Downtown                                   25                   41   \n",
       "Riley Park                                 21                   32   \n",
       "Downtown Eastside                          25                   38   \n",
       "West End                                   22                   36   \n",
       "Kensington-Cedar Cottage                   22                   35   \n",
       "Hastings-Sunrise                           26                   42   \n",
       "Renfrew-Collingwood                        27                   43   \n",
       "Mount Pleasant                             22                   35   \n",
       "Grandview-Woodland                         17                   28   \n",
       "Fairview                                   18                   31   \n",
       "Kitsilano                                  22                   34   \n",
       "Kerrisdale                                 31                   47   \n",
       "Arbutus Ridge                              29                   47   \n",
       "Sunset                                     30                   48   \n",
       "Dunbar Southlands                          30                   48   \n",
       "Marpole                                    38                   60   \n",
       "Killarney                                  34                   53   \n",
       "Shaughnessy                                26                   39   \n",
       "South Cambie                               22                   35   \n",
       "Strathcona                                 20                   32   \n",
       "West Point Grey                            27                   40   \n",
       "Oakridge                                   34                   52   \n",
       "Victoria-Fraserview                        41                   65   \n",
       "\n",
       "                          Avg_365_availability  \n",
       "Downtown                                   147  \n",
       "Riley Park                                 117  \n",
       "Downtown Eastside                          129  \n",
       "West End                                   117  \n",
       "Kensington-Cedar Cottage                   136  \n",
       "Hastings-Sunrise                           138  \n",
       "Renfrew-Collingwood                        153  \n",
       "Mount Pleasant                             126  \n",
       "Grandview-Woodland                         116  \n",
       "Fairview                                   132  \n",
       "Kitsilano                                  133  \n",
       "Kerrisdale                                 145  \n",
       "Arbutus Ridge                              156  \n",
       "Sunset                                     142  \n",
       "Dunbar Southlands                          170  \n",
       "Marpole                                    167  \n",
       "Killarney                                  140  \n",
       "Shaughnessy                                107  \n",
       "South Cambie                               125  \n",
       "Strathcona                                 129  \n",
       "West Point Grey                            133  \n",
       "Oakridge                                   136  \n",
       "Victoria-Fraserview                        228  "
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_neighbourhood"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Finding the most common words in Reviews and listing Summaries:\n",
    "\n",
    "With the aim of making word clouds of most common words in property summary (description) and reviews we go through all reviews and listing summaries in each neghbourhood and will feed Tableau with tables that list the top 50 words used in reviews and listing summaries, in each neighbourhood:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Funtion to return the top 50 words:\n",
    "\n",
    "from operator import itemgetter\n",
    "\n",
    "def top_word_counter(df, filter_col, filter_match, lookup_col):\n",
    "\n",
    "    words = list(entry.lower().strip() for entry in df[df[filter_col] == filter_match][lookup_col].astype(str))\n",
    "    wordcount = {}\n",
    "    \n",
    "    # Words that will not be counted:\n",
    "    stopwords = [\"i\", \"me\", \"my\", \"myself\", \"we\", \"our\", \"ours\", \"ourselves\", \"you\", \"your\", \"yours\", \n",
    "                 \"yourself\", \"yourselves\", \"he\", \"him\", \"his\", \"himself\", \"she\", \"her\", \"hers\", \"herself\", \n",
    "                 \"it\", \"its\", \"itself\", \"they\", \"them\", \"their\", \"theirs\", \"themselves\", \"what\", \"which\", \n",
    "                 \"who\", \"whom\", \"this\", \"that\", \"these\", \"those\", \"am\", \"is\", \"are\", \"was\", \"were\", \"be\", \n",
    "                 \"been\", \"being\", \"have\", \"has\", \"had\", \"having\", \"do\", \"does\", \"did\", \"doing\", \"a\", \"an\", \n",
    "                 \"the\", \"and\", \"but\", \"if\", \"or\", \"because\", \"as\", \"until\", \"while\", \"of\", \"at\", \"by\", \"for\", \n",
    "                 \"with\", \"about\", \"against\", \"between\", \"into\", \"through\", \"during\", \"before\", \"after\", \"above\", \n",
    "                 \"below\", \"to\", \"from\", \"up\", \"down\", \"in\", \"out\", \"on\", \"off\", \"over\", \"under\", \"again\", \"further\", \n",
    "                 \"then\", \"once\", \"here\", \"there\", \"when\", \"where\", \"why\", \"how\", \"all\", \"any\", \"both\", \"each\", \n",
    "                 \"few\", \"more\", \"most\", \"other\", \"some\", \"such\", \"no\", \"nor\", \"not\", \"only\", \"own\", \"same\", \"so\", \n",
    "                 \"than\", \"too\", \"very\", \"s\", \"t\", \"can\", \"will\", \"just\", \"don\", \"should\", \"now\", \"\"]\n",
    "\n",
    "    symbols_to_replace = [\".\", \",\", \":\", \";\", \"\\\"\", \"/\", \"!\", \"?\", \"â€œ\", \"â€˜\", \"*\", \")\", \"(\", \"|\", \n",
    "                          \">\", \"<\", \"{\", \"}\", \"&\", \"^\", \"~\", \"+\"]\n",
    "    \n",
    "    seperator = ' '\n",
    "    text = seperator.join(words)\n",
    "\n",
    "    for word in text.split():\n",
    "        for symbol in symbols_to_replace:\n",
    "            word = word.replace(symbol,\"\")\n",
    "\n",
    "        if word not in stopwords:\n",
    "            if word not in wordcount:\n",
    "                wordcount[word] = 1\n",
    "            else:\n",
    "                wordcount[word] += 1\n",
    "\n",
    "    top_words = sorted(wordcount.items(), key=itemgetter(1), reverse = True)[0:50]\n",
    "    \n",
    "    return [(filter_match, i[0], i[1]) for i in top_words]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Finding the top 50 words in summaries for each neighbouthood:\n",
    "summary_words = []\n",
    "\n",
    "for neighb in df_listings_clean['neighbourhood_cleansed'].unique():\n",
    "    summary_words += top_word_counter(df_listings_clean, 'neighbourhood_cleansed', neighb, 'summary')\n",
    "\n",
    "summary_words = pd.DataFrame(summary_words)\n",
    "summary_words.columns = ['neighbourhood_cleansed', 'Word', 'Word_Count']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Cleaning the reviews dataframe before checking for most used words:\n",
    "reviews_clean = df_reviews_more.drop(['id', 'date', 'reviewer_id', 'reviewer_name'] , axis = 1)\\\n",
    "    .rename(columns = {'listing_id': 'id', 'comments': 'review'})\\\n",
    "        .merge(df_listings_more.loc[:, ['id', 'neighbourhood_cleansed']], on = 'id')\n",
    "\n",
    "# Finding the top 50 words in reviews for each neighbouthood:\n",
    "review_words = []\n",
    "\n",
    "for neighb in reviews_clean['neighbourhood_cleansed'].unique():\n",
    "    review_words += top_word_counter(reviews_clean, 'neighbourhood_cleansed', neighb, 'review')\n",
    "\n",
    "review_words = pd.DataFrame(review_words)\n",
    "review_words.columns = ['neighbourhood_cleansed', 'Word', 'Word_Count']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Exporting new, clean dataframes as csv files that will feed into Tableau:\n",
    "\n",
    "df_listings_clean.to_csv('Data/clean_data.csv')\n",
    "review_words.to_csv('Data/review_words.csv')\n",
    "summary_words.to_csv('Data/summary_words.csv')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Data visualization to continue on Tableau."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}