{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "QkFgjTNf3auo"
},
"source": [
"# TripAdvisor Recommendation Challenge\n",
"In this project, we will build a recommendation system based on *TripAdvisor* reviews. Our goal is to implement a BM25 baseline and a custom recommendation model that can outperform BM25 using user reviews only, without access to explicit ratings during the recommendation phase."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 12626,
"status": "ok",
"timestamp": 1730740246182,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "Rbrrcses54qP",
"outputId": "1594b83f-c6dc-4438-a3c0-1e50d2371fcb"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: rank_bm25 in c:\\users\\joyce\\anaconda3\\envs\\ml-nlp\\lib\\site-packages (0.2.2)\n",
"Requirement already satisfied: numpy in c:\\users\\joyce\\anaconda3\\envs\\ml-nlp\\lib\\site-packages (from rank_bm25) (1.26.4)\n"
]
}
],
"source": [
"!pip install rank_bm25"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 74043,
"status": "ok",
"timestamp": 1730740320222,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "CSHZJjvc3EJG",
"outputId": "9e617a95-b25e-43eb-8582-c22157b682d3"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"c:\\Users\\Joyce\\anaconda3\\envs\\ml-nlp\\lib\\site-packages\\sentence_transformers\\cross_encoder\\CrossEncoder.py:13: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console)\n",
" from tqdm.autonotebook import tqdm, trange\n",
"[nltk_data] Downloading package punkt to\n",
"[nltk_data] C:\\Users\\Joyce\\AppData\\Roaming\\nltk_data...\n",
"[nltk_data] Package punkt is already up-to-date!\n",
"[nltk_data] Downloading package stopwords to\n",
"[nltk_data] C:\\Users\\Joyce\\AppData\\Roaming\\nltk_data...\n",
"[nltk_data] Package stopwords is already up-to-date!\n",
"[nltk_data] Downloading package wordnet to\n",
"[nltk_data] C:\\Users\\Joyce\\AppData\\Roaming\\nltk_data...\n",
"[nltk_data] Package wordnet is already up-to-date!\n"
]
}
],
"source": [
"import pandas as pd # For data management\n",
"import json # For manipulating JSON-like formatted strings/documents\n",
"import numpy as np # To use \"argsort\" function\n",
"from rank_bm25 import BM25Okapi # For BM25 implementation\n",
"from collections import Counter # For counting occurrences of elements, used here for word frequency analysis\n",
"import matplotlib.pyplot as plt\n",
"from sentence_transformers import SentenceTransformer\n",
"\n",
"# Display progress bar\n",
"from tqdm import tqdm\n",
"\n",
"# ==== scikit-learn ====\n",
"from sklearn.metrics import root_mean_squared_error, ndcg_score\n",
"from sklearn.feature_extraction.text import TfidfVectorizer\n",
"from sklearn.metrics.pairwise import cosine_similarity\n",
"\n",
"# ==== NLTK for NLP ====\n",
"import nltk\n",
"\n",
"# For tokenizing text based on a regular expression pattern\n",
"nltk.download('punkt')\n",
"from nltk.tokenize import regexp_tokenize\n",
"\n",
"# For accessing stopwords, which are commonly removed from text data\n",
"nltk.download('stopwords')\n",
"from nltk.corpus import stopwords\n",
"stop_words = set(stopwords.words('english'))\n",
"\n",
"# For converting words to their base form (lemmatization)\n",
"nltk.download(\"wordnet\") # lemmatizer\n",
"from nltk.stem import WordNetLemmatizer"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"executionInfo": {
"elapsed": 2,
"status": "ok",
"timestamp": 1730740349661,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "Bm-9_Odu3glT"
},
"outputs": [],
"source": [
"# PROJECT_PATH = \"/content/drive/MyDrive/Ecole/ESILV/A5/Machine Learning for NLP/Project/\"\n",
"PROJECT_PATH = \"\""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EcN1_Y6t3cqd"
},
"source": [
"## Data Loading and Preprocessing\n",
"We will load the TripAdvisor dataset (downloadable from Kaggle), filter it based on specified aspects, and preprocess by concatenating reviews by place (`offering_id`)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sE7wBkkD30lc"
},
"source": [
"### Loading the dataset"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 293
},
"executionInfo": {
"elapsed": 42118,
"status": "ok",
"timestamp": 1730740391777,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "8CXyTmmH3x4R",
"outputId": "cb5a8728-09ab-4cb1-e200-c623a7eb8892"
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" ratings \n",
" title \n",
" text \n",
" author \n",
" date_stayed \n",
" offering_id \n",
" num_helpful_votes \n",
" date \n",
" id \n",
" via_mobile \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
" “Truly is \"Jewel of the Upper Wets Side\"” \n",
" Stayed in a king suite for 11 nights and yes i... \n",
" {'username': 'Papa_Panda', 'num_cities': 22, '... \n",
" December 2012 \n",
" 93338 \n",
" 0 \n",
" 2012-12-17 \n",
" 147643103 \n",
" False \n",
" \n",
" \n",
" 1 \n",
" {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
" “My home away from home!” \n",
" On every visit to NYC, the Hotel Beacon is the... \n",
" {'username': 'Maureen V', 'num_reviews': 2, 'n... \n",
" December 2012 \n",
" 93338 \n",
" 0 \n",
" 2012-12-17 \n",
" 147639004 \n",
" False \n",
" \n",
" \n",
" 2 \n",
" {'service': 4.0, 'cleanliness': 5.0, 'overall'... \n",
" “Great Stay” \n",
" This is a great property in Midtown. We two di... \n",
" {'username': 'vuguru', 'num_cities': 12, 'num_... \n",
" December 2012 \n",
" 1762573 \n",
" 0 \n",
" 2012-12-18 \n",
" 147697954 \n",
" False \n",
" \n",
" \n",
" 3 \n",
" {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
" “Modern Convenience” \n",
" The Andaz is a nice hotel in a central locatio... \n",
" {'username': 'Hotel-Designer', 'num_cities': 5... \n",
" August 2012 \n",
" 1762573 \n",
" 0 \n",
" 2012-12-17 \n",
" 147625723 \n",
" False \n",
" \n",
" \n",
" 4 \n",
" {'service': 4.0, 'cleanliness': 5.0, 'overall'... \n",
" “Its the best of the Andaz Brand in the US....” \n",
" I have stayed at each of the US Andaz properti... \n",
" {'username': 'JamesE339', 'num_cities': 34, 'n... \n",
" December 2012 \n",
" 1762573 \n",
" 0 \n",
" 2012-12-17 \n",
" 147612823 \n",
" False \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" ratings \\\n",
"0 {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
"1 {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
"2 {'service': 4.0, 'cleanliness': 5.0, 'overall'... \n",
"3 {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
"4 {'service': 4.0, 'cleanliness': 5.0, 'overall'... \n",
"\n",
" title \\\n",
"0 “Truly is \"Jewel of the Upper Wets Side\"” \n",
"1 “My home away from home!” \n",
"2 “Great Stay” \n",
"3 “Modern Convenience” \n",
"4 “Its the best of the Andaz Brand in the US....” \n",
"\n",
" text \\\n",
"0 Stayed in a king suite for 11 nights and yes i... \n",
"1 On every visit to NYC, the Hotel Beacon is the... \n",
"2 This is a great property in Midtown. We two di... \n",
"3 The Andaz is a nice hotel in a central locatio... \n",
"4 I have stayed at each of the US Andaz properti... \n",
"\n",
" author date_stayed \\\n",
"0 {'username': 'Papa_Panda', 'num_cities': 22, '... December 2012 \n",
"1 {'username': 'Maureen V', 'num_reviews': 2, 'n... December 2012 \n",
"2 {'username': 'vuguru', 'num_cities': 12, 'num_... December 2012 \n",
"3 {'username': 'Hotel-Designer', 'num_cities': 5... August 2012 \n",
"4 {'username': 'JamesE339', 'num_cities': 34, 'n... December 2012 \n",
"\n",
" offering_id num_helpful_votes date id via_mobile \n",
"0 93338 0 2012-12-17 147643103 False \n",
"1 93338 0 2012-12-17 147639004 False \n",
"2 1762573 0 2012-12-18 147697954 False \n",
"3 1762573 0 2012-12-17 147625723 False \n",
"4 1762573 0 2012-12-17 147612823 False "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Load dataset\n",
"df = pd.read_csv(PROJECT_PATH + 'data/reviews.csv')\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 5,
"status": "ok",
"timestamp": 1730740391777,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "VnaDr8DY35M4",
"outputId": "1721ce9f-e802-4a9c-b5a4-c240f4932fbe"
},
"outputs": [
{
"data": {
"text/plain": [
"(878561, 10)"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.shape"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 398
},
"executionInfo": {
"elapsed": 3,
"status": "ok",
"timestamp": 1730740391777,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "oBi2bQCW35mv",
"outputId": "58f440b6-b66e-477c-e627-44382e10e266"
},
"outputs": [
{
"data": {
"text/plain": [
"ratings object\n",
"title object\n",
"text object\n",
"author object\n",
"date_stayed object\n",
"offering_id int64\n",
"num_helpful_votes int64\n",
"date object\n",
"id int64\n",
"via_mobile bool\n",
"dtype: object"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.dtypes"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 398
},
"executionInfo": {
"elapsed": 329,
"status": "ok",
"timestamp": 1730740392103,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "OAOW8_A23710",
"outputId": "8718b712-e5d2-428d-da49-6cfadde87b00"
},
"outputs": [
{
"data": {
"text/plain": [
"ratings 0\n",
"title 0\n",
"text 0\n",
"author 0\n",
"date_stayed 67594\n",
"offering_id 0\n",
"num_helpful_votes 0\n",
"date 0\n",
"id 0\n",
"via_mobile 0\n",
"dtype: int64"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.isnull().sum()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "BrfJM0ps39cB"
},
"source": [
"### Filtering reviews with specific aspects"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 310
},
"executionInfo": {
"elapsed": 1354,
"status": "ok",
"timestamp": 1730740393454,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "ZZE-h2qR3-K5",
"outputId": "1e7bcb2d-ff18-4d97-941d-37463484972f"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"New DataFrame's shape: (436391, 10)\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" ratings \n",
" title \n",
" text \n",
" author \n",
" date_stayed \n",
" offering_id \n",
" num_helpful_votes \n",
" date \n",
" id \n",
" via_mobile \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
" “Truly is \"Jewel of the Upper Wets Side\"” \n",
" Stayed in a king suite for 11 nights and yes i... \n",
" {'username': 'Papa_Panda', 'num_cities': 22, '... \n",
" December 2012 \n",
" 93338 \n",
" 0 \n",
" 2012-12-17 \n",
" 147643103 \n",
" False \n",
" \n",
" \n",
" 1 \n",
" {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
" “My home away from home!” \n",
" On every visit to NYC, the Hotel Beacon is the... \n",
" {'username': 'Maureen V', 'num_reviews': 2, 'n... \n",
" December 2012 \n",
" 93338 \n",
" 0 \n",
" 2012-12-17 \n",
" 147639004 \n",
" False \n",
" \n",
" \n",
" 2 \n",
" {'service': 4.0, 'cleanliness': 5.0, 'overall'... \n",
" “Great Stay” \n",
" This is a great property in Midtown. We two di... \n",
" {'username': 'vuguru', 'num_cities': 12, 'num_... \n",
" December 2012 \n",
" 1762573 \n",
" 0 \n",
" 2012-12-18 \n",
" 147697954 \n",
" False \n",
" \n",
" \n",
" 3 \n",
" {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
" “Modern Convenience” \n",
" The Andaz is a nice hotel in a central locatio... \n",
" {'username': 'Hotel-Designer', 'num_cities': 5... \n",
" August 2012 \n",
" 1762573 \n",
" 0 \n",
" 2012-12-17 \n",
" 147625723 \n",
" False \n",
" \n",
" \n",
" 4 \n",
" {'service': 4.0, 'cleanliness': 5.0, 'overall'... \n",
" “Its the best of the Andaz Brand in the US....” \n",
" I have stayed at each of the US Andaz properti... \n",
" {'username': 'JamesE339', 'num_cities': 34, 'n... \n",
" December 2012 \n",
" 1762573 \n",
" 0 \n",
" 2012-12-17 \n",
" 147612823 \n",
" False \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" ratings \\\n",
"0 {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
"1 {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
"2 {'service': 4.0, 'cleanliness': 5.0, 'overall'... \n",
"3 {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
"4 {'service': 4.0, 'cleanliness': 5.0, 'overall'... \n",
"\n",
" title \\\n",
"0 “Truly is \"Jewel of the Upper Wets Side\"” \n",
"1 “My home away from home!” \n",
"2 “Great Stay” \n",
"3 “Modern Convenience” \n",
"4 “Its the best of the Andaz Brand in the US....” \n",
"\n",
" text \\\n",
"0 Stayed in a king suite for 11 nights and yes i... \n",
"1 On every visit to NYC, the Hotel Beacon is the... \n",
"2 This is a great property in Midtown. We two di... \n",
"3 The Andaz is a nice hotel in a central locatio... \n",
"4 I have stayed at each of the US Andaz properti... \n",
"\n",
" author date_stayed \\\n",
"0 {'username': 'Papa_Panda', 'num_cities': 22, '... December 2012 \n",
"1 {'username': 'Maureen V', 'num_reviews': 2, 'n... December 2012 \n",
"2 {'username': 'vuguru', 'num_cities': 12, 'num_... December 2012 \n",
"3 {'username': 'Hotel-Designer', 'num_cities': 5... August 2012 \n",
"4 {'username': 'JamesE339', 'num_cities': 34, 'n... December 2012 \n",
"\n",
" offering_id num_helpful_votes date id via_mobile \n",
"0 93338 0 2012-12-17 147643103 False \n",
"1 93338 0 2012-12-17 147639004 False \n",
"2 1762573 0 2012-12-18 147697954 False \n",
"3 1762573 0 2012-12-17 147625723 False \n",
"4 1762573 0 2012-12-17 147612823 False "
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Filter reviews with specific aspects\n",
"required_aspects = ['service', 'cleanliness', 'overall', 'value', 'location', 'sleep_quality', 'rooms']\n",
"\n",
"# Filter rows where all required aspects are in each 'ratings' entry\n",
"df = df[df[\"ratings\"].apply(lambda x: all(aspect in x for aspect in required_aspects))]\n",
"\n",
"print(f\"New DataFrame's shape: {df.shape}\")\n",
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "R382jTpz4Cyj"
},
"source": [
"We went from **878,561** rows to **436,391 rows**; so about 450k rows didn't contain all the aspects we need to compare all places accurately."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yZT6e3gm4EsC"
},
"source": [
"### Concatenating reviews from the same place"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"executionInfo": {
"elapsed": 3,
"status": "ok",
"timestamp": 1730740393454,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "gG_CzosV4CVg"
},
"outputs": [],
"source": [
"# Function to calculate average rating for each aspect\n",
"def get_avg_rating_aspects(ratings_str):\n",
" # Convert each string in the list to a dictionary\n",
" ratings_dicts = [json.loads(rating.replace(\"'\", \"\\\"\")) for rating in ratings_str]\n",
"\n",
" # Get the number of ratings to calculate the average\n",
" nb_ratings = len(ratings_dicts)\n",
"\n",
" # Initialize a dictionary to store average ratings per aspect\n",
" avg_ratings = {}\n",
"\n",
" # Iterate over each required aspect\n",
" for aspect in required_aspects:\n",
" # Initialize the sum for the current aspect\n",
" aspect_rating_sum = 0\n",
"\n",
" # Sum up the ratings for the current aspect from all dictionaries\n",
" for ratings in ratings_dicts:\n",
" aspect_rating_sum += ratings[aspect]\n",
"\n",
" # Calculate the average rating for the aspect, rounded to 1 decimal place\n",
" avg_ratings[aspect] = round(aspect_rating_sum / nb_ratings, 1)\n",
"\n",
" # Return the dictionary with average ratings for each aspect\n",
" return avg_ratings"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 224
},
"executionInfo": {
"elapsed": 6852,
"status": "ok",
"timestamp": 1730740400303,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "46TWi9GX3_10",
"outputId": "3cfa3af9-97f4-44e5-b299-1ed3a26c6941"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Grouped DataFrame's shape: (3754, 3)\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" offering_id \n",
" reviews \n",
" ratings \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 72572 \n",
" I had to make fast visit to seattle and I foun... \n",
" {'service': 4.6, 'cleanliness': 4.6, 'overall'... \n",
" \n",
" \n",
" 1 \n",
" 72579 \n",
" Great service, rooms were clean, could use som... \n",
" {'service': 4.2, 'cleanliness': 4.2, 'overall'... \n",
" \n",
" \n",
" 2 \n",
" 72586 \n",
" Beautiful views of the space needle - especial... \n",
" {'service': 4.2, 'cleanliness': 4.3, 'overall'... \n",
" \n",
" \n",
" 3 \n",
" 72598 \n",
" This hotel is in need of some serious updates.... \n",
" {'service': 3.2, 'cleanliness': 3.2, 'overall'... \n",
" \n",
" \n",
" 4 \n",
" 73236 \n",
" My experience at this days inn was perfect. th... \n",
" {'service': 4.3, 'cleanliness': 3.1, 'overall'... \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" offering_id reviews \\\n",
"0 72572 I had to make fast visit to seattle and I foun... \n",
"1 72579 Great service, rooms were clean, could use som... \n",
"2 72586 Beautiful views of the space needle - especial... \n",
"3 72598 This hotel is in need of some serious updates.... \n",
"4 73236 My experience at this days inn was perfect. th... \n",
"\n",
" ratings \n",
"0 {'service': 4.6, 'cleanliness': 4.6, 'overall'... \n",
"1 {'service': 4.2, 'cleanliness': 4.2, 'overall'... \n",
"2 {'service': 4.2, 'cleanliness': 4.3, 'overall'... \n",
"3 {'service': 3.2, 'cleanliness': 3.2, 'overall'... \n",
"4 {'service': 4.3, 'cleanliness': 3.1, 'overall'... "
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Call custom aggregate function on \"ratings\" column\n",
"df_grouped = df.groupby('offering_id').agg({'text': '\\n'.join, 'ratings': get_avg_rating_aspects}).reset_index()\n",
"\n",
"# Rename the 'text' column to be more explicit\n",
"df_grouped.rename(columns={\"text\": \"reviews\"}, inplace=True)\n",
"\n",
"print(f\"Grouped DataFrame's shape: {df_grouped.shape}\")\n",
"df_grouped.head()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "T_0Fabrt4JZS"
},
"source": [
"### Adding hotel info to `df`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*This step is merely to make the results more readable for us humans, so that instead of an ID we get an actual place name with some info (e.g., stars).*"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206
},
"executionInfo": {
"elapsed": 520,
"status": "ok",
"timestamp": 1730740400819,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "McV2Mhx64J_N",
"outputId": "e5113272-1a77-4a45-df20-3586188967ec"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" hotel_class \n",
" region_id \n",
" url \n",
" phone \n",
" details \n",
" address \n",
" type \n",
" id \n",
" name \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 4.0 \n",
" 60763 \n",
" http://www.tripadvisor.com/Hotel_Review-g60763... \n",
" NaN \n",
" NaN \n",
" {'region': 'NY', 'street-address': '147 West 4... \n",
" hotel \n",
" 113317 \n",
" Casablanca Hotel Times Square \n",
" \n",
" \n",
" 1 \n",
" 5.0 \n",
" 32655 \n",
" http://www.tripadvisor.com/Hotel_Review-g32655... \n",
" NaN \n",
" NaN \n",
" {'region': 'CA', 'street-address': '300 S Dohe... \n",
" hotel \n",
" 76049 \n",
" Four Seasons Hotel Los Angeles at Beverly Hills \n",
" \n",
" \n",
" 2 \n",
" 3.5 \n",
" 60763 \n",
" http://www.tripadvisor.com/Hotel_Review-g60763... \n",
" NaN \n",
" NaN \n",
" {'region': 'NY', 'street-address': '790 Eighth... \n",
" hotel \n",
" 99352 \n",
" Hilton Garden Inn Times Square \n",
" \n",
" \n",
" 3 \n",
" 4.0 \n",
" 60763 \n",
" http://www.tripadvisor.com/Hotel_Review-g60763... \n",
" NaN \n",
" NaN \n",
" {'region': 'NY', 'street-address': '152 West 5... \n",
" hotel \n",
" 93589 \n",
" The Michelangelo Hotel \n",
" \n",
" \n",
" 4 \n",
" 4.0 \n",
" 60763 \n",
" http://www.tripadvisor.com/Hotel_Review-g60763... \n",
" NaN \n",
" NaN \n",
" {'region': 'NY', 'street-address': '130 West 4... \n",
" hotel \n",
" 217616 \n",
" The Muse Hotel New York \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" hotel_class region_id url \\\n",
"0 4.0 60763 http://www.tripadvisor.com/Hotel_Review-g60763... \n",
"1 5.0 32655 http://www.tripadvisor.com/Hotel_Review-g32655... \n",
"2 3.5 60763 http://www.tripadvisor.com/Hotel_Review-g60763... \n",
"3 4.0 60763 http://www.tripadvisor.com/Hotel_Review-g60763... \n",
"4 4.0 60763 http://www.tripadvisor.com/Hotel_Review-g60763... \n",
"\n",
" phone details address type \\\n",
"0 NaN NaN {'region': 'NY', 'street-address': '147 West 4... hotel \n",
"1 NaN NaN {'region': 'CA', 'street-address': '300 S Dohe... hotel \n",
"2 NaN NaN {'region': 'NY', 'street-address': '790 Eighth... hotel \n",
"3 NaN NaN {'region': 'NY', 'street-address': '152 West 5... hotel \n",
"4 NaN NaN {'region': 'NY', 'street-address': '130 West 4... hotel \n",
"\n",
" id name \n",
"0 113317 Casablanca Hotel Times Square \n",
"1 76049 Four Seasons Hotel Los Angeles at Beverly Hills \n",
"2 99352 Hilton Garden Inn Times Square \n",
"3 93589 The Michelangelo Hotel \n",
"4 217616 The Muse Hotel New York "
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Load hotels' information\n",
"offerings = pd.read_csv(PROJECT_PATH + \"data/offerings.csv\")\n",
"offerings.head()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 81
},
"executionInfo": {
"elapsed": 8,
"status": "ok",
"timestamp": 1730740400819,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "qC2LilsW4Osf",
"outputId": "c73bbe15-f4b6-4bd7-b07a-7ff6c47185e1"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" hotel_class \n",
" id \n",
" name \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 4.0 \n",
" 113317 \n",
" Casablanca Hotel Times Square \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" hotel_class id name\n",
"0 4.0 113317 Casablanca Hotel Times Square"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Drop useless columns\n",
"offerings.drop(columns=[\"region_id\", \"url\", \"phone\", \"details\", \"address\", \"type\"], inplace=True)\n",
"offerings.head(1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "l9BC8BSW4Qc_"
},
"source": [
"*Might remove `hotel_class` later if it turns out to be useless...*"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 178
},
"executionInfo": {
"elapsed": 7,
"status": "ok",
"timestamp": 1730740400819,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "L3LlwSWP4Ruy",
"outputId": "75a92ca7-05b4-4951-a429-6af6131e6667"
},
"outputs": [
{
"data": {
"text/plain": [
"hotel_class 1192\n",
"id 0\n",
"name 0\n",
"dtype: int64"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"offerings.isnull().sum()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QvB_DG0H4TKN"
},
"source": [
"Let's store these null rows to check if the replacement has been done correctly when the time comes."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 424
},
"executionInfo": {
"elapsed": 6,
"status": "ok",
"timestamp": 1730740400819,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "rRQYVUGo4Tqe",
"outputId": "9299a0f9-134f-4fb6-ef0b-828638f804e2"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" hotel_class \n",
" id \n",
" name \n",
" \n",
" \n",
" \n",
" \n",
" 6 \n",
" NaN \n",
" 2643161 \n",
" The NoMad Hotel \n",
" \n",
" \n",
" 44 \n",
" NaN \n",
" 1630591 \n",
" Crowne Plaza \n",
" \n",
" \n",
" 49 \n",
" NaN \n",
" 585164 \n",
" Residence Inn Houston West/Energy Corridor \n",
" \n",
" \n",
" 52 \n",
" NaN \n",
" 258634 \n",
" Scottish Inn & Suites Reliant Park/Six Flags \n",
" \n",
" \n",
" 70 \n",
" NaN \n",
" 815515 \n",
" Scottish Inns & Suites - Willowbrook \n",
" \n",
" \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" \n",
" \n",
" 4253 \n",
" NaN \n",
" 1204691 \n",
" Quality Inn & Suites Laurel \n",
" \n",
" \n",
" 4259 \n",
" NaN \n",
" 1218625 \n",
" Wyndham \n",
" \n",
" \n",
" 4261 \n",
" NaN \n",
" 1515599 \n",
" Executive Apartments \n",
" \n",
" \n",
" 4262 \n",
" NaN \n",
" 84068 \n",
" Harrington Hotel \n",
" \n",
" \n",
" 4282 \n",
" NaN \n",
" 120566 \n",
" The Mansion on O Street \n",
" \n",
" \n",
"
\n",
"
1192 rows × 3 columns
\n",
"
"
],
"text/plain": [
" hotel_class id name\n",
"6 NaN 2643161 The NoMad Hotel\n",
"44 NaN 1630591 Crowne Plaza\n",
"49 NaN 585164 Residence Inn Houston West/Energy Corridor\n",
"52 NaN 258634 Scottish Inn & Suites Reliant Park/Six Flags\n",
"70 NaN 815515 Scottish Inns & Suites - Willowbrook\n",
"... ... ... ...\n",
"4253 NaN 1204691 Quality Inn & Suites Laurel\n",
"4259 NaN 1218625 Wyndham\n",
"4261 NaN 1515599 Executive Apartments\n",
"4262 NaN 84068 Harrington Hotel\n",
"4282 NaN 120566 The Mansion on O Street\n",
"\n",
"[1192 rows x 3 columns]"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"idx_null_class = offerings[offerings[\"hotel_class\"].isnull()].index\n",
"offerings.iloc[idx_null_class]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VTGiVqVo4YRy"
},
"source": [
"A hotel with a missing value in regards to its class basically means that the hotel has **0 stars**, so we can replace these `NaN` values with `0`."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 424
},
"executionInfo": {
"elapsed": 7,
"status": "ok",
"timestamp": 1730740400820,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "KCw5XjVh4YrX",
"outputId": "3c8d1707-1525-437b-cb4f-14015b5c297e"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" hotel_class \n",
" id \n",
" name \n",
" \n",
" \n",
" \n",
" \n",
" 6 \n",
" 0.0 \n",
" 2643161 \n",
" The NoMad Hotel \n",
" \n",
" \n",
" 44 \n",
" 0.0 \n",
" 1630591 \n",
" Crowne Plaza \n",
" \n",
" \n",
" 49 \n",
" 0.0 \n",
" 585164 \n",
" Residence Inn Houston West/Energy Corridor \n",
" \n",
" \n",
" 52 \n",
" 0.0 \n",
" 258634 \n",
" Scottish Inn & Suites Reliant Park/Six Flags \n",
" \n",
" \n",
" 70 \n",
" 0.0 \n",
" 815515 \n",
" Scottish Inns & Suites - Willowbrook \n",
" \n",
" \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" \n",
" \n",
" 4253 \n",
" 0.0 \n",
" 1204691 \n",
" Quality Inn & Suites Laurel \n",
" \n",
" \n",
" 4259 \n",
" 0.0 \n",
" 1218625 \n",
" Wyndham \n",
" \n",
" \n",
" 4261 \n",
" 0.0 \n",
" 1515599 \n",
" Executive Apartments \n",
" \n",
" \n",
" 4262 \n",
" 0.0 \n",
" 84068 \n",
" Harrington Hotel \n",
" \n",
" \n",
" 4282 \n",
" 0.0 \n",
" 120566 \n",
" The Mansion on O Street \n",
" \n",
" \n",
"
\n",
"
1192 rows × 3 columns
\n",
"
"
],
"text/plain": [
" hotel_class id name\n",
"6 0.0 2643161 The NoMad Hotel\n",
"44 0.0 1630591 Crowne Plaza\n",
"49 0.0 585164 Residence Inn Houston West/Energy Corridor\n",
"52 0.0 258634 Scottish Inn & Suites Reliant Park/Six Flags\n",
"70 0.0 815515 Scottish Inns & Suites - Willowbrook\n",
"... ... ... ...\n",
"4253 0.0 1204691 Quality Inn & Suites Laurel\n",
"4259 0.0 1218625 Wyndham\n",
"4261 0.0 1515599 Executive Apartments\n",
"4262 0.0 84068 Harrington Hotel\n",
"4282 0.0 120566 The Mansion on O Street\n",
"\n",
"[1192 rows x 3 columns]"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"offerings[\"hotel_class\"] = offerings[\"hotel_class\"].fillna(0)\n",
"offerings.iloc[idx_null_class]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "wfq_jBjw4au8"
},
"source": [
"The replacement has been done successfully; so now we can **merge** both of the DataFrames."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 424
},
"executionInfo": {
"elapsed": 3626,
"status": "ok",
"timestamp": 1730740404441,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "ehz0gAoi4bPw",
"outputId": "b9725cff-8e7e-483d-ad3a-2cb04462fcf8"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" offering_id \n",
" name \n",
" hotel_class \n",
" ratings \n",
" reviews \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 72572 \n",
" BEST WESTERN PLUS Pioneer Square Hotel \n",
" 3.5 \n",
" {'service': 4.6, 'cleanliness': 4.6, 'overall'... \n",
" I had to make fast visit to seattle and I foun... \n",
" \n",
" \n",
" 1 \n",
" 72579 \n",
" BEST WESTERN Loyal Inn \n",
" 2.0 \n",
" {'service': 4.2, 'cleanliness': 4.2, 'overall'... \n",
" Great service, rooms were clean, could use som... \n",
" \n",
" \n",
" 2 \n",
" 72586 \n",
" BEST WESTERN PLUS Executive Inn \n",
" 3.0 \n",
" {'service': 4.2, 'cleanliness': 4.3, 'overall'... \n",
" Beautiful views of the space needle - especial... \n",
" \n",
" \n",
" 3 \n",
" 72598 \n",
" Comfort Inn & Suites Seattle \n",
" 2.5 \n",
" {'service': 3.2, 'cleanliness': 3.2, 'overall'... \n",
" This hotel is in need of some serious updates.... \n",
" \n",
" \n",
" 4 \n",
" 73236 \n",
" Days Inn San Antonio/Near Lackland AFB \n",
" 2.0 \n",
" {'service': 4.3, 'cleanliness': 3.1, 'overall'... \n",
" My experience at this days inn was perfect. th... \n",
" \n",
" \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" \n",
" \n",
" 3749 \n",
" 3523356 \n",
" Hampton Inn & Suites Austin @ The University/C... \n",
" 2.5 \n",
" {'service': 4.9, 'cleanliness': 4.9, 'overall'... \n",
" I've stayed at plenty of Hampton Inns during m... \n",
" \n",
" \n",
" 3750 \n",
" 3541823 \n",
" New York Budget Inn \n",
" 0.0 \n",
" {'service': 4.3, 'cleanliness': 4.5, 'overall'... \n",
" Inn staff absolutely wonderful, helpful, knowl... \n",
" \n",
" \n",
" 3751 \n",
" 3572384 \n",
" Hyatt Place Chicago/River North \n",
" 0.0 \n",
" {'service': 3.0, 'cleanliness': 2.0, 'overall'... \n",
" Crowded, noisy, dirty. Service is poor, food i... \n",
" \n",
" \n",
" 3752 \n",
" 3572583 \n",
" Holiday Inn Express New York - Manhattan West ... \n",
" 3.0 \n",
" {'service': 1.0, 'cleanliness': 1.0, 'overall'... \n",
" El hotel estaba en medio de una remodelación. ... \n",
" \n",
" \n",
" 3753 \n",
" 3574675 \n",
" Days Inn Columbus Airport \n",
" 3.0 \n",
" {'service': 3.0, 'cleanliness': 2.8, 'overall'... \n",
" We were looking for a place to stay close to t... \n",
" \n",
" \n",
"
\n",
"
3754 rows × 5 columns
\n",
"
"
],
"text/plain": [
" offering_id name \\\n",
"0 72572 BEST WESTERN PLUS Pioneer Square Hotel \n",
"1 72579 BEST WESTERN Loyal Inn \n",
"2 72586 BEST WESTERN PLUS Executive Inn \n",
"3 72598 Comfort Inn & Suites Seattle \n",
"4 73236 Days Inn San Antonio/Near Lackland AFB \n",
"... ... ... \n",
"3749 3523356 Hampton Inn & Suites Austin @ The University/C... \n",
"3750 3541823 New York Budget Inn \n",
"3751 3572384 Hyatt Place Chicago/River North \n",
"3752 3572583 Holiday Inn Express New York - Manhattan West ... \n",
"3753 3574675 Days Inn Columbus Airport \n",
"\n",
" hotel_class ratings \\\n",
"0 3.5 {'service': 4.6, 'cleanliness': 4.6, 'overall'... \n",
"1 2.0 {'service': 4.2, 'cleanliness': 4.2, 'overall'... \n",
"2 3.0 {'service': 4.2, 'cleanliness': 4.3, 'overall'... \n",
"3 2.5 {'service': 3.2, 'cleanliness': 3.2, 'overall'... \n",
"4 2.0 {'service': 4.3, 'cleanliness': 3.1, 'overall'... \n",
"... ... ... \n",
"3749 2.5 {'service': 4.9, 'cleanliness': 4.9, 'overall'... \n",
"3750 0.0 {'service': 4.3, 'cleanliness': 4.5, 'overall'... \n",
"3751 0.0 {'service': 3.0, 'cleanliness': 2.0, 'overall'... \n",
"3752 3.0 {'service': 1.0, 'cleanliness': 1.0, 'overall'... \n",
"3753 3.0 {'service': 3.0, 'cleanliness': 2.8, 'overall'... \n",
"\n",
" reviews \n",
"0 I had to make fast visit to seattle and I foun... \n",
"1 Great service, rooms were clean, could use som... \n",
"2 Beautiful views of the space needle - especial... \n",
"3 This hotel is in need of some serious updates.... \n",
"4 My experience at this days inn was perfect. th... \n",
"... ... \n",
"3749 I've stayed at plenty of Hampton Inns during m... \n",
"3750 Inn staff absolutely wonderful, helpful, knowl... \n",
"3751 Crowded, noisy, dirty. Service is poor, food i... \n",
"3752 El hotel estaba en medio de una remodelación. ... \n",
"3753 We were looking for a place to stay close to t... \n",
"\n",
"[3754 rows x 5 columns]"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Rename \"id\" column for the merge to go through\n",
"offerings.rename(columns={\"id\": \"offering_id\"}, inplace=True)\n",
"\n",
"# Merge both DataFrames on \"offering_id\" column\n",
"final_df = df_grouped.merge(offerings, on=\"offering_id\")\n",
"\n",
"# Re-order the columns in a more logical order\n",
"final_df = final_df.iloc[:, [0, 4, 3, 2, 1]]\n",
"final_df"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1gBbg7qW4eNc"
},
"source": [
"## Implementing BM25 Baseline\n",
"Using the [`rank_bm25`](github.com/dorianbrown/rank_bm25) library to implement a BM25 model. We will retrieve the most similar place for a given query based on BM25 scores."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "aXh-Y_Ue4hdr"
},
"source": [
"### Preprocessing"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"executionInfo": {
"elapsed": 3,
"status": "ok",
"timestamp": 1730740404441,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "-LRJdBP54i2T"
},
"outputs": [],
"source": [
"# Preprocess function to clean and prepare text\n",
"def preprocess(text, count=False):\n",
" stop_words = set(stopwords.words(\"english\"))\n",
"\n",
" # Tokenization\n",
" tokens = regexp_tokenize(text.lower(), r\"\\w+\")\n",
" tokens = [token for token in tokens if token.isalpha()] # Remove non-alphabetic tokens\n",
" tokens = [token for token in tokens if token not in stop_words] # Remove stopwords\n",
"\n",
" # Lemmatization\n",
" lemmatizer = WordNetLemmatizer()\n",
" tokens = [lemmatizer.lemmatize(token) for token in tokens]\n",
"\n",
" return Counter(tokens) if count else tokens # Returning tokens directly for use in BM25"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Showing the occurrences of each token."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 2210,
"status": "ok",
"timestamp": 1730740406649,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "hC1PSiZX4n0A",
"outputId": "0d469a6c-6d0a-458f-8ea7-a1f954ad0155"
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'hotel': 267,\n",
" 'room': 230,\n",
" 'needle': 129,\n",
" 'seattle': 125,\n",
" 'space': 124,\n",
" 'staff': 123,\n",
" 'great': 104,\n",
" 'breakfast': 88,\n",
" 'location': 86,\n",
" 'clean': 78,\n",
" 'good': 77,\n",
" 'stay': 77,\n",
" 'night': 74,\n",
" 'u': 70,\n",
" 'best': 69,\n",
" 'view': 66,\n",
" 'nice': 61,\n",
" 'monorail': 61,\n",
" 'would': 59,\n",
" 'western': 55,\n",
" 'one': 55,\n",
" 'place': 54,\n",
" 'parking': 53,\n",
" 'walk': 53,\n",
" 'downtown': 51,\n",
" 'service': 50,\n",
" 'desk': 48,\n",
" 'friendly': 48,\n",
" 'time': 46,\n",
" 'restaurant': 46,\n",
" 'center': 45,\n",
" 'stayed': 45,\n",
" 'price': 45,\n",
" 'day': 44,\n",
" 'bed': 44,\n",
" 'also': 44,\n",
" 'front': 43,\n",
" 'helpful': 43,\n",
" 'well': 38,\n",
" 'like': 36,\n",
" 'buffet': 36,\n",
" 'block': 34,\n",
" 'area': 34,\n",
" 'free': 33,\n",
" 'get': 33,\n",
" 'way': 33,\n",
" 'excellent': 32,\n",
" 'could': 31,\n",
" 'close': 31,\n",
" 'even': 31,\n",
" 'minute': 31,\n",
" 'next': 30,\n",
" 'two': 30,\n",
" 'away': 30,\n",
" 'comfortable': 30,\n",
" 'need': 30,\n",
" 'little': 30,\n",
" 'really': 29,\n",
" 'walking': 27,\n",
" 'plus': 24,\n",
" 'got': 24,\n",
" 'everything': 24,\n",
" 'back': 24,\n",
" 'inn': 23,\n",
" 'easy': 23,\n",
" 'pike': 23,\n",
" 'right': 23,\n",
" 'see': 22,\n",
" 'recommend': 22,\n",
" 'executive': 22,\n",
" 'check': 22,\n",
" 'market': 22,\n",
" 'go': 21,\n",
" 'thing': 21,\n",
" 'around': 21,\n",
" 'food': 21,\n",
" 'noise': 21,\n",
" 'near': 21,\n",
" 'tv': 20,\n",
" 'much': 19,\n",
" 'window': 19,\n",
" 'street': 19,\n",
" 'extra': 19,\n",
" 'da': 19,\n",
" 'lot': 18,\n",
" 'make': 18,\n",
" 'morning': 18,\n",
" 'car': 18,\n",
" 'big': 18,\n",
" 'first': 18,\n",
" 'asked': 18,\n",
" 'cruise': 18,\n",
" 'guest': 17,\n",
" 'bit': 17,\n",
" 'people': 17,\n",
" 'distance': 17,\n",
" 'find': 16,\n",
" 'went': 16,\n",
" 'better': 16,\n",
" 'package': 16,\n",
" 'take': 16,\n",
" 'attraction': 16,\n",
" 'perfect': 16,\n",
" 'always': 16,\n",
" 'lobby': 16,\n",
" 'although': 16,\n",
" 'wanted': 16,\n",
" 'bathroom': 16,\n",
" 'etc': 16,\n",
" 'family': 16,\n",
" 'trip': 16,\n",
" 'outside': 16,\n",
" 'feel': 15,\n",
" 'going': 15,\n",
" 'shower': 15,\n",
" 'choice': 15,\n",
" 'took': 15,\n",
" 'every': 15,\n",
" 'value': 15,\n",
" 'paid': 15,\n",
" 'spacious': 15,\n",
" 'used': 15,\n",
" 'und': 15,\n",
" 'want': 14,\n",
" 'many': 14,\n",
" 'building': 14,\n",
" 'review': 14,\n",
" 'bw': 14,\n",
" 'within': 14,\n",
" 'museum': 14,\n",
" 'pretty': 14,\n",
" 'side': 14,\n",
" 'bad': 14,\n",
" 'customer': 14,\n",
" 'ok': 14,\n",
" 'town': 14,\n",
" 'city': 14,\n",
" 'bus': 14,\n",
" 'booked': 14,\n",
" 'visit': 14,\n",
" 'door': 13,\n",
" 'came': 13,\n",
" 'made': 13,\n",
" 'arrived': 13,\n",
" 'fantastic': 13,\n",
" 'pay': 13,\n",
" 'overall': 13,\n",
" 'rate': 13,\n",
" 'travel': 13,\n",
" 'reservation': 13,\n",
" 'help': 13,\n",
" 'person': 12,\n",
" 'short': 12,\n",
" 'nothing': 12,\n",
" 'included': 12,\n",
" 'cost': 12,\n",
" 'however': 12,\n",
" 'small': 12,\n",
" 'look': 12,\n",
" 'say': 12,\n",
" 'nearby': 12,\n",
" 'microwave': 12,\n",
" 'across': 12,\n",
" 'needed': 12,\n",
" 'last': 12,\n",
" 'call': 11,\n",
" 'internet': 11,\n",
" 'science': 11,\n",
" 'enough': 11,\n",
" 'worked': 11,\n",
" 'quiet': 11,\n",
" 'experience': 11,\n",
" 'wonderful': 11,\n",
" 'part': 11,\n",
" 'fine': 11,\n",
" 'said': 11,\n",
" 'kind': 11,\n",
" 'phone': 11,\n",
" 'taking': 11,\n",
" 'kid': 11,\n",
" 'use': 11,\n",
" 'emp': 11,\n",
" 'extremely': 11,\n",
" 'war': 11,\n",
" 'called': 10,\n",
" 'loud': 10,\n",
" 'happy': 10,\n",
" 'light': 10,\n",
" 'since': 10,\n",
" 'key': 10,\n",
" 'floor': 10,\n",
" 'water': 10,\n",
" 'though': 10,\n",
" 'fridge': 10,\n",
" 'felt': 10,\n",
" 'looked': 10,\n",
" 'walked': 10,\n",
" 'problem': 10,\n",
" 'expensive': 10,\n",
" 'park': 10,\n",
" 'work': 10,\n",
" 'deal': 10,\n",
" 'new': 10,\n",
" 'centre': 10,\n",
" 'airport': 10,\n",
" 'die': 10,\n",
" 'issue': 9,\n",
" 'king': 9,\n",
" 'group': 9,\n",
" 'home': 9,\n",
" 'decent': 9,\n",
" 'another': 9,\n",
" 'full': 9,\n",
" 'husband': 9,\n",
" 'queen': 9,\n",
" 'told': 9,\n",
" 'eat': 9,\n",
" 'manager': 9,\n",
" 'staying': 9,\n",
" 'put': 9,\n",
" 'definitely': 9,\n",
" 'reasonable': 9,\n",
" 'management': 9,\n",
" 'provided': 9,\n",
" 'everyone': 9,\n",
" 'ride': 9,\n",
" 'access': 9,\n",
" 'computer': 9,\n",
" 'party': 9,\n",
" 'woman': 9,\n",
" 'far': 9,\n",
" 'charge': 9,\n",
" 'huge': 9,\n",
" 'coffee': 9,\n",
" 'found': 9,\n",
" 'courteous': 9,\n",
" 'secure': 8,\n",
" 'inside': 8,\n",
" 'hallway': 8,\n",
" 'may': 8,\n",
" 'evening': 8,\n",
" 'traffic': 8,\n",
" 'sure': 8,\n",
" 'old': 8,\n",
" 'available': 8,\n",
" 'never': 8,\n",
" 'star': 8,\n",
" 'convenient': 8,\n",
" 'high': 8,\n",
" 'name': 8,\n",
" 'quality': 8,\n",
" 'screen': 8,\n",
" 'still': 8,\n",
" 'pier': 8,\n",
" 'green': 8,\n",
" 'able': 8,\n",
" 'vancouver': 8,\n",
" 'lounge': 8,\n",
" 'business': 8,\n",
" 'zimmer': 8,\n",
" 'der': 8,\n",
" 'man': 7,\n",
" 'seemed': 7,\n",
" 'safe': 7,\n",
" 'pleased': 7,\n",
" 'top': 7,\n",
" 'hot': 7,\n",
" 'left': 7,\n",
" 'end': 7,\n",
" 'offer': 7,\n",
" 'plan': 7,\n",
" 'hour': 7,\n",
" 'wall': 7,\n",
" 'thanks': 7,\n",
" 'wifi': 7,\n",
" 'arena': 7,\n",
" 'elevator': 7,\n",
" 'least': 7,\n",
" 'bar': 7,\n",
" 'checked': 7,\n",
" 'stop': 7,\n",
" 'note': 7,\n",
" 'super': 7,\n",
" 'enjoyed': 7,\n",
" 'without': 7,\n",
" 'closed': 7,\n",
" 'transportation': 7,\n",
" 'half': 7,\n",
" 'fee': 7,\n",
" 'chose': 7,\n",
" 'event': 7,\n",
" 'waterfront': 7,\n",
" 'question': 7,\n",
" 'probably': 7,\n",
" 'money': 7,\n",
" 'sleep': 7,\n",
" 'discount': 7,\n",
" 'fun': 7,\n",
" 'long': 7,\n",
" 'facility': 7,\n",
" 'fresh': 7,\n",
" 'negative': 7,\n",
" 'outstanding': 7,\n",
" 'amazing': 7,\n",
" 'liked': 7,\n",
" 'per': 7,\n",
" 'friend': 7,\n",
" 'hear': 7,\n",
" 'ist': 7,\n",
" 'wir': 7,\n",
" 'im': 7,\n",
" 'especially': 6,\n",
" 'ask': 6,\n",
" 'conference': 6,\n",
" 'pacific': 6,\n",
" 'getting': 6,\n",
" 'older': 6,\n",
" 'think': 6,\n",
" 'egg': 6,\n",
" 'spot': 6,\n",
" 'glass': 6,\n",
" 'year': 6,\n",
" 'thin': 6,\n",
" 'turn': 6,\n",
" 'awesome': 6,\n",
" 'delivered': 6,\n",
" 'noisy': 6,\n",
" 'pressure': 6,\n",
" 'plenty': 6,\n",
" 'child': 6,\n",
" 'rather': 6,\n",
" 'waiting': 6,\n",
" 'impressed': 6,\n",
" 'open': 6,\n",
" 'motel': 6,\n",
" 'try': 6,\n",
" 'know': 6,\n",
" 'variety': 6,\n",
" 'air': 6,\n",
" 'moment': 6,\n",
" 'highly': 6,\n",
" 'drive': 6,\n",
" 'hall': 6,\n",
" 'slept': 6,\n",
" 'card': 6,\n",
" 'shuttle': 6,\n",
" 'pick': 6,\n",
" 'thank': 6,\n",
" 'clearly': 6,\n",
" 'duck': 6,\n",
" 'dated': 6,\n",
" 'including': 6,\n",
" 'shopping': 6,\n",
" 'worth': 6,\n",
" 'lake': 6,\n",
" 'point': 6,\n",
" 'machine': 6,\n",
" 'flat': 6,\n",
" 'allowed': 6,\n",
" 'convention': 6,\n",
" 'wife': 6,\n",
" 'couple': 6,\n",
" 'wait': 6,\n",
" 'low': 6,\n",
" 'fast': 6,\n",
" 'dispenser': 6,\n",
" 'shampoo': 6,\n",
" 'actually': 6,\n",
" 'book': 6,\n",
" 'certainly': 6,\n",
" 'provide': 6,\n",
" 'looking': 6,\n",
" 'spent': 6,\n",
" 'garage': 6,\n",
" 'visiting': 6,\n",
" 'dinner': 6,\n",
" 'large': 6,\n",
" 'head': 6,\n",
" 'parent': 6,\n",
" 'sehr': 6,\n",
" 'für': 6,\n",
" 'mit': 6,\n",
" 'beautiful': 5,\n",
" 'local': 5,\n",
" 'come': 5,\n",
" 'line': 5,\n",
" 'due': 5,\n",
" 'underground': 5,\n",
" 'waffle': 5,\n",
" 'member': 5,\n",
" 'driving': 5,\n",
" 'limited': 5,\n",
" 'anything': 5,\n",
" 'quite': 5,\n",
" 'july': 5,\n",
" 'direction': 5,\n",
" 'property': 5,\n",
" 'son': 5,\n",
" 'ever': 5,\n",
" 'pillow': 5,\n",
" 'yes': 5,\n",
" 'working': 5,\n",
" 'request': 5,\n",
" 'ate': 5,\n",
" 'special': 5,\n",
" 'selection': 5,\n",
" 'online': 5,\n",
" 'gave': 5,\n",
" 'construction': 5,\n",
" 'thought': 5,\n",
" 'surprised': 5,\n",
" 'three': 5,\n",
" 'recommended': 5,\n",
" 'return': 5,\n",
" 'hard': 5,\n",
" 'beyond': 5,\n",
" 'helped': 5,\n",
" 'oh': 5,\n",
" 'afternoon': 5,\n",
" 'complaint': 5,\n",
" 'e': 5,\n",
" 'almost': 5,\n",
" 'carpet': 5,\n",
" 'game': 5,\n",
" 'number': 5,\n",
" 'taxi': 5,\n",
" 'voucher': 5,\n",
" 'station': 5,\n",
" 'dining': 5,\n",
" 'loved': 5,\n",
" 'enjoy': 5,\n",
" 'pool': 5,\n",
" 'homeless': 5,\n",
" 'early': 5,\n",
" 'hold': 5,\n",
" 'luggage': 5,\n",
" 'standard': 5,\n",
" 'five': 5,\n",
" 'located': 5,\n",
" 'expectation': 5,\n",
" 'addition': 5,\n",
" 'given': 5,\n",
" 'accommodating': 5,\n",
" 'upon': 5,\n",
" 'run': 5,\n",
" 'central': 5,\n",
" 'pizza': 5,\n",
" 'meal': 5,\n",
" 'quick': 5,\n",
" 'despite': 5,\n",
" 'comfy': 5,\n",
" 'daughter': 5,\n",
" 'ken': 5,\n",
" 'believe': 5,\n",
" 'decor': 5,\n",
" 'sky': 5,\n",
" 'four': 5,\n",
" 'cozy': 5,\n",
" 'wireless': 5,\n",
" 'christmas': 5,\n",
" 'employee': 5,\n",
" 'soap': 5,\n",
" 'terminal': 5,\n",
" 'wi': 4,\n",
" 'fi': 4,\n",
" 'dressed': 4,\n",
" 'keep': 4,\n",
" 'ticket': 4,\n",
" 'proximity': 4,\n",
" 'kept': 4,\n",
" 'furniture': 4,\n",
" 'brella': 4,\n",
" 'sorry': 4,\n",
" 'offering': 4,\n",
" 'different': 4,\n",
" 'weather': 4,\n",
" 'surprise': 4,\n",
" 'garden': 4,\n",
" 'accommodation': 4,\n",
" 'credit': 4,\n",
" 'concert': 4,\n",
" 'handy': 4,\n",
" 'set': 4,\n",
" 'lower': 4,\n",
" 'someone': 4,\n",
" 'talking': 4,\n",
" 'polite': 4,\n",
" 'suggestion': 4,\n",
" 'option': 4,\n",
" 'adequate': 4,\n",
" 'aware': 4,\n",
" 'fault': 4,\n",
" 'refrigerator': 4,\n",
" 'sign': 4,\n",
" 'leave': 4,\n",
" 'let': 4,\n",
" 'bacon': 4,\n",
" 'conditioning': 4,\n",
" 'safeco': 4,\n",
" 'drop': 4,\n",
" 'company': 4,\n",
" 'reached': 4,\n",
" 'truly': 4,\n",
" 'com': 4,\n",
" 'completely': 4,\n",
" 'job': 4,\n",
" 'menu': 4,\n",
" 'twice': 4,\n",
" 'making': 4,\n",
" 'fancy': 4,\n",
" 'yummy': 4,\n",
" 'cooky': 4,\n",
" 'whole': 4,\n",
" 'pleasant': 4,\n",
" 'site': 4,\n",
" 'connectivity': 4,\n",
" 'neighborhood': 4,\n",
" 'higher': 4,\n",
" 'expected': 4,\n",
" 'mcdonald': 4,\n",
" 'less': 4,\n",
" 'expect': 4,\n",
" 'cheap': 4,\n",
" 'sink': 4,\n",
" 'positive': 4,\n",
" 'union': 4,\n",
" 'checking': 4,\n",
" 'answer': 4,\n",
" 'important': 4,\n",
" 'direct': 4,\n",
" 'email': 4,\n",
" 'tax': 4,\n",
" 'level': 4,\n",
" 'normally': 4,\n",
" 'start': 4,\n",
" 'using': 4,\n",
" 'meeting': 4,\n",
" 'requested': 4,\n",
" 'attendee': 4,\n",
" 'clerk': 4,\n",
" 'deserves': 4,\n",
" 'tom': 4,\n",
" 'dana': 4,\n",
" 'age': 4,\n",
" 'previous': 4,\n",
" 'sightseeing': 4,\n",
" 'saved': 4,\n",
" 'rail': 4,\n",
" 'fitness': 4,\n",
" 'anyone': 4,\n",
" 'travelling': 4,\n",
" 'spectacular': 4,\n",
" 'music': 4,\n",
" 'reception': 4,\n",
" 'considering': 4,\n",
" 'chair': 4,\n",
" 'information': 4,\n",
" 'catering': 4,\n",
" 'prepared': 4,\n",
" 'easily': 4,\n",
" 'several': 4,\n",
" 'road': 4,\n",
" 'disappointed': 4,\n",
" 'major': 4,\n",
" 'picked': 4,\n",
" 'idea': 4,\n",
" 'decorated': 4,\n",
" 'towel': 4,\n",
" 'ten': 4,\n",
" 'care': 4,\n",
" 'min': 4,\n",
" 'amenity': 4,\n",
" 'choose': 4,\n",
" 'step': 4,\n",
" 'speed': 4,\n",
" 'frühstück': 4,\n",
" 'einen': 4,\n",
" 'zu': 4,\n",
" 'nicht': 4,\n",
" 'den': 4,\n",
" 'ein': 4,\n",
" 'var': 4,\n",
" 'duty': 3,\n",
" 'give': 3,\n",
" 'tried': 3,\n",
" 'downstairs': 3,\n",
" 'decided': 3,\n",
" 'updated': 3,\n",
" 'sound': 3,\n",
" 'efficient': 3,\n",
" 'paying': 3,\n",
" 'added': 3,\n",
" 'saw': 3,\n",
" 'dark': 3,\n",
" 'otherwise': 3,\n",
" 'suggested': 3,\n",
" 'sleeping': 3,\n",
" 'blanket': 3,\n",
" 'saturday': 3,\n",
" 'others': 3,\n",
" 'relaxing': 3,\n",
" 'definately': 3,\n",
" 'traveling': 3,\n",
" 'favorite': 3,\n",
" 'giving': 3,\n",
" 'average': 3,\n",
" 'complain': 3,\n",
" 'either': 3,\n",
" 'coming': 3,\n",
" 'finding': 3,\n",
" 'understand': 3,\n",
" 'connecting': 3,\n",
" 'future': 3,\n",
" 'cleaned': 3,\n",
" 'wash': 3,\n",
" 'whether': 3,\n",
" 'includes': 3,\n",
" 'tut': 3,\n",
" 'parked': 3,\n",
" 'spend': 3,\n",
" 'non': 3,\n",
" 'second': 3,\n",
" 'tiny': 3,\n",
" 'toilet': 3,\n",
" 'maid': 3,\n",
" 'respect': 3,\n",
" 'trying': 3,\n",
" 'goodness': 3,\n",
" 'replaced': 3,\n",
" 'spring': 3,\n",
" 'response': 3,\n",
" 'field': 3,\n",
" 'mariner': 3,\n",
" 'answered': 3,\n",
" 'driver': 3,\n",
" 'apparently': 3,\n",
" 'julian': 3,\n",
" 'accomodations': 3,\n",
" 'tip': 3,\n",
" 'item': 3,\n",
" 'table': 3,\n",
" 'fabulous': 3,\n",
" 'patricia': 3,\n",
" 'decision': 3,\n",
" 'corner': 3,\n",
" 'swimming': 3,\n",
" 'spa': 3,\n",
" 'dont': 3,\n",
" 'police': 3,\n",
" 'matter': 3,\n",
" 'already': 3,\n",
" 'mentioned': 3,\n",
" 'seems': 3,\n",
" 'love': 3,\n",
" 'basic': 3,\n",
" 'snack': 3,\n",
" 'cut': 3,\n",
" 'week': 3,\n",
" 'flight': 3,\n",
" 'real': 3,\n",
" 'slow': 3,\n",
" 'relatively': 3,\n",
" 'store': 3,\n",
" 'district': 3,\n",
" 'managed': 3,\n",
" 'onto': 3,\n",
" 'reason': 3,\n",
" 'might': 3,\n",
" 'save': 3,\n",
" 'size': 3,\n",
" 'updating': 3,\n",
" 'fan': 3,\n",
" 'lotion': 3,\n",
" 'weak': 3,\n",
" 'turned': 3,\n",
" 'touch': 3,\n",
" 'recently': 3,\n",
" 'tourism': 3,\n",
" 'maybe': 3,\n",
" 'tourist': 3,\n",
" 'frustrating': 3,\n",
" 'checkout': 3,\n",
" 'mattress': 3,\n",
" 'alternative': 3,\n",
" 'cereal': 3,\n",
" 'juice': 3,\n",
" 'p': 3,\n",
" 'bonus': 3,\n",
" 'ton': 3,\n",
" 'complimentary': 3,\n",
" 'equipped': 3,\n",
" 'presentation': 3,\n",
" 'factor': 3,\n",
" 'budget': 3,\n",
" 'late': 3,\n",
" 'busy': 3,\n",
" 'happen': 3,\n",
" 'sat': 3,\n",
" 'justin': 3,\n",
" 'ago': 3,\n",
" 'supposed': 3,\n",
" 'conditioner': 3,\n",
" 'c': 3,\n",
" 'encountered': 3,\n",
" 'tour': 3,\n",
" 'show': 3,\n",
" 'sufficient': 3,\n",
" 'mini': 3,\n",
" 'attractive': 3,\n",
" 'continental': 3,\n",
" 'additional': 3,\n",
" 'offered': 3,\n",
" 'strength': 3,\n",
" 'throughout': 3,\n",
" 'rental': 3,\n",
" 'worry': 3,\n",
" 'bartender': 3,\n",
" 'cheryl': 3,\n",
" 'judge': 3,\n",
" 'cover': 3,\n",
" 'appreciated': 3,\n",
" 'josh': 3,\n",
" 'true': 3,\n",
" 'incredibly': 3,\n",
" 'ideal': 3,\n",
" 'lovely': 3,\n",
" 'treat': 3,\n",
" 'effort': 3,\n",
" 'met': 3,\n",
" 'warm': 3,\n",
" 'heart': 3,\n",
" 'adjacent': 3,\n",
" 'entertainment': 3,\n",
" 'affordable': 3,\n",
" 'column': 3,\n",
" 'lunch': 3,\n",
" 'quickly': 3,\n",
" 'must': 3,\n",
" 'pas': 3,\n",
" 'fireplace': 3,\n",
" 'exterior': 3,\n",
" 'curtain': 3,\n",
" 'treated': 3,\n",
" 'nicely': 3,\n",
" 'greeted': 3,\n",
" 'purpose': 3,\n",
" 'hop': 3,\n",
" 'held': 3,\n",
" 'vacation': 3,\n",
" 'choosing': 3,\n",
" 'beat': 3,\n",
" 'neighbor': 3,\n",
" 'handicapped': 3,\n",
" 'men': 3,\n",
" 'ended': 3,\n",
" 'ready': 3,\n",
" 'ice': 3,\n",
" 'promised': 3,\n",
" 'bottle': 3,\n",
" 'westlake': 3,\n",
" 'personal': 3,\n",
" 'von': 3,\n",
" 'entfernt': 3,\n",
" 'dem': 3,\n",
" 'auch': 3,\n",
" 'nur': 3,\n",
" 'unser': 3,\n",
" 'usd': 3,\n",
" 'nach': 3,\n",
" 'aber': 3,\n",
" 'hotellet': 3,\n",
" 'av': 3,\n",
" 'password': 2,\n",
" 'policy': 2,\n",
" 'ridiculous': 2,\n",
" 'anyway': 2,\n",
" 'spoke': 2,\n",
" 'shocked': 2,\n",
" 'leaf': 2,\n",
" 'vip': 2,\n",
" 'specific': 2,\n",
" 'opted': 2,\n",
" 'dissapointed': 2,\n",
" 'wise': 2,\n",
" 'adjoining': 2,\n",
" 'sausage': 2,\n",
" 'slimy': 2,\n",
" 'chihuly': 2,\n",
" 'exhibit': 2,\n",
" 'experienced': 2,\n",
" 'v': 2,\n",
" 'compared': 2,\n",
" 'lodging': 2,\n",
" 'counter': 2,\n",
" 'unless': 2,\n",
" 'stocked': 2,\n",
" 'marketing': 2,\n",
" 'reserved': 2,\n",
" 'adult': 2,\n",
" 'housekeeper': 2,\n",
" 'bring': 2,\n",
" 'setting': 2,\n",
" 'heard': 2,\n",
" 'main': 2,\n",
" 'seen': 2,\n",
" 'comment': 2,\n",
" 'disposable': 2,\n",
" 'advising': 2,\n",
" 'particularly': 2,\n",
" 'locally': 2,\n",
" 'tap': 2,\n",
" 'advantage': 2,\n",
" 'literally': 2,\n",
" 'entire': 2,\n",
" 'watch': 2,\n",
" 'earlier': 2,\n",
" 'garbage': 2,\n",
" 'truck': 2,\n",
" 'sit': 2,\n",
" 'privacy': 2,\n",
" 'telling': 2,\n",
" 'serviced': 2,\n",
" 'cleanliness': 2,\n",
" 'comfort': 2,\n",
" 'nite': 2,\n",
" 'lucky': 2,\n",
" 'uncomfortable': 2,\n",
" 'broken': 2,\n",
" 'change': 2,\n",
" 'worried': 2,\n",
" 'portland': 2,\n",
" 'ferry': 2,\n",
" 'stain': 2,\n",
" 'missing': 2,\n",
" 'informed': 2,\n",
" 'jerk': 2,\n",
" 'indeed': 2,\n",
" 'ticked': 2,\n",
" 'lodge': 2,\n",
" 'sending': 2,\n",
" 'letter': 2,\n",
" 'showed': 2,\n",
" 'confirmation': 2,\n",
" 'solve': 2,\n",
" 'fully': 2,\n",
" 'worse': 2,\n",
" 'collection': 2,\n",
" 'beside': 2,\n",
" 'priced': 2,\n",
" 'following': 2,\n",
" 'mcdonalds': 2,\n",
" 'round': 2,\n",
" 'something': 2,\n",
" 'ive': 2,\n",
" 'rating': 2,\n",
" 'bothered': 2,\n",
" 'stone': 2,\n",
" 'throw': 2,\n",
" 'milk': 2,\n",
" 'later': 2,\n",
" 'solid': 2,\n",
" 'mile': 2,\n",
" 'abit': 2,\n",
" 'bath': 2,\n",
" 'attend': 2,\n",
" 'definite': 2,\n",
" 'attending': 2,\n",
" 'responsive': 2,\n",
" 'month': 2,\n",
" 'accomodating': 2,\n",
" 'particular': 2,\n",
" 'enjoyable': 2,\n",
" 'perfectly': 2,\n",
" 'honest': 2,\n",
" 'spray': 2,\n",
" 'covering': 2,\n",
" 'advertising': 2,\n",
" 'stocking': 2,\n",
" 'possible': 2,\n",
" 'regular': 2,\n",
" 'yakima': 2,\n",
" 'condition': 2,\n",
" 'amount': 2,\n",
" 'poor': 2,\n",
" 'notice': 2,\n",
" 'brought': 2,\n",
" 'attention': 2,\n",
" 'rough': 2,\n",
" 'term': 2,\n",
" 'outdated': 2,\n",
" 'worst': 2,\n",
" 'system': 2,\n",
" 'cool': 2,\n",
" 'disappointing': 2,\n",
" 'extensive': 2,\n",
" 'bright': 2,\n",
" 'denny': 2,\n",
" 'anne': 2,\n",
" 'alaskan': 2,\n",
" 'checkin': 2,\n",
" 'industry': 2,\n",
" 'conversation': 2,\n",
" 'stuff': 2,\n",
" 'knowledge': 2,\n",
" 'map': 2,\n",
" 'training': 2,\n",
" 'confirm': 2,\n",
" 'happened': 2,\n",
" 'calling': 2,\n",
" 'simply': 2,\n",
" 'cheese': 2,\n",
" 'maker': 2,\n",
" 'fruit': 2,\n",
" 'potato': 2,\n",
" 'clown': 2,\n",
" 'sleeper': 2,\n",
" 'banquet': 2,\n",
" 'west': 2,\n",
" 'separation': 2,\n",
" 'stair': 2,\n",
" 'solved': 2,\n",
" 'speaker': 2,\n",
" 'minor': 2,\n",
" 'play': 2,\n",
" 'delicious': 2,\n",
" 'wide': 2,\n",
" 'host': 2,\n",
" 'respond': 2,\n",
" 'hesitate': 2,\n",
" 'emergency': 2,\n",
" 'gentleman': 2,\n",
" 'concerned': 2,\n",
" 'attentive': 2,\n",
" 'unhelpful': 2,\n",
" 'sour': 2,\n",
" 'smile': 2,\n",
" 'fifteenth': 2,\n",
" 'seated': 2,\n",
" 'waitress': 2,\n",
" 'young': 2,\n",
" 'assume': 2,\n",
" 'chain': 2,\n",
" 'bin': 2,\n",
" 'rug': 2,\n",
" 'hidden': 2,\n",
" 'housekeeping': 2,\n",
" 'stuck': 2,\n",
" 'surroundings': 2,\n",
" 'feature': 2,\n",
" 'changed': 2,\n",
" 'proved': 2,\n",
" 'ship': 2,\n",
" 'sized': 2,\n",
" 'ground': 2,\n",
" 'coupon': 2,\n",
" 'wake': 2,\n",
" 'rode': 2,\n",
" 'unsafe': 2,\n",
" 'talk': 2,\n",
" 'fire': 2,\n",
" 'victoria': 2,\n",
" 'funky': 2,\n",
" 'dropped': 2,\n",
" 'newer': 2,\n",
" 'lived': 2,\n",
" 'cold': 2,\n",
" 'virtually': 2,\n",
" 'maintained': 2,\n",
" 'linen': 2,\n",
" 'freeway': 2,\n",
" 'tasty': 2,\n",
" 'wish': 2,\n",
" 'hawaii': 2,\n",
" 'remember': 2,\n",
" 'rest': 2,\n",
" 'opinion': 2,\n",
" 'smell': 2,\n",
" 'inviting': 2,\n",
" 'attended': 2,\n",
" 'patient': 2,\n",
" 'concern': 2,\n",
" 'firm': 2,\n",
" 'demolition': 2,\n",
" 'empty': 2,\n",
" 'arrangement': 2,\n",
" 'middle': 2,\n",
" 'behind': 2,\n",
" 'poster': 2,\n",
" 'drink': 2,\n",
" 'complimented': 2,\n",
" 'kitchen': 2,\n",
" 'hub': 2,\n",
" 'reachable': 2,\n",
" 'efficiency': 2,\n",
" 'past': 2,\n",
" 'self': 2,\n",
" 'actual': 2,\n",
" 'downside': 2,\n",
" 'exceptional': 2,\n",
" 'advisor': 2,\n",
" ...})"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"preprocess(final_df[\"reviews\"][2], True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Getting only tokens."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 4,
"status": "ok",
"timestamp": 1730740406649,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "BGLnrN144qeV",
"outputId": "d560a949-43a7-46df-cecb-25aa490b67cd"
},
"outputs": [
{
"data": {
"text/plain": [
"['beautiful',\n",
" 'view',\n",
" 'space',\n",
" 'needle',\n",
" 'especially',\n",
" 'night',\n",
" 'like',\n",
" 'photography',\n",
" 'ask',\n",
" 'room',\n",
" 'view',\n",
" 'staff',\n",
" 'great',\n",
" 'local',\n",
" 'restuarant',\n",
" 'excellent',\n",
" 'secure',\n",
" 'parking',\n",
" 'plus',\n",
" 'free',\n",
" 'wi',\n",
" 'fi',\n",
" 'issue',\n",
" 'waking',\n",
" 'conference',\n",
" 'call',\n",
" 'could',\n",
" 'find',\n",
" 'internet',\n",
" 'password',\n",
" 'called',\n",
" 'front',\n",
" 'desk',\n",
" 'man',\n",
" 'duty',\n",
" 'would',\n",
" 'give',\n",
" 'insisited',\n",
" 'come',\n",
" 'get',\n",
" 'person',\n",
" 'tried',\n",
" 'convince',\n",
" 'dressed',\n",
" 'claimed',\n",
" 'policy',\n",
" 'guest',\n",
" 'come',\n",
" 'collect',\n",
" 'password',\n",
" 'desk',\n",
" 'got',\n",
" 'dressed',\n",
" 'went',\n",
" 'downstairs',\n",
" 'get',\n",
" 'ridiculous',\n",
" 'seriously',\n",
" 'irritated',\n",
" 'anyway',\n",
" 'spoke',\n",
" 'day',\n",
" 'staff',\n",
" 'seemed',\n",
" 'shocked',\n",
" 'hotel',\n",
" 'policy',\n",
" 'keep',\n",
" 'code',\n",
" 'safe',\n",
" 'intending',\n",
" 'see',\n",
" 'king',\n",
" 'tutankhamen',\n",
" 'treasure',\n",
" 'pacific',\n",
" 'science',\n",
" 'center',\n",
" 'leaf',\n",
" 'january',\n",
" 'time',\n",
" 'getting',\n",
" 'short',\n",
" 'decided',\n",
" 'better',\n",
" 'purchase',\n",
" 'vip',\n",
" 'ticket',\n",
" 'go',\n",
" 'specific',\n",
" 'time',\n",
" 'stand',\n",
" 'line',\n",
" 'opted',\n",
" 'hotel',\n",
" 'due',\n",
" 'proximity',\n",
" 'seattle',\n",
" 'center',\n",
" 'dissapointed',\n",
" 'secure',\n",
" 'underground',\n",
" 'parking',\n",
" 'space',\n",
" 'needle',\n",
" 'view',\n",
" 'king',\n",
" 'room',\n",
" 'pleased',\n",
" 'hotel',\n",
" 'bit',\n",
" 'older',\n",
" 'well',\n",
" 'kept',\n",
" 'updated',\n",
" 'furniture',\n",
" 'wise',\n",
" 'well',\n",
" 'room',\n",
" 'adjoining',\n",
" 'room',\n",
" 'group',\n",
" 'next',\n",
" 'door',\n",
" 'bit',\n",
" 'loud',\n",
" 'think',\n",
" 'door',\n",
" 'thick',\n",
" 'enough',\n",
" 'sound',\n",
" 'proof',\n",
" 'enough',\n",
" 'two',\n",
" 'room',\n",
" 'hotel',\n",
" 'staff',\n",
" 'efficient',\n",
" 'helpful',\n",
" 'package',\n",
" 'came',\n",
" 'free',\n",
" 'breakfast',\n",
" 'brella',\n",
" 'restaurant',\n",
" 'inside',\n",
" 'hotel',\n",
" 'passed',\n",
" 'breakfast',\n",
" 'nothing',\n",
" 'write',\n",
" 'home',\n",
" 'much',\n",
" 'better',\n",
" 'breakfast',\n",
" 'hotel',\n",
" 'included',\n",
" 'cost',\n",
" 'room',\n",
" 'feel',\n",
" 'sorry',\n",
" 'people',\n",
" 'paying',\n",
" 'breakfast',\n",
" 'top',\n",
" 'cost',\n",
" 'room',\n",
" 'offering',\n",
" 'much',\n",
" 'different',\n",
" 'hotel',\n",
" 'good',\n",
" 'sausage',\n",
" 'slimy',\n",
" 'egg',\n",
" 'stiff',\n",
" 'dry',\n",
" 'hash',\n",
" 'brown',\n",
" 'slimy',\n",
" 'good',\n",
" 'thing',\n",
" 'waffle',\n",
" 'made',\n",
" 'hot',\n",
" 'spot',\n",
" 'blessed',\n",
" 'decent',\n",
" 'weather',\n",
" 'another',\n",
" 'surprise',\n",
" 'seattle',\n",
" 'center',\n",
" 'dale',\n",
" 'chihuly',\n",
" 'garden',\n",
" 'glass',\n",
" 'exhibit',\n",
" 'added',\n",
" 'year',\n",
" 'arrived',\n",
" 'saw',\n",
" 'garden',\n",
" 'bit',\n",
" 'sunlight',\n",
" 'left',\n",
" 'experienced',\n",
" 'dark',\n",
" 'fantastic',\n",
" 'hotel',\n",
" 'excellent',\n",
" 'location',\n",
" 'would',\n",
" 'recommend',\n",
" 'space',\n",
" 'needle',\n",
" 'view',\n",
" 'however',\n",
" 'end',\n",
" 'hallway',\n",
" 'get',\n",
" 'small',\n",
" 'window',\n",
" 'offer',\n",
" 'free',\n",
" 'breakfast',\n",
" 'buffet',\n",
" 'best',\n",
" 'western',\n",
" 'member',\n",
" 'otherwise',\n",
" 'pay',\n",
" 'good',\n",
" 'buffet',\n",
" 'excellent',\n",
" 'seating',\n",
" 'v',\n",
" 'best',\n",
" 'western',\n",
" 'room',\n",
" 'nice',\n",
" 'pay',\n",
" 'resonable',\n",
" 'compared',\n",
" 'lodging',\n",
" 'closer',\n",
" 'downtown',\n",
" 'walking',\n",
" 'distant',\n",
" 'monorail',\n",
" 'take',\n",
" 'downtown',\n",
" 'however',\n",
" 'plan',\n",
" 'driving',\n",
" 'around',\n",
" 'may',\n",
" 'want',\n",
" 'look',\n",
" 'another',\n",
" 'location',\n",
" 'parking',\n",
" 'limited',\n",
" 'anything',\n",
" 'going',\n",
" 'seattle',\n",
" 'may',\n",
" 'find',\n",
" 'street',\n",
" 'parking',\n",
" 'block',\n",
" 'away',\n",
" 'left',\n",
" 'one',\n",
" 'night',\n",
" 'parking',\n",
" 'full',\n",
" 'counter',\n",
" 'personnel',\n",
" 'suggested',\n",
" 'lot',\n",
" 'two',\n",
" 'block',\n",
" 'charged',\n",
" 'hour',\n",
" 'happy',\n",
" 'camper',\n",
" 'walking',\n",
" 'issue',\n",
" 'defiantly',\n",
" 'would',\n",
" 'worked',\n",
" 'well',\n",
" 'accommodation',\n",
" 'wall',\n",
" 'thin',\n",
" 'unless',\n",
" 'turn',\n",
" 'heater',\n",
" 'feel',\n",
" 'like',\n",
" 'sleeping',\n",
" 'sheet',\n",
" 'light',\n",
" 'blanket',\n",
" 'pull',\n",
" 'closet',\n",
" 'suggest',\n",
" 'hotel',\n",
" 'restaurant',\n",
" 'front',\n",
" 'desk',\n",
" 'steered',\n",
" 'u',\n",
" 'better',\n",
" 'food',\n",
" 'husband',\n",
" 'stayed',\n",
" 'saturday',\n",
" 'would',\n",
" 'recommend',\n",
" 'others',\n",
" 'staff',\n",
" 'friendly',\n",
" 'quiet',\n",
" 'relaxing',\n",
" 'room',\n",
" 'clean',\n",
" 'bed',\n",
" 'comfortable',\n",
" 'good',\n",
" 'price',\n",
" 'close',\n",
" 'space',\n",
" 'needle',\n",
" 'would',\n",
" 'definately',\n",
" 'stay',\n",
" 'next',\n",
" 'time',\n",
" 'go',\n",
" 'seattle',\n",
" 'overall',\n",
" 'great',\n",
" 'hotel',\n",
" 'dong',\n",
" 'quite',\n",
" 'bit',\n",
" 'traveling',\n",
" 'since',\n",
" 'retired',\n",
" 'end',\n",
" 'july',\n",
" 'best',\n",
" 'wester',\n",
" 'plus',\n",
" 'executive',\n",
" 'inn',\n",
" 'one',\n",
" 'favorite',\n",
" 'staff',\n",
" 'attitude',\n",
" 'make',\n",
" 'stay',\n",
" 'great',\n",
" 'experience',\n",
" 'wonderful',\n",
" 'time',\n",
" 'thanks',\n",
" 'helpful',\n",
" 'staff',\n",
" 'te',\n",
" 'great',\n",
" 'giving',\n",
" 'u',\n",
" 'direction',\n",
" 'advise',\n",
" 'go',\n",
" 'get',\n",
" 'seattle',\n",
" 'awesome',\n",
" 'location',\n",
" 'great',\n",
" 'view',\n",
" 'room',\n",
" 'clean',\n",
" 'well',\n",
" 'stocked',\n",
" 'wifi',\n",
" 'flawless',\n",
" 'short',\n",
" 'one',\n",
" 'night',\n",
" 'stay',\n",
" 'average',\n",
" 'nothing',\n",
" 'complain',\n",
" 'nothing',\n",
" 'brag',\n",
" 'either',\n",
" 'feel',\n",
" 'plus',\n",
" 'status',\n",
" 'hotel',\n",
" 'marketing',\n",
" 'stretch',\n",
" 'give',\n",
" 'credit',\n",
" 'property',\n",
" 'location',\n",
" 'location',\n",
" 'location',\n",
" 'easy',\n",
" 'walk',\n",
" 'many',\n",
" 'seattle',\n",
" 'attraction',\n",
" 'around',\n",
" 'space',\n",
" 'needle',\n",
" 'reserved',\n",
" 'july',\n",
" 'occupancy',\n",
" 'room',\n",
" 'queen',\n",
" 'bed',\n",
" 'traveling',\n",
" 'adult',\n",
" 'son',\n",
" 'november',\n",
" 'concert',\n",
" 'key',\n",
" 'arena',\n",
" 'good',\n",
" 'thing',\n",
" 'say',\n",
" 'close',\n",
" 'handy',\n",
" 'concert',\n",
" 'arrived',\n",
" 'check',\n",
" 'told',\n",
" 'room',\n",
" 'bed',\n",
" 'roll',\n",
" 'away',\n",
" 'sent',\n",
" 'saddest',\n",
" 'bed',\n",
" 'ever',\n",
" 'saw',\n",
" 'delivered',\n",
" 'rude',\n",
" 'housekeeper',\n",
" 'acted',\n",
" 'mad',\n",
" 'bring',\n",
" 'u',\n",
" 'bed',\n",
" 'extra',\n",
" 'blanket',\n",
" 'pillow',\n",
" 'even',\n",
" 'offer',\n",
" 'set',\n",
" 'bed',\n",
" 'room',\n",
" 'floor',\n",
" 'elevator',\n",
" 'noisy',\n",
" 'setting',\n",
" 'heard',\n",
" 'guest',\n",
" 'housekeeper',\n",
" 'coming',\n",
" 'going',\n",
" 'view',\n",
" 'roof',\n",
" 'top',\n",
" 'neighboring',\n",
" 'building',\n",
" 'lower',\n",
" 'parking',\n",
" 'area',\n",
" 'good',\n",
" 'place',\n",
" 'nearby',\n",
" 'eat',\n",
" 'nice',\n",
" 'shower',\n",
" 'lot',\n",
" 'water',\n",
" 'pressure',\n",
" 'though',\n",
" 'tub',\n",
" 'drain',\n",
" 'stayed',\n",
" 'night',\n",
" 'went',\n",
" 'rush',\n",
" 'concert',\n",
" 'key',\n",
" 'arena',\n",
" 'excellent',\n",
" 'concert',\n",
" 'way',\n",
" 'location',\n",
" 'perfect',\n",
" 'contrary',\n",
" 'review',\n",
" 'trouble',\n",
" 'finding',\n",
" 'excellent',\n",
" 'view',\n",
" 'space',\n",
" 'needle',\n",
" 'floor',\n",
" 'room',\n",
" 'yes',\n",
" 'older',\n",
" 'hotel',\n",
" 'need',\n",
" 'reno',\n",
" 'understand',\n",
" 'working',\n",
" 'room',\n",
" 'plenty',\n",
" 'space',\n",
" 'everything',\n",
" 'worked',\n",
" 'well',\n",
" 'main',\n",
" 'issue',\n",
" 'noise',\n",
" 'coming',\n",
" 'guest',\n",
" 'next',\n",
" 'door',\n",
" 'one',\n",
" 'night',\n",
" 'child',\n",
" 'loud',\n",
" 'next',\n",
" 'someone',\n",
" 'talking',\n",
" 'loud',\n",
" 'morning',\n",
" 'may',\n",
" 'adjoining',\n",
" 'room',\n",
" 'next',\n",
" 'time',\n",
" 'would',\n",
" 'request',\n",
" 'room',\n",
" 'connecting',\n",
" 'door',\n",
" 'lesson',\n",
" 'learned',\n",
" 'breakfast',\n",
" 'better',\n",
" 'many',\n",
" 'get',\n",
" 'best',\n",
" 'western',\n",
" 'plenty',\n",
" 'choice',\n",
" 'get',\n",
" 'bored',\n",
" 'day',\n",
" 'stay',\n",
" 'food',\n",
" 'restaurant',\n",
" 'evening',\n",
" 'rather',\n",
" 'bland',\n",
" 'ate',\n",
" 'nice',\n",
" 'fridge',\n",
" 'microwave',\n",
" 'room',\n",
" 'always',\n",
" 'ask',\n",
" 'fridge',\n",
" 'least',\n",
" 'staff',\n",
" 'hotel',\n",
" 'always',\n",
" 'polite',\n",
" 'friendly',\n",
" 'even',\n",
" 'line',\n",
" 'guest',\n",
" 'waiting',\n",
" 'seen',\n",
" 'also',\n",
" 'impressed',\n",
" 'manager',\n",
" 'see',\n",
" 'responds',\n",
" 'review',\n",
" 'doubt',\n",
" 'comment',\n",
" 'suggestion',\n",
" 'future',\n",
" 'bw',\n",
" 'plus',\n",
" 'proper',\n",
" 'glass',\n",
" 'mug',\n",
" 'well',\n",
" 'disposable',\n",
" 'option',\n",
" 'seen',\n",
" 'report',\n",
" 'cleaned',\n",
" 'place',\n",
" 'guest',\n",
" 'wash',\n",
" 'also',\n",
" 'think',\n",
" 'advising',\n",
" 'guest',\n",
" 'room',\n",
" 'connecting',\n",
" 'door',\n",
" 'next',\n",
" 'room',\n",
" 'decide',\n",
" 'whether',\n",
" 'stay',\n",
" 'room',\n",
" 'hotel',\n",
" 'great',\n",
" 'location',\n",
" 'view',\n",
" 'space',\n",
" 'needle',\n",
" 'within',\n",
" 'easy',\n",
" 'walking',\n",
" 'distance',\n",
" 'monorail',\n",
" 'downtown',\n",
" 'service',\n",
" 'patchy',\n",
" 'however',\n",
" 'lobby',\n",
" 'room',\n",
" 'little',\n",
" 'dingy',\n",
" 'need',\n",
" 'modernization',\n",
" 'particularly',\n",
" 'view',\n",
" 'price',\n",
" 'parking',\n",
" 'extra',\n",
" 'also',\n",
" 'get',\n",
" 'rate',\n",
" 'includes',\n",
" 'breakfast',\n",
" 'although',\n",
" 'breakfast',\n",
" 'nothing',\n",
" 'special',\n",
" 'bar',\n",
" 'nice',\n",
" 'selection',\n",
" 'locally',\n",
" 'brewed',\n",
" 'ale',\n",
" 'tap',\n",
" 'wanted',\n",
" 'place',\n",
" 'close',\n",
" 'museum',\n",
" 'attraction',\n",
" 'near',\n",
" 'space',\n",
" 'needle',\n",
" 'took',\n",
" 'advantage',\n",
" 'hotel',\n",
" 'king',\n",
" 'tut',\n",
" 'package',\n",
" 'hotel',\n",
" 'literally',\n",
" 'two',\n",
" 'block',\n",
" 'everything',\n",
" 'price',\n",
" 'better',\n",
" 'others',\n",
" 'checked',\n",
" 'online',\n",
" 'parked',\n",
" 'car',\n",
" 'hotel',\n",
" 'move',\n",
" 'entire',\n",
" 'time',\n",
" 'night',\n",
" 'one',\n",
" 'big',\n",
" 'advantage',\n",
" 'staying',\n",
" 'king',\n",
" 'tut',\n",
" 'package',\n",
" 'gave',\n",
" 'u',\n",
" 'vip',\n",
" 'ticket',\n",
" 'meant',\n",
" 'could',\n",
" 'go',\n",
" 'see',\n",
" 'exhibit',\n",
" 'time',\n",
" 'open',\n",
" 'rather',\n",
" 'date',\n",
" 'time',\n",
" 'specific',\n",
" 'ticket',\n",
" 'motel',\n",
" 'room',\n",
" 'clean',\n",
" 'adequate',\n",
" 'spend',\n",
" 'much',\n",
" 'time',\n",
" 'view',\n",
" 'room',\n",
" 'pretty',\n",
" 'blah',\n",
" 'overlooking',\n",
" 'parking',\n",
" 'lot',\n",
" 'except',\n",
" 'get',\n",
" 'watch',\n",
" 'one',\n",
" 'really',\n",
" 'big',\n",
" 'building',\n",
" 'crane',\n",
" 'operation',\n",
" 'aware',\n",
" 'least',\n",
" 'part',\n",
" 'seattle',\n",
" 'seemed',\n",
" 'construction',\n",
" 'mode',\n",
" 'hotel',\n",
" 'fault',\n",
" 'awakened',\n",
" 'one',\n",
" 'morning',\n",
" 'non',\n",
" 'stop',\n",
" 'jack',\n",
" 'hammering',\n",
" 'street',\n",
" 'even',\n",
" 'earlier',\n",
" 'second',\n",
" 'morning',\n",
" 'arrival',\n",
" 'garbage',\n",
" 'truck',\n",
" 'plus',\n",
" 'side',\n",
" 'room',\n",
" 'refrigerator',\n",
" 'microwave',\n",
" 'arm',\n",
" 'chairm',\n",
" 'bathroom',\n",
" 'tiny',\n",
" 'try',\n",
" 'sit',\n",
" 'toilet',\n",
" 'close',\n",
" 'door',\n",
" 'time',\n",
" 'comment',\n",
" 'maid',\n",
" 'service',\n",
" 'say',\n",
" 'respect',\n",
" 'privacy',\n",
" 'clean',\n",
" 'room',\n",
" 'since',\n",
" 'put',\n",
" 'disturb',\n",
" 'sign',\n",
" 'leave',\n",
" 'u',\n",
" 'note',\n",
" 'day',\n",
" 'telling',\n",
" 'u',\n",
" 'want',\n",
" 'anything',\n",
" 'serviced',\n",
" 'let',\n",
" 'front',\n",
" 'desk',\n",
" 'know',\n",
" 'thought',\n",
" 'nice',\n",
" 'way',\n",
" 'handle',\n",
" 'breakfast',\n",
" 'hotel',\n",
" 'restaurant',\n",
" 'morning',\n",
" 'convenience',\n",
" 'first',\n",
" 'day',\n",
" 'part',\n",
" 'package',\n",
" 'good',\n",
" 'variety',\n",
" 'make',\n",
" 'breakfast',\n",
" 'burrito',\n",
" 'waffle',\n",
" 'oatmeal',\n",
" 'bacon',\n",
" 'etc',\n",
" 'although',\n",
" 'exact',\n",
" 'choice',\n",
" 'day',\n",
" 'got',\n",
" 'little',\n",
" 'boring',\n",
" 'third',\n",
" 'morning',\n",
" 'breakfast',\n",
" 'family',\n",
" 'stay',\n",
" 'another',\n",
" 'hotel',\n",
" 'area',\n",
" 'earlier',\n",
" 'year',\n",
" 'surprised',\n",
" 'air',\n",
" 'conditioning',\n",
" 'hotel',\n",
" 'air',\n",
" 'conditioning',\n",
" 'worked',\n",
" 'fine',\n",
" 'definitely',\n",
" 'adequate',\n",
" 'place',\n",
" 'reasonable',\n",
" 'price',\n",
" 'seattle',\n",
" 'super',\n",
" 'location',\n",
" 'stayed',\n",
" 'three',\n",
" 'night',\n",
" 'enjoyed',\n",
" 'every',\n",
" 'moment',\n",
" 'room',\n",
" 'location',\n",
" 'helpful',\n",
" 'staff',\n",
" 'cleanliness',\n",
" 'value',\n",
" 'comfort',\n",
" 'everything',\n",
" 'spot',\n",
" 'staff',\n",
" 'friendly',\n",
" 'helpful',\n",
" 'highly',\n",
" 'recommended',\n",
" 'return',\n",
" 'wanted',\n",
" 'hotel',\n",
" 'near',\n",
" 'space',\n",
" 'needle',\n",
" 'knew',\n",
" 'traffic',\n",
" 'would',\n",
" 'nite',\n",
" 'mare',\n",
" 'trying',\n",
" 'find',\n",
" 'place',\n",
" 'hard',\n",
" 'check',\n",
" 'horrific',\n",
" 'person',\n",
" 'come',\n",
" 'really',\n",
" 'took',\n",
" 'minute',\n",
" 'hour',\n",
" 'drive',\n",
" 'beyond',\n",
" 'comprehension',\n",
" 'got',\n",
" 'lucky',\n",
" 'got',\n",
" 'room',\n",
" 'view',\n",
" 'without',\n",
" 'pay',\n",
" 'extra',\n",
" 'night',\n",
" 'helped',\n",
" 'little',\n",
" 'bit',\n",
" 'hall',\n",
" 'room',\n",
" 'dirty',\n",
" 'bed',\n",
" 'oh',\n",
" 'goodness',\n",
" 'horrible',\n",
" 'uncomfortable',\n",
" 'son',\n",
" 'said',\n",
" 'felt',\n",
" 'like',\n",
" 'really',\n",
" 'big',\n",
" 'person',\n",
" 'slept',\n",
" 'need',\n",
" 'replaced',\n",
" 'spring',\n",
" 'totally',\n",
" 'broken',\n",
" 'pleased',\n",
" 'breakfast',\n",
" 'day',\n",
" 'paid',\n",
" 'day',\n",
" 'change',\n",
" 'closed',\n",
" 'exactly',\n",
" 'exception',\n",
" 'mon',\n",
" 'afternoon',\n",
" 'thurs',\n",
" 'put',\n",
" 'complaint',\n",
" 'card',\n",
" 'got',\n",
" 'e',\n",
" 'mail',\n",
" 'response',\n",
" 'replied',\n",
" 'back',\n",
" 'nothing',\n",
" 'sure',\n",
" 'management',\n",
" 'anything',\n",
" 'night',\n",
" 'almost',\n",
" 'mortgage',\n",
" 'payment',\n",
" 'thought',\n",
" 'would',\n",
" 'better',\n",
" 'room',\n",
" 'ever',\n",
" 'worried',\n",
" 'next',\n",
" 'night',\n",
" 'best',\n",
" 'western',\n",
" 'portland',\n",
" 'let',\n",
" 'see',\n",
" 'checked',\n",
" 'best',\n",
" 'western',\n",
" 'executive',\n",
" 'part',\n",
" 'blackball',\n",
" 'ferry',\n",
" 'package',\n",
" 'hotel',\n",
" ...]"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"preprocess(final_df[\"reviews\"][2])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Application on all reviews"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can apply the preprocessing to all the reviews to make our **corpus**."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"output_embedded_package_id": "1uz0rISpNcAAou-Y97DrNwFCi5iVTMNJk"
},
"executionInfo": {
"elapsed": 217285,
"status": "ok",
"timestamp": 1730740623933,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "O1uV1hBv4r8_",
"outputId": "73340db7-9f8a-424c-fe58-38644bca2550"
},
"outputs": [
{
"data": {
"text/plain": [
"[['make',\n",
" 'fast',\n",
" 'visit',\n",
" 'seattle',\n",
" 'found',\n",
" 'pioneer',\n",
" 'regret',\n",
" 'hotel',\n",
" 'comfortable',\n",
" 'clean',\n",
" 'else',\n",
" 'mm',\n",
" 'nice',\n",
" 'area',\n",
" 'everything',\n",
" 'walking',\n",
" 'distance',\n",
" 'guess',\n",
" 'breakfast',\n",
" 'something',\n",
" 'else',\n",
" 'bread',\n",
" 'waffle',\n",
" 'fourth',\n",
" 'fifth',\n",
" 'time',\n",
" 'staying',\n",
" 'best',\n",
" 'western',\n",
" 'pioneer',\n",
" 'square',\n",
" 'like',\n",
" 'location',\n",
" 'within',\n",
" 'walking',\n",
" 'distance',\n",
" 'downtown',\n",
" 'waterfront',\n",
" 'ball',\n",
" 'field',\n",
" 'get',\n",
" 'town',\n",
" 'park',\n",
" 'available',\n",
" 'garage',\n",
" 'rest',\n",
" 'trip',\n",
" 'foot',\n",
" 'minute',\n",
" 'walk',\n",
" 'action',\n",
" 'downtown',\n",
" 'pike',\n",
" 'street',\n",
" 'shopping',\n",
" 'etc',\n",
" 'may',\n",
" 'far',\n",
" 'walk',\n",
" 'like',\n",
" 'dealing',\n",
" 'driving',\n",
" 'parking',\n",
" 'downtown',\n",
" 'place',\n",
" 'nothing',\n",
" 'fancy',\n",
" 'hotel',\n",
" 'room',\n",
" 'nice',\n",
" 'clean',\n",
" 'staff',\n",
" 'helpful',\n",
" 'friendly',\n",
" 'view',\n",
" 'room',\n",
" 'also',\n",
" 'leave',\n",
" 'much',\n",
" 'desired',\n",
" 'fact',\n",
" 'better',\n",
" 'worrying',\n",
" 'gazing',\n",
" 'window',\n",
" 'place',\n",
" 'location',\n",
" 'atmosphere',\n",
" 'old',\n",
" 'seattle',\n",
" 'continental',\n",
" 'breakfast',\n",
" 'taken',\n",
" 'advantage',\n",
" 'previous',\n",
" 'visit',\n",
" 'fine',\n",
" 'want',\n",
" 'quick',\n",
" 'bite',\n",
" 'morning',\n",
" 'make',\n",
" 'unplanned',\n",
" 'visit',\n",
" 'seattle',\n",
" 'due',\n",
" 'lost',\n",
" 'passport',\n",
" 'chose',\n",
" 'least',\n",
" 'expensive',\n",
" 'hotel',\n",
" 'near',\n",
" 'passport',\n",
" 'office',\n",
" 'expecting',\n",
" 'boring',\n",
" 'chain',\n",
" 'hotel',\n",
" 'delightful',\n",
" 'surprise',\n",
" 'best',\n",
" 'western',\n",
" 'like',\n",
" 'boutique',\n",
" 'hotel',\n",
" 'lobby',\n",
" 'lovely',\n",
" 'comfortable',\n",
" 'staff',\n",
" 'warm',\n",
" 'helpful',\n",
" 'room',\n",
" 'service',\n",
" 'food',\n",
" 'wonderful',\n",
" 'italian',\n",
" 'restaurant',\n",
" 'next',\n",
" 'door',\n",
" 'definitely',\n",
" 'back',\n",
" 'say',\n",
" 'rate',\n",
" 'seem',\n",
" 'fluctuate',\n",
" 'quite',\n",
" 'lot',\n",
" 'lowest',\n",
" 'best',\n",
" 'deal',\n",
" 'excluding',\n",
" 'priceline',\n",
" 'hotwire',\n",
" 'thing',\n",
" 'seattle',\n",
" 'going',\n",
" 'stayed',\n",
" 'many',\n",
" 'time',\n",
" 'every',\n",
" 'single',\n",
" 'stay',\n",
" 'fantastic',\n",
" 'value',\n",
" 'love',\n",
" 'building',\n",
" 'historic',\n",
" 'building',\n",
" 'really',\n",
" 'like',\n",
" 'big',\n",
" 'old',\n",
" 'stairway',\n",
" 'use',\n",
" 'rather',\n",
" 'small',\n",
" 'elevator',\n",
" 'crammed',\n",
" 'building',\n",
" 'later',\n",
" 'date',\n",
" 'place',\n",
" 'kept',\n",
" 'clean',\n",
" 'room',\n",
" 'range',\n",
" 'fairly',\n",
" 'small',\n",
" 'pretty',\n",
" 'big',\n",
" 'bed',\n",
" 'good',\n",
" 'bathroom',\n",
" 'good',\n",
" 'small',\n",
" 'maintenance',\n",
" 'property',\n",
" 'seems',\n",
" 'excellent',\n",
" 'location',\n",
" 'spectacular',\n",
" 'link',\n",
" 'light',\n",
" 'rail',\n",
" 'seattle',\n",
" 'fantastic',\n",
" 'rail',\n",
" 'system',\n",
" 'seatac',\n",
" 'westlake',\n",
" 'short',\n",
" 'walk',\n",
" 'hotel',\n",
" 'station',\n",
" 'name',\n",
" 'wait',\n",
" 'pioneer',\n",
" 'square',\n",
" 'kind',\n",
" 'easy',\n",
" 'remember',\n",
" 'given',\n",
" 'name',\n",
" 'hotel',\n",
" 'huh',\n",
" 'easy',\n",
" 'walking',\n",
" 'riding',\n",
" 'public',\n",
" 'transport',\n",
" 'downtown',\n",
" 'staff',\n",
" 'nice',\n",
" 'always',\n",
" 'greet',\n",
" 'kindly',\n",
" 'gentleman',\n",
" 'checking',\n",
" 'time',\n",
" 'even',\n",
" 'apologized',\n",
" 'noticed',\n",
" 'multiple',\n",
" 'time',\n",
" 'know',\n",
" 'name',\n",
" 'checking',\n",
" 'early',\n",
" 'asked',\n",
" 'day',\n",
" 'plan',\n",
" 'learning',\n",
" 'leaving',\n",
" 'late',\n",
" 'flight',\n",
" 'promptly',\n",
" 'offered',\n",
" 'check',\n",
" 'bag',\n",
" 'need',\n",
" 'great',\n",
" 'staff',\n",
" 'try',\n",
" 'anticipate',\n",
" 'need',\n",
" 'well',\n",
" 'managed',\n",
" 'location',\n",
" 'free',\n",
" 'breakfast',\n",
" 'typical',\n",
" 'hotel',\n",
" 'fare',\n",
" 'reasonable',\n",
" 'quality',\n",
" 'lot',\n",
" 'restaurant',\n",
" 'around',\n",
" 'want',\n",
" 'grab',\n",
" 'bite',\n",
" 'eat',\n",
" 'grab',\n",
" 'take',\n",
" 'away',\n",
" 'room',\n",
" 'really',\n",
" 'like',\n",
" 'inexpensive',\n",
" 'friendly',\n",
" 'asian',\n",
" 'restaurant',\n",
" 'right',\n",
" 'next',\n",
" 'door',\n",
" 'east',\n",
" 'side',\n",
" 'front',\n",
" 'building',\n",
" 'recall',\n",
" 'name',\n",
" 'would',\n",
" 'review',\n",
" 'highly',\n",
" 'recommend',\n",
" 'hotel',\n",
" 'particularly',\n",
" 'pay',\n",
" 'one',\n",
" 'low',\n",
" 'rate',\n",
" 'absolutely',\n",
" 'excellent',\n",
" 'value',\n",
" 'kudos',\n",
" 'management',\n",
" 'staff',\n",
" 'get',\n",
" 'know',\n",
" 'area',\n",
" 'get',\n",
" 'info',\n",
" 'staff',\n",
" 'take',\n",
" 'time',\n",
" 'enjoy',\n",
" 'seattle',\n",
" 'waterfront',\n",
" 'perhaps',\n",
" 'ride',\n",
" 'ferry',\n",
" 'boat',\n",
" 'two',\n",
" 'across',\n",
" 'sound',\n",
" 'ready',\n",
" 'ever',\n",
" 'changing',\n",
" 'weather',\n",
" 'part',\n",
" 'seattle',\n",
" 'charm',\n",
" 'time',\n",
" 'slowly',\n",
" 'walk',\n",
" 'around',\n",
" 'give',\n",
" 'depth',\n",
" 'picture',\n",
" 'seattle',\n",
" 'water',\n",
" 'front',\n",
" 'activity',\n",
" 'also',\n",
" 'always',\n",
" 'something',\n",
" 'stimulate',\n",
" 'everyones',\n",
" 'attention',\n",
" 'perfectly',\n",
" 'renovated',\n",
" 'historical',\n",
" 'hotel',\n",
" 'immaculate',\n",
" 'room',\n",
" 'comfortable',\n",
" 'hotel',\n",
" 'staff',\n",
" 'courteous',\n",
" 'knowledgeable',\n",
" 'location',\n",
" 'less',\n",
" 'block',\n",
" 'waterfront',\n",
" 'ferry',\n",
" 'pier',\n",
" 'two',\n",
" 'block',\n",
" 'light',\n",
" 'rail',\n",
" 'station',\n",
" 'fare',\n",
" 'seatac',\n",
" 'step',\n",
" 'back',\n",
" 'greeted',\n",
" 'historic',\n",
" 'ambiance',\n",
" 'rich',\n",
" 'dark',\n",
" 'wood',\n",
" 'welcoming',\n",
" 'fireplace',\n",
" 'sitting',\n",
" 'area',\n",
" 'original',\n",
" 'grand',\n",
" 'staircase',\n",
" 'period',\n",
" 'furnishing',\n",
" 'floor',\n",
" 'landing',\n",
" 'quiet',\n",
" 'room',\n",
" 'beautifully',\n",
" 'decorated',\n",
" 'heavy',\n",
" 'brocade',\n",
" 'drapery',\n",
" 'tapestry',\n",
" 'highback',\n",
" 'sitting',\n",
" 'chair',\n",
" 'loveseat',\n",
" 'setee',\n",
" 'besides',\n",
" 'two',\n",
" 'comfortable',\n",
" 'bed',\n",
" 'flat',\n",
" 'screen',\n",
" 'tv',\n",
" 'keurig',\n",
" 'coffeemaker',\n",
" 'paul',\n",
" 'newman',\n",
" 'coffee',\n",
" 'product',\n",
" 'writing',\n",
" 'desk',\n",
" 'chair',\n",
" 'plush',\n",
" 'bathroom',\n",
" 'towel',\n",
" 'sister',\n",
" 'sky',\n",
" 'toiletry',\n",
" 'expanded',\n",
" 'continental',\n",
" 'breakfast',\n",
" 'offered',\n",
" 'array',\n",
" 'cold',\n",
" 'hot',\n",
" 'cereal',\n",
" 'fresh',\n",
" 'fruit',\n",
" 'hot',\n",
" 'bacon',\n",
" 'egg',\n",
" 'cheese',\n",
" 'wrap',\n",
" 'juice',\n",
" 'hot',\n",
" 'beverage',\n",
" 'fresh',\n",
" 'belgian',\n",
" 'waffle',\n",
" 'pastry',\n",
" 'muffin',\n",
" 'bagel',\n",
" 'actual',\n",
" 'bagel',\n",
" 'slicer',\n",
" 'highly',\n",
" 'recommend',\n",
" 'room',\n",
" 'balcony',\n",
" 'step',\n",
" 'experience',\n",
" 'ambiance',\n",
" 'area',\n",
" 'seattle',\n",
" 'transportation',\n",
" 'system',\n",
" 'convenient',\n",
" 'taxi',\n",
" 'reasonable',\n",
" 'otherwise',\n",
" 'parking',\n",
" 'hard',\n",
" 'find',\n",
" 'expensive',\n",
" 'plan',\n",
" 'make',\n",
" 'trip',\n",
" 'seattle',\n",
" 'great',\n",
" 'place',\n",
" 'stay',\n",
" 'convenient',\n",
" 'transportation',\n",
" 'stayed',\n",
" 'nice',\n",
" 'hotel',\n",
" 'recently',\n",
" 'night',\n",
" 'great',\n",
" 'location',\n",
" 'walking',\n",
" 'distance',\n",
" 'football',\n",
" 'stadium',\n",
" 'football',\n",
" 'game',\n",
" 'shop',\n",
" 'restaurant',\n",
" 'plus',\n",
" 'pier',\n",
" 'pike',\n",
" 'market',\n",
" 'took',\n",
" 'light',\n",
" 'rail',\n",
" 'airport',\n",
" 'get',\n",
" 'hotel',\n",
" 'get',\n",
" 'pioneer',\n",
" 'square',\n",
" 'stop',\n",
" 'walk',\n",
" 'block',\n",
" 'hotel',\n",
" 'lot',\n",
" 'luggage',\n",
" 'may',\n",
" 'easy',\n",
" 'customer',\n",
" 'service',\n",
" 'hotel',\n",
" 'excellent',\n",
" 'staff',\n",
" 'encountered',\n",
" 'friendly',\n",
" 'helpful',\n",
" 'room',\n",
" 'clean',\n",
" 'comfortable',\n",
" 'everything',\n",
" 'needed',\n",
" 'used',\n",
" 'city',\n",
" 'noise',\n",
" 'might',\n",
" 'like',\n",
" 'location',\n",
" 'little',\n",
" 'noisy',\n",
" 'street',\n",
" 'surrounding',\n",
" 'hotel',\n",
" 'calm',\n",
" 'overnight',\n",
" 'bothered',\n",
" 'noise',\n",
" 'good',\n",
" 'night',\n",
" 'sleep',\n",
" 'every',\n",
" 'night',\n",
" 'morning',\n",
" 'breakfast',\n",
" 'good',\n",
" 'nothing',\n",
" 'fancy',\n",
" 'good',\n",
" 'selection',\n",
" 'pleasant',\n",
" 'stay',\n",
" 'stay',\n",
" 'next',\n",
" 'trip',\n",
" 'seattle',\n",
" 'train',\n",
" 'got',\n",
" 'u',\n",
" 'seattle',\n",
" 'morning',\n",
" 'hotel',\n",
" 'convenient',\n",
" 'station',\n",
" 'room',\n",
" 'ready',\n",
" 'u',\n",
" 'beautiful',\n",
" 'room',\n",
" 'great',\n",
" 'furnishing',\n",
" 'cooky',\n",
" 'milk',\n",
" 'delivered',\n",
" 'room',\n",
" 'check',\n",
" 'later',\n",
" 'checked',\n",
" 'see',\n",
" 'needed',\n",
" 'towel',\n",
" 'great',\n",
" 'staff',\n",
" 'great',\n",
" 'place',\n",
" 'liked',\n",
" 'way',\n",
" 'room',\n",
" 'set',\n",
" 'customer',\n",
" 'service',\n",
" 'excellent',\n",
" 'enjoyed',\n",
" 'close',\n",
" 'train',\n",
" 'station',\n",
" 'attraction',\n",
" 'saw',\n",
" 'breakfast',\n",
" 'great',\n",
" 'everyone',\n",
" 'lively',\n",
" 'friendly',\n",
" 'enjoyed',\n",
" 'stay',\n",
" 'thank',\n",
" 'train',\n",
" 'travel',\n",
" 'consultant',\n",
" 'ted',\n",
" 'sylvia',\n",
" 'blishak',\n",
" 'landing',\n",
" 'fine',\n",
" 'little',\n",
" 'hotel',\n",
" 'one',\n",
" 'best',\n",
" 'seen',\n",
" 'u',\n",
" 'convenient',\n",
" 'train',\n",
" 'station',\n",
" 'two',\n",
" 'night',\n",
" 'comfortable',\n",
" 'really',\n",
" 'helped',\n",
" 'u',\n",
" 'prepare',\n",
" 'canadian',\n",
" 'transcontinental',\n",
" 'train',\n",
" 'would',\n",
" 'board',\n",
" 'day',\n",
" 'ahead',\n",
" 'hotel',\n",
" 'almost',\n",
" 'tucked',\n",
" 'away',\n",
" 'side',\n",
" 'street',\n",
" 'cabbie',\n",
" 'bit',\n",
" 'challenge',\n",
" 'finding',\n",
" 'check',\n",
" 'flawless',\n",
" 'received',\n",
" 'large',\n",
" 'well',\n",
" 'appointed',\n",
" 'room',\n",
" 'top',\n",
" 'floor',\n",
" 'best',\n",
" 'western',\n",
" 'frequent',\n",
" 'guest',\n",
" 'card',\n",
" 'helped',\n",
" 'upgrade',\n",
" 'although',\n",
" 'considerable',\n",
" 'construction',\n",
" 'noise',\n",
" 'underground',\n",
" 'tunneling',\n",
" 'operation',\n",
" 'shut',\n",
" 'promptly',\n",
" 'sleepless',\n",
" 'seattle',\n",
" 'best',\n",
" 'western',\n",
" 'morning',\n",
" 'breakfast',\n",
" 'offering',\n",
" 'good',\n",
" 'providing',\n",
" 'enough',\n",
" 'tasty',\n",
" 'sustenance',\n",
" 'make',\n",
" 'lunch',\n",
" 'hand',\n",
" 'many',\n",
" 'trendy',\n",
" 'restaurant',\n",
" 'abound',\n",
" 'within',\n",
" 'neighboring',\n",
" 'block',\n",
" 'historic',\n",
" 'district',\n",
" 'choice',\n",
" 'hotel',\n",
" 'also',\n",
" 'relatively',\n",
" 'close',\n",
" 'sport',\n",
" 'stadium',\n",
" 'downtown',\n",
" 'shopping',\n",
" 'pike',\n",
" 'place',\n",
" 'market',\n",
" 'mile',\n",
" 'hike',\n",
" 'away',\n",
" 'caught',\n",
" 'vancouver',\n",
" 'bc',\n",
" 'train',\n",
" 'second',\n",
" 'morning',\n",
" 'advise',\n",
" 'allowing',\n",
" 'extra',\n",
" 'time',\n",
" 'get',\n",
" 'station',\n",
" 'due',\n",
" 'construction',\n",
" 'impediment',\n",
" 'drawback',\n",
" 'people',\n",
" 'coming',\n",
" 'car',\n",
" 'site',\n",
" 'parking',\n",
" 'inconvenient',\n",
" 'complicated',\n",
" 'expensive',\n",
" 'bottom',\n",
" 'line',\n",
" 'definitely',\n",
" 'best',\n",
" 'destination',\n",
" 'folk',\n",
" 'arriving',\n",
" 'rail',\n",
" 'also',\n",
" 'likely',\n",
" 'one',\n",
" 'best',\n",
" 'kept',\n",
" 'lodging',\n",
" 'secret',\n",
" 'seattle',\n",
" 'coming',\n",
" 'car',\n",
" 'parking',\n",
" 'situation',\n",
" 'causing',\n",
" 'nevertheless',\n",
" 'highly',\n",
" 'recommend',\n",
" 'pioneer',\n",
" 'square',\n",
" 'best',\n",
" 'western',\n",
" 'next',\n",
" 'visit',\n",
" 'seattle',\n",
" 'stayed',\n",
" 'june',\n",
" 'clean',\n",
" 'comfortable',\n",
" 'room',\n",
" 'great',\n",
" 'rate',\n",
" 'breakfast',\n",
" 'basic',\n",
" 'included',\n",
" 'best',\n",
" 'thing',\n",
" 'hotel',\n",
" 'service',\n",
" 'guest',\n",
" 'service',\n",
" 'associate',\n",
" 'incredibly',\n",
" 'helpful',\n",
" 'went',\n",
" 'way',\n",
" 'make',\n",
" 'sure',\n",
" 'accommodation',\n",
" 'suitable',\n",
" 'exceeded',\n",
" 'suitable',\n",
" 'great',\n",
" 'location',\n",
" 'city',\n",
" 'picked',\n",
" 'hotel',\n",
" 'simply',\n",
" 'location',\n",
" 'daughter',\n",
" 'apartment',\n",
" 'downtown',\n",
" 'would',\n",
" 'stay',\n",
" 'even',\n",
" 'live',\n",
" 'close',\n",
" 'people',\n",
" 'friendly',\n",
" 'even',\n",
" 'arrived',\n",
" 'midnight',\n",
" 'first',\n",
" 'thing',\n",
" 'morning',\n",
" 'room',\n",
" 'clean',\n",
" 'picky',\n",
" 'everyday',\n",
" 'returned',\n",
" 'breakfast',\n",
" 'norm',\n",
" 'dining',\n",
" 'area',\n",
" 'staffed',\n",
" 'sightseeing',\n",
" 'would',\n",
" 'say',\n",
" 'location',\n",
" 'excellent',\n",
" 'well',\n",
" 'back',\n",
" 'sure',\n",
" 'loved',\n",
" 'location',\n",
" 'public',\n",
" 'transportation',\n",
" 'good',\n",
" 'need',\n",
" 'car',\n",
" 'hotel',\n",
" 'lovely',\n",
" 'room',\n",
" 'quiet',\n",
" 'clean',\n",
" 'bathroom',\n",
" 'sparkled',\n",
" 'great',\n",
" 'shower',\n",
" 'king',\n",
" 'size',\n",
" 'bed',\n",
" 'comfortable',\n",
" 'ate',\n",
" 'complimentary',\n",
" 'breakfast',\n",
" 'every',\n",
" 'morning',\n",
" 'stay',\n",
" 'lot',\n",
" 'healthy',\n",
" 'choice',\n",
" 'fresh',\n",
" 'fruit',\n",
" 'liked',\n",
" 'staff',\n",
" 'friendly',\n",
" 'reccomended',\n",
" 'great',\n",
" 'restaurant',\n",
" 'ever',\n",
" 'wonder',\n",
" 'comment',\n",
" 'submit',\n",
" 'make',\n",
" 'hotel',\n",
" 'reservation',\n",
" 'read',\n",
" 'well',\n",
" 'hotel',\n",
" 'hotel',\n",
" 'great',\n",
" 'location',\n",
" 'walk',\n",
" 'everything',\n",
" 'along',\n",
" 'waterfront',\n",
" 'pike',\n",
" 'place',\n",
" 'market',\n",
" 'football',\n",
" 'baseball',\n",
" 'stadium',\n",
" 'many',\n",
" 'great',\n",
" 'place',\n",
" 'eat',\n",
" 'club',\n",
" 'go',\n",
" 'far',\n",
" 'time',\n",
" 'room',\n",
" 'lovely',\n",
" 'paid',\n",
" 'city',\n",
" 'view',\n",
" 'glad',\n",
" 'french',\n",
" 'door',\n",
" 'opened',\n",
" 'onto',\n",
" 'balcony',\n",
" 'wide',\n",
" 'ledge',\n",
" 'window',\n",
" 'view',\n",
" 'puget',\n",
" 'sound',\n",
" 'ferry',\n",
" 'coming',\n",
" 'going',\n",
" 'king',\n",
" 'bed',\n",
" 'comfy',\n",
" 'quiet',\n",
" 'high',\n",
" 'ceiling',\n",
" 'staff',\n",
" 'helpful',\n",
" 'parking',\n",
" 'ask',\n",
" 'stay',\n",
" 'great',\n",
" 'clean',\n",
" 'room',\n",
" 'frig',\n",
" 'close',\n",
" 'eveything',\n",
" 'downtown',\n",
" 'area',\n",
" 'came',\n",
" 'new',\n",
" 'car',\n",
" 'show',\n",
" 'quest',\n",
" 'feild',\n",
" 'ansd',\n",
" 'able',\n",
" 'walk',\n",
" 'every',\n",
" 'weekend',\n",
" 'great',\n",
" 'restaurant',\n",
" 'bar',\n",
" 'also',\n",
" 'within',\n",
" 'walking',\n",
" 'distance',\n",
" 'come',\n",
" 'city',\n",
" 'seattle',\n",
" 'time',\n",
" 'year',\n",
" 'stay',\n",
" 'location',\n",
" 'great',\n",
" 'bed',\n",
" 'good',\n",
" 'everything',\n",
" 'else',\n",
" 'alright',\n",
" 'could',\n",
" 'hear',\n",
" 'people',\n",
" 'room',\n",
" 'u',\n",
" 'fail',\n",
" 'minimally',\n",
" 'helpful',\n",
" 'staff',\n",
" ...],\n",
" ['great',\n",
" 'service',\n",
" 'room',\n",
" 'clean',\n",
" 'could',\n",
" 'use',\n",
" 'queen',\n",
" 'bed',\n",
" 'two',\n",
" 'bed',\n",
" 'room',\n",
" 'otherwise',\n",
" 'great',\n",
" 'great',\n",
" 'price',\n",
" 'location',\n",
" 'would',\n",
" 'definitely',\n",
" 'go',\n",
" 'back',\n",
" 'make',\n",
" 'sure',\n",
" 'try',\n",
" 'zeets',\n",
" 'pizza',\n",
" 'amazing',\n",
" 'give',\n",
" 'kid',\n",
" 'pizza',\n",
" 'dough',\n",
" 'play',\n",
" 'wait',\n",
" 'room',\n",
" 'clean',\n",
" 'hotel',\n",
" 'conveniently',\n",
" 'located',\n",
" 'close',\n",
" 'restaurant',\n",
" 'shop',\n",
" 'close',\n",
" 'downtown',\n",
" 'service',\n",
" 'good',\n",
" 'room',\n",
" 'rate',\n",
" 'good',\n",
" 'recommend',\n",
" 'friend',\n",
" 'breakfast',\n",
" 'alright',\n",
" 'nice',\n",
" 'location',\n",
" 'visiting',\n",
" 'seattle',\n",
" 'center',\n",
" 'downtown',\n",
" 'good',\n",
" 'breakfast',\n",
" 'nice',\n",
" 'staff',\n",
" 'room',\n",
" 'little',\n",
" 'small',\n",
" 'u',\n",
" 'clean',\n",
" 'cozy',\n",
" 'nice',\n",
" 'workout',\n",
" 'room',\n",
" 'whirlpool',\n",
" 'hot',\n",
" 'tub',\n",
" 'available',\n",
" 'completed',\n",
" 'stay',\n",
" 'best',\n",
" 'western',\n",
" 'loyal',\n",
" 'inn',\n",
" 'within',\n",
" 'seattle',\n",
" 'washington',\n",
" 'disappointeed',\n",
" 'due',\n",
" 'number',\n",
" 'factor',\n",
" 'majority',\n",
" 'began',\n",
" 'day',\n",
" 'first',\n",
" 'disclaimer',\n",
" 'staff',\n",
" 'hotel',\n",
" 'offer',\n",
" 'correct',\n",
" 'issue',\n",
" 'discus',\n",
" 'fact',\n",
" 'occurred',\n",
" 'first',\n",
" 'place',\n",
" 'present',\n",
" 'demonstrated',\n",
" 'paying',\n",
" 'attention',\n",
" 'room',\n",
" 'upkeep',\n",
" 'best',\n",
" 'western',\n",
" 'may',\n",
" 'better',\n",
" 'condition',\n",
" 'one',\n",
" 'certainly',\n",
" 'lacking',\n",
" 'price',\n",
" 'key',\n",
" 'selecting',\n",
" 'hotel',\n",
" 'well',\n",
" 'brand',\n",
" 'name',\n",
" 'make',\n",
" 'difference',\n",
" 'found',\n",
" 'year',\n",
" 'travel',\n",
" 'upon',\n",
" 'arrival',\n",
" 'told',\n",
" 'check',\n",
" 'pm',\n",
" 'also',\n",
" 'told',\n",
" 'reserved',\n",
" 'room',\n",
" 'paid',\n",
" 'would',\n",
" 'ready',\n",
" 'time',\n",
" 'common',\n",
" 'excuse',\n",
" 'took',\n",
" 'staff',\n",
" 'day',\n",
" 'prepare',\n",
" 'various',\n",
" 'room',\n",
" 'room',\n",
" 'unavailable',\n",
" 'late',\n",
" 'day',\n",
" 'becoming',\n",
" 'serious',\n",
" 'concern',\n",
" 'many',\n",
" 'hotel',\n",
" 'pm',\n",
" 'mean',\n",
" 'lost',\n",
" 'day',\n",
" 'without',\n",
" 'place',\n",
" 'stay',\n",
" 'even',\n",
" 'though',\n",
" 'paying',\n",
" 'day',\n",
" 'latest',\n",
" 'hour',\n",
" 'ever',\n",
" 'advised',\n",
" 'stayed',\n",
" 'countless',\n",
" 'hotel',\n",
" 'year',\n",
" 'road',\n",
" 'expressing',\n",
" 'thought',\n",
" 'desk',\n",
" 'switched',\n",
" 'different',\n",
" 'room',\n",
" 'ready',\n",
" 'checked',\n",
" 'bed',\n",
" 'layout',\n",
" 'originally',\n",
" 'reserved',\n",
" 'heat',\n",
" 'froze',\n",
" 'night',\n",
" 'got',\n",
" 'turned',\n",
" 'heater',\n",
" 'first',\n",
" 'rattled',\n",
" 'loudly',\n",
" 'proceeded',\n",
" 'blow',\n",
" 'nothing',\n",
" 'cold',\n",
" 'air',\n",
" 'regardless',\n",
" 'dial',\n",
" 'setting',\n",
" 'stripped',\n",
" 'bed',\n",
" 'cover',\n",
" 'keep',\n",
" 'warm',\n",
" 'night',\n",
" 'obtained',\n",
" 'blanket',\n",
" 'desk',\n",
" 'next',\n",
" 'day',\n",
" 'bedding',\n",
" 'consisted',\n",
" 'clean',\n",
" 'sheet',\n",
" 'thin',\n",
" 'pink',\n",
" 'blanket',\n",
" 'would',\n",
" 'keep',\n",
" 'corpse',\n",
" 'warm',\n",
" 'sink',\n",
" 'stopped',\n",
" 'day',\n",
" 'sink',\n",
" 'failed',\n",
" 'drain',\n",
" 'fill',\n",
" 'water',\n",
" 'would',\n",
" 'wait',\n",
" 'minute',\n",
" 'empty',\n",
" 'finishing',\n",
" 'washing',\n",
" 'staff',\n",
" 'notified',\n",
" 'wanted',\n",
" 'send',\n",
" 'repair',\n",
" 'people',\n",
" 'room',\n",
" 'gone',\n",
" 'declined',\n",
" 'offer',\n",
" 'stated',\n",
" 'staff',\n",
" 'try',\n",
" 'make',\n",
" 'thing',\n",
" 'right',\n",
" 'changing',\n",
" 'room',\n",
" 'stopping',\n",
" 'schedule',\n",
" 'multiple',\n",
" 'issue',\n",
" 'repaired',\n",
" 'something',\n",
" 'would',\n",
" 'like',\n",
" 'addressed',\n",
" 'arrival',\n",
" 'leave',\n",
" 'hour',\n",
" 'paying',\n",
" 'siren',\n",
" 'siren',\n",
" 'vehicle',\n",
" 'constant',\n",
" 'least',\n",
" 'passing',\n",
" 'window',\n",
" 'hour',\n",
" 'night',\n",
" 'back',\n",
" 'hotel',\n",
" 'night',\n",
" 'better',\n",
" 'siren',\n",
" 'night',\n",
" 'thru',\n",
" 'transient',\n",
" 'kid',\n",
" 'alley',\n",
" 'rear',\n",
" 'hotel',\n",
" 'several',\n",
" 'transient',\n",
" 'young',\n",
" 'people',\n",
" 'yelling',\n",
" 'pm',\n",
" 'unknown',\n",
" 'reason',\n",
" 'perhaps',\n",
" 'anger',\n",
" 'resident',\n",
" 'hotel',\n",
" 'good',\n",
" 'job',\n",
" 'night',\n",
" 'showed',\n",
" 'improvement',\n",
" 'raining',\n",
" 'one',\n",
" 'guy',\n",
" 'time',\n",
" 'singing',\n",
" 'loud',\n",
" 'directly',\n",
" 'n',\n",
" 'w',\n",
" 'breakfast',\n",
" 'let',\n",
" 'see',\n",
" 'egg',\n",
" 'fried',\n",
" 'hard',\n",
" 'perfect',\n",
" 'circle',\n",
" 'placed',\n",
" 'metal',\n",
" 'container',\n",
" 'clear',\n",
" 'plastic',\n",
" 'lid',\n",
" 'cold',\n",
" 'placed',\n",
" 'inside',\n",
" 'microwave',\n",
" 'heat',\n",
" 'meat',\n",
" 'sad',\n",
" 'stack',\n",
" 'packaged',\n",
" 'ham',\n",
" 'chunk',\n",
" 'pressed',\n",
" 'meat',\n",
" 'inside',\n",
" 'container',\n",
" 'clear',\n",
" 'plastic',\n",
" 'lid',\n",
" 'type',\n",
" 'meat',\n",
" 'present',\n",
" 'pressed',\n",
" 'ham',\n",
" 'bit',\n",
" 'made',\n",
" 'rectangular',\n",
" 'slice',\n",
" 'cereal',\n",
" 'dispenser',\n",
" 'mounted',\n",
" 'vertically',\n",
" 'bagel',\n",
" 'good',\n",
" 'white',\n",
" 'bread',\n",
" 'loaf',\n",
" 'bag',\n",
" 'milk',\n",
" 'cold',\n",
" 'tasty',\n",
" 'staff',\n",
" 'cleaning',\n",
" 'woman',\n",
" 'present',\n",
" 'kind',\n",
" 'helpful',\n",
" 'gave',\n",
" 'cash',\n",
" 'tip',\n",
" 'later',\n",
" 'left',\n",
" 'one',\n",
" 'room',\n",
" 'cleaning',\n",
" 'staff',\n",
" 'well',\n",
" 'issue',\n",
" 'discussed',\n",
" 'fault',\n",
" 'left',\n",
" 'breakfast',\n",
" 'room',\n",
" 'reflected',\n",
" 'bacon',\n",
" 'fresh',\n",
" 'egg',\n",
" 'everything',\n",
" 'pre',\n",
" 'cooked',\n",
" 'dumped',\n",
" 'container',\n",
" 'various',\n",
" 'cheese',\n",
" 'cold',\n",
" 'cut',\n",
" 'meat',\n",
" 'described',\n",
" 'bare',\n",
" 'basic',\n",
" 'food',\n",
" 'found',\n",
" 'motel',\n",
" 'sigh',\n",
" 'room',\n",
" 'clean',\n",
" 'enough',\n",
" 'issue',\n",
" 'modern',\n",
" 'date',\n",
" 'given',\n",
" 'price',\n",
" 'good',\n",
" 'condition',\n",
" 'clean',\n",
" 'towel',\n",
" 'shower',\n",
" 'fine',\n",
" 'adjusted',\n",
" 'water',\n",
" 'hot',\n",
" 'wi',\n",
" 'fi',\n",
" 'job',\n",
" 'day',\n",
" 'open',\n",
" 'parking',\n",
" 'lot',\n",
" 'alley',\n",
" 'street',\n",
" 'fencing',\n",
" 'security',\n",
" 'system',\n",
" 'secured',\n",
" 'parking',\n",
" 'structure',\n",
" 'lesson',\n",
" 'learned',\n",
" 'get',\n",
" 'pay',\n",
" 'used',\n",
" 'airline',\n",
" 'mile',\n",
" 'location',\n",
" 'night',\n",
" 'stayed',\n",
" 'hampton',\n",
" 'inn',\n",
" 'hilton',\n",
" 'usually',\n",
" 'even',\n",
" 'airline',\n",
" 'mile',\n",
" 'time',\n",
" 'around',\n",
" 'tried',\n",
" 'cut',\n",
" 'budget',\n",
" 'since',\n",
" 'seattle',\n",
" 'paid',\n",
" 'price',\n",
" 'happen',\n",
" 'best',\n",
" 'western',\n",
" 'jm',\n",
" 'oregon',\n",
" 'family',\n",
" 'stayed',\n",
" 'two',\n",
" 'night',\n",
" 'liked',\n",
" 'central',\n",
" 'location',\n",
" 'walked',\n",
" 'downtown',\n",
" 'next',\n",
" 'morning',\n",
" 'took',\n",
" 'cab',\n",
" 'ride',\n",
" 'pike',\n",
" 'place',\n",
" 'actually',\n",
" 'cheaper',\n",
" 'took',\n",
" 'bus',\n",
" 'staff',\n",
" 'friendly',\n",
" 'helpfull',\n",
" 'especially',\n",
" 'young',\n",
" 'man',\n",
" 'checked',\n",
" 'u',\n",
" 'thursday',\n",
" 'evening',\n",
" 'loved',\n",
" 'continental',\n",
" 'buffet',\n",
" 'breakfast',\n",
" 'included',\n",
" 'stay',\n",
" 'offer',\n",
" 'huge',\n",
" 'varriety',\n",
" 'choice',\n",
" 'fresh',\n",
" 'waffle',\n",
" 'fruit',\n",
" 'yogurt',\n",
" 'egg',\n",
" 'ham',\n",
" 'wife',\n",
" 'find',\n",
" 'convienant',\n",
" 'child',\n",
" 'love',\n",
" 'choice',\n",
" 'ok',\n",
" 'kind',\n",
" 'know',\n",
" 'getting',\n",
" 'pay',\n",
" 'little',\n",
" 'hotel',\n",
" 'location',\n",
" 'hotel',\n",
" 'central',\n",
" 'action',\n",
" 'best',\n",
" 'part',\n",
" 'hotel',\n",
" 'location',\n",
" 'walked',\n",
" 'pioneer',\n",
" 'square',\n",
" 'lake',\n",
" 'union',\n",
" 'westlake',\n",
" 'center',\n",
" 'freemont',\n",
" 'quite',\n",
" 'hike',\n",
" 'nice',\n",
" 'walk',\n",
" 'along',\n",
" 'lake',\n",
" 'along',\n",
" 'point',\n",
" 'reach',\n",
" 'seattle',\n",
" 'downtown',\n",
" 'hot',\n",
" 'spot',\n",
" 'easily',\n",
" 'love',\n",
" 'public',\n",
" 'transportation',\n",
" 'seattle',\n",
" 'particularly',\n",
" 'light',\n",
" 'rail',\n",
" 'airport',\n",
" 'downtown',\n",
" 'hotel',\n",
" 'take',\n",
" 'light',\n",
" 'rail',\n",
" 'sea',\n",
" 'last',\n",
" 'stop',\n",
" 'westlake',\n",
" 'still',\n",
" 'mile',\n",
" 'walk',\n",
" 'lot',\n",
" 'baggage',\n",
" 'biggest',\n",
" 'drawback',\n",
" 'location',\n",
" 'three',\n",
" 'option',\n",
" 'westlake',\n",
" 'used',\n",
" 'light',\n",
" 'rail',\n",
" 'get',\n",
" 'take',\n",
" 'taxi',\n",
" 'expensive',\n",
" 'since',\n",
" 'short',\n",
" 'hop',\n",
" 'may',\n",
" 'best',\n",
" 'lot',\n",
" 'bag',\n",
" 'hop',\n",
" 'south',\n",
" 'lake',\n",
" 'union',\n",
" 'transit',\n",
" 'street',\n",
" 'car',\n",
" 'south',\n",
" 'westlake',\n",
" 'get',\n",
" 'walk',\n",
" 'short',\n",
" 'way',\n",
" 'street',\n",
" 'hotel',\n",
" 'pack',\n",
" 'light',\n",
" 'walk',\n",
" 'westlake',\n",
" 'next',\n",
" 'best',\n",
" 'thing',\n",
" 'people',\n",
" 'staff',\n",
" 'nice',\n",
" 'attentive',\n",
" 'toilet',\n",
" 'broke',\n",
" 'day',\n",
" 'seven',\n",
" 'day',\n",
" 'stay',\n",
" 'fixed',\n",
" 'quickly',\n",
" 'room',\n",
" 'seemed',\n",
" 'cleaned',\n",
" 'reasonably',\n",
" 'well',\n",
" 'breakfast',\n",
" 'bar',\n",
" 'adequate',\n",
" 'typical',\n",
" 'free',\n",
" 'breakfast',\n",
" 'food',\n",
" 'open',\n",
" 'exactly',\n",
" 'time',\n",
" 'one',\n",
" 'minute',\n",
" 'early',\n",
" 'head',\n",
" 'bit',\n",
" 'early',\n",
" 'get',\n",
" 'jump',\n",
" 'thing',\n",
" 'hallway',\n",
" 'floor',\n",
" 'stank',\n",
" 'idea',\n",
" 'stench',\n",
" 'first',\n",
" 'arrived',\n",
" 'concerned',\n",
" 'room',\n",
" 'smelling',\n",
" 'like',\n",
" 'room',\n",
" 'smell',\n",
" 'hallway',\n",
" 'held',\n",
" 'breath',\n",
" 'went',\n",
" 'problem',\n",
" 'solved',\n",
" 'room',\n",
" 'noise',\n",
" 'fairly',\n",
" 'high',\n",
" 'needed',\n",
" 'use',\n",
" 'earplug',\n",
" 'wall',\n",
" 'seem',\n",
" 'thin',\n",
" 'outside',\n",
" 'room',\n",
" 'lot',\n",
" 'siren',\n",
" 'area',\n",
" 'night',\n",
" 'long',\n",
" 'know',\n",
" 'group',\n",
" 'young',\n",
" 'kid',\n",
" 'arrived',\n",
" 'partway',\n",
" 'week',\n",
" 'antic',\n",
" 'also',\n",
" 'added',\n",
" 'ambient',\n",
" 'noise',\n",
" 'inside',\n",
" 'temperature',\n",
" 'control',\n",
" 'room',\n",
" 'cooler',\n",
" 'heater',\n",
" 'seem',\n",
" 'hold',\n",
" 'steady',\n",
" 'temp',\n",
" 'wake',\n",
" 'night',\n",
" 'either',\n",
" 'roasting',\n",
" 'freezing',\n",
" 'setting',\n",
" 'light',\n",
" 'invasion',\n",
" 'hallway',\n",
" 'high',\n",
" 'door',\n",
" 'fit',\n",
" 'tightly',\n",
" 'frame',\n",
" 'allowed',\n",
" 'lot',\n",
" 'hallway',\n",
" 'light',\n",
" 'room',\n",
" 'thumb',\n",
" 'squarely',\n",
" 'middle',\n",
" 'know',\n",
" 'getting',\n",
" 'accept',\n",
" 'downside',\n",
" 'cheap',\n",
" 'downtown',\n",
" 'seattle',\n",
" 'room',\n",
" 'place',\n",
" 'good',\n",
" 'value',\n",
" 'set',\n",
" 'expectation',\n",
" 'appropriately',\n",
" 'hotel',\n",
" 'close',\n",
" 'seattle',\n",
" 'space',\n",
" 'needle',\n",
" 'walking',\n",
" 'distance',\n",
" 'many',\n",
" 'downtown',\n",
" 'seattle',\n",
" 'attraction',\n",
" 'offer',\n",
" 'free',\n",
" 'wifi',\n",
" 'breakfast',\n",
" 'make',\n",
" 'great',\n",
" 'deal',\n",
" 'breakfast',\n",
" 'better',\n",
" 'expected',\n",
" 'assortment',\n",
" 'cereal',\n",
" 'oatmeal',\n",
" 'waffle',\n",
" 'fruit',\n",
" 'various',\n",
" 'item',\n",
" 'however',\n",
" 'room',\n",
" 'less',\n",
" 'expected',\n",
" 'first',\n",
" 'room',\n",
" 'third',\n",
" 'floor',\n",
" 'horrible',\n",
" 'experience',\n",
" 'taking',\n",
" 'shower',\n",
" 'heat',\n",
" 'triggered',\n",
" 'smoke',\n",
" 'detector',\n",
" 'start',\n",
" 'ringing',\n",
" 'run',\n",
" 'shower',\n",
" 'throw',\n",
" 'clothes',\n",
" 'go',\n",
" 'front',\n",
" 'desk',\n",
" 'get',\n",
" 'new',\n",
" 'room',\n",
" 'oh',\n",
" 'soap',\n",
" 'shampoo',\n",
" 'conditioner',\n",
" 'come',\n",
" 'standard',\n",
" 'small',\n",
" 'bottle',\n",
" 'see',\n",
" 'hotel',\n",
" 'big',\n",
" 'dispenser',\n",
" 'attached',\n",
" 'shower',\n",
" 'know',\n",
" 'putting',\n",
" 'seem',\n",
" 'sanitary',\n",
" 'second',\n",
" 'room',\n",
" 'worked',\n",
" 'ok',\n",
" 'stay',\n",
" 'seattle',\n",
" 'would',\n",
" 'go',\n",
" 'somewhere',\n",
" 'else',\n",
" 'bit',\n",
" 'nicer',\n",
" 'pretty',\n",
" 'full',\n",
" 'breakfast',\n",
" 'bar',\n",
" 'quiet',\n",
" 'location',\n",
" 'denny',\n",
" 'way',\n",
" 'close',\n",
" 'jazz',\n",
" 'alley',\n",
" 'many',\n",
" 'good',\n",
" 'restaurant',\n",
" 'make',\n",
" 'hotel',\n",
" 'great',\n",
" 'plus',\n",
" 'three',\n",
" 'block',\n",
" 'whole',\n",
" 'food',\n",
" 'gallery',\n",
" 'honeychurch',\n",
" 'booked',\n",
" 'night',\n",
" 'hotel',\n",
" 'people',\n",
" 'friend',\n",
" 'room',\n",
" 'queen',\n",
" 'bed',\n",
" 'room',\n",
" 'bf',\n",
" 'queen',\n",
" 'bed',\n",
" 'sheet',\n",
" 'clean',\n",
" 'bed',\n",
" 'comfortable',\n",
" 'room',\n",
" 'tv',\n",
" 'c',\n",
" 'safety',\n",
" 'box',\n",
" 'hair',\n",
" 'dryer',\n",
" 'bathroom',\n",
" 'also',\n",
" 'clean',\n",
" 'soap',\n",
" 'shampoo',\n",
" 'dispenser',\n",
" 'room',\n",
" 'spacious',\n",
" 'getting',\n",
" 'ready',\n",
" 'freak',\n",
" 'night',\n",
" 'concert',\n",
" 'seattle',\n",
" 'mile',\n",
" 'wamu',\n",
" 'theater',\n",
" 'front',\n",
" 'desk',\n",
" 'person',\n",
" 'named',\n",
" 'chris',\n",
" 'helpful',\n",
" 'let',\n",
" 'u',\n",
" 'know',\n",
" 'go',\n",
" 'breakfast',\n",
" 'morning',\n",
" 'seemed',\n",
" 'nice',\n",
" 'helpful',\n",
" 'overall',\n",
" 'guest',\n",
" 'coming',\n",
" 'halloween',\n",
" 'weekend',\n",
" 'hotel',\n",
" 'also',\n",
" 'offer',\n",
" 'continental',\n",
" 'breakfast',\n",
" 'friend',\n",
" 'said',\n",
" 'waffle',\n",
" 'good',\n",
" 'parking',\n",
" 'per',\n",
" 'car',\n",
" 'felt',\n",
" 'safe',\n",
" 'knowing',\n",
" 'going',\n",
" 'get',\n",
" 'ticket',\n",
" 'felt',\n",
" 'like',\n",
" 'car',\n",
" 'get',\n",
" 'broken',\n",
" 'would',\n",
" 'recommend',\n",
" 'staying',\n",
" 'couple',\n",
" 'night',\n",
" 'need',\n",
" 'something',\n",
" 'basic',\n",
" 'expensive',\n",
" 'clean',\n",
" 'close',\n",
" 'restaurant',\n",
" 'etc',\n",
" 'also',\n",
" 'good',\n",
" 'party',\n",
" 'spot',\n",
" 'able',\n",
" 'get',\n",
" 'friend',\n",
" 'take',\n",
" 'room',\n",
" 'next',\n",
" 'lot',\n",
" 'fun',\n",
" 'thank',\n",
" 'best',\n",
" 'western',\n",
" 'loyal',\n",
" 'inn',\n",
" 'impressed',\n",
" 'hotel',\n",
" 'seattle',\n",
" 'probably',\n",
" 'look',\n",
" 'elsewhere',\n",
" 'return',\n",
" 'pro',\n",
" 'parking',\n",
" 'within',\n",
" 'walking',\n",
" 'distance',\n",
" 'downtown',\n",
" ...],\n",
" ['beautiful',\n",
" 'view',\n",
" 'space',\n",
" 'needle',\n",
" 'especially',\n",
" 'night',\n",
" 'like',\n",
" 'photography',\n",
" 'ask',\n",
" 'room',\n",
" 'view',\n",
" 'staff',\n",
" 'great',\n",
" 'local',\n",
" 'restuarant',\n",
" 'excellent',\n",
" 'secure',\n",
" 'parking',\n",
" 'plus',\n",
" 'free',\n",
" 'wi',\n",
" 'fi',\n",
" 'issue',\n",
" 'waking',\n",
" 'conference',\n",
" 'call',\n",
" 'could',\n",
" 'find',\n",
" 'internet',\n",
" 'password',\n",
" 'called',\n",
" 'front',\n",
" 'desk',\n",
" 'man',\n",
" 'duty',\n",
" 'would',\n",
" 'give',\n",
" 'insisited',\n",
" 'come',\n",
" 'get',\n",
" 'person',\n",
" 'tried',\n",
" 'convince',\n",
" 'dressed',\n",
" 'claimed',\n",
" 'policy',\n",
" 'guest',\n",
" 'come',\n",
" 'collect',\n",
" 'password',\n",
" 'desk',\n",
" 'got',\n",
" 'dressed',\n",
" 'went',\n",
" 'downstairs',\n",
" 'get',\n",
" 'ridiculous',\n",
" 'seriously',\n",
" 'irritated',\n",
" 'anyway',\n",
" 'spoke',\n",
" 'day',\n",
" 'staff',\n",
" 'seemed',\n",
" 'shocked',\n",
" 'hotel',\n",
" 'policy',\n",
" 'keep',\n",
" 'code',\n",
" 'safe',\n",
" 'intending',\n",
" 'see',\n",
" 'king',\n",
" 'tutankhamen',\n",
" 'treasure',\n",
" 'pacific',\n",
" 'science',\n",
" 'center',\n",
" 'leaf',\n",
" 'january',\n",
" 'time',\n",
" 'getting',\n",
" 'short',\n",
" 'decided',\n",
" 'better',\n",
" 'purchase',\n",
" 'vip',\n",
" 'ticket',\n",
" 'go',\n",
" 'specific',\n",
" 'time',\n",
" 'stand',\n",
" 'line',\n",
" 'opted',\n",
" 'hotel',\n",
" 'due',\n",
" 'proximity',\n",
" 'seattle',\n",
" 'center',\n",
" 'dissapointed',\n",
" 'secure',\n",
" 'underground',\n",
" 'parking',\n",
" 'space',\n",
" 'needle',\n",
" 'view',\n",
" 'king',\n",
" 'room',\n",
" 'pleased',\n",
" 'hotel',\n",
" 'bit',\n",
" 'older',\n",
" 'well',\n",
" 'kept',\n",
" 'updated',\n",
" 'furniture',\n",
" 'wise',\n",
" 'well',\n",
" 'room',\n",
" 'adjoining',\n",
" 'room',\n",
" 'group',\n",
" 'next',\n",
" 'door',\n",
" 'bit',\n",
" 'loud',\n",
" 'think',\n",
" 'door',\n",
" 'thick',\n",
" 'enough',\n",
" 'sound',\n",
" 'proof',\n",
" 'enough',\n",
" 'two',\n",
" 'room',\n",
" 'hotel',\n",
" 'staff',\n",
" 'efficient',\n",
" 'helpful',\n",
" 'package',\n",
" 'came',\n",
" 'free',\n",
" 'breakfast',\n",
" 'brella',\n",
" 'restaurant',\n",
" 'inside',\n",
" 'hotel',\n",
" 'passed',\n",
" 'breakfast',\n",
" 'nothing',\n",
" 'write',\n",
" 'home',\n",
" 'much',\n",
" 'better',\n",
" 'breakfast',\n",
" 'hotel',\n",
" 'included',\n",
" 'cost',\n",
" 'room',\n",
" 'feel',\n",
" 'sorry',\n",
" 'people',\n",
" 'paying',\n",
" 'breakfast',\n",
" 'top',\n",
" 'cost',\n",
" 'room',\n",
" 'offering',\n",
" 'much',\n",
" 'different',\n",
" 'hotel',\n",
" 'good',\n",
" 'sausage',\n",
" 'slimy',\n",
" 'egg',\n",
" 'stiff',\n",
" 'dry',\n",
" 'hash',\n",
" 'brown',\n",
" 'slimy',\n",
" 'good',\n",
" 'thing',\n",
" 'waffle',\n",
" 'made',\n",
" 'hot',\n",
" 'spot',\n",
" 'blessed',\n",
" 'decent',\n",
" 'weather',\n",
" 'another',\n",
" 'surprise',\n",
" 'seattle',\n",
" 'center',\n",
" 'dale',\n",
" 'chihuly',\n",
" 'garden',\n",
" 'glass',\n",
" 'exhibit',\n",
" 'added',\n",
" 'year',\n",
" 'arrived',\n",
" 'saw',\n",
" 'garden',\n",
" 'bit',\n",
" 'sunlight',\n",
" 'left',\n",
" 'experienced',\n",
" 'dark',\n",
" 'fantastic',\n",
" 'hotel',\n",
" 'excellent',\n",
" 'location',\n",
" 'would',\n",
" 'recommend',\n",
" 'space',\n",
" 'needle',\n",
" 'view',\n",
" 'however',\n",
" 'end',\n",
" 'hallway',\n",
" 'get',\n",
" 'small',\n",
" 'window',\n",
" 'offer',\n",
" 'free',\n",
" 'breakfast',\n",
" 'buffet',\n",
" 'best',\n",
" 'western',\n",
" 'member',\n",
" 'otherwise',\n",
" 'pay',\n",
" 'good',\n",
" 'buffet',\n",
" 'excellent',\n",
" 'seating',\n",
" 'v',\n",
" 'best',\n",
" 'western',\n",
" 'room',\n",
" 'nice',\n",
" 'pay',\n",
" 'resonable',\n",
" 'compared',\n",
" 'lodging',\n",
" 'closer',\n",
" 'downtown',\n",
" 'walking',\n",
" 'distant',\n",
" 'monorail',\n",
" 'take',\n",
" 'downtown',\n",
" 'however',\n",
" 'plan',\n",
" 'driving',\n",
" 'around',\n",
" 'may',\n",
" 'want',\n",
" 'look',\n",
" 'another',\n",
" 'location',\n",
" 'parking',\n",
" 'limited',\n",
" 'anything',\n",
" 'going',\n",
" 'seattle',\n",
" 'may',\n",
" 'find',\n",
" 'street',\n",
" 'parking',\n",
" 'block',\n",
" 'away',\n",
" 'left',\n",
" 'one',\n",
" 'night',\n",
" 'parking',\n",
" 'full',\n",
" 'counter',\n",
" 'personnel',\n",
" 'suggested',\n",
" 'lot',\n",
" 'two',\n",
" 'block',\n",
" 'charged',\n",
" 'hour',\n",
" 'happy',\n",
" 'camper',\n",
" 'walking',\n",
" 'issue',\n",
" 'defiantly',\n",
" 'would',\n",
" 'worked',\n",
" 'well',\n",
" 'accommodation',\n",
" 'wall',\n",
" 'thin',\n",
" 'unless',\n",
" 'turn',\n",
" 'heater',\n",
" 'feel',\n",
" 'like',\n",
" 'sleeping',\n",
" 'sheet',\n",
" 'light',\n",
" 'blanket',\n",
" 'pull',\n",
" 'closet',\n",
" 'suggest',\n",
" 'hotel',\n",
" 'restaurant',\n",
" 'front',\n",
" 'desk',\n",
" 'steered',\n",
" 'u',\n",
" 'better',\n",
" 'food',\n",
" 'husband',\n",
" 'stayed',\n",
" 'saturday',\n",
" 'would',\n",
" 'recommend',\n",
" 'others',\n",
" 'staff',\n",
" 'friendly',\n",
" 'quiet',\n",
" 'relaxing',\n",
" 'room',\n",
" 'clean',\n",
" 'bed',\n",
" 'comfortable',\n",
" 'good',\n",
" 'price',\n",
" 'close',\n",
" 'space',\n",
" 'needle',\n",
" 'would',\n",
" 'definately',\n",
" 'stay',\n",
" 'next',\n",
" 'time',\n",
" 'go',\n",
" 'seattle',\n",
" 'overall',\n",
" 'great',\n",
" 'hotel',\n",
" 'dong',\n",
" 'quite',\n",
" 'bit',\n",
" 'traveling',\n",
" 'since',\n",
" 'retired',\n",
" 'end',\n",
" 'july',\n",
" 'best',\n",
" 'wester',\n",
" 'plus',\n",
" 'executive',\n",
" 'inn',\n",
" 'one',\n",
" 'favorite',\n",
" 'staff',\n",
" 'attitude',\n",
" 'make',\n",
" 'stay',\n",
" 'great',\n",
" 'experience',\n",
" 'wonderful',\n",
" 'time',\n",
" 'thanks',\n",
" 'helpful',\n",
" 'staff',\n",
" 'te',\n",
" 'great',\n",
" 'giving',\n",
" 'u',\n",
" 'direction',\n",
" 'advise',\n",
" 'go',\n",
" 'get',\n",
" 'seattle',\n",
" 'awesome',\n",
" 'location',\n",
" 'great',\n",
" 'view',\n",
" 'room',\n",
" 'clean',\n",
" 'well',\n",
" 'stocked',\n",
" 'wifi',\n",
" 'flawless',\n",
" 'short',\n",
" 'one',\n",
" 'night',\n",
" 'stay',\n",
" 'average',\n",
" 'nothing',\n",
" 'complain',\n",
" 'nothing',\n",
" 'brag',\n",
" 'either',\n",
" 'feel',\n",
" 'plus',\n",
" 'status',\n",
" 'hotel',\n",
" 'marketing',\n",
" 'stretch',\n",
" 'give',\n",
" 'credit',\n",
" 'property',\n",
" 'location',\n",
" 'location',\n",
" 'location',\n",
" 'easy',\n",
" 'walk',\n",
" 'many',\n",
" 'seattle',\n",
" 'attraction',\n",
" 'around',\n",
" 'space',\n",
" 'needle',\n",
" 'reserved',\n",
" 'july',\n",
" 'occupancy',\n",
" 'room',\n",
" 'queen',\n",
" 'bed',\n",
" 'traveling',\n",
" 'adult',\n",
" 'son',\n",
" 'november',\n",
" 'concert',\n",
" 'key',\n",
" 'arena',\n",
" 'good',\n",
" 'thing',\n",
" 'say',\n",
" 'close',\n",
" 'handy',\n",
" 'concert',\n",
" 'arrived',\n",
" 'check',\n",
" 'told',\n",
" 'room',\n",
" 'bed',\n",
" 'roll',\n",
" 'away',\n",
" 'sent',\n",
" 'saddest',\n",
" 'bed',\n",
" 'ever',\n",
" 'saw',\n",
" 'delivered',\n",
" 'rude',\n",
" 'housekeeper',\n",
" 'acted',\n",
" 'mad',\n",
" 'bring',\n",
" 'u',\n",
" 'bed',\n",
" 'extra',\n",
" 'blanket',\n",
" 'pillow',\n",
" 'even',\n",
" 'offer',\n",
" 'set',\n",
" 'bed',\n",
" 'room',\n",
" 'floor',\n",
" 'elevator',\n",
" 'noisy',\n",
" 'setting',\n",
" 'heard',\n",
" 'guest',\n",
" 'housekeeper',\n",
" 'coming',\n",
" 'going',\n",
" 'view',\n",
" 'roof',\n",
" 'top',\n",
" 'neighboring',\n",
" 'building',\n",
" 'lower',\n",
" 'parking',\n",
" 'area',\n",
" 'good',\n",
" 'place',\n",
" 'nearby',\n",
" 'eat',\n",
" 'nice',\n",
" 'shower',\n",
" 'lot',\n",
" 'water',\n",
" 'pressure',\n",
" 'though',\n",
" 'tub',\n",
" 'drain',\n",
" 'stayed',\n",
" 'night',\n",
" 'went',\n",
" 'rush',\n",
" 'concert',\n",
" 'key',\n",
" 'arena',\n",
" 'excellent',\n",
" 'concert',\n",
" 'way',\n",
" 'location',\n",
" 'perfect',\n",
" 'contrary',\n",
" 'review',\n",
" 'trouble',\n",
" 'finding',\n",
" 'excellent',\n",
" 'view',\n",
" 'space',\n",
" 'needle',\n",
" 'floor',\n",
" 'room',\n",
" 'yes',\n",
" 'older',\n",
" 'hotel',\n",
" 'need',\n",
" 'reno',\n",
" 'understand',\n",
" 'working',\n",
" 'room',\n",
" 'plenty',\n",
" 'space',\n",
" 'everything',\n",
" 'worked',\n",
" 'well',\n",
" 'main',\n",
" 'issue',\n",
" 'noise',\n",
" 'coming',\n",
" 'guest',\n",
" 'next',\n",
" 'door',\n",
" 'one',\n",
" 'night',\n",
" 'child',\n",
" 'loud',\n",
" 'next',\n",
" 'someone',\n",
" 'talking',\n",
" 'loud',\n",
" 'morning',\n",
" 'may',\n",
" 'adjoining',\n",
" 'room',\n",
" 'next',\n",
" 'time',\n",
" 'would',\n",
" 'request',\n",
" 'room',\n",
" 'connecting',\n",
" 'door',\n",
" 'lesson',\n",
" 'learned',\n",
" 'breakfast',\n",
" 'better',\n",
" 'many',\n",
" 'get',\n",
" 'best',\n",
" 'western',\n",
" 'plenty',\n",
" 'choice',\n",
" 'get',\n",
" 'bored',\n",
" 'day',\n",
" 'stay',\n",
" 'food',\n",
" 'restaurant',\n",
" 'evening',\n",
" 'rather',\n",
" 'bland',\n",
" 'ate',\n",
" 'nice',\n",
" 'fridge',\n",
" 'microwave',\n",
" 'room',\n",
" 'always',\n",
" 'ask',\n",
" 'fridge',\n",
" 'least',\n",
" 'staff',\n",
" 'hotel',\n",
" 'always',\n",
" 'polite',\n",
" 'friendly',\n",
" 'even',\n",
" 'line',\n",
" 'guest',\n",
" 'waiting',\n",
" 'seen',\n",
" 'also',\n",
" 'impressed',\n",
" 'manager',\n",
" 'see',\n",
" 'responds',\n",
" 'review',\n",
" 'doubt',\n",
" 'comment',\n",
" 'suggestion',\n",
" 'future',\n",
" 'bw',\n",
" 'plus',\n",
" 'proper',\n",
" 'glass',\n",
" 'mug',\n",
" 'well',\n",
" 'disposable',\n",
" 'option',\n",
" 'seen',\n",
" 'report',\n",
" 'cleaned',\n",
" 'place',\n",
" 'guest',\n",
" 'wash',\n",
" 'also',\n",
" 'think',\n",
" 'advising',\n",
" 'guest',\n",
" 'room',\n",
" 'connecting',\n",
" 'door',\n",
" 'next',\n",
" 'room',\n",
" 'decide',\n",
" 'whether',\n",
" 'stay',\n",
" 'room',\n",
" 'hotel',\n",
" 'great',\n",
" 'location',\n",
" 'view',\n",
" 'space',\n",
" 'needle',\n",
" 'within',\n",
" 'easy',\n",
" 'walking',\n",
" 'distance',\n",
" 'monorail',\n",
" 'downtown',\n",
" 'service',\n",
" 'patchy',\n",
" 'however',\n",
" 'lobby',\n",
" 'room',\n",
" 'little',\n",
" 'dingy',\n",
" 'need',\n",
" 'modernization',\n",
" 'particularly',\n",
" 'view',\n",
" 'price',\n",
" 'parking',\n",
" 'extra',\n",
" 'also',\n",
" 'get',\n",
" 'rate',\n",
" 'includes',\n",
" 'breakfast',\n",
" 'although',\n",
" 'breakfast',\n",
" 'nothing',\n",
" 'special',\n",
" 'bar',\n",
" 'nice',\n",
" 'selection',\n",
" 'locally',\n",
" 'brewed',\n",
" 'ale',\n",
" 'tap',\n",
" 'wanted',\n",
" 'place',\n",
" 'close',\n",
" 'museum',\n",
" 'attraction',\n",
" 'near',\n",
" 'space',\n",
" 'needle',\n",
" 'took',\n",
" 'advantage',\n",
" 'hotel',\n",
" 'king',\n",
" 'tut',\n",
" 'package',\n",
" 'hotel',\n",
" 'literally',\n",
" 'two',\n",
" 'block',\n",
" 'everything',\n",
" 'price',\n",
" 'better',\n",
" 'others',\n",
" 'checked',\n",
" 'online',\n",
" 'parked',\n",
" 'car',\n",
" 'hotel',\n",
" 'move',\n",
" 'entire',\n",
" 'time',\n",
" 'night',\n",
" 'one',\n",
" 'big',\n",
" 'advantage',\n",
" 'staying',\n",
" 'king',\n",
" 'tut',\n",
" 'package',\n",
" 'gave',\n",
" 'u',\n",
" 'vip',\n",
" 'ticket',\n",
" 'meant',\n",
" 'could',\n",
" 'go',\n",
" 'see',\n",
" 'exhibit',\n",
" 'time',\n",
" 'open',\n",
" 'rather',\n",
" 'date',\n",
" 'time',\n",
" 'specific',\n",
" 'ticket',\n",
" 'motel',\n",
" 'room',\n",
" 'clean',\n",
" 'adequate',\n",
" 'spend',\n",
" 'much',\n",
" 'time',\n",
" 'view',\n",
" 'room',\n",
" 'pretty',\n",
" 'blah',\n",
" 'overlooking',\n",
" 'parking',\n",
" 'lot',\n",
" 'except',\n",
" 'get',\n",
" 'watch',\n",
" 'one',\n",
" 'really',\n",
" 'big',\n",
" 'building',\n",
" 'crane',\n",
" 'operation',\n",
" 'aware',\n",
" 'least',\n",
" 'part',\n",
" 'seattle',\n",
" 'seemed',\n",
" 'construction',\n",
" 'mode',\n",
" 'hotel',\n",
" 'fault',\n",
" 'awakened',\n",
" 'one',\n",
" 'morning',\n",
" 'non',\n",
" 'stop',\n",
" 'jack',\n",
" 'hammering',\n",
" 'street',\n",
" 'even',\n",
" 'earlier',\n",
" 'second',\n",
" 'morning',\n",
" 'arrival',\n",
" 'garbage',\n",
" 'truck',\n",
" 'plus',\n",
" 'side',\n",
" 'room',\n",
" 'refrigerator',\n",
" 'microwave',\n",
" 'arm',\n",
" 'chairm',\n",
" 'bathroom',\n",
" 'tiny',\n",
" 'try',\n",
" 'sit',\n",
" 'toilet',\n",
" 'close',\n",
" 'door',\n",
" 'time',\n",
" 'comment',\n",
" 'maid',\n",
" 'service',\n",
" 'say',\n",
" 'respect',\n",
" 'privacy',\n",
" 'clean',\n",
" 'room',\n",
" 'since',\n",
" 'put',\n",
" 'disturb',\n",
" 'sign',\n",
" 'leave',\n",
" 'u',\n",
" 'note',\n",
" 'day',\n",
" 'telling',\n",
" 'u',\n",
" 'want',\n",
" 'anything',\n",
" 'serviced',\n",
" 'let',\n",
" 'front',\n",
" 'desk',\n",
" 'know',\n",
" 'thought',\n",
" 'nice',\n",
" 'way',\n",
" 'handle',\n",
" 'breakfast',\n",
" 'hotel',\n",
" 'restaurant',\n",
" 'morning',\n",
" 'convenience',\n",
" 'first',\n",
" 'day',\n",
" 'part',\n",
" 'package',\n",
" 'good',\n",
" 'variety',\n",
" 'make',\n",
" 'breakfast',\n",
" 'burrito',\n",
" 'waffle',\n",
" 'oatmeal',\n",
" 'bacon',\n",
" 'etc',\n",
" 'although',\n",
" 'exact',\n",
" 'choice',\n",
" 'day',\n",
" 'got',\n",
" 'little',\n",
" 'boring',\n",
" 'third',\n",
" 'morning',\n",
" 'breakfast',\n",
" 'family',\n",
" 'stay',\n",
" 'another',\n",
" 'hotel',\n",
" 'area',\n",
" 'earlier',\n",
" 'year',\n",
" 'surprised',\n",
" 'air',\n",
" 'conditioning',\n",
" 'hotel',\n",
" 'air',\n",
" 'conditioning',\n",
" 'worked',\n",
" 'fine',\n",
" 'definitely',\n",
" 'adequate',\n",
" 'place',\n",
" 'reasonable',\n",
" 'price',\n",
" 'seattle',\n",
" 'super',\n",
" 'location',\n",
" 'stayed',\n",
" 'three',\n",
" 'night',\n",
" 'enjoyed',\n",
" 'every',\n",
" 'moment',\n",
" 'room',\n",
" 'location',\n",
" 'helpful',\n",
" 'staff',\n",
" 'cleanliness',\n",
" 'value',\n",
" 'comfort',\n",
" 'everything',\n",
" 'spot',\n",
" 'staff',\n",
" 'friendly',\n",
" 'helpful',\n",
" 'highly',\n",
" 'recommended',\n",
" 'return',\n",
" 'wanted',\n",
" 'hotel',\n",
" 'near',\n",
" 'space',\n",
" 'needle',\n",
" 'knew',\n",
" 'traffic',\n",
" 'would',\n",
" 'nite',\n",
" 'mare',\n",
" 'trying',\n",
" 'find',\n",
" 'place',\n",
" 'hard',\n",
" 'check',\n",
" 'horrific',\n",
" 'person',\n",
" 'come',\n",
" 'really',\n",
" 'took',\n",
" 'minute',\n",
" 'hour',\n",
" 'drive',\n",
" 'beyond',\n",
" 'comprehension',\n",
" 'got',\n",
" 'lucky',\n",
" 'got',\n",
" 'room',\n",
" 'view',\n",
" 'without',\n",
" 'pay',\n",
" 'extra',\n",
" 'night',\n",
" 'helped',\n",
" 'little',\n",
" 'bit',\n",
" 'hall',\n",
" 'room',\n",
" 'dirty',\n",
" 'bed',\n",
" 'oh',\n",
" 'goodness',\n",
" 'horrible',\n",
" 'uncomfortable',\n",
" 'son',\n",
" 'said',\n",
" 'felt',\n",
" 'like',\n",
" 'really',\n",
" 'big',\n",
" 'person',\n",
" 'slept',\n",
" 'need',\n",
" 'replaced',\n",
" 'spring',\n",
" 'totally',\n",
" 'broken',\n",
" 'pleased',\n",
" 'breakfast',\n",
" 'day',\n",
" 'paid',\n",
" 'day',\n",
" 'change',\n",
" 'closed',\n",
" 'exactly',\n",
" 'exception',\n",
" 'mon',\n",
" 'afternoon',\n",
" 'thurs',\n",
" 'put',\n",
" 'complaint',\n",
" 'card',\n",
" 'got',\n",
" 'e',\n",
" 'mail',\n",
" 'response',\n",
" 'replied',\n",
" 'back',\n",
" 'nothing',\n",
" 'sure',\n",
" 'management',\n",
" 'anything',\n",
" 'night',\n",
" 'almost',\n",
" 'mortgage',\n",
" 'payment',\n",
" 'thought',\n",
" 'would',\n",
" 'better',\n",
" 'room',\n",
" 'ever',\n",
" 'worried',\n",
" 'next',\n",
" 'night',\n",
" 'best',\n",
" 'western',\n",
" 'portland',\n",
" 'let',\n",
" 'see',\n",
" 'checked',\n",
" 'best',\n",
" 'western',\n",
" 'executive',\n",
" 'part',\n",
" 'blackball',\n",
" 'ferry',\n",
" 'package',\n",
" 'hotel',\n",
" ...],\n",
" ['hotel',\n",
" 'need',\n",
" 'serious',\n",
" 'update',\n",
" 'room',\n",
" 'big',\n",
" 'bed',\n",
" 'comfortable',\n",
" 'carpet',\n",
" 'worn',\n",
" 'hesitate',\n",
" 'walk',\n",
" 'bare',\n",
" 'foot',\n",
" 'wall',\n",
" 'paper',\n",
" 'peeling',\n",
" 'black',\n",
" 'mark',\n",
" 'wall',\n",
" 'door',\n",
" 'disappointing',\n",
" 'frustrating',\n",
" 'thing',\n",
" 'stay',\n",
" 'poor',\n",
" 'water',\n",
" 'pressure',\n",
" 'lukewarm',\n",
" 'water',\n",
" 'driving',\n",
" 'hr',\n",
" 'getting',\n",
" 'late',\n",
" 'looking',\n",
" 'forward',\n",
" 'nice',\n",
" 'hot',\n",
" 'shower',\n",
" 'happen',\n",
" 'glad',\n",
" 'staying',\n",
" 'night',\n",
" 'leaving',\n",
" 'morning',\n",
" 'would',\n",
" 'checked',\n",
" 'gone',\n",
" 'another',\n",
" 'hotel',\n",
" 'nosie',\n",
" 'level',\n",
" 'pretty',\n",
" 'high',\n",
" 'could',\n",
" 'hear',\n",
" 'people',\n",
" 'walking',\n",
" 'hall',\n",
" 'waking',\n",
" 'morning',\n",
" 'alaska',\n",
" 'airline',\n",
" 'provided',\n",
" 'voucher',\n",
" 'comfort',\n",
" 'inn',\n",
" 'delay',\n",
" 'caused',\n",
" 'u',\n",
" 'miss',\n",
" 'last',\n",
" 'leg',\n",
" 'flight',\n",
" 'disappointment',\n",
" 'place',\n",
" 'poor',\n",
" 'condition',\n",
" 'smell',\n",
" 'bad',\n",
" 'front',\n",
" 'lobby',\n",
" 'hallway',\n",
" 'guest',\n",
" 'room',\n",
" 'contacting',\n",
" 'let',\n",
" 'u',\n",
" 'know',\n",
" 'needed',\n",
" 'ride',\n",
" 'airport',\n",
" 'even',\n",
" 'though',\n",
" 'advertise',\n",
" 'pick',\n",
" 'ups',\n",
" 'every',\n",
" 'minute',\n",
" 'still',\n",
" 'took',\n",
" 'minute',\n",
" 'shuttle',\n",
" 'arrived',\n",
" 'place',\n",
" 'less',\n",
" 'minute',\n",
" 'airport',\n",
" 'switch',\n",
" 'room',\n",
" 'housekeeping',\n",
" 'somehow',\n",
" 'missed',\n",
" 'bed',\n",
" 'made',\n",
" 'looked',\n",
" 'like',\n",
" 'pit',\n",
" 'fruit',\n",
" 'bed',\n",
" 'sheet',\n",
" 'gross',\n",
" 'good',\n",
" 'bad',\n",
" 'thing',\n",
" 'hotel',\n",
" 'internet',\n",
" 'terrible',\n",
" 'towel',\n",
" 'good',\n",
" 'water',\n",
" 'pressure',\n",
" 'good',\n",
" 'room',\n",
" 'decent',\n",
" 'size',\n",
" 'nice',\n",
" 'dark',\n",
" 'bed',\n",
" 'comfortable',\n",
" 'staff',\n",
" 'friendly',\n",
" 'location',\n",
" 'convenient',\n",
" 'though',\n",
" 'neighborhood',\n",
" 'seem',\n",
" 'great',\n",
" 'problem',\n",
" 'regarding',\n",
" 'though',\n",
" 'stayed',\n",
" 'two',\n",
" 'night',\n",
" 'visiting',\n",
" 'seattle',\n",
" 'october',\n",
" 'little',\n",
" 'worried',\n",
" 'review',\n",
" 'mixed',\n",
" 'pleased',\n",
" 'found',\n",
" 'room',\n",
" 'large',\n",
" 'immaculate',\n",
" 'comfortable',\n",
" 'everything',\n",
" 'room',\n",
" 'entire',\n",
" 'hotel',\n",
" 'looked',\n",
" 'like',\n",
" 'well',\n",
" 'maintained',\n",
" 'even',\n",
" 'recently',\n",
" 'painted',\n",
" 'walking',\n",
" 'corridor',\n",
" 'room',\n",
" 'rug',\n",
" 'bit',\n",
" 'lint',\n",
" 'always',\n",
" 'hate',\n",
" 'rug',\n",
" 'clean',\n",
" 'wall',\n",
" 'dirty',\n",
" 'might',\n",
" 'well',\n",
" 'stay',\n",
" 'home',\n",
" 'case',\n",
" 'another',\n",
" 'reviewer',\n",
" 'mentioned',\n",
" 'area',\n",
" 'know',\n",
" 'expect',\n",
" 'found',\n",
" 'area',\n",
" 'fine',\n",
" 'busy',\n",
" 'street',\n",
" 'heart',\n",
" 'area',\n",
" 'seattle',\n",
" 'lot',\n",
" 'store',\n",
" 'shop',\n",
" 'along',\n",
" 'avenue',\n",
" 'find',\n",
" 'seedy',\n",
" 'fact',\n",
" 'needed',\n",
" 'use',\n",
" 'room',\n",
" 'key',\n",
" 'use',\n",
" 'elevator',\n",
" 'felt',\n",
" 'safe',\n",
" 'would',\n",
" 'recommend',\n",
" 'hotel',\n",
" 'mixed',\n",
" 'one',\n",
" 'night',\n",
" 'stay',\n",
" 'comfort',\n",
" 'inn',\n",
" 'check',\n",
" 'clerk',\n",
" 'friendly',\n",
" 'although',\n",
" 'next',\n",
" 'day',\n",
" 'clerk',\n",
" 'much',\n",
" 'room',\n",
" 'pretty',\n",
" 'large',\n",
" 'relatively',\n",
" 'clean',\n",
" 'good',\n",
" 'internet',\n",
" 'service',\n",
" 'breakfast',\n",
" 'quite',\n",
" 'offering',\n",
" 'bad',\n",
" 'thing',\n",
" 'location',\n",
" 'choose',\n",
" 'hotel',\n",
" 'visiting',\n",
" 'business',\n",
" 'hotel',\n",
" 'chosen',\n",
" 'neighborhood',\n",
" 'sketchy',\n",
" 'drive',\n",
" 'several',\n",
" 'mile',\n",
" 'nearby',\n",
" 'shopping',\n",
" 'mall',\n",
" 'find',\n",
" 'restaurant',\n",
" 'good',\n",
" 'option',\n",
" 'people',\n",
" 'traveling',\n",
" 'issue',\n",
" 'cleanliness',\n",
" 'room',\n",
" 'mine',\n",
" 'ok',\n",
" 'great',\n",
" 'thing',\n",
" 'seattle',\n",
" 'area',\n",
" 'fast',\n",
" 'neighborhood',\n",
" 'change',\n",
" 'one',\n",
" 'looking',\n",
" 'affordable',\n",
" 'place',\n",
" 'stay',\n",
" 'outside',\n",
" 'spendy',\n",
" 'area',\n",
" 'comfort',\n",
" 'suite',\n",
" 'work',\n",
" 'stayed',\n",
" 'suite',\n",
" 'less',\n",
" 'single',\n",
" 'double',\n",
" 'bed',\n",
" 'mile',\n",
" 'closer',\n",
" 'space',\n",
" 'needle',\n",
" 'work',\n",
" 'sure',\n",
" 'drive',\n",
" 'may',\n",
" 'leave',\n",
" 'wondering',\n",
" 'headed',\n",
" 'aurora',\n",
" 'ave',\n",
" 'rest',\n",
" 'assured',\n",
" 'get',\n",
" 'work',\n",
" 'nice',\n",
" 'stay',\n",
" 'ken',\n",
" 'arrived',\n",
" 'midnight',\n",
" 'greeted',\n",
" 'friendly',\n",
" 'welcoming',\n",
" 'smile',\n",
" 'although',\n",
" 'way',\n",
" 'past',\n",
" 'normal',\n",
" 'person',\n",
" 'bed',\n",
" 'time',\n",
" 'gentleman',\n",
" 'checking',\n",
" 'u',\n",
" 'took',\n",
" 'time',\n",
" 'help',\n",
" 'u',\n",
" 'rate',\n",
" 'since',\n",
" 'staying',\n",
" 'long',\n",
" 'period',\n",
" 'time',\n",
" 'also',\n",
" 'helpful',\n",
" 'giving',\n",
" 'u',\n",
" 'suggestion',\n",
" 'really',\n",
" 'late',\n",
" 'night',\n",
" 'bite',\n",
" 'eat',\n",
" 'retired',\n",
" 'night',\n",
" 'room',\n",
" 'little',\n",
" 'small',\n",
" 'fine',\n",
" 'husband',\n",
" 'able',\n",
" 'hear',\n",
" 'people',\n",
" 'u',\n",
" 'walking',\n",
" 'around',\n",
" 'disruptive',\n",
" 'time',\n",
" 'parking',\n",
" 'lot',\n",
" 'little',\n",
" 'small',\n",
" 'well',\n",
" 'fine',\n",
" 'full',\n",
" 'size',\n",
" 'rental',\n",
" 'car',\n",
" 'location',\n",
" 'good',\n",
" 'easy',\n",
" 'access',\n",
" 'direction',\n",
" 'short',\n",
" 'drive',\n",
" 'away',\n",
" 'main',\n",
" 'seattle',\n",
" 'attraction',\n",
" 'visited',\n",
" 'watch',\n",
" 'cousin',\n",
" 'play',\n",
" 'football',\n",
" 'uw',\n",
" 'frequented',\n",
" 'university',\n",
" 'district',\n",
" 'close',\n",
" 'distance',\n",
" 'well',\n",
" 'pleasant',\n",
" 'stay',\n",
" 'would',\n",
" 'given',\n",
" 'excellent',\n",
" 'rating',\n",
" 'noise',\n",
" 'people',\n",
" 'walking',\n",
" 'u',\n",
" 'husband',\n",
" 'definitely',\n",
" 'staying',\n",
" 'future',\n",
" 'visit',\n",
" 'seattle',\n",
" 'looking',\n",
" 'good',\n",
" 'hotel',\n",
" 'without',\n",
" 'paying',\n",
" 'arm',\n",
" 'leg',\n",
" 'chose',\n",
" 'comfort',\n",
" 'inn',\n",
" 'aurora',\n",
" 'ave',\n",
" 'night',\n",
" 'tax',\n",
" 'etc',\n",
" 'easy',\n",
" 'drive',\n",
" 'airport',\n",
" 'route',\n",
" 'fact',\n",
" 'took',\n",
" 'u',\n",
" 'minute',\n",
" 'get',\n",
" 'airport',\n",
" 'flight',\n",
" 'left',\n",
" 'front',\n",
" 'desk',\n",
" 'staff',\n",
" 'great',\n",
" 'quick',\n",
" 'check',\n",
" 'room',\n",
" 'better',\n",
" 'average',\n",
" 'room',\n",
" 'clean',\n",
" 'neat',\n",
" 'time',\n",
" 'stayed',\n",
" 'turn',\n",
" 'booked',\n",
" 'first',\n",
" 'last',\n",
" 'night',\n",
" 'wa',\n",
" 'also',\n",
" 'great',\n",
" 'breakfast',\n",
" 'buffet',\n",
" 'likely',\n",
" 'return',\n",
" 'visiting',\n",
" 'future',\n",
" 'wife',\n",
" 'booked',\n",
" 'hotel',\n",
" 'online',\n",
" 'website',\n",
" 'misleading',\n",
" 'hotel',\n",
" 'middle',\n",
" 'industrial',\n",
" 'area',\n",
" 'nowhere',\n",
" 'near',\n",
" 'site',\n",
" 'want',\n",
" 'see',\n",
" 'old',\n",
" 'run',\n",
" 'place',\n",
" 'poor',\n",
" 'condition',\n",
" 'fit',\n",
" 'anyone',\n",
" 'like',\n",
" 'comfortable',\n",
" 'area',\n",
" 'poor',\n",
" 'seedy',\n",
" 'want',\n",
" 'stay',\n",
" 'couple',\n",
" 'bring',\n",
" 'kid',\n",
" 'way',\n",
" 'stay',\n",
" 'away',\n",
" 'one',\n",
" 'worth',\n",
" 'time',\n",
" 'stayed',\n",
" 'one',\n",
" 'night',\n",
" 'staff',\n",
" 'reception',\n",
" 'welcome',\n",
" 'room',\n",
" 'less',\n",
" 'clean',\n",
" 'breafast',\n",
" 'guest',\n",
" 'talking',\n",
" 'tv',\n",
" 'loud',\n",
" 'background',\n",
" 'music',\n",
" 'hard',\n",
" 'chat',\n",
" 'couple',\n",
" 'gym',\n",
" 'closed',\n",
" 'nothing',\n",
" 'worked',\n",
" 'bit',\n",
" 'dirty',\n",
" 'complaint',\n",
" 'hotel',\n",
" 'check',\n",
" 'quick',\n",
" 'courteous',\n",
" 'paid',\n",
" 'including',\n",
" 'tax',\n",
" 'good',\n",
" 'sized',\n",
" 'room',\n",
" 'comfy',\n",
" 'king',\n",
" 'bed',\n",
" 'nice',\n",
" 'clean',\n",
" 'wi',\n",
" 'fi',\n",
" 'worked',\n",
" 'without',\n",
" 'hitch',\n",
" 'breakfast',\n",
" 'better',\n",
" 'hotel',\n",
" 'average',\n",
" 'egg',\n",
" 'sausage',\n",
" 'waffle',\n",
" 'located',\n",
" 'busy',\n",
" 'north',\n",
" 'aurora',\n",
" 'ave',\n",
" 'noise',\n",
" 'problem',\n",
" 'feel',\n",
" 'bad',\n",
" 'area',\n",
" 'typical',\n",
" 'busy',\n",
" 'commercial',\n",
" 'strip',\n",
" 'average',\n",
" 'looking',\n",
" 'residential',\n",
" 'area',\n",
" 'starting',\n",
" 'one',\n",
" 'block',\n",
" 'east',\n",
" 'west',\n",
" 'aurora',\n",
" 'wife',\n",
" 'walk',\n",
" 'three',\n",
" 'mile',\n",
" 'day',\n",
" 'exercise',\n",
" 'walked',\n",
" 'along',\n",
" 'street',\n",
" 'paralleling',\n",
" 'aurora',\n",
" 'roughly',\n",
" 'concern',\n",
" 'glitzy',\n",
" 'hotel',\n",
" 'served',\n",
" 'purpose',\n",
" 'would',\n",
" 'return',\n",
" 'wow',\n",
" 'start',\n",
" 'hotel',\n",
" 'stayed',\n",
" 'two',\n",
" 'night',\n",
" 'ago',\n",
" 'firstly',\n",
" 'area',\n",
" 'little',\n",
" 'rundown',\n",
" 'super',\n",
" 'unsafe',\n",
" 'per',\n",
" 'se',\n",
" 'prefer',\n",
" 'stay',\n",
" 'thing',\n",
" 'around',\n",
" 'though',\n",
" 'review',\n",
" 'would',\n",
" 'think',\n",
" 'check',\n",
" 'fine',\n",
" 'super',\n",
" 'friendly',\n",
" 'fine',\n",
" 'except',\n",
" 'two',\n",
" 'car',\n",
" 'asking',\n",
" 'another',\n",
" 'parking',\n",
" 'pas',\n",
" 'saying',\n",
" 'lot',\n",
" 'get',\n",
" 'full',\n",
" 'check',\n",
" 'gentleman',\n",
" 'basically',\n",
" 'ignored',\n",
" 'request',\n",
" 'second',\n",
" 'parking',\n",
" 'pas',\n",
" 'like',\n",
" 'social',\n",
" 'skill',\n",
" 'deal',\n",
" 'saying',\n",
" 'odd',\n",
" 'get',\n",
" 'room',\n",
" 'huge',\n",
" 'way',\n",
" 'probably',\n",
" 'good',\n",
" 'thing',\n",
" 'say',\n",
" 'hotel',\n",
" 'pubic',\n",
" 'head',\n",
" 'hair',\n",
" 'bathroom',\n",
" 'tub',\n",
" 'behind',\n",
" 'bathroom',\n",
" 'door',\n",
" 'like',\n",
" 'swept',\n",
" 'day',\n",
" 'hair',\n",
" 'bed',\n",
" 'old',\n",
" 'washed',\n",
" 'blood',\n",
" 'stain',\n",
" 'absolute',\n",
" 'worst',\n",
" 'yes',\n",
" 'worse',\n",
" 'someone',\n",
" 'else',\n",
" 'pubes',\n",
" 'turned',\n",
" 'shower',\n",
" 'water',\n",
" 'sprayed',\n",
" 'everywhere',\n",
" 'sprayed',\n",
" 'much',\n",
" 'upwards',\n",
" 'getting',\n",
" 'water',\n",
" 'part',\n",
" 'roof',\n",
" 'dripping',\n",
" 'entire',\n",
" 'bathroom',\n",
" 'soaking',\n",
" 'etc',\n",
" 'called',\n",
" 'front',\n",
" 'desk',\n",
" 'let',\n",
" 'know',\n",
" 'know',\n",
" 'anything',\n",
" 'ever',\n",
" 'done',\n",
" 'terrible',\n",
" 'even',\n",
" 'wash',\n",
" 'hair',\n",
" 'shower',\n",
" 'nightmare',\n",
" 'basically',\n",
" 'shower',\n",
" 'totally',\n",
" 'unusable',\n",
" 'paying',\n",
" 'awful',\n",
" 'hotel',\n",
" 'breakfast',\n",
" 'bad',\n",
" 'got',\n",
" 'rush',\n",
" 'min',\n",
" 'later',\n",
" 'never',\n",
" 'would',\n",
" 'got',\n",
" 'seat',\n",
" 'since',\n",
" 'definitely',\n",
" 'enough',\n",
" 'seating',\n",
" 'seem',\n",
" 'restocking',\n",
" 'food',\n",
" 'quickly',\n",
" 'free',\n",
" 'internet',\n",
" 'worked',\n",
" 'room',\n",
" 'small',\n",
" 'fitness',\n",
" 'room',\n",
" 'dinky',\n",
" 'hot',\n",
" 'tub',\n",
" 'honestly',\n",
" 'would',\n",
" 'never',\n",
" 'stay',\n",
" 'comfort',\n",
" 'inn',\n",
" 'good',\n",
" 'luck',\n",
" 'need',\n",
" 'pro',\n",
" 'large',\n",
" 'room',\n",
" 'queen',\n",
" 'lot',\n",
" 'space',\n",
" 'internet',\n",
" 'signal',\n",
" 'strong',\n",
" 'room',\n",
" 'con',\n",
" 'parking',\n",
" 'check',\n",
" 'late',\n",
" 'front',\n",
" 'desk',\n",
" 'clerk',\n",
" 'caring',\n",
" 'lot',\n",
" 'hair',\n",
" 'bathroom',\n",
" 'next',\n",
" 'tub',\n",
" 'tub',\n",
" 'budget',\n",
" 'half',\n",
" 'sized',\n",
" 'pillow',\n",
" 'never',\n",
" 'seen',\n",
" 'anything',\n",
" 'like',\n",
" 'alittle',\n",
" 'worried',\n",
" 'reading',\n",
" 'review',\n",
" 'booked',\n",
" 'king',\n",
" 'size',\n",
" 'suite',\n",
" 'sofa',\n",
" 'bed',\n",
" 'since',\n",
" 'four',\n",
" 'adult',\n",
" 'spending',\n",
" 'one',\n",
" 'night',\n",
" 'cruise',\n",
" 'hotel',\n",
" 'great',\n",
" 'value',\n",
" 'money',\n",
" 'since',\n",
" 'want',\n",
" 'spend',\n",
" 'one',\n",
" 'night',\n",
" 'post',\n",
" 'complaint',\n",
" 'thin',\n",
" 'wall',\n",
" 'problem',\n",
" 'hearing',\n",
" 'people',\n",
" 'breakfast',\n",
" 'menu',\n",
" 'several',\n",
" 'item',\n",
" 'food',\n",
" 'good',\n",
" 'several',\n",
" 'pizza',\n",
" 'place',\n",
" 'deliver',\n",
" 'room',\n",
" 'since',\n",
" 'restaruants',\n",
" 'walking',\n",
" 'distance',\n",
" 'negative',\n",
" 'linen',\n",
" 'room',\n",
" 'sofa',\n",
" 'bed',\n",
" 'called',\n",
" 'desk',\n",
" 'person',\n",
" 'said',\n",
" 'planning',\n",
" 'sleeping',\n",
" 'reservation',\n",
" 'four',\n",
" 'adult',\n",
" 'one',\n",
" 'night',\n",
" 'bring',\n",
" 'linen',\n",
" 'pillow',\n",
" 'opened',\n",
" 'bed',\n",
" 'bowed',\n",
" 'middle',\n",
" 'even',\n",
" 'know',\n",
" 'begin',\n",
" 'tell',\n",
" 'horrible',\n",
" 'stay',\n",
" 'u',\n",
" 'start',\n",
" 'saying',\n",
" 'arrived',\n",
" 'hotel',\n",
" 'around',\n",
" 'found',\n",
" 'first',\n",
" 'literally',\n",
" 'parking',\n",
" 'hotel',\n",
" 'questionable',\n",
" 'area',\n",
" 'park',\n",
" 'street',\n",
" 'nervous',\n",
" 'already',\n",
" 'leaving',\n",
" 'car',\n",
" 'area',\n",
" 'went',\n",
" 'got',\n",
" 'key',\n",
" 'already',\n",
" 'noticing',\n",
" 'run',\n",
" 'place',\n",
" 'went',\n",
" 'room',\n",
" 'smell',\n",
" 'hallway',\n",
" 'stinky',\n",
" 'fresh',\n",
" 'kind',\n",
" 'smelt',\n",
" 'like',\n",
" 'dog',\n",
" 'pee',\n",
" 'got',\n",
" 'room',\n",
" 'smell',\n",
" 'bed',\n",
" 'definitely',\n",
" 'smelt',\n",
" 'like',\n",
" 'dog',\n",
" 'pee',\n",
" 'sickened',\n",
" 'pet',\n",
" 'u',\n",
" 'enjoy',\n",
" 'smelling',\n",
" 'someone',\n",
" 'elses',\n",
" 'pet',\n",
" 'reluctantly',\n",
" 'called',\n",
" 'front',\n",
" 'desk',\n",
" 'asked',\n",
" 'room',\n",
" 'could',\n",
" 'switched',\n",
" 'cold',\n",
" 'rude',\n",
" 'finally',\n",
" 'said',\n",
" 'come',\n",
" 'another',\n",
" 'key',\n",
" 'went',\n",
" 'second',\n",
" 'room',\n",
" 'family',\n",
" 'never',\n",
" 'unpacks',\n",
" 'anything',\n",
" 'checking',\n",
" 'bed',\n",
" 'stuff',\n",
" 'room',\n",
" 'carefully',\n",
" 'bedbug',\n",
" 'went',\n",
" 'ahead',\n",
" 'room',\n",
" 'might',\n",
" 'mention',\n",
" 'also',\n",
" 'smelled',\n",
" 'moldy',\n",
" 'upon',\n",
" 'lifting',\n",
" 'mattress',\n",
" 'see',\n",
" 'underneath',\n",
" 'husband',\n",
" 'found',\n",
" 'bug',\n",
" 'moving',\n",
" 'freaked',\n",
" 'u',\n",
" 'never',\n",
" 'actually',\n",
" 'found',\n",
" 'bug',\n",
" 'went',\n",
" 'told',\n",
" 'desk',\n",
" 'clerk',\n",
" 'found',\n",
" 'denied',\n",
" 'made',\n",
" 'u',\n",
" 'feel',\n",
" ...],\n",
" ['experience',\n",
" 'day',\n",
" 'inn',\n",
" 'perfect',\n",
" 'staff',\n",
" 'great',\n",
" 'manager',\n",
" 'ted',\n",
" 'angel',\n",
" 'helpful',\n",
" 'complimentary',\n",
" 'breakfast',\n",
" 'always',\n",
" 'hot',\n",
" 'also',\n",
" 'provided',\n",
" 'bbq',\n",
" 'grill',\n",
" 'really',\n",
" 'recommend',\n",
" 'place',\n",
" 'others',\n",
" 'planning',\n",
" 'staying',\n",
" 'san',\n",
" 'antonio',\n",
" 'staff',\n",
" 'front',\n",
" 'desk',\n",
" 'extremely',\n",
" 'helpful',\n",
" 'went',\n",
" 'way',\n",
" 'ensure',\n",
" 'trip',\n",
" 'enjoyable',\n",
" 'attending',\n",
" 'nephew',\n",
" 'air',\n",
" 'force',\n",
" 'graduation',\n",
" 'staff',\n",
" 'gave',\n",
" 'u',\n",
" 'useful',\n",
" 'information',\n",
" 'make',\n",
" 'navigation',\n",
" 'base',\n",
" 'easier',\n",
" 'special',\n",
" 'thanks',\n",
" 'mr',\n",
" 'angel',\n",
" 'worked',\n",
" 'front',\n",
" 'desk',\n",
" 'morning',\n",
" 'departure',\n",
" 'location',\n",
" 'great',\n",
" 'close',\n",
" 'base',\n",
" 'however',\n",
" 'hotel',\n",
" 'disgusting',\n",
" 'dirty',\n",
" 'roach',\n",
" 'sure',\n",
" 'still',\n",
" 'open',\n",
" 'never',\n",
" 'ever',\n",
" 'stay',\n",
" 'location',\n",
" 'location',\n",
" 'risk',\n",
" 'deal',\n",
" 'environment',\n",
" 'wish',\n",
" 'people',\n",
" 'would',\n",
" 'mentioned',\n",
" 'review',\n",
" 'could',\n",
" 'selected',\n",
" 'different',\n",
" 'hotel',\n",
" 'happier',\n",
" 'note',\n",
" 'proud',\n",
" 'graduate',\n",
" 'congrats',\n",
" 'share',\n",
" 'feeling',\n",
" 'pride',\n",
" 'service',\n",
" 'great',\n",
" 'helpful',\n",
" 'front',\n",
" 'desk',\n",
" 'matter',\n",
" 'time',\n",
" 'day',\n",
" 'room',\n",
" 'ok',\n",
" 'bathroom',\n",
" 'need',\n",
" 'updating',\n",
" 'bed',\n",
" 'high',\n",
" 'short',\n",
" 'people',\n",
" 'comfortable',\n",
" 'would',\n",
" 'stay',\n",
" 'reserved',\n",
" 'king',\n",
" 'room',\n",
" 'none',\n",
" 'available',\n",
" 'check',\n",
" 'left',\n",
" 'pizza',\n",
" 'thrown',\n",
" 'away',\n",
" 'reason',\n",
" 'know',\n",
" 'son',\n",
" 'would',\n",
" 'enjoyed',\n",
" 'location',\n",
" 'location',\n",
" 'location',\n",
" 'get',\n",
" 'issue',\n",
" 'perfect',\n",
" 'location',\n",
" 'lackland',\n",
" 'air',\n",
" 'base',\n",
" 'staff',\n",
" 'outstanding',\n",
" 'helpful',\n",
" 'polite',\n",
" 'friendly',\n",
" 'staff',\n",
" 'room',\n",
" 'however',\n",
" 'damp',\n",
" 'feeling',\n",
" 'dark',\n",
" 'dingy',\n",
" 'carpet',\n",
" 'tub',\n",
" 'room',\n",
" 'needed',\n",
" 'replaced',\n",
" 'old',\n",
" 'building',\n",
" 'need',\n",
" 'work',\n",
" 'also',\n",
" 'feel',\n",
" 'safe',\n",
" 'sometimes',\n",
" 'although',\n",
" 'problem',\n",
" 'quiet',\n",
" 'room',\n",
" 'clean',\n",
" 'min',\n",
" 'downtown',\n",
" 'asked',\n",
" 'stair',\n",
" 'close',\n",
" 'parking',\n",
" 'problem',\n",
" 'tv',\n",
" 'line',\n",
" 'could',\n",
" 'use',\n",
" 'update',\n",
" 'cannot',\n",
" 'compliment',\n",
" 'enough',\n",
" 'employee',\n",
" 'front',\n",
" 'desk',\n",
" 'ted',\n",
" 'angel',\n",
" 'honestly',\n",
" 'went',\n",
" 'beyond',\n",
" 'came',\n",
" 'giving',\n",
" 'exact',\n",
" 'direction',\n",
" 'around',\n",
" 'city',\n",
" 'got',\n",
" 'lost',\n",
" 'airport',\n",
" 'almost',\n",
" 'hr',\n",
" 'could',\n",
" 'find',\n",
" 'hotel',\n",
" 'near',\n",
" 'tear',\n",
" 'point',\n",
" 'ted',\n",
" 'stayed',\n",
" 'phone',\n",
" 'u',\n",
" 'directed',\n",
" 'u',\n",
" 'right',\n",
" 'parking',\n",
" 'lot',\n",
" 'min',\n",
" 'away',\n",
" 'point',\n",
" 'helpful',\n",
" 'friendly',\n",
" 'trip',\n",
" 'let',\n",
" 'employee',\n",
" 'get',\n",
" 'away',\n",
" 'hard',\n",
" 'find',\n",
" 'hotel',\n",
" 'cab',\n",
" 'driver',\n",
" 'get',\n",
" 'parking',\n",
" 'lot',\n",
" 'room',\n",
" 'ready',\n",
" 'check',\n",
" 'booked',\n",
" 'online',\n",
" 'disappointed',\n",
" 'hotel',\n",
" 'far',\n",
" 'either',\n",
" 'airport',\n",
" 'conference',\n",
" 'centre',\n",
" 'staff',\n",
" 'met',\n",
" 'ground',\n",
" 'thorough',\n",
" 'courteous',\n",
" 'mind',\n",
" 'staying',\n",
" 'walking',\n",
" 'distance',\n",
" 'conference',\n",
" 'centre',\n",
" 'general',\n",
" 'manager',\n",
" 'ash',\n",
" 'day',\n",
" 'inn',\n",
" 'name',\n",
" 'jon',\n",
" 'family',\n",
" 'stayed',\n",
" 'day',\n",
" 'inn',\n",
" 'past',\n",
" 'weekend',\n",
" 'wanted',\n",
" 'compliment',\n",
" 'employee',\n",
" 'jacob',\n",
" 'thursday',\n",
" 'night',\n",
" 'checked',\n",
" 'helpful',\n",
" 'accommodating',\n",
" 'exemplified',\n",
" 'customer',\n",
" 'service',\n",
" 'seen',\n",
" 'quite',\n",
" 'say',\n",
" 'great',\n",
" 'employee',\n",
" 'worth',\n",
" 'favorable',\n",
" 'review',\n",
" 'express',\n",
" 'room',\n",
" 'nice',\n",
" 'clean',\n",
" 'bed',\n",
" 'comfortable',\n",
" 'except',\n",
" 'bit',\n",
" 'fresh',\n",
" 'paint',\n",
" 'top',\n",
" 'great',\n",
" 'room',\n",
" 'host',\n",
" 'enjoyed',\n",
" 'complimentary',\n",
" 'hot',\n",
" 'breakfast',\n",
" 'served',\n",
" 'btw',\n",
" 'never',\n",
" 'asked',\n",
" 'pay',\n",
" 'parking',\n",
" 'fee',\n",
" 'wanted',\n",
" 'say',\n",
" 'thank',\n",
" 'excellent',\n",
" 'experience',\n",
" 'sure',\n",
" 'pas',\n",
" 'recommendation',\n",
" 'friend',\n",
" 'regard',\n",
" 'jon',\n",
" 'place',\n",
" 'good',\n",
" 'far',\n",
" 'cost',\n",
" 'realize',\n",
" 'day',\n",
" 'park',\n",
" 'yes',\n",
" 'parking',\n",
" 'lot',\n",
" 'parking',\n",
" 'garage',\n",
" 'maybe',\n",
" 'old',\n",
" 'motel',\n",
" 'like',\n",
" 'parking',\n",
" 'situation',\n",
" 'guess',\n",
" 'lot',\n",
" 'going',\n",
" 'san',\n",
" 'antonio',\n",
" 'come',\n",
" 'put',\n",
" 'price',\n",
" 'room',\n",
" 'good',\n",
" 'time',\n",
" 'nice',\n",
" 'staff',\n",
" 'hot',\n",
" 'breakfast',\n",
" 'comfortable',\n",
" 'bed',\n",
" 'wifi',\n",
" 'cooling',\n",
" 'heat',\n",
" 'good',\n",
" 'hot',\n",
" 'shower',\n",
" 'know',\n",
" 'older',\n",
" 'hotel',\n",
" 'served',\n",
" 'purpose',\n",
" 'close',\n",
" 'lackland',\n",
" 'afb',\n",
" 'hard',\n",
" 'find',\n",
" 'several',\n",
" 'hotel',\n",
" 'nearby',\n",
" 'parking',\n",
" 'easy',\n",
" 'park',\n",
" 'far',\n",
" 'inner',\n",
" 'section',\n",
" 'pool',\n",
" 'use',\n",
" 'pool',\n",
" 'nice',\n",
" 'really',\n",
" 'impressed',\n",
" 'hot',\n",
" 'breakfast',\n",
" 'worry',\n",
" 'go',\n",
" 'eat',\n",
" 'graduate',\n",
" 'bmt',\n",
" 'nice',\n",
" 'close',\n",
" 'upon',\n",
" 'check',\n",
" 'evening',\n",
" 'receptionist',\n",
" 'friendly',\n",
" 'helpful',\n",
" 'although',\n",
" 'quite',\n",
" 'people',\n",
" 'checking',\n",
" 'ahead',\n",
" 'wait',\n",
" 'time',\n",
" 'minimal',\n",
" 'friendly',\n",
" 'courteous',\n",
" 'everyone',\n",
" 'room',\n",
" 'look',\n",
" 'attempted',\n",
" 'upgrade',\n",
" 'include',\n",
" 'hdtv',\n",
" 'refrigerator',\n",
" 'hair',\n",
" 'dryer',\n",
" 'ironing',\n",
" 'board',\n",
" 'iron',\n",
" 'bed',\n",
" 'comfortable',\n",
" 'room',\n",
" 'clean',\n",
" 'room',\n",
" 'faced',\n",
" 'pool',\n",
" 'little',\n",
" 'noisy',\n",
" 'kid',\n",
" 'playing',\n",
" 'overall',\n",
" 'value',\n",
" 'good',\n",
" 'family',\n",
" 'husband',\n",
" 'booked',\n",
" 'priceline',\n",
" 'day',\n",
" 'room',\n",
" 'double',\n",
" 'queen',\n",
" 'good',\n",
" 'rate',\n",
" 'staff',\n",
" 'welcoming',\n",
" 'friendly',\n",
" 'helpful',\n",
" 'liked',\n",
" 'room',\n",
" 'connected',\n",
" 'u',\n",
" 'family',\n",
" 'stick',\n",
" 'together',\n",
" 'room',\n",
" 'neatly',\n",
" 'updated',\n",
" 'hdtv',\n",
" 'comfortable',\n",
" 'bed',\n",
" 'breakfast',\n",
" 'area',\n",
" 'small',\n",
" 'great',\n",
" 'denny',\n",
" 'walking',\n",
" 'distance',\n",
" 'away',\n",
" 'would',\n",
" 'return',\n",
" 'budget',\n",
" 'oh',\n",
" 'plus',\n",
" 'got',\n",
" 'little',\n",
" 'gift',\n",
" 'bag',\n",
" 'appreciate',\n",
" 'made',\n",
" 'u',\n",
" 'feel',\n",
" 'welcome',\n",
" 'thanked',\n",
" 'guest',\n",
" 'housekeeping',\n",
" 'like',\n",
" 'barge',\n",
" 'letting',\n",
" 'otherwise',\n",
" 'totally',\n",
" 'banging',\n",
" 'door',\n",
" 'view',\n",
" 'window',\n",
" 'brick',\n",
" 'wall',\n",
" 'swimming',\n",
" 'pool',\n",
" 'look',\n",
" 'like',\n",
" 'cess',\n",
" 'pit',\n",
" 'room',\n",
" 'ugly',\n",
" 'clean',\n",
" 'location',\n",
" 'convenient',\n",
" 'slightest',\n",
" 'stayed',\n",
" 'hotel',\n",
" 'different',\n",
" 'time',\n",
" 'past',\n",
" 'month',\n",
" 'satisfied',\n",
" 'staff',\n",
" 'friendly',\n",
" 'helpfully',\n",
" 'clean',\n",
" 'room',\n",
" 'part',\n",
" 'think',\n",
" 'may',\n",
" 'upgrading',\n",
" 'room',\n",
" 'second',\n",
" 'room',\n",
" 'stayed',\n",
" 'nice',\n",
" 'nicer',\n",
" 'bathroom',\n",
" 'others',\n",
" 'others',\n",
" 'still',\n",
" 'nice',\n",
" 'convenient',\n",
" 'going',\n",
" 'visiting',\n",
" 'lackland',\n",
" 'afb',\n",
" 'base',\n",
" 'literally',\n",
" 'ther',\n",
" 'street',\n",
" 'clean',\n",
" 'pool',\n",
" 'want',\n",
" 'swim',\n",
" 'good',\n",
" 'breakfast',\n",
" 'even',\n",
" 'though',\n",
" 'went',\n",
" 'twice',\n",
" 'perfect',\n",
" 'hotel',\n",
" 'price',\n",
" 'range',\n",
" 'feel',\n",
" 'safe',\n",
" 'stay',\n",
" 'important',\n",
" 'complaint',\n",
" 'hojo',\n",
" 'staff',\n",
" 'reception',\n",
" 'pleasant',\n",
" 'request',\n",
" 'accomidated',\n",
" 'breakfast',\n",
" 'nice',\n",
" 'carbs',\n",
" 'one',\n",
" 'back',\n",
" 'room',\n",
" 'pleasantly',\n",
" 'large',\n",
" 'room',\n",
" 'fridge',\n",
" 'microwave',\n",
" 'know',\n",
" 'older',\n",
" 'facility',\n",
" 'show',\n",
" 'age',\n",
" 'noticable',\n",
" 'repair',\n",
" 'lieu',\n",
" 'upgrade',\n",
" 'like',\n",
" 'staying',\n",
" 'great',\n",
" 'aunt',\n",
" 'house',\n",
" 'whose',\n",
" 'failing',\n",
" 'vision',\n",
" 'see',\n",
" 'allow',\n",
" 'see',\n",
" 'dust',\n",
" 'grime',\n",
" 'sheet',\n",
" 'need',\n",
" 'good',\n",
" 'bleaching',\n",
" 'make',\n",
" 'bed',\n",
" 'nicely',\n",
" 'work',\n",
" 'hard',\n",
" 'welcome',\n",
" 'air',\n",
" 'heat',\n",
" 'could',\n",
" 'worked',\n",
" 'better',\n",
" 'interior',\n",
" 'pool',\n",
" 'looked',\n",
" 'wonderful',\n",
" 'lush',\n",
" 'though',\n",
" 'property',\n",
" 'surrounded',\n",
" 'side',\n",
" 'large',\n",
" 'wrought',\n",
" 'iron',\n",
" 'fencing',\n",
" 'back',\n",
" 'neighborhood',\n",
" 'looked',\n",
" 'fine',\n",
" 'felt',\n",
" 'comfortable',\n",
" 'immediate',\n",
" 'area']]"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Prepare corpus\n",
"corpus = [preprocess(doc) for doc in final_df['reviews']]\n",
"\n",
"# Show five first elements of corpus\n",
"corpus[:5]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Xp_MVYTo4uA2"
},
"source": [
"### Implementation"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"executionInfo": {
"elapsed": 9128,
"status": "ok",
"timestamp": 1730740633059,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "pmXFXMer4vkh"
},
"outputs": [],
"source": [
"# Initialize BM25\n",
"bm25 = BM25Okapi(corpus)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"executionInfo": {
"elapsed": 3,
"status": "ok",
"timestamp": 1730740633060,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "tLTrsx7x4wkR"
},
"outputs": [],
"source": [
"# Define function to retrieve most similar place\n",
"def retrieve_bm25(query, k=1):\n",
" query = preprocess(query)\n",
" scores = bm25.get_scores(query)\n",
"\n",
" # Returns the indices of scores sorted in descending order & selects the top k indices corresponding to the highest scores.\n",
" top_k_idx = np.argsort(scores)[::-1][:k]\n",
"\n",
" return final_df.iloc[top_k_idx][['offering_id', 'name', 'hotel_class', 'ratings', 'reviews']]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "GE9DpR5P4yTs"
},
"source": [
"### Best hotel for different queries according to the ***BM25 model***"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 81
},
"executionInfo": {
"elapsed": 933,
"status": "ok",
"timestamp": 1730740633991,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "GuR02bXz4yIy",
"outputId": "c3ece997-6aea-43cc-8971-dba40530d821"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" offering_id \n",
" name \n",
" hotel_class \n",
" ratings \n",
" reviews \n",
" \n",
" \n",
" \n",
" \n",
" 2612 \n",
" 258705 \n",
" Hotel Commonwealth \n",
" 4.0 \n",
" {'service': 4.8, 'cleanliness': 4.9, 'overall'... \n",
" I was pleasantly surprised that this hotel was... \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" offering_id name hotel_class \\\n",
"2612 258705 Hotel Commonwealth 4.0 \n",
"\n",
" ratings \\\n",
"2612 {'service': 4.8, 'cleanliness': 4.9, 'overall'... \n",
"\n",
" reviews \n",
"2612 I was pleasantly surprised that this hotel was... "
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Example usage\n",
"query_service = 'excellent service and clean rooms'\n",
"retrieve_bm25(query_service)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 81
},
"executionInfo": {
"elapsed": 490,
"status": "ok",
"timestamp": 1730740634478,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "9dirk0sH41Ad",
"outputId": "ee22dce2-de86-49cc-bd01-8c8fa2b9be92"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" offering_id \n",
" name \n",
" hotel_class \n",
" ratings \n",
" reviews \n",
" \n",
" \n",
" \n",
" \n",
" 166 \n",
" 79868 \n",
" Bay Club Hotel & Marina \n",
" 3.0 \n",
" {'service': 4.6, 'cleanliness': 4.5, 'overall'... \n",
" Great hopitality and a wonderful location. The... \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" offering_id name hotel_class \\\n",
"166 79868 Bay Club Hotel & Marina 3.0 \n",
"\n",
" ratings \\\n",
"166 {'service': 4.6, 'cleanliness': 4.5, 'overall'... \n",
"\n",
" reviews \n",
"166 Great hopitality and a wonderful location. The... "
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_food = 'delicious food and great view'\n",
"retrieve_bm25(query_food)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "lg089CyC42ug"
},
"source": [
"## Custom Recommendation Model\n",
"Here, we will experiment with different methods to improve on BM25. We'll start with TF-IDF and progress to embedding-based models using `sentence-transformers`. Finally, we may re-rank using a similarity metric to find the best match."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qEsWtOPB44P_"
},
"source": [
"### TF-IDF-Based Custom Model"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6QYJaO4K44kR"
},
"source": [
"Using **TF-IDF (Term Frequency - Inverse Document Frequency)**, we can create vector representations for each place's concatenated reviews. Then, we’ll compare a query with these vectors using cosine similarity."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"executionInfo": {
"elapsed": 59531,
"status": "ok",
"timestamp": 1730740694007,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "TYicPEK646Ee"
},
"outputs": [],
"source": [
"# Instantiate TF-IDF Vectorizer\n",
"tfidf_vectorizer = TfidfVectorizer(stop_words='english')\n",
"\n",
"# Fit and transform reviews into TF-IDF vectors\n",
"tfidf_matrix = tfidf_vectorizer.fit_transform(final_df['reviews']) # 'reviews' column with concatenated reviews per place"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"executionInfo": {
"elapsed": 7,
"status": "ok",
"timestamp": 1730740694007,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "bEom928M47kP"
},
"outputs": [],
"source": [
"def retrieve_tfidf(query, k=5):\n",
" # Transform the query text into TF-IDF vector\n",
" query_vec = tfidf_vectorizer.transform([query])\n",
"\n",
" # Compute cosine similarity between the query vector and all document vectors\n",
" scores = cosine_similarity(query_vec, tfidf_matrix).flatten()\n",
"\n",
" # Get the indices of the top-k most similar places\n",
" top_k_idx = scores.argsort()[::-1][:k]\n",
"\n",
" # Return the top-k places based on similarity\n",
" return final_df.iloc[top_k_idx][['offering_id', 'name', 'hotel_class', 'ratings', 'reviews']]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "p4F9amL949us"
},
"source": [
"#### Top 5 hotels for different queries according to the ***TF-IDF model***"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206
},
"executionInfo": {
"elapsed": 360,
"status": "ok",
"timestamp": 1730740694360,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "i8qiCAoF4-xL",
"outputId": "cd15a791-370a-4875-e818-647c4da382d8"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" offering_id \n",
" name \n",
" hotel_class \n",
" ratings \n",
" reviews \n",
" \n",
" \n",
" \n",
" \n",
" 1091 \n",
" 98940 \n",
" Houston Inn and Suites \n",
" 0.0 \n",
" {'service': 3.0, 'cleanliness': 3.0, 'overall'... \n",
" I stay for a weekend, and the rooms were nice ... \n",
" \n",
" \n",
" 2570 \n",
" 249793 \n",
" BEST WESTERN Fort Worth Inn & Suites \n",
" 3.0 \n",
" {'service': 4.7, 'cleanliness': 4.8, 'overall'... \n",
" I was very impressed when as I was walking in ... \n",
" \n",
" \n",
" 1531 \n",
" 109101 \n",
" La Quinta Inn & Suites Fort Worth North \n",
" 2.5 \n",
" {'service': 4.4, 'cleanliness': 4.6, 'overall'... \n",
" Rolling into Fort Worth after a long day on th... \n",
" \n",
" \n",
" 90 \n",
" 74845 \n",
" Comfort Inn West \n",
" 2.0 \n",
" {'service': 4.5, 'cleanliness': 4.5, 'overall'... \n",
" We had a wonderful stay!! Beautiful redone roo... \n",
" \n",
" \n",
" 1973 \n",
" 124066 \n",
" Ramada Limited Addison \n",
" 0.0 \n",
" {'service': 4.0, 'cleanliness': 3.0, 'overall'... \n",
" Hotel based on a main road but within walking ... \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" offering_id name hotel_class \\\n",
"1091 98940 Houston Inn and Suites 0.0 \n",
"2570 249793 BEST WESTERN Fort Worth Inn & Suites 3.0 \n",
"1531 109101 La Quinta Inn & Suites Fort Worth North 2.5 \n",
"90 74845 Comfort Inn West 2.0 \n",
"1973 124066 Ramada Limited Addison 0.0 \n",
"\n",
" ratings \\\n",
"1091 {'service': 3.0, 'cleanliness': 3.0, 'overall'... \n",
"2570 {'service': 4.7, 'cleanliness': 4.8, 'overall'... \n",
"1531 {'service': 4.4, 'cleanliness': 4.6, 'overall'... \n",
"90 {'service': 4.5, 'cleanliness': 4.5, 'overall'... \n",
"1973 {'service': 4.0, 'cleanliness': 3.0, 'overall'... \n",
"\n",
" reviews \n",
"1091 I stay for a weekend, and the rooms were nice ... \n",
"2570 I was very impressed when as I was walking in ... \n",
"1531 Rolling into Fort Worth after a long day on th... \n",
"90 We had a wonderful stay!! Beautiful redone roo... \n",
"1973 Hotel based on a main road but within walking ... "
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"retrieve_tfidf(query_service)"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206
},
"executionInfo": {
"elapsed": 1149,
"status": "ok",
"timestamp": 1730740695506,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "L_1PQ7YX4_EF",
"outputId": "814b0e36-aaca-4f28-812f-c81425bc7f65"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" offering_id \n",
" name \n",
" hotel_class \n",
" ratings \n",
" reviews \n",
" \n",
" \n",
" \n",
" \n",
" 1645 \n",
" 112136 \n",
" Penn's View Hotel \n",
" 3.0 \n",
" {'service': 4.5, 'cleanliness': 4.6, 'overall'... \n",
" This hotel is located in the old city and is c... \n",
" \n",
" \n",
" 603 \n",
" 87608 \n",
" Holiday Inn Chicago - Mart Plaza \n",
" 3.0 \n",
" {'service': 4.1, 'cleanliness': 4.3, 'overall'... \n",
" If your visiting Chicago, and are a little fle... \n",
" \n",
" \n",
" 1263 \n",
" 100567 \n",
" The Edgewater Hotel Seattle \n",
" 4.0 \n",
" {'service': 4.3, 'cleanliness': 4.3, 'overall'... \n",
" I have traveled to Seattle extremely often for... \n",
" \n",
" \n",
" 602 \n",
" 87603 \n",
" Hotel 71, Wyndham Affiliate \n",
" 3.5 \n",
" {'service': 4.4, 'cleanliness': 4.4, 'overall'... \n",
" Stayed there for 6 nights, Upon arrival shocke... \n",
" \n",
" \n",
" 703 \n",
" 89620 \n",
" Hyatt Harborside at Boston's Logan Internation... \n",
" 4.0 \n",
" {'service': 4.3, 'cleanliness': 4.5, 'overall'... \n",
" My boyfriend and I loved this hotel! Our room ... \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" offering_id name \\\n",
"1645 112136 Penn's View Hotel \n",
"603 87608 Holiday Inn Chicago - Mart Plaza \n",
"1263 100567 The Edgewater Hotel Seattle \n",
"602 87603 Hotel 71, Wyndham Affiliate \n",
"703 89620 Hyatt Harborside at Boston's Logan Internation... \n",
"\n",
" hotel_class ratings \\\n",
"1645 3.0 {'service': 4.5, 'cleanliness': 4.6, 'overall'... \n",
"603 3.0 {'service': 4.1, 'cleanliness': 4.3, 'overall'... \n",
"1263 4.0 {'service': 4.3, 'cleanliness': 4.3, 'overall'... \n",
"602 3.5 {'service': 4.4, 'cleanliness': 4.4, 'overall'... \n",
"703 4.0 {'service': 4.3, 'cleanliness': 4.5, 'overall'... \n",
"\n",
" reviews \n",
"1645 This hotel is located in the old city and is c... \n",
"603 If your visiting Chicago, and are a little fle... \n",
"1263 I have traveled to Seattle extremely often for... \n",
"602 Stayed there for 6 nights, Upon arrival shocke... \n",
"703 My boyfriend and I loved this hotel! Our room ... "
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"retrieve_tfidf(query_food)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UXuc7wbI5MPk"
},
"source": [
"### Embedding-Based Custom Model with Sentence Transformers"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "iPjG0-aG5Ogp"
},
"source": [
"Using Sentence Transformers (like `all-MiniLM-L6-v2`), we can create dense embeddings of the review text, which generally capture semantic similarities better than TF-IDF."
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 560,
"referenced_widgets": [
"41aa5f25235d435ab2022deb265c1224",
"211c9255a5b841a6981f4bb90ef0cc08",
"f39607b761c34a09bfd9fac0ecaab5f4",
"bc6bc1c1abff40a8bc46bd40fcba011d",
"98e8d5ab089c4972b28de304327ef4b0",
"78cd919f72c24e349e35568be0834f8b",
"018a4edc5a514defa8f88b76c5d2051d",
"0288d54f62a44a51af1cc28cbffeb414",
"0d3fc9b5eb5343688db12952c9b79209",
"50e7bea3bda3449993f02204ab708c84",
"2c4d0395d3c9450c93cdf92534fb667e",
"d4933c6ee65f47638f514b25da3031ff",
"a8827aab24ad4f62a569ebc7c49e5e13",
"5f8eb46b969f499281890ad8feaf34e9",
"c12e263dd34e41e4a15390d05b99e937",
"25e00f5d82e448368c2d132f0274c089",
"40426b1e8b6d4b9f90bc8b6551e33dad",
"9f18e1e8a03544c7b43e8ad098385165",
"b9394cb942f44d248d070d2fa38a3a60",
"d339ebcb8350489ab8adc6ade4a3549f",
"0871e359a4754641b1001a7748790ef3",
"f8279f7457b54ea79871d8afffaa4c1a",
"eae753a0f03a4940811ce9b14f3c4fba",
"65cd6c2e7ea4417cbcbc49e9826da3a4",
"59c60af3e36e4afa94c9243a713c862e",
"01f0b4f9c8494b1d940f52e4ae2e052e",
"af6f78c572ec4876ae5e0b7440bea99c",
"b89dba8be52b44ab922ec32a2dbf0388",
"d73b0aa7b82a446287f1597560aa9f2e",
"014a132cc94240bca260251b30e80b1d",
"85f80f687da548c4be2a9aa4bc5f703a",
"3af1e1c748874d9dbe6db67ac60f59a6",
"f2bd0d7b25f64d21a0b101e300f5970f",
"2d731537db694de2abf5e50ff5315582",
"116d7e14b77e4f408e38ee2a9edd2589",
"4d1a8a8036184623b6d9a0bb60134c0b",
"75f9eb8825164ee0ab40ce3d35082182",
"09ad0f566d154ebaa5a1d407e95b1a3f",
"1bda7a65c51c4eda942daba2ba20763a",
"e753b149eebc4195a0c27d055d2c68c1",
"333ae2cfd578425c93552c57c4b7f8ae",
"eed1334d1cfb47b8b6568b754b36e591",
"d098812189de448cb80c14fc9f4b932e",
"b65a40471afa46f88b2a745df3b8164f",
"5760a67d10b342c7b802e0b14ef934b8",
"77a06f1ab92a437a93326ca36667ca87",
"e53714b59cb44cdfb55fe0768be6960d",
"d92024edce0c416c9bf5389c4763be3e",
"e4bff3bbc2c54c6aa9ffdb855c707def",
"643406402c1b45c798fc3a9b073084b7",
"1f54531a0f9249c79b3db2e715da96c2",
"51b607d567414be9806ba7eecd5b47f5",
"72b931a3238e4d229517d8762e64de4f",
"15df539a86084979a6998c9a8b1987f9",
"d5b2b751583a4eea83dd7d0dac3311df",
"eaa6b4ef21fd4de2b67e0afb28af35a7",
"72045a496fac4abba58f054c4f5dd289",
"829f7b18252f4452bc0c052707438553",
"3aad5b862e584581873e0fcf48dc645f",
"304275b18f514e7da443e0e85cb92ebf",
"77e1d1adc7f84c05830987cbcd9abf6b",
"33ea3f887ac34973aaa72cf28774e17a",
"414026345ea14ad3a1c46f3b48e08dad",
"520a8efd266645e98d64fe12e692fdd4",
"b47a4986321b4011857e43ec29af5930",
"70d1721ee8764008a378e86f8b4d1a40",
"6f315b8a29ec4dceb936f40dcf935ca9",
"3571d5e96c964209bf75196c768c4f16",
"21845b143b0b42a89b8c96371fcb7cfd",
"16ca2364890d4369bd76d528955d5c07",
"407bdd18f40942d4857674f5aef60f5a",
"cc1811b5165a462c80e33ea28c9b1cd5",
"d7aabc8485904970acc594cff1c2ecb1",
"36a5b49746ab413eb824189c5312a4a8",
"869ba914217346d79c3d5d73f5e00ea8",
"02ad14c4355e4b95ac857236b939f386",
"7efde21464b74b89b2b18d4d69bab5f4",
"e513eb3bf3f74a8b9f56f67be171e4c7",
"6a55cee63ed64d5c8bce9853553a152e",
"66e195b05b7d4007875cee4ca105bee5",
"d49c878e0999475c946d18ea7ad24e1b",
"177dad175ea045bbb434e60e5fa6a2d3",
"90b79c66e79f4cb3ac20515d1ee204ba",
"4073631f4ae04f6394e7ec7160c1b267",
"2e73e9f4f6d349deb946044f44f5bc6a",
"c1f08c978d8e40ec87a1d6b348331205",
"c064370aceca42d08529b26489e89d01",
"497a4539b1f44ac884b22c441a5cc0fb",
"62217ad84eee4d9ab223350440f31c40",
"97d56fd60344424dac5c0482ab9efabb",
"c8bd3f54301d4ebda35036a60c4aa5e7",
"d2c96a3539d64fa4b5ca319057ce1838",
"7fc87395d6664c579ad3b49c5d5f5af9",
"63c2cc0bb0d2470b9c663fb789bccbcc",
"4920dfa856c743c28d6b645bc6462d84",
"f9727f5d318f4c578c6f609da8514a03",
"dafa8bd7b78c4dcbaf8bc8bf1f38b441",
"ab3daf3afb14420e917d2ce4b70ed03a",
"e8cd452dbdfe4913b3f06060bdebbc0e",
"e6afaae6e36e46ae94a433b5437012b2",
"4a573eab972948139eb2acccc8171917",
"c92e52e2edf146058ff52be717bb11a1",
"d45e5d6905ad4b24b24b79b80758920c",
"b22240de5d414420ba2d74813e0cb059",
"97c9c08018ac4d98a5a73a4e4d4b4552",
"1e7db89680c144eb846ee878b37c7b78",
"72500bb48c964e91a308f763db191956",
"14c68438a1574435857fb6b5e8952de1",
"c7984758157f4bd885a29f3d13508f12",
"1f49e5943d254770a92ea4778733a86a",
"c410fd7c34974edba7625df2c5dda519",
"398becbbdc324b2d9033ffb2511ab612",
"cf6ba93b63214a79b705c74f9ab08470",
"7a0b0da7087a4830a5263f93a9835870",
"34c8ce56ae6047809e128507f7ec1333",
"f0596d60ee03456899a44a9f024c13e8",
"7705618bee1b436eb9cc7c954281b985",
"8aa2e9c7ecc8463d80969f8be0b8fa97",
"145978f0aad44da0b8cb60e65add585c",
"fab297d175304e86be07326a50074a96",
"3ecefd97e39949778d00bddddbcd93d5",
"75abb8d3b53341aabed11f3e3e789423",
"64c0d4f909914b66aca9fd86882d4759",
"a22f72bba00646fc8d424e84f9761d04",
"39b450fcf67e4eff80f0a94ce03c3d9a",
"419c59425958463e86e3c89e7897562d",
"ccd93edd28fd47dc8ed9b25fdff1273b",
"8ab8006e3d0c46c688e3f8f30f1076d3",
"c792016a2f1e421c9c63a4760529c6dd",
"9992af63e99a4d279c2a00b6e2609532",
"6988cc6bae8646baabc123c4a90eb1f2",
"0d79e7275de94b5b8744179614b31fdd"
]
},
"executionInfo": {
"elapsed": 102225,
"status": "ok",
"timestamp": 1730741587609,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "azjzZN-C5L2Y",
"outputId": "1b4e0ccb-f44c-4141-d0d8-610a8e3d28e9"
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "2f06970571ed4c1ca631bdb54f2fd604",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Batches: 0%| | 0/118 [00:00, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Load a pre-trained Sentence Transformer model\n",
"model = SentenceTransformer('all-MiniLM-L6-v2')\n",
"\n",
"# Generate embeddings for each place's concatenated reviews\n",
"embeddings = model.encode(final_df['reviews'].tolist(), show_progress_bar=True)"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"executionInfo": {
"elapsed": 1,
"status": "ok",
"timestamp": 1730741587610,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "pAihREds5S0Z"
},
"outputs": [],
"source": [
"def retrieve_embeddings(query, k=5):\n",
" # Encode the query into an embedding vector\n",
" query_embedding = model.encode(query)\n",
"\n",
" # Compute cosine similarity between the query embedding and all document embeddings\n",
" scores = cosine_similarity([query_embedding], embeddings).flatten()\n",
"\n",
" # Get indices of the top-k most similar places\n",
" top_k_idx = np.argsort(scores)[::-1][:k]\n",
"\n",
" # Return the top-k most similar places\n",
" return final_df.iloc[top_k_idx][['offering_id', 'name', 'hotel_class', 'ratings', 'reviews']]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "359xmFGB5UOa"
},
"source": [
"#### Top 5 hotels for different queries according to the ***Embedding-Based Custom model***"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206
},
"executionInfo": {
"elapsed": 211,
"status": "ok",
"timestamp": 1730741587820,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "RHmcj0Z55Vnp",
"outputId": "2fa72a10-5c47-4fb5-dafb-dc9deda8b355"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" offering_id \n",
" name \n",
" hotel_class \n",
" ratings \n",
" reviews \n",
" \n",
" \n",
" \n",
" \n",
" 3301 \n",
" 1174784 \n",
" Holiday Inn Baltimore-Towson \n",
" 0.0 \n",
" {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
" Found the room exceptionally clean, the staff ... \n",
" \n",
" \n",
" 3086 \n",
" 656554 \n",
" Charlotte Express Inn \n",
" 2.0 \n",
" {'service': 3.5, 'cleanliness': 2.5, 'overall'... \n",
" Very courteous, helpful and professional staff... \n",
" \n",
" \n",
" 1818 \n",
" 119997 \n",
" Belcaro Motel \n",
" 0.0 \n",
" {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
" Perfectly clean. Some new bath renovations, bu... \n",
" \n",
" \n",
" 2823 \n",
" 497952 \n",
" Americas Best Value Inn - San Antonio / Lackla... \n",
" 2.0 \n",
" {'service': 2.8, 'cleanliness': 3.8, 'overall'... \n",
" Great clean rooms and great service.\\nHotel wa... \n",
" \n",
" \n",
" 2881 \n",
" 553345 \n",
" Americas Best Value Inn & Suites Granada Hills \n",
" 2.0 \n",
" {'service': 4.7, 'cleanliness': 4.4, 'overall'... \n",
" Rooms were clean,air conditioner worked well,t... \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" offering_id name \\\n",
"3301 1174784 Holiday Inn Baltimore-Towson \n",
"3086 656554 Charlotte Express Inn \n",
"1818 119997 Belcaro Motel \n",
"2823 497952 Americas Best Value Inn - San Antonio / Lackla... \n",
"2881 553345 Americas Best Value Inn & Suites Granada Hills \n",
"\n",
" hotel_class ratings \\\n",
"3301 0.0 {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
"3086 2.0 {'service': 3.5, 'cleanliness': 2.5, 'overall'... \n",
"1818 0.0 {'service': 5.0, 'cleanliness': 5.0, 'overall'... \n",
"2823 2.0 {'service': 2.8, 'cleanliness': 3.8, 'overall'... \n",
"2881 2.0 {'service': 4.7, 'cleanliness': 4.4, 'overall'... \n",
"\n",
" reviews \n",
"3301 Found the room exceptionally clean, the staff ... \n",
"3086 Very courteous, helpful and professional staff... \n",
"1818 Perfectly clean. Some new bath renovations, bu... \n",
"2823 Great clean rooms and great service.\\nHotel wa... \n",
"2881 Rooms were clean,air conditioner worked well,t... "
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"retrieve_embeddings(query_service)"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206
},
"executionInfo": {
"elapsed": 877,
"status": "ok",
"timestamp": 1730741588697,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "D2cE83Oo5WD_",
"outputId": "ae47ee1e-4660-4e93-b7f5-1c5a246efb42"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" offering_id \n",
" name \n",
" hotel_class \n",
" ratings \n",
" reviews \n",
" \n",
" \n",
" \n",
" \n",
" 3723 \n",
" 2627745 \n",
" Hotel Palomar Phoenix - a Kimpton Hotel \n",
" 4.0 \n",
" {'service': 4.9, 'cleanliness': 4.9, 'overall'... \n",
" We are big fans of Kimpton Hotels and have sta... \n",
" \n",
" \n",
" 2692 \n",
" 275455 \n",
" Scottish Inn Memphis Airport \n",
" 0.0 \n",
" {'service': 4.0, 'cleanliness': 4.0, 'overall'... \n",
" Very nice property located several long blocks... \n",
" \n",
" \n",
" 3691 \n",
" 2151571 \n",
" Hotel Americano \n",
" 0.0 \n",
" {'service': 3.9, 'cleanliness': 4.4, 'overall'... \n",
" I have just come back from a 9 day trip from t... \n",
" \n",
" \n",
" 1499 \n",
" 108980 \n",
" Hampton Inn Austin - Arboretum Northwest \n",
" 2.5 \n",
" {'service': 4.8, 'cleanliness': 4.7, 'overall'... \n",
" Everything about this hotel screams that they ... \n",
" \n",
" \n",
" 1629 \n",
" 111751 \n",
" Hotel Bel-Air \n",
" 5.0 \n",
" {'service': 4.7, 'cleanliness': 4.9, 'overall'... \n",
" Beautiful & classic high class hotel. The spac... \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" offering_id name hotel_class \\\n",
"3723 2627745 Hotel Palomar Phoenix - a Kimpton Hotel 4.0 \n",
"2692 275455 Scottish Inn Memphis Airport 0.0 \n",
"3691 2151571 Hotel Americano 0.0 \n",
"1499 108980 Hampton Inn Austin - Arboretum Northwest 2.5 \n",
"1629 111751 Hotel Bel-Air 5.0 \n",
"\n",
" ratings \\\n",
"3723 {'service': 4.9, 'cleanliness': 4.9, 'overall'... \n",
"2692 {'service': 4.0, 'cleanliness': 4.0, 'overall'... \n",
"3691 {'service': 3.9, 'cleanliness': 4.4, 'overall'... \n",
"1499 {'service': 4.8, 'cleanliness': 4.7, 'overall'... \n",
"1629 {'service': 4.7, 'cleanliness': 4.9, 'overall'... \n",
"\n",
" reviews \n",
"3723 We are big fans of Kimpton Hotels and have sta... \n",
"2692 Very nice property located several long blocks... \n",
"3691 I have just come back from a 9 day trip from t... \n",
"1499 Everything about this hotel screams that they ... \n",
"1629 Beautiful & classic high class hotel. The spac... "
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"retrieve_embeddings(query_food)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pcjdwA7n5Zk8"
},
"source": [
"### Re-Ranking Using Hybrid Approach"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ss1oXhzc5Z9N"
},
"source": [
"To combine the strengths of both models, we create a **hybrid model** where we:\n",
"1. Retrieve a larger set of similar places (e.g., top 10) using TF-IDF, and then\n",
"2. Re-rank these candidates with the embedding-based model."
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"executionInfo": {
"elapsed": 0,
"status": "ok",
"timestamp": 1730741588698,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "xDJjUrPw5cMU"
},
"outputs": [],
"source": [
"def retrieve_hybrid(query, initial_k=10, final_k=5):\n",
" # Step 1: Initial retrieval with TF-IDF or BM25 to get top-k candidates\n",
" initial_candidates = retrieve_tfidf(query, k=initial_k)\n",
"\n",
" # Step 2: Generate embeddings for the initial candidates' reviews\n",
" candidate_embeddings = model.encode(initial_candidates['reviews'].tolist())\n",
" query_embedding = model.encode(query)\n",
"\n",
" # Step 3: Re-rank candidates based on embedding similarity\n",
" scores = cosine_similarity([query_embedding], candidate_embeddings).flatten()\n",
" top_k_idx = np.argsort(scores)[::-1][:final_k]\n",
"\n",
" # Return final top-k ranked places\n",
" return initial_candidates.iloc[top_k_idx]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "anmtGoRz5d1I"
},
"source": [
"#### Top 5 hotels for different queries according to the ***Hybrid model***"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206
},
"executionInfo": {
"elapsed": 4915,
"status": "ok",
"timestamp": 1730741593613,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "ir_BGgS65fCA",
"outputId": "63f4708a-3b97-40c4-a326-344e4786ad9e"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" offering_id \n",
" name \n",
" hotel_class \n",
" ratings \n",
" reviews \n",
" \n",
" \n",
" \n",
" \n",
" 1091 \n",
" 98940 \n",
" Houston Inn and Suites \n",
" 0.0 \n",
" {'service': 3.0, 'cleanliness': 3.0, 'overall'... \n",
" I stay for a weekend, and the rooms were nice ... \n",
" \n",
" \n",
" 90 \n",
" 74845 \n",
" Comfort Inn West \n",
" 2.0 \n",
" {'service': 4.5, 'cleanliness': 4.5, 'overall'... \n",
" We had a wonderful stay!! Beautiful redone roo... \n",
" \n",
" \n",
" 2178 \n",
" 223171 \n",
" Super 8 Houston \n",
" 2.0 \n",
" {'service': 4.4, 'cleanliness': 4.2, 'overall'... \n",
" The hotel was very neat and good for the price... \n",
" \n",
" \n",
" 2570 \n",
" 249793 \n",
" BEST WESTERN Fort Worth Inn & Suites \n",
" 3.0 \n",
" {'service': 4.7, 'cleanliness': 4.8, 'overall'... \n",
" I was very impressed when as I was walking in ... \n",
" \n",
" \n",
" 3625 \n",
" 1846923 \n",
" Sleep Inn & Suites I-45 / Airtex \n",
" 2.0 \n",
" {'service': 4.8, 'cleanliness': 4.8, 'overall'... \n",
" I stay in many Choice Hotels during the year. ... \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" offering_id name hotel_class \\\n",
"1091 98940 Houston Inn and Suites 0.0 \n",
"90 74845 Comfort Inn West 2.0 \n",
"2178 223171 Super 8 Houston 2.0 \n",
"2570 249793 BEST WESTERN Fort Worth Inn & Suites 3.0 \n",
"3625 1846923 Sleep Inn & Suites I-45 / Airtex 2.0 \n",
"\n",
" ratings \\\n",
"1091 {'service': 3.0, 'cleanliness': 3.0, 'overall'... \n",
"90 {'service': 4.5, 'cleanliness': 4.5, 'overall'... \n",
"2178 {'service': 4.4, 'cleanliness': 4.2, 'overall'... \n",
"2570 {'service': 4.7, 'cleanliness': 4.8, 'overall'... \n",
"3625 {'service': 4.8, 'cleanliness': 4.8, 'overall'... \n",
"\n",
" reviews \n",
"1091 I stay for a weekend, and the rooms were nice ... \n",
"90 We had a wonderful stay!! Beautiful redone roo... \n",
"2178 The hotel was very neat and good for the price... \n",
"2570 I was very impressed when as I was walking in ... \n",
"3625 I stay in many Choice Hotels during the year. ... "
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"retrieve_hybrid(query_service)"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206
},
"executionInfo": {
"elapsed": 4165,
"status": "ok",
"timestamp": 1730741597778,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "Vv_jq37x5fUd",
"outputId": "493b7f64-8b22-47d1-ef0d-0db59744f049"
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" offering_id \n",
" name \n",
" hotel_class \n",
" ratings \n",
" reviews \n",
" \n",
" \n",
" \n",
" \n",
" 166 \n",
" 79868 \n",
" Bay Club Hotel & Marina \n",
" 3.0 \n",
" {'service': 4.6, 'cleanliness': 4.5, 'overall'... \n",
" Great hopitality and a wonderful location. The... \n",
" \n",
" \n",
" 302 \n",
" 81126 \n",
" Mandarin Oriental, San Francisco \n",
" 5.0 \n",
" {'service': 4.8, 'cleanliness': 4.9, 'overall'... \n",
" The sweeping views from the 40 th floor are ju... \n",
" \n",
" \n",
" 2306 \n",
" 223983 \n",
" Baltimore Marriott Waterfront \n",
" 4.0 \n",
" {'service': 4.4, 'cleanliness': 4.6, 'overall'... \n",
" I had a reservation at a different hotel but a... \n",
" \n",
" \n",
" 1234 \n",
" 100507 \n",
" Inn at the Market \n",
" 4.0 \n",
" {'service': 4.8, 'cleanliness': 4.9, 'overall'... \n",
" My cousin and I went to Seattle to attend the ... \n",
" \n",
" \n",
" 1645 \n",
" 112136 \n",
" Penn's View Hotel \n",
" 3.0 \n",
" {'service': 4.5, 'cleanliness': 4.6, 'overall'... \n",
" This hotel is located in the old city and is c... \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" offering_id name hotel_class \\\n",
"166 79868 Bay Club Hotel & Marina 3.0 \n",
"302 81126 Mandarin Oriental, San Francisco 5.0 \n",
"2306 223983 Baltimore Marriott Waterfront 4.0 \n",
"1234 100507 Inn at the Market 4.0 \n",
"1645 112136 Penn's View Hotel 3.0 \n",
"\n",
" ratings \\\n",
"166 {'service': 4.6, 'cleanliness': 4.5, 'overall'... \n",
"302 {'service': 4.8, 'cleanliness': 4.9, 'overall'... \n",
"2306 {'service': 4.4, 'cleanliness': 4.6, 'overall'... \n",
"1234 {'service': 4.8, 'cleanliness': 4.9, 'overall'... \n",
"1645 {'service': 4.5, 'cleanliness': 4.6, 'overall'... \n",
"\n",
" reviews \n",
"166 Great hopitality and a wonderful location. The... \n",
"302 The sweeping views from the 40 th floor are ju... \n",
"2306 I had a reservation at a different hotel but a... \n",
"1234 My cousin and I went to Seattle to attend the ... \n",
"1645 This hotel is located in the old city and is c... "
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"retrieve_hybrid(query_food)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "13i6YgQ45hOp"
},
"source": [
"### Summary of Each Approach\n",
"- **TF-IDF**: Quick but mainly focuses on word overlap, which can miss out on semantic similarity.\n",
"- **Embedding-Based Model**: Captures semantic meaning and is more accurate but computationally heavier.\n",
"- **Hybrid Model**: Combines initial recall with TF-IDF followed by a fine-grained re-ranking using embeddings."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "F1qxWQ2Z5jLt"
},
"source": [
"## Evaluation Metrics\n",
"For evaluation, we will use Mean Squared Error (MSE) to calculate the error between ratings on each aspect for the recommended and actual places.\n",
"This score will be calculated across both BM25 and custom models to compare performance."
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"executionInfo": {
"elapsed": 1,
"status": "ok",
"timestamp": 1730741597778,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "DEnTAwaV5kpi"
},
"outputs": [],
"source": [
"# Calculate MSE for a single recommendation\n",
"def calculate_mse(actual_ratings, predicted_ratings):\n",
" return root_mean_squared_error(actual_ratings, predicted_ratings)"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"id": "JBERdcCtZK_5"
},
"outputs": [],
"source": [
"# Define the sample size of query places for evaluation\n",
"sample_size = 100\n",
"samples = final_df.sample(sample_size)\n",
"max_char = 280 # Max limit of characters for query, like an X post"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2JwOiCB45k7Y"
},
"source": [
"## Results and Analysis"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### MSE scores for each model\n",
"We will run the BM25 and custom model to retrieve places for several test queries, then calculate the MSE for each model and compare the results."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 194677,
"status": "ok",
"timestamp": 1730742280788,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "E9IsxBBQ5nSy",
"outputId": "194dd152-ccbb-4537-9c2a-1cd1098d9061"
},
"outputs": [],
"source": [
"# Store MSE scores for each model\n",
"bm25_mse_scores = []\n",
"custom_mse_scores = []\n",
"custom_embedding_mse_scores = []\n",
"custom_hybrid_mse_scores = []\n",
"\n",
"# Step 1: Loop over a sample of query places in the dataset\n",
"progress_bar = tqdm(samples.iterrows(), total=sample_size, desc=\"MSE Calculations\")\n",
"for _, query_place in progress_bar:\n",
" query_text = query_place['reviews'][:max_char] # Use the concatenated reviews for the place as the query\n",
" actual_ratings = query_place[\"ratings\"] # Actual ratings for MSE calculation\n",
"\n",
" # Step 2: Retrieve the most similar place using BM25\n",
" bm25_recommendation = retrieve_bm25(query_text, k=1)\n",
" bm25_predicted_ratings = bm25_recommendation[\"ratings\"].iloc[0]\n",
"\n",
" # Calculate MSE for BM25\n",
" bm25_mse = calculate_mse(list(actual_ratings.values()), list(bm25_predicted_ratings.values()))\n",
" bm25_mse_scores.append(bm25_mse)\n",
"\n",
" # Step 3: Retrieve the most similar place using the Custom Model\n",
" custom_recommendation = retrieve_tfidf(query_text, k=1)\n",
" custom_predicted_ratings = custom_recommendation[\"ratings\"].iloc[0]\n",
"\n",
" # Calculate MSE for the Custom Model\n",
" custom_mse = calculate_mse(list(actual_ratings.values()), list(custom_predicted_ratings.values()))\n",
" custom_mse_scores.append(custom_mse)\n",
"\n",
" # Step 4: Retrieve the most similar place using the Custom Model\n",
" custom_embedding_recommendation = retrieve_embeddings(query_text, k=1)\n",
" custom_embedding_predicted_ratings = custom_embedding_recommendation[\"ratings\"].iloc[0]\n",
"\n",
" # Calculate MSE for the Custom Model\n",
" custom_embedding_mse = calculate_mse(list(actual_ratings.values()), list(custom_embedding_predicted_ratings.values()))\n",
" custom_embedding_mse_scores.append(custom_embedding_mse)\n",
"\n",
" # Step 5: Retrieve the most similar place using the Custom Model\n",
" custom_hybrid_recommendation = retrieve_hybrid(query_text, final_k=1)\n",
" custom_hybrid_predicted_ratings = custom_hybrid_recommendation[\"ratings\"].iloc[0]\n",
"\n",
" # Calculate MSE for the Custom Model\n",
" custom_hybrid_mse = calculate_mse(list(actual_ratings.values()), list(custom_hybrid_predicted_ratings.values()))\n",
" custom_hybrid_mse_scores.append(custom_hybrid_mse)\n",
"\n",
"# Step 6: Compute the average MSE across all queries for each model\n",
"avg_bm25_mse = sum(bm25_mse_scores) / len(bm25_mse_scores)\n",
"avg_custom_mse = sum(custom_mse_scores) / len(custom_mse_scores)\n",
"avg_custom_embedding_mse = sum(custom_embedding_mse_scores) / len(custom_embedding_mse_scores)\n",
"avg_custom_hybrid_mse = sum(custom_hybrid_mse_scores) / len(custom_hybrid_mse_scores)\n",
"\n",
"print(f\"Average BM25 MSE: {(avg_bm25_mse * 100):.2f}%\")\n",
"print(f\"Average Custom Model MSE: {(avg_custom_mse * 100):.2f}%\")\n",
"print(f\"Average Embedding Model MSE: {(avg_custom_embedding_mse * 100):.2f}%\")\n",
"print(f\"Average Hybrid Model MSE: {(avg_custom_hybrid_mse * 100):.2f}%\")"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Average BM25 MSE: 29.91%\n",
"Average Custom Model MSE: 31.34%\n",
"Average Embedding Model MSE: 29.76%\n",
"Average Hybrid Model MSE: 22.13%\n"
]
}
],
"source": [
"print(f\"Average BM25 MSE: {(avg_bm25_mse * 100):.2f}%\")\n",
"print(f\"Average Custom Model MSE: {(avg_custom_mse * 100):.2f}%\")\n",
"print(f\"Average Embedding Model MSE: {(avg_custom_embedding_mse * 100):.2f}%\")\n",
"print(f\"Average Hybrid Model MSE: {(avg_custom_hybrid_mse * 100):.2f}%\")"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 718
},
"executionInfo": {
"elapsed": 647,
"status": "ok",
"timestamp": 1730742291056,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "kpzBRz4g5o8p",
"outputId": "7d20d00a-5291-4eaf-cecd-4bfc004d5cf7"
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plt.figure(figsize=(10, 8)) # Set the figure size\n",
"\n",
"plt.plot(range(1, sample_size + 1), bm25_mse_scores, marker='o', label=f'BM25')\n",
"# plt.plot(range(1, sample_size + 1), custom_mse_scores, label=f'Custom (TF-IDF)')\n",
"# plt.plot(range(1, sample_size + 1), custom_embedding_mse_scores, label=f'Custom Embedding')\n",
"plt.plot(range(1, sample_size + 1), custom_hybrid_mse_scores, marker='o', label=f'Custom Hybrid (TF_IDF + Embedding)')\n",
"\n",
"# Set titles and labels\n",
"plt.title(\"Details of Performance of Both Models\")\n",
"plt.xlabel(\"Sample\")\n",
"plt.ylabel(\"MSE\")\n",
"plt.grid(True)\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Qyo1S3o15q5r"
},
"source": [
"1. **Hybrid Model** (22.13% MSE):\\\n",
"The Hybrid model, combining the initial ranking of TF-IDF with re-ranking by embeddings, performs best, with an 26.45% MSE. This likely indicates that embeddings add value when used to refine results within a more relevant subset, offering a good balance between term-based and semantic similarity.\n",
"\n",
"2. **Embedding Model** (29.76% MSE):\\\n",
"This model has a much higher MSE. This could happen if embeddings aren’t fine-tuned on similar hotel review data, making the model potentially less aligned with the dataset’s vocabulary or context.\n",
"\n",
"3. **BM25 Model** (29.91% MSE):\\\n",
"BM25 achieves a slightly higher MSE, likely due to its inability to consistently capture the full semantic similarity, despite its strong ability to retrieve similar documents based on term frequency and document frequency.\n",
"\n",
"4. **Custom Model with TF-IDF** (31.34% MSE):\\\n",
"TF-IDF relies on term frequency but might miss some contextual relevance, especially in cases where semantic meaning is important which might explain its poor performance compared to the other models.\n",
"\n",
"It looks like the ***hybrid model*** is a strong candidate here, with room for further improvements to potentially outperform BM25 by an even wider margin."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "u8VwcdT0MS92"
},
"source": [
"### NDCG Scores for each models"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 3248441,
"status": "ok",
"timestamp": 1730747890528,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "tUD61lbSWGvG",
"outputId": "16fdffbd-5d92-45a0-d62c-75289886c183"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"NDCG Calculations: 100%|██████████| 100/100.0 [02:07<00:00, 1.28s/it]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Average BM25 NDCG: 0.9933\n",
"Average Custom Model NDCG: 0.9933\n",
"Average Embedding Model NDCG: 0.9917\n",
"Average Hybrid Model NDCG: 0.9949\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n"
]
}
],
"source": [
"# Store NDCG scores for each model\n",
"bm25_ndcg_scores = []\n",
"custom_ndcg_scores = []\n",
"custom_embedding_ndcg_scores = []\n",
"custom_hybrid_ndcg_scores = []\n",
"\n",
"# Loop over a sample of query places in the dataset\n",
"for _, query_place in tqdm(samples.iterrows(), total=float(sample_size), desc=\"NDCG Calculations\"):\n",
" query_text = query_place['reviews'][:max_char] # Use the concatenated reviews for the place as the query\n",
" actual_ratings = query_place[\"ratings\"] # Actual ratings\n",
"\n",
" # Convert actual ratings to relevance scores and ensure it's a list\n",
" actual_relevance = [list(actual_ratings.values())]\n",
"\n",
" # Step 1: Retrieve the most similar place using BM25\n",
" bm25_recommendation = retrieve_bm25(query_text, k=1)\n",
" bm25_predicted_ratings = bm25_recommendation[\"ratings\"].iloc[0]\n",
" bm25_relevance = [list(bm25_predicted_ratings.values())]\n",
"\n",
" # Calculate NDCG for BM25\n",
" bm25_ndcg = ndcg_score(actual_relevance, bm25_relevance)\n",
" bm25_ndcg_scores.append(bm25_ndcg)\n",
"\n",
" # Step 2: Retrieve the most similar place using the Custom Model (TF-IDF)\n",
" custom_recommendation = retrieve_tfidf(query_text, k=1)\n",
" custom_predicted_ratings = custom_recommendation[\"ratings\"].iloc[0]\n",
" custom_relevance = [list(custom_predicted_ratings.values())]\n",
"\n",
" # Calculate NDCG for the Custom Model\n",
" custom_ndcg = ndcg_score(actual_relevance, custom_relevance)\n",
" custom_ndcg_scores.append(custom_ndcg)\n",
"\n",
" # Step 3: Retrieve the most similar place using the Embedding Model\n",
" custom_embedding_recommendation = retrieve_embeddings(query_text, k=1)\n",
" custom_embedding_predicted_ratings = custom_embedding_recommendation[\"ratings\"].iloc[0]\n",
" custom_embedding_relevance = [list(custom_embedding_predicted_ratings.values())]\n",
"\n",
" # Calculate NDCG for the Embedding Model\n",
" custom_embedding_ndcg = ndcg_score(actual_relevance, custom_embedding_relevance)\n",
" custom_embedding_ndcg_scores.append(custom_embedding_ndcg)\n",
"\n",
" # Step 4: Retrieve the most similar place using the Hybrid Model\n",
" custom_hybrid_recommendation = retrieve_hybrid(query_text, final_k=1)\n",
" custom_hybrid_predicted_ratings = custom_hybrid_recommendation[\"ratings\"].iloc[0]\n",
" custom_hybrid_relevance = [list(custom_hybrid_predicted_ratings.values())]\n",
"\n",
" # Calculate NDCG for the Hybrid Model\n",
" custom_hybrid_ndcg = ndcg_score(actual_relevance, custom_hybrid_relevance)\n",
" custom_hybrid_ndcg_scores.append(custom_hybrid_ndcg)\n",
"\n",
"# Compute the average NDCG across all queries for each model\n",
"avg_bm25_ndcg = np.mean(bm25_ndcg_scores)\n",
"avg_custom_ndcg = np.mean(custom_ndcg_scores)\n",
"avg_custom_embedding_ndcg = np.mean(custom_embedding_ndcg_scores)\n",
"avg_custom_hybrid_ndcg = np.mean(custom_hybrid_ndcg_scores)\n",
"\n",
"# Print average NDCG results\n",
"print(f\"Average BM25 NDCG: {avg_bm25_ndcg:.4f}\")\n",
"print(f\"Average Custom Model NDCG: {avg_custom_ndcg:.4f}\")\n",
"print(f\"Average Embedding Model NDCG: {avg_custom_embedding_ndcg:.4f}\")\n",
"print(f\"Average Hybrid Model NDCG: {avg_custom_hybrid_ndcg:.4f}\")"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 564
},
"executionInfo": {
"elapsed": 878,
"status": "ok",
"timestamp": 1730744057730,
"user": {
"displayName": "Joyce Lapilus",
"userId": "10669185642835107674"
},
"user_tz": -60
},
"id": "swTFFpqlMN54",
"outputId": "d1ec139a-fa6e-499a-bd1d-5ee41ca3553e"
},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Models for plots\n",
"models = ['BM25', 'Custom Model (TF-IDF)', 'Embedding Model', 'Hybrid Model']\n",
"ndcg_scores = [avg_bm25_ndcg, avg_custom_ndcg, avg_custom_embedding_ndcg, avg_custom_hybrid_ndcg]\n",
"\n",
"# Sorting results for display\n",
"sorted_data = sorted(zip(models, ndcg_scores), key=lambda x: x[1], reverse=True)\n",
"sorted_models, sorted_scores = zip(*sorted_data) # Unpack the sorted data\n",
"\n",
"# Plots\n",
"plt.figure(figsize=(10, 6))\n",
"plt.bar(sorted_models, sorted_scores, color=['blue', 'orange', 'green', 'red'])\n",
"plt.title('NDCG Comparison of Different Recommendation Models')\n",
"plt.xlabel('Model')\n",
"plt.ylabel('Average NDCG Score')\n",
"plt.ylim(.975, 1) # NDCG scores between 0.975 and 1\n",
"plt.show()"
]
}
],
"metadata": {
"colab": {
"authorship_tag": "ABX9TyM/mgbFAsjxw+7MUp8ESy8P",
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "ml-nlp",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.20"
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"014a132cc94240bca260251b30e80b1d": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"018a4edc5a514defa8f88b76c5d2051d": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"01f0b4f9c8494b1d940f52e4ae2e052e": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_3af1e1c748874d9dbe6db67ac60f59a6",
"placeholder": "",
"style": "IPY_MODEL_f2bd0d7b25f64d21a0b101e300f5970f",
"value": " 10.7k/10.7k [00:00<00:00, 378kB/s]"
}
},
"0288d54f62a44a51af1cc28cbffeb414": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"02ad14c4355e4b95ac857236b939f386": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"0871e359a4754641b1001a7748790ef3": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"09ad0f566d154ebaa5a1d407e95b1a3f": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"0d3fc9b5eb5343688db12952c9b79209": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"0d79e7275de94b5b8744179614b31fdd": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"116d7e14b77e4f408e38ee2a9edd2589": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_1bda7a65c51c4eda942daba2ba20763a",
"placeholder": "",
"style": "IPY_MODEL_e753b149eebc4195a0c27d055d2c68c1",
"value": "sentence_bert_config.json: 100%"
}
},
"145978f0aad44da0b8cb60e65add585c": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"14c68438a1574435857fb6b5e8952de1": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"15df539a86084979a6998c9a8b1987f9": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"16ca2364890d4369bd76d528955d5c07": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_02ad14c4355e4b95ac857236b939f386",
"placeholder": "",
"style": "IPY_MODEL_7efde21464b74b89b2b18d4d69bab5f4",
"value": " 350/350 [00:00<00:00, 18.2kB/s]"
}
},
"177dad175ea045bbb434e60e5fa6a2d3": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"1bda7a65c51c4eda942daba2ba20763a": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"1e7db89680c144eb846ee878b37c7b78": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"1f49e5943d254770a92ea4778733a86a": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"1f54531a0f9249c79b3db2e715da96c2": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"211c9255a5b841a6981f4bb90ef0cc08": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_78cd919f72c24e349e35568be0834f8b",
"placeholder": "",
"style": "IPY_MODEL_018a4edc5a514defa8f88b76c5d2051d",
"value": "modules.json: 100%"
}
},
"21845b143b0b42a89b8c96371fcb7cfd": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_36a5b49746ab413eb824189c5312a4a8",
"max": 350,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_869ba914217346d79c3d5d73f5e00ea8",
"value": 350
}
},
"25e00f5d82e448368c2d132f0274c089": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"2c4d0395d3c9450c93cdf92534fb667e": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"2d731537db694de2abf5e50ff5315582": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_116d7e14b77e4f408e38ee2a9edd2589",
"IPY_MODEL_4d1a8a8036184623b6d9a0bb60134c0b",
"IPY_MODEL_75f9eb8825164ee0ab40ce3d35082182"
],
"layout": "IPY_MODEL_09ad0f566d154ebaa5a1d407e95b1a3f"
}
},
"2e73e9f4f6d349deb946044f44f5bc6a": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"304275b18f514e7da443e0e85cb92ebf": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"333ae2cfd578425c93552c57c4b7f8ae": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"33ea3f887ac34973aaa72cf28774e17a": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"34c8ce56ae6047809e128507f7ec1333": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"3571d5e96c964209bf75196c768c4f16": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_cc1811b5165a462c80e33ea28c9b1cd5",
"placeholder": "",
"style": "IPY_MODEL_d7aabc8485904970acc594cff1c2ecb1",
"value": "tokenizer_config.json: 100%"
}
},
"36a5b49746ab413eb824189c5312a4a8": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"398becbbdc324b2d9033ffb2511ab612": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_f0596d60ee03456899a44a9f024c13e8",
"placeholder": "",
"style": "IPY_MODEL_7705618bee1b436eb9cc7c954281b985",
"value": "1_Pooling/config.json: 100%"
}
},
"39b450fcf67e4eff80f0a94ce03c3d9a": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_6988cc6bae8646baabc123c4a90eb1f2",
"placeholder": "",
"style": "IPY_MODEL_0d79e7275de94b5b8744179614b31fdd",
"value": " 118/118 [14:18<00:00, 2.44s/it]"
}
},
"3aad5b862e584581873e0fcf48dc645f": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_b47a4986321b4011857e43ec29af5930",
"placeholder": "",
"style": "IPY_MODEL_70d1721ee8764008a378e86f8b4d1a40",
"value": " 90.9M/90.9M [00:00<00:00, 119MB/s]"
}
},
"3af1e1c748874d9dbe6db67ac60f59a6": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"3ecefd97e39949778d00bddddbcd93d5": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"40426b1e8b6d4b9f90bc8b6551e33dad": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"4073631f4ae04f6394e7ec7160c1b267": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"407bdd18f40942d4857674f5aef60f5a": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"414026345ea14ad3a1c46f3b48e08dad": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"419c59425958463e86e3c89e7897562d": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"41aa5f25235d435ab2022deb265c1224": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_211c9255a5b841a6981f4bb90ef0cc08",
"IPY_MODEL_f39607b761c34a09bfd9fac0ecaab5f4",
"IPY_MODEL_bc6bc1c1abff40a8bc46bd40fcba011d"
],
"layout": "IPY_MODEL_98e8d5ab089c4972b28de304327ef4b0"
}
},
"4920dfa856c743c28d6b645bc6462d84": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"497a4539b1f44ac884b22c441a5cc0fb": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"4a573eab972948139eb2acccc8171917": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_97c9c08018ac4d98a5a73a4e4d4b4552",
"placeholder": "",
"style": "IPY_MODEL_1e7db89680c144eb846ee878b37c7b78",
"value": "special_tokens_map.json: 100%"
}
},
"4d1a8a8036184623b6d9a0bb60134c0b": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_333ae2cfd578425c93552c57c4b7f8ae",
"max": 53,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_eed1334d1cfb47b8b6568b754b36e591",
"value": 53
}
},
"50e7bea3bda3449993f02204ab708c84": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"51b607d567414be9806ba7eecd5b47f5": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"520a8efd266645e98d64fe12e692fdd4": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"5760a67d10b342c7b802e0b14ef934b8": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_77a06f1ab92a437a93326ca36667ca87",
"IPY_MODEL_e53714b59cb44cdfb55fe0768be6960d",
"IPY_MODEL_d92024edce0c416c9bf5389c4763be3e"
],
"layout": "IPY_MODEL_e4bff3bbc2c54c6aa9ffdb855c707def"
}
},
"59c60af3e36e4afa94c9243a713c862e": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_014a132cc94240bca260251b30e80b1d",
"max": 10659,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_85f80f687da548c4be2a9aa4bc5f703a",
"value": 10659
}
},
"5f8eb46b969f499281890ad8feaf34e9": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_b9394cb942f44d248d070d2fa38a3a60",
"max": 116,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_d339ebcb8350489ab8adc6ade4a3549f",
"value": 116
}
},
"62217ad84eee4d9ab223350440f31c40": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_97d56fd60344424dac5c0482ab9efabb",
"IPY_MODEL_c8bd3f54301d4ebda35036a60c4aa5e7",
"IPY_MODEL_d2c96a3539d64fa4b5ca319057ce1838"
],
"layout": "IPY_MODEL_7fc87395d6664c579ad3b49c5d5f5af9"
}
},
"63c2cc0bb0d2470b9c663fb789bccbcc": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"643406402c1b45c798fc3a9b073084b7": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"64c0d4f909914b66aca9fd86882d4759": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_ccd93edd28fd47dc8ed9b25fdff1273b",
"placeholder": "",
"style": "IPY_MODEL_8ab8006e3d0c46c688e3f8f30f1076d3",
"value": "Batches: 100%"
}
},
"65cd6c2e7ea4417cbcbc49e9826da3a4": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_b89dba8be52b44ab922ec32a2dbf0388",
"placeholder": "",
"style": "IPY_MODEL_d73b0aa7b82a446287f1597560aa9f2e",
"value": "README.md: 100%"
}
},
"66e195b05b7d4007875cee4ca105bee5": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_2e73e9f4f6d349deb946044f44f5bc6a",
"max": 231508,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_c1f08c978d8e40ec87a1d6b348331205",
"value": 231508
}
},
"6988cc6bae8646baabc123c4a90eb1f2": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"6a55cee63ed64d5c8bce9853553a152e": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_90b79c66e79f4cb3ac20515d1ee204ba",
"placeholder": "",
"style": "IPY_MODEL_4073631f4ae04f6394e7ec7160c1b267",
"value": "vocab.txt: 100%"
}
},
"6f315b8a29ec4dceb936f40dcf935ca9": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_3571d5e96c964209bf75196c768c4f16",
"IPY_MODEL_21845b143b0b42a89b8c96371fcb7cfd",
"IPY_MODEL_16ca2364890d4369bd76d528955d5c07"
],
"layout": "IPY_MODEL_407bdd18f40942d4857674f5aef60f5a"
}
},
"70d1721ee8764008a378e86f8b4d1a40": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"72045a496fac4abba58f054c4f5dd289": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_77e1d1adc7f84c05830987cbcd9abf6b",
"placeholder": "",
"style": "IPY_MODEL_33ea3f887ac34973aaa72cf28774e17a",
"value": "model.safetensors: 100%"
}
},
"72500bb48c964e91a308f763db191956": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"72b931a3238e4d229517d8762e64de4f": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"75abb8d3b53341aabed11f3e3e789423": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_64c0d4f909914b66aca9fd86882d4759",
"IPY_MODEL_a22f72bba00646fc8d424e84f9761d04",
"IPY_MODEL_39b450fcf67e4eff80f0a94ce03c3d9a"
],
"layout": "IPY_MODEL_419c59425958463e86e3c89e7897562d"
}
},
"75f9eb8825164ee0ab40ce3d35082182": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_d098812189de448cb80c14fc9f4b932e",
"placeholder": "",
"style": "IPY_MODEL_b65a40471afa46f88b2a745df3b8164f",
"value": " 53.0/53.0 [00:00<00:00, 1.61kB/s]"
}
},
"7705618bee1b436eb9cc7c954281b985": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"77a06f1ab92a437a93326ca36667ca87": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_643406402c1b45c798fc3a9b073084b7",
"placeholder": "",
"style": "IPY_MODEL_1f54531a0f9249c79b3db2e715da96c2",
"value": "config.json: 100%"
}
},
"77e1d1adc7f84c05830987cbcd9abf6b": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"78cd919f72c24e349e35568be0834f8b": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"7a0b0da7087a4830a5263f93a9835870": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_fab297d175304e86be07326a50074a96",
"placeholder": "",
"style": "IPY_MODEL_3ecefd97e39949778d00bddddbcd93d5",
"value": " 190/190 [00:00<00:00, 8.52kB/s]"
}
},
"7efde21464b74b89b2b18d4d69bab5f4": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"7fc87395d6664c579ad3b49c5d5f5af9": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"829f7b18252f4452bc0c052707438553": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_414026345ea14ad3a1c46f3b48e08dad",
"max": 90868376,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_520a8efd266645e98d64fe12e692fdd4",
"value": 90868376
}
},
"85f80f687da548c4be2a9aa4bc5f703a": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"869ba914217346d79c3d5d73f5e00ea8": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"8aa2e9c7ecc8463d80969f8be0b8fa97": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"8ab8006e3d0c46c688e3f8f30f1076d3": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"90b79c66e79f4cb3ac20515d1ee204ba": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"97c9c08018ac4d98a5a73a4e4d4b4552": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"97d56fd60344424dac5c0482ab9efabb": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_63c2cc0bb0d2470b9c663fb789bccbcc",
"placeholder": "",
"style": "IPY_MODEL_4920dfa856c743c28d6b645bc6462d84",
"value": "tokenizer.json: 100%"
}
},
"98e8d5ab089c4972b28de304327ef4b0": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"9992af63e99a4d279c2a00b6e2609532": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"9f18e1e8a03544c7b43e8ad098385165": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"a22f72bba00646fc8d424e84f9761d04": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_c792016a2f1e421c9c63a4760529c6dd",
"max": 118,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_9992af63e99a4d279c2a00b6e2609532",
"value": 118
}
},
"a8827aab24ad4f62a569ebc7c49e5e13": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_40426b1e8b6d4b9f90bc8b6551e33dad",
"placeholder": "",
"style": "IPY_MODEL_9f18e1e8a03544c7b43e8ad098385165",
"value": "config_sentence_transformers.json: 100%"
}
},
"ab3daf3afb14420e917d2ce4b70ed03a": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"af6f78c572ec4876ae5e0b7440bea99c": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"b22240de5d414420ba2d74813e0cb059": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"b47a4986321b4011857e43ec29af5930": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"b65a40471afa46f88b2a745df3b8164f": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"b89dba8be52b44ab922ec32a2dbf0388": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"b9394cb942f44d248d070d2fa38a3a60": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"bc6bc1c1abff40a8bc46bd40fcba011d": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_50e7bea3bda3449993f02204ab708c84",
"placeholder": "",
"style": "IPY_MODEL_2c4d0395d3c9450c93cdf92534fb667e",
"value": " 349/349 [00:00<00:00, 11.2kB/s]"
}
},
"c064370aceca42d08529b26489e89d01": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"c12e263dd34e41e4a15390d05b99e937": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_0871e359a4754641b1001a7748790ef3",
"placeholder": "",
"style": "IPY_MODEL_f8279f7457b54ea79871d8afffaa4c1a",
"value": " 116/116 [00:00<00:00, 3.40kB/s]"
}
},
"c1f08c978d8e40ec87a1d6b348331205": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"c410fd7c34974edba7625df2c5dda519": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_398becbbdc324b2d9033ffb2511ab612",
"IPY_MODEL_cf6ba93b63214a79b705c74f9ab08470",
"IPY_MODEL_7a0b0da7087a4830a5263f93a9835870"
],
"layout": "IPY_MODEL_34c8ce56ae6047809e128507f7ec1333"
}
},
"c792016a2f1e421c9c63a4760529c6dd": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"c7984758157f4bd885a29f3d13508f12": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"c8bd3f54301d4ebda35036a60c4aa5e7": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_f9727f5d318f4c578c6f609da8514a03",
"max": 466247,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_dafa8bd7b78c4dcbaf8bc8bf1f38b441",
"value": 466247
}
},
"c92e52e2edf146058ff52be717bb11a1": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_72500bb48c964e91a308f763db191956",
"max": 112,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_14c68438a1574435857fb6b5e8952de1",
"value": 112
}
},
"cc1811b5165a462c80e33ea28c9b1cd5": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"ccd93edd28fd47dc8ed9b25fdff1273b": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"cf6ba93b63214a79b705c74f9ab08470": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_8aa2e9c7ecc8463d80969f8be0b8fa97",
"max": 190,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_145978f0aad44da0b8cb60e65add585c",
"value": 190
}
},
"d098812189de448cb80c14fc9f4b932e": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"d2c96a3539d64fa4b5ca319057ce1838": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_ab3daf3afb14420e917d2ce4b70ed03a",
"placeholder": "",
"style": "IPY_MODEL_e8cd452dbdfe4913b3f06060bdebbc0e",
"value": " 466k/466k [00:00<00:00, 6.13MB/s]"
}
},
"d339ebcb8350489ab8adc6ade4a3549f": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"d45e5d6905ad4b24b24b79b80758920c": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_c7984758157f4bd885a29f3d13508f12",
"placeholder": "",
"style": "IPY_MODEL_1f49e5943d254770a92ea4778733a86a",
"value": " 112/112 [00:00<00:00, 5.89kB/s]"
}
},
"d4933c6ee65f47638f514b25da3031ff": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_a8827aab24ad4f62a569ebc7c49e5e13",
"IPY_MODEL_5f8eb46b969f499281890ad8feaf34e9",
"IPY_MODEL_c12e263dd34e41e4a15390d05b99e937"
],
"layout": "IPY_MODEL_25e00f5d82e448368c2d132f0274c089"
}
},
"d49c878e0999475c946d18ea7ad24e1b": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_c064370aceca42d08529b26489e89d01",
"placeholder": "",
"style": "IPY_MODEL_497a4539b1f44ac884b22c441a5cc0fb",
"value": " 232k/232k [00:00<00:00, 4.87MB/s]"
}
},
"d5b2b751583a4eea83dd7d0dac3311df": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"d73b0aa7b82a446287f1597560aa9f2e": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"d7aabc8485904970acc594cff1c2ecb1": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"d92024edce0c416c9bf5389c4763be3e": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HTMLModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_15df539a86084979a6998c9a8b1987f9",
"placeholder": "",
"style": "IPY_MODEL_d5b2b751583a4eea83dd7d0dac3311df",
"value": " 612/612 [00:00<00:00, 14.2kB/s]"
}
},
"dafa8bd7b78c4dcbaf8bc8bf1f38b441": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"e4bff3bbc2c54c6aa9ffdb855c707def": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"e513eb3bf3f74a8b9f56f67be171e4c7": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_6a55cee63ed64d5c8bce9853553a152e",
"IPY_MODEL_66e195b05b7d4007875cee4ca105bee5",
"IPY_MODEL_d49c878e0999475c946d18ea7ad24e1b"
],
"layout": "IPY_MODEL_177dad175ea045bbb434e60e5fa6a2d3"
}
},
"e53714b59cb44cdfb55fe0768be6960d": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_51b607d567414be9806ba7eecd5b47f5",
"max": 612,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_72b931a3238e4d229517d8762e64de4f",
"value": 612
}
},
"e6afaae6e36e46ae94a433b5437012b2": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_4a573eab972948139eb2acccc8171917",
"IPY_MODEL_c92e52e2edf146058ff52be717bb11a1",
"IPY_MODEL_d45e5d6905ad4b24b24b79b80758920c"
],
"layout": "IPY_MODEL_b22240de5d414420ba2d74813e0cb059"
}
},
"e753b149eebc4195a0c27d055d2c68c1": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"e8cd452dbdfe4913b3f06060bdebbc0e": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"eaa6b4ef21fd4de2b67e0afb28af35a7": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_72045a496fac4abba58f054c4f5dd289",
"IPY_MODEL_829f7b18252f4452bc0c052707438553",
"IPY_MODEL_3aad5b862e584581873e0fcf48dc645f"
],
"layout": "IPY_MODEL_304275b18f514e7da443e0e85cb92ebf"
}
},
"eae753a0f03a4940811ce9b14f3c4fba": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "HBoxModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_65cd6c2e7ea4417cbcbc49e9826da3a4",
"IPY_MODEL_59c60af3e36e4afa94c9243a713c862e",
"IPY_MODEL_01f0b4f9c8494b1d940f52e4ae2e052e"
],
"layout": "IPY_MODEL_af6f78c572ec4876ae5e0b7440bea99c"
}
},
"eed1334d1cfb47b8b6568b754b36e591": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "ProgressStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"f0596d60ee03456899a44a9f024c13e8": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"f2bd0d7b25f64d21a0b101e300f5970f": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"f39607b761c34a09bfd9fac0ecaab5f4": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "FloatProgressModel",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_0288d54f62a44a51af1cc28cbffeb414",
"max": 349,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_0d3fc9b5eb5343688db12952c9b79209",
"value": 349
}
},
"f8279f7457b54ea79871d8afffaa4c1a": {
"model_module": "@jupyter-widgets/controls",
"model_module_version": "1.5.0",
"model_name": "DescriptionStyleModel",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"f9727f5d318f4c578c6f609da8514a03": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"fab297d175304e86be07326a50074a96": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "1.2.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
}
}
}
},
"nbformat": 4,
"nbformat_minor": 0
}