{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "2022-01-07-ncf.ipynb",
"provenance": [],
"collapsed_sections": [],
"authorship_tag": "ABX9TyO+EzcZLxWCVB+iE5tVY1BN"
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "iVkUysixCyk6"
},
"source": [
"# Neural Collaborative Filtering Recommenders"
]
},
{
"cell_type": "code",
"metadata": {
"id": "-xzeGn4mvtvd"
},
"source": [
"!pip install -q pytorch-lightning"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "xNgD2QXOkq4g"
},
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import pandas as pd\n",
"import torch\n",
"import torch.nn as nn\n",
"import torch.nn.functional as F\n",
"import torch.optim as optim\n",
"from torch.utils.data import Dataset, DataLoader\n",
"from torch.utils.data import TensorDataset\n",
"from tqdm.notebook import tqdm\n",
"import pytorch_lightning as pl\n",
"\n",
"np.random.seed(123)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "8fqqbAgmrxoT"
},
"source": [
"## NCF with PyTorch Lightning on ML-25m\n",
"\n",
"In this section, we will build a simple yet accurate model using movielens-25m dataset and pytorch lightning library. This will be a retrieval model where the objective is to maximize recall over precision."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "MrlkpJITv8Rw",
"outputId": "4fd97487-1f04-4563-8adf-5aa306c13d2b"
},
"source": [
"!wget -q --show-progress https://files.grouplens.org/datasets/movielens/ml-25m.zip\n",
"!unzip ml-25m.zip"
],
"execution_count": null,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"ml-25m.zip.1 100%[===================>] 249.84M 45.7MB/s in 5.9s \n",
"Archive: ml-25m.zip\n",
"replace ml-25m/tags.csv? [y]es, [n]o, [A]ll, [N]one, [r]ename: N\n"
]
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"id": "5-juuyKOwCmL",
"outputId": "93ff3c63-a959-4e2d-fdff-c5e72a475160"
},
"source": [
"ratings = pd.read_csv('ml-25m/ratings.csv', infer_datetime_format=True)\n",
"ratings.head()"
],
"execution_count": null,
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
userId
\n",
"
movieId
\n",
"
rating
\n",
"
timestamp
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
1
\n",
"
296
\n",
"
5.0
\n",
"
1147880044
\n",
"
\n",
"
\n",
"
1
\n",
"
1
\n",
"
306
\n",
"
3.5
\n",
"
1147868817
\n",
"
\n",
"
\n",
"
2
\n",
"
1
\n",
"
307
\n",
"
5.0
\n",
"
1147868828
\n",
"
\n",
"
\n",
"
3
\n",
"
1
\n",
"
665
\n",
"
5.0
\n",
"
1147878820
\n",
"
\n",
"
\n",
"
4
\n",
"
1
\n",
"
899
\n",
"
3.5
\n",
"
1147868510
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" userId movieId rating timestamp\n",
"0 1 296 5.0 1147880044\n",
"1 1 306 3.5 1147868817\n",
"2 1 307 5.0 1147868828\n",
"3 1 665 5.0 1147878820\n",
"4 1 899 3.5 1147868510"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "A2bKYi9WwGyP"
},
"source": [
"### Subset\n",
"\n",
"In order to keep memory usage manageable, we will only use data from 20% of the users in this dataset. Let's randomly select 30% of the users and only use data from the selected users."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "HsYAMEXqwZoH",
"outputId": "434b4209-8c53-4474-edd8-6c5d109c3184"
},
"source": [
"rand_userIds = np.random.choice(ratings['userId'].unique(), \n",
" size=int(len(ratings['userId'].unique())*0.2), \n",
" replace=False)\n",
"\n",
"ratings = ratings.loc[ratings['userId'].isin(rand_userIds)]\n",
"\n",
"print('There are {} rows of data from {} users'.format(len(ratings), len(rand_userIds)))"
],
"execution_count": null,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"There are 5015129 rows of data from 32508 users\n"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "w7OS6UqDpXdi"
},
"source": [
"### Train/Test Split\n",
"**Chronological Leave-One-Out Split**"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZaqAMrH-gn_i"
},
"source": [
"Along with the rating, there is also a timestamp column that shows the date and time the review was submitted. Using the timestamp column, we will implement our train-test split strategy using the leave-one-out methodology. For each user, the most recent review is used as the test set (i.e. leave one out), while the rest will be used as training data ."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "A4COa9yVguUO"
},
"source": [
"> Note: Doing a random split would not be fair, as we could potentially be using a user's recent reviews for training and earlier reviews for testing. This introduces data leakage with a look-ahead bias, and the performance of the trained model would not be generalizable to real-world performance."
]
},
{
"cell_type": "code",
"metadata": {
"id": "WtdtS0FMgTez"
},
"source": [
"ratings['rank_latest'] = ratings.groupby(['userId'])['timestamp'] \\\n",
" .rank(method='first', ascending=False)\n",
"\n",
"train_ratings = ratings[ratings['rank_latest'] != 1]\n",
"test_ratings = ratings[ratings['rank_latest'] == 1]\n",
"\n",
"# drop columns that we no longer need\n",
"train_ratings = train_ratings[['userId', 'movieId', 'rating']]\n",
"test_ratings = test_ratings[['userId', 'movieId', 'rating']]"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "XFoYAbhqpnPD"
},
"source": [
"### Implicit Conversion"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HwYvXqJ6hz2u"
},
"source": [
"We will train a recommender system using implicit feedback. However, the MovieLens dataset that we're using is based on explicit feedback. To convert this dataset into an implicit feedback dataset, we'll simply binarize the ratings such that they are are '1' (i.e. positive class). The value of '1' represents that the user has interacted with the item.\n",
"\n",
"> Note: Using implicit feedback reframes the problem that our recommender is trying to solve. Instead of trying to predict movie ratings (when using explicit feedback), we are trying to predict whether the user will interact (i.e. click/buy/watch) with each movie, with the aim of presenting to users the movies with the highest interaction likelihood.\n",
"\n",
"> Tip: This setting is suitable at retrieval stage where the objective is to maximize recall by identifying items that user will at least interact with."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"id": "sQYuW1Otg_Cg",
"outputId": "096aec13-d21e-4690-e88d-258fe9a681a0"
},
"source": [
"train_ratings.loc[:, 'rating'] = 1\n",
"\n",
"train_ratings.sample(5)"
],
"execution_count": null,
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
userId
\n",
"
movieId
\n",
"
rating
\n",
"
\n",
" \n",
" \n",
"
\n",
"
9865540
\n",
"
64043
\n",
"
2019
\n",
"
1
\n",
"
\n",
"
\n",
"
17648975
\n",
"
114398
\n",
"
2671
\n",
"
1
\n",
"
\n",
"
\n",
"
19045758
\n",
"
123527
\n",
"
986
\n",
"
1
\n",
"
\n",
"
\n",
"
3125012
\n",
"
20593
\n",
"
86487
\n",
"
1
\n",
"
\n",
"
\n",
"
4540349
\n",
"
29803
\n",
"
4571
\n",
"
1
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" userId movieId rating\n",
"9865540 64043 2019 1\n",
"17648975 114398 2671 1\n",
"19045758 123527 986 1\n",
"3125012 20593 86487 1\n",
"4540349 29803 4571 1"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uhwZiaBPpsQl"
},
"source": [
"### Negative Sampling"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ngZdyoMjizlw"
},
"source": [
"We do have a problem now though. After binarizing our dataset, we see that every sample in the dataset now belongs to the positive class. However we also require negative samples to train our models, to indicate movies that the user has not interacted with. We assume that such movies are those that the user are not interested in - even though this is a sweeping assumption that may not be true, it usually works out rather well in practice.\n",
"\n",
"The code below generates 4 negative samples for each row of data. In other words, the ratio of negative to positive samples is 4:1. This ratio is chosen arbitrarily but I found that it works rather well (feel free to find the best ratio yourself!)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "4T0_UVhTizVn"
},
"source": [
"# Get a list of all movie IDs\n",
"all_movieIds = ratings['movieId'].unique()\n",
"\n",
"# Placeholders that will hold the training data\n",
"users, items, labels = [], [], []\n",
"\n",
"# This is the set of items that each user has interaction with\n",
"user_item_set = set(zip(train_ratings['userId'], train_ratings['movieId']))\n",
"\n",
"# 4:1 ratio of negative to positive samples\n",
"num_negatives = 4\n",
"\n",
"for (u, i) in tqdm(user_item_set):\n",
" users.append(u)\n",
" items.append(i)\n",
" labels.append(1) # items that the user has interacted with are positive\n",
" for _ in range(num_negatives):\n",
" # randomly select an item\n",
" negative_item = np.random.choice(all_movieIds) \n",
" # check that the user has not interacted with this item\n",
" while (u, negative_item) in user_item_set:\n",
" negative_item = np.random.choice(all_movieIds)\n",
" users.append(u)\n",
" items.append(negative_item)\n",
" labels.append(0) # items not interacted with are negative"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "9brxnZqlpvXD"
},
"source": [
"### PyTorch Dataset"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Br2u5nn5jAy1"
},
"source": [
"Great! We now have the data in the format required by our model. Before we move on, let's define a PyTorch Dataset to facilitate training. The class below simply encapsulates the code we have written above into a PyTorch Dataset class."
]
},
{
"cell_type": "code",
"metadata": {
"id": "pCn0M346i6Z8"
},
"source": [
"class MovieLensTrainDataset(Dataset):\n",
" \"\"\"MovieLens PyTorch Dataset for Training\n",
" \n",
" Args:\n",
" ratings (pd.DataFrame): Dataframe containing the movie ratings\n",
" all_movieIds (list): List containing all movieIds\n",
" \n",
" \"\"\"\n",
"\n",
" def __init__(self, ratings, all_movieIds):\n",
" self.users, self.items, self.labels = self.get_dataset(ratings, all_movieIds)\n",
"\n",
" def __len__(self):\n",
" return len(self.users)\n",
" \n",
" def __getitem__(self, idx):\n",
" return self.users[idx], self.items[idx], self.labels[idx]\n",
"\n",
" def get_dataset(self, ratings, all_movieIds):\n",
" users, items, labels = [], [], []\n",
" user_item_set = set(zip(ratings['userId'], ratings['movieId']))\n",
"\n",
" num_negatives = 4\n",
" for u, i in user_item_set:\n",
" users.append(u)\n",
" items.append(i)\n",
" labels.append(1)\n",
" for _ in range(num_negatives):\n",
" negative_item = np.random.choice(all_movieIds)\n",
" while (u, negative_item) in user_item_set:\n",
" negative_item = np.random.choice(all_movieIds)\n",
" users.append(u)\n",
" items.append(negative_item)\n",
" labels.append(0)\n",
"\n",
" return torch.tensor(users), torch.tensor(items), torch.tensor(labels)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "SqMDcEYRjOZN"
},
"source": [
"### Model\n",
"\n",
"While there are many deep learning based architecture for recommendation systems, I find that the framework proposed by He et al. is the most straightforward and it is simple enough to be implemented in a tutorial such as this."
]
},
{
"cell_type": "code",
"metadata": {
"id": "xwlBJpqljJvS"
},
"source": [
"class NCF(pl.LightningModule):\n",
" \"\"\" Neural Collaborative Filtering (NCF)\n",
" \n",
" Args:\n",
" num_users (int): Number of unique users\n",
" num_items (int): Number of unique items\n",
" ratings (pd.DataFrame): Dataframe containing the movie ratings for training\n",
" all_movieIds (list): List containing all movieIds (train + test)\n",
" \"\"\"\n",
" \n",
" def __init__(self, num_users, num_items, ratings, all_movieIds):\n",
" super().__init__()\n",
" self.user_embedding = nn.Embedding(num_embeddings=num_users, embedding_dim=8)\n",
" self.item_embedding = nn.Embedding(num_embeddings=num_items, embedding_dim=8)\n",
" self.fc1 = nn.Linear(in_features=16, out_features=64)\n",
" self.fc2 = nn.Linear(in_features=64, out_features=32)\n",
" self.output = nn.Linear(in_features=32, out_features=1)\n",
" self.ratings = ratings\n",
" self.all_movieIds = all_movieIds\n",
" \n",
" def forward(self, user_input, item_input):\n",
" \n",
" # Pass through embedding layers\n",
" user_embedded = self.user_embedding(user_input)\n",
" item_embedded = self.item_embedding(item_input)\n",
"\n",
" # Concat the two embedding layers\n",
" vector = torch.cat([user_embedded, item_embedded], dim=-1)\n",
"\n",
" # Pass through dense layer\n",
" vector = nn.ReLU()(self.fc1(vector))\n",
" vector = nn.ReLU()(self.fc2(vector))\n",
"\n",
" # Output layer\n",
" pred = nn.Sigmoid()(self.output(vector))\n",
"\n",
" return pred\n",
" \n",
" def training_step(self, batch, batch_idx):\n",
" user_input, item_input, labels = batch\n",
" predicted_labels = self(user_input, item_input)\n",
" loss = nn.BCELoss()(predicted_labels, labels.view(-1, 1).float())\n",
" return loss\n",
"\n",
" def configure_optimizers(self):\n",
" return torch.optim.Adam(self.parameters())\n",
"\n",
" def train_dataloader(self):\n",
" return DataLoader(MovieLensTrainDataset(self.ratings, self.all_movieIds),\n",
" batch_size=512, num_workers=2)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "I8wT1WK9jzeJ"
},
"source": [
"We instantiate the NCF model using the class that we have defined above."
]
},
{
"cell_type": "code",
"metadata": {
"id": "F3Bh9dorjww7"
},
"source": [
"num_users = ratings['userId'].max()+1\n",
"num_items = ratings['movieId'].max()+1\n",
"\n",
"all_movieIds = ratings['movieId'].unique()\n",
"\n",
"model = NCF(num_users, num_items, train_ratings, all_movieIds)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "K4Mw8CdVp5lF"
},
"source": [
"### Model Training"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xnYHNWe3kRRD"
},
"source": [
"> Note: One advantage of PyTorch Lightning over vanilla PyTorch is that you don't need to write your own boiler plate training code. Notice how the Trainer class allows us to train our model with just a few lines of code.\n",
"\n",
"Let's train our NCF model for 5 epochs using the GPU. "
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 375,
"referenced_widgets": [
"68bcd7bfc32f4d9ebaba5c08437bca28"
]
},
"id": "0JganCIMj2EW",
"outputId": "6fa64b89-c835-4f39-d6ad-bac6f2369c64"
},
"source": [
"trainer = pl.Trainer(max_epochs=5, gpus=1, reload_dataloaders_every_epoch=True,\n",
" progress_bar_refresh_rate=50, logger=False, checkpoint_callback=False)\n",
"\n",
"trainer.fit(model)"
],
"execution_count": null,
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"GPU available: True, used: True\n",
"TPU available: False, using: 0 TPU cores\n",
"LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]\n",
"\n",
" | Name | Type | Params\n",
"---------------------------------------------\n",
"0 | user_embedding | Embedding | 1.3 M \n",
"1 | item_embedding | Embedding | 1.7 M \n",
"2 | fc1 | Linear | 1.1 K \n",
"3 | fc2 | Linear | 2.1 K \n",
"4 | output | Linear | 33 \n",
"---------------------------------------------\n",
"3.0 M Trainable params\n",
"0 Non-trainable params\n",
"3.0 M Total params\n",
"11.907 Total estimated model params size (MB)\n",
"/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:481: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.\n",
" cpuset_checked))\n"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "68bcd7bfc32f4d9ebaba5c08437bca28",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…"
]
},
"metadata": {
"tags": []
},
"output_type": "display_data"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "R6V1Tiw1kIxk"
},
"source": [
"> Note: We are using the argument reload_dataloaders_every_epoch=True. This creates a new randomly chosen set of negative samples for each epoch, which ensures that our model is not biased by the selection of negative samples."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "I3BXx1YzlAUq"
},
"source": [
"### Evaluating our Recommender System\n",
"\n",
"Now that our model is trained, we are ready to evaluate it using the test data. In traditional Machine Learning projects, we evaluate our models using metrics such as Accuracy (for classification problems) and RMSE (for regression problems). However, such metrics are too simplistic for evaluating recommender systems.\n",
"\n",
"The key here is that we don't need the user to interact on every single item in the list of recommendations. Instead, we just need the user to interact with at least one item on the list - as long as the user does that, the recommendations have worked.\n",
"\n",
"To simulate this, let's run the following evaluation protocol to generate a list of 10 recommended items for each user.\n",
"- For each user, randomly select 99 items that the user has not interacted with\n",
"- Combine these 99 items with the test item (the actual item that the user interacted with). We now have 100 items.\n",
"- Run the model on these 100 items, and rank them according to their predicted probabilities\n",
"- Select the top 10 items from the list of 100 items. If the test item is present within the top 10 items, then we say that this is a hit.\n",
"- Repeat the process for all users. The Hit Ratio is then the average hits."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "B2PVVpUflN34"
},
"source": [
"> Note: This evaluation protocol is known as Hit Ratio @ 10, and it is commonly used to evaluate recommender systems."
]
},
{
"cell_type": "code",
"metadata": {
"id": "uSLTYZuhlNEV"
},
"source": [
"# User-item pairs for testing\n",
"test_user_item_set = set(zip(test_ratings['userId'], test_ratings['movieId']))\n",
"\n",
"# Dict of all items that are interacted with by each user\n",
"user_interacted_items = ratings.groupby('userId')['movieId'].apply(list).to_dict()\n",
"\n",
"hits = []\n",
"for (u,i) in tqdm(test_user_item_set):\n",
" interacted_items = user_interacted_items[u]\n",
" not_interacted_items = set(all_movieIds) - set(interacted_items)\n",
" selected_not_interacted = list(np.random.choice(list(not_interacted_items), 99))\n",
" test_items = selected_not_interacted + [i]\n",
" \n",
" predicted_labels = np.squeeze(model(torch.tensor([u]*100), \n",
" torch.tensor(test_items)).detach().numpy())\n",
" \n",
" top10_items = [test_items[i] for i in np.argsort(predicted_labels)[::-1][0:10].tolist()]\n",
" \n",
" if i in top10_items:\n",
" hits.append(1)\n",
" else:\n",
" hits.append(0)\n",
" \n",
"print(\"The Hit Ratio @ 10 is {:.2f}\".format(np.average(hits)))"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "s1XtzBFsllfN"
},
"source": [
"We got a pretty good Hit Ratio @ 10 score! To put this into context, what this means is that 86% of the users were recommended the actual item (among a list of 10 items) that they eventually interacted with. Not bad!"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_xofdqRI29zl"
},
"source": [
"## NMF with PyTorch on ML-1m"
]
},
{
"cell_type": "code",
"metadata": {
"id": "3wo7aehx3AyG"
},
"source": [
"import os\n",
"import time\n",
"import random\n",
"import argparse\n",
"import numpy as np \n",
"import pandas as pd \n",
"import torch\n",
"import torch.nn as nn\n",
"import torch.optim as optim\n",
"import torch.utils.data as data\n",
"from torch.utils.tensorboard import SummaryWriter"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "y1iNipgl3JhO",
"outputId": "363cf5f9-b062-4930-b122-d7573d824ab0"
},
"source": [
"DATA_URL = \"https://raw.githubusercontent.com/sparsh-ai/rec-data-public/master/ml-1m-dat/ratings.dat\"\n",
"MAIN_PATH = '/content/'\n",
"DATA_PATH = MAIN_PATH + 'ratings.dat'\n",
"MODEL_PATH = MAIN_PATH + 'models/'\n",
"MODEL = 'ml-1m_Neu_MF'\n",
"\n",
"!wget -q --show-progress https://raw.githubusercontent.com/sparsh-ai/rec-data-public/master/ml-1m-dat/ratings.dat"
],
"execution_count": null,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\rratings.dat 0%[ ] 0 --.-KB/s \rratings.dat 100%[===================>] 23.45M 128MB/s in 0.2s \n"
]
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "EXFnsMFy3YTE"
},
"source": [
"def seed_everything(seed):\n",
" random.seed(seed)\n",
" os.environ['PYTHONHASHSEED'] = str(seed)\n",
" np.random.seed(seed)\n",
" torch.manual_seed(seed)\n",
" torch.cuda.manual_seed(seed)\n",
" torch.backends.cudnn.deterministic = True\n",
" torch.backends.cudnn.benchmark = True"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "KvTX81Z23bFs"
},
"source": [
"### Dataset"
]
},
{
"cell_type": "code",
"metadata": {
"id": "NN5GjJCf3rI8"
},
"source": [
"class Rating_Datset(torch.utils.data.Dataset):\n",
"\tdef __init__(self, user_list, item_list, rating_list):\n",
"\t\tsuper(Rating_Datset, self).__init__()\n",
"\t\tself.user_list = user_list\n",
"\t\tself.item_list = item_list\n",
"\t\tself.rating_list = rating_list\n",
"\n",
"\tdef __len__(self):\n",
"\t\treturn len(self.user_list)\n",
"\n",
"\tdef __getitem__(self, idx):\n",
"\t\tuser = self.user_list[idx]\n",
"\t\titem = self.item_list[idx]\n",
"\t\trating = self.rating_list[idx]\n",
"\t\t\n",
"\t\treturn (\n",
"\t\t\ttorch.tensor(user, dtype=torch.long),\n",
"\t\t\ttorch.tensor(item, dtype=torch.long),\n",
"\t\t\ttorch.tensor(rating, dtype=torch.float)\n",
"\t\t\t)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "d4xgxyBsfoJM"
},
"source": [
"- *_reindex*: process dataset to reindex userID and itemID, also set rating as binary feedback\n",
"- *_leave_one_out*: leave-one-out evaluation protocol in paper https://www.comp.nus.edu.sg/~xiangnan/papers/ncf.pdf\n",
"- *negative_sampling*: randomly selects n negative examples for each positive one"
]
},
{
"cell_type": "code",
"metadata": {
"id": "HggfgX_8Oqmq"
},
"source": [
"class NCF_Data(object):\n",
"\t\"\"\"\n",
"\tConstruct Dataset for NCF\n",
"\t\"\"\"\n",
"\tdef __init__(self, args, ratings):\n",
"\t\tself.ratings = ratings\n",
"\t\tself.num_ng = args.num_ng\n",
"\t\tself.num_ng_test = args.num_ng_test\n",
"\t\tself.batch_size = args.batch_size\n",
"\n",
"\t\tself.preprocess_ratings = self._reindex(self.ratings)\n",
"\n",
"\t\tself.user_pool = set(self.ratings['user_id'].unique())\n",
"\t\tself.item_pool = set(self.ratings['item_id'].unique())\n",
"\n",
"\t\tself.train_ratings, self.test_ratings = self._leave_one_out(self.preprocess_ratings)\n",
"\t\tself.negatives = self._negative_sampling(self.preprocess_ratings)\n",
"\t\trandom.seed(args.seed)\n",
"\t\n",
"\tdef _reindex(self, ratings):\n",
"\t\t\"\"\"\n",
"\t\tProcess dataset to reindex userID and itemID, also set rating as binary feedback\n",
"\t\t\"\"\"\n",
"\t\tuser_list = list(ratings['user_id'].drop_duplicates())\n",
"\t\tuser2id = {w: i for i, w in enumerate(user_list)}\n",
"\n",
"\t\titem_list = list(ratings['item_id'].drop_duplicates())\n",
"\t\titem2id = {w: i for i, w in enumerate(item_list)}\n",
"\n",
"\t\tratings['user_id'] = ratings['user_id'].apply(lambda x: user2id[x])\n",
"\t\tratings['item_id'] = ratings['item_id'].apply(lambda x: item2id[x])\n",
"\t\tratings['rating'] = ratings['rating'].apply(lambda x: float(x > 0))\n",
"\t\treturn ratings\n",
"\n",
"\tdef _leave_one_out(self, ratings):\n",
"\t\t\"\"\"\n",
"\t\tleave-one-out evaluation protocol in paper https://www.comp.nus.edu.sg/~xiangnan/papers/ncf.pdf\n",
"\t\t\"\"\"\n",
"\t\tratings['rank_latest'] = ratings.groupby(['user_id'])['timestamp'].rank(method='first', ascending=False)\n",
"\t\ttest = ratings.loc[ratings['rank_latest'] == 1]\n",
"\t\ttrain = ratings.loc[ratings['rank_latest'] > 1]\n",
"\t\tassert train['user_id'].nunique()==test['user_id'].nunique(), 'Not Match Train User with Test User'\n",
"\t\treturn train[['user_id', 'item_id', 'rating']], test[['user_id', 'item_id', 'rating']]\n",
"\n",
"\tdef _negative_sampling(self, ratings):\n",
"\t\tinteract_status = (\n",
"\t\t\tratings.groupby('user_id')['item_id']\n",
"\t\t\t.apply(set)\n",
"\t\t\t.reset_index()\n",
"\t\t\t.rename(columns={'item_id': 'interacted_items'}))\n",
"\t\tinteract_status['negative_items'] = interact_status['interacted_items'].apply(lambda x: self.item_pool - x)\n",
"\t\tinteract_status['negative_samples'] = interact_status['negative_items'].apply(lambda x: random.sample(x, self.num_ng_test))\n",
"\t\treturn interact_status[['user_id', 'negative_items', 'negative_samples']]\n",
"\n",
"\tdef get_train_instance(self):\n",
"\t\tusers, items, ratings = [], [], []\n",
"\t\ttrain_ratings = pd.merge(self.train_ratings, self.negatives[['user_id', 'negative_items']], on='user_id')\n",
"\t\ttrain_ratings['negatives'] = train_ratings['negative_items'].apply(lambda x: random.sample(x, self.num_ng))\n",
"\t\tfor row in train_ratings.itertuples():\n",
"\t\t\tusers.append(int(row.user_id))\n",
"\t\t\titems.append(int(row.item_id))\n",
"\t\t\tratings.append(float(row.rating))\n",
"\t\t\tfor i in range(self.num_ng):\n",
"\t\t\t\tusers.append(int(row.user_id))\n",
"\t\t\t\titems.append(int(row.negatives[i]))\n",
"\t\t\t\tratings.append(float(0)) # negative samples get 0 rating\n",
"\t\tdataset = Rating_Datset(\n",
"\t\t\tuser_list=users,\n",
"\t\t\titem_list=items,\n",
"\t\t\trating_list=ratings)\n",
"\t\treturn torch.utils.data.DataLoader(dataset, batch_size=self.batch_size, shuffle=True, num_workers=2)\n",
"\n",
"\tdef get_test_instance(self):\n",
"\t\tusers, items, ratings = [], [], []\n",
"\t\ttest_ratings = pd.merge(self.test_ratings, self.negatives[['user_id', 'negative_samples']], on='user_id')\n",
"\t\tfor row in test_ratings.itertuples():\n",
"\t\t\tusers.append(int(row.user_id))\n",
"\t\t\titems.append(int(row.item_id))\n",
"\t\t\tratings.append(float(row.rating))\n",
"\t\t\tfor i in getattr(row, 'negative_samples'):\n",
"\t\t\t\tusers.append(int(row.user_id))\n",
"\t\t\t\titems.append(int(i))\n",
"\t\t\t\tratings.append(float(0))\n",
"\t\tdataset = Rating_Datset(\n",
"\t\t\tuser_list=users,\n",
"\t\t\titem_list=items,\n",
"\t\t\trating_list=ratings)\n",
"\t\treturn torch.utils.data.DataLoader(dataset, batch_size=self.num_ng_test+1, shuffle=False, num_workers=2)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "hfXyLsgVOsBs"
},
"source": [
"### Metrics\n",
"Using Hit Rate and NDCG as our evaluation metrics"
]
},
{
"cell_type": "code",
"metadata": {
"id": "KM4B7r12OvnS"
},
"source": [
"def hit(ng_item, pred_items):\n",
"\tif ng_item in pred_items:\n",
"\t\treturn 1\n",
"\treturn 0\n",
"\n",
"\n",
"def ndcg(ng_item, pred_items):\n",
"\tif ng_item in pred_items:\n",
"\t\tindex = pred_items.index(ng_item)\n",
"\t\treturn np.reciprocal(np.log2(index+2))\n",
"\treturn 0\n",
"\n",
"\n",
"def metrics(model, test_loader, top_k, device):\n",
"\tHR, NDCG = [], []\n",
"\n",
"\tfor user, item, label in test_loader:\n",
"\t\tuser = user.to(device)\n",
"\t\titem = item.to(device)\n",
"\n",
"\t\tpredictions = model(user, item)\n",
"\t\t_, indices = torch.topk(predictions, top_k)\n",
"\t\trecommends = torch.take(\n",
"\t\t\t\titem, indices).cpu().numpy().tolist()\n",
"\n",
"\t\tng_item = item[0].item() # leave one-out evaluation has only one item per user\n",
"\t\tHR.append(hit(ng_item, recommends))\n",
"\t\tNDCG.append(ndcg(ng_item, recommends))\n",
"\n",
"\treturn np.mean(HR), np.mean(NDCG)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "HWyCD7pLOxjq"
},
"source": [
"### Models\n",
"- Generalized Matrix Factorization\n",
"- Multi Layer Perceptron\n",
"- Neural Matrix Factorization"
]
},
{
"cell_type": "code",
"metadata": {
"id": "aTQaitu7d1R3"
},
"source": [
"class Generalized_Matrix_Factorization(nn.Module):\n",
" def __init__(self, args, num_users, num_items):\n",
" super(Generalized_Matrix_Factorization, self).__init__()\n",
" self.num_users = num_users\n",
" self.num_items = num_items\n",
" self.factor_num = args.factor_num\n",
"\n",
" self.embedding_user = nn.Embedding(num_embeddings=self.num_users, embedding_dim=self.factor_num)\n",
" self.embedding_item = nn.Embedding(num_embeddings=self.num_items, embedding_dim=self.factor_num)\n",
"\n",
" self.affine_output = nn.Linear(in_features=self.factor_num, out_features=1)\n",
" self.logistic = nn.Sigmoid()\n",
"\n",
" def forward(self, user_indices, item_indices):\n",
" user_embedding = self.embedding_user(user_indices)\n",
" item_embedding = self.embedding_item(item_indices)\n",
" element_product = torch.mul(user_embedding, item_embedding)\n",
" logits = self.affine_output(element_product)\n",
" rating = self.logistic(logits)\n",
" return rating\n",
"\n",
" def init_weight(self):\n",
" pass"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "7kSFzPlNd50f"
},
"source": [
"class Multi_Layer_Perceptron(nn.Module):\n",
" def __init__(self, args, num_users, num_items):\n",
" super(Multi_Layer_Perceptron, self).__init__()\n",
" self.num_users = num_users\n",
" self.num_items = num_items\n",
" self.factor_num = args.factor_num\n",
" self.layers = args.layers\n",
"\n",
" self.embedding_user = nn.Embedding(num_embeddings=self.num_users, embedding_dim=self.factor_num)\n",
" self.embedding_item = nn.Embedding(num_embeddings=self.num_items, embedding_dim=self.factor_num)\n",
"\n",
" self.fc_layers = nn.ModuleList()\n",
" for idx, (in_size, out_size) in enumerate(zip(self.layers[:-1], self.layers[1:])):\n",
" self.fc_layers.append(nn.Linear(in_size, out_size))\n",
"\n",
" self.affine_output = nn.Linear(in_features=self.layers[-1], out_features=1)\n",
" self.logistic = nn.Sigmoid()\n",
"\n",
" def forward(self, user_indices, item_indices):\n",
" user_embedding = self.embedding_user(user_indices)\n",
" item_embedding = self.embedding_item(item_indices)\n",
" vector = torch.cat([user_embedding, item_embedding], dim=-1) # the concat latent vector\n",
" for idx, _ in enumerate(range(len(self.fc_layers))):\n",
" vector = self.fc_layers[idx](vector)\n",
" vector = nn.ReLU()(vector)\n",
" # vector = nn.BatchNorm1d()(vector)\n",
" # vector = nn.Dropout(p=0.5)(vector)\n",
" logits = self.affine_output(vector)\n",
" rating = self.logistic(logits)\n",
" return rating\n",
"\n",
" def init_weight(self):\n",
" pass"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "7DQpVuaV9cF0"
},
"source": [
"class NeuMF(nn.Module):\n",
" def __init__(self, args, num_users, num_items):\n",
" super(NeuMF, self).__init__()\n",
" self.num_users = num_users\n",
" self.num_items = num_items\n",
" self.factor_num_mf = args.factor_num\n",
" self.factor_num_mlp = int(args.layers[0]/2)\n",
" self.layers = args.layers\n",
" self.dropout = args.dropout\n",
"\n",
" self.embedding_user_mlp = nn.Embedding(num_embeddings=self.num_users, embedding_dim=self.factor_num_mlp)\n",
" self.embedding_item_mlp = nn.Embedding(num_embeddings=self.num_items, embedding_dim=self.factor_num_mlp)\n",
"\n",
" self.embedding_user_mf = nn.Embedding(num_embeddings=self.num_users, embedding_dim=self.factor_num_mf)\n",
" self.embedding_item_mf = nn.Embedding(num_embeddings=self.num_items, embedding_dim=self.factor_num_mf)\n",
"\n",
" self.fc_layers = nn.ModuleList()\n",
" for idx, (in_size, out_size) in enumerate(zip(args.layers[:-1], args.layers[1:])):\n",
" self.fc_layers.append(torch.nn.Linear(in_size, out_size))\n",
" self.fc_layers.append(nn.ReLU())\n",
"\n",
" self.affine_output = nn.Linear(in_features=args.layers[-1] + self.factor_num_mf, out_features=1)\n",
" self.logistic = nn.Sigmoid()\n",
" self.init_weight()\n",
"\n",
" def init_weight(self):\n",
" nn.init.normal_(self.embedding_user_mlp.weight, std=0.01)\n",
" nn.init.normal_(self.embedding_item_mlp.weight, std=0.01)\n",
" nn.init.normal_(self.embedding_user_mf.weight, std=0.01)\n",
" nn.init.normal_(self.embedding_item_mf.weight, std=0.01)\n",
" \n",
" for m in self.fc_layers:\n",
" if isinstance(m, nn.Linear):\n",
" nn.init.xavier_uniform_(m.weight)\n",
" \n",
" nn.init.xavier_uniform_(self.affine_output.weight)\n",
"\n",
" for m in self.modules():\n",
" if isinstance(m, nn.Linear) and m.bias is not None:\n",
" m.bias.data.zero_()\n",
"\n",
" def forward(self, user_indices, item_indices):\n",
" user_embedding_mlp = self.embedding_user_mlp(user_indices)\n",
" item_embedding_mlp = self.embedding_item_mlp(item_indices)\n",
"\n",
" user_embedding_mf = self.embedding_user_mf(user_indices)\n",
" item_embedding_mf = self.embedding_item_mf(item_indices)\n",
"\n",
" mlp_vector = torch.cat([user_embedding_mlp, item_embedding_mlp], dim=-1)\n",
" mf_vector =torch.mul(user_embedding_mf, item_embedding_mf)\n",
"\n",
" for idx, _ in enumerate(range(len(self.fc_layers))):\n",
" mlp_vector = self.fc_layers[idx](mlp_vector)\n",
"\n",
" vector = torch.cat([mlp_vector, mf_vector], dim=-1)\n",
" logits = self.affine_output(vector)\n",
" rating = self.logistic(logits)\n",
" return rating.squeeze()"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "tpBX6rqNfSc9"
},
"source": [
"### Setting Arguments\n",
"\n",
"Here is the brief description of important ones:\n",
"- Learning rate is 0.001\n",
"- Dropout rate is 0.2\n",
"- Running for 10 epochs\n",
"- HitRate@10 and NDCG@10\n",
"- 4 negative samples for each positive one"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Bc5Vg1Ik_gnF",
"outputId": "072e970d-c6d2-413c-d6f4-2f25e13ee4bf"
},
"source": [
"parser = argparse.ArgumentParser()\n",
"parser.add_argument(\"--seed\", \n",
"\ttype=int, \n",
"\tdefault=42, \n",
"\thelp=\"Seed\")\n",
"parser.add_argument(\"--lr\", \n",
"\ttype=float, \n",
"\tdefault=0.001, \n",
"\thelp=\"learning rate\")\n",
"parser.add_argument(\"--dropout\", \n",
"\ttype=float,\n",
"\tdefault=0.2, \n",
"\thelp=\"dropout rate\")\n",
"parser.add_argument(\"--batch_size\", \n",
"\ttype=int, \n",
"\tdefault=256, \n",
"\thelp=\"batch size for training\")\n",
"parser.add_argument(\"--epochs\", \n",
"\ttype=int,\n",
"\tdefault=10, \n",
"\thelp=\"training epoches\")\n",
"parser.add_argument(\"--top_k\", \n",
"\ttype=int, \n",
"\tdefault=10, \n",
"\thelp=\"compute metrics@top_k\")\n",
"parser.add_argument(\"--factor_num\", \n",
"\ttype=int,\n",
"\tdefault=32, \n",
"\thelp=\"predictive factors numbers in the model\")\n",
"parser.add_argument(\"--layers\",\n",
" nargs='+', \n",
" default=[64,32,16,8],\n",
" help=\"MLP layers. Note that the first layer is the concatenation of user \\\n",
" and item embeddings. So layers[0]/2 is the embedding size.\")\n",
"parser.add_argument(\"--num_ng\", \n",
"\ttype=int,\n",
"\tdefault=4, \n",
"\thelp=\"Number of negative samples for training set\")\n",
"parser.add_argument(\"--num_ng_test\", \n",
"\ttype=int,\n",
"\tdefault=100, \n",
"\thelp=\"Number of negative samples for test set\")\n",
"parser.add_argument(\"--out\", \n",
"\tdefault=True,\n",
"\thelp=\"save model or not\")"
],
"execution_count": null,
"outputs": [
{
"data": {
"text/plain": [
"_StoreAction(option_strings=['--out'], dest='out', nargs=None, const=None, default=True, type=None, choices=None, help='save model or not', metavar=None)"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "RnaRWy2gg_Nw"
},
"source": [
"### Training"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"background_save": true,
"base_uri": "https://localhost:8080/"
},
"id": "VyWquJG893CV",
"outputId": "61938a61-f7f5-4885-85d1-1e3e2d2a06f6"
},
"source": [
"# set device and parameters\n",
"args = parser.parse_args(args={})\n",
"device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n",
"writer = SummaryWriter()\n",
"\n",
"# seed for Reproducibility\n",
"seed_everything(args.seed)\n",
"\n",
"# load data\n",
"ml_1m = pd.read_csv(\n",
"\tDATA_PATH, \n",
"\tsep=\"::\", \n",
"\tnames = ['user_id', 'item_id', 'rating', 'timestamp'], \n",
"\tengine='python')\n",
"\n",
"# set the num_users, items\n",
"num_users = ml_1m['user_id'].nunique()+1\n",
"num_items = ml_1m['item_id'].nunique()+1\n",
"\n",
"# construct the train and test datasets\n",
"data = NCF_Data(args, ml_1m)\n",
"train_loader = data.get_train_instance()\n",
"test_loader = data.get_test_instance()\n",
"\n",
"# set model and loss, optimizer\n",
"model = NeuMF(args, num_users, num_items)\n",
"model = model.to(device)\n",
"loss_function = nn.BCELoss()\n",
"optimizer = optim.Adam(model.parameters(), lr=args.lr)\n",
"\n",
"# train, evaluation\n",
"best_hr = 0\n",
"for epoch in range(1, args.epochs+1):\n",
"\tmodel.train() # Enable dropout (if have).\n",
"\tstart_time = time.time()\n",
"\n",
"\tfor user, item, label in train_loader:\n",
"\t\tuser = user.to(device)\n",
"\t\titem = item.to(device)\n",
"\t\tlabel = label.to(device)\n",
"\n",
"\t\toptimizer.zero_grad()\n",
"\t\tprediction = model(user, item)\n",
"\t\tloss = loss_function(prediction, label)\n",
"\t\tloss.backward()\n",
"\t\toptimizer.step()\n",
"\t\twriter.add_scalar('loss/Train_loss', loss.item(), epoch)\n",
"\n",
"\tmodel.eval()\n",
"\tHR, NDCG = metrics(model, test_loader, args.top_k, device)\n",
"\twriter.add_scalar('Perfomance/HR@10', HR, epoch)\n",
"\twriter.add_scalar('Perfomance/NDCG@10', NDCG, epoch)\n",
"\n",
"\telapsed_time = time.time() - start_time\n",
"\tprint(\"The time elapse of epoch {:03d}\".format(epoch) + \" is: \" + \n",
"\t\t\ttime.strftime(\"%H: %M: %S\", time.gmtime(elapsed_time)))\n",
"\tprint(\"HR: {:.3f}\\tNDCG: {:.3f}\".format(np.mean(HR), np.mean(NDCG)))\n",
"\n",
"\tif HR > best_hr:\n",
"\t\tbest_hr, best_ndcg, best_epoch = HR, NDCG, epoch\n",
"\t\tif args.out:\n",
"\t\t\tif not os.path.exists(MODEL_PATH):\n",
"\t\t\t\tos.mkdir(MODEL_PATH)\n",
"\t\t\ttorch.save(model, \n",
"\t\t\t\t'{}{}.pth'.format(MODEL_PATH, MODEL))\n",
"\n",
"writer.close()"
],
"execution_count": null,
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:481: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.\n",
" cpuset_checked))\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"The time elapse of epoch 001 is: 00: 05: 41\n",
"HR: 0.626\tNDCG: 0.359\n",
"The time elapse of epoch 002 is: 00: 05: 42\n",
"HR: 0.658\tNDCG: 0.389\n",
"The time elapse of epoch 003 is: 00: 05: 47\n",
"HR: 0.664\tNDCG: 0.396\n",
"The time elapse of epoch 004 is: 00: 05: 34\n",
"HR: 0.669\tNDCG: 0.400\n",
"The time elapse of epoch 005 is: 00: 05: 44\n",
"HR: 0.671\tNDCG: 0.401\n",
"The time elapse of epoch 006 is: 00: 05: 44\n",
"HR: 0.672\tNDCG: 0.402\n",
"The time elapse of epoch 007 is: 00: 05: 39\n",
"HR: 0.668\tNDCG: 0.396\n",
"The time elapse of epoch 008 is: 00: 05: 34\n",
"HR: 0.667\tNDCG: 0.396\n",
"The time elapse of epoch 009 is: 00: 05: 41\n",
"HR: 0.668\tNDCG: 0.397\n",
"The time elapse of epoch 010 is: 00: 05: 37\n",
"HR: 0.664\tNDCG: 0.395\n"
]
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"background_save": true,
"base_uri": "https://localhost:8080/"
},
"id": "fkiRJWeD_trR",
"outputId": "d3efcab5-fa0b-4938-d5ff-7967f38dab4d"
},
"source": [
"print(\"Best epoch {:03d}: HR = {:.3f}, NDCG = {:.3f}\".format(\n",
"\t\t\t\t\t\t\t\t\tbest_epoch, best_hr, best_ndcg))"
],
"execution_count": null,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Best epoch 006: HR = 0.672, NDCG = 0.402\n"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "WOTaSMGnPoAG"
},
"source": [
"## MF with PyTorch on ML-100k\n",
"\n",
"Training Pytorch MLP model on movielens-100k dataset and visualizing factors by decomposing using PCA"
]
},
{
"cell_type": "code",
"metadata": {
"id": "I2f_R0Yo6BUp"
},
"source": [
"!pip install -U -q git+https://github.com/sparsh-ai/recochef.git"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "KXT07lHDBzAQ"
},
"source": [
"import torch\n",
"from torch import nn\n",
"from torch import optim\n",
"from torch.nn import functional as F \n",
"from torch.optim.lr_scheduler import _LRScheduler\n",
"\n",
"from recochef.datasets.movielens import MovieLens\n",
"from recochef.preprocessing.encode import label_encode\n",
"from recochef.utils.iterators import batch_generator\n",
"from recochef.models.embedding import EmbeddingNet\n",
"\n",
"import math\n",
"import copy\n",
"import pickle\n",
"import numpy as np\n",
"import pandas as pd\n",
"from textwrap import wrap\n",
"from sklearn.decomposition import PCA\n",
"from sklearn.model_selection import train_test_split\n",
"\n",
"import matplotlib.pyplot as plt\n",
"plt.style.use('ggplot')"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "NINhOhYAxt5n"
},
"source": [
"### Data loading and preprocessing"
]
},
{
"cell_type": "code",
"metadata": {
"id": "3Z4R3bXNjaNP"
},
"source": [
"data = MovieLens()"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"id": "A2Xgw-sXk7Ac",
"outputId": "399404ea-51bd-476f-f74e-0dc4807ebaa9"
},
"source": [
"ratings_df = data.load_interactions()\n",
"ratings_df.head()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"
"
],
"text/plain": [
" USERID ITEMID RATING TIMESTAMP\n",
"0 0 0 3.0 881250949\n",
"1 1 1 3.0 891717742\n",
"2 2 2 1.0 878887116\n",
"3 3 3 2.0 880606923\n",
"4 4 4 1.0 886397596"
]
},
"metadata": {},
"execution_count": 50
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "36dsiSqWwNRz"
},
"source": [
"X = ratings_df[['USERID','ITEMID']]\n",
"y = ratings_df[['RATING']]"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Mp-OqA2jyNAS",
"outputId": "969eca89-0f0e-464d-d98e-70aeb215b99b"
},
"source": [
"for _x_batch, _y_batch in batch_generator(X, y, bs=4):\n",
" print(_x_batch)\n",
" print(_y_batch)\n",
" break"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"tensor([[873, 377],\n",
" [808, 601],\n",
" [ 90, 354],\n",
" [409, 570]])\n",
"tensor([[4.],\n",
" [3.],\n",
" [4.],\n",
" [2.]])\n"
]
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "oQlnTmST0cx6",
"outputId": "96db7cc7-9614-4215-f0e8-be69eb17e4e0"
},
"source": [
"_x_batch[:, 1]"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"tensor([377, 601, 354, 570])"
]
},
"metadata": {},
"execution_count": 24
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mVavggU2_WUY"
},
"source": [
"### Embedding Net"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "D39WDxT5_f3l"
},
"source": [
"The PyTorch is a framework that allows to build various computational graphs (not only neural networks) and run them on GPU. The conception of tensors, neural networks, and computational graphs is outside the scope of this article but briefly speaking, one could treat the library as a set of tools to create highly computationally efficient and flexible machine learning models. In our case, we want to create a neural network that could help us to infer the similarities between users and predict their ratings based on available data."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9Gve8w7f_l8i"
},
"source": [
"The picture above schematically shows the model we're going to build. At the very beginning, we put our embeddings matrices, or look-ups, which convert integer IDs into arrays of floating-point numbers. Next, we put a bunch of fully-connected layers with dropouts. Finally, we need to return a list of predicted ratings. For this purpose, we use a layer with sigmoid activation function and rescale it to the original range of values (in case of MovieLens dataset, it is usually from 1 to 5)."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "9zDzhH2c0Cv-",
"outputId": "134739d6-e9ff-4cd7-a0b5-0aac999500fa"
},
"source": [
"netx = EmbeddingNet(\n",
" n_users=50, n_items=20, \n",
" n_factors=10, hidden=[500], \n",
" embedding_dropout=0.05, dropouts=[0.5])\n",
"netx"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"EmbeddingNet(\n",
" (u): Embedding(50, 10)\n",
" (m): Embedding(20, 10)\n",
" (drop): Dropout(p=0.05, inplace=False)\n",
" (hidden): Sequential(\n",
" (0): Linear(in_features=20, out_features=500, bias=True)\n",
" (1): ReLU()\n",
" (2): Dropout(p=0.5, inplace=False)\n",
" )\n",
" (fc): Linear(in_features=500, out_features=1, bias=True)\n",
")"
]
},
"metadata": {},
"execution_count": 25
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "o4vQFZ6iwiyM"
},
"source": [
"### Cyclical Learning Rate (CLR)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6RIaav5rwk66"
},
"source": [
"One of the `fastai` library features is the cyclical learning rate scheduler. We can implement something similar inheriting the `_LRScheduler` class from the `torch` library. Following the [original paper's](https://arxiv.org/abs/1506.01186) pseudocode, this [CLR Keras callback implementation](https://github.com/bckenstler/CLR), and making a couple of adjustments to support [cosine annealing](https://pytorch.org/docs/stable/optim.html#torch.optim.lr_scheduler.CosineAnnealingLR) with restarts, let's create our own CLR scheduler.\n",
"\n",
"The implementation of this idea is quite simple. The [base PyTorch scheduler class](https://pytorch.org/docs/stable/_modules/torch/optim/lr_scheduler.html) has the `get_lr()` method that is invoked each time when we call the `step()` method. The method should return a list of learning rates depending on the current training epoch. In our case, we have the same learning rate for all of the layers, and therefore, we return a list with a single value. \n",
"\n",
"The next cell defines a `CyclicLR` class that expectes a single callback function. This function should accept the current training epoch and the base value of learning rate, and return a new learning rate value."
]
},
{
"cell_type": "code",
"metadata": {
"id": "eYQh4ZCmmgW9"
},
"source": [
"class CyclicLR(_LRScheduler):\n",
" \n",
" def __init__(self, optimizer, schedule, last_epoch=-1):\n",
" assert callable(schedule)\n",
" self.schedule = schedule\n",
" super().__init__(optimizer, last_epoch)\n",
"\n",
" def get_lr(self):\n",
" return [self.schedule(self.last_epoch, lr) for lr in self.base_lrs]"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "1bpK5hOvw7Hg"
},
"source": [
"Our scheduler is very similar to [LambdaLR](https://pytorch.org/docs/stable/optim.html#torch.optim.lr_scheduler.LambdaLR) one but expects a bit different callback signature. \n",
"\n",
"So now we only need to define appropriate scheduling functions. We're createing a couple of functions that accept scheduling parameters and return a _new function_ with the appropriate signature:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "I6st2zPctj1T"
},
"source": [
"def triangular(step_size, max_lr, method='triangular', gamma=0.99):\n",
" \n",
" def scheduler(epoch, base_lr):\n",
" period = 2 * step_size\n",
" cycle = math.floor(1 + epoch/period)\n",
" x = abs(epoch/step_size - 2*cycle + 1)\n",
" delta = (max_lr - base_lr)*max(0, (1 - x))\n",
"\n",
" if method == 'triangular':\n",
" pass # we've already done\n",
" elif method == 'triangular2':\n",
" delta /= float(2 ** (cycle - 1))\n",
" elif method == 'exp_range':\n",
" delta *= (gamma**epoch)\n",
" else:\n",
" raise ValueError('unexpected method: %s' % method)\n",
" \n",
" return base_lr + delta\n",
" \n",
" return scheduler"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "k-CjYin0toWa"
},
"source": [
"def cosine(t_max, eta_min=0):\n",
" \n",
" def scheduler(epoch, base_lr):\n",
" t = epoch % t_max\n",
" return eta_min + (base_lr - eta_min)*(1 + math.cos(math.pi*t/t_max))/2\n",
" \n",
" return scheduler"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "oB_zTLW-wwdM"
},
"source": [
"To understand how the created functions work, and to check the correctness of our implementation, let's create a couple of plots visualizing learning rates changes depending on the number of epoch:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Dl-TWx4OwwdN"
},
"source": [
"def plot_lr(schedule):\n",
" ts = list(range(1000))\n",
" y = [schedule(t, 0.001) for t in ts]\n",
" plt.plot(ts, y)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 265
},
"id": "wCfhKAoMwwdN",
"outputId": "52bd67ed-e05a-4a07-9747-c8d68b1eae5a"
},
"source": [
"plot_lr(triangular(250, 0.005))"
],
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYYAAAD4CAYAAADo30HgAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nOzda2BU13no/f/aM+KiC4KRkGSwwEbC2AiMjAbQzUaXUZIGtyU9rtvYcRqgdfvGIUfmvDlxoW/cNuWUE2JwjEid06qkqXlTUjeQxrm4GoTA0iCQABkLjI0MGMsICzRCFyQkzex1PmwskCXQbUZ7Luv3xRaz9t7Pmj0zz8zaaz9LSCkliqIoinKTZnYAiqIoSmBRiUFRFEUZQCUGRVEUZQCVGBRFUZQBVGJQFEVRBlCJQVEURRnAanYAvnLp0qUxbRcfH8/Vq1d9HE1gU30OD6rP4WE8fZ41a9aQ/65+MSiKoigDqMSgKIqiDKASg6IoijKASgyKoijKACoxKIqiKAOMaFZSXV0du3btQtd1CgsLWb169YDH+/r6KCkp4dy5c8TExFBcXExCQgIAe/fupby8HE3TWLNmDenp6QA899xzTJkyBU3TsFgsbNmyBYDOzk62b9/OlStXmDlzJs8//zzR0dG+7LOiKIpyF8P+YtB1ndLSUjZu3Mj27dupqqqisbFxQJvy8nKioqLYsWMHq1atYvfu3QA0NjbicrnYtm0bmzZtorS0FF3X+7d78cUX2bp1a39SANi3bx+LFy/mlVdeYfHixezbt89XfVUURVFGYNjE0NDQQFJSEomJiVitVrKzs6mpqRnQpra2lry8PAAyMzOpr69HSklNTQ3Z2dlERESQkJBAUlISDQ0Ndz1eTU0NK1euBGDlypWDjqWYR3a0oVftR1VqV/xBSmm8vjrazA4l7A07lOR2u4mLi+v/Oy4ujrNnz96xjcViITIyko6ODtxuN/Pnz+9vZ7PZcLvd/X9v3rwZgKKiIhwOBwBtbW3MmDEDgOnTp9PWNvSLxOl04nQ6AdiyZQvx8fHD93YIVqt1zNsGq7H2uX3fT+j+1etMn7+ASQvT/RCZ/6jzHPh6T9fR+uMfMOWLTzDtzzaMaR/B1mdf8EefTbvz+bvf/S42m422tjb+7u/+jlmzZrFw4cIBbYQQCCGG3N7hcPQnE2DMd/6pOyVHRvb1oh/4LQDXfvUfaAn3+iM0v1HnOfDpv/oPALorfkvP43+MiJg06n0EW599wZQ7n202Gy0tLf1/t7S0YLPZ7tjG6/XS1dVFTEzMoG3dbnf/tp/+NzY2lmXLlvUPMcXGxtLa2gpAa2sr06ZNG3EnFf+RdUegqxNmzUHWViK7u8wOSQkhsrsLWVsJs+ZAVyfyRLXZIYW1YRNDSkoKTU1NNDc34/F4cLlc2O32AW0yMjKoqKgAoLq6mrS0NIQQ2O12XC4XfX19NDc309TURGpqKjdu3KC7uxuAGzducPLkSebMmQOA3W7n4MGDABw8eJBly5b5sr/KGMnKMrDNRPvqN6C3B1nzltkhKSFE1rwFvT3G68s203i9KaYZdijJYrGwdu1aNm/ejK7r5Ofnk5yczJ49e0hJScFut1NQUEBJSQnr168nOjqa4uJiAJKTk8nKymLDhg1omsa6devQNI22tja+//3vA8YvjNzc3P5prKtXr2b79u2Ul5f3T1dVzCVbmuHdtxGP/xHMWwD3JCOrnPDY580OTQkRssoJ9yTDvAWInELkG3uQVz9BxCeaHVpYEjJEppio6qojN9o+6//5U+Qb/4b29/+IiEtA/699yH//Z7S/KUHMmuPHSH1HnefAJS9dRH/xG4g/XIv2udXIlmb0v/wzxON/hPZ7T41qX8HSZ19S1VWVCSd1HenaDw8tQcQZNy2KrHywWNTPfcUnZGUZWCzG6wqM19lDS5BV+5G61+TowpNKDMrdnTkJLc2InFszwERMLCxZjqyuQHr6TAxOCXbS04esroAly43X1U0ixwHuK8brT5lwKjEodyUryyAyGvFI5oB/13KLoKMNTqobEJVxOFkDHW3G6+k24pFMiIxGVjpNCiy8qcSg3JG83oE8UY3IzBs8pzztEZgeh67euMo46JVOmB5nvJ5uIyImITLzkCcOIzvbTYoufKnEoNyRPHIQPH0DhpE+JTQLIrsA6o8jW1uG2FpR7k62tkD9cUR2AUKzDHpc5DjA40EeOWRCdOFNJQbljmRlGcxJQcyZN+TjIscB8ubFaUUZJenaD1If8osHYLzu5qQgK8tUfa4JphKDMiT54Qfw0XnEZ8Z+bycS7oEFi5FVTuRtVXMVZThSSuPehQWLjdfRHYjcImg8DxfPTWB0ikoMypBkVRlYIxDLH7trO5HjgCuX4ezpCYpMCQnvn4Irl+/4a+FTYvljYI1QU6MnmEoMyiCytwd55CBiaTYi6u6LJIml2TA1Ur1xlVGRlWUwNdJ4/dyFiIpGLM1GHjmI7O2ZoOgUlRiUQeSJaui6jsi9+7c5ADF5MmLZY8jjVciu6xMQnRLsZNd15PEqxLLHEJMnD9te5Dqg+7oqrDeBVGJQBpFVTohPhAWLR9Re5BZBb68qrKeMiFEwr/eu168GWLAY4hPVr9IJpBKDMoC8ctkomJdTiNBG+PK4LxVmz1VvXGVEZGUZzJ5rvG5GQGgaIqcQzpw0Xp+K36nEoAwgXeUgBCKrcMTbCCGMn/sXziIbL/gvOCXoycYLcOEsItdxx0W4hiKyCkEINTV6gqjEoPSTuhfpcsLCdETczFFtK1bkg8VqDEMpyh3IKidYrMbrZRRE3ExYmK4K600QlRiUW06/De6rg+rWjISImYZIX4GsPoDsU4X1lMFkXx+y+gAifQUiZvQrM2q5RdB6FU7X+SE65XYqMSj9ZJUTomNgyYoxbS9yHdDZASeP+jgyJSScPAqdHSOa7TakJSsgOkYV1psAw67gBlBXV8euXbvQdZ3CwkJWr1494PG+vj5KSko4d+4cMTExFBcXk5Bg1O7fu3cv5eXlaJrGmjVr+ldqA9B1nRdeeAGbzcYLL7wAwM6dOzl9+jSRkZEAPPfcc9x3332+6KtyF7KzHVlXjVj5O4iIiLHtZGE62OLRK8uwZOT4NkAl6OmVZWCLN14nYyAiIhAr8pAVv0F2tI/pV4cyMsP+YtB1ndLSUjZu3Mj27dupqqqisbFxQJvy8nKioqLYsWMHq1atYvfu3QA0NjbicrnYtm0bmzZtorS0FP220gm//vWvmT179qBjPvPMM2zdupWtW7eqpDBBZHUFeDwjn0I4BKOwXiGcOoF0X/FdcErQk+4rcOoEIrtwyIJ5IyVyi8DrQR6p8F1wyiDDJoaGhgaSkpJITEzEarWSnZ1NTc3AGvy1tbXk5eUBkJmZSX19PVJKampqyM7OJiIigoSEBJKSkmhoaACgpaWF48ePU1g48tkvin9IKY0phHNTEffeN659iexCkNKY3aQoN0lXOUhpvD7GQdx7H8xNVYX1/GzYoSS3201cXFz/33FxcZw9e/aObSwWC5GRkXR0dOB2u5k/f35/O5vNhtvtBuDHP/4xX/nKV+ju7h50zJ/+9Ke8/vrrLFq0iKeffpqIIYY2nE4nTqcx1rhlyxbi4+NH0t9BrFbrmLcNVp/tc1/Du7g//pCYP/8WkeN9LuLjaV2cgbf6AHFf/X9Gfi+En6nzbB6p67RUH8CyOIMZDy0a9/66vvAlOn60lenXrhAxf+GAxwKlzxPJH30e0TUGXzt27BixsbHMmzePU6dODXjsqaeeYvr06Xg8Hn70ox/xi1/8gieeeGLQPhwOBw7HrYtYY10MWy0eDvob/w4Rk7i+cCldPngu9OUrkaXbuFp1APHQknHvzxfUeTaPPHMS/ZNL6I//sU/ikQuXQsQkWn/1OtpXvj7gsUDp80QaT59nzZo15L8P+3XOZrPR0nJrIZaWlhZsNtsd23i9Xrq6uoiJiRm0rdvtxmaz8d5771FbW8tzzz3Hyy+/TH19Pa+88goAM2bMQAhBREQE+fn5/UNPin/Inh7k0UOIjGxEZJRP9imWZsHUKHVPgwJ8WjAvynhd+ICIjEJkZCOPHkL2qMJ6/jBsYkhJSaGpqYnm5mY8Hg8ulwu73T6gTUZGBhUVFQBUV1eTlpaGEAK73Y7L5aKvr4/m5maamppITU3lqaee4tVXX2Xnzp0UFxezaNEivvnNbwLQ2toK0H+NIjk52cddVm4nT7igu2tcF50/S0yajFixEnn8MLKr02f7VYKP7OpEHj+MWLESMWn4gnkjJXKLoLsLedzls30qtww7lGSxWFi7di2bN29G13Xy8/NJTk5mz549pKSkYLfbKSgooKSkhPXr1xMdHU1xcTEAycnJZGVlsWHDBjRNY926dWjDjDm/8sortLcba7zOnTuXZ5991gfdVO5EVjphZhLMT/PpfkWuA1nxa+PXSN4XfbpvJXjIo4egr3fs9y7cyQOLYGaS8as0a3R3USvDEzJELu1funRpTNuF85ikbG5C3/TniNVfQVv1pE+PIaVE/9tisFiw/NU2n+57LML5PJvJ+3cbwOtF+87Lo6qNNBL6r36G3Pca2uYf9a8CFwh9nmimXGNQQpes2g9CQ2QV+HzfRmG9IviwAfnReZ/vXwl88qPz8GEDIrfI50kBMF63QlPXsvxAJYYwZRTM2w9pjyBs/pneJ1Y8BlZVWC9cySonWK3G68APhC0e0h5BulRhPV9TiSFcnaqDay1jKpg3UiJ6GuKRLGR1hSqsF2aMgnkViEeyENH+K12h5RbBNTecOuG3Y4QjlRjClF5ZBtHTYMkyvx5H5DrgegeyTi3LGE5k3RG4Po6CeSO1ZBlETzNez4rPqMQQhvS2Vnj7KCIzH2EdY8G8kXpwCdhmqoqYYUZWloFtpnH+/UhYIxCZ+fD2UWRHm1+PFU5UYghD3QffBO/4CuaNVP+yjO/WIVua/X48xXyypRnerRvd8rDjYBTW8yIPH/D7scKFSgxhRkpJt/OXcP8DiNlzJuSYnxZOk1VqWcZw8GkBxfEWzBspMXsO3P+AKqznQyoxhJsLZ/F+dN7/Y7+3EfGJ8ODDN2eP6MNvoAQtqevGbKQHHzbO+wQRuQ5o+gjP2dMTdsxQphJDmJGVZTB5CmKZf6YQ3onILYKWZjhzckKPq0ywMyehpXlChilvJ5Y9BpMmG7+GlXFTiSGMfFowb0p2AWJq5IQeWzySCZHRRmJSQpasckJktHG+J5CYGonIyOFGpRPZc2NCjx2KVGIII/JYFdzoZmrhqgk/toiYZBTWO1GNvN4x4cdX/E9ev61gXsSkCT++yHUgu7uM17kyLioxhBFZVQYJ9xAxxjV3x0vkFoGnD3nkoCnHV/xLHqkAT9+EDyP1m5+G5Z5k9avUB1RiCBPyk0vw/ilEjsMvdWtGQsyZB3PmqRIZIUpWOWHOPOM8m0AIYfwaPnsaefljU2IIFSoxhAlZ5TQK5mX7vmDeaIjcIrh4DnnxA1PjUHxLXvwALp4z79fCTVPyf8corOdSXz7GQyWGMCC9XmNu+eIMxPS44TfwI7F8JVgj1M/9ECMry8AaYZxfE1lsM2FxBtJ1AOlVhfXGSiWGcHDqOLS50XIm7t6FOxFR0YilWcgjB5F9vWaHo/iA7OtFHjmIWJqFiIo2Oxzjdd7mhvrjZocStIZdwQ2grq6OXbt2oes6hYWFrF69esDjfX19lJSUcO7cOWJiYiguLiYhIQGAvXv3Ul5ejqZprFmzhvT0Wxc+dV3nhRdewGaz8cILLwDQ3NzMyy+/TEdHB/PmzWP9+vVYrSMKU7kDvbIMYmLhYf8WzBspkVtkrNd7cwaLEtzk8cPQdd30YaR+Dy+DmFj0yjIsfi4SGaqG/cWg6zqlpaVs3LiR7du3U1VVRWNj44A25eXlREVFsWPHDlatWsXu3bsBaGxsxOVysW3bNjZt2kRpaSn6bXe+/vrXv2b27NkD9vXaa6+xatUqduzYQVRUFOXl5b7oZ9iS7a1wsgaRVYAIlAS7YDHEJaiL0CFCVjkhLsE4rwFAWK3GIj7v1Bivf2XUhk0MDQ0NJCUlkZiYiNVqJTs7m5qamgFtamtrycvLAyAzM5P6+nqklNTU1JCdnU1ERAQJCQkkJSXR0NAAQEtLC8ePH6ew8FY9FSklp06dIjPTuDkmLy9v0LGU0ZHVFeD1TmgJjOEYhfUc8O7byKufmB2OMg7y6ifw7tvGbLcJKJg3UiLXcbOwXoXZoQSlYb9Cut1u4uJuXbCMi4vj7Nmzd2xjsViIjIyko6MDt9vN/Pnz+9vZbDbcbjcAP/7xj/nKV75Cd3d3/+MdHR1ERkZisVgGtf8sp9OJ02l849yyZQvx8WNbhcxqtY5520AnpaTl8AG0BYuwLX6k/98Doc/ex5/g6i9/ytQTh4n+8p/6/XiB0OeJNhF97izbx3UhiHv8CSwB8Pz29zk+HveCReiHy4l76k9Nm6I9Efxxnk0ZWzh27BixsbHMmzePU6dOjWkfDocDh+PWt+CxLoYdyouHyw/OoDdeQP/qNwb0MSD6LKzwUDrXnf9Jd+HvIjSLXw8XEH2eYP7us9S96M7/hIfSaRVWCIDn9/Y+6yvykD8p4erRKkTKgyZH5j/jOc+zZs0a8t+H/e1ns9loaWnp/7ulpQWbzXbHNl6vl66uLmJiYgZt63a7sdlsvPfee9TW1vLcc8/x8ssvU19fzyuvvEJMTAxdXV14b04z+7S9MjayynmzYF6u2aEMSeQWgfsqvKsK6wWld0+C+2rgXHT+DLEsFyZPUVOjx2DYxJCSkkJTUxPNzc14PB5cLhd2u31Am4yMDCoqKgCorq4mLS0NIQR2ux2Xy0VfXx/Nzc00NTWRmprKU089xauvvsrOnTspLi5m0aJFfPOb30QIQVpaGtXVxjKQFRUVg46ljIy80Y08+hbCnoOYMrEF80ZKpK+AqBj1xg1SsrIMomKM8xiAxJRIhD0HWVOJvNE9/AZKv2ETg8ViYe3atWzevJnnn3+erKwskpOT2bNnD7W1tQAUFBTQ2dnJ+vXreeONN3j66acBSE5OJisriw0bNrB582bWrVuHNswFqqeffpo33niD9evX09nZSUGBuXfqBit5zAU93QH7bQ5AREQgMvOQddXIznazw1FGQXa2I+uqEZl5iAg/Lw87DiK3CHq6VWG9URIyRJY8unTp0pi2C9WxZ+//fgE629D+9oeDLrwFUp9l43n0v/nviD/+M7TC3/XbcQKpzxPFn33W9/8S+W//iPbiDxD33u+XY4zFZ/sspUT/ztchehqWb/9vEyPzH1OuMSjBR15uhIbTphbMGylx7/0wN1UtyxhEpJTGMNLc1IBKCkMRQhhToxveRTY1Dr+BAqjEEJJk1X7QNOMmnyAgch3QeAFUYb3gcPEDaLwQUPfG3I3IKgBNUzdUjoJKDCFGer3Iw+Ww2I6InWF2OCMilj8GEZPUReggISvLIGKScd6CgIidAYvtyMPlSI/H7HCCgkoMoeadWmhrRQvgi86fJSI/Lax3CNnbY3Y4yl3I3h7kkUNGwbxI8wvmjZSWWwTt16D+mNmhBAWVGEKMXuWEadNhUYbZoYyKyC2C7utGQTYlYMnjh6E7gArmjdSiDJg23SgoqQxLJYYQItsCsGDeSD2wCGYmqeGkACcry2BmknG+gsitwnq1yGtDl9lRblGJIYTIw+Wg60FzUfB2QtMQ2YXw3jvI5iazw1GGIK9chvfeQWQXBlTBvJESuQ7QdWT1AbNDCXjBd3aVIUkpjVkXqQ8hku41O5wxEdmFN5dl3G92KMoQbi0PWzh84wAkku6F1IeQlU41NXoYKjGEig/ehcsfB9/Y722ELR7SHkG6ypG6WpYxkEj95vKwaY8Y5ylIidwi+ORjaHjX7FACmkoMIUJWlsHkqYiMHLNDGRct1wGtV+FUndmhKLc7XQetV43zE8RERg5MnoqsUtey7kYlhhAgb3Qha6sQy3IRU6aaHc74LFkO0dPQ1Rs3oOiVZRA9zTg/QUxMmYpYlousrULe6DI7nIClEkMIkDWV0HMjqIeRPiWsEYjMfKg7iuxoMzscBYzzUHcUkZmPsAZuwbyRMgrr3TDeN8qQVGIIAbLKCfckw7wFZofiE8ayjB5jWVLFdPJIBXg9QTnbbUjzFsA9yapExl2oxBDkZNNH8MGZoCiYN1Ji9ly4/wFVWC8AGAXznHD/A8Z5CQH9hfU+OGO8f5RBVGIIcrLSCRYLIivP7FB8SuQ44NJFuHB2+MaK/1xogI8/NM5HCBFZ+WCxGO8fZZAR3R5bV1fHrl270HWdwsJCVq9ePeDxvr4+SkpKOHfuHDExMRQXF5OQkADA3r17KS8vR9M01qxZQ3p6Or29vbz44ot4PB68Xi+ZmZk8+eSTAOzcuZPTp08TGWmsOvbcc89x3333+bDLoUN6PDcL5i1DTAuOgnkjJZY9ivzZPyErnYj7HzA7nLAlK8tg0iTEskfNDsWnxLTp8PAyo7Del54JvkoBfjbss6HrOqWlpfzVX/0VcXFx/OVf/iV2u5177711E1V5eTlRUVHs2LGDqqoqdu/ezfPPP09jYyMul4tt27bR2trKd7/7XX7wgx8QERHBiy++yJQpU/B4PHznO98hPT2dBx4wPgCeeeYZMjMz/dfrUPFOLXS0BVXBvJESkVGIjBxkzSHkk+sQkyebHVLYkT09yJpDiIwcRGSU2eH4nJZThH6iGk7WwNIss8MJKMMOJTU0NJCUlERiYiJWq5Xs7GxqamoGtKmtrSUvLw+AzMxM6uvrkVJSU1NDdnY2ERERJCQkkJSURENDA0IIpkyZAoDX68Xr9YbM+PhE0ivLINYGi5aaHYpfiJwi6O5SyzKaRB53QXeXcR5C0aKlEGszCk8qAwz7i8HtdhMXF9f/d1xcHGfPnr1jG4vFQmRkJB0dHbjdbubPn9/fzmaz4XYbBax0Xefb3/42ly9f5vOf//yAdj/96U95/fXXWbRoEU8//TQRQ6wp63Q6cTqNE7plyxbi48d2N6bVah3ztmbyuq9wtf4YkaufJiYxcVTbBkufZdxKWu65F+3oQWy/9+S49hUsffal8fbZfaQC/Z57icteGTRf3Ebb547CVXTt280MTWKxzfRjZP7jj9e2aQNrmqaxdetWrl+/zve//30uXrzInDlzeOqpp5g+fToej4cf/ehH/OIXv+CJJ54YtL3D4cDhuHVBbKxrngbrWsD6b/4DdJ0bS3PoGWX8wdRnPTMf795/5crpk4iEodenHYlg6rOvjKfPsvkS+qkTiC89Q0tLi48j85/R9lkuzYGf/ystb7yO9sU/9GNk/mPKms82m23AC6OlpQWbzXbHNl6vl66uLmJiYgZt63a7B20bFRVFWloadXVGCYQZM2YghCAiIoL8/HwaGhpG2MXw0T+F8IE0ROLYPyyDgcgqMArrqdkjE0pW7TcK5gXJ8rBjJRJnwQNpyCpVWO92wyaGlJQUmpqaaG5uxuPx4HK5sNvtA9pkZGRQUVEBQHV1NWlpaQghsNvtuFwu+vr6aG5upqmpidTUVNrb27l+/ToAvb29nDx5ktmzZwPQ2toK0H+NIjk52Zf9DQ1nT0PzpZCbQjgUMSMOFi01Zo94VWG9iSC9XqPC7aKlxvMf4kSOA5qb4Owps0MJGMMOJVksFtauXcvmzZvRdZ38/HySk5PZs2cPKSkp2O12CgoKKCkpYf369URHR1NcXAxAcnIyWVlZbNiwAU3TWLduHZqm0drays6dO9F1HSklWVlZZGQYK4698sortLe3AzB37lyeffZZP3Y/OMnKMpgS/AXzRkrLLUL/h7+HU8fh4WVmhxP6Th2Ha260L/+52ZFMCJGRg/zp/0FWliGCbAEifxEyRH4/Xbp0aUzbBdvYs+zuQv9//wSRmYf2zHNj2kfQ9dnTh/4/10LqQ1i+vnFM+wi2PvvCWPvs/Ye/h7On0b73z0FXG2msfdb/dSey+gDa93+CmBrph8j8x5RrDEpgkTVvQW9PWAwjfUpYI4w7VU/WINuvmR1OSJPt1+Dto4is0CiYN1IixwG9vciaQ2aHEhBUYggysrIMZs2BMLsbWOQ4wOtVyzL6mayuAK83rL54AMb7adYcNcnhJpUYgoj8+CKcfx+RWxQ088p9RcyaA/MWqGUZ/ciY7VYG8xYYz3cYEUIY5bjPv2+8z8KcSgxBRFaVgcWKyMwzOxRTiNwiaPoIzr1ndiih6dx70PRRSKzrMRYiMw8sViM5hjmVGIKE9PQhDx+AJcsRMbFmh2MKYc+FSZNVHX0/kVVOmDTZeJ7DkIiJhSXLkdUHkJ4+s8MxlUoMweJkDXS2B/2au+MhpkYi7LnIo28he26YHU5IkT03kDVvIey5QTcrx5e03CLobIe3a4ZvHMJUYggSeqUTpsdB2iNmh2IqY1nGbmStKqznS7K2Cm50h+0wUr+0dJgeZxSoDGMqMQQB2doC9ccR2YUIzWJ2OOZKfQgSZ6txYB+TVWWQONt4fsOY0CyI7EI4dQLpDq/7Xm6nEkMQkK79IHVETqHZoZiuf1nGhtPIyx+bHU5IkJc/hrOnQ2p52PEQOYUgdWMRrDClEkOAk7puXBRcsBiRcI/Z4QQEkZUPmqYuQvuIrHKCphnPq2K8zxYsNgrr6brZ4ZhCJYZAd/YUXLmMCOOLzp8lpttgsV0V1vMB6fXeXB7WbjyvCoDxfrtyOWwL66nEEOBkpROmRiIeyTY7lICi5TqgrRXqj5kdSnCrPwZtrWE9220o4pFsmBoZtteyVGIIYLLrOvJ4FWL5Y2rN489aZIdp08N+9sh46ZVOmDbdeD6VfmLyZMTyx5DHXMiu62aHM+FUYghgRsG8XjWFcAjCar1VWK+t1exwgpJsb4V3am4WzDNtMceAJXKLoK8XeTT8CuupxBDAZGUZzJ4Lc1PNDiUgiZwi0HVVWG+M5OEDNwvmqS8eQ5qbCrPnhuVwkkoMAUo2XoALZ8OyYN5IiXvuhZQHkZVlqrDeKPUvD5vyoPE8KoP0F9b7sAHZeN7scCbUiH4/1tXVsWvXLnRdp7CwkNWrVw94vK+vj5KSEs6dO0dMTAzFxcUkJNweGwgAACAASURBVCQAsHfvXsrLy9E0jTVr1pCenk5vby8vvvgiHo8Hr9dLZmYmTz75JADNzc28/PLLdHR0MG/ePNavX481DH/myionWMO3YN5Iidwi5L/sgA/OhP3NWaPywRm43Ij4k/VmRxLQRGYe8j9+jKx0Iv74z8wOZ8IM+4tB13VKS0vZuHEj27dvp6qqisbGxgFtysvLiYqKYseOHaxatYrdu3cD0NjYiMvlYtu2bWzatInS0lJ0XSciIoIXX3yRrVu38r3vfY+6ujref/99AF577TVWrVrFjh07iIqKorw8/G4ykX19yOoDiCUrENHTzA4noAl7LkyeEpY/98dDVpbB5ClhWzBvpET0NMSSFcjqCmRf+BTWGzYxNDQ0kJSURGJiIlarlezsbGpqBhaYqq2tJS8vD4DMzEzq6+uRUlJTU0N2djYREREkJCSQlJREQ0MDQgimTJkCgNfrxev1IoRASsmpU6fIzMwEIC8vb9CxwsLbR6CzQ110HgExZapRWK+2Enmjy+xwgoK80Y2srTQK5k2ZanY4AU/kFsH1DuN9GSaGHaNxu93ExcX1/x0XF8fZs2fv2MZisRAZGUlHRwdut5v58+f3t7PZbLjdbsD4JfLtb3+by5cv8/nPf5758+fT3t5OZGQkFotlUPvPcjqdOJ3Gna9btmwhPj5+NP3uZ7Vax7ytv7QePYQnPpH4RwsRFt/XRgrEPo9H7+N/SGuVk+gzJ5nqeHzINqHW55G4U5+7nW/Q3nOD6Y//IZNC7Dnxx3mWjxZydfc/YD16kBlfWD38BhPMH302bfBe0zS2bt3K9evX+f73v8/FixeZPn36iLd3OBw4HLduyhnrYtiBtki8dF9BrzuCWPUkLa3+mYYZaH0eLxmXBEn30v7bn3M9PXPINqHW55G4U5+9v/05JN1LW1wSIsSeE3+dZ5mZR++vfsaV995FxM30+f7HYzx9njVr1pD/PuxQks1mo6Wlpf/vlpYWbDbbHdt4vV66urqIiYkZtK3b7R60bVRUFGlpadTV1RETE0NXVxfem2UOhmof6oyCedKo8KiMiDF7xAEfnEE2fWR2OAFNNjXCB2cQuapg3miI7EKQEnl4v9mhTIhhE0NKSgpNTU00Nzfj8XhwuVzY7QPvkszIyKCiogKA6upq0tLSEEJgt9txuVz09fXR3NxMU1MTqamptLe3c/26cTdhb28vJ0+eZPbs2QghSEtLo7q6GoCKiopBxwplRsG8/fDgw4iZSWaHE1RUYb2RkVVlqmDeGIiZSfDgw8aa42FQWG/YoSSLxcLatWvZvHkzuq6Tn59PcnIye/bsISUlBbvdTkFBASUlJaxfv57o6GiKi4sBSE5OJisriw0bNqBpGuvWrUPTNFpbW9m5cye6riOlJCsri4yMDACefvppXn75Zf7t3/6N+++/n4KCAv8+A4HkvXfg6ieI1V8xO5KgI6bNgIeXIV3lyNXPqDt5hyA9HuOmtoeXGc+XMioitwj5Ty8Z79OHlpgdjl8JGSJ3Bl26dGlM2wXS2LP+jy8h62vRtv4YMcl/tZECqc++JN8+il7yd2hf34h4ZOC1hlDt8918ts+yrhp95/9C+8ZfIZYsNzEy//HneZa9Pejf+hpikR3tz/6HX44xFqZcY1AmhrzeiTzuQixf6dekENIWZUDsDHQ1nDQkvdIJsTOM50kZNTFpMmL5SuRxF/J6p9nh+JVKDAFCHj0Enj5178I4CIsFkVUA79Qirw09zTlcyWtueKcWkVXglynQ4ULkFoGnL+QL66nEECBklROS70fMTTE7lKAmchxGYb0wXpZxKPLwAdB14/lRxkzMTYHk+0P+TnuVGAKA/Og8fNigqlz6gEiaDfMXGrNHQuPy2bhJKY0vHvMXGs+PMi4ipwgufoC8eM7sUPxGJYYAICvLwBqByFxpdighQeQUQfMlOHva7FACQ8O78MnH6ouHj4jMlWCNCOmp0SoxmEz29SKrKxCPZCKiYswOJyQIew5MmRrSb9zRkJVlMGWq8bwo4yaiYhCPZN4srNdrdjh+oRKDyWTdEejqNO7cVXxCTJ6CWPaoUVivO7wL68nuLqNg3rJHEZOnmB1OyBC5DujqRJ6oNjsUv1CJwWSysgxsM+HB0L5hZqKJHAf09hjLo4YxWVsJvT3qorOvPbgE4hJC9lepSgwmki3N8O7biJxChKZOhU/NWwD3JIfsG3ekZGUZ3JNsPB+KzwhNM+onvfu28T4OMerTyESyyijIpb7N+V5/Yb1z7yEvXTQ7HFN4PjoP595TBfP8ROQYhS5D8cuHSgwmkbpuVFJ9aAkiLsHscEKSyMwHiyXk55zfSff+N8BiMZ4HxedEXAI8tARZtT/kCuupxGCWMyehpVn9WvAjMW06LFkedssyAkhPH90HfgNLlhvPg+IXIscB7itw5m2zQ/EplRhMIivLIDJ6ULE3xbe0HAd0tNFTW2V2KBPrZC2y/ZrRf8VvxCOZEBmNrAyt4SSVGEwgr3cgT1QjMvMQEZPMDie0pS2F6Ta69//S7EgmlF5ZhmaLN/qv+I2ImITIzEOeqEZe7zA7HJ9RicEE8shBo2Ce+jbnd8JiQWQX0nviCLK1ZfgNQoBsbYH640zN/6IqmDcBRI7DKKxXfdDsUHxGJQYTyMoymJOCmDPP7FDCgsgpNArrucJjWUZ5uBykzpTCx80OJSyIOfNgToqxOl6IGNEyV3V1dezatQtd1yksLGT16tUDHu/r66OkpIRz584RExNDcXExCQnGTJu9e/dSXl6OpmmsWbOG9PR0rl69ys6dO7l27RpCCBwOB1/84hcB+NnPfsb+/fuZNm0aAF/+8pdZujR0fg7Lix/AR+cRT/2F2aGEDZEwi4i0R+irciJ/54mQvmekv2DeA4uw3nMvhNniRGYRuUXI//9V5IcfhESF5GHfIbquU1paysaNG9m+fTtVVVU0NjYOaFNeXk5UVBQ7duxg1apV7N69G4DGxkZcLhfbtm1j06ZNlJaWous6FouFZ555hu3bt7N582befPPNAftctWoVW7duZevWrSGVFOC2gnnLHzM7lLAy1fE4XLkc+oX13j8FzU1qXY8JJpY/ZhTWC5Gp0cMmhoaGBpKSkkhMTMRqtZKdnU1NTc2ANrW1teTl5QGQmZlJfX09UkpqamrIzs4mIiKChIQEkpKSaGhoYMaMGcybZwyjTJ06ldmzZ+N2h/7CKrK3B3nkIGJpNiIq2uxwwsqUrHyYGhkyb9w7kVVlMDUSsTTb7FDCioiKRizNRh49iOztMTuccRt2KMntdhMXF9f/d1xcHGfPnr1jG4vFQmRkJB0dHbjdbubPn9/fzmazDUoAzc3NnD9/ntTU1P5/e/PNNzl06BDz5s3jq1/9KtHRgz9EnU4nTqcxRWzLli3Ex8ePpL+DWK3WMW87Wt1v/RftXdeJXfXfmDxBxxzKRPY5UFitVqY++jm6K36D7Rt/iRaCiVm/3smVYy6m5v0O02bPDtvzbFafe1f9N1qPHiS64RRTH/vchB3XH30e0TUGf7lx4wYvvfQSX/va14iMjATgc5/7HE888QQAe/bs4Sc/+Qlf//rXB23rcDhwOG7N6hnrYtgTuUi89zd7IT6R9qQ5CBPHfieyz4EiPj6eHnsu/Nc+rv52H9rKL5gdks/ph34LvT302HO5evVq2J5ns/osk+YY7+/f/JzrCyduCHw8fZ41a9aQ/z7sUJLNZqOl5dY0v5aWFmw22x3beL1eurq6iImJGbSt2+3u39bj8fDSSy/x6KOPsmLFiv4206dPR9M0NE2jsLCQDz74YBTdDFzyymVVMM9s982H2XNDdjhJVjph9lyjn8qEE5pmzIA7c9J4vwexYT+hUlJSaGpqorm5GY/Hg8vlwm63D2iTkZFBRUUFANXV1aSlpSGEwG6343K56Ovro7m5maamJlJTU5FS8uqrrzJ79mwef3zglLrW1tb+/z969CjJyck+6Kb5pKschEBkFZodStjqL6x34Syy8YLZ4fiU/PhDOP++KphnMpFdCEIE/dToYYeSLBYLa9euZfPmzei6Tn5+PsnJyezZs4eUlBTsdjsFBQWUlJSwfv16oqOjKS4uBiA5OZmsrCw2bNiApmmsW7cOTdM4c+YMhw4dYs6cOXzrW98Cbk1Lfe2117hw4QJCCGbOnMmzzz7r32dgAkjdi3Q5YWE6Im6m2eGENbEiH/n6vyCrnIg/+lOzw/EZWekEixWxQhXMM5OwzYSF6UjXfuTv/jFCC84bDIUMkRXTL126NKbtJmJMUtYfR//BX6P9+f9E2HP9eqyRCPexZ++rW+C9d9C+92NERITJkY2f9PShf2sNLFiE5S9e6P/3cD/PZpG1leg/+h7af/9rxCL/X2sw5RqDMn6yygnRMbBkxfCNFb/TcougswNOHjU7FN94+yh0thv9Usy3ZAVExwT1tSyVGPxMdrYj66oRK/JC4ttpSFiYDjPi0YP4jXs7vdIJM+KNfimmExERiBV5yLojyI52s8MZE5UY/ExWV4DHo+5EDSBCsyCyC+DUCaT7itnhjIt0X4VTJxDZBUE7nh2KRG4ReD3IIxVmhzImKjH4kZTS+Dk5NxVx731mh6PcRuQ4QEpjtlgQk679IHVVqTfAiHvvg7mpyMoygvEyrkoM/vRhA3z8ofq1EIDEzCRYsBhZ5QzaZRn7l4ddsNjojxJQRG4RfPyh8TkQZFRi8CNZ5YSISapgXoASuUVw9RN47x2zQxmb9+vhymX1xSNAieWPQcSkoLwIrRKDnxgF8w4hMrIRkVFmh6MMQSzNgqlRRgIPQrLKCVOjjH4oAUdERiEyspFHDyF7gquwnkoMfiKPu6D7uvo2F8DEpMmIFY8hjx9GdnWaHc6oyK5O5DEXYsVjiEmTzQ5HuQORWwTdXcgTLrNDGRWVGPxEVjphZhLMTzM7FOUuRG4R9PUijx4yO5RRkUcPQV+v+uIR6B5YBDOTjM+DIKISgx/I5iZ47x1EjkMVzAt0c1Lg3vuC7o0rK51w731G/ErAEkIYM8bee8f4XAgS6lPLD2TVfhAaIqvA7FCUYRiF9YrgwwbkR+fNDmdEZON5+LABkVukCuYFAaOwnmZ8LgQJlRh8zCiYtx/SHkHYwmuRlGAlVqwEqzVoLkLLSidYrUbcSsATM+Jg0VKjsJ7uNTucEVGJwddO1cG1FlW3JoiI6GmI9ExkdQWyr8/scO5K9vUhqysQ6ZmI6Glmh6OMkJbjgGstxudDEFCJwcf0yjKIngZLlpkdijIKIrcIrncg66rNDuWuZN0RuN6hLjoHmyXLICY2aOpzqcTgQ7KjDd4+isjMR1hVwbyg8tDDYJsZ8BehZWUZ2GYa8SpBQ1gjEJl58PZR43MiwI1ozee6ujp27dqFrusUFhayevXqAY/39fVRUlLCuXPniImJobi4mISEBAD27t1LeXk5mqaxZs0a0tPTuXr1Kjt37uTatWsIIXA4HHzxi18EoLOzk+3bt3PlyhVmzpzJ888/T3R0cCzcLqsrwKsK5gUjo7BeIfJXe5AtzYi4BLNDGkS2XIF36xCr/kgVzAtCIqcIWfYLYyiw6PfNDueuhv3FoOs6paWlbNy4ke3bt1NVVUVjY+OANuXl5URFRbFjxw5WrVrF7t27AWhsbMTlcrFt2zY2bdpEaWkpuq5jsVh45pln2L59O5s3b+bNN9/s3+e+fftYvHgxr7zyCosXL2bfvn1+6Lbv9RfMu/8BxOw5ZoejjIHIMZZdDdTZI58uF/lpnEpwEbPnwP0PBEVhvWETQ0NDA0lJSSQmJmK1WsnOzqampmZAm9raWvLy8gDIzMykvr4eKSU1NTVkZ2cTERFBQkICSUlJNDQ0MGPGDObNmwfA1KlTmT17Nm63G4CamhpWrjRmW6xcuXLQsQLWhbNw6aKxprASlER8Ijz48M3ZI4FVWE/qujFr6sGHjTiVoCRyHXDpIpx/3+xQ7mrYoSS3201cXFz/33FxcZw9e/aObSwWC5GRkXR0dOB2u5k/f35/O5vN1p8APtXc3Mz58+dJTU0FoK2tjRkzZgAwffp02tqGHo9zOp04ncZ48JYtW4iPH9vUUKvVOuZtb9f+76V0T55C/Be+hBbgtZF81edgMtI+d//Ol2jf9tdMa/qQyQE0gaDnZC3XWpqZ9idfZ+oIz506z4FH/8KXuPKzf2bysUqmLc/xyT790ecRXWPwlxs3bvDSSy/xta99jcjIyEGPCyHueAOPw+HA4bj17Xysa576Yo1Y2dODfui/EEuzcXd1Q1f3uPbnb4GwLu5EG2mfZeoiiIyi7Vevo82+fwIiGxn9V69DZBSdqYu4PsJzp85zYBJLs+k+9F/0/N7TiMlTxr0/U9Z8ttlstLS09P/d0tKCzWa7Yxuv10tXVxcxMTGDtnW73f3bejweXnrpJR599FFWrLi1FnJsbCytra0AtLa2Mm1a4M/Vlseq4Ea3GkYKASJiEmLFSuSJauT1DrPDAUBe70QeP4xYsRIRMcnscJRxErkOuNGNPBa4hfWGTQwpKSk0NTXR3NyMx+PB5XJht9sHtMnIyKCiogKA6upq0tLSEEJgt9txuVz09fXR3NxMU1MTqampSCl59dVXmT17No8//viAfdntdg4ePAjAwYMHWbYscH7O34msKoOEe1TBvBAhcovA04c8ctDsUACQRw+Cp0/NdgsV89MgYZbxuRGghh1KslgsrF27ls2bN6PrOvn5+SQnJ7Nnzx5SUlKw2+0UFBRQUlLC+vXriY6Opri4GIDk5GSysrLYsGEDmqaxbt06NE3jzJkzHDp0iDlz5vCtb30LgC9/+cssXbqU1atXs337dsrLy/unqwYy+ckleP8U4kvPqLo1IULMSYE584yLvQWPD7+Bn8nKMpgzz4hLCXpGfS4H8uc/QX5yCZE49HCOmYQM9HlTI3Tp0qUxbTfeMUn95z9B/vbnaN8rRUyPG36DABAM47C+Nto+6+VvIH/6f9D+v+2mfiDLix+gf/d5xJefRRtlklLnOXDJay3o/3Md4gt/gPYHXx3Xvky5xqDcmfR6jcXkF2cETVJQRkasyANrhOnLMhoF8yKMeJSQIabHweIMpKsc6Q28wnoqMYzHqePQ5jYKZCkhRURFI5ZmIY8cRPb1mhKD7OtFHjmIWJqFiAqOu/+VkdNyi6DNDfXHzQ5lEJUYxkGvLIOYWHg48C+QK6MnchzQdR15/LApx5cnqqGr04hDCT2L7UZhvQC8CK0SwxjJ9lY4WYPIKkBYTb0dRPGXBx+GuATT1mmQlWUQl2DEoYQcYbUai3mdrDE+TwKISgxjZBTM86p7F0KY0DTj2/q7byOvfjKhx5ZXP4EzJ9XysCFO5DrA60UerjA7lAHUK24MjIJ5Tkh5EHFPstnhKH5kLMsoJrywXn/BvGxVMC+UiXuSIeVBZJUzoArrqcQwFufeg6aP1NhvGBBxM+GhdKTLOWHLMkrdaySih9KN4yshTeQ4oOkj43MlQKjEMAayygmTpyCW5ZodijIBRK4D3Ffh3ZMTc8AzJ8F9RQ1ThgmxLBcmTwmoNcdVYhgleaMbefQthD0HMWVw4T8l9Ij0TIiKmbB7GmSlE6JijOMqIU9MiUTYc5BH30LeCIwCnCoxjJI85oKeblW3JoyICGNZRllXjexs9+uxZGc78sRhRGYeIkItDxsuRG4R9HQbBTkDgEoMoyQryyBpNqQ8ZHYoygQSOQ7wePxeWE8eOQQej7p+FW5SHoKk2QGz5rhKDKMgLzdCw2ljCqEqmBdWRPL9MDfVr8sy9i8POzfVOJ4SNoQQxpeBhtPG54zJVGIYBVm1HzTNuClFCTsi1wGNF+DiB/45wMUPoPG8uugcpkRWAWhaQPxqUIlhhKTXizxcDovtiNgZZoejmEAsfwwiJvntIrSsdELEJOM4StgRsTNgsR15uBzp8Zgai0oMI/VOLbS1GoWvlLAkIj8trHcI2dvj033L3p5bBfMiVcG8cKXlFkH7Nag/Zm4cph49iOhVTpg2HRZlmB2KYiKRWwTdvi+sJ48fhu7rarZbuFtsh9gZRoFOE42o+ltdXR27du1C13UKCwtZvXr1gMf7+vooKSnh3LlzxMTEUFxcTEJCAgB79+6lvLwcTdNYs2YN6enpAPzwhz/k+PHjxMbG8tJLL/Xv62c/+xn79+/vX+v505XdzCTbbhbMK1qtCuaFuwcWQXyiMZyUmeez3coqJ8QnGvtXwpawWBCZ+ciyfci2VtOGrYf9xaDrOqWlpWzcuJHt27dTVVVFY+PAq+bl5eVERUWxY8cOVq1axe7duwFobGzE5XKxbds2Nm3aRGlpKbquA5CXl8fGjRuHPOaqVavYunUrW7duNT0pAMa1BV1XFwWVW4X13nsH2dzkk33KK5dVwTyln8h1gK4bnzsmGfZV2NDQQFJSEomJiVitVrKzs6mpqRnQpra2lry8PAAyMzOpr69HSklNTQ3Z2dlERESQkJBAUlISDQ0NACxcuJDo6MAfS5VSGt/mUh9CJN1rdjhKABDZBUZhPZdvCutJ134QwtivEvZE0r2QutDUwnrDjou43W7i4m4tWxkXF8fZs2fv2MZisRAZGUlHRwdut5v58+f3t7PZbLjd7mGDevPNNzl06BDz5s3jq1/96pAJxOl04nQa07q2bNlCfHz8sPsditVqveu2ve+epPXyx0z7xp8wdYzHCDTD9TkU+bTP8fG0pq/AU11B3Jr1CItlzLuSXi9XDx8gIn0FMx7w7U2T6jwHr+4vrKa95H8Re7WJSQ/dfT0Of/Q54AbMP/e5z/HEE08AsGfPHn7yk5/w9a9/fVA7h8OBw3FraGesi2EPt5C2/qvXYfJUOhcs4XoQLDI+EsGyYLov+brPcsVK9BPVXD3kRCwe+4QEWX8MvaUZzx+u8fk5Uec5eMkFS2DyVK796t/RZs66a9vx9HnWrKH3PexQks1mo6Wlpf/vlpYWbDbbHdt4vV66urqIiYkZtK3b7R607WdNnz4dTdPQNI3CwkI++MBPNxONgLzRhaytRCzLRUyZalocSgB6eDlEx4x7WUZZ6YToGGN/inKTmDIVsSwXWVuFvNE14ccfNjGkpKTQ1NREc3MzHo8Hl8uF3W4f0CYjI4OKigoAqqurSUtLQwiB3W7H5XLR19dHc3MzTU1NpKam3vV4ra23lrg7evQoycnmLYQjayqh54aaQqgMYhTWy4e6o8iOtjHtQ3a0I+uOIDLzVcE8ZRCjsN4N43Nogg07lGSxWFi7di2bN29G13Xy8/NJTk5mz549pKSkYLfbKSgooKSkhPXr1xMdHU1xcTEAycnJZGVlsWHDBjRNY926dWg3Z128/PLLnD59mo6ODv7iL/6CJ598koKCAl577TUuXLiAEIKZM2fy7LPP+vcZuAtZ5YR7kmHeAtNiUAKXyC1COv8TWV2BKPr9UW8vjxwAr0d98VCGNm8B3JNsfA49+rkJPbSQgbSe3DhcunRpTNvdaXxONn2E/p3nEE+sQfv8l8YbXkAJlXHY0fBXn72b/wf09qD99Y5RFVaUUqL/zTchYhKWTS8Nv8EYqPMc/PQ39yJf34X2tzvvuIywKdcYwpWsdILFgsjKMzsUJYCJ3CK4dBEunB2+8e0uNMDHH6pfC8pdiax8sFgmvLCeSgxDkB7PzYJ5yxDTVME85c7Eskdh0qRRv3FlZRlMmmRsryh3IKZNh4eXTXhhPZUYhvJOLXS0qYJ5yrBEZBRiaQ6y5hCyZ2SF9WRPD7LmEGJpDiIyys8RKsFOyymCjjbjc2mijjlhRwoiemUZxNpgkfnlOJTAZxTW6xrxsozyuAu6u9QwkjIyi5ZCrG1CC+upxPAZ8loLvHMMkZ0/rjtalTDyQBrMTDJmj4yArHLCzCRjO0UZhrBYjHIp7xwzPp8mgEoMnyEPHwCpI3LUtzllZPqXZXy/Htl899lxsrkJ3ntHLQ+rjIrIcYDUjc+nCaASw22MNXed8EAaIvHut6Eryu1EdiGI4ZdllFVOEJrRXlFGSCTOggfSkJUTU1hPJYbbnT0NzZeM7KwooyBmxMGipcbsEa93yDZS9xqVVBctNdoryiiInCJovmR8TvmZSgy3kZVlMGUqIiPH7FCUIKTlOuCaG04dH7rBqRNwzW20U5RREhnZMGWq39Ycv51KDDfJm7NKxPLHEJOnmB2OEoweXgYxsXecPaJXlkFMrNFOUUZJTJ6CWP4Y8lgVstu/hfVUYrhJ1rwFvT1qGEkZM2GNQGTmwckaZPu1AY/JjjZ4+ygiMw9hVQXzlLEROQ7o7TE+r/xIJYabZGUZzJoD9z9gdihKEBO5ReD1IqsHzh6Rhw+A16vuXVDG5/4HYNYcvw8nqcQAyI8vwvn3EblFagqhMi5i1hyYt2DA7BFjtlsZzFtgPK4oYySEML5cnH/f+NzyE5UYAFlVBharMQygKOMkchzQ9BGce8/4h/PvQ9NHaphS8QmRmQcWq/G55SdhnxhkX5/xM3/JckRMrNnhKCHAKKw3uf9OaKNg3mRVME/xCRETC0uWIw8fQHr6/HKMsE8MPbVV0NmuphAqPiOmRiIycpBH30K2X0PWvIXIyEFMjTQ7NCVEaLlF0NkOb9f4Zf/DruAGUFdXx65du9B1ncLCQlavXj3g8b6+PkpKSjh37hwxMTEUFxeTkJAAwN69eykvL0fTNNasWUN6ejoAP/zhDzl+/DixsbG89NKthUo6OzvZvn07V65cYebMmTz//PNER0f7qr+DdO//JUyPg7RH/HYMJfyI3CLk4XL0f3oJbnSri86Kb6Wlw/Q49ConfP73fL77YX8x6LpOaWkpGzduZPv27VRVVdHY2DigTXl5OVFRUezYsYNVq1axe/duABobG3G5XGzbto1NmzZRWlqKrusA5OXlsXHjxkHH27dvH4sXL+aVV15h8eLF7Nu3zxf9HJJsbaH3Xe3qiwAACSBJREFUxBFEdiFCUwXzFB+avxASZsG7bxv/nb/Q7IiUECI0i1FWpf443pYrPt//sImhoaGBpKQkEhMTsVqtZGdnU1Mz8OdLbW0teXl5AGRmZlJfX4+UkpqaGrKzs4mIiCAhIYGkpCQaGhoAWLhw4ZC/BGpqali5ciUAK1euHHQsX5Ku/aDriBxVt0bxLWP2iDE8KXJVwTzF90ROIUTH4Gm84PN9DzuU5Ha7iYu7VdclLi6Os2fP3rGNxWIhMjKSjo4O3G438+fP729ns9lwu913PV5bWxszZhirpk2fPp22trYh2zmdTpxO4+Leli1biI+PH64rg3TfOwdP0e8Ss3DxqLcNZlardUzPVzAzo8/6l57ieu8Nor70FFr0tAk9NqjzHPLi45G7fknE5ClM9vHqbiO6xmAWIcQdv2k5HA4cjlsXjMe0GPaSTOILHw+pxcNHItQWTB8J0/r8+1+h50Yv3Jj4Y6vzHB7G0+dZs4auIj3sUJLNZqOl5dbiEC0tLdhstju28Xq9dHV1ERMTM2hbt9s9aNvPio2NpbW1FYDW1lamTZv4b1qKoijhbNjEkJKSQlNTE83NzXg8HlwuF3a7fUCbjIwMKioqAKiuriYtLQ0hBHa7HZfLRV9fH83NzTQ1NZGamnrX49ntdg4ePAjAwYMHWbZMFRxTFEWZSEKOYNWH48eP8y//8i/ouk5+fj5/8Ad/wJ49e0hJScFut9Pb20tJSQnnz58nOjqa4uJiEhMTAfj5z3/OgQMH0DSNr33tazzyiDEt9OWX/2979xfSVP/HAfx9mCktbfOc0MiKmtmFhgkpSlCmRhdREF4IRRde5kqR6GLdRDcRBMtBTraL0PCuixTsoiBMQ0SYfzFXy8xCeKrljsqZU6fb57kQx3OenueHS5/Obzuf15U7Dvx89mZ+zvnu6NcBr9cLRVFgMplQU1ODyspKKIqCpqYmzM7OxnW76h9//O+ds/4NX3rqA/esD9xzfP5tKWlTgyER8GDYPO5ZH7hnfdDkMwbGGGP6woOBMcaYCg8GxhhjKjwYGGOMqSTNh8+MMca2h+6vGGw2m9Yl/Hbcsz5wz/rwX/Ss+8HAGGNMjQcDY4wxFcPdu3fval2E1iwWi9Yl/Hbcsz5wz/qw3T3zh8+MMcZUeCmJMcaYCg8GxhhjKv/XG/X810ZHR9Ha2opoNIqqqipcunRJ65K2bHZ2Fk6nE/Pz8xAEAWfPnsX58+cRDAbR1NSEHz9+qP5rLRGhtbUVIyMjSEtLg9VqTdg12mg0CpvNBlEUYbPZ4Pf74XA4oCgKLBYL6uvrkZKSgtXVVTQ3N+PTp0/IyMhAY2MjsrKytC4/bouLi3C5XJiZmYEgCKirq8O+ffuSOufnz5+ju7sbgiDgwIEDsFqtmJ+fT6qcW1paMDw8DJPJBLvdDgC/9P7t6enBs2fPAADV1dWx7Zc3hXQqEonQjRs36Nu3b7S6ukq3bt2imZkZrcvaMlmWaWpqioiIQqEQNTQ00MzMDLW3t1NHRwcREXV0dFB7ezsREQ0NDdG9e/coGo2Sz+ej27dva1b7VnV1dZHD4aD79+8TEZHdbqe+vj4iInK73fTy5UsiInrx4gW53W4iIurr66OHDx9qU/AWPXr0iF69ekVERKurqxQMBpM650AgQFarlVZWVohoPd/Xr18nXc4TExM0NTVFN2/ejB2LN1dFUej69eukKIrq683S7VLSx48fsXfvXmRnZyMlJQUnT56Ex+PRuqwty8zMjJ0x7Ny5Ezk5OZBlGR6PB+Xl5QCA8vLyWK+Dg4M4ffo0BEHA0aNHsbi4GNtBL5EEAgEMDw+jqqoKAEBEmJiYQFlZGQDgzJkzqp43zp7Kysrw9u1bUILdgxEKhfDu3TtUVlYCWN/reNeuXUmfczQaRTgcRiQSQTgchtlsTrqc8/Pzf9qDJt5cR0dHUVhYiPT0dKSnp6OwsBCjo6ObrkG3S0myLEOSpNhjSZIwOTmpYUXbz+/3Y3p6GkeOHMHCwgIyMzMBAGazGQsLCwDWX4e/bp4uSRJkWY49N1G0tbXh6tWrWFpaAgAoigKj0QiDwQBgfftZWZYBqLM3GAwwGo1QFCWhtpH1+/3YvXs3Wlpa8OXLF1gsFtTW1iZ1zqIo4uLFi6irq0NqaiqOHz8Oi8WS1DlviDfXv/9+++vrshm6vWJIdsvLy7Db7aitrYXRaFR9TxAECIKgUWXbb2hoCCaTKSHXzH9VJBLB9PQ0zp07hwcPHiAtLQ2dnZ2q5yRbzsFgEB6PB06nE263G8vLy3GdBSeL35Grbq8YRFFEIBCIPQ4EAhBFUcOKts/a2hrsdjtOnTqF0tJSAIDJZMLc3BwyMzMxNzcXO2sSRVG1+1Mivg4+nw+Dg4MYGRlBOBzG0tIS2traEAqFEIlEYDAYIMtyrK+N7CVJQiQSQSgUQkZGhsZdxEeSJEiShLy8PADrSyWdnZ1JnfP4+DiysrJiPZWWlsLn8yV1zhvizVUURXi93thxWZaRn5+/6Z+n2yuG3NxcfP36FX6/H2tra+jv70dxcbHWZW0ZEcHlciEnJwcXLlyIHS8uLkZvby8AoLe3FyUlJbHjb968ARHhw4cPMBqNCbW8AABXrlyBy+WC0+lEY2Mjjh07hoaGBhQUFGBgYADA+h0aG/meOHECPT09AICBgQEUFBQk3Jm12WyGJEmxLW3Hx8exf//+pM55z549mJycxMrKCogo1nMy57wh3lyLioowNjaGYDCIYDCIsbExFBUVbfrn6fovn4eHh/HkyRNEo1FUVFSgurpa65K27P3797hz5w4OHjwYexNcvnwZeXl5aGpqwuzs7E+3uz1+/BhjY2NITU2F1WpFbm6uxl38uomJCXR1dcFms+H79+9wOBwIBoM4fPgw6uvrsWPHDoTDYTQ3N2N6ehrp6elobGxEdna21qXH7fPnz3C5XFhbW0NWVhasViuIKKlzfvr0Kfr7+2EwGHDo0CFcu3YNsiwnVc4OhwNerxeKosBkMqGmpgYlJSVx59rd3Y2Ojg4A67erVlRUbLoGXQ8GxhhjP9PtUhJjjLF/xoOBMcaYCg8GxhhjKjwYGGOMqfBgYIwxpsKDgTHGmAoPBsYYYyp/AkyDqgXSPnXJAAAAAElFTkSuQmCC\n",
"text/plain": [
"
"
],
"text/plain": [
" USERID ITEMID RATING TIMESTAMP\n",
"0 0 0 1 884182806\n",
"1 1 1 1 891628467\n",
"2 2 2 1 879781125\n",
"3 3 3 1 876042340\n",
"4 4 4 1 879270459"
]
},
"metadata": {},
"execution_count": 17
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "37Y7qEEJNiE4"
},
"source": [
"### Interactions\n",
"While we have chosen to represent the data as a ``pandas.DataFrame`` for easy viewing now, Collie uses a custom ``torch.utils.data.Dataset`` called ``Interactions``. This class stores a sparse representation of the data and offers some handy benefits, including: \n",
"\n",
"* The ability to index the data with a ``__getitem__`` method \n",
"* The ability to sample many negative items (we will get to this later!) \n",
"* Nice quality checks to ensure data is free of errors before model training \n",
"\n",
"Instantiating the object is simple! "
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "dLcqt57TI-6l",
"outputId": "b399e7e8-fec0-460a-8048-e7a1e9395106"
},
"source": [
"interactions = Interactions(\n",
" users=df['USERID'],\n",
" items=df['ITEMID'],\n",
" ratings=df['RATING'],\n",
" allow_missing_ids=True,\n",
")\n",
"\n",
"interactions"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Checking for and removing duplicate user, item ID pairs...\n",
"Checking ``num_negative_samples`` is valid...\n",
"Maximum number of items a user has interacted with: 378\n",
"Generating positive items set...\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"Interactions object with 55375 interactions between 942 users and 1447 items, returning 10 negative samples per interaction."
]
},
"metadata": {},
"execution_count": 18
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "4TaWM_OFNZzn"
},
"source": [
"### Data Splits \n",
"With an ``Interactions`` dataset, Collie supports two types of data splits. \n",
"\n",
"1. **Random split**: This code randomly assigns an interaction to a ``train``, ``validation``, or ``test`` dataset. While this is significantly faster to perform than a stratified split, it does not guarantee any balance, meaning a scenario where a user will have no interactions in the ``train`` dataset and all in the ``test`` dataset is possible. \n",
"2. **Stratified split**: While this code runs slower than a random split, this guarantees that each user will be represented in the ``train``, ``validation``, and ``test`` dataset. This is by far the most fair way to train and evaluate a recommendation model. \n",
"\n",
"Since this is a small dataset and we have time, we will go ahead and use ``stratified_split``. If you're short on time, a ``random_split`` can easily be swapped in, since both functions share the same API! "
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "U_4IPy2aNLVE",
"outputId": "5db1d12d-5d17-46f5-c4d1-05cfe4d458e4"
},
"source": [
"train_interactions, val_interactions = stratified_split(interactions, test_p=0.1, seed=42)\n",
"train_interactions, val_interactions"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Generating positive items set...\n",
"Generating positive items set...\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"(Interactions object with 49426 interactions between 942 users and 1447 items, returning 10 negative samples per interaction.,\n",
" Interactions object with 5949 interactions between 942 users and 1447 items, returning 10 negative samples per interaction.)"
]
},
"metadata": {},
"execution_count": 19
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZO5rLYzH-imu"
},
"source": [
"### Model Architecture \n",
"With our data ready-to-go, we can now start training a recommendation model. While Collie has several model architectures built-in, the simplest by far is the ``MatrixFactorizationModel``, which use ``torch.nn.Embedding`` layers and a dot-product operation to perform matrix factorization via collaborative filtering."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZWi4VUjeghUA"
},
"source": [
"Digging through the code of [``collie_recs.model.MatrixFactorizationModel``](../collie_recs/model.py) shows the architecture is as simple as we might think. For simplicity, we will include relevant portions below so we know exactly what we are building: \n",
"\n",
"````python\n",
"def _setup_model(self, **kwargs) -> None:\n",
" self.user_biases = ZeroEmbedding(num_embeddings=self.hparams.num_users,\n",
" embedding_dim=1,\n",
" sparse=self.hparams.sparse)\n",
" self.item_biases = ZeroEmbedding(num_embeddings=self.hparams.num_items,\n",
" embedding_dim=1,\n",
" sparse=self.hparams.sparse)\n",
" self.user_embeddings = ScaledEmbedding(num_embeddings=self.hparams.num_users,\n",
" embedding_dim=self.hparams.embedding_dim,\n",
" sparse=self.hparams.sparse)\n",
" self.item_embeddings = ScaledEmbedding(num_embeddings=self.hparams.num_items,\n",
" embedding_dim=self.hparams.embedding_dim,\n",
" sparse=self.hparams.sparse)\n",
"\n",
" \n",
"def forward(self, users: torch.tensor, items: torch.tensor) -> torch.tensor:\n",
" user_embeddings = self.user_embeddings(users)\n",
" item_embeddings = self.item_embeddings(items)\n",
"\n",
" preds = (\n",
" torch.mul(user_embeddings, item_embeddings).sum(axis=1)\n",
" + self.user_biases(users).squeeze(1)\n",
" + self.item_biases(items).squeeze(1)\n",
" )\n",
"\n",
" if self.hparams.y_range is not None:\n",
" preds = (\n",
" torch.sigmoid(preds)\n",
" * (self.hparams.y_range[1] - self.hparams.y_range[0])\n",
" + self.hparams.y_range[0]\n",
" )\n",
"\n",
" return preds\n",
"````\n",
"\n",
"Let's go ahead and instantiate the model and start training! Note that even if you are running this model on a CPU instead of a GPU, this will still be relatively quick to fully train. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "L7o3t9vNN-Lt"
},
"source": [
"Collie is built with PyTorch Lightning, so all the model classes and the ``CollieTrainer`` class accept all the training options available in PyTorch Lightning. Here, we're going to set the embedding dimension and learning rate differently, and go with the defaults for everything else"
]
},
{
"cell_type": "code",
"metadata": {
"id": "LNfxzlruN1xx"
},
"source": [
"model = MatrixFactorizationModel(\n",
" train=train_interactions,\n",
" val=val_interactions,\n",
" embedding_dim=10,\n",
" lr=1e-2,\n",
")"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "TKxeyMsMN1vg"
},
"source": [
"trainer = CollieTrainer(model, max_epochs=10, deterministic=True)\n",
"\n",
"trainer.fit(model)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "dx8EoEhC_Cjh"
},
"source": [
"```text\n",
"GPU available: False, used: False\n",
"TPU available: False, using: 0 TPU cores\n",
"IPU available: False, using: 0 IPUs\n",
"\n",
" | Name | Type | Params\n",
"----------------------------------------------------\n",
"0 | user_biases | ZeroEmbedding | 942 \n",
"1 | item_biases | ZeroEmbedding | 1.4 K \n",
"2 | user_embeddings | ScaledEmbedding | 9.4 K \n",
"3 | item_embeddings | ScaledEmbedding | 14.5 K\n",
"4 | dropout | Dropout | 0 \n",
"----------------------------------------------------\n",
"26.3 K Trainable params\n",
"0 Non-trainable params\n",
"26.3 K Total params\n",
"0.105 Total estimated model params size (MB)\n",
"Validation sanity check: 0%\n",
"0/2 [00:00, ?it/s]\n",
"Global seed set to 22\n",
"Epoch 9: 100%\n",
"55/55 [00:05<00:00, 10.37it/s, loss=1.74, v_num=0]\n",
"Epoch 3: reducing learning rate of group 0 to 1.0000e-03.\n",
"Epoch 3: reducing learning rate of group 0 to 1.0000e-03.\n",
"Epoch 5: reducing learning rate of group 0 to 1.0000e-04.\n",
"Epoch 5: reducing learning rate of group 0 to 1.0000e-04.\n",
"Epoch 7: reducing learning rate of group 0 to 1.0000e-05.\n",
"Epoch 7: reducing learning rate of group 0 to 1.0000e-05.\n",
"Epoch 9: reducing learning rate of group 0 to 1.0000e-06.\n",
"Epoch 9: reducing learning rate of group 0 to 1.0000e-06.\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "b1PhE5nDghUC"
},
"source": [
"### Evaluate the Model \n",
"We have a model! Now, we need to figure out how well we did. Evaluating implicit recommendation models is a bit tricky, but Collie offers the following metrics that are built into the library. They use vectorized operations that can run on the GPU in a single pass for speed-ups. \n",
"\n",
"* [``Mean Average Precision at K (MAP@K)``](https://en.wikipedia.org/wiki/Evaluation_measures_(information_retrieval)#Mean_average_precision) \n",
"* [``Mean Reciprocal Rank``](https://en.wikipedia.org/wiki/Mean_reciprocal_rank) \n",
"* [``Area Under the Curve (AUC)``](https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve) \n",
"\n",
"We'll go ahead and evaluate all of these at once below. "
]
},
{
"cell_type": "code",
"metadata": {
"id": "fUFAraDsOHtL"
},
"source": [
"model.eval() # set model to inference mode\n",
"mapk_score, mrr_score, auc_score = evaluate_in_batches([mapk, mrr, auc], val_interactions, model)\n",
"\n",
"print(f'MAP@10 Score: {mapk_score}')\n",
"print(f'MRR Score: {mrr_score}')\n",
"print(f'AUC Score: {auc_score}')"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "CFcwh56u-9bM"
},
"source": [
"```text\n",
"MAP@10 Score: 0.04792124467542778\n",
"MRR Score: 0.1670155641949101\n",
"AUC Score: 0.8854083361899018\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zlqMa7eWOMGl"
},
"source": [
"### Inference"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "z7y-J7KUghUD"
},
"source": [
"We can also look at particular users to get a sense of what the recs look like. "
]
},
{
"cell_type": "code",
"metadata": {
"id": "sIQsXedgghUD",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"outputId": "5496ece7-9dd4-47c9-b804-a2fb283cf8dd"
},
"source": [
"# select a random user ID to look at recommendations for\n",
"user_id = np.random.randint(10, train_interactions.num_users)\n",
"\n",
"display(\n",
" HTML(\n",
" get_recommendation_visualizations(\n",
" model=model,\n",
" user_id=user_id,\n",
" filter_films=True,\n",
" shuffle=True,\n",
" detailed=True,\n",
" )\n",
" )\n",
")"
],
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/html": [
"
User 895:
\n",
" \n",
"
\n",
"
\n",
"
Willy Wonka and the Chocolate Factory (1971)
\n",
"
Mighty Aphrodite (1995)
\n",
"
Conspiracy Theory (1997)
\n",
"
Sense and Sensibility (1995)
\n",
"
Liar Liar (1997)
\n",
"
In & Out (1997)
\n",
"
Return of the Jedi (1983)
\n",
"
Ransom (1996)
\n",
"
Emma (1996)
\n",
"
Toy Story (1995)
\n",
"
\n",
" \n",
" \n",
"
\n",
"
Some loved films:
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
" \n",
"
\n",
" \n",
"
\n",
"
\n",
"
Princess Bride, The (1987)
\n",
"
Graduate, The (1967)
\n",
"
Cold Comfort Farm (1995)
\n",
"
Apartment, The (1960)
\n",
"
Jerry Maguire (1996)
\n",
"
Sleeper (1973)
\n",
"
Independence Day (ID4) (1996)
\n",
"
Desperado (1995)
\n",
"
Three Colors: Red (1994)
\n",
"
Lawnmower Man, The (1992)
\n",
"
\n",
" \n",
" \n",
"
\n",
"
Recommended films:
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n",
" \n",
"
-----
User 895 has rated 12 films with a 4 or 5
User 895 has rated 8 films with a 1, 2, or 3
% of these films rated 5 or 4 appearing in the first 10 recommendations:0.0%
% of these films rated 1, 2, or 3 appearing in the first 10 recommendations: 0.0%
"
],
"text/plain": [
""
]
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DbZ9ufGvghUE"
},
"source": [
"### Save and Load a Standard Model "
]
},
{
"cell_type": "code",
"metadata": {
"id": "WqJbXHXgghUG"
},
"source": [
"# we can save the model with...\n",
"os.makedirs('models', exist_ok=True)\n",
"model.save_model('models/matrix_factorization_model.pth')"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "Dz_8miLPghUG",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "1f2f8c07-be29-4fb9-ca8c-168ac1515f16"
},
"source": [
"# ... and if we wanted to load that model back in, we can do that easily...\n",
"model_loaded_in = MatrixFactorizationModel(load_model_path='models/matrix_factorization_model.pth')\n",
"\n",
"model_loaded_in"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"MatrixFactorizationModel(\n",
" (user_biases): ZeroEmbedding(942, 1)\n",
" (item_biases): ZeroEmbedding(1447, 1)\n",
" (user_embeddings): ScaledEmbedding(942, 10)\n",
" (item_embeddings): ScaledEmbedding(1447, 10)\n",
" (dropout): Dropout(p=0.0, inplace=False)\n",
")"
]
},
"metadata": {},
"execution_count": 25
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cQpWlCuughUH"
},
"source": [
"Now that we've built our first model and gotten some baseline metrics, we now will be looking at some more advanced features in Collie's ``MatrixFactorizationModel``. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Q3Ne8ETsgzLe"
},
"source": [
"### Faster Data Loading Through Approximate Negative Sampling "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8nBu6PZhgzLe"
},
"source": [
"With sufficiently large enough data, verifying that each negative sample is one a user has *not* interacted with becomes expensive. With many items, this can soon become a bottleneck in the training process. \n",
"\n",
"Yet, when we have many items, the chances a user has interacted with most is increasingly rare. Say we have ``1,000,000`` items and we want to sample ``10`` negative items for a user that has positively interacted with ``200`` items. The chance that we accidentally select a positive item in a random sample of ``10`` items is just ``0.2%``. At that point, it might be worth it to forgo the expensive check to assert our negative sample is true, and instead just randomly sample negative items with the hope that most of the time, they will happen to be negative. \n",
"\n",
"This is the theory behind the ``ApproximateNegativeSamplingInteractionsDataLoader``, an alternate DataLoader built into Collie. Let's train a model with this below, noting how similar this procedure looks to that in the previous tutorial. "
]
},
{
"cell_type": "code",
"metadata": {
"id": "MgSijz04gzLf"
},
"source": [
"train_loader = ApproximateNegativeSamplingInteractionsDataLoader(train_interactions, batch_size=1024, shuffle=True)\n",
"val_loader = ApproximateNegativeSamplingInteractionsDataLoader(val_interactions, batch_size=1024, shuffle=False)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "n9wQLd9-gzLf"
},
"source": [
"model = MatrixFactorizationModel(\n",
" train=train_loader,\n",
" val=val_loader,\n",
" embedding_dim=10,\n",
" lr=1e-2,\n",
")"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "AbYirCSNgzLg"
},
"source": [
"trainer = CollieTrainer(model, max_epochs=10, deterministic=True)\n",
"\n",
"trainer.fit(model)\n",
"model.eval()"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "uvykQiW2_Oof"
},
"source": [
"```text\n",
"GPU available: True, used: True\n",
"TPU available: False, using: 0 TPU cores\n",
"LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]\n",
"\n",
" | Name | Type | Params\n",
"----------------------------------------------------\n",
"0 | user_biases | ZeroEmbedding | 941 \n",
"1 | item_biases | ZeroEmbedding | 1.4 K \n",
"2 | user_embeddings | ScaledEmbedding | 9.4 K \n",
"3 | item_embeddings | ScaledEmbedding | 14.5 K\n",
"4 | dropout | Dropout | 0 \n",
"----------------------------------------------------\n",
"26.3 K Trainable params\n",
"0 Non-trainable params\n",
"26.3 K Total params\n",
"0.105 Total estimated model params size (MB)\n",
"Detected GPU. Setting ``gpus`` to 1.\n",
"Global seed set to 22\n",
"\n",
"MatrixFactorizationModel(\n",
" (user_biases): ZeroEmbedding(941, 1)\n",
" (item_biases): ZeroEmbedding(1447, 1)\n",
" (user_embeddings): ScaledEmbedding(941, 10)\n",
" (item_embeddings): ScaledEmbedding(1447, 10)\n",
" (dropout): Dropout(p=0.0, inplace=False)\n",
")\n",
"```"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 117,
"referenced_widgets": [
"d7c34c7be74246acb1a7da702b490029",
"8ca15bf2e3be4ba88b7dbbf4ed26820d",
"3e63ae601740455e81c35d4fbab2db6e",
"b289ad3cdd3f4a25a9e92ed22c37aeed",
"3d7d55e46c7d4e2caa6680229fe73b4e",
"e841d95562314124b242c2e4225afbdb",
"7b9d13b53df04e2082d4e236f50b807d",
"9a4137c369cc4f26817ed3ab568a5ef7"
]
},
"id": "xLKDSq-hgzLg",
"outputId": "44183d74-a567-418d-a85e-946bf1443713"
},
"source": [
"mapk_score, mrr_score, auc_score = evaluate_in_batches([mapk, mrr, auc], val_interactions, model)\n",
"\n",
"print(f'MAP@10 Score: {mapk_score}')\n",
"print(f'MRR Score: {mrr_score}')\n",
"print(f'AUC Score: {auc_score}')"
],
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "d7c34c7be74246acb1a7da702b490029",
"version_minor": 0,
"version_major": 2
},
"text/plain": [
"HBox(children=(FloatProgress(value=0.0, max=28.0), HTML(value='')))"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "stream",
"text": [
"\n",
"MAP@10 Score: 0.027979833367276323\n",
"MRR Score: 0.1703751336709069\n",
"AUC Score: 0.8517987786322347\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qipOCQzxgzLh"
},
"source": [
"We're seeing a small hit on performance and only a marginal improvement in training time compared to the standard ``MatrixFactorizationModel`` model because MovieLens 100K has so few items. ``ApproximateNegativeSamplingInteractionsDataLoader`` is especially recommended for when we have more items in our data and training times need to be optimized. \n",
"\n",
"For more details on this and other DataLoaders in Collie (including those for out-of-memory datasets), check out the [docs](https://collie.readthedocs.io/en/latest/index.html)! "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "iGgM-FaegzLi"
},
"source": [
"### Multiple Optimizers "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "4xcFspsbgzLi"
},
"source": [
"Training recommendation models at ShopRunner, we have encountered something we call \"the curse of popularity.\" \n",
"\n",
"This is best thought of in the viewpoint of a model optimizer - say we have a user, a positive item, and several negative items that we hope have recommendation scores that score lower than the positive item. As an optimizer, you can either optimize every single embedding dimension (hundreds of parameters) to achieve this, or instead choose to score a quick win by optimizing the bias terms for the items (just add a positive constant to the positive item and a negative constant to each negative item). \n",
"\n",
"While we clearly want to have varied embedding layers that reflect each user and item's taste profiles, some models learn to settle for popularity as a recommendation score proxy by over-optimizing the bias terms, essentially just returning the same set of recommendations for every user. Worst of all, since popular items are... well, popular, **the loss of this model will actually be decent, solidifying the model getting stuck in a local loss minima**. \n",
"\n",
"To counteract this, Collie supports multiple optimizers in a ``MatrixFactorizationModel``. With this, we can have a faster optimizer work to optimize the embedding layers for users and items, and a slower optimizer work to optimize the bias terms. With this, we impel the model to do the work actually coming up with varied, personalized recommendations for users while still taking into account the necessity of the bias (popularity) terms on recommendations. \n",
"\n",
"At ShopRunner, we have seen significantly better metrics and results from this type of model. With Collie, this is simple to do, as shown below. "
]
},
{
"cell_type": "code",
"metadata": {
"id": "GxUMsr61gzLj"
},
"source": [
"model = MatrixFactorizationModel(\n",
" train=train_interactions,\n",
" val=val_interactions,\n",
" embedding_dim=10,\n",
" lr=1e-2,\n",
" bias_lr=1e-1,\n",
" optimizer='adam',\n",
" bias_optimizer='sgd',\n",
")"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 557,
"referenced_widgets": [
"49f97e2e5f55454bab313b1570b4b9c4",
"b3c7b293fe8b44e0920b060f00382401",
"80b4510b6a7e4d77879516c59662180c",
"f1e818c6ef934704ae0ef1f80eaa7b5b",
"695c6402e0a1475eb2965bee589c2bf4",
"06f180111dfe4f57a12723bccb253f98",
"316eaa17f9cd4558b8c7f6951b98d6f0",
"75fc6c2f339b4e47bb22790d37e7ebab",
"8d714e45f0d64e85b637961f4a25f3d0",
"b5a4c2dd499c4d5c8d3ab2e890b44642",
"d1dd8920ddfc4c97beffe89ece6b982e",
"fc7d1f4bea4145af93d9059177a477fd",
"ef66bb3e20eb4d25a6dc3af0fccf2e2a",
"9275275798974477a5f5858880d1b429",
"2bcffbf7f05b41688936d51d6c3ffcac",
"28198fbab991465583818c47d8ddbb6d",
"83658275a6c346d7822fd1368345e09a",
"3389371180f64e088e7c6549a25b3ee1",
"54d47349fff54cd0bca2e4903b06e5d8",
"ec37483292aa4715a97c6f0de133cb24",
"78845118227b40adac2503e57e1f38a7",
"ce20fc80baf0494f976d7ebb54303833",
"b0606438031a4d0fbe1b3b1bc4d49e9e",
"38cb1183eb294363af2d2762d8a49a13",
"b9a2d6afed7641d995fba91c5a94f8ec",
"6f42c45ff32f46deb83569801ab976c8",
"58d6593d8810421bb8bd39099e82feb9",
"7cdc87e9c27f48e2b081bdbf2d6228e8",
"45b86b0dc3394da4b2f87191d23dfed7",
"585898b88c044e2e8d6fd0c46c3b575d",
"a0259f8bb2264993a6d27acd097b6c05",
"5b0ed212d376472987768ea06e314f9d",
"a59ea6641fca44bbaa553a1624deab51",
"671b80df47dc444d817c51101c7adf49",
"d56ba50fc09c4502a1055ff9a3199062",
"1bd1fa4ccb564778be34c5fe9036e731",
"1b3ff709422c4864acff0ca42ec89e07",
"5b6f8c0d04be4ec19c55259caf622c02",
"b8903c665b7c4a3695101aa7d6e8f929",
"518b4b044b4d411fb43dac5fc993c02c",
"36d600f8e9be46a6b05ec580025aac20",
"f14194bdbf724d20bdd7d5b572374608",
"b62d3801dd3d4f45b53de6d6c25e26bd",
"066ce24dcd2e46acb54f1fd437963b85",
"bd80f8f37cb247c69f969d96b146ea8c",
"4100968fe609430f8cd1f73c48a356f2",
"9e6c6c48df824c548aa0425888adac9b",
"27866404913f4820bb46d1641f63dc1f",
"d75b9faa61b242279706a77458e55276",
"ba792412f171494d937051219e01607d",
"f43bba18ba134b269006e100f8de55d1",
"63c6fb5bfa3147a89962641d1a7d215b",
"6d7a19d43c4846d99d3d470fd5d7d66e",
"90af7f79f905435f8c81ca21e59cab59",
"fad0632c95f74aca92b6e72751da0632",
"7eb49fb403564170b47e0c6f867f81e1",
"57322a343ddd4ad2bc408a840eca0e98",
"18b3142ac5c24401941cd4f79bac783b",
"6d730313b939447bbd396526cdee3fad",
"73cccc23b004406e8521b0cc26401c9c",
"7520241c3b2c4591bc0532d20dafccda",
"5dd8ebbab9ce418cae4eaf9f568ab5c9",
"0afd2fe607b04590813dc297c3bfc4da",
"c14405b6297a435eaab8032dbbb9966f",
"0351e40f8c0f46708aeea7233565800a",
"d7e152362cd54b4380a6cd3e506e7b86",
"73dff796a59d444a96016b2578f277d7",
"89a5fbcb3a4e4732a72c8cee79a346b6",
"bcde36e5444444568f7a6fa34d85ccaa",
"53e3249bbef446208f54a6216420062f",
"235ef9871d5443a9ab3b31f1c231f53d",
"b329742fdb3e44e2974d3a31700072a3",
"93fbdf1f9df1464888cdecc6285ef575",
"7ef0973e658740528843af596067fe8e",
"dad5c7f76e20437ca651d4d39ac68660",
"3a61c8ddd8b74af78c6ceffb7001da02",
"fec250de7d6f45688eb48b51d837b53a",
"54d415a20e204d53bbe9baddd4598e2f",
"136520e97ebe4b04b9b46a44e46297fc",
"c4d1f1810116465b9dcb9282eed0f113",
"2c8df37c614447d5b5064dbbaa837081",
"dc9b62bd3eb94dc8afd9c21491faff0b",
"2416bc3ad43149ff9e8d10572693f115",
"93054dc57a344b9daae9e6e36e14b594",
"4d9cf1e48cb74afb91410b46b2d601b5",
"897727bf519648e0bc3f204771ded957",
"795429863893411cab25dfe5771a2e4a",
"93a527dd2045487393fb8280ff3945b1",
"3d3ca53ac0cb41d1bbb58aa754a79619",
"600e766c4f4c4136875db902954f9ac7",
"2cf5bd88c0c74729a89a70f6ba9dd02e",
"a247748059184bbaac041b8fffd99e3f",
"010c2bf6e2af470d950542394f8ceef5",
"5bb0a358fcd64106acdac950f82d12a7",
"16202361258644f6ab7b168990f73136",
"c2fffba8678b48f9957b9a581aae9e2e"
]
},
"id": "dQ_tTRfOgzLj",
"outputId": "4d3af437-c62d-4b9b-ddb5-eaf3731556da"
},
"source": [
"trainer = CollieTrainer(model, max_epochs=10, deterministic=True)\n",
"\n",
"trainer.fit(model)\n",
"model.eval()"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"GPU available: True, used: True\n",
"TPU available: False, using: 0 TPU cores\n",
"LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]\n",
"\n",
" | Name | Type | Params\n",
"----------------------------------------------------\n",
"0 | user_biases | ZeroEmbedding | 941 \n",
"1 | item_biases | ZeroEmbedding | 1.4 K \n",
"2 | user_embeddings | ScaledEmbedding | 9.4 K \n",
"3 | item_embeddings | ScaledEmbedding | 14.5 K\n",
"4 | dropout | Dropout | 0 \n",
"----------------------------------------------------\n",
"26.3 K Trainable params\n",
"0 Non-trainable params\n",
"26.3 K Total params\n",
"0.105 Total estimated model params size (MB)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"Detected GPU. Setting ``gpus`` to 1.\n"
],
"name": "stdout"
},
{
"output_type": "display_data",
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "49f97e2e5f55454bab313b1570b4b9c4",
"version_minor": 0,
"version_major": 2
},
"text/plain": [
"HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "stream",
"text": [
"Global seed set to 22\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"\r"
],
"name": "stdout"
},
{
"output_type": "display_data",
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "8d714e45f0d64e85b637961f4a25f3d0",
"version_minor": 0,
"version_major": 2
},
"text/plain": [
"HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "display_data",
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "83658275a6c346d7822fd1368345e09a",
"version_minor": 0,
"version_major": 2
},
"text/plain": [
"HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "display_data",
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "b9a2d6afed7641d995fba91c5a94f8ec",
"version_minor": 0,
"version_major": 2
},
"text/plain": [
"HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "display_data",
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "a59ea6641fca44bbaa553a1624deab51",
"version_minor": 0,
"version_major": 2
},
"text/plain": [
"HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "stream",
"text": [
"Epoch 3: reducing learning rate of group 0 to 1.0000e-03.\n",
"Epoch 3: reducing learning rate of group 0 to 1.0000e-02.\n"
],
"name": "stdout"
},
{
"output_type": "display_data",
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "36d600f8e9be46a6b05ec580025aac20",
"version_minor": 0,
"version_major": 2
},
"text/plain": [
"HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "display_data",
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "d75b9faa61b242279706a77458e55276",
"version_minor": 0,
"version_major": 2
},
"text/plain": [
"HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "display_data",
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "57322a343ddd4ad2bc408a840eca0e98",
"version_minor": 0,
"version_major": 2
},
"text/plain": [
"HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "display_data",
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "0351e40f8c0f46708aeea7233565800a",
"version_minor": 0,
"version_major": 2
},
"text/plain": [
"HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "display_data",
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "93fbdf1f9df1464888cdecc6285ef575",
"version_minor": 0,
"version_major": 2
},
"text/plain": [
"HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "display_data",
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "2c8df37c614447d5b5064dbbaa837081",
"version_minor": 0,
"version_major": 2
},
"text/plain": [
"HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "display_data",
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "3d3ca53ac0cb41d1bbb58aa754a79619",
"version_minor": 0,
"version_major": 2
},
"text/plain": [
"HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "stream",
"text": [
"\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"MatrixFactorizationModel(\n",
" (user_biases): ZeroEmbedding(941, 1)\n",
" (item_biases): ZeroEmbedding(1447, 1)\n",
" (user_embeddings): ScaledEmbedding(941, 10)\n",
" (item_embeddings): ScaledEmbedding(1447, 10)\n",
" (dropout): Dropout(p=0.0, inplace=False)\n",
")"
]
},
"metadata": {
"tags": []
},
"execution_count": 55
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 117,
"referenced_widgets": [
"5c2218b13d6345ff91c4d8980b2b2ca0",
"309d24f05b1b4bc5b7d0148795fd5cf4",
"ccfb5dc3eef14f1f8324bc48b3dc0e7f",
"a84e5372c5c24417a08487dd3efcebe8",
"c28c07424b7f45088e73ac25f2bb712a",
"5d01bda6f6354f25bb6dcb15ef4596a0",
"758cba83e9414f60bf87799a71b3cd7f",
"48d71d4847be4a25a237568371ae4493"
]
},
"id": "ENZSOd1DgzLk",
"outputId": "7b1844fb-b81a-43cb-8a98-5206b87b4506"
},
"source": [
"mapk_score, mrr_score, auc_score = evaluate_in_batches([mapk, mrr, auc], val_interactions, model)\n",
"\n",
"print(f'MAP@10 Score: {mapk_score}')\n",
"print(f'MRR Score: {mrr_score}')\n",
"print(f'AUC Score: {auc_score}')"
],
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "5c2218b13d6345ff91c4d8980b2b2ca0",
"version_minor": 0,
"version_major": 2
},
"text/plain": [
"HBox(children=(FloatProgress(value=0.0, max=28.0), HTML(value='')))"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "stream",
"text": [
"\n",
"MAP@10 Score: 0.03243186201880122\n",
"MRR Score: 0.19819369246580287\n",
"AUC Score: 0.8617710409716284\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "blXcVpg3gzLk"
},
"source": [
"Again, we're not seeing as much performance increase here compared to the standard model because MovieLens 100K has so few items. For a more dramatic difference, try training this model on a larger dataset, such as MovieLens 10M, adjusting the architecture-specific hyperparameters, or train longer. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1Dt0IsWJgzLk"
},
"source": [
"### Item-Item Similarity "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sqQJpyKVgzLl"
},
"source": [
"While we've trained every model thus far to work for member-item recommendations (given a *member*, recommend *items* - think of this best as \"Personalized recommendations for you\"), we also have access to item-item recommendations for free (given a seed *item*, recommend similar *items* - think of this more like \"People who interacted with this item also interacted with...\"). \n",
"\n",
"With Collie, accessing this is simple! "
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 343
},
"id": "RRMouFweUOhw",
"outputId": "02e6281a-f64e-43c4-a151-84601d91ab50"
},
"source": [
"df_item = data_object.load_items()\n",
"df_item.head()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"
"
],
"text/plain": [
" USERID ITEMID RATING TIMESTAMP\n",
"76722 4 358 2 892004275\n",
"3277 4 260 4 892004275\n",
"1250 4 264 3 892004275\n",
"12151 4 294 5 892004409\n",
"48826 4 11 4 892004520"
]
},
"metadata": {
"tags": []
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qny2aVoWQknP"
},
"source": [
"### Preprocessing\n",
"\n",
"1. Sort by User ID and Timestamp\n",
"2. Label encode user and item id - in this case, already label encoded starting from 1, so decreasing ids by 1 as a proxy for label encode\n",
"3. Remove Timestamp and Rating column. The reason is that we are training a recall-maximing model where the objective is to correctly retrieve the items that users can interact with. We can select a rating threshold also\n",
"4. Convert Item IDs into list format\n",
"5. Store as a space-seperated txt file"
]
},
{
"cell_type": "code",
"metadata": {
"id": "1iSOiyCqpmYE"
},
"source": [
"def preprocess(data):\n",
" data = data.copy()\n",
" data = data.sort_values(by=['USERID','TIMESTAMP'])\n",
" data['USERID'] = data['USERID'] - 1\n",
" data['ITEMID'] = data['ITEMID'] - 1\n",
" data.drop(['TIMESTAMP','RATING'], axis=1, inplace=True)\n",
" data = data.groupby('USERID')['ITEMID'].apply(list).reset_index(name='ITEMID')\n",
" return data"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"id": "D7ZtrUPp22dO",
"outputId": "56290e63-dcf3-448b-b3a5-ce60bd2c23db"
},
"source": [
"preprocess(df_train).head()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
USERID
\n",
"
ITEMID
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
0
\n",
"
[167, 171, 164, 155, 165, 195, 186, 13, 249, 1...
\n",
"
\n",
"
\n",
"
1
\n",
"
1
\n",
"
[285, 257, 304, 306, 287, 311, 300, 305, 291, ...
\n",
"
\n",
"
\n",
"
2
\n",
"
2
\n",
"
[301, 332, 343, 299, 267, 336, 302, 344, 353, ...
\n",
"
\n",
"
\n",
"
3
\n",
"
3
\n",
"
[257, 287, 299, 327, 270, 358, 361, 302, 326, ...
\n",
"
\n",
"
\n",
"
4
\n",
"
4
\n",
"
[266, 454, 221, 120, 404, 362, 256, 249, 24, 2...
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" USERID ITEMID\n",
"0 0 [167, 171, 164, 155, 165, 195, 186, 13, 249, 1...\n",
"1 1 [285, 257, 304, 306, 287, 311, 300, 305, 291, ...\n",
"2 2 [301, 332, 343, 299, 267, 336, 302, 344, 353, ...\n",
"3 3 [257, 287, 299, 327, 270, 358, 361, 302, 326, ...\n",
"4 4 [266, 454, 221, 120, 404, 362, 256, 249, 24, 2..."
]
},
"metadata": {
"tags": []
},
"execution_count": 9
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "yDMAhrig1Lde"
},
"source": [
"def store(data, target_file='./data/movielens/train.txt'):\n",
" Path(target_file).parent.mkdir(parents=True, exist_ok=True)\n",
" with open(target_file, 'w+') as f:\n",
" writer = csv.writer(f, delimiter=' ')\n",
" for USERID, row in zip(data.USERID.values,data.ITEMID.values):\n",
" row = [USERID] + row\n",
" writer.writerow(row)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "XUIFrKsavRzV"
},
"source": [
"store(preprocess(df_train), '/content/data/ml-100k/train.txt')\n",
"store(preprocess(df_test), '/content/data/ml-100k/test.txt')"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "vq2IwCkJtUTy",
"outputId": "bdd5df6b-213f-4e87-deae-b8f29e42ec87"
},
"source": [
"!head /content/data/ml-100k/train.txt"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"0 167 171 164 155 165 195 186 13 249 126 180 116 108 0 245 256 247 49 248 252 261 92 223 123 18 122 136 145 6 234 14 244 259 23 263 125 236 12 24 120 250 235 239 117 129 64 189 46 30 27 113 38 51 237 198 182 10 68 160 94 59 82 178 21 97 63 134 162 25 201 88 7 213 181 47 98 159 174 191 179 127 142 184 67 54 203 55 95 80 78 150 211 22 69 83 93 196 190 183 133 206 144 187 185 96 84 35 143 158 16 173 251 104 147 107 146 219 105 242 121 106 103 246 119 44 267 266 258 260 262 9 149 233 91 70 41 175 90 192 216 176 215 193 72 58 132 40 194 217 169 212 156 222 26 226 79 230 66 118 199 3 214 163 1 205 76 52 135 45 39 152 268 253 114 172 210 228 154 202 61 89 218 166 229 34 161 60 264 111 56 48 29 232 130 151 81 140 71 32 157 197 224 112 20 148 87 100 109 102 238 33 28 42 131 209 204 115 124\r\n",
"1 285 257 304 306 287 311 300 305 291 302 268 298 314 295 0 18 296 292 274 256 294 276 286 254 297 289 279 273 275 272 290 277 293 24 278 13 110 9 281 12 236 283 99 126 312 284 301 282 250 310\r\n",
"2 301 332 343 299 267 336 302 344 353 257 287 318 340 351 271 349 352 333 342 338 341 335 298 325 293 306 331 270 244 354 323 348 322 321 334 263 324 337 329 350 346 339 328\r\n",
"3 257 287 299 327 270 358 361 302 326 328 359 360 353 300 323 209 355 356 49\r\n",
"4 266 454 221 120 404 362 256 249 24 20 99 108 368 234 411 406 410 104 367 224 150 0 180 49 405 423 412 78 396 372 230 398 228 225 175 449 182 434 88 1 227 229 226 448 209 430 173 171 143 402 397 390 384 16 371 385 392 395 166 366 89 400 389 41 152 185 455 69 383 109 79 380 363 208 450 381 427 382 429 210 432 238 172 207 203 413 167 153 421 431 422 418 142 416 414 373 28 433 364 365 379 391 386 428 424 213 134 61 374 97 447 184 233 435 199 442 444 446 218 443 378 369 440 144 445 401 240 370 215 65 420 426 377 94 419 101 415 417 98 403\r\n",
"5 285 241 301 268 305 257 339 302 303 320 309 258 267 308 537 260 181 247 407 274 6 296 126 275 99 8 458 123 13 514 14 136 292 533 535 12 116 284 220 474 0 256 110 476 245 507 470 150 297 283 124 409 532 236 457 293 471 459 534 404 531 472 20 475 300 307 63 523 7 513 97 426 164 222 134 78 530 509 177 486 176 135 88 49 526 204 512 480 461 186 191 46 168 173 519 317 483 488 497 142 70 11 478 190 496 479 491 503 511 495 210 468 524 196 481 198 529 55 536 499 473 68 520 203 506 31 179 182 518 489 188 22 193 463 494 510 460 184 165 174 487 485 69 132 528 482 215 521 522 434 192 462 431 492 237 493 58 208 130 21 94 502 527 86 490 316 467 505 155\r\n",
"6 268 677 681 258 680 306 265 285 267 682 299 287 263 679 308 63 173 186 602 514 175 179 85 366 264 227 522 434 617 185 418 215 446 529 177 642 31 650 171 649 181 100 473 172 615 428 97 233 487 49 92 525 196 88 99 481 495 513 21 611 203 610 652 658 96 143 483 190 402 656 131 170 180 633 7 494 222 95 22 645 654 635 55 8 490 195 435 81 200 498 655 632 167 422 603 134 272 430 67 204 384 165 614 660 510 182 214 155 607 647 643 197 212 670 68 126 189 43 595 355 542 236 526 512 3 284 135 163 497 237 151 484 592 193 482 612 91 478 491 191 317 392 156 381 583 479 509 420 496 202 429 486 590 6 426 662 152 207 501 567 587 631 78 460 178 629 504 480 27 506 130 228 213 503 433 657 10 588 24 646 160 613 549 469 628 206 210 98 69 626 601 194 528 80 555 673 527 651 70 26 46 547 608 150 536 187 508 216 667 618 274 518 431 627 606 634 9 470 176 120 401 605 209 403 674 201 50 630 89 621 377 211 659 415 454 609 548 153 139 520 678 464 124 462 132 419 125 648 442 543 669 404 604 76 378 229 471 565 117 500 162 161 545 140 580 488 199 636 639 505 644 38 225 591 638 280 383 546 600 596 672 364 231 594 597 51 28 572 447 451 598 90 105 623 619 450 570 586 502 663 71 240 379 561 141 577 599 388 593 77 622 440 400 395 443 571 398 79 414 664 563 676\r\n",
"7 257 293 300 258 335 259 687 242 357 456 340 686 337 688 650 171 186 126 49 384 88 21 55 189 181 180 95 173 510 509 567 176 10 434 175 182 402 143 54 227 78 272 209 6 194 228 685\r\n",
"8 339 241 478 520 401 506 614 526 689 275 293 6 370 49 384 5 297 200\r\n",
"9 301 285 268 288 318 244 333 332 653 526 429 55 512 662 63 31 173 126 492 193 152 557 58 185 517 701 610 628 417 473 384 692 602 706 155 504 655 587 485 663 22 11 403 530 685 370 81 222 123 474 49 708 190 272 609 487 220 705 174 274 696 10 177 21 710 700 167 204 650 184 605 181 233 15 510 697 0 479 196 460 159 115 477 134 178 156 8 508 495 197 601 47 194 210 651 175 3 98 133 68 136 284 356 154 462 97 434 501 496 217 691 199 481 482 215 179 273 163 169 497 69 446 461 99 469 275 132 466 654 483 588 191 478 413 128 202 704 12 198 703 160 694 518 702 656 520 603\r\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "E7YYuu2XuVQa",
"outputId": "fd6fc81c-8ee7-4a34-cb7f-ae5c9ffb27d6"
},
"source": [
"!head /content/data/ml-100k/test.txt"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"0 207 2 11 57 200 137 65 36 37 139 240 75 225 77 62 231 138 141 74 50 53 43 86 99 8 227 153 85 168 15 177 221 257 265 254 271 270 19 128 220 243 5 17 269 208 31 188 241 170 110 4 255 101 73\r\n",
"1 49 241 271 309 303 299 288 315 307 308 313 280\r\n",
"2 320 327 326 345 347 259 330 317 316 319 180\r\n",
"3 357 259 263 293 10\r\n",
"4 138 388 453 68 161 232 242 258 451 439 437 436 438 188 168 407 100 425 376 62 399 93 408 193 162 393 375 39 409 23 441 387 394 452 456\r\n",
"5 516 80 133 498 194 466 418 131 504 207 356 465 199 172 187 212 273 422 169 525 484 508 515 501 201 469 366 185 500 153 477 202 167 424 18 85 152 27 517 538 464 271\r\n",
"6 675 61 144 551 616 560 569 553 585 540 52 138 589 448 544 558 333 293 259 624 417 541 557 218 671 439 568 562 550 564 566 668 445 637 556 579 665 641 226 582 53 573 449 142 416 625 620 559 230 575 390 539 427 174 72 584 574 576 385 578 519 386 316 661 552 554 30 133 323 257 11 192 640 184 198 356 666 653 432 581 340\r\n",
"7 549 81 187 221 430 683 517 226 240 232 565 684\r\n",
"8 690 285 482 486\r\n",
"9 143 503 59 616 709 431 494 92 509 524 488 229 529 161 6 237 614 695 581 282 699 693 366 84 39 419 711 707 528 182 698 131 32 498 320 293 339\r\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "C-f9mzEf4Ow6"
},
"source": [
"Path('/content/results').mkdir(parents=True, exist_ok=True)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "Bxz7aV0Sws1S"
},
"source": [
"def parse_args():\n",
" parser = argparse.ArgumentParser(description=\"Run NGCF.\")\n",
" parser.add_argument('--data_dir', type=str,\n",
" default='./data/',\n",
" help='Input data path.')\n",
" parser.add_argument('--dataset', type=str, default='ml-100k',\n",
" help='Dataset name: Amazond-book, Gowella, ml-100k')\n",
" parser.add_argument('--results_dir', type=str, default='results',\n",
" help='Store model to path.')\n",
" parser.add_argument('--n_epochs', type=int, default=400,\n",
" help='Number of epoch.')\n",
" parser.add_argument('--reg', type=float, default=1e-5,\n",
" help='l2 reg.')\n",
" parser.add_argument('--lr', type=float, default=0.0001,\n",
" help='Learning rate.')\n",
" parser.add_argument('--emb_dim', type=int, default=64,\n",
" help='number of embeddings.')\n",
" parser.add_argument('--layers', type=str, default='[64,64]',\n",
" help='Output sizes of every layer')\n",
" parser.add_argument('--batch_size', type=int, default=512,\n",
" help='Batch size.')\n",
" parser.add_argument('--node_dropout', type=float, default=0.,\n",
" help='Graph Node dropout.')\n",
" parser.add_argument('--mess_dropout', type=float, default=0.1,\n",
" help='Message dropout.')\n",
" parser.add_argument('--k', type=str, default=20,\n",
" help='k order of metric evaluation (e.g. NDCG@k)')\n",
" parser.add_argument('--eval_N', type=int, default=5,\n",
" help='Evaluate every N epochs')\n",
" parser.add_argument('--save_results', type=int, default=1,\n",
" help='Save model and results')\n",
"\n",
" return parser.parse_args(args={})"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "twi1ZIucR0ga"
},
"source": [
"### Helper Functions\n",
"\n",
"- early_stopping()\n",
"- train()\n",
"- split_matrix()\n",
"- ndcg_k()\n",
"- eval_model"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "aCShFbsCTPzw"
},
"source": [
"#### Early Stopping\n",
"Premature stopping is applied if *recall@20* on the test set does not increase for 5 successive epochs."
]
},
{
"cell_type": "code",
"metadata": {
"id": "tHVTudWxTVZo"
},
"source": [
"def early_stopping(log_value, best_value, stopping_step, flag_step, expected_order='asc'):\n",
" \"\"\"\n",
" Check if early_stopping is needed\n",
" Function copied from original code\n",
" \"\"\"\n",
" assert expected_order in ['asc', 'des']\n",
" if (expected_order == 'asc' and log_value >= best_value) or (expected_order == 'des' and log_value <= best_value):\n",
" stopping_step = 0\n",
" best_value = log_value\n",
" else:\n",
" stopping_step += 1\n",
"\n",
" if stopping_step >= flag_step:\n",
" print(\"Early stopping at step: {} log:{}\".format(flag_step, log_value))\n",
" should_stop = True\n",
" else:\n",
" should_stop = False\n",
"\n",
" return best_value, stopping_step, should_stop"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "6JEG5Jlpw3Nw"
},
"source": [
"def train(model, data_generator, optimizer):\n",
" \"\"\"\n",
" Train the model PyTorch style\n",
" Arguments:\n",
" ---------\n",
" model: PyTorch model\n",
" data_generator: Data object\n",
" optimizer: PyTorch optimizer\n",
" \"\"\"\n",
" model.train()\n",
" n_batch = data_generator.n_train // data_generator.batch_size + 1\n",
" running_loss=0\n",
" for _ in range(n_batch):\n",
" u, i, j = data_generator.sample()\n",
" optimizer.zero_grad()\n",
" loss = model(u,i,j)\n",
" loss.backward()\n",
" optimizer.step()\n",
" running_loss += loss.item()\n",
" return running_loss\n",
"\n",
"def split_matrix(X, n_splits=100):\n",
" \"\"\"\n",
" Split a matrix/Tensor into n_folds (for the user embeddings and the R matrices)\n",
" Arguments:\n",
" ---------\n",
" X: matrix to be split\n",
" n_folds: number of folds\n",
" Returns:\n",
" -------\n",
" splits: split matrices\n",
" \"\"\"\n",
" splits = []\n",
" chunk_size = X.shape[0] // n_splits\n",
" for i in range(n_splits):\n",
" start = i * chunk_size\n",
" end = X.shape[0] if i == n_splits - 1 else (i + 1) * chunk_size\n",
" splits.append(X[start:end])\n",
" return splits\n",
"\n",
"def compute_ndcg_k(pred_items, test_items, test_indices, k):\n",
" \"\"\"\n",
" Compute NDCG@k\n",
" \n",
" Arguments:\n",
" ---------\n",
" pred_items: binary tensor with 1s in those locations corresponding to the predicted item interactions\n",
" test_items: binary tensor with 1s in locations corresponding to the real test interactions\n",
" test_indices: tensor with the location of the top-k predicted items\n",
" k: k'th-order \n",
" Returns:\n",
" -------\n",
" NDCG@k\n",
" \"\"\"\n",
" r = (test_items * pred_items).gather(1, test_indices)\n",
" f = torch.from_numpy(np.log2(np.arange(2, k+2))).float().cuda()\n",
" dcg = (r[:, :k]/f).sum(1)\n",
" dcg_max = (torch.sort(r, dim=1, descending=True)[0][:, :k]/f).sum(1)\n",
" ndcg = dcg/dcg_max\n",
" ndcg[torch.isnan(ndcg)] = 0\n",
" return ndcg"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "sx-Vzl2vTeWN"
},
"source": [
"#### Eval Model\n",
"\n",
"At every N epoch, the model is evaluated on the test set. From this evaluation, we compute the recall and normal discounted cumulative gain (ndcg) at the top-20 predictions. It is important to note that in order to evaluate the model on the test set we have to ‘unpack’ the sparse matrix (torch.sparse.todense()), and thus load a bunch of ‘zeros’ on memory. In order to prevent memory overload, we split the sparse matrices into 100 chunks, unpack the sparse chunks one by one, compute the metrics we need, and compute the mean value of all chunks."
]
},
{
"cell_type": "code",
"metadata": {
"id": "1dysqVKGTjm6"
},
"source": [
"def eval_model(u_emb, i_emb, Rtr, Rte, k):\n",
" \"\"\"\n",
" Evaluate the model\n",
" \n",
" Arguments:\n",
" ---------\n",
" u_emb: User embeddings\n",
" i_emb: Item embeddings\n",
" Rtr: Sparse matrix with the training interactions\n",
" Rte: Sparse matrix with the testing interactions\n",
" k : kth-order for metrics\n",
" \n",
" Returns:\n",
" --------\n",
" result: Dictionary with lists correponding to the metrics at order k for k in Ks\n",
" \"\"\"\n",
" # split matrices\n",
" ue_splits = split_matrix(u_emb)\n",
" tr_splits = split_matrix(Rtr)\n",
" te_splits = split_matrix(Rte)\n",
"\n",
" recall_k, ndcg_k= [], []\n",
" # compute results for split matrices\n",
" for ue_f, tr_f, te_f in zip(ue_splits, tr_splits, te_splits):\n",
"\n",
" scores = torch.mm(ue_f, i_emb.t())\n",
"\n",
" test_items = torch.from_numpy(te_f.todense()).float().cuda()\n",
" non_train_items = torch.from_numpy(1-(tr_f.todense())).float().cuda()\n",
" scores = scores * non_train_items\n",
"\n",
" _, test_indices = torch.topk(scores, dim=1, k=k)\n",
"\n",
" # If you want to use a as the index in dim1 for t, this code should work:\n",
" #t[torch.arange(t.size(0)), a]\n",
"\n",
" pred_items = torch.zeros_like(scores).float()\n",
" # pred_items.scatter_(dim=1,index=test_indices,src=torch.tensor(1.0).cuda())\n",
" pred_items.scatter_(dim=1,index=test_indices,src=torch.ones_like(test_indices, dtype=torch.float).cuda())\n",
"\n",
" topk_preds = torch.zeros_like(scores).float()\n",
" # topk_preds.scatter_(dim=1,index=test_indices[:, :k],src=torch.tensor(1.0))\n",
" _idx = test_indices[:, :k]\n",
" topk_preds.scatter_(dim=1,index=_idx,src=torch.ones_like(_idx, dtype=torch.float))\n",
"\n",
" TP = (test_items * topk_preds).sum(1)\n",
" rec = TP/test_items.sum(1)\n",
" ndcg = compute_ndcg_k(pred_items, test_items, test_indices, k)\n",
"\n",
" recall_k.append(rec)\n",
" ndcg_k.append(ndcg)\n",
"\n",
" return torch.cat(recall_k).mean(), torch.cat(ndcg_k).mean()"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "mvvKJOlpSLn4"
},
"source": [
"### Dataset Class"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "AzoIqHHuUADD"
},
"source": [
"#### Laplacian matrix\n",
"\n",
"The components of the Laplacian matrix are as follows,\n",
"\n",
"- **D**: a diagonal degree matrix, where D{t,t} is |N{t}|, which is the amount of first-hop neighbors for either item or user t,\n",
"- **R**: the user-item interaction matrix,\n",
"- **0**: an all-zero matrix,\n",
"- **A**: the adjacency matrix,\n",
"\n",
"#### Interaction and Adjacency Matrix\n",
"\n",
"We create the sparse interaction matrix R, the adjacency matrix A, the degree matrix D, and the Laplacian matrix L, using the SciPy library. The adjacency matrix A is then transferred onto PyTorch tensor objects."
]
},
{
"cell_type": "code",
"metadata": {
"id": "s0w9GTdKw7Vj"
},
"source": [
"class Data(object):\n",
" def __init__(self, path, batch_size):\n",
" self.path = path\n",
" self.batch_size = batch_size\n",
"\n",
" train_file = path + '/train.txt'\n",
" test_file = path + '/test.txt'\n",
"\n",
" #get number of users and items\n",
" self.n_users, self.n_items = 0, 0\n",
" self.n_train, self.n_test = 0, 0\n",
" self.neg_pools = {}\n",
"\n",
" self.exist_users = []\n",
"\n",
" # search train_file for max user_id/item_id\n",
" with open(train_file) as f:\n",
" for l in f.readlines():\n",
" if len(l) > 0:\n",
" l = l.strip('\\n').split(' ')\n",
" items = [int(i) for i in l[1:]]\n",
" # first element is the user_id, rest are items\n",
" uid = int(l[0])\n",
" self.exist_users.append(uid)\n",
" # item/user with highest number is number of items/users\n",
" self.n_items = max(self.n_items, max(items))\n",
" self.n_users = max(self.n_users, uid)\n",
" # number of interactions\n",
" self.n_train += len(items)\n",
"\n",
" # search test_file for max item_id\n",
" with open(test_file) as f:\n",
" for l in f.readlines():\n",
" if len(l) > 0:\n",
" l = l.strip('\\n')\n",
" try:\n",
" items = [int(i) for i in l.split(' ')[1:]]\n",
" except Exception:\n",
" continue\n",
" if not items:\n",
" print(\"empyt test exists\")\n",
" pass\n",
" else:\n",
" self.n_items = max(self.n_items, max(items))\n",
" self.n_test += len(items)\n",
" # adjust counters: user_id/item_id starts at 0\n",
" self.n_items += 1\n",
" self.n_users += 1\n",
"\n",
" self.print_statistics()\n",
"\n",
" # create interactions/ratings matrix 'R' # dok = dictionary of keys\n",
" print('Creating interaction matrices R_train and R_test...')\n",
" t1 = time()\n",
" self.R_train = sp.dok_matrix((self.n_users, self.n_items), dtype=np.float32) \n",
" self.R_test = sp.dok_matrix((self.n_users, self.n_items), dtype=np.float32)\n",
"\n",
" self.train_items, self.test_set = {}, {}\n",
" with open(train_file) as f_train:\n",
" with open(test_file) as f_test:\n",
" for l in f_train.readlines():\n",
" if len(l) == 0: break\n",
" l = l.strip('\\n')\n",
" items = [int(i) for i in l.split(' ')]\n",
" uid, train_items = items[0], items[1:]\n",
" # enter 1 if user interacted with item\n",
" for i in train_items:\n",
" self.R_train[uid, i] = 1.\n",
" self.train_items[uid] = train_items\n",
"\n",
" for l in f_test.readlines():\n",
" if len(l) == 0: break\n",
" l = l.strip('\\n')\n",
" try:\n",
" items = [int(i) for i in l.split(' ')]\n",
" except Exception:\n",
" continue\n",
" uid, test_items = items[0], items[1:]\n",
" for i in test_items:\n",
" self.R_test[uid, i] = 1.0\n",
" self.test_set[uid] = test_items\n",
" print('Complete. Interaction matrices R_train and R_test created in', time() - t1, 'sec')\n",
"\n",
" # if exist, get adjacency matrix\n",
" def get_adj_mat(self):\n",
" try:\n",
" t1 = time()\n",
" adj_mat = sp.load_npz(self.path + '/s_adj_mat.npz')\n",
" print('Loaded adjacency-matrix (shape:', adj_mat.shape,') in', time() - t1, 'sec.')\n",
"\n",
" except Exception:\n",
" print('Creating adjacency-matrix...')\n",
" adj_mat = self.create_adj_mat()\n",
" sp.save_npz(self.path + '/s_adj_mat.npz', adj_mat)\n",
" return adj_mat\n",
" \n",
" # create adjancency matrix\n",
" def create_adj_mat(self):\n",
" t1 = time()\n",
" \n",
" adj_mat = sp.dok_matrix((self.n_users + self.n_items, self.n_users + self.n_items), dtype=np.float32)\n",
" adj_mat = adj_mat.tolil()\n",
" R = self.R_train.tolil() # to list of lists\n",
"\n",
" adj_mat[:self.n_users, self.n_users:] = R\n",
" adj_mat[self.n_users:, :self.n_users] = R.T\n",
" adj_mat = adj_mat.todok()\n",
" print('Complete. Adjacency-matrix created in', adj_mat.shape, time() - t1, 'sec.')\n",
"\n",
" t2 = time()\n",
"\n",
" # normalize adjacency matrix\n",
" def normalized_adj_single(adj):\n",
" rowsum = np.array(adj.sum(1))\n",
"\n",
" d_inv = np.power(rowsum, -.5).flatten()\n",
" d_inv[np.isinf(d_inv)] = 0.\n",
" d_mat_inv = sp.diags(d_inv)\n",
"\n",
" norm_adj = d_mat_inv.dot(adj).dot(d_mat_inv)\n",
" return norm_adj.tocoo()\n",
"\n",
" print('Transforming adjacency-matrix to NGCF-adjacency matrix...')\n",
" ngcf_adj_mat = normalized_adj_single(adj_mat) + sp.eye(adj_mat.shape[0])\n",
"\n",
" print('Complete. Transformed adjacency-matrix to NGCF-adjacency matrix in', time() - t2, 'sec.')\n",
" return ngcf_adj_mat.tocsr()\n",
"\n",
" # create collections of N items that users never interacted with\n",
" def negative_pool(self):\n",
" t1 = time()\n",
" for u in self.train_items.keys():\n",
" neg_items = list(set(range(self.n_items)) - set(self.train_items[u]))\n",
" pools = [rd.choice(neg_items) for _ in range(100)]\n",
" self.neg_pools[u] = pools\n",
" print('refresh negative pools', time() - t1)\n",
"\n",
" # sample data for mini-batches\n",
" def sample(self):\n",
" if self.batch_size <= self.n_users:\n",
" users = rd.sample(self.exist_users, self.batch_size)\n",
" else:\n",
" users = [rd.choice(self.exist_users) for _ in range(self.batch_size)]\n",
"\n",
" def sample_pos_items_for_u(u, num):\n",
" pos_items = self.train_items[u]\n",
" n_pos_items = len(pos_items)\n",
" pos_batch = []\n",
" while True:\n",
" if len(pos_batch) == num: break\n",
" pos_id = np.random.randint(low=0, high=n_pos_items, size=1)[0]\n",
" pos_i_id = pos_items[pos_id]\n",
"\n",
" if pos_i_id not in pos_batch:\n",
" pos_batch.append(pos_i_id)\n",
" return pos_batch\n",
"\n",
" def sample_neg_items_for_u(u, num):\n",
" neg_items = []\n",
" while True:\n",
" if len(neg_items) == num: break\n",
" neg_id = np.random.randint(low=0, high=self.n_items,size=1)[0]\n",
" if neg_id not in self.train_items[u] and neg_id not in neg_items:\n",
" neg_items.append(neg_id)\n",
" return neg_items\n",
"\n",
" def sample_neg_items_for_u_from_pools(u, num):\n",
" neg_items = list(set(self.neg_pools[u]) - set(self.train_items[u]))\n",
" return rd.sample(neg_items, num)\n",
"\n",
" pos_items, neg_items = [], []\n",
" for u in users:\n",
" pos_items += sample_pos_items_for_u(u, 1)\n",
" neg_items += sample_neg_items_for_u(u, 1)\n",
"\n",
" return users, pos_items, neg_items\n",
"\n",
" def get_num_users_items(self):\n",
" return self.n_users, self.n_items\n",
"\n",
" def print_statistics(self):\n",
" print('n_users=%d, n_items=%d' % (self.n_users, self.n_items))\n",
" print('n_interactions=%d' % (self.n_train + self.n_test))\n",
" print('n_train=%d, n_test=%d, sparsity=%.5f' % (self.n_train, self.n_test, (self.n_train + self.n_test)/(self.n_users * self.n_items)))"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "J2RxhIxmSYvl"
},
"source": [
"### NGCF Model"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "P4vz1IOwTvED"
},
"source": [
"#### Weight initialization\n",
"\n",
"We then create tensors for the user embeddings and item embeddings with the proper dimensions. The weights are initialized using [Xavier uniform initialization](https://pytorch.org/docs/stable/nn.init.html).\n",
"\n",
"For each layer, the weight matrices and corresponding biases are initialized using the same procedure."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "wo9vgNvJUWvR"
},
"source": [
"#### Embedding Layer\n",
"\n",
"The initial user and item embeddings are concatenated in an embedding lookup table as shown in the figure below. This embedding table is initialized using the user and item embeddings and will be optimized in an end-to-end fashion by the network."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ntRUCJGMUeNf"
},
"source": [
"![image.png](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAASEAAABICAYAAAC5kUM9AAAgAElEQVR4Ae2daVhUV7rvz5f74T733A99zj3dT/p0x3S0p6QTTxJN2k6cYsc4xChx1hijxiHGeR4jzjOKAwqIoOIMIiAKIiLzXMxTMRbFUExFUUVBjft3n12AIuJAGoLEzfPUQ9Xae7/rXb937f9ea+219/o3pD+JgERAItCDBP6tB/OWspYISAQkAkgiJFUCiYBEoEcJSCLUo/ilzCUCEgFJhKQ6IBGQCPQoAUmEehS/lLlEQCIgiZBUByQCEoEeJSCJUI/ilzKXCEgEJBGS6oBEQCLQowQkEepR/FLmEgGJgCRCUh2QCEgEepSAJEI9il/KXCIgEZBESKoDEgGJQI8SkESoR/FLmUsEJAKSCEl1QCIgEehRApII9Sh+KXOJgERAEiGpDkgEJAI9SuAXK0KNufe45O7CqVOncLkWhUJrwNKjqKXMJQJdRECoIyPoEmdPn+D4cSe84srRGoQuMv7zm/mZRchCScRFju/ZyqaNG9mwYUObz2ZOBBZS12TtAgoCNV7LGPb5fA55R5BRUkOj2UrvDVMXIJFM/IIImNBVK8mJvcLG8R/xtWM8Sk3vvcT+zCIk0FirIGjbaN7u898MXX2Fe1EyZDF3cPp+BKM236ao1tAFYmGl6sI3/M+Uw0QV1GISJPn5BZ2BUlFsBAQsBhVXFw5g8qFoitXmXsvlZxYhkZOZ9MNj+MOv/p0hOxNQ1pmwmBqpTjrGUns/FNX6LhOhAXPOkFWh6wJ7vTa+kuO/aAJNBK7+mGkOkgh1MswWMh2+4A+/+r8M3S2jrN5MVbQ/IVmFZBepMZq7ojvW3BIaOOcM2S8qQnoFMT5nObpvHw6ufsjKGjB1hSudpNM1u+tRxN7E/eg+9jm44isrQ2d8iQujLyH2pnsLe1+SSnW8zO52FCN9SRy+7o7s3+eAq28SpVoj3U+8icA1kgh1FI/npLWK0P/hrck7OHLKma3TZ7A3XIna2HG3yVKVSfjd2/j7+eHX9nM7jkKdgScbop0TIaEugTPLJjJ1zRn8A1xZMXYQw+eeIq5C14Ht5xSvpzcLdSScWc6kqWtw9buF68ov+MfwOTjFlqM19bRzT+YvsndbMZlpa1zwu+XKyi8+Zvick8SUa7G5K9SSGXqPpJJ6ml7KYQ+BusSzrJwyjdXOvvi7rmLcx8OZcyKa0vpm4II6i9B7SSjqm7r45ogkQk/WqBdKaRWhf+e9+c5c9fPn0MxJbL+voPYpItSU/4Cr587gfPo0p9t8nD1Dye8wsJ0RITM5Z2cz8I13mLT7An53b2L/xRv812tjOJRYhrbLKr6AWuaFs1sgWbX6x8VNUCPzcsYtMIta/eOSKqhleDm7EZhVS7tNHdI257gz56M/8O7EXZz3DeLm9nG8+evXGH0wntL6LitMh3l3PtFMjvtcPnrzXSbuPIdv0E22j3uTX782moNxSmzuWvLxd9jHxdgKNJ29AySoSfZ2wS0wk5qGtlwF1MneuLgFklnT8EQsHh3zAqptzsFj3iD6vjuRHR43Cbq5gy/7/obfjtpPbInGJjqWglsc2X+RmHINnS3Cs5lKIvRsPk/d+kiEBttHkVdeR+E9b+7laZ56klkbNVRXqqioqHj8UyVeHTu669UJEbIquTD7bX7z6w+Zt/8kbp4XcD68k207zxJWWo/BakKjSCejWIPBLIBVTV5mIeomUyeb2xbk5xYxetxGfApqaGzb6LPIObdoNOM2+lBQ0/jYGJZFfo5Fo8ex0aeAmscO6giwlVLPb3nntV/z4dx9nHTz5ILzYXba78QtTEl9kxWTRkF6ZjF1BhMCVtT5mRSpG3um62ktxXPOO7z2m4HM3XsCtwvN7O13uvFAqaHJrEOVl0Cw/30yVA2d76JZ5JxfPJYvN3qTV92WqwX5+cWM/XIj3nnVT8Si9Rh51fPHJ62lF5nb/7f8ZuAc9hw/w4ULzhzeac9OtweUaJow61TkJQbjfz8Dla6ru2iSCHV0FrxAWqsItY4JWbE0atEZm8gODCBRpcPQrjOtT77EznXL+X7RIha1+Xy//gzRKi1PNqA6IUKWPE7a9eX//WoIW++mkl9eSWVlJZVVdehNGrJDLnNgwWi+PRZnG78y557l+0XHiFLWdZDvs4ovoK/IJS2tgOrGdgIm6KnITSOtoJrGdgNRgr6C3LQ0CqpfRCgs5DlN5I//9SsGbwkiJb+8uSyVVdTpjWiyQ7h8cCFjvj1GbKkGkzkX9x++51hkCXVPQnxWYbpmmyUPp0l/5L9+NZgtgSnktWUvTqkwqUg5u5yxM/YTolB3vhUhNKKSp5FeUNWOq0CjSk5aegFVT8Tiacd0XGRL/ikm//nX/Mcnm7kty6NcrDuVlVTV6TFbBUyqVNxXfMHM/cFddOe3rR+SCLWl0YnvFjIcxrYZmG5WHGtVMNumreZKfhUNbVsJgKW+FHlWBmlpaY9/shSoDeYOWiSdECFBS+imwbz+q9cY/uNt5GoDVksNyQ/iUGh1aCqC2frZxyzyzKRSb0bpOZdPF54jraIBq9BASYKMwoam5vELkYK5gqRb1wlIVKI1dF/3x6xK4tb1ABJLtDzKRkAbupkhff6D14ZtISCnFoPVQk1yGHEKLVpNBfd+HMkniy6QXtGAufQi8/65CI/UcnRiGAxKQo+vYpH9NdKqmrsp5qo0Lu3bgZNPDIU64zPHNMwqWQc+PaNqCFoebBnKG//xGsO23CKntpl9Slg8Cm0jZque5MPjGTT3DLIyHRYE9MpEkgt1ND7sKZlRyQLwCkigpL4bJ6SaVcgCvAhIUFD/CDiC9gFbh/+B/3xtKJv9sqlpsmKpSSEsXkG93oxVn4LDhI+Z6yoOVjfXB0GdzbWdi9jokUh5y7gR+kwubV3L3nP3yalrQmx0P//vSRFqUsYT/CCNcq2hg/Pi+RZ7Yo+f+Ra9hZJIT7aO/yO/+t//i//+eCaLlq5izeplzB33AW8MXIF/UU3nr3hPkOuECCGgzbnBjxPfp2/f9/nn5DksXLiY9c5hKOsNmEvO8c2grzmRVIZGncjRyX9nylFxfEVLScJ17L9Zy9XCR8JpVfmy5rMP6D96O8GKzraWnijIUxKsqHzXMvKD/ozefpdStfHhfoI2Bx/7SXzQty/v/XMycxYsZPF6Zx6UiN3JEs7P/pivjydQWqcm0XEKg6YeJU6paR4XsTaQcHg2Qz4azKLzMlQNFoSmbFx2OxKWVYL2mRM+m336fEB/RtvfRak2PPTp6V8EtLk+2E/+gH793mPE5DkssLEPRaExYLWWcXn+YGY4xlKiMdsEyGvHt6y7kodK13KWWivxWzeKAf1HYx9UhLprB10eum6t9GPdqIH8z6htBBbWPqqjgpbcm9uZMqAf/d4bwaRvF7Bg8XpOhxbbJt5ay66wYMgMjkYrqGsdlmqIwnHqQN6Z5oSsTIsVPTlBB/j673bsCcmkwtDREMNDV9p8aS9CZjJd5jDi0yVcyFI9cTFvc+BL9fVnFiFxsmIpeWnxxERHEZuUSnpGJpmZ6aQkxhKTUkSdwfzYmMhPo9UsQuI8ocwXuUVvaUCVn0b0XT9u3n5AnCwNeXk9RouALnA1g8bYE5KfTbS3PXZ/G4n9nTRKGo00qNM4NusHzudVoG05JwRjFTmRV9iw7BARpW0q608ryFOOEjBW5RB1ZSPLDkVQUdv2hLfQoMonPeYufjdv8yBWRpq8nHqjBUEXxJp/jMU+WE52jDfbv3qHz+0DSFXosYgtIaEaP6eTeDnNZ/QXq7iYUklDfQjHHO9SUq1/zpW1xaerok/hlD/m01OKISbb2KcT08I+tg17oe4Oq4ZMYGdAMkU6C0a9mvQTs1lyPpey+pY+u2CkKieKq5uWcyi8hJpuEiExrrlRV9m0/BDhJY9fKC0NKgrSYwj28+X2g1hkaXLKNQbMgoDmzmqGTdjJLVkhWovYlgNDsidO+xYwcNBKvHJUVGffw+/WQSZ/8gOXMspo1ddnUGvZ1H6ekIBOEUfI/ZTubRU+37FO7fEzi1CnfPsXdhZFaBb9Jx62XeVftFNkNRtoMpiwPJxhLT7+sYj3PhjDoq0n8b64hqF/Gcqc/TdIVYvdhVLc5y/DM/+RCCHoyA2+yMXAdFQN4uBv9/wJulyCL10kMF1Fg6mDXKxmjE0GTBbhoQ9CjTffvz+AMYu2csL7ImuH/ZVhc/bhnVJLo9j+b4zE5fR9FMVpeG+wY/yaSySGHOHAZTmVL3BnTfTp3qVLT/fpGSisZmM79mBKOcjYd4cxa7snceLAtGCl7NwiVnjKH4kQAjr5PS5fCiStQtd9A+yCDvm9y1wKTKOiwwFmK2ZjEwaThYfVBxOph8fRf/gs7C/E2o4TMCC74MqdJFe+HTSenX4BXPF+QEbYASZ+tYt7eVWPWlnP4GXbJNRwdf4AJh+MejRjWrBgElusHVSJ55nrqe2/WBGqvraYQQNHMH3ZNhw8Iyj+iQ+wmmpyiQmPJiVHQXVlNtERiWQVVqIzWRE6EiGMaKsqqNYam1sX3RVZo5aqimq0RstzWihtHDDVII+JIDolh+KqSrJjIkjKLKRSJwovmDPdOXGjgGqtCV2BL5snjmfO9DnsDC2l9kVaGD/Fpzbutf8qaOTERCSQkd/SksNKqUd7EQKjtoqKai1GW3OuvZWu+v1T4ipQL48lIiGDfFvL2gpNSZx3CaSgIpkTUwcyct5eriSVUnNvM6O/c0bWclv/mV6LD7AGenJy1xLGDXyLiYfjUda96KX2mZZ7ZOMvVITAVFNAqkxGSmYuBUrxSv+i/ez2cRCwmC1YxUuLYMEsTgmwXWWs1GRdZ+Wno1h5ORalvrXVI/w8VyHhUQunvcfP+i1YzFis4rFtygXoy1Lw2z6PZacjya8zYrU2UOi/lQkjFnMuQ9U8cP0sw+K2n+jTU80+xhusNdl4rf6MMSsvEqVooLUBKPwsl/2fFtdHvIGmClKubmLFofvkVdcStnsu6zyikCvyuL97PAMm7+eeXP0CUxFMaCuVFMkzSE5KIlucYf5iI9lPRd2TG36xItT9UAWM9eXkpqSQW6a2zVfq/jy7LwdLYx0V8kxyleKkyGbBtjaUkibLpUrf1fNbflo5BKOWcnkqKbllqJs60QL8adl1/VHWRupK88gtUdNosqArK0BZ14jJ2EBVQTpJGcXUiHfUuj7nl9qiJEIvdXh+TufE5p3Y2ns8T8HaLuHxzdKvLiLQQr+LrPUuM5II9a54Sd5KBH5xBCQR+sWFVCqQRKB3EZBEqHfFq+e9lXpnP1MMXh3Qr7wImYrCCYwvRftyvifiZ6rwL5aNtUZGUFgu6i6a/2QqjiAoXkl9Y++9vfxi5Dq5l7WG5Lth5NbqOvl8YifzeUl2f+VFSNAreHB2P/tdbyEr7c0vMuvGGmWppzDyMg67jnMzpcw2R6orchP0JYS5H+CAqz9Jyt73IrOuYPC4DQv1RVFcObKb4z4ySrXtHnR+fOdfzK9XXoTAgrYslUDXbSyYNoVZi1azdY8DTm6eXPP2wdfPhwvOLpy/4YuPpzPO573wPneKo04e3PQ6h7PzeXy9z3H6qBMe17045+zMeS8vzp06ipPHdbzONR/j1XKMj+8NLnleITDoHuExSaTLlagbzbzYTSgTuqpictMSiA67z51rl7ly0x8fz9M2f3xa/fF6cX+uu59sPtbm1yX8fH3xu+mD95ULnDlxmN2bVzB36iRmLtvHhfuZVDR05Ykh3qZOJfCMPQunN7Pfsvvwk+y9fbnh6YLLeS+8zp3G0ckDsawuLufxFcvq6ITHNZG9SzP70444eVyzsReP8b3hiYvrJe4EhRKdlI5cqX7xeWNmHVUKOWkJ0YQHXcfNzZMbvjfwdOk4r1b/rrmffMxPW12x+XmTG5c8uXTDl5u+vvh4X+HCmRMc3r2FFfOmMunrZew9H0JGuXhBfDW6ZJII2a4nZhoqC8mIDuDC8Z2sWTiLSWM/Y9jHA3nv3bfo26cPb/T7I/3eeJ3X3+hL3z6/47e/60O/vn14/fU3bP9//9vf0efNvvT5/eu80bcvfX73W37X50369vn948f060fft/rx57/+hbfefof+7w1k0NBRTJq7mr1ng8hQ6R9/9EDQUxJ/E+ddK5ht9xlDBg3gvf5/4+2//pU/9fsbffv9mX5v/P4n+/Nma1lEv97ux1/efpv+H3zIJ8NGMm7ybBav38upq0HEZiiofWGx7MxF2kxDVREZ0bfxPL6TtYu+YdIXIvsPeb+Vfd9+/xL7fn3f4PXX/8Cf/vxX3n6nP+8NHMSQUZOYu3oPboHpTwiroFcS7+vC7pXfYjdyCIMGvEf/v73NW39+kz593qBvv3688fqz4/yQ6xN1pB99+75Fvz+9xdvvvs+Hnwxj5LjJzF68nj1OVwiKSae4Vnwk6NUQILGmSCLU9nyxGtDWlFOcl01GiozE+FiioyKJiIggPDy8yz4RURGEh90nJDiI277XOO96kqMHdrJ55QJmfbsap/sFaAxWrJp0vPYsZd53i1mzdRcHjp7E7aIXfrcDCQ4JJSw8iojwrvNN9CsiMpKomFjiE2WkZuSQr6iwjQF1+znRyj4/p9vYP7gfQnDQbXyvncfV6SgHdm5m5YJZfLvaiXv5dbYJp5p0b/Yum8f8xavZsusAR0+64Xndj9uBwYSEhnVNHYiIIiIiksioGGLjE5GlZpCTr6CiVnxx26sjPq2nniRCrSR+7v+CgNVqwWxqQq/VUFtVjiI/i+RIP9wOOHA9IQGv4w64XA0kSpZFQUkFVbUadHoDJvExklewsv6rIRIEK1aLGVOTHq2mlqpyBflZyUT6n+WgwzXi4704ccSFq3cikWUVUFJeRa1Gh95gsj2uIyH/VyPQ8fGSCHXMpedSzQ1UFSUSdOM6vuGvXtO8J8Cb9dUUJQVxw8uX8LQiarql29kTJesdeUoi9DLGSTBSX1XViQHrl7EQvcsnwVhPVZU4YN38zp/e5X3v9lYSod4dP8l7iUCvJyCJUK8PoVQAiUDvJiCJUO+On+S9RKDXE5BEqNeHUCqARKB3E5BEqHfHT/JeItDrCUgi1OtDKBVAItC7CUgi9JLGz1BTTHr0fRIKdTS1rlf1Qr4K6CvzSI4IJ6Ws4bGFETtOfyGjv+CdDNQWpxMdmkChtunxdemfV2pBT1VeChHhKZTq2i68KKCvyiNFjEGprk0Mnmfw1dwuidBLGXcBbUEQe7/+kg3eBVTrOzOV30pt1jW2TJrBvhBxhdrWY9unWykLOYXDNRmV2keLJ3YLDms5IaePcC2pkuasftpL47vFN0FH4d39fDN+A9fzKukc6lqyr29l8sy9BD+2aKeV2uzrbJ08k73B4uTHMu47H+Vakor6bkbdLYy62agkQt0M+Keat+jlOE39mO89c1C1Xxf7OUZN9bHsGTuaLXcKqH4oQtA+XZt+m5tRxWi6+30+gpb0275EFWlotJjJvb6NfdezqWxZFvk5xenmzRb0eaeYPngxFzLLO7HwoOiWCW3cPsaN2UKAvIqmVr0Xt2jj2DduDFsC5FQ11ZNxRyx/HfpOtWq7uegviflXRoQsdUpyszLJF59SN6lR5maRma9CLy5WJ1aamgJkMdHEpRVT27K+vaBXkZucSHpxLU1mK1ZTPaVZOSjraynOzLItVWwWTNQUyIiJjiOtuBajuf1aCQJ6VS7JieIjGE2YrVYMagUZuWXotdUUpSWSodRgMDWgypWRkKFsXi3VquL87KEsORdNVGQ8KYU1Nh+a6017my21SWigLDOemJibbPxsJJtbRaij9CYj9cUyotNKaWjUU1eSSW6ZnobqYtLiE5FXtnma31BNXlI00SlysiNvcMEnltL6JgwPmRVR0/T0VSIEUz0KWTRppQ00akrwWjmYLzddJTiuCLV4nFWPKjeZxPRiam2/DagVGTZ/tNVFpCVmoNQYMDWoyJUlkFFSj6GjJW6EdnawYqgrIbMta3E57IesS6g3mLGoLjBn+BI8xIdK41MorGls8yYDkbWclKR0imxPt7dEoKGcrPgYYm5u4vNRmx+JkNBAeVZzDDZ9PorNAXIqtfUokmNIU+poMhoestZVF5OekIhcXNix9b1uImtZjI11VtQNPH1iUWqaMNQWkhwbTVzqs1m/JLrSKTdeHRGqkuGydDxL3NJQaSpIPrMcu6VnSVfpMFWGcNj+OL7BAZxcu4sbRTXUFd/F+fgFbgf7c3L9chzvF5KfcJFNU6byw/ZdbFq6BMdQJfKAQ9gf9yX4lhNrd3mjqm5oEwAzirsunLgQQLD/SdYvdySkMIdI15VMmLmGw8dcOOu4iunTV7D32GnczzqwfNo8jkWUomlScf6bAYz94QDO513ZtmgBu29mU6M3oAhutenEhhWOhBSpaTIUE+R0kNNeQYT4HmLa/wxl7a18qnUdp1dVZOD14wymbg+gIDsKj7UTmbHqEMdOueNsP5dJK86RotJhacjg0p5DXLr3AP9jy/lu6yku306lqiiQwzuOc/PuLZzW7ca7oOopa5MJqDO92DZzKttvFaIqknHq2/cZsfgY5/xklNblEehyggsBd/F32sAKx3vkZ0fiusqOmWsOcczlLI6rZjBjxV4cT7tz9sgKpn/nSFhJHY/1bMwlBLuefMxOYU0x8W6rsft6NYdaWM+YsYK9jiLrI6yY/h2OYSXUllzg2w+/4Id9pzjnas+ihbvxyaymwWKi5J4rJy8EcNffiY0rHblXUIu26C6nDp3meuA9fA/N4P2ha/ETW0JGBXdPHeL09UDu+R5ixvtDWeuXS26SF/ZfT2O7fwGViiTOr5/EzFUHcTzljov9PCav8EBWrsXckMHlfYe5GByK//EVzN96iosBKVQWBXJk5wl8gvxxWr8H7/xKtO2vdW1qXW/7+sqIkGCqxWfJh0w+kkippola32X8fepRZKUajFknmPz5Qk4GJpMSFkuupoLAHRP5av4W9h89wvoJHzB0gz/5BeHs/eJDZh0JJDohjcKaBtKOTebzhScJlKUQFpuLvvHRqSFo7rNz0kTmb9nHkSMbsPtgKOv9Msnx28Cwf67iYmQahfJLfP+PiewKiCNbkcuZbwYx3yOLCm0F57/5mDnHH5BWpCT51ByGzTxKRIYP20Sbm1tsDhBt5lFwZyfTFh/jQVYp6spwto8axeY7eRQE7uogXVxltQ7ZQTuGr/Imt6SU0G0j+XSZB2EpBRRH72PCP9fhK69Cm3SESVN3EpRdSnnoNsZN3MXtzAp0mSeZOnoRJ+4kkSwyq2986qtITRoZh+w+ZZV3Dqq6GgLWDGbS3hDSiuqovLeDSRPns3nfEY5ssGPAsPX4ZmTjv/FTPlvlSXhqAfLLi/lk0i78Y7JQ5Lrx7ccLcM8oQ/uw+yOgCd3NlEnz2by3jR25EuXdTYwYuQrP8FQK5JdZPHgSu/xjyFLk4vbtxyxwz0CZf545g+dy7H4KhcoUTs/9lK+PhFNcfJfdUyYxf9NeW/y+GjiMdT4JXP9xBj843idDqaYyYgdjRostoUoqH+xhxg+O3M9Qoq6MYMeY0baWUKlKxuGJI1h1PZsKdQ1h20fzz2XuhMryKY7ez1cj1+GTU0m9zJGp03dyJ7OE8gfbGT95F7fSy9BlOjF97CKOBySSHB5Hrkb/VNa9TYBEf18ZEQIDwWs/ZrqjjDKNBcP9DQyecZyUsnosulx8j6zju8ni1fcsMco4DowfwkKXMGQ5hchTY4jLrqSpSY7TlBGsuSGnqlE8AwR0ub4cWfcdk+1msvpsNNW6RyJkznRkwpCFuDxIIqdQTmpsHNmVehri9zB6wi5Ci2oxGh6wedg0jiQo0VgaubX0I75xzqBMU9HcHbvUPCbUFL6F4SM34ee/lTFDF+Lc1qaqntRjExmy5BLZqgYESxaOdl/wY2AOkYc7ShfHiiwUu85g5LqbyKt0pDlM4IttQRTUGLCWuTNr+HKu5qjQlnixxO57XMJTkXmuYPLKi6RU6DDrcvE7uoH5U75i5mo3osq0Tz8xLApcZ37Oupu5VDY28WDzcGY6JqGsM5B57CuGLXQmNDGbQnkqsXHZqPQNxO8di92u+xTWGDGEbeHT6UeIU6gxNwWw/B+zOZ2qpP5ha8BM5rGJDFt4mvtt7TQYaUrcxxd2uwgpqMFoCGPLiOkciVOgNjcRsPwfzD6dSon8vK075pkljgk1EfHjCEZtukXWg4NMHL6I0yEJZLfGrzSJIxOHscQzg/IGAUv2cSaN+5Hb8nJSTk5h2BJP21sRBUs2xyeN48fbcqr0xZz5ejTrbohlM5N+9Cu+3HaHvKomrGXnmD1iOVcyy6kv8Wb5pMU4hyYj81zF1FWeyMq0mHVybjlubGF9hsjS+qez7oUq9EqJ0P0Ng5l4MBZlnRl94Cr+McWBRGUdalk44ZkZJEVcZt34Cey4Hc6hiR/wxfYgCtQGrIIVo9GMYM7j5OSRbPDLp1ochRS0yMLDycxIIuLyOsbb7aBEWfuwGlgUZ5k5YBzbg/JRG8S164225XqNCbsZNWE3YcW1mAwP2CSKUHwJdZZG/JcM5JtT6ZRpytuIkJXK64sYtfgcyXHHmD5wHNsD86h9aNNMifssPpx6hBilBrMplUPjRrHplpzk0193kC76b6HIZTqfrfVBXllP6uHxjNt2t1mESt2ZNXQpl7JV6KpvsXPdIc6JL1PzDyIyR0WDyUy9LIKIzHQSI66wfsJX7LhTSE3bkdmHFABLES4zRrL2RqsIDWOqQxyKOiMK0e9x9tzJq8VgFbAajZgFIwl7xmC3O/ShCA2f5kBssRpz4y2W/X02TilKNA9FyGIr/0fj7Lktr2ljR8CYuIexX+0itFWEPp2OQ2wxanMjt5b9ndlOKSjaipC1Eq8fxrDYI5mSTDe++ehL7G/nUtPK2lCCx+xBTHWIpLjOjCnVgfFjNuEnryDv3GwGTaDKfY0AAAZUSURBVHUgsrjOFgOH8WPY5CensqHQJsJrvUURMpJ2xI7x2wLJqxIF/xyzP13KxcxytNUB7Nl4CA/P6/j5BxKZXUGD0Ux9cmQz68grbPhqIjsCWupfW8a9+PsrJEIWiq8uZtT4H9hx5DSu2yfz3oAp7LiRSvGVTSze7cHVq8dZOn0p7okKMm7aM33UBOZvPcgJFw/8EgvJeXCcOe//kU9/OM29fC0GSx2+Py5mt8dVrh5fxvSlZylVaR9VB6OCW9tnMNpuPlsPnsDFw4/EwixCDkzh7b+NZ5tvCmn3dmH3l/eY5nCXjJQANn3+Jh/OOkFIfg6uc4Zgt+Io5y66sn/zDs6EF1GnLeLWjmabW1psJpRq0RX4sHHyBOZtPsBJF0cWDH6X0SvdeRDlyZqJT6ZHJIZy8rsP+eOnS3C+fo3dU9/lrbFbuJFSgMxzCZ/0G8zCM+EoCq+wZtYCVq9Zy4YtOzns4kO8UkulzzZ+2OPOlSsnWDZjKWcTytB2eOfHQnm8Kws++hMjlrgRpqgj8fAEhtgtZqtTMJmp3mybORq7+Vs4cMIFD78ECrNCODj1Hf725Y/cTE7j3u6v+Ov7UzkUmEZywBZG9/uQWceCydc8Wp7aqAhg58wxLXac8fBNoLQmn/DD03jnnS/50UdG6r3dTHzrfaYeCiQtOYAto/vx4axj3I08wewhX7HCwQNP1wNs2XGGsAI1TU0Kbu/8mjF237H5wHGcPXxJUNaQc2MTU+zmsWn/SVwcFzKk/2hWuEVQmn6dzVPsmLdpvy0GC4f0Z/SKM/j4HmXu3//MiB/OEBwVxMGv3+PtsZvwkuUju7iMIX8awgKXBxQVXGXd7IWsXt2GdUk9lTe3s3TPWS5fOcHymctwi1NSb3rYF31U33rpt1dIhMBUW4AsKpKE9FwKcuIJj0gmp6yexopckpNTbJ+U9Hyq9CZMunJyEsIJDY8hOauAUrWeevEuSUwk8akFqLQmLIKZKnkyySniJ4X0/CqMpoeXZ7EJgK48h4TwUMJjkskqKEXdoKFCnkx0VCLZZXWoK3KRRccgy1OhqS0lKzGa2OR8Khu0FKbEIpMlkZqZTXZuScs7npttJra1Kb6Ey1yPMjOB6LhksvPzkUU9ID5TQW2DGkXGk+k1tRXkpcYSGZ9GgUJBtiyaqIQsyup01BRnEB8ZR2pRNeoUd/YccOeK9y1u+3vjcWAZ83b4I89ORfYYMyOWDs8LAX11AamxkcSnFdnY6pRJhIXHk55Xga6pnrKcRMJDw4lJzqKgVE2DpgJ5cjTRidmU1ampyE0mOkZGnkpDbWk2SdGxJOdX0tByZ9N27ll0VOQkEvHQTi2NRi2VeclERz9indyGdXZSM2tVbT7JsTJkSSlkZmeTW1Lb8iJ8C7qKHBIjWuKXX2qLgbG+lMyEaOKSs8jLlxEVFk+GeGfUoKG0JQZZeWIMwojPKKJUmUuKrfyFqKrLyE2JsbEurdM2s46KI7WwCnWqB/sOunP5MdZ+5GSl2uqXWEfF+lnZ8DTWvVOFXikREkMkWMRXdYqT5SxY2uiFYDFiMIjC0uZMEiyYDEbbS8dtqYJ4XOunNeACFqMBg3hCtDm0dWtrngajuKJG8w6PbLTaavuf5jwQsFqsNn/F2/7tbYvlaGuzOT8Bi7k5H4ul7cu5Okh/WI62ebf/biL52BxWuMWTX16PTquhLPooq/YGoqhuxCr60J5Z24K3fH+svLY0K2bx5WEPeQlYTGJXtbWc7f1o95vm309m1d5OC8unlrV1uxWLtflY8QXzD91q9d9isnXHW+NnSxYsmG3+WrDY6lOrN+1YPzXvdmUSTKQc/46VZ2KRl7WwjnFkzd47FFXqX5h1qxe96f8rJ0K9KTg976uAKsSBlUs3sOekB5e9buDjd4eonEoaH2vx9bynvd8DAdX9o6xetoHdJ9y5fL2FdbY4l63N1bL3F/SJEkgi9AQSKaEtAUuDivysLHILiilRllJWWffia3a1NSR9fy4BkXVBtsi66JViLYnQc6uGtINEQCLQnQQkEepOupJtiYBE4LkEJBF6LiJpB4mARKA7CUgi1J10JdsSAYnAcwlIIvRcRNIOEgGJQHcSkESoO+lKtiUCEoHnEpBE6LmIpB0kAhKB7iQgiVB30pVsSwQkAs8l8P8BQIzxTVPvPikAAAAASUVORK5CYII=)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HKqah7XFUhan"
},
"source": [
"#### Embedding propagation\n",
"\n",
"The embedding table is propagated through the network using the formula shown in the figure below."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZvrR6lvUUuIm"
},
"source": [
"![image.png](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAXUAAAApCAYAAADKxrnwAAAgAElEQVR4Ae2dd3iUVdr/33/23d/qirvrvjZcxV1R0bXRV6SDFOlSVYr0IiBShFBCCb2ELtJbSEJIMKRDAmlAAiG99zLpk2QyM5k+z+d3PZOZZBKSEAUbO3Ndc81TznPK99zP99znvu9z5n+wfWwI2BCwIWBD4LFB4H8em5bYGmJDwIaADQEbAthI3SYENgRsCNgQeIwQsJH6Y9SZtqbYELAhYEPARuo2GbAhYEPAhsBjhICN1B+jzrQ1xYaADQEbAjZSt8mADQEbAjYEHiMEbKT+GHWmrSk2BGwI2BCwkbpNBmwI2BCwIfAYIWAj9ceoM21NsSFgQ8CGgI3UH1MZEBRJ3LpXgk5nfExbaGuWDYH/PgSM0hgikirR6IRmG28j9Wah+R3fEGREnT/EpXgpGkPznf87buFvt+qGMlJTi9DqDaCWkp0QQVBYEpqaXBKTS9HobYPsb7fzfqWaGcpJSy1Co6uVmZzESK6HJaFW5pCUXIpavG7+CKoMvI6e56ZEjrYZUXpoUjeUpZBSpMNgBHV5NgkRQYQlV6DOjSepVGu6bqmQ7feXQUCTeJLVmz1Jq9BgfJw4vY4wRWGzEGYyWq3+JwCrpyAxmVKNnmbejZ+Qp5KEi4dxjiqhRicgVKdzafUYhix1QyorIeb8AS5ElaLS/5Y7xUB5aipFWh0G1EhzEom8Hk6SVM1PqbZekkRyqZpfbcLYFGGGJ6HW/DSZkSQlU6rWPVKZSbx0BJe7RSh0RoTqDDzWjuOTpa6UVhQTe+EQzneLUVo0c0FDYeAu1h6NoFihoylJejhSV8bjcuAC98o0GASB6rSLrBw5iGUehShK73F2nxPRUvHeT3g/bI/8NAQEKQGrp7HOLw+59nECvp4wRVIUCdPNbjRDll6iQq5pJVZ6KgsKkekNCBipijnPgQv3KFPpm3w5WplpXTJ18hm+XX2G6OIaEwEK2kLOftGBARtvIlVoUaScY+U3R7lbXnu/7sHf0IEywY3vnKMoVuoQhGoy3Nfw6dCluOXIULdSnPRVEgplepNCZ6yKxengBaJKarDw0i/XXCWJbkdwiRJJsZYw3deO5ZOlFymtUreyGnqqJIXIdAaMoszEOnHowl1KRHxamUNLyTQp57Bbe4YoicIsM0Wcn/oOH28IoaRaiyLVCbtlR4ksUdThZ5TdZNtsO9zTK1A1oZE8BKmrSTq5lJVn4yhTiy+JgFZymomv9cMhUopKJyf5zHIWH42mSlM/fWipgbZ7D4+AUeLKrGHfcCVPzuPE6SbCtDtDjDVhfv4mAzbdpEL5IK1LQJl7k0v7lzLpS0duiaQKGBUpnPt2CUejpI9Ae1YStmkC888kU6Eyy7vmNut6fsRS3wKqxc7Q53N+3qes9c3/bQ646hTOrlzDmXtFKEW1XNBSeG4ybw/cSFiR3IRZSxIqKHO55X6AZZ9NxzG8rFa7NCpIOb+KpUfvUqZ8NINnS3WwvqdOOcuqNWe4V6i0Isx/8/GGUErlrZGZW3gcXM7n0x0JM5OqKDNOdks5eqe0Xnu2LvRHHSsJ3/IZX52Op6zGIjORbOjbm6VeOVRpRJkp4MJX41nnbT4X8xfkhK77lJknYihT3s+tP53UFSHYj5nL2fTqOtuO5tZqundfytXiWhD1eWeZNXINAcW/xij9o9B9TBIbkVyYyn+mnCJDpn6EU8RfGx4FYRvHmwizUm1WTUTC/KiHiTDlD1QBBdRlOST5rqBnj2+5WixHZ2qSnvzzcxmz1o8ChfbhNC91OGv7jsUxphzLe6ZP3c/wLjNwyqiittp6Mg98Sj+7q5TLtb82qPeVrwx3YOKC0yRKVdRShYaI9b3pudSHPNmD8RHUZeQk+fFt756s8C+k2txEfb4T88euxTevGpGnfpmPkvDNk1hwOpHyOsKMYEOfXvWE2WJFzDLjt5K+vVbgVyAzK0l6Ci4sYNw6H3JlmoeTGc1N1g8cz56oEixjjD7tEKO7z+RsitSshevJPDSegXZ+FFdZZqQClT/Mp8/n33GvVGnuq/rG/GRSV4euoseovSTWkYeelL1D+GC6M7lyswDoMtg7ohd2QeXUPGhgrK/TL3sk6FCU5ZEYcYvkMu3v21QklHFpxrsM2BxZL8h1aBqpTIwmQ6H7ddtoKCQuOhelTpzdtfKjDmdNn09xjJVieT9FwhzWeQZOmVVompiCNs7ZaDCgy9jH8D6rCCyxkDroMw8wpp+dFdE3frJ158aCY4zrOAvnvCqzmcJIyYWpdBx7kHhpjfnFE1D+MIf3JxyhtFLVuox/sVRqbq7rz7g90fXanz6VAyO7MvN8OhWWwbSl+hgNGHSZ7B/Zj1VXi+pIHX0mB8f2xy5AJPpW93pLJT34nvom9v3HsedeKQqzMqtPO8iobjM5l2ohzJazMclM5gFG97cjQCLDMgzrMw8xbqCdFdG3nE9zd42SE0zqOhunLCkqEyxGSlym03XcfqJLFPUyc2U+XSYdRlKmrMvKkHWI0Z2+4GhcWd07YbnZNKkbCrjpcoQ9Wzfj4ODQ8Lv1NGElNWR/N4p3Zlystb2JuRmLOf/Zu4w5lILM8pYJSjxmvM247zORP/QQLaBMucrpgzvZtnUrW3cdxPlmAaqHjSZQZRJ0eAmjeo1l5+1KLDNnC0At/grVJPmf4uCubWxx2MTWPYc4cngfjoddCc+SoX+Al1JQpnD1zCF2bdvK1q27OOh8k3yVxXFnpDTqEkf2bGfr1q3s3H+Sq6k1aA1gLLvH5WP72LFtKzv2X+ROsdmJpQ5mRae3+PJCHtUN8BZQJLqw64A/mXJtixq8TlZASnQU8VllKPXG1hNvU0Dp8wi7cATH7dtMbdj93WWiSsqIPr+DAwFZKEzuewMFty7yveNWNjeWNYetnAopQpH9PWM7zsQ5X2bW9ETCnELHTw+SUKEy2TpLo/3xcHPB2dnZ6uvCpdDMuugBQ9Z+RjQidUF5mdnvj+dIagWK3DDOHdrFdlG+dh7ENUKCojCKH47vZ6elj9wiKVTpMBrLiblykgM7t7P74BkCA7cx+P25uBVY6qjg6pKufLwpnBJFvUaj8VnAu6P2UVJR0xRij+aaXkZBSjT34jMpVepMznKD5DZuRx3Ztvn+d3rrqRAk8iyOje/MrAs5VJqN58YSZ6Z1HsuB2NrZh7E0hoDLbrg0wNgZl0uhZCi1tSRkyOLgqH7YWZO6oMRzbkcmfJeMtKYVI/ADUTAgibjEsb3b2LK5ET85bOFksAR51jEmdJmFU3Zl/SDr/CVdxu0nxqTdGimNCeBykzKTgcpsMhbJc0wjUheUnszrPJHDicWkhDrx3Z7tJk7aedCV2wVyCu95cvJAPU9djJCg1BoxSmPxPnWQXdt3c+B0EMlxexjRdR6uORXmOioIXPYhQzaGIKmut9lrfBfRcYwjklJFHTKC8ipLunRinmt2XX9ZbjZN6oKC4hQPlvdpz8sv9+dbt2vcCAsj0H030z4axMbwMhJ3DeadOZdNQmPKTO7Pwo792RQhtSJGDV5zOjBifxrVrRnpLbVq8ldAV5lPwqVl9HnjHcbvvEZcvvyBxNlkVtYX9dVIbm6k3/NdWRn8I2cUgpbKvHhcv+nNG50msz/wDrEJUfjtm8vIT1fhnt582JFYBUFXSX6CO8v7vsm743dyLS4feR2RCqhK0/G3H0KHt0ayyS+KzEq96QUVVGVk3TvDvL5j2eQTQ76y9rohfT9DX/6INWEiIVs1siae44sXsfdGDtXNhSEISlI9d7JyzT5cvLxw3raAmfbupMg1903vrHJu+dAopyjJjaV93uTVfw5j49V4JDU6FFme2M9ehVtKNRqjgKI4hR++7c+b7V6m/3IX/K+HERbozp4ZvRmyIZii+D0M6ziXSxLL9F3B1a+78vHGm5SYGmqkLD4Ib8/LeHh4WH0vc+V2Tm2oGNAUqaPxZv47o9iXVIGyuog412/o3+FVXum3Ep80KRplGVmhe5jU8TVe6fAZ+25nUyX2kaCmPN2Zxf1HseZSJBnZznzZ8QtOZFbWal26KDb17cESj2jSS8RBwNTjSM9PpuscV0pllql0yxD+uLsCytQr7F61ln3OV/By3s7CWfZcSpKhqi4m1XMVA01tW4aTbxBhYYF4OM6k75D13JDE4TiyM3Mv5lFlJnXFtW/oPqjenm4siyfIx5PLDTD24PKVW2TXtEDqaPD56n1G741HqmyB1I1qqgpSiIqMIjmvApVB9NQ19RFQlqTiafcxb736Cv2WOuEbJMqMB44z+zJ0/XUK4hwZ2WUuF3MtpK7g2tIPGbwxhEITYRopT7iOT1Mycysblag9mWTmflJH48vCjmPYG1dKsSQet2UDefuf7ei74grJZWqUZVmE7f2cLq+3o8PEPYRlVKAzCgjqctJdlvDx6NW43kqnvNSN2d2mcDy1HNNYp7vHloG9WOJ2h9RiS+SgQMWFaXw415lCqdXsTp/K3qHt6W8fbGqPNUpNk7qYQh/Dph7P8Kc/9WN3ehU1ej0aeQHBm6Zi51uO5PwU3vnsNPlmW6Tu7np6dl+CZ3wWpRZiEqScnfgBs9wkKB7RtEuftpsBz73JbA8xBKjpLrduYGuOjSXHGf3Sh9iF/khSN2WuJ2lHf55/bzG+JUrEYKeazMOMefU1Jh1Pr5+1NFcRfRp7Pn6BDrPdKbwvREmg6OhoXnz1c87myxvaI7VRbBw+mwt1miGoAxbx9osj2J9useGKhRrIdZnHoGmHiS1XNWt6UUYd5IsP+zLn5G3SC4tI/P5LBs89RUyl+qeTuli8NhL77v/Hk20+4XC22TRhrCTUfgTDN1xFYooi0BO7pTfPPfkEfXckUa7QodfKkYRuYcZqH0rynJje8QtOZpmftxDm5WgyTIQpoKuRUVkhRSq1/laYImOMQq2cGDL2MqzXylpTi1l0BOl5vugyG9fcWg3bUO7D4k7P8OQLoziSWGGyhQs1iewa/CJt2nRi5Y0SLIq3LmE3I4evIyhPjk6fzfGJg/jWbGIwFp/ls/admbjxLOESlTmkT0nIqmFMP51MVVNhC83JSGuvK+9xeMpH9Jt9nPBUCUWJR5kxdC4n74kECfq4rfR74c882Wc78aXV6PRa5JJQts1cg3dRLk4zOzP5RAYVJluAjqjN/floiTtRaSXoDEYEXQ2yyopGGEupqJCjFknLJG4Z7B/Rh5UBknpTi1CB05TuzHHJuk+rrG2agYo4D/Y7bGTr7gMcOXqEg7s3s9Z+Lz/El6FuMpZST9y2/rz41JP02RpHSbUoMwokYduYtdabwhwnZnWZzPF0aUPCvHSX1GIxBLslmVFTLzMHGNVvJf51NnUQKi4w9T9zcc6sQCUYKPdbQtdnn+KFEYeJKxdNbQKqpD0Me/kvPN1xBYGSarMPR0ei4xhGrbtKdpUWwZDDyS8+YaVvnskpaiw5z+QOXZm4/hSh+TVmX6WS0DUjmXkqAanF9iiCJlThPO113px2hqwKVYPBrwVSj2Nzz2f40xMD2JstR2Mo57Z3MOlZcaSU6tBmfM/Y/t8SWCaGZxkpPjWOdh9MZLPTbYo0Zi+34gbLB00zOVMtFpnWymdz6Qy5Bxn6wjss9Ba9z02n0pUlcsPTBScXbyJyquu0eWN1DhG+bpw7c54rkfkozIOPsfQEYyykrsoj0qSNuHP5WhzZCTfw8ryMu7s7/jElaPQGSmIC8LzsgYdvFBK1hvR9g3mx03KulSlrIwRkrkx+5QWGH0hDVjdD0VGWeIMrLk64eEeQU12rXWPI5dCwF3l3oTclivsbJD0znrbtp+NS1CiaRZ/IzgmL8CixkL2RohOf8tI/J3OuTqMFdEkcGNmJ8Yfjm7eLCpX4Le3CC69O4FSGzKQ91+THEpUkQS6GgjUNM+gz8DjhTaaiBW1ed5f1H4qkPpwjuZawOAHFrbX0776QS9nVaAU98dv68tyTT9J/dxpVKgPlEb6EpGUSm1KKVpPJMTNhik5RY/EZJr3WiYmbRMJU0xoLnLEkFq9d43mz3UCWnQkj12zmUoas5JPpZ0ipNDuWhQr8v+7E3//cltHfJ2JyzBqzOTi8LW3++BTd1gabZwc6EhxHM3J9MBK5OFXWke/xDZPXX0MimrhUGQR7BnArPotytRgOB0J1MJtm2OFpwrg5UMXrejIun8QnQ476/uCGZh4UqPRfzn9e+icTTqSa+lqoKSDuXpJpKi9aAvUJ2xnwwlM82W8nyVIl+vJI/ELTyIhNoVSjIfP4ZwxZ6U+B6BQ1FnPu8zfoMnEDZ8IKUDU3w7OujbGEWO/dTOjwKgOXniI0u6Z2MFOGYDd8BqeTrGfx9Q+qUy/isNYRJ79wohJSycrOIi0xhoiAs2z6dhte6bImcNCTsGMgbds8ST9REVDqkUb6EZqWYZIZjSaTE58PYaVfPjKtgLHkHF+8KRLmaRNhtqo5pXH47JnI2/8cyDcnQ8gWTVmAMnQ1I2eeJqGs1lciVATwTdfnaNN2FIfjyk02bmPOYUa/8hf+31NdsAuSUC2+2rpE9o8bw4agPFOdQEfBD8uZtiGAPJkGoyqTkCtXuRWXSbnKUDsrrw5h8+w1XE6taISBhoAl7/OPwVuJqrO/12L6YFL/03t8efA058468MXnO4mSmTU3XR6XFk3CPkhcTGFElXGdy/63SMiWmkgBBKqv2zN11RWyFc0E6xvKSA4L4lpAAAENvkHESDRNLlxqmdRF27ErDvb78bx1lzBXe6ZMXMeVLAWa6miOLV7Ooev3uBd8gqXjZ3AgotykBTQgdW0RXqsmMnnVYTxuZ1OWF8GpxYPp0ncBZ+LL0RiMyDPdWfXZXPb4JFCu1ZJhTeqaMqKOTeXDj2ZzOq4Sk2lOUJB40QH7/Z7cuhuGq/1UJq3zJEuhxfiQpH65jtR1JGzrw/Md5uBhNQAYso8y+rV+bIy436FS91opA1n6wTM83WMj9ypqBdVYay+oS9LkgeY6qyZv47ZIEE0mEAW5KVIHQfYDc97pzNdeRch1FlL/E+9O3ceJs2fZPGUyOyOlZjNSPWEWiphZEabUFE7bXOH11wVlKZmxYQT4BXMntZBqkwmlmuCN07HzzLCaURmR+i2m49+fou2YoyRVqtEXujD9o7d45Yk/8HS3dYSUKNHrEnAcPYL1N8SBr3bYM8oScd22Ddf4StR6IzrRGVw3IqpJubSD3W6xlDwwLl7DjdVT2X6rDPn943x9oxocKQla3pln/9KD9ZFltc5BoxhZXf+xkPoT70xhz7EznN08lSk7IigzTz10BZdZOmUDVwvEiDYVmcFXCLgVR1a5usl3sT5n85GgpDQzjvAAf4LvpFAoE+35AtUhDsxc/QNploHT+kFjKX6blrLLK44i0YxodU/QlJPotpbFjqFUKixuSksCC6k/wTuT93DszFk2T53CjtulZsx0FFxextQNV8mvFmVGJMwAbsVl1RGmJafmfk0yExfOVf9g7qSIseqi2a2akM2zWP1D7cBpetZYgf+SLjzXpi2jv4unvEZPoessev+7HU/84Wm6rg4yDay6xP2MG72BwDxLJA0YZUlc3LEd11jRSmBE30hmUj12scctxoRNnSiZCtUSvvo/vNR9mcmJay0mrSD1Tsw/484VDwdGf7KBiEpLuJORqvgLbN5ykaRq0f6jQ2e9ykidzMWtO3EzEWHD6tSBqE4n8NxJjn5/hCNHrL5HnQjJVTWpgbVI6oYcLsztxZBFR/EOvUWY1waGtGvPl075VBdeZev8bfgVKtDUZHJwZHsGbLmFVBwR6zT1YnLCT7N563H8Y7IpU2gx6BTk+3xDt3/0xD60DJUetAkH+Xq1OwlF4oIBA5n7BvNC+09Yum0DS6YMY8DYZRy7mkS5pjbCw5DjzNzeQ1j0vRcht8Lw2jiUdq9/iZMYS67/6Zr6jgmLsCb1iLXdeO69RfiKMbVmkNXXvubf7SZwrIXFI7q4rfR9rg2vzbps1jrreqjJA72inJLCQiQ5LswZvoIr8enkSYoorlLVTVvrHmyG1NHfY1PPVxi2J4Eqldasqf+JjnNP4vaDB5vHjmBDeFmdqUMkTBcLYRoaE2Zdac0fCIKpboIgYPoC6pRL7NjtRlypyhTHbHnYKPVl4Qd/589tx3A0qZws5xl8PHsziz78P576W3fWhZRQFePIqBGiLVpetyhENHVVZUcSHNOEZqsvJDb0LtktLcbTKygvKaJQkoPLvJF86xlLWq6EouJKDA8aZPXxbB/wIn9pP4NLVqRhaZP4W0fqHWdzzPUyHlvGMXJDGCWWkcMoI9F1O9td46gQZzI6MVKqmXfXOuO6YwFBNMNYMBYfVafgvnM3brHF1DRlRlFeZdXcfUSJcfCmsOwwzh+7xJ1CpUnLN1YEsGb+Xqoq5HWl1B7Uk/oHs47hctmDLeNGsiG0uFYrFmM3ZIlc3L4d1zhpE4TZKLumTpuSmVQPdu1xI6bIegGZkQq/r+nybBvajv6O+NJMXGYPYY7DQno8/zTPdFtNUKGUmH1jGb0+kFzR9FJXngFZ9h1CYvKpMdvx627pi4gLizINqvfPRrVEOfThH+/NwyOvqoFp9sGk/kRftsVJKJUm88MZX7JEz7+lVEMVWRE3iJZo7iNgfWEMIXezqdS2MH03KCgryCcvN5dc629eYe300VKO1W9LpC5UXuLL1//F8DWncHFzx93tJHvs13EiXIq6pozMu8F4XzzDiZPfs6jXc7y1wMu0MsxE6m3fZvSir/j044k4BOYhtxJAQR7NjsGv02OFL4WKKiIOrOVIZIlZSGtJ/cV3ZnAiJBTvHWPo8OZYDkVbpksClZem88Zrw1l90hk3d3fcTu7Bft0JwsvV6B9A6pXnJvDSa9O4UNjI/KKLZvOk5fiWWq5rCV7+Ps92Wc51ixkIqHGfTrt/TeW8tUnGCk8wUnzuM9r9tR0TTltrrA0SWZ3UkODuyNaN67FfM5Ue/+7PzOWrWGu/gT0/JFPTeKFZc6RuSGHXwJfos/42FUqNmdSfoM+We+QVS0m+cg7fTEXtTMdUuoGqrFrCVLdm7mxV46YP9RTGhnI3u+L+/XGMUny++sBkgvn0+xsc+WIA810TCN8ymJfa/I3u9kEEOQxjxPob9w+Cgha5vInBzahCLvpM6t/m+6qlSvBg37ZNrLdfw9Qe79B/xjJWrbVnw54fkCtbXgFpLHHii389Q7txJ+pNSY1KsJD6k70duJNdhDTFi/N+GcitbDwGWTZ3QmIoUInbBDz8R18US1hUFlKzgtM4R9GnMXfhBQoqakxEVxO+g7EDp3Ag0uzs1yfhOGcd1VJZo0ctpP4kvTZFklUkJcXLCb8GJqsWCLNRbq071VMUF0ZUltQ0E7N+RlQEFnd+ljZtR3M48DumDFqAc2woWz9px1+f6caagKs4jBzF+sBcqhr7Fy0y0zhaziwzxsbXTQXrSdg5iJc7TMe5LnqmtkatIHWzTV3QI6+UodGkEuAba5pailkIWjnVKnFaYt1EMNbITY7MRpcbJqqJwXnLGlauWM7y5VbfFQ64xstN4XsNH4CmSV1LVlQMhXnfMeKFfzB6byTpuRIKCwspLCxCqtRjKL7B7vkL2HIxlHsJsRwe346353maQjJrNfXOzDt6lMX93uOjOSe4Z61RCRqyTk3i3+9N59ztizhsdCOt0rL9gZnUzTZ1VclNdoz6gB4LXUgVp30iaR4bzYsvj8YxMp0ciVinQgqLRNOCEaEZ84s26x4xxVqqL8+hfdtRHGzg/BTZOoBlU3bVmUtAS8iKD3i28zKCSuvNIZqAhbz1ykRONLvwQ8X1ZR/w/PuzcM2oX0jWGPf6cz0VOQnExUQTHXGILwbM51RQOHeiY0jMq0JvPVsTH2qO1HXROPT8B0N3JVBZp6lbbOpG9IpKZBoNadf8ia3Q1GrSFuFvLGz1lfsRR0ZUcnHpdVMSakTq/RXv//3PtB0wlv7dZuCSUYksbjdDXmrD37p/zuc9+rPGyvTyIwpuNqm+IofEuBiioyM4PPlj5p8MJOxONDGJuejEDcJa+KiCV9C57QfMvJBKZTMOrDpSN9vUjXoFlTI16rRAAuKkqE1mJAGtXI7K2IIy1kI9Gt8yqkQeuJ8fLOmEmhusmbWD2+LaAQEM5ancDr1DhrnPhUpvls3ZT2VFfThf7bP1pF5rUzfLjFpNWmAAcVJV7QzKIjNNEqOlFq39NcuMxSls/ZioCCzqxLNt2tL/0/50m3GBlPJK4h2H0+6vz9Bt0iR6DFhDYG7VI1rprSd+x8f8460ZuORYInxqK9QCqcc2dJSaZN9IWeAaPv3GA3mNZYJv3bIfeawvJ+3ubW6GhxNu/b15h3Rp0wuBDBmOfPxcB+Z5ltQt09UVXDHtj5BQHMCyLs/z5tiD3DHtRyM6p3LJKlZR6DKLdzvPxjlDtHNX4jL5Vd6c7W7aFMdYeJRRbbuxMjCHZP+tjOnUhfHbAimoqV/WbCi+woLOr/PR8Cms9co1CWptaw2k7hrAC+8txk+MfjFtuGPPkA8G8K1nlildTchKuj7/JmMPRFJq3lJBnptFscaA0ZDB3kEv8NbcH+o36NEVcMVuGUcTZSgzTzHh9bcYd/iueTsGsVQ9xf5rmbstpH6dAFoi1nbn2XcX4mNlfhHjbIe/2p/Nd+sX7lj3kiALYXWfbkw+eIuiB9p6rZ8ENP4sGbuJ8HJFCzb1O2ZH6TC+q3OUAgpv5r3diYWeEuRaMZLB2lFaOxc0lgdhP2EZ7s3OMhrV5xGeGsu9mP/+Mzz59F958/OzpFWpMaoT2DXkJdr85Rme77aMawXWphdL4ea9Qkz7y1iu/dhfDQFLx7MprKR1NnWhmtC1/fnP5P2EFdSu5m6qRH38dvpbOUpNKBvLub5hEssv5SNr7eYuTWX+U68ZKwjdtRC7CzGUNpY/oYp7Rxbz9fd3qGwQoysWJpKataPUIjM32PjZcsCxo3EAAAqgSURBVC7VLQL7qRX7sc+JisBCOj37FE//9Q0+O51sMu2qEx0Z1u6v/OVvz9P1G/9GphexDBVFSbe4ERROYnGtyal1Jeu4u6k3L703n8u5teYXfWks/p7XaJrUDfncdFrLsFee4A9/eJVBi+xYu34961cvZmL3V+m07Doy0bj8i34ElGmBnF49hFeefJp/j7dj+5597N2xlrkjO9FhwnFyZGXEn/uawZ06M3jaCjbt3M3u/ecIzVcivbqSD9/4iOlbDnDk++9ZN/o1nusyhc2n3XHdPJY327xA76WXSSrPI8xxLO926MUXdke5nmO27RurCFn5H/45wJ4gy4sjyEm5eoqVg17mz3/rxLTtF7hdqEZXk8fV7V8wcNh8HL0TKS+P5dySIXTuPIhpKzaxY/du9p8LJbc0kaunVzO03Z95+t/jsNu+h317d7B27kg6dZjAsexqNNpibn63mPEjxzNn9S6+O3GCI/u247DlMD4p1qYDPYk7+vH8m7MbOEoFdSK7h77LxKPJtZEcpj7TUp6ZREp6Itf2LmDG6nPckVjbCM0da9TfZ1Zr0OX6VDydrlNQ08yCJlGOLqxl6Ct/5n//+Abjt7gQWVq7UEp9ex29us7FLbOSrJvOrB/xT5763z/Q7uMFrFxjz/r1q1k86UP+1fkbrlqtAG1Q/s95YizHd3Fn/q9NeyafT6NK1H4FNQn7R9Du6b/R7dtrFJiiXiyVEPeXMe8VMmNv3f4ylrs/7ldP2pUL3MhXtrhaVlueRVJKOonX9rFw5mrORhSYZn8NyzKi1xsxFNzGZcNI/tXmj/yh3UDmf7sa+/XrWb14Ej1e68ISq2X9DZ//uc8EavJucNLBjvUH3bgek0VpRSk5iaG47V3DCoeThIgRUnU2X7E+RiQRrmwc1Z6n//gH2g2Yx4rVFpnpwWtdluBntQL0526BJX9juR/fdHuOv7T/nDPJtZttie/fwTGv8bdnurLcL9cc9WJ+QlAQd2E/x/3CCQ84zIr5W/BuPCO3ZH7fr5Ywu24mR+lVicwUSl16ZQWDu3drhtQFBUVp9wj198bLy5/rN28TERFJxK0Qrvn6EJpSef80+75CH/UFAV1F7da+Pt5e+AdHEh0TS0zUba77euJzJ99k59LLcokOvsJFZzeuBN7krujtVhswyDK57eOOu28wkbFJJNz24ryLL7eT00i5G4K/jzeBkVlU6Axoy+K57uNL4M148mTm0EOMFHz/JZ8fiqFMZV7iLmiRZscTEeSDt5c/oXeTKZSL6QXUJUmE+XoTGF2AQq9FlhtDsNdFnN2uEHjzLnFZ5ajU5WQnRBLk4423fzCR0THExtzj9nVfPH3ukKcWowEEtBXZxIZfw8/fD/9rwdyMvEt0SlFdSGYt0gJlp8fz0iufcybfskhHtI9pyDw3jb5TTpIsapvokVzbyayR/ejdqwfvt+9ArylrOODkTUhUIpkSKQpFJXkJN3D5/hLRYvRHUxYKsVBBg6xSURcyel+PCzIkSXe44evFlSt+BN9JpbhGxKeGmxtGMGbTNXIVWuTF6USHBuDj5YX/9XBuR0QQGXGLkGu++IQkU9GMPfa+8h7pBSOy1Ks4O/sRZx6IxIguTWEkHucvEpJRhbaBqUlAXZpNos9yevVY8ZDbDghoZJW1Zotm2qQvDGT3nNEM6N2LHh+8Todek7Hbdx7vkLskZEqQyhVU5icS7HKUS9EV1FQVkR4TRoCPN15+1wm/HUFEZAS3Qq/h6xNCstRs4mqmvJ/1sqCmLO0OgZfPc/zwXnbu2MneQ0c54+bPrVQxOKEBo4uCh6I4nZiwAEQu8LOSmVCzzEgtYdU/a8UbZW6UkXbNBWc/McLJ7HsUNBTd+QGni8Gki2s+rN8lQw6u367k5L0CyktDsR/Uk28u18asN8q5iVM1fovf4R9DtxNTKm4pIKDKv4v3kQXNkHoTWfwmLgnG2giGRpURzNfrLgsGNDVKasSOtQJRMGioqRHjmsUd6HSoVaKJxxINYfkVc2nkwRcvGSV4bNiAR4a4CrKuJATR9igWIuZjOra6p9egEonZXAdT+coaNOLWsaZkAqITxLqOtZfN1+uzMmknek0NSqWqEZnUJ1KJkS4vfsLelMoGW3IayiM5OG8mu0KLqdHKyY0OJcjfCw/nExzYsoZlC6bz2eih9OvZk979BjJo8ED69h3KbMcgcpsLR60vtoUjCya1eIphkmK79blurJi9Ec8GM40Wsvm1bokyoqldZl9fBQMaVe1W0/XXao+Mer1pfxlxK4JrdZuGNU71KM4FFHnRhAX5433ZmZMHtrJ22QJmfDaGof160rN3XwYOGszAvn0ZOnsPgdktr2x+FDV6+DyMaGTF5GYkExcbR1JaDkVVFr/Vw+f+S+Ug6NRoGvsQDBrT/u33v+cK8pNSKVLq0MuusWzAaHaEiiG+raitIMVpcnve+vJs3eIjQZHBdeezvzNSb0VbH30SNaneh9i8ahFLD4VRaGVnf/RlPVyOhpzDDH+5OytvlDbcJkDQUhbjzoE9rtyTKlCIWoQYrqVTUV1eSHZKHHfDb+Dnfp7jB3axedNW9p2+wp3catPy5oerVaOnDRKuH9/P2esZVGrNM55GSX7Pp4asA4zs+/OTul4t/vGEOGjqUMnLKcxOJe7uTW74eeB0/AC7tjiwde8pPO/kmPYCt9Jtfs/wPp5115cSfnQNdnu9SRLXBLSms/SJ7Br0Lz7edJNi8xaPgqaSgpxCG6k/WErUJHvswsHxPGE58kdPcg+uQKtTCKowVnVtz+SzuY029BJ3C5BTkJZtip23mmg0yNs0k6iuRFouRaZqOfyuwYM/5kRQIEnPQVw01Fw9fkx2v7W0vwypt9BqwYC2Rk6VtBypTIwAaQ1DtJCf7dbPi4CxjIjz+znkFEC06Hi3TOsfVKrSl0UfdOGrS1lUWjbvE60FYCP1B2EnOmWUZXnkFovOmt/4CyJIuTL3A/rYh5n+kODBbbOleNQIGMTtfXuv5GpRU5Exj7o0W36/bwS0ZF3ewZZj/kQXKFCl+eJzT0ZN4zj2JhqpT3Hkk87TOJlo3gzMKk3T0S9WCWyHvycEjJS4Taf7xKOkmpyiv6e6//7raiyJw2f3BDpY9pcRncK//2bZWvBzIaCJZMcn3en96XTmL1rMrFFjsL9W1DBCpsmyBcouzqTnlGPEPso/yWiyLNvFXx0BY8kVvho8j4s5ln29f/Uq/ddUQFCWkBETip/vdSLFvUIsu5X+1yBga+iPQsBQTNz1QIJuBBMcEkyQ3w0Syyw7eraQkyDDf+kI5p2z+hs8q+Q2Td0KjMfi0FhJuMM0ll7Koto6TOexaNxvvBFN7BXyG6+xrXq/KgKNIt/MNvEHVUmQBrBm5np80ysbROJZnrORugWJx+ZXQJXlwcZ154h/2L3QHxtMbA2xIfC4IKAj230zDs4xFDfzR942Un9c+tq6HeIiiIATnLtRwKPZ/Mo6c9uxDQEbAr8WAsaK27ie9iHWvItlU/WwkXpTqDwG14yKArKLlKZ/eHkMmmNrgg0BGwLiskh1MTmSarTNLvO2hTTaBMWGgA0BGwKPFQI2Tf2x6k5bY2wI2BD4b0fARur/7RJga78NARsCjxUC/x+3ILT73sRliQAAAABJRU5ErkJggg==)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Co9D_oNXUlgj"
},
"source": [
"The components of the formula are as follows,\n",
"\n",
"- **E⁽ˡ⁾**: the embedding table after l steps of embedding propagation, where E⁽⁰⁾ is the initial embedding table,\n",
"- **LeakyReLU**: the rectified linear unit used as activation function,\n",
"- **W**: the weights trained by the network,\n",
"- **I**: an identity matrix,\n",
"- **L**: the Laplacian matrix for the user-item graph, which is formulated as"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SLrIvZowUyyI"
},
"source": [
"![image.png](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAOkAAAA6CAYAAABVnX5iAAAVDklEQVR4Ae2dd1hU176Gz1/35mhEjLElGsWSeKPx5CRGb2JNMWJLjp4cjSZqComJxxKjRlEsUQRF7EFRYjkWUETpKggoUqSpiBVp0oQBBAZmmGGG2e99hjqUGRgFIbnb59mPM7utb31rvXu13x7+gvhPdEB0oE078Jc2rU4UJzogOoAIqVgJRAfauAMipG28gER5ogMipGIdEB1o4w6IkLbxAhLliQ6IkIp1QHSgjTsgQtrGC0iUJzogQirWAdGBNu6ACGkbL6D/F/IU0RxatQLrfW74+gUSlVSIqkz4w2VdKE7hWshlArycsF62HMfgHGSlT58PEdI/XFX4EwqWnsZi8EgWHI/kblIa2YWlaJ6+bj9zowSVFElmOqnx7vw8YhBzDiXzuOTpMyJC+syLUkywngNSF2b3G8WqoCyKVPWO/vF2qKLZMHoAM50SRUj/eKUnKm7QgSeEVC2Xo9QIPH1b1aCqmp3qRPwPbMdmrRWrVqxi3aYt2G3ZjN1vJ/C/lU2JWlNzrvaTCGltP8RvfwIHjIRUkD3AZ8dqLNdasWShJQ7+CRSWlrWcEZpCUq8fY+GI/vTsac4azyDCIv3YN38iYz7+ip1XHiFT6TwqREhbrizEO7eSA8ZAKhQQtmM2Iz+wYO/5AJyXT2ToB//mP7fyUbQgp5RGsn5Edzp0nIRDYh5yjYL0kxYM7tKVt5b6kiktrWnRRUifsCIJctJjYnmoUNGSZWm0uraqy+iMPMUFRkCqyfZgwdsv0Xf679x7XEJh0C8Mf6kPE+2jyJG1YMmqYtg4qjsmVZAKUHrFkmHdOmL2tSupBUoR0qeoAkAJ6TFn2Pi1JR7ZxSh1eiZPd9+nvVqr6yzWX6/CI7uoDel62nwZeb0RkCqClvN2144MXRlMVpGaslRH/tGzEy9PP0xSvqIGFCMlNHp6XUhLJYRYm9P7xVeZ6RRHXonOA0JsSRu1s4ET1BRlRWI71YLjmYUo2gyklbqmWXA8oy3pasDCltzVZEgF8o7Poo9pB0bbxJIrK0PIPcrM3qaYvL+FuBwZdaZwmk91FaTtBjJ5/nwsvpjCuwMGYf7LYcLTZKh0ExYhfULfNens/2Iezo/aGAxaXV/Ow7lNPTye0OMnvcwISAtc5mDWyYRR1tfLu7eajAP8s5cppuN2cC9P3vKQdhjBspOn2DbnTTo/34Px2yKR1O1mi5A+YU3QpOHYJiFNEyFtMqSgjFrPqO6mDPnJj0ypitLoXxnVvSMDf/AkvVBn8uYJq4ney6paUu2YNCGb9IjtTB3QmZdHreJ8ahG1AotESPXaaOCAhsf33Fg08mNW+twmt7Ss5cYuBlTUP6Th8X03Fo8azwqfW89IlwalNIukm+FcSypqG+F3RkAqSK+yaXxveo63IyqniMTfP6PvS0NZeDaRQqVun7O+20+1Rwte5cTRb9rZXWUm51eM5uUuA5i2M5xHcnVNnWpTkKoLSb93g+u3ksnRimyWsZ6a1LCTHNhpx2ZbW2y32LNtmz3b9zjh7BtBcoFuyJgGyQ0/3N1cOXXqlM7mytnQJJSqisF8ZmYGweec2WOzGQcXH2JuxqFWqxsuM8Vdzmzdwsnrj1GodTNkQJdPBEkFpdSEm1bo8tCjS1FLl0ulLm9i4m7p19WwWuP3Co+55W3PDxPG8MORJAoVjVRsrR/2dpy89piSWn4Yn7TeK4yAFKGEh4Hb+Xr8JL5bu5bvJnzI9NXOXM8uoaXkoU4j7IQVk/qY8NxzA/ls4wmuZskpiD/L6smDMHtjIt9v9qSgWFmRxbYBqYAs3pttq9ay+5Q3Pi52LPp+Ax7xxZQ2UuZ6C6r6gIaizFucWjKG18zMmGDlQXBYOEEeh7BZMIOpXy5n36UUZOUjdQ05Ny/i7XGWM2fO6Gxn8QxPQVk5mg8MDGTZsmUsXLSQhQsXsmPHDuRyeXWKNR8E8gNW88GAPgxb4kVasW73qVLXzzW6Lmt1eeroCkqhuFpXgB5dydUPj4CAgFq6du3apUdXjcKn/iQoyEs8zfy/9WDKrgc8lus+iOreXSA/0IoPXzNj2E8epBbp+lH33Kf4bgykgKDM4X7EJfz8zuPrd4Wb6VJULRnsqyki/W4Ul3y98PQ8T3BMPFkyNZqyYtJuXMb7lCveYQ9QVAVUtAlIZdfYO3sE73//O2HxGWTGOTL343kcji1AqTMT/eTFVkrk2v/lxfYdmLQ3mUJFGcqiLB5EOrPso0EMHjmXPWHZ5S2dSlbA47xccnN1tzzypAo0lU37o0cZ+Ls4st1uE1uc3Im5qafF0mTg9u+PeaPXc7Qb8A0nkwprjzWo0NWlfQcmOiRRUKkrQatrnFbXHHaHZpWHiTVNV3ptXXG3W74l1RaKKoYNI3sxdU8C+YYCwLV+LBjPkFe0fnyNS2IBLdKjNBLSinoloFYqy+E09Jh58jpY+0pBbzdRQKMqLR82VOtofUgF8i8sZVgPM2YcTigfBwjyVK5H3iK9SNVMby+oiF7/Ll3amzBlfyqFVWsmgoxE528Z3LkLA2cdIK5Q0bTZPFkov20/w+27l9j9sw1+WbI68FUUiDreiW/m/MqOb4fwgokZnx++T0GtWqnV9V6FLseH9XS9odU1cz838xVNC5iQheFQpWupDRceFTeoq6q6aIoeEnXhDM7HXPCJTKVIpUEbuVoiiScq8CxeERkUZd8nzNeVYye8iEwr1lkaUJN7Jwh3l2Mcd9nBnMHd+GS3YUjV8b/z7dxf2f7t3+jc0YwZh+6S31j3uEqsMf8/EaTGJPCMz219SGUELv07nTuOYIN2nKJtOTUag7CU5d4j7FIAF/398dfZLgbFkqFUN3CtHki1SeW48tWrprTrMoW9dwtoUp1RpRMbm4FCFsnG6YtxT5c2EDggJ8rucywcr5McZMXobib0/Mc+4vJ1HwR6IK3U9fVrnWjfZQoOt/MpaUq3X6vrZoUu6+k/6dFVUcGE4lgOL7Nkb0AUUUEHWT7rexyuSpCryyiM98Nu7ije/8oKuy27+f3QXlbN/ADzFWdIkSoRBBl33WxYZXMUv8hoQs6swbx3ZybtMgSpnOits7DYd42kwDWM7dGRXp86cPNxif4HkCaX++GXCbxYu5z9LwYRm6HQeWDUgUaEtI4htb8a/6qaOo7N73fDpL8F7pnF6MYV1751zTfFgwCOH3Ziv6MjjjrbgROXSZGrjIIU1XU2jupC+78OYfnlHGR65n9qUtd+EhA0Um5772P3sVDSZPXTFPIvsuKjCSw95MPFcw7MGWRKu27j2RZT+SAqv6F+SFFrdXWl/V/fYFmQhGIjdN3xdizXlVpcX1dVPjS5AWxZYItvmpQSWQIOUwcy3iYUiUyNqigd1+9ep9c4S9yCb5CYlkr0ZnN6j1lHiKSY0rTTLPpgKlY+d5Eo1CiyfVj81kt8aqAl1fqxctxElh70xl/rxxudaN/tY+yjc5HrG9IoEghyPozT/trl7HjgOJeTZVQN2aryVP2/CGm1FQ19MBpSTfYxZvU2pff0wyRIlQ0AVj+ZsiIJaQ9TSElJJjm5ZktJz6OkTNtlq/tPf0uqhdRaC2m74ayPytNfYWrdUkGCvzOufpHcu5NAWnGZzkys9kQNGad/wPyTRWzcuRfH/XvY8MXfebHdi4zZFF4+c11xOwOQqm5gPborz7cbzrrIHOqub9eSU/1FQcJFFwO6qk8snyxJiLqMr9sxDh9x4qexPRg035OM8smcUoJ/eRuzf+0jIb+k3E+Z61f0HbaU85mF5Hj+wKBXZ3MkuaAi2ko7Jh1laEyqIcNtPhM+XcSGnQ6VfrxFl3YvMtq64sFQo0znU1kxkrRUUlJSapVzckp6edic3rkdEVIdE+t/NBrSkkvL+Hv3v2Fx8gHSJk7lltx0xW69FZYrV7JSZ1tl40qcVNlA90k/pJosZ2b360jHNxbhm9G0llyT7c3KWbNZsm4L1kvt8Ko7JlXHc+CLqaxwi+L2gySSkpJIDLdn8ismdB5uRVC2jIqGUT+kWl1z+pvScfBCfNKKmtTD0Ei8sZw1hyVrN1fr0hdXrJEEs3vxYmxPXiIq9hp7P+/L4B/dySh/+6IC0n6zDpJaUBG/WuL2Df3eWYJvZiGZR2bwSp+ZHExpIqTqeJy+nMaK05Hciq/xY0rvjnQevprALD2+l8ThZv8rayxrl/NKSxtOxRag0Ne7eGJI1TwMPMhOO1s2bdpUZ7Nlt9ddZC36akx9oMr3tOqYVJByZfVYhn2xh7BHTV+XUufcJzIshODg4Frblav3yWkwsEBF1LqKiaPJjjUTR2XSBHw3TKRfz7eYuzeC7JKmBSUI8mQigsOJjo4hIuQGqSW642AB6VUbPpm2gcsZxTVrbapEnKb3peMLb7LEN53i8n69iqjKiaPJ+2omjjTluibRX6vL4SpZugvbespRu1urK7JKV2iFroZ7khqy3X7krWHfcfxeLoqyAly/6s/r37tWQxq09E3MPncipRLS4lNzMXtrEV4ZUgrD1jKi11AWeCRTWCqgyfVh4Zvd+dD6WgNvjmj9sOUf/9zApXSdh40qkd9n9Mf0hTf5ySeVolohNpWZVOcQHxVGSJ1yDr5ylXsSJXXfja625okh1SBNiSXqahihoaEE2M1hrq03F4JCCQ0NI/pBHiq9iVan3vwfDEGqlnDTz4uAhl4W16OkSS1paW4Sd+8ncCdgNwu+seQ/ERnI660ca1A3iyFq0kJPYDXhFZ7/r/9mwKeW2Npvx36rLWt+XsD8BcvYdMSfOzmKOl1WPTk0uFtDRtgRVk8dhNno+ewLSEJeuc4pueaO9dR+dHiuPWYTlrLTPYbYoKNYTazRZVNX12F/bkuaQ1dd0QLSAEtGvj6ab21/44CTE+v/+So9hs/BxjmCm0GH+GlMd0wHzcD+/AMKU0JxshjKC12HY7E3gMTsaxxe8DHvmX/LKtsd7Nm/A4t3XmLIVEuORjzSCVKo8MNq2mD6jv6RvRcTq9ejJdfd2TStPyZ/bY+Z+c/sOHOdXKVOlE1dycZ8bwBSddJFft+5mfXaX0NY8yu2W+3ZamfPXueAco+rq5ogoF0e0W7Sk/P44Vg6eTJN5T5jRBhzrhq5XFm9xFfvSgOQaiRe/GL+Lu/O2MFVSVUPrd4dau1oFFJ1ZgDb5k1l3PvvM/rt13h9zFesdXDhXMg17iRl8rhYRn7abYJPOXFWu05aD95a6TXhiwZpxl2ignzwdHfHJzCcyOhooiIjCAkOIUob3aRdSK4/kG3CveueIlCccYswf298gyK5kyZFXX5jAXl2AjdDL+Dl6YGnXwjXHmQhSb2joyusWldocAiRzaqrrk7QFCYQ5uXKaZ8gwq/FERvqwVFnb0LvZpKdcoNQP088fAKJSc5HWZDKjSvn8fI4x+XYhxSolDxOjOSi51m8AsKIjrtLuO9pfC9HcC9LVplnbZo1fpwLiuR2WmFlkECVH37VfsTEZ+uZT6ivvdE9DUCqKUgh5ugC3uvfk57ma/AMCuXqeQfmTxqL+Te7Cc2S1xtSFJVDmtFIgEajagycICBP8GXXmtWsX7uUxVZ7uZhQWH9CzACkQslDIjwc+OHbzYRJimjKTzo1AqlA8cMYLvv74HH6GAd2bGTVz/P5ZtY0Jn00lrEfjMN80iQmfPQRk+ftIPCh7rqcgbz+IQ7VfgroXct+hnkRypTI5dqlDAEEFYoSJWXlrUhVy1HRomjjM8sDOapamUqN2uu1UTHaVket0q5p186j4azUPteoSw3fGBqAVHtJaeQ63uv2PCaT95KUX4JGkYbLN6/TpevbLDv3qF6Xu6UhFQrC2TV3NB9a/Iav/wmWTXyHjxYc5VZ+nWUpA5AiFJMQ5MJ/vK6TXaJqYNK0vlmNQqoqKamI6tCUIpfmkJ54hxsRwVz0duWI4062bFiP9baDuEekIC1fXK+fiLhHdMCgA3ogVcVsYGT3DnSshFSglCsr36GriRlfn06jsM4sW8tCqiHbcyFDX+7LdKc75MkLCfplOC/3mYR9pKT2bL5BSJU8Tk0iPV9JWVPW0uEp/j6pUIZSJiU/N4fcQm3Xo/aT1mChiAdFB3QdaCKkpZIQrM178+KrM3GK012/rrhZy0KqIGj523QzHcqKy48oUpeR6jiVXp16Mv1wEvlVUXFaKYYg1a7ZG4lKIy2prpPiZ9GBFnKgEUjbDZzCggXfMfvTdxkwaDzLDoY1GJCizrjBjbSS+mPE5pAt5HH8iz50MhnNphs5FJcJ5B6dRe9OJozdEkeuTKdZNAip8WJESI33TLyiuR1oBNLnRyzj5MmtfDnkBdr3MGd7VE4Tg1iaUahQgMscM14wGcXGaxKKyzRkOH3GK6amjNt+l1y5CGkzui3eqs050Aik5WNSSRpXt35C/849GWN1gbTipk26NF9elUSuH0kP0yEs9stAqiolZsNoepgOZJ5HWvn6c3VaYktabYX44c/igD5Io3+tmDiapJ3dlaPI8GX5qJfo8upn7L6a1cBafUsaIiANt2Z8n16Y20UhkSZxcHo/Xhr6b9wS6rzCJ0LakgUh3rtVHGgAUnVaOM5rJtHH5Dme+5/p2LhEIJEXcN/NkkmvmzFk8o/Yed1H1jwvMDcp24I8hYv2X2E+eR7r1s9j4of/wvJYDFkldYI6REib5Kd40h/JgQYg1UjTuRMRiLeHO+6+l4i+r205NZQVpXI9yJOTJz0Jjc+ltOY3a55BjgWUOfcID7zAhXM+nA+OJa1Q92dzKiWIkD6DshCTeLYONABp+euF+pYqBA2qUlV5IMezFVqZmqBGqTQQDCJC2irFIibakg40CGlLJtjC9xYhbWGDxds/ewdESA16Lq6TGrRHPPhMHNBCavY60zae4nxQKLGpbeT3gI3MvCBL51bUVUIubGfGwF78a7/4R4SNtFA8vc06oHpIqPtZzofe4Pa9B6TmKShrntecnmmWhdLHpCc/IP72VS6ccePyg+Jm+fVMsSV9psUoJtawAwJlahUqlXZT67w61/DZbXavoKnJh1rd5AD6xvIjQtqYQ+Jx0YFWdkCEtJULQExedKAxB0RIG3NIPC460MoOiJC2cgGIyYsONOaACGljDonHRQda2QER0lYuADF50YHGHBAhbcwh8bjoQCs7IELaygUgJi860JgDIqSNOSQeFx1oZQdESFu5AMTkRQcac0CEtDGHxOOiA63swP8B060KAVwSYHwAAAAASUVORK5CYII=)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "TcNXtDMFVLii"
},
"source": [
"#### Architecture"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kyKDfxNfBWl-"
},
"source": [
"![](https://github.com/recohut/reco-static/raw/master/media/images/120222_ncf.png)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "4i1YYbJB4oGQ"
},
"source": [
"class NGCF(nn.Module):\n",
" def __init__(self, n_users, n_items, emb_dim, layers, reg, node_dropout, mess_dropout,\n",
" adj_mtx):\n",
" super().__init__()\n",
"\n",
" # initialize Class attributes\n",
" self.n_users = n_users\n",
" self.n_items = n_items\n",
" self.emb_dim = emb_dim\n",
" self.adj_mtx = adj_mtx\n",
" self.laplacian = adj_mtx - sp.eye(adj_mtx.shape[0])\n",
" self.reg = reg\n",
" self.layers = layers\n",
" self.n_layers = len(self.layers)\n",
" self.node_dropout = node_dropout\n",
" self.mess_dropout = mess_dropout\n",
"\n",
" #self.u_g_embeddings = nn.Parameter(torch.empty(n_users, emb_dim+np.sum(self.layers)))\n",
" #self.i_g_embeddings = nn.Parameter(torch.empty(n_items, emb_dim+np.sum(self.layers)))\n",
"\n",
" # Initialize weights\n",
" self.weight_dict = self._init_weights()\n",
" print(\"Weights initialized.\")\n",
"\n",
" # Create Matrix 'A', PyTorch sparse tensor of SP adjacency_mtx\n",
" self.A = self._convert_sp_mat_to_sp_tensor(self.adj_mtx)\n",
" self.L = self._convert_sp_mat_to_sp_tensor(self.laplacian)\n",
"\n",
" # initialize weights\n",
" def _init_weights(self):\n",
" print(\"Initializing weights...\")\n",
" weight_dict = nn.ParameterDict()\n",
"\n",
" initializer = torch.nn.init.xavier_uniform_\n",
" \n",
" weight_dict['user_embedding'] = nn.Parameter(initializer(torch.empty(self.n_users, self.emb_dim).to(device)))\n",
" weight_dict['item_embedding'] = nn.Parameter(initializer(torch.empty(self.n_items, self.emb_dim).to(device)))\n",
"\n",
" weight_size_list = [self.emb_dim] + self.layers\n",
"\n",
" for k in range(self.n_layers):\n",
" weight_dict['W_gc_%d' %k] = nn.Parameter(initializer(torch.empty(weight_size_list[k], weight_size_list[k+1]).to(device)))\n",
" weight_dict['b_gc_%d' %k] = nn.Parameter(initializer(torch.empty(1, weight_size_list[k+1]).to(device)))\n",
" \n",
" weight_dict['W_bi_%d' %k] = nn.Parameter(initializer(torch.empty(weight_size_list[k], weight_size_list[k+1]).to(device)))\n",
" weight_dict['b_bi_%d' %k] = nn.Parameter(initializer(torch.empty(1, weight_size_list[k+1]).to(device)))\n",
" \n",
" return weight_dict\n",
"\n",
" # convert sparse matrix into sparse PyTorch tensor\n",
" def _convert_sp_mat_to_sp_tensor(self, X):\n",
" \"\"\"\n",
" Convert scipy sparse matrix to PyTorch sparse matrix\n",
" Arguments:\n",
" ----------\n",
" X = Adjacency matrix, scipy sparse matrix\n",
" \"\"\"\n",
" coo = X.tocoo().astype(np.float32)\n",
" i = torch.LongTensor(np.mat([coo.row, coo.col]))\n",
" v = torch.FloatTensor(coo.data)\n",
" res = torch.sparse.FloatTensor(i, v, coo.shape).to(device)\n",
" return res\n",
"\n",
" # apply node_dropout\n",
" def _droupout_sparse(self, X):\n",
" \"\"\"\n",
" Drop individual locations in X\n",
" \n",
" Arguments:\n",
" ---------\n",
" X = adjacency matrix (PyTorch sparse tensor)\n",
" dropout = fraction of nodes to drop\n",
" noise_shape = number of non non-zero entries of X\n",
" \"\"\"\n",
" \n",
" node_dropout_mask = ((self.node_dropout) + torch.rand(X._nnz())).floor().bool().to(device)\n",
" i = X.coalesce().indices()\n",
" v = X.coalesce()._values()\n",
" i[:,node_dropout_mask] = 0\n",
" v[node_dropout_mask] = 0\n",
" X_dropout = torch.sparse.FloatTensor(i, v, X.shape).to(X.device)\n",
"\n",
" return X_dropout.mul(1/(1-self.node_dropout))\n",
"\n",
" def forward(self, u, i, j):\n",
" \"\"\"\n",
" Computes the forward pass\n",
" \n",
" Arguments:\n",
" ---------\n",
" u = user\n",
" i = positive item (user interacted with item)\n",
" j = negative item (user did not interact with item)\n",
" \"\"\"\n",
" # apply drop-out mask\n",
" A_hat = self._droupout_sparse(self.A) if self.node_dropout > 0 else self.A\n",
" L_hat = self._droupout_sparse(self.L) if self.node_dropout > 0 else self.L\n",
"\n",
" ego_embeddings = torch.cat([self.weight_dict['user_embedding'], self.weight_dict['item_embedding']], 0)\n",
"\n",
" all_embeddings = [ego_embeddings]\n",
"\n",
" # forward pass for 'n' propagation layers\n",
" for k in range(self.n_layers):\n",
"\n",
" # weighted sum messages of neighbours\n",
" side_embeddings = torch.sparse.mm(A_hat, ego_embeddings)\n",
" side_L_embeddings = torch.sparse.mm(L_hat, ego_embeddings)\n",
"\n",
" # transformed sum weighted sum messages of neighbours\n",
" sum_embeddings = torch.matmul(side_embeddings, self.weight_dict['W_gc_%d' % k]) + self.weight_dict['b_gc_%d' % k]\n",
"\n",
" # bi messages of neighbours\n",
" bi_embeddings = torch.mul(ego_embeddings, side_L_embeddings)\n",
" # transformed bi messages of neighbours\n",
" bi_embeddings = torch.matmul(bi_embeddings, self.weight_dict['W_bi_%d' % k]) + self.weight_dict['b_bi_%d' % k]\n",
"\n",
" # non-linear activation \n",
" ego_embeddings = F.leaky_relu(sum_embeddings + bi_embeddings)\n",
" # + message dropout\n",
" mess_dropout_mask = nn.Dropout(self.mess_dropout)\n",
" ego_embeddings = mess_dropout_mask(ego_embeddings)\n",
"\n",
" # normalize activation\n",
" norm_embeddings = F.normalize(ego_embeddings, p=2, dim=1)\n",
"\n",
" all_embeddings.append(norm_embeddings)\n",
"\n",
" all_embeddings = torch.cat(all_embeddings, 1)\n",
" \n",
" # back to user/item dimension\n",
" u_g_embeddings, i_g_embeddings = all_embeddings.split([self.n_users, self.n_items], 0)\n",
"\n",
" self.u_g_embeddings = nn.Parameter(u_g_embeddings)\n",
" self.i_g_embeddings = nn.Parameter(i_g_embeddings)\n",
" \n",
" u_emb = u_g_embeddings[u] # user embeddings\n",
" p_emb = i_g_embeddings[i] # positive item embeddings\n",
" n_emb = i_g_embeddings[j] # negative item embeddings\n",
"\n",
" y_ui = torch.mul(u_emb, p_emb).sum(dim=1)\n",
" y_uj = torch.mul(u_emb, n_emb).sum(dim=1)\n",
" log_prob = (torch.log(torch.sigmoid(y_ui-y_uj))).mean()\n",
"\n",
" # compute bpr-loss\n",
" bpr_loss = -log_prob\n",
" if self.reg > 0.:\n",
" l2norm = (torch.sum(u_emb**2)/2. + torch.sum(p_emb**2)/2. + torch.sum(n_emb**2)/2.) / u_emb.shape[0]\n",
" l2reg = self.reg*l2norm\n",
" bpr_loss = -log_prob + l2reg\n",
"\n",
" return bpr_loss"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "5xbcqHLUSowG"
},
"source": [
"### Training and Evaluation"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "N6S7uZU3Tpht"
},
"source": [
"Training is done using the standard PyTorch method. If you are already familiar with PyTorch, the following code should look familiar.\n",
"\n",
"One of the most useful functions of PyTorch is the torch.nn.Sequential() function, that takes existing and custom torch.nn modules. This makes it very easy to build and train complete networks. However, due to the nature of NCGF model structure, usage of torch.nn.Sequential() is not possible and the forward pass of the network has to be implemented ‘manually’. Using the Bayesian personalized ranking (BPR) pairwise loss, the forward pass is implemented as follows:"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"id": "LEMcstCz4vSm",
"outputId": "4e06cd7c-8e69-4f6b-d4b8-054d373f01cb"
},
"source": [
"# read parsed arguments\n",
"args = parse_args()\n",
"data_dir = args.data_dir\n",
"dataset = args.dataset\n",
"batch_size = args.batch_size\n",
"layers = eval(args.layers)\n",
"emb_dim = args.emb_dim\n",
"lr = args.lr\n",
"reg = args.reg\n",
"mess_dropout = args.mess_dropout\n",
"node_dropout = args.node_dropout\n",
"k = args.k\n",
"\n",
"# generate the NGCF-adjacency matrix\n",
"data_generator = Data(path=data_dir + dataset, batch_size=batch_size)\n",
"adj_mtx = data_generator.get_adj_mat()\n",
"\n",
"# create model name and save\n",
"modelname = \"NGCF\" + \\\n",
" \"_bs_\" + str(batch_size) + \\\n",
" \"_nemb_\" + str(emb_dim) + \\\n",
" \"_layers_\" + str(layers) + \\\n",
" \"_nodedr_\" + str(node_dropout) + \\\n",
" \"_messdr_\" + str(mess_dropout) + \\\n",
" \"_reg_\" + str(reg) + \\\n",
" \"_lr_\" + str(lr)\n",
"\n",
"# create NGCF model\n",
"model = NGCF(data_generator.n_users, \n",
" data_generator.n_items,\n",
" emb_dim,\n",
" layers,\n",
" reg,\n",
" node_dropout,\n",
" mess_dropout,\n",
" adj_mtx)\n",
"if use_cuda:\n",
" model = model.cuda()\n",
"\n",
"# current best metric\n",
"cur_best_metric = 0\n",
"\n",
"# Adam optimizer\n",
"optimizer = torch.optim.Adam(model.parameters(), lr=args.lr)\n",
"\n",
"# Set values for early stopping\n",
"cur_best_loss, stopping_step, should_stop = 1e3, 0, False\n",
"today = datetime.now()\n",
"\n",
"print(\"Start at \" + str(today))\n",
"print(\"Using \" + str(device) + \" for computations\")\n",
"print(\"Params on CUDA: \" + str(next(model.parameters()).is_cuda))\n",
"\n",
"results = {\"Epoch\": [],\n",
" \"Loss\": [],\n",
" \"Recall\": [],\n",
" \"NDCG\": [],\n",
" \"Training Time\": []}\n",
"\n",
"for epoch in range(args.n_epochs):\n",
"\n",
" t1 = time()\n",
" loss = train(model, data_generator, optimizer)\n",
" training_time = time()-t1\n",
" print(\"Epoch: {}, Training time: {:.2f}s, Loss: {:.4f}\".\n",
" format(epoch, training_time, loss))\n",
"\n",
" # print test evaluation metrics every N epochs (provided by args.eval_N)\n",
" if epoch % args.eval_N == (args.eval_N - 1):\n",
" with torch.no_grad():\n",
" t2 = time()\n",
" recall, ndcg = eval_model(model.u_g_embeddings.detach(),\n",
" model.i_g_embeddings.detach(),\n",
" data_generator.R_train,\n",
" data_generator.R_test,\n",
" k)\n",
" print(\n",
" \"Evaluate current model:\\n\",\n",
" \"Epoch: {}, Validation time: {:.2f}s\".format(epoch, time()-t2),\"\\n\",\n",
" \"Loss: {:.4f}:\".format(loss), \"\\n\",\n",
" \"Recall@{}: {:.4f}\".format(k, recall), \"\\n\",\n",
" \"NDCG@{}: {:.4f}\".format(k, ndcg)\n",
" )\n",
"\n",
" cur_best_metric, stopping_step, should_stop = \\\n",
" early_stopping(recall, cur_best_metric, stopping_step, flag_step=5)\n",
"\n",
" # save results in dict\n",
" results['Epoch'].append(epoch)\n",
" results['Loss'].append(loss)\n",
" results['Recall'].append(recall.item())\n",
" results['NDCG'].append(ndcg.item())\n",
" results['Training Time'].append(training_time)\n",
" else:\n",
" # save results in dict\n",
" results['Epoch'].append(epoch)\n",
" results['Loss'].append(loss)\n",
" results['Recall'].append(None)\n",
" results['NDCG'].append(None)\n",
" results['Training Time'].append(training_time)\n",
"\n",
" if should_stop == True: break\n",
"\n",
"# save\n",
"if args.save_results:\n",
" date = today.strftime(\"%d%m%Y_%H%M\")\n",
"\n",
" # save model as .pt file\n",
" if os.path.isdir(\"./models\"):\n",
" torch.save(model.state_dict(), \"./models/\" + str(date) + \"_\" + modelname + \"_\" + dataset + \".pt\")\n",
" else:\n",
" os.mkdir(\"./models\")\n",
" torch.save(model.state_dict(), \"./models/\" + str(date) + \"_\" + modelname + \"_\" + dataset + \".pt\")\n",
"\n",
" # save results as pandas dataframe\n",
" results_df = pd.DataFrame(results)\n",
" results_df.set_index('Epoch', inplace=True)\n",
" if os.path.isdir(\"./results\"):\n",
" results_df.to_csv(\"./results/\" + str(date) + \"_\" + modelname + \"_\" + dataset + \".csv\")\n",
" else:\n",
" os.mkdir(\"./results\")\n",
" results_df.to_csv(\"./results/\" + str(date) + \"_\" + modelname + \"_\" + dataset + \".csv\")\n",
" # plot loss\n",
" results_df['Loss'].plot(figsize=(12,8), title='Loss')"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"n_users=943, n_items=1682\n",
"n_interactions=100000\n",
"n_train=80000, n_test=20000, sparsity=0.06305\n",
"Creating interaction matrices R_train and R_test...\n",
"Complete. Interaction matrices R_train and R_test created in 1.4850668907165527 sec\n",
"Loaded adjacency-matrix (shape: (2625, 2625) ) in 0.018111467361450195 sec.\n",
"Initializing weights...\n",
"Weights initialized.\n",
"Start at 2021-07-12 09:57:58.311285\n",
"Using cuda for computations\n",
"Params on CUDA: True\n",
"Epoch: 0, Training time: 9.11s, Loss: 107.9355\n",
"Epoch: 1, Training time: 8.88s, Loss: 101.6095\n",
"Epoch: 2, Training time: 8.75s, Loss: 80.7764\n",
"Epoch: 3, Training time: 8.76s, Loss: 76.1915\n",
"Epoch: 4, Training time: 8.58s, Loss: 73.0698\n",
"Evaluate current model:\n",
" Epoch: 4, Validation time: 1.51s \n",
" Loss: 73.0698: \n",
" Recall@20: 0.0623 \n",
" NDCG@20: 0.2352\n",
"Epoch: 5, Training time: 8.84s, Loss: 69.3378\n",
"Epoch: 6, Training time: 8.71s, Loss: 64.4498\n",
"Epoch: 7, Training time: 8.67s, Loss: 60.1440\n",
"Epoch: 8, Training time: 8.76s, Loss: 56.8538\n",
"Epoch: 9, Training time: 8.78s, Loss: 52.3951\n",
"Evaluate current model:\n",
" Epoch: 9, Validation time: 1.54s \n",
" Loss: 52.3951: \n",
" Recall@20: 0.0837 \n",
" NDCG@20: 0.2559\n",
"Epoch: 10, Training time: 8.72s, Loss: 50.5261\n",
"Epoch: 11, Training time: 8.73s, Loss: 49.2488\n",
"Epoch: 12, Training time: 8.72s, Loss: 48.5012\n",
"Epoch: 13, Training time: 8.75s, Loss: 47.5585\n",
"Epoch: 14, Training time: 8.82s, Loss: 47.0483\n",
"Evaluate current model:\n",
" Epoch: 14, Validation time: 1.51s \n",
" Loss: 47.0483: \n",
" Recall@20: 0.0926 \n",
" NDCG@20: 0.2676\n",
"Epoch: 15, Training time: 8.84s, Loss: 46.4847\n",
"Epoch: 16, Training time: 8.98s, Loss: 46.2644\n",
"Epoch: 17, Training time: 8.99s, Loss: 45.5963\n",
"Epoch: 18, Training time: 8.78s, Loss: 45.0955\n",
"Epoch: 19, Training time: 8.84s, Loss: 44.9321\n",
"Evaluate current model:\n",
" Epoch: 19, Validation time: 1.55s \n",
" Loss: 44.9321: \n",
" Recall@20: 0.1102 \n",
" NDCG@20: 0.2934\n",
"Epoch: 20, Training time: 8.61s, Loss: 44.4621\n",
"Epoch: 21, Training time: 9.02s, Loss: 44.1910\n",
"Epoch: 22, Training time: 8.94s, Loss: 43.7996\n",
"Epoch: 23, Training time: 8.83s, Loss: 43.1078\n",
"Epoch: 24, Training time: 9.01s, Loss: 43.1549\n",
"Evaluate current model:\n",
" Epoch: 24, Validation time: 1.54s \n",
" Loss: 43.1549: \n",
" Recall@20: 0.1217 \n",
" NDCG@20: 0.3255\n",
"Epoch: 25, Training time: 9.08s, Loss: 42.8759\n",
"Epoch: 26, Training time: 8.92s, Loss: 42.4126\n",
"Epoch: 27, Training time: 8.82s, Loss: 42.0810\n",
"Epoch: 28, Training time: 8.97s, Loss: 41.7865\n",
"Epoch: 29, Training time: 8.89s, Loss: 41.3096\n",
"Evaluate current model:\n",
" Epoch: 29, Validation time: 1.57s \n",
" Loss: 41.3096: \n",
" Recall@20: 0.1257 \n",
" NDCG@20: 0.3217\n",
"Epoch: 30, Training time: 9.15s, Loss: 40.9893\n",
"Epoch: 31, Training time: 9.11s, Loss: 40.8605\n",
"Epoch: 32, Training time: 9.06s, Loss: 40.3089\n",
"Epoch: 33, Training time: 8.87s, Loss: 40.1379\n",
"Epoch: 34, Training time: 8.89s, Loss: 39.6859\n",
"Evaluate current model:\n",
" Epoch: 34, Validation time: 1.51s \n",
" Loss: 39.6859: \n",
" Recall@20: 0.1293 \n",
" NDCG@20: 0.3432\n",
"Epoch: 35, Training time: 9.12s, Loss: 39.9238\n",
"Epoch: 36, Training time: 9.12s, Loss: 39.4329\n",
"Epoch: 37, Training time: 9.20s, Loss: 38.9671\n",
"Epoch: 38, Training time: 8.79s, Loss: 38.7849\n",
"Epoch: 39, Training time: 8.78s, Loss: 38.3410\n",
"Evaluate current model:\n",
" Epoch: 39, Validation time: 1.54s \n",
" Loss: 38.3410: \n",
" Recall@20: 0.1365 \n",
" NDCG@20: 0.3411\n",
"Epoch: 40, Training time: 8.85s, Loss: 38.6723\n",
"Epoch: 41, Training time: 8.78s, Loss: 37.9243\n",
"Epoch: 42, Training time: 9.07s, Loss: 37.8358\n",
"Epoch: 43, Training time: 8.85s, Loss: 37.2368\n",
"Epoch: 44, Training time: 8.97s, Loss: 37.4086\n",
"Evaluate current model:\n",
" Epoch: 44, Validation time: 1.51s \n",
" Loss: 37.4086: \n",
" Recall@20: 0.1383 \n",
" NDCG@20: 0.3554\n",
"Epoch: 45, Training time: 8.94s, Loss: 37.1695\n",
"Epoch: 46, Training time: 9.05s, Loss: 36.9502\n",
"Epoch: 47, Training time: 8.75s, Loss: 36.5551\n",
"Epoch: 48, Training time: 9.08s, Loss: 36.4953\n",
"Epoch: 49, Training time: 9.13s, Loss: 35.9976\n",
"Evaluate current model:\n",
" Epoch: 49, Validation time: 1.54s \n",
" Loss: 35.9976: \n",
" Recall@20: 0.1397 \n",
" NDCG@20: 0.3541\n",
"Epoch: 50, Training time: 8.79s, Loss: 35.8774\n",
"Epoch: 51, Training time: 9.03s, Loss: 36.0130\n",
"Epoch: 52, Training time: 9.00s, Loss: 35.4460\n",
"Epoch: 53, Training time: 8.76s, Loss: 35.2867\n",
"Epoch: 54, Training time: 9.11s, Loss: 35.4907\n",
"Evaluate current model:\n",
" Epoch: 54, Validation time: 1.53s \n",
" Loss: 35.4907: \n",
" Recall@20: 0.1435 \n",
" NDCG@20: 0.3563\n",
"Epoch: 55, Training time: 8.97s, Loss: 35.1628\n",
"Epoch: 56, Training time: 8.86s, Loss: 34.5842\n",
"Epoch: 57, Training time: 8.83s, Loss: 34.1935\n",
"Epoch: 58, Training time: 8.88s, Loss: 34.3039\n",
"Epoch: 59, Training time: 8.79s, Loss: 34.2499\n",
"Evaluate current model:\n",
" Epoch: 59, Validation time: 1.49s \n",
" Loss: 34.2499: \n",
" Recall@20: 0.1495 \n",
" NDCG@20: 0.3704\n",
"Epoch: 60, Training time: 8.84s, Loss: 33.9897\n",
"Epoch: 61, Training time: 8.66s, Loss: 33.2779\n",
"Epoch: 62, Training time: 8.86s, Loss: 33.2062\n",
"Epoch: 63, Training time: 8.78s, Loss: 32.9654\n",
"Epoch: 64, Training time: 9.03s, Loss: 32.2721\n",
"Evaluate current model:\n",
" Epoch: 64, Validation time: 1.51s \n",
" Loss: 32.2721: \n",
" Recall@20: 0.1497 \n",
" NDCG@20: 0.3725\n",
"Epoch: 65, Training time: 8.90s, Loss: 32.5445\n",
"Epoch: 66, Training time: 8.85s, Loss: 32.1805\n",
"Epoch: 67, Training time: 8.81s, Loss: 32.1525\n",
"Epoch: 68, Training time: 8.80s, Loss: 31.7560\n",
"Epoch: 69, Training time: 8.81s, Loss: 31.3688\n",
"Evaluate current model:\n",
" Epoch: 69, Validation time: 1.51s \n",
" Loss: 31.3688: \n",
" Recall@20: 0.1536 \n",
" NDCG@20: 0.3816\n",
"Epoch: 70, Training time: 8.55s, Loss: 31.3098\n",
"Epoch: 71, Training time: 8.87s, Loss: 31.3700\n",
"Epoch: 72, Training time: 8.72s, Loss: 31.1579\n",
"Epoch: 73, Training time: 8.76s, Loss: 30.1733\n",
"Epoch: 74, Training time: 8.76s, Loss: 30.5201\n",
"Evaluate current model:\n",
" Epoch: 74, Validation time: 1.50s \n",
" Loss: 30.5201: \n",
" Recall@20: 0.1581 \n",
" NDCG@20: 0.3809\n",
"Epoch: 75, Training time: 8.70s, Loss: 30.2994\n",
"Epoch: 76, Training time: 8.76s, Loss: 29.8949\n",
"Epoch: 77, Training time: 8.77s, Loss: 29.7122\n",
"Epoch: 78, Training time: 8.74s, Loss: 29.7030\n",
"Epoch: 79, Training time: 8.64s, Loss: 29.6655\n",
"Evaluate current model:\n",
" Epoch: 79, Validation time: 1.49s \n",
" Loss: 29.6655: \n",
" Recall@20: 0.1609 \n",
" NDCG@20: 0.3873\n",
"Epoch: 80, Training time: 8.94s, Loss: 29.6567\n",
"Epoch: 81, Training time: 8.87s, Loss: 29.5109\n",
"Epoch: 82, Training time: 8.91s, Loss: 29.1704\n",
"Epoch: 83, Training time: 8.82s, Loss: 28.6625\n",
"Epoch: 84, Training time: 8.79s, Loss: 28.7304\n",
"Evaluate current model:\n",
" Epoch: 84, Validation time: 1.48s \n",
" Loss: 28.7304: \n",
" Recall@20: 0.1613 \n",
" NDCG@20: 0.3908\n",
"Epoch: 85, Training time: 8.85s, Loss: 29.0495\n",
"Epoch: 86, Training time: 8.76s, Loss: 28.4390\n",
"Epoch: 87, Training time: 8.81s, Loss: 28.5633\n",
"Epoch: 88, Training time: 8.83s, Loss: 28.3275\n",
"Epoch: 89, Training time: 8.96s, Loss: 27.8343\n",
"Evaluate current model:\n",
" Epoch: 89, Validation time: 1.52s \n",
" Loss: 27.8343: \n",
" Recall@20: 0.1591 \n",
" NDCG@20: 0.3895\n",
"Epoch: 90, Training time: 8.92s, Loss: 28.3271\n",
"Epoch: 91, Training time: 8.85s, Loss: 28.0346\n",
"Epoch: 92, Training time: 8.69s, Loss: 27.7937\n",
"Epoch: 93, Training time: 8.93s, Loss: 27.5649\n",
"Epoch: 94, Training time: 9.08s, Loss: 27.9189\n",
"Evaluate current model:\n",
" Epoch: 94, Validation time: 1.50s \n",
" Loss: 27.9189: \n",
" Recall@20: 0.1611 \n",
" NDCG@20: 0.3912\n",
"Epoch: 95, Training time: 8.86s, Loss: 27.9343\n",
"Epoch: 96, Training time: 8.83s, Loss: 27.2735\n",
"Epoch: 97, Training time: 8.92s, Loss: 27.3794\n",
"Epoch: 98, Training time: 8.84s, Loss: 27.2788\n",
"Epoch: 99, Training time: 8.86s, Loss: 27.4216\n",
"Evaluate current model:\n",
" Epoch: 99, Validation time: 1.50s \n",
" Loss: 27.4216: \n",
" Recall@20: 0.1656 \n",
" NDCG@20: 0.3922\n",
"Epoch: 100, Training time: 8.71s, Loss: 26.6066\n",
"Epoch: 101, Training time: 8.88s, Loss: 27.1389\n",
"Epoch: 102, Training time: 9.04s, Loss: 26.6459\n",
"Epoch: 103, Training time: 8.71s, Loss: 26.8171\n",
"Epoch: 104, Training time: 8.91s, Loss: 26.7730\n",
"Evaluate current model:\n",
" Epoch: 104, Validation time: 1.49s \n",
" Loss: 26.7730: \n",
" Recall@20: 0.1627 \n",
" NDCG@20: 0.3926\n",
"Epoch: 105, Training time: 9.06s, Loss: 26.4580\n",
"Epoch: 106, Training time: 9.12s, Loss: 25.9192\n",
"Epoch: 107, Training time: 8.93s, Loss: 26.4427\n",
"Epoch: 108, Training time: 8.77s, Loss: 26.3804\n",
"Epoch: 109, Training time: 8.86s, Loss: 26.1349\n",
"Evaluate current model:\n",
" Epoch: 109, Validation time: 1.52s \n",
" Loss: 26.1349: \n",
" Recall@20: 0.1691 \n",
" NDCG@20: 0.3950\n",
"Epoch: 110, Training time: 8.81s, Loss: 25.8410\n",
"Epoch: 111, Training time: 8.84s, Loss: 25.9275\n",
"Epoch: 112, Training time: 8.77s, Loss: 25.9278\n",
"Epoch: 113, Training time: 8.92s, Loss: 26.2235\n",
"Epoch: 114, Training time: 8.90s, Loss: 25.4737\n",
"Evaluate current model:\n",
" Epoch: 114, Validation time: 1.50s \n",
" Loss: 25.4737: \n",
" Recall@20: 0.1673 \n",
" NDCG@20: 0.3995\n",
"Epoch: 115, Training time: 8.78s, Loss: 25.7582\n",
"Epoch: 116, Training time: 8.77s, Loss: 25.3173\n",
"Epoch: 117, Training time: 8.63s, Loss: 25.4568\n",
"Epoch: 118, Training time: 8.63s, Loss: 25.3934\n",
"Epoch: 119, Training time: 8.63s, Loss: 25.2544\n",
"Evaluate current model:\n",
" Epoch: 119, Validation time: 1.50s \n",
" Loss: 25.2544: \n",
" Recall@20: 0.1689 \n",
" NDCG@20: 0.4028\n",
"Epoch: 120, Training time: 8.77s, Loss: 24.9747\n",
"Epoch: 121, Training time: 8.93s, Loss: 24.7825\n",
"Epoch: 122, Training time: 8.92s, Loss: 25.2147\n",
"Epoch: 123, Training time: 8.79s, Loss: 24.5176\n",
"Epoch: 124, Training time: 8.72s, Loss: 24.7453\n",
"Evaluate current model:\n",
" Epoch: 124, Validation time: 1.48s \n",
" Loss: 24.7453: \n",
" Recall@20: 0.1682 \n",
" NDCG@20: 0.3954\n",
"Epoch: 125, Training time: 8.78s, Loss: 24.9444\n",
"Epoch: 126, Training time: 8.81s, Loss: 24.9258\n",
"Epoch: 127, Training time: 8.77s, Loss: 24.5360\n",
"Epoch: 128, Training time: 8.70s, Loss: 24.4527\n",
"Epoch: 129, Training time: 8.65s, Loss: 24.5864\n",
"Evaluate current model:\n",
" Epoch: 129, Validation time: 1.48s \n",
" Loss: 24.5864: \n",
" Recall@20: 0.1689 \n",
" NDCG@20: 0.3977\n",
"Epoch: 130, Training time: 8.66s, Loss: 24.2351\n",
"Epoch: 131, Training time: 8.84s, Loss: 24.4298\n",
"Epoch: 132, Training time: 8.57s, Loss: 24.3624\n",
"Epoch: 133, Training time: 8.74s, Loss: 24.1980\n",
"Epoch: 134, Training time: 8.84s, Loss: 24.0672\n",
"Evaluate current model:\n",
" Epoch: 134, Validation time: 1.47s \n",
" Loss: 24.0672: \n",
" Recall@20: 0.1735 \n",
" NDCG@20: 0.4069\n",
"Epoch: 135, Training time: 8.75s, Loss: 24.4691\n",
"Epoch: 136, Training time: 8.67s, Loss: 23.9019\n",
"Epoch: 137, Training time: 8.77s, Loss: 24.1378\n",
"Epoch: 138, Training time: 8.68s, Loss: 23.8090\n",
"Epoch: 139, Training time: 8.81s, Loss: 23.9487\n",
"Evaluate current model:\n",
" Epoch: 139, Validation time: 1.48s \n",
" Loss: 23.9487: \n",
" Recall@20: 0.1687 \n",
" NDCG@20: 0.4037\n",
"Epoch: 140, Training time: 8.64s, Loss: 23.8015\n",
"Epoch: 141, Training time: 8.57s, Loss: 24.0985\n",
"Epoch: 142, Training time: 8.70s, Loss: 23.8640\n",
"Epoch: 143, Training time: 8.77s, Loss: 23.5799\n",
"Epoch: 144, Training time: 8.77s, Loss: 23.7568\n",
"Evaluate current model:\n",
" Epoch: 144, Validation time: 1.48s \n",
" Loss: 23.7568: \n",
" Recall@20: 0.1708 \n",
" NDCG@20: 0.4068\n",
"Epoch: 145, Training time: 8.75s, Loss: 23.6537\n",
"Epoch: 146, Training time: 8.77s, Loss: 23.8114\n",
"Epoch: 147, Training time: 8.64s, Loss: 23.5442\n",
"Epoch: 148, Training time: 8.51s, Loss: 23.2413\n",
"Epoch: 149, Training time: 8.77s, Loss: 23.5159\n",
"Evaluate current model:\n",
" Epoch: 149, Validation time: 1.49s \n",
" Loss: 23.5159: \n",
" Recall@20: 0.1698 \n",
" NDCG@20: 0.4052\n",
"Epoch: 150, Training time: 8.67s, Loss: 23.4435\n",
"Epoch: 151, Training time: 8.54s, Loss: 23.5388\n",
"Epoch: 152, Training time: 8.54s, Loss: 23.2494\n",
"Epoch: 153, Training time: 8.60s, Loss: 23.1259\n",
"Epoch: 154, Training time: 8.68s, Loss: 23.1326\n",
"Evaluate current model:\n",
" Epoch: 154, Validation time: 1.49s \n",
" Loss: 23.1326: \n",
" Recall@20: 0.1709 \n",
" NDCG@20: 0.4059\n",
"Epoch: 155, Training time: 8.69s, Loss: 22.8828\n",
"Epoch: 156, Training time: 8.60s, Loss: 23.0292\n",
"Epoch: 157, Training time: 8.54s, Loss: 22.9355\n",
"Epoch: 158, Training time: 8.61s, Loss: 22.7000\n",
"Epoch: 159, Training time: 8.83s, Loss: 22.9723\n",
"Evaluate current model:\n",
" Epoch: 159, Validation time: 1.47s \n",
" Loss: 22.9723: \n",
" Recall@20: 0.1731 \n",
" NDCG@20: 0.4118\n",
"Early stopping at step: 5 log:0.1731223464012146\n"
],
"name": "stdout"
},
{
"output_type": "display_data",
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAsUAAAHwCAYAAABOlBKbAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nOzdd3zd5X33//d1ls7R3tOSlzzkiY0xZhMgCZvQ7EFIQ0Lvlmb3kdk2d9M0bXqnd0hzN00hJCG/JCQpGSYJGZhA2QYbG+8hW7b23jo60hnX7w/JRhgZrPk90vf1fDz8sM/3DH30fQjz5uJzfS5jrRUAAADgZh6nCwAAAACcRigGAACA6xGKAQAA4HqEYgAAALgeoRgAAACuRygGAACA6xGKAQAA4HqEYgBIEsaYE8aYa5yuAwDciFAMAAAA1yMUA0ASM8akGGPuNsY0jv662xiTMvpcvjHmN8aYbmNMpzHmSWOMZ/S5zxhjGowxfcaYw8aYq539TgAgufmcLgAA8Jq+IGmLpPMkWUlbJf2tpL+T9ClJ9ZIKRl+7RZI1xqyQ9NeSLrDWNhpjFknyzm7ZADC3sFIMAMntvZK+ZK1ttda2SfoHSbeNPheVVCJpobU2aq190lprJcUlpUhaZYzxW2tPWGuPOVI9AMwRhGIASG6lkk6OeXxy9Jok/R9J1ZL+aIw5boz5rCRZa6slfVzS/5bUaoz5iTGmVACAsyIUA0Bya5S0cMzjitFrstb2WWs/Za1dIulmSZ881Ttsrf2xtfbS0fdaSV+d3bIBYG4hFANAcvEbY4Knfkl6QNLfGmMKjDH5kv5e0g8lyRhzozGm0hhjJPVopG0iYYxZYYy5anRDXkTSoKSEM98OAMwNhGIASC4PayTEnvoVlLRD0h5JeyW9KOnLo69dJmmbpH5Jz0r6lrX2MY30E/+LpHZJzZIKJX1u9r4FAJh7zMieDAAAAMC9WCkGAACA6xGKAQAA4HqEYgAAALgeoRgAAACuRygGAACA6/mcLkCS8vPz7aJFi5wuAwAAAPPczp072621BWdeT4pQvGjRIu3YscPpMgAAADDPGWNOjned9gkAAAC4HqEYAAAArkcoBgAAgOsRigEAAOB6hGIAAAC4HqEYAAAArkcoBgAAgOsRigEAAOB6hGIAAAC4HqEYAAAArkcoBgAAgOsRigEAAOB6hGIAAAC4HqEYAAAArkcoBgAAgOsRigEAAOB6rg3F1lp1h4c1FIs7XQoAAAAc5tpQ/HR1h8770iPaXdvtdCkAAABwmGtD8YKckCSprmvQ4UoAAADgNNeG4tLskIyR6jrDTpcCAAAAh7k2FAd8HpVmhQjFAAAAcG8olkZaKOq6CMUAAABu5+pQXJ6bqlpWigEAAFzP1aG4IjdVLb1DikQZywYAAOBmrg7F5bkjEygauplAAQAA4GauDsUVuamSRAsFAACAy7k6FJfnjITiekIxAACAq7k6FBdkpCjF52GlGAAAwOVcHYqNMSrPTVVdJz3FAAAAbubqUCxJ5cwqBgAAcD1CMbOKAQAAXM/1obgiN1V9kZh6wlGnSwEAAIBDXB+KF+Qwlg0AAMDtXB+KT80qpq8YAADAvVwfik+dalfHSjEAAIBruT4UZwT9yk710z4BAADgYq4PxdJIC0VdF7OKAQAA3IpQrJHjnmmfAAAAcC9CsaQFuSE1dA0qkbBOlwIAAAAHEIo10j4xHE+opS/idCkAAABwAKFYI+0TklTbQQsFAACAGxGKNXZWMZvtAAAA3IhQLKk0OyRjONUOAADArQjFkgI+j0oyg6onFAMAALgSoXhUeW4qK8UAAAAuRSgeVZ6bqrouQjEAAIAbEYpHleekqqV3SJFo3OlSAAAAMMsIxaMKMlIkSV3hYYcrAQAAwGwjFI9KDXglSYPDrBQDAAC4DaF4VNA/GoppnwAAAHAdQvGo0OhKMT3FAAAA7vO6odgY811jTKsxZt+Ya7nGmEeMMUdHf88ZvW6MMf9ujKk2xuwxxmycyeKnU+jUSvFwwuFKAAAAMNvOZaX4+5KuPePaZyU9aq1dJunR0ceSdJ2kZaO/7pT0n9NT5swL0T4BAADgWq8biq21T0jqPOPyLZLuH/3z/ZLeMub6D+yI5yRlG2NKpqvYmRQKjNwKQjEAAID7TLanuMha2zT652ZJRaN/LpNUN+Z19aPXXsUYc6cxZocxZkdbW9sky5g+oYBPkjQ4HHO4EgAAAMy2KW+0s9ZaSXYS77vHWrvJWrupoKBgqmVM2cs9xawUAwAAuM1kQ3HLqbaI0d9bR683SCof87oFo9eS3ss9xWy0AwAAcJvJhuKHJN0++ufbJW0dc/39o1MotkjqGdNmkdRSfPQUAwAAuJXv9V5gjHlA0pWS8o0x9ZK+KOlfJP3MGHOHpJOS3jH68oclXS+pWlJY0p/PQM0zwuMxCvo9zCkGAABwodcNxdbad5/lqavHea2VdNdUi3JKyO+lpxgAAMCFONFujJDfS/sEAACACxGKxwgFWCkGAABwI0LxGKEAK8UAAABuRCgeg55iAAAAdyIUjxGkpxgAAMCVCMVjhPxeRrIBAAC4EKF4DHqKAQAA3IlQPAY9xQAAAO5EKB6DkWwAAADuRCgeg8M7AAAA3IlQPEbI71UsYRWNJ5wuBQAAALOIUDxGKOCVJFaLAQAAXIZQPEbQPxKKI/QVAwAAuAqheIyQn5ViAAAANyIUj0H7BAAAgDsRisc4HYppnwAAAHAVQvEYp9snCMUAAACuQigeg55iAAAAdyIUj0FPMQAAgDsRisegfQIAAMCdCMVjnJ5TzEoxAACAqxCKx6B9AgAAwJ0IxWO83D6RcLgSAAAAzCZC8Rhej1HA51E4GnO6FAAAAMwiQvEZQn6vImy0AwAAcBVC8RlCfi89xQAAAC5DKD5DKODVYJSeYgAAADchFJ8h6PcypxgAAMBlCMVnSA14mVMMAADgMoTiM9BTDAAA4D6E4jME/V6FaZ8AAABwFULxGUK0TwAAALgOofgMIb+HjXYAAAAuQyg+Az3FAAAA7kMoPkMwQCgGAABwG0LxGVL9Pg3HEoonrNOlAAAAYJYQis8QCozcEjbbAQAAuAeh+Awhv1eSaKEAAABwEULxGYKnQjETKAAAAFyDUHyGUICVYgAAALchFJ8hxEoxAACA6xCKz0BPMQAAgPsQis9A+wQAAID7EIrPcCoUR2ifAAAAcA1C8RlonwAAAHAfQvEZToXiMCvFAAAArkEoPkPwVPsEK8UAAACuQSg+AyPZAAAA3IdQfAa/1yOfx9BTDAAA4CKE4nGEAl5CMQAAgIsQiscR8nvpKQYAAHARQvE4QgEvPcUAAAAuQigeR8jvZSQbAACAixCKxxH001MMAADgJoTicdBTDAAA4C6E4nEwfQIAAMBdCMXjYKMdAACAuxCKxzHSPpFwugwAAADMEkLxOEJstAMAAHAVQvE4aJ8AAABwF0LxOE6NZEskrNOlAAAAYBYQiscR8nslSUMx+ooBAADcgFA8jpB/5LbQVwwAAOAOhOJxpAZ8kgjFAAAAbkEoHkcwMNI+wWY7AAAAdyAUj+NUTzFHPQMAALgDoXgcp0Ix7RMAAADuQCgeRygwclvCtE8AAAC4AqF4HEE/PcUAAABuQigeBz3FAAAA7kIoHgcj2QAAANyFUDyOEO0TAAAArkIoHkcwwIl2AAAAbkIoHkfA65HH0FMMAADgFoTicRhjFPJ7GckGAADgEoTiswgFvLRPAAAAuASh+CxCAa8irBQDAAC4AqH4LEJ+VooBAADcglB8FoRiAAAA9yAUn0XQ72VOMQAAgEsQis8iFPAykg0AAMAlCMVnwUg2AAAA9yAUnwU9xQAAAO5BKD6L1BSv+odiTpcBAACAWTClUGyM+YQxZr8xZp8x5gFjTNAYs9gYs90YU22M+akxJjBdxc6mRXlp6g5H1dE/5HQpAAAAmGGTDsXGmDJJH5W0yVq7RpJX0rskfVXS1621lZK6JN0xHYXOtpXFmZKkw819DlcCAACAmTbV9gmfpJAxxicpVVKTpKskPTj6/P2S3jLFr+GIlSUZkqSDhGIAAIB5b9Kh2FrbIOlrkmo1EoZ7JO2U1G2tPdWMWy+pbLz3G2PuNMbsMMbsaGtrm2wZMyY/PUX56Sk61NTrdCkAAACYYVNpn8iRdIukxZJKJaVJuvZc32+tvcdau8lau6mgoGCyZcyoqpIMHWwmFAMAAMx3U2mfuEZSjbW2zVoblfQLSZdIyh5tp5CkBZIaplijY6pKMnWkpV+xeMLpUgAAADCDphKKayVtMcakGmOMpKslHZD0mKS3jb7mdklbp1aic1YWZ2g4ltCJjgGnSwEAAMAMmkpP8XaNbKh7UdLe0c+6R9JnJH3SGFMtKU/SfdNQpyNOTaA42MRmOwAAgPnM9/ovOTtr7RclffGMy8clbZ7K5yaLpYVp8nmMDjX36qb1pU6XAwAAgBnCiXavIcXn1dKCdB1ipRgAAGBeIxS/jpUlGTrErGIAAIB5jVD8OlYWZ6qhe1A9g1GnSwEAAMAMIRS/jlMn23GIBwAAwPxFKH4dq0pGJlDQQgEAADB/EYpfR2FGinJS/TrEyXYAAADzFqH4dRhjtLI4k1nFAAAA8xih+BysLMnQ4eY+JRLW6VIAAAAwAwjF56CqOFOD0bhqO8NOlwIAAIAZQCg+B6cnUNBXDAAAMC8Ris/BssIMeYzoKwYAAJinCMXnIBTwalF+mg4wqxgAAGBeIhSfo/PKs7XjRCeb7QAAAOYhQvE5urQyX13hqA7SVwwAADDvEIrP0SWV+ZKkp6vbHa4EAAAA041QfI6KMoOqLEzXU9UdTpcCAACAaUYonoBLK/P1Qk2nhmJxp0sBAADANCIUT8AllfkajMa1q7bb6VIAAAAwjQjFE3Dhklx5DH3FAAAA8w2heAIyg36tL88mFAMAAMwzhOIJurQyXy/V96g3EnW6FAAAAEwTQvEEXbw0X/GE1fbjnU6XAgAAgGlCKJ6gjQuzFfR7aKEAAACYRwjFE5Ti82rz4jxCMQAAwDxCKJ6ES5bm6Whrv1p6I06XAgAAgGlAKJ6EU0c+P3OM1WIAAID5gFA8CatKMpWd6tezxzjyGQAAYD4gFE+Cx2O0sSJHL3KyHQAAwLxAKJ6kjRXZqm7tV0+YecUAAABzHaF4kjZU5EiSdtezWgwAADDXEYonaX15tjxGevFkl9OlAAAAYIoIxZOUnuLT8qIM7apjpRgAAGCuIxRPwYaKHO2q7VIiYZ0uBQAAAFNAKJ6CjRXZ6ovEdKyt3+lSAAAAMAWE4ik4tdluF6PZAAAA5jRC8RQsyU9TVsivF2vZbAcAADCXEYqnwOMx2lCRzUoxAADAHEconqIN5Tk60tqn3giHeAAAAMxVhOIp2rgwW9ZKLzGaDQAAYM4iFE/R+vJsGcNmOwAAgLmMUDxFmUG/lhWms9kOAABgDiMUT4ONFTnaVdvNIR4AAABzFKF4GmyoyFbPYFQ1HQNOlwIAAIBJIBRPg40c4gEAADCnEYqnwZKCdKX4PDrc3Ot0KQAAAJgEQvE08HqMlhak60hLv9OlAAAAYBIIxdNkeVG6jrb0OV0GAAAAJoFQPE2WF2eosSfCyXYAAABzEKF4miwvzJAkHaWFAgAAYM4hFE+T5UWnQjEtFAAAAHMNoXiaLMgJKeT3stkOAABgDiIUTxOPx6iyMF1HW1kpBgAAmGsIxdNoeVGGDjcTigEAAOYaQvE0Wl6Urta+IfWEmUABAAAwlxCKp9GpzXZHaKEAAACYUwjF02hZUbok6QgTKAAAAOYUQvE0KssOKS3g1RH6igEAAOYUQvE0MsaosiiDsWwAAABzDKF4mq0oYiwbAADAXEMonmbLizLU3j+szoFhp0sBAADAOSIUT7NlpyZQsNkOAABgziAUT7PlTKAAAACYcwjF06w4M6iMFB+hGAAAYA4hFE8zY4yWFzOBAgAAYC4hFM+A5UXpOtrSJ2ut06UAAADgHBCKZ8Cywgx1haNq72cCBQAAwFxAKJ4BK4pHJlAcau51uBIAAACcC0LxDFhVkilJOtBIKAYAAJgLCMUzICctoNKsoPYTigEAAOYEQvEMWVWapf2NPU6XAQAAgHNAKJ4hq0szdbx9QOHhmNOlAAAA4HUQimfImrIsWSsdbKKFAgAAINkRimfI6tKRzXb0FQMAACQ/QvEMKckKKifVr/0NhGIAAIBkRyieIcYYrS7N0v4mNtsBAAAkO0LxDFpdmqkjzf2KxhNOlwIAAIDXQCieQatKMzUcT+hoS7/TpQAAAOA1EIpn0OrSLEliXjEAAECSIxTPoMX5aQr5vUygAAAASHKE4hnk9RhVlWToAKEYAAAgqRGKZ9jq0iwdaOpVImGdLgUAAABnQSieYWvKMtU/FFNtZ9jpUgAAAHAWhOIZdmqz3T422wEAACStKYViY0y2MeZBY8whY8xBY8xFxphcY8wjxpijo7/nTFexc9GyonT5PIbNdgAAAElsqivF35D0e2vtSknrJR2U9FlJj1prl0l6dPSxa6X4vFpWlEEoBgAASGKTDsXGmCxJl0u6T5KstcPW2m5Jt0i6f/Rl90t6y1SLnOtWl2bqQGOPrGWzHQAAQDKaykrxYkltkr5njNlljPmOMSZNUpG1tmn0Nc2SiqZa5Fy3tixL7f3DaugedLoUAAAAjGMqodgnaaOk/7TWbpA0oDNaJezI0ui4y6PGmDuNMTuMMTva2tqmUEby27w4V5K0/Xinw5UAAABgPFMJxfWS6q2120cfP6iRkNxijCmRpNHfW8d7s7X2HmvtJmvtpoKCgimUkfxWFGUoJ9WvZ493OF0KAAAAxjHpUGytbZZUZ4xZMXrpakkHJD0k6fbRa7dL2jqlCucBj8fowsV5eo5QDAAAkJR8U3z/RyT9yBgTkHRc0p9rJGj/zBhzh6STkt4xxa8xL2xZkqvf729WXWdY5bmpTpcDAACAMaYUiq21uyVtGuepq6fyufPRlqV5kqTtNZ2EYgAAgCTDiXazZHlhhnLTArRQAAAAJCFC8SwZ6SvO1bPHCMUAAADJhlA8i7YsyVND96DqOsNOlwIAAIAxCMWzaMuSkb5iWigAAACSC6F4Fi0rTB/tK+YQDwAAgGRCKJ5FHo/RliW5eu54h0YO+wMAAEAyIBTPslN9xfVdg06XAgAAgFGE4ll2qq+YI58BAACSB6F4li0rTFce84oBAACSCqF4lhljdOGSXD1fw2Y7AACAZEEodsDasmzVdw2qJxx1uhQAAACIUOyIVaWZkqQDTb0OVwIAAACJUOyIqpIMSdJBQjEAAEBSIBQ7oDAjqPz0FEIxAABAkiAUO6SqJIP2CQAAgCRBKHbIqpJMHW3pVzSecLoUAAAA1yMUO2RVaaaG4wkda+t3uhQAAADXIxQ7pKpkZAIFfcUAAADOIxQ7ZEl+mgI+jw429TldCgAAgOsRih3i83q0oihDBxpZKQYAAHAaodhBVSUZOtjUK2ut06UAAAC4GqHYQatKMtUxMKy2viGnSwEAAHA1QrGDTm22289mOwAAAEcRih1UVcoECgAAgGRAKHZQZtCvBTkhNtsBAAA4jFDssKqSTFaKAQAAHEYodtiqkkzVtA9ocDjudCkAAACuRSh2WFVJphJWOtzCIR4AAABOIRQ7bDWb7QAAABxHKHbYgpyQMlJ8bLYDAABwEKHYYcYYrS/P1uNHWhVPcLIdAACAEwjFSeC9F1aornNQjxxocboUAAAAVyIUJ4E3rS7WgpyQvvtUjdOlAAAAuBKhOAl4PUYfuHiRnj/Rqb31PU6XAwAA4DqE4iTxzgvKlZ7i031PHXe6FAAAANchFCeJjKBf79hUrt/saVJzT8TpcgAAAFyFUJxE/vySRUpYqx88e8LpUgAAAFyFUJxEynNT9aZVxfrR9lqFh2NOlwMAAOAahOIkc8dli9UzGNXPXqhzuhQAAADXIBQnmU0Lc3TRkjz92yNH1NpLbzEAAMBsIBQnGWOMvvJnazUcS+jvt+53uhwAAABXIBQnocX5afrEG5fr9/ub9bu9TU6XAwAAMO8RipPUhy5drDVlmfq7rfvVE446XQ4AAMC8RihOUj6vR1996zp1hYf15d8ecLocAACAeY1QnMRWl2bpLy5fov/eWa+tuxucLgcAAGDeIhQnuY9evUybF+Xq4z/drfufOeF0OQAAAPMSoTjJBf1e/eCOzbqmqkhffGi/vvaHw7LWOl0WAADAvEIongOCfq/+870b9e7N5fp/j1Xrsz/fq3iCYAwAADBdfE4XgHPj83r0lVvXKj89Rd/8U7VCAa++eNMqGWOcLg0AAGDOIxTPIcYYfepNKxSJxnXvkzVakBPShy5b4nRZAAAAcx6heA763HVVauge1Jd/e1AlWSHdsK7E6ZIAAADmNHqK5yCPx+j/vuM8bVqYo0/8bLdeONHpdEkAAABzGqF4jgr6vbr3/Zu0IDukD37vBf1o+0kl2HwHAAAwKYTiOSwnLaAf3LFZq0oz9YVf7tPbvv2MDjb1Ol0WAADAnEMonuMW5KTqJ3du0b+9fb1OdIR14zef0j8/fFCRaNzp0gAAAOYMQvE8YIzRW89foEc/eYXetnGB/uuJ47rpm09pb32P06UBAADMCYTieSQnLaCvvm2d7v/gZvVGorr1W0/r7m1HFI0nnC4NAAAgqZlkODJ406ZNdseOHU6XMa/0hKP64kP79KvdjcpLC+iSynxdWpmvS5flqzQ75HR5AAAAjjDG7LTWbjrzOnOK56msVL/uftcG3XxeqX79UpOeqm7XQy81SpL+bGOZvnLrWgX9XoerBAAASA6E4nnuqpVFumplkay1OtLSr1/uatC3/+eYjrb0679uO59VYwAAANFT7BrGGK0oztBnr1upe9+/STXtA7r5/z3FwR8AAAAiFLvSG1cV6Vd3XayMoF/vufc53b3tCCPcAACAqxGKXaqyMEO/uusSXbemRHdvO6o33/2EHjvc6nRZAAAAjmD6BPR0dbv+bus+HW8b0DVVhbpqZZHWlmVpeXG6UnxsxgMAAPPH2aZPEIohSRqOJXTvk8d175PH1R2OSpL8XqMN5Tn61JuW68IleQ5XCAAAMHWEYpwTa63qOge1r7FHext6tHVXgxp7IrpuTbE+d12VKvJSnS4RAABg0gjFmJTB4bi+8+RxfevxY4onrN61uVw3ry/VxooceTzG6fIAAAAmhFCMKWnuiehrfzysh3Y3ajieUFFmiq5dXay3nV+utQuynC4PAADgnBCKMS36IlH96VCrHt7bpMcPt2koltBly/L1l1cu1UVL8mQMq8cAACB5EYox7XojUf3ouVrd91SN2vuHdF55tr50y2qtW5DtdGkAAADjOlsoZk4xJi0z6NdfXrlUT33mDfrHt6xRc09Eb//2s/r1S41OlwYAADAhhGJMWdDv1W1bFuq3H71U6xZk6SMP7NLd244oGf4vBAAAwLnwOV0A5o+89BT98EMX6vO/2Ke7tx3V4eY+XbAoV/1DMfUPxWSt1RXLC7VlSa58Xv57DAAAJA96ijHtrLW654nj+pffH9KpH6+Q36u4tRqOJZSXFtC1a4r1ZxsX6PyFOc4WCwAAXIWNdph13eFhWSulB33yez2KRON6/HCrfr2nSX862KrBaFzXVBXpc9ev1NKCdKfLBQAALkAoRlIJD8d0/zMn9R+PVSsSjet9Wxbqo1cvU25awOnSAADAPEYoRlJq7x/S1x85ogeer5Xf69EN60r03gsrtLEih5nHAABg2hGKkdSqW/v0vadPaOvuRvUPxbSiKEPvuKBcN60vUWFG0OnyAADAPEEoxpwwMBTTQy816oHna7WnvkceI11Sma9bzivTjetKFPR7nS4RAADMYYRizDnVrX3aurtRv9rdoLrOQZVlh/SFG6p03ZpiWisAAMCkEIoxZ1lr9VR1u/7ptwd1qLlPFy7O1RdvWq1VpZlOlwYAAOYYQjHmvFg8oZ+8UKd/++NhdQ9GddmyAr3rgnJdU1WkgI/DQAAAwOsjFGPe6AlHdd/TNfrvHXVq6okoNy2gG9eVaE1plpYWpmlpQbqyUxntBgAAXo1QjHknnrB68mibfvpCnR491KrhWOL0c5WF6frKrWu1eXGugxUCAIBkM2Oh2BjjlbRDUoO19kZjzGJJP5GUJ2mnpNustcOv9RmEYkxVPGFV3xXWsbZ+Vbf264fP1aquK6wPXLxIn37zSoUCTK0AAABnD8XT0Yj5MUkHxzz+qqSvW2srJXVJumMavgbwmrweo4V5abpqZZHuvHypfvexy/T+LQv1vadP6NpvPKHf7GlUc0/E6TIBAECSmtJKsTFmgaT7Jf2TpE9KuklSm6Ria23MGHORpP9trX3za30OK8WYKc8e69Cnf/6S6joHJUmFGSlatyBbN60v0U3rSuXxMNoNAAA3OdtKsW+Kn3u3pE9Lyhh9nCep21obG31cL6lsil8DmLSLluZp2yev0L6GXu2p79ae+h7tONmpbT9p0XeerNHnrl+pi5fmS5K6Bob1xNE2HWzq01s3lmlZUcbrfDoAAJgvJh2KjTE3Smq11u40xlw5ifffKelOSaqoqJhsGcDrSvF5df7CHJ2/MEeSlEhY/Wp3g772h8N6z73bdWllvsLDMe2u61Zi9H+c3Pvkcd22ZaE+cc1yZaX6HaweAADMhkm3Txhj/lnSbZJikoKSMiX9UtKbRfsE5oBINK77nzmh7zxVo9KsoK5cUagrVxSoPDdVd287oh9vr1VWyK9PvmmF3n1BuXxeZiEDADDXzehIttGV4r8ZnT7x35J+bq39iTHm25L2WGu/9VrvJxQjGR1s6tU//Hq/njveqZXFGfr7m1adbrUAAABz00xOnzjTZyR90hhTrZEe4/tm4GsAM66qJFMPfHiL/vO9G9UXiek9927XX/5wp4639SsZ5nsDAIDpw+EdwDmIROO694nj+tbjxzQYjasoM0UbK3K0oSJbV60sUmVhutMlAgCAc8CJdsA0aO6J6A/7m/VibZd21XartjMsj5HeunGBPvHG5SrNDkmS+odiemh3o363r0m3X7RI16wqcrhyAAAgEYqBGdHSG9G9T2+gTucAAB9TSURBVBzXD549KRnp9osWqn8orod2N2hgOK70FJ/CwzF95da1etdmpqwAAOC0mZpTDLhaUWZQf3vjKn3gkkX6+iNH9Z2napTi8+imdaV694UVWlGUob/60Yv67C/2qr1/SHe9oVLGcGAIAADJhpViYBo190SUmuJVZvDl2cbReEKfeXCPfrGrQe+9sEK3bihTcVZQhRlBBXyMeQMAYDaxUgzMguKs4Kuu+b0efe3t65WfkaJ7njiuH22vPf3ckoI0feXWtdqyJG82ywQAAGdgpRiYRcfb+lXbGVZzT0TNvRFt3d2okx0D+us3VOqjVy/jgBAAAGYYK8VAElhSkK4lBS+Pb/vwZUv0xYf269//VK2nj3XoI1dVqrq1X7vrurW3oUchv1eXLcvXFcsLtWlRjoJ+r4PVAwAwf7FSDCSBrbsb9IVf7lP/UEySVJYd0tqyLPVGotpxokvD8YSCfo+2LMnT5csKdPnyAi0tSGPTHgAAE8RKMZDEbjmvTJsX5+pwc59Wl2apICPl9HPh4ZieO96hJ46064kjbfrS4QOSRoLzuzeX67aLFikr5D/bRwMAgHPASjEwx9R1hvXE0Tb9fl+znjzarowUn95/8UJ98JLFyktPef0PAADAxTi8A5iH9jX06FuPV+t3+5rlMUY5qX5lhfzKSQ1oUX6aPnb1MpXnpjpdJgAASYNQDMxj1a39emh3g9oHhtUdHlbXQFQv1XcrnrD6qysr9RdXLGGTHgAAIhQDrtPUM6gv//agfrunSQvzUnXXlZVaWZKhRflpygz6Za1Vc29ER1v6dbJjQBdX5mvpmMkYAADMR4RiwKWeOtquv39on463DZy+lpcW0HAsob7RaReSlOLz6DPXrtQHLl4kj4epFgCA+YlQDLhYLJ7QsbYB1bQP6ETHgGraBhTwebS8KF2VhRkqyAjonx8+pEcPteqiJXn62jvWqyw7NO5nWWtlrQjOAIA5iVAM4DVZa/WzHXX60q8PyBijSyrztLo0S2vKMlWSFdJLdd167niHttd0qmcwqpvXl+rdmyu0bkEW85IBAHMGoRjAOanrDOvrjxzRrrpu1bQPvOK5/PSALlycp6Dfq4f3NmkwGldVSaY+fNli3bqhjHAMAEh6hGIAE9YXiepgU58auwe1pizrFafo9UWi2rq7UT/aXquDTb26pqpI//LWtcpnVjIAIIkRigHMiETC6rtP1+hf/3BYmUGf/vVt63TVyiKnywIAYFxnC8UeJ4oBMH94PEYfumyJfv3Xlyo/PUUf/P4OffSBXTrc3Od0aQAAnDNWigFMm6FYXN98tFrffbpG4eG4rqkq1F9euVR5aSmq7QyrtjOspp5BBX1eZaX6lRn0qzAjRRsX5nC4CABgVtA+AWDWdA0M6/5nT+j7z5xQdzj6iuc8Rkqc8ddOWsCrK1cU6k2ri3TVykJlBP2zVywAwFUIxQBm3cBQTL/d0yRjpIrcVFXkpaooI6hoIqHewZh6I1HVdoT1yMEWPXKgRW19Q8pO9evb7ztfW5bkOV0+AGAeIhQDSGqJhNXO2i597hd7dbJjQP/6tnW6dcMCp8sCAMwzbLQDkNQ8HqMLFuXq5//rYm1amKtP/PQl3b3tiJLhP9wBAPOfz+kCAGCsrFS/7v/gZn3uF3t197aj+tOhVpXnpio/LaC89BStKsnUxZV5Sg3w1xcAYPrwbxUASSfg8+hrb1+nqpIM/XF/iw429qq9f0i9kdjp5y9akqc3rCjQW89fwMY8AMCU0VMMYM6IROPacaJLjx1u1WOHWnW8fUCVhem67/ZNWpiX5nR5AIA5gI12AOadZ6rb9Vc/flFG0n/dtkmbF+eefu5kx4COtvTr0mX5zEAGAJxGKAYwL51oH9AH739BdZ1hff76KoWH43p4b5P2N/ZKknLTAnrfhRV630ULVZgRdLhaAIDTCMUA5q2ecFR3/fhFPVXdLknaWJGt69eWaElBmn68vU6PHmqR3+PRJZV5ygj6FfR7lOLzamVJht66cQEryQDgIoRiAPNaNJ7Q09XtWl6UodLs0Cueq2kf0PefrtH2mk4NxRKKROMKD8fVMxhVfnqK7rh0sd63pYINewDgAoRiABjDWqvtNZ36j8eq9eTRdmUEfbr9okX680sWKS89xenyAAAzhFAMAGext75H33q8Wr/f36wUn0fvuqBCd16+5FUrzgCAuY9QDACvo7q1X9/+n2P61a4GWUklWUFlp/qVkxpQeopPfZGYugeH1TUQVcJanb8wRxcvzdfFS/O0MC9VxhinvwUAwOsgFAPAOWroHtQD22vV0D2o7vCwugej6o/ElB70KSc1oOxUv2Jxq+01HWrpHZIkrS/P1g/v2ExfMgAkubOFYk60A4AzlGWH9DdvXvG6r7PWqqZ9QI8dbtM/P3xQH3lgl77z/k3yeT2zUCUAYDrxNzcATJIxRksK0nXHpYv1pVvW6PHDbfqnhw86XRYAYBJYKQaAafCeCytU3dqv7z5do8rCdL33woVq6Y3ox9tr9d876pSa4tOllfm6tDJfFy7Jpc0CAJIMoRgApskXbqjS8fZ+/f3W/XrsUKseP9ymuLW6fFmBrKSfvFCr7z9zQj6P0U3rS3XXGypVWZjudNkAALHRDgCmVV8kqrd/+1k1dg/qnReU631bFmphXpokKRKN68XaLv1xf4t++kKdIrG4blhbor++qlIrizMdrhwA3IHpEwAwS4ZicUlSiu/sx0e39w/pO0/W6P979oQGhuMqzQpqw8IcbSjPVlVJpgI+jzzGyOsx8nuN0gI+paZ4lRbwnX7OY8QYOACYIEIxACShroFh/Wp3g3ae7NKu2m41dA9O6P0eI5VkhbSyOEMrSzK0sjhTV64ooGcZAM6CUAwAc0BLb0THWvsVt1bxhFXCWg3HEgoPx0d/xTQcSyhhpXjCKpZIqLZzUIeaenW8fUDxhFVqwKu3bCjT+y5cqFWltGUAwFjMKQaAOaAoM6iizOCk3jsUi2tfQ49+8nydfr6zXj/eXqtNC3P0+RuqtLEi53Xf3z8UU4rPIz9zlgG4ECvFADAPdYeH9eDOen3nyRq19EX0rgsq9JlrVyg7NfCK18UTVk8cbdPPXqjTtoMtKsoM6pvv3qAN5xCiAWAuon0CAFyofyimux85ou89c0LZIb/uuGyxEgmrjoFhdQ4M6/maTjX1RJSbFtDN60v1yIEWtfRG9OlrV+hDly6Rx8NGPgDzC6EYAFzsQGOvvvCrvdpV2y1JSk/xKTctoGWF6Xrb+Qt0dVWRAj6PesJRfebne/T7/c26ckWBtizJU03bgI6396uxO6LVpZm6ckWhrlxRoNLskMPfFQBMHKEYAFzOWqu2/iFlBv0K+s8+Ls5aqx8+d1L/+JuDGo4nlJ8e0JL8dBVlBfXiya7TEzKqSjL1qTcu1zWril7x/s6BYX1j2xE1dEd0w7pivWlVsdJS2MICIDkQigEAE9IdHpYxRlmhl8e7WWtV3dqvxw+36Scv1OpY24CuWlmoL960SgtyUvXj52v1tT8cVv9QTIUZKWrqiSjo9+iNq4p1/ZpiXbIsX5mMiwPgIEIxAGBaDccS+v4zNfrGtqOKJqwqclNV3dqvLUty9aVb1qiyIF07a7u0dXeDfrunSV3hqHweo40Lc3TligJdsbxAq0oyX3EAyf7GHt3/zAk9erBVkWhc0YRVLJ5QWU5I33z3Rp1Xnu3gdwxgPiAUAwBmRHNPRP/8u4M61NSnu66q1E3rSl510l40ntCu2m49frhVjx9u04GmXklSYUaKrlheoLULsvSbl5r0/IlOhfxeXbumWLlpAfm8Rj6P0dbdjWrtHdKXb12jd2wqd+LbBDBPEIoBAEmjtTei/znSpsePtOnJI23qjcS0ICek2y9apHdsKldW6itbLLoGhvWRB3bpqep2vf+ihfr89VVq6xtSXVdY9V2DCng9KssJqSw7pKLMoFr7Ijrc3KcjLX3q6B/WBy5ZpJIsNgYCIBQDAJJULJ7QiY6wFuenyfsaI+Bi8YT+9Q+Hdc8Tx1/z84yRxv6rzWOk8txUPfDhLa+amNE5MKzmnggn/wEuQigGAMwLjxxo0Z76bi3ICWlBTqoW5IQUjSdU3zWo+q5BNfdEVJiZohVFGVpelKGajgHdft/zyk7z64EPb9GCnFRZa/WLFxv0j789oO5wVDevL9UXbqia9GmCAOYOQjEAwLVequvWbfdtV2bIr397+3r9x+PH9MSRNm2syNbmxXn67tM18nuMPnbNMl1TVaS9DT3aU9+jvQ09Go4llJ7iU3qKTxlBny5dlq83ry5+zbF2AJIXoRgA4Gp763v0vvu2q2cwqtSAV59+8wrddtEieT1GJzsG9KVfH9Cjh1pPvz7F59Gq0kylp/g0MBRT/1BM7f0jJwFmBn26+bxSvXNThdYuyBr361lrFR6OKzXgfdXGQwDOIRQDAFzvQGOvfrajTh+6bLEW5KS+6vknj7apoWtQaxdkaXlRhvxezyueTySsnjveoZ/tqNPv9jVrKJbQLeeV6os3rVZuWuAVX+dzv9ijl+p75PcaZYUCyk3zq7IwXW9YUag3rCxUfnrKpL8Pay1BG5gkQjEAANOoZzCq7z5Vo289Xq3MoF//cMtqXVNVpLu3HdW9Tx5XTqpft21ZpKFYXF3hkRXm3XXdaukdkjHSeeXZunZ1sW4+r/ScJ2McaenTr3Y1aOvuRqX4PLrn/eersjBjhr9TYH4hFAMAMAMONffq0w/u0Z76HmUGfeqNxPTOTeX63PUrlZ0aeMVrrbXa39irPx1q1baDLdpT3yNjpC2L83Tj+hL5PR4190bU3BtRR/+QEmP+FV3fNaiDTb3yeowuW5avfQ29iiUS+t4HLtCGipxZ/q6BuYtQDADADInFE7rvqRo9erBVn3jjcl20NO+c3lfTPqCtuxv0q10NOtERPn09Ny2g/PSAvJ6X2zcygj5dv6ZYN64vVX56ik52DOi2+55XW9+QvvW+jXrDikJVt/bpod2N2nawVatKM/Xxa5aN2yYCuBmhGACAJGWt1bG2fqX4vCrMTFGK79wmW7T1DekD33teh5v7tKQgTUda+k+3Zuxv7JWsdNtFC3XXGyqVGvCqurVfR1v7VN85qPSgT7lpAeWkBlSYmaJFeWlM1IArEIoBAJiH+iJRffYXe9XaG9ENa0t0/doSFWYG1dA9qG9sO6IHd9bL5/UoFk+8oh3jTB4jLcpL07KidK0vz9YNa0u0MC/tNb92JBrXk0fbdUllnlIDvmn+zoCZQSgGAMCFjrb06Ufba5UZ8o8eaJKu8txUhYfj6hwYVnd4WE09ER1t6dORln4daenT8fYBSdLasizduK5EV1cVaWlB2umJF7F4Qg/urNc3Hj2qpp6I1pRl6jvvv0DFWa99+EkkGld7/5ACXo98Xo/8XqP0FB+TNDCrCMUAAOCc1HeF9bu9zfrNnka9VN8jScpJ9ev8hblaVZqpX7/UqJr2AZ1Xnq0b15Xo648cUXrQp/tuv0Bryl6e29w/FNMLNZ3aXtOp52s6tLehR9H4K3PHssJ0/a8rlurm80pfNQIPmAmEYgAAMGF1nWE9e6xDL5zo1I6TXappH9CKogz9zZtX6JqqQhljdKi5V3d8f4c6B4b1tzdWqWtgWE8cbdeLJ7sUS1j5vUZry7K0eXGeFuenKpawisYSGowmtHV3gw4196ksO6Q7L1+iSyrzVJgZVMboCvJQLK4T7WEdbe1TdziqN68uVkHG5Gc8A4RiAAAwZb2RqNIDPnk8r2x5aO2L6M4f7NTuum5J0urSTF2+vECXVuZrY0WOQoHxN/FZa/XY4VZ967Fj2nGy6/T1kN+r7FS/WvuGFB/TDB3wenTDuhLdfvEirSvL0rG2fj1/olM7TnSpOzysnLSA8tICyk1L0eL8VK0py1JZduisLRp1nWH95/8c0976Hn38mmW6uqpoqrcISY5QDAAAZlQkGtfOk11aUZwxqRP79jX06Hj7gFp7I2rpjahjYFhl2SFVFqarsjBdXo/RA9tr9eDOeg0Mx5UW8GpgOC5Jyk9PUXFWiroGouoYGFIkmjj9uTmpfq0py9KKogxVFqZraWG6UgNefe/pE/rlrgZ5jVFxVlC1nWG9c1O5/vbGKmUE/ae/pxdOdCrF59UFi3Im3P88FIurqTuihXmp9E4nCUIxAACYF/oiUf18Z72OtvbrvPJsXbAo91Whc2AopqOt/dpb3629DT3a19CrY239Goq9HJZTfB6958IK/cXlS5WT5tc3th3Vt//nmEqzQ3rvhQv1wolOPXusQ4PRkeC9JD9N79pcrrduXKCskF8nOsI60tKnmvYBFWakqKokU5WF6Qp4Pdpxsku/3NWg3+5pVG8kpoKMFF2xvEBXrijQZcsKlBXyz/p9wwhCMQAAcLV4wqqxe1DVrf1q6Y3o6qqiV/Un7zzZqU/+7CWd7AirIjdVV64YCbLd4ageeL5WL5zokt9rZGQ0HE+86mt4jJQZ8qs7HFXI79W1a4q1sSJb22s69eTRdvUMRuX3Gl2xvEA3rS/VNVVFSg14daIjrN11XdpT36OA16MFOSEtyElVYWaK6jrD2t/Yq/2NvWroGtTmxbm6bk2xNi/OlY/NiRNGKAYAADgHQ7G4OvqHVZIVfFXLw9GWPj34Yr0kaXlhhlYUZ2hxfppaeiM63Nyng819auga1GXL8vXGVUVKS3l5fnMsntBL9d36w/4W/fqlRjX1RBT0exT0e9Udjkoa6aWOW6vh2CsDt9djVFmQrsLMFL1wolORaEK5aQG9aVWRbt1QpgsW5b6iz7tzYFiPHGhWVziqvLSA8jNSlJcWUH8kpsaeiBq7B9XRP6TzF+XqmqpCV82ZJhQDAAAkiUTC6sXaLv1mT5Mi0bjOK8/W+vJsLS/KkJHUPjCk+q5BNfdEVJod0srijNMnDoaHY3riSJt+t69Z2w60aGA4rgU5If3ZhjIVZQX1u73NevZ4xys2KI4n5PdqMBpXasCrN60q0g3rSrWqNFMlmcFXbaScTwjFAAAA80x4OKY/7m/Rz1+s19PV7UpYaVFeqq5fW6Ib1o2cStjRP6T2/mF19A8pPehTaVZIxVlBBbwePX+iU1t3N+rhvU3qGRxZrQ76PVqUl6bS7JAkKWGtElbKTw/oqpWFunx5gTKDL/dEDwzFdKytX0G/VyVZwdObFF9LXyR6Tq+bCYRiAACAeaylN6KewaiWFaZPeNLFcCyhF2u7dKytXzVtAzrePqCW3og8xshjJGOMTnYMqCs80hN94eI8pQa8OtzSp5Md4Vd8VkaKT2U5IV2wKFdXrijQRUtHjgE/dSjMw/uadLIjrO2fv9qRA1sIxQAAAJi0+GjLx7aDLfrTwVbFrVVVcaZWFI8cHz4ct2rqHlRj96BOdIT1fE2nBqNxBXweVeSmqrq1X5K0pixT160p0Z9fssiRXmZCMQAAAGbNqRnPjx9u05GWPl1Sma/r1hRrYV6ao3WdLRS7Z6shAAAAZk3Q79Vly0bmMs8FDLcDAACA6xGKAQAA4HqEYgAAALgeoRgAAACuRygGAACA6xGKAQAA4HqEYgAAALjepEOxMabcGPOYMeaAMWa/MeZjo9dzjTGPGGOOjv6eM33lAgAAANNvKivFMUmfstaukrRF0l3GmFWSPivpUWvtMkmPjj4GAAAAktakQ7G1tsla++Lon/skHZRUJukWSfePvux+SW+ZapEAAADATJqWnmJjzCJJGyRtl1RkrW0afapZUtF0fA0AAABgpkw5FBtj0iX9XNLHrbW9Y5+z1lpJ9izvu9MYs8MYs6OtrW2qZQAAAACTNqVQbIzxayQQ/8ha+4vRyy3GmJLR50sktY73XmvtPdbaTdbaTQUFBVMpAwAAAJiSqUyfMJLuk3TQWvt/xzz1kKTbR/98u6Stky8PAAAAmHm+Kbz3Ekm3SdprjNk9eu3zkv5F0s+MMXdIOinpHVMrEQAAAJhZkw7F1tqnJJmzPH31ZD8XAAAAmG2caAcAAADXIxQDAADA9czI1DSHizCmTSP9x07Il9Tu0Neei7hfE8c9mxju18RxzyaG+zVx3LOJ4X5N3Gzes4XW2leNPkuKUOwkY8wOa+0mp+uYK7hfE8c9mxju18RxzyaG+zVx3LOJ4X5NXDLcM9onAAAA4HqEYgAAALgeoVi6x+kC5hju18RxzyaG+zVx3LOJ4X5NHPdsYrhfE+f4PXN9TzEAAADASjEAAABcz7Wh2BhzrTHmsDGm2hjzWafrSUbGmHJjzGPGmAPGmP3GmI+NXs81xjxijDk6+nuO07UmE2OM1xizyxjzm9HHi40x20d/1n5qjAk4XWMyMcZkG2MeNMYcMsYcNMZcxM/Y2RljPjH6z+M+Y8wDxpggP2OvZIz5rjGm1Rizb8y1cX+mzIh/H713e4wxG52r3DlnuWf/Z/Sfyz3GmF8aY7LHPPe50Xt22BjzZmeqds5492vMc58yxlhjTP7oY37GdPZ7Zoz5yOjP2X5jzL+OuT7rP2OuDMXGGK+k/5B0naRVkt5tjFnlbFVJKSbpU9baVZK2SLpr9D59VtKj1tplkh4dfYyXfUzSwTGPvyrp69baSkldku5wpKrk9Q1Jv7fWrpS0Xv9/e/caYsdZx3H8+yObhKSF3oKxZitbNfVFai+hSvGGjSJtLV3BQiMBqxaEvPDypmoNCIIvRERLvVS0pYkaLFpjDYLSmpYqaBttyKX1mrYh3bAxCZJ4JY3154vniZ2e3bObgMnMZn4fGHbmmTmH5/z3f878z8wzc0rskmPTkLQM+Ahwle1LgXnAapJjg9YD1w60Dcup64DldfoQcNdp6mPXrGdqzB4CLrV9GfBH4HaAuh9YDayoj/la3a/2yXqmxgtJFwHvBPY2mpNjxXoGYibpGmAcuNz2CuALtb2VHOtlUQy8Adht+xnbzwP3Uf4p0WB70va2Ov83SrGyjBKrDXWzDcC72+lh90gaBd4F3F2XBawC7q+bJF4Nks4B3grcA2D7eduHSY7NZARYJGkEWAxMkhx7Cds/B/4y0Dwsp8aBb7l4DDhX0oWnp6fdMV3MbD9o+9918TFgtM6PA/fZPmr7WWA3Zb/aG0NyDOBLwMeB5gVbyTGGxmwt8DnbR+s2B2p7KznW16J4GfBcY3mitsUQksaAK4HHgaW2J+uq/cDSlrrVRXdQPhD/U5cvAA43dizJtZe6GDgI3FuHnNwt6SySY9OyvY9yJGUvpRg+AjxBcuxEDMup7A9OzAeBn9T5xGwaksaBfbZ3DKxKvIa7BHhLHf71qKTX1/ZWYtbXojhOgqSzgR8AH7P91+Y6l9uX5BYmgKQbgAO2n2i7L3PICLASuMv2lcA/GBgqkRx7UR0HO075MvEK4CymOYUbM0tOnRxJ6yjD6Ta23ZeukrQY+BTw6bb7MseMAOdThmjeBnyvnmFtRV+L4n3ARY3l0doWAyTNpxTEG21vqs1/Pn7qp/49MOzxPfMm4EZJeyhDclZRxsueW091Q3Jt0AQwYfvxunw/pUhOjk3vHcCztg/aPgZsouRdcmx2w3Iq+4MZSHo/cAOwxi/ewzUxm+rVlC+rO+o+YBTYJunlJF4zmQA21aElWylnWZfQUsz6WhT/Glher9heQBnMvbnlPnVO/bZ2D/A7219srNoM3FLnbwF+dLr71kW2b7c9anuMklMP214DPALcVDdLvBps7week/Ta2vR24Lckx4bZC1wtaXF9fx6PV3JsdsNyajPwvnqHgKuBI41hFr0m6VrKcLAbbf+zsWozsFrSQkkXUy4g29pGH7vC9i7bL7M9VvcBE8DK+hmXHBvuAeAaAEmXAAuAQ7SVY7Z7OQHXU66mfRpY13Z/ujgBb6acYtwJbK/T9ZRxsluAPwE/A85vu69dm4C3AT+u86+qb+bdwPeBhW33r0sTcAXwm5pnDwDnJcdmjNdngN8DTwLfBhYmx6bE6LuUMdfHKMXJrcNyChDlbkRPA7sod/Zo/TV0JGa7KeM6j3/+f72x/boasz8A17Xd/y7Ea2D9HmBJcmzWHFsAfKd+nm0DVrWZY/lFu4iIiIjovb4On4iIiIiI+J8UxRERERHReymKIyIiIqL3UhRHRERERO+lKI6IiIiI3ktRHBHRMkkvSNremD45+6NO+LnHJD35/3q+iIgz1cjsm0RExCn2L9tXtN2JiIg+y5HiiIiOkrRH0ucl7ZK0VdJravuYpIcl7ZS0RdIra/tSST+UtKNOb6xPNU/SNyU9JelBSYtae1ERER2Vojgion2LBoZP3NxYd8T264CvAHfUti8DG2xfBmwE7qztdwKP2r4cWAk8VduXA1+1vQI4DLznFL+eiIg5J79oFxHRMkl/t332NO17KD97+oyk+cB+2xdIOgRcaPtYbZ+0vUTSQWDU9tHGc4wBD9leXpc/Acy3/dlT/8oiIuaOHCmOiOg2D5k/GUcb8y+Q60kiIqZIURwR0W03N/7+qs7/Elhd59cAv6jzW4C1AJLmSTrndHUyImKuy9GCiIj2LZK0vbH8U9vHb8t2nqSdlKO9761tHwbulXQbcBD4QG3/KPANSbdSjgivBSZPee8jIs4AGVMcEdFRdUzxVbYPtd2XiIgzXYZPRERERETv5UhxRERERPRejhRHRERERO+lKI6IiIiI3ktRHBERERG9l6I4IiIiInovRXFERERE9F6K4oiIiIjovf8CyBMWCmhaCugAAAAASUVORK5CYII=\n",
"text/plain": [
"
"
]
},
"metadata": {
"tags": [],
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "wBD0wM3JVXol"
},
"source": [
"### Appendix"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Iz0QI1P7VaDP"
},
"source": [
"#### References\n",
"1. [https://medium.com/@yusufnoor_88274/implementing-neural-graph-collaborative-filtering-in-pytorch-4d021dff25f3](https://medium.com/@yusufnoor_88274/implementing-neural-graph-collaborative-filtering-in-pytorch-4d021dff25f3)\n",
"2. [https://github.com/xiangwang1223/neural_graph_collaborative_filtering](https://github.com/xiangwang1223/neural_graph_collaborative_filtering)\n",
"3. [https://arxiv.org/pdf/1905.08108.pdf](https://arxiv.org/pdf/1905.08108.pdf)\n",
"4. [https://github.com/metahexane/ngcf_pytorch_g61](https://github.com/metahexane/ngcf_pytorch_g61)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_JywmGguVbNU"
},
"source": [
"#### Next\n",
"\n",
"Try out this notebook on the following datasets:\n",
"\n",
"![](https://github.com/recohut/reco-static/raw/master/media/images/120222_data.png)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "MppKYNXJVswT"
},
"source": [
"Compare out the performance with these baselines:\n",
"\n",
"1. MF: This is matrix factorization optimized by the Bayesian\n",
"personalized ranking (BPR) loss, which exploits the user-item\n",
"direct interactions only as the target value of interaction function.\n",
"2. NeuMF: The method is a state-of-the-art neural CF model\n",
"which uses multiple hidden layers above the element-wise and\n",
"concatenation of user and item embeddings to capture their nonlinear feature interactions. Especially, we employ two-layered\n",
"plain architecture, where the dimension of each hidden layer\n",
"keeps the same.\n",
"3. CMN: It is a state-of-the-art memory-based model, where\n",
"the user representation attentively combines the memory slots\n",
"of neighboring users via the memory layers. Note that the firstorder connections are used to find similar users who interacted\n",
"with the same items.\n",
"4. HOP-Rec: This is a state-of-the-art graph-based model,\n",
"where the high-order neighbors derived from random walks\n",
"are exploited to enrich the user-item interaction data.\n",
"5. PinSage: PinSage is designed to employ GraphSAGE\n",
"on item-item graph. In this work, we apply it on user-item interaction graph. Especially, we employ two graph convolution\n",
"layers, and the hidden dimension is set equal\n",
"to the embedding size.\n",
"6. GC-MC: This model adopts GCN encoder to generate\n",
"the representations for users and items, where only the first-order\n",
"neighbors are considered. Hence one graph convolution layer,\n",
"where the hidden dimension is set as the embedding size, is used."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "e7RRc2UQBuc9"
},
"source": [
"## A simple recommender with tensorflow\n",
"> A tutorial on how to build a simple deep learning based movie recommender using tensorflow library."
]
},
{
"cell_type": "code",
"metadata": {
"id": "hLtJPt_5idKN"
},
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import tensorflow as tf\n",
"from tensorflow import keras\n",
"from tensorflow.keras import layers\n",
"from tensorflow.keras import models\n",
"\n",
"tf.random.set_seed(343)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "DNLlAwKUihC1"
},
"source": [
"# Clean up the logdir if it exists\n",
"import shutil\n",
"shutil.rmtree('logs', ignore_errors=True)\n",
"\n",
"# Load TensorBoard extension for notebooks\n",
"%load_ext tensorboard"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"id": "8IRTF0EVjQuX",
"outputId": "932eaa43-725c-4fb8-e9d4-dca92ced4cf0"
},
"source": [
"movielens_ratings_file = 'https://github.com/sparsh-ai/reco-data/blob/master/MovieLens_100K_ratings.csv?raw=true'\n",
"df_raw = pd.read_csv(movielens_ratings_file)\n",
"df_raw.head()"
],
"execution_count": null,
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
UserId
\n",
"
MovieId
\n",
"
Rating
\n",
"
Timestamp
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
196
\n",
"
242
\n",
"
3.0
\n",
"
881250949
\n",
"
\n",
"
\n",
"
1
\n",
"
186
\n",
"
302
\n",
"
3.0
\n",
"
891717742
\n",
"
\n",
"
\n",
"
2
\n",
"
22
\n",
"
377
\n",
"
1.0
\n",
"
878887116
\n",
"
\n",
"
\n",
"
3
\n",
"
244
\n",
"
51
\n",
"
2.0
\n",
"
880606923
\n",
"
\n",
"
\n",
"
4
\n",
"
166
\n",
"
346
\n",
"
1.0
\n",
"
886397596
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" UserId MovieId Rating Timestamp\n",
"0 196 242 3.0 881250949\n",
"1 186 302 3.0 891717742\n",
"2 22 377 1.0 878887116\n",
"3 244 51 2.0 880606923\n",
"4 166 346 1.0 886397596"
]
},
"execution_count": 22,
"metadata": {
"tags": []
},
"output_type": "execute_result"
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "El1C8OwWjhxk",
"outputId": "f8ed06e7-8554-45f2-982d-987f153a5cc7"
},
"source": [
"df = df_raw.copy()\n",
"df.columns = ['userId', 'movieId', 'rating', 'timestamp']\n",
"user_ids = df['userId'].unique()\n",
"user_encoding = {x: i for i, x in enumerate(user_ids)} # {user_id: index}\n",
"movie_ids = df['movieId'].unique()\n",
"movie_encoding = {x: i for i, x in enumerate(movie_ids)} # {movie_id: index}\n",
"\n",
"df['user'] = df['userId'].map(user_encoding) # Map from IDs to indices\n",
"df['movie'] = df['movieId'].map(movie_encoding)\n",
"\n",
"n_users = len(user_ids)\n",
"n_movies = len(movie_ids)\n",
"\n",
"min_rating = min(df['rating'])\n",
"max_rating = max(df['rating'])\n",
"\n",
"print(f'Number of users: {n_users}\\nNumber of movies: {n_movies}\\nMin rating: {min_rating}\\nMax rating: {max_rating}')\n",
"\n",
"# Shuffle the data\n",
"df = df.sample(frac=1, random_state=42)"
],
"execution_count": null,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of users: 943\n",
"Number of movies: 1682\n",
"Min rating: 1.0\n",
"Max rating: 5.0\n"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "1W5V8T-C8Gpv"
},
"source": [
"### Scheme of the model\n",
"\n",
"![](https://github.com/recohut/reco-static/raw/master/media/images/120222_scheme.png)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "G-iv9rijkaBf"
},
"source": [
"class MatrixFactorization(models.Model):\n",
" def __init__(self, n_users, n_movies, n_factors, **kwargs):\n",
" super(MatrixFactorization, self).__init__(**kwargs)\n",
" self.n_users = n_users\n",
" self.n_movies = n_movies\n",
" self.n_factors = n_factors\n",
" \n",
" # We specify the size of the matrix,\n",
" # the initializer (truncated normal distribution)\n",
" # and the regularization type and strength (L2 with lambda = 1e-6)\n",
" self.user_emb = layers.Embedding(n_users, \n",
" n_factors, \n",
" embeddings_initializer='he_normal',\n",
" embeddings_regularizer=keras.regularizers.l2(1e-6),\n",
" name='user_embedding')\n",
" self.movie_emb = layers.Embedding(n_movies, \n",
" n_factors, \n",
" embeddings_initializer='he_normal',\n",
" embeddings_regularizer=keras.regularizers.l2(1e-6),\n",
" name='movie_embedding')\n",
" \n",
" # Embedding returns a 3D tensor with one dimension = 1, so we reshape it to a 2D tensor\n",
" self.reshape = layers.Reshape((self.n_factors,))\n",
" \n",
" # Dot product of the latent vectors\n",
" self.dot = layers.Dot(axes=1)\n",
"\n",
" def call(self, inputs):\n",
" # Two inputs\n",
" user, movie = inputs\n",
" u = self.user_emb(user)\n",
" u = self.reshape(u)\n",
" \n",
" m = self.movie_emb(movie)\n",
" m = self.reshape(m)\n",
" \n",
" return self.dot([u, m])\n",
"\n",
"n_factors = 50\n",
"model = MatrixFactorization(n_users, n_movies, n_factors)\n",
"model.compile(\n",
" optimizer=keras.optimizers.Adam(learning_rate=0.001),\n",
" loss=keras.losses.MeanSquaredError()\n",
")"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Bac1w7u49Ddx",
"outputId": "bb733033-9aba-446b-a56d-f971897221d0"
},
"source": [
"try:\n",
" model.summary()\n",
"except ValueError as e:\n",
" print(e, type(e))"
],
"execution_count": null,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"This model has not yet been built. Build the model first by calling `build()` or calling `fit()` with some data, or specify an `input_shape` argument in the first layer(s) for automatic build. \n"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "o-JSFnJA-1dz"
},
"source": [
"This is why building models via subclassing is a bit annoying - you can run into errors such as this. We'll fix it by calling the model with some fake data so it knows the shapes of the inputs."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "7wkIhqmO92Ca",
"outputId": "6825be87-3d5e-4d25-e276-5043ff3a3bb9"
},
"source": [
"_ = model([np.array([1, 2, 3]), np.array([2, 88, 5])])\n",
"model.summary()"
],
"execution_count": null,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"matrix_factorization_1\"\n",
"_________________________________________________________________\n",
"Layer (type) Output Shape Param # \n",
"=================================================================\n",
"user_embedding (Embedding) multiple 47150 \n",
"_________________________________________________________________\n",
"movie_embedding (Embedding) multiple 84100 \n",
"_________________________________________________________________\n",
"reshape_1 (Reshape) multiple 0 \n",
"_________________________________________________________________\n",
"dot_1 (Dot) multiple 0 \n",
"=================================================================\n",
"Total params: 131,250\n",
"Trainable params: 131,250\n",
"Non-trainable params: 0\n",
"_________________________________________________________________\n"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9Nxdrz7b_HOq"
},
"source": [
"We're going to expand our toolbox by introducing callbacks. Callbacks can be used to monitor our training progress, decay the learning rate, periodically save the weights or even stop early in case of detected overfitting. In Keras, they are really easy to use: you just create a list of desired callbacks and pass it to the model.fit method. It's also really easy to define your own by subclassing the Callback class. You can also specify when they will be triggered - the default is at the end of every epoch.\n",
"\n",
"We'll use two: an early stopping callback which will monitor our loss and stop the training early if needed and TensorBoard, a utility for visualizing models, monitoring the training progress and much more."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "6N_Y7u5o-QpY",
"outputId": "b4f299bc-25e2-4e38-b349-dc07184fb488"
},
"source": [
"callbacks = [\n",
" keras.callbacks.EarlyStopping(\n",
" # Stop training when `val_loss` is no longer improving\n",
" monitor='val_loss',\n",
" # \"no longer improving\" being defined as \"no better than 1e-2 less\"\n",
" min_delta=1e-2,\n",
" # \"no longer improving\" being further defined as \"for at least 2 epochs\"\n",
" patience=2,\n",
" verbose=1,\n",
" ),\n",
" keras.callbacks.TensorBoard(log_dir='logs')\n",
"]\n",
"\n",
"history = model.fit(\n",
" x=(df['user'].values, df['movie'].values), # The model has two inputs!\n",
" y=df['rating'],\n",
" batch_size=128,\n",
" epochs=20,\n",
" verbose=1,\n",
" validation_split=0.1,\n",
" callbacks=callbacks\n",
")"
],
"execution_count": null,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch 1/20\n",
"704/704 [==============================] - 3s 3ms/step - loss: 12.0905 - val_loss: 5.5121\n",
"Epoch 2/20\n",
"704/704 [==============================] - 2s 3ms/step - loss: 2.1751 - val_loss: 1.2149\n",
"Epoch 3/20\n",
"704/704 [==============================] - 2s 3ms/step - loss: 1.0271 - val_loss: 0.9839\n",
"Epoch 4/20\n",
"704/704 [==============================] - 2s 3ms/step - loss: 0.9003 - val_loss: 0.9266\n",
"Epoch 5/20\n",
"704/704 [==============================] - 2s 3ms/step - loss: 0.8470 - val_loss: 0.8996\n",
"Epoch 6/20\n",
"704/704 [==============================] - 2s 3ms/step - loss: 0.8046 - val_loss: 0.8786\n",
"Epoch 7/20\n",
"704/704 [==============================] - 2s 3ms/step - loss: 0.7667 - val_loss: 0.8680\n",
"Epoch 8/20\n",
"704/704 [==============================] - 2s 3ms/step - loss: 0.7329 - val_loss: 0.8618\n",
"Epoch 9/20\n",
"704/704 [==============================] - 2s 3ms/step - loss: 0.6999 - val_loss: 0.8558\n",
"Epoch 10/20\n",
"704/704 [==============================] - 2s 3ms/step - loss: 0.6688 - val_loss: 0.8558\n",
"Epoch 11/20\n",
"704/704 [==============================] - 2s 3ms/step - loss: 0.6381 - val_loss: 0.8560\n",
"Epoch 00011: early stopping\n"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5QUGLmtw_eWA"
},
"source": [
"We see that we stopped early because the validation loss was not improving. Now, we'll open TensorBoard (it's a separate program called via command-line) to read the written logs and visualize the loss over all epochs. We will also look at how to visualize the model as a computational graph."
]
},
{
"cell_type": "code",
"metadata": {
"id": "_J-v9Hua_SV8"
},
"source": [
"# Run TensorBoard and specify the log dir\n",
"%tensorboard --logdir logs"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ldq0DwgI_lWC"
},
"source": [
"We've seen how easy it is to implement a recommender system with Keras and use a few utilities to make it easier to experiment. Note that this model is still quite basic and we could easily improve it: we could try adding a bias for each user and movie or adding non-linearity by using a sigmoid function and then rescaling the output. It could also be extended to use other features of a user or movie."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-dpCn5hm_nUM"
},
"source": [
"Next, we'll try a bigger, more state-of-the-art model: a deep autoencoder."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zTGdZ0b4_4rl"
},
"source": [
"We'll apply a more advanced algorithm to the same dataset as before, taking a different approach. We'll use a deep autoencoder network, which attempts to reconstruct its input and with that gives us ratings for unseen user / movie pairs."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yIf926SkCOEp"
},
"source": [
"![](https://github.com/recohut/reco-static/raw/master/media/images/120222_algo.png)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "rSdY5NKQAYfI"
},
"source": [
"Preprocessing will be a bit different due to the difference in our model. Our autoencoder will take a vector of all ratings for a movie and attempt to reconstruct it. However, our input vector will have a lot of zeroes due to the sparsity of our data. We'll modify our loss so our model won't predict zeroes for those combinations - it will actually predict unseen ratings.\n",
"\n",
"To facilitate this, we'll use the sparse tensor that TF supports. Note: to make training easier, we'll transform it to dense form, which would not work in larger datasets - we would have to preprocess the data in a different way or stream it into the model."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HBcKm55rCVkk"
},
"source": [
"### Sparse representation and autoencoder reconstruction\n",
"\n",
"![](https://github.com/recohut/reco-static/raw/master/media/images/120222_ae.png)"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"id": "OWybs9LyE8bB",
"outputId": "1e108c11-a007-4942-bf0d-50409e4bbb1d"
},
"source": [
"df_raw.head()"
],
"execution_count": null,
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
userId
\n",
"
movieId
\n",
"
rating
\n",
"
timestamp
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
196
\n",
"
242
\n",
"
3.0
\n",
"
881250949
\n",
"
\n",
"
\n",
"
1
\n",
"
186
\n",
"
302
\n",
"
3.0
\n",
"
891717742
\n",
"
\n",
"
\n",
"
2
\n",
"
22
\n",
"
377
\n",
"
1.0
\n",
"
878887116
\n",
"
\n",
"
\n",
"
3
\n",
"
244
\n",
"
51
\n",
"
2.0
\n",
"
880606923
\n",
"
\n",
"
\n",
"
4
\n",
"
166
\n",
"
346
\n",
"
1.0
\n",
"
886397596
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" userId movieId rating timestamp\n",
"0 196 242 3.0 881250949\n",
"1 186 302 3.0 891717742\n",
"2 22 377 1.0 878887116\n",
"3 244 51 2.0 880606923\n",
"4 166 346 1.0 886397596"
]
},
"execution_count": 21,
"metadata": {
"tags": []
},
"output_type": "execute_result"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "M9jASOsh_gvU"
},
"source": [
"# Create a sparse tensor: at each user, movie location, we have a value, the rest is 0\n",
"sparse_x = tf.sparse.SparseTensor(indices=df[['movie', 'user']].values, values=df['rating'], dense_shape=(n_movies, n_users))\n",
"\n",
"# Transform it to dense form and to float32 (good enough precision)\n",
"dense_x = tf.cast(tf.sparse.to_dense(tf.sparse.reorder(sparse_x)), tf.float32)\n",
"\n",
"# Shuffle the data\n",
"x = tf.random.shuffle(dense_x, seed=42)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "1j2-lFANEp8t"
},
"source": [
"Now, let's create the model. We'll have to specify the input shape. Because we have 9724 movies and only 610 users, we'll prefer to predict ratings for movies instead of users - this way, our dataset is larger."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "9s4qXdbuEpuX",
"outputId": "45ac7925-b8f3-44dd-8e35-53fa7192b538"
},
"source": [
"class Encoder(layers.Layer):\n",
" def __init__(self, **kwargs):\n",
" super(Encoder, self).__init__(**kwargs)\n",
" self.dense1 = layers.Dense(28, activation='selu', kernel_initializer='glorot_uniform')\n",
" self.dense2 = layers.Dense(56, activation='selu', kernel_initializer='glorot_uniform')\n",
" self.dense3 = layers.Dense(56, activation='selu', kernel_initializer='glorot_uniform')\n",
" self.dropout = layers.Dropout(0.3)\n",
" \n",
" def call(self, x):\n",
" d1 = self.dense1(x)\n",
" d2 = self.dense2(d1)\n",
" d3 = self.dense3(d2)\n",
" return self.dropout(d3)\n",
" \n",
" \n",
"class Decoder(layers.Layer):\n",
" def __init__(self, n, **kwargs):\n",
" super(Decoder, self).__init__(**kwargs)\n",
" self.dense1 = layers.Dense(56, activation='selu', kernel_initializer='glorot_uniform')\n",
" self.dense2 = layers.Dense(28, activation='selu', kernel_initializer='glorot_uniform')\n",
" self.dense3 = layers.Dense(n, activation='selu', kernel_initializer='glorot_uniform')\n",
"\n",
" def call(self, x):\n",
" d1 = self.dense1(x)\n",
" d2 = self.dense2(d1)\n",
" return self.dense3(d2)\n",
"\n",
"n = n_users\n",
"inputs = layers.Input(shape=(n,))\n",
"\n",
"encoder = Encoder()\n",
"decoder = Decoder(n)\n",
"\n",
"enc1 = encoder(inputs)\n",
"dec1 = decoder(enc1)\n",
"enc2 = encoder(dec1)\n",
"dec2 = decoder(enc2)\n",
"\n",
"model = models.Model(inputs=inputs, outputs=dec2, name='DeepAutoencoder')\n",
"model.summary()"
],
"execution_count": null,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model: \"DeepAutoencoder\"\n",
"__________________________________________________________________________________________________\n",
"Layer (type) Output Shape Param # Connected to \n",
"==================================================================================================\n",
"input_1 (InputLayer) [(None, 943)] 0 \n",
"__________________________________________________________________________________________________\n",
"encoder (Encoder) (None, 56) 31248 input_1[0][0] \n",
" decoder[0][0] \n",
"__________________________________________________________________________________________________\n",
"decoder (Decoder) (None, 943) 32135 encoder[0][0] \n",
" encoder[1][0] \n",
"==================================================================================================\n",
"Total params: 63,383\n",
"Trainable params: 63,383\n",
"Non-trainable params: 0\n",
"__________________________________________________________________________________________________\n"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "aqXWA_TQGMDa"
},
"source": [
"Because our inputs are sparse, we'll need to create a modified mean squared error function. We have to look at which ratings are zero in the ground truth and remove them from our loss calculation (if we didn't, our model would quickly learn to predict zeros almost everywhere). We'll use masking - first get a boolean mask of non-zero values and then extract them from the result."
]
},
{
"cell_type": "code",
"metadata": {
"id": "G7AyGH8IFXAj"
},
"source": [
"def masked_mse(y_true, y_pred):\n",
" mask = tf.not_equal(y_true, 0)\n",
" se = tf.boolean_mask(tf.square(y_true - y_pred), mask)\n",
" return tf.reduce_mean(se)\n",
"\n",
"model.compile(\n",
" loss=masked_mse,\n",
" optimizer=keras.optimizers.Adam()\n",
")"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "6O-Hqm_FGTmz"
},
"source": [
"The model training will be similar as before - we'll use early stopping and TensorBoard. Our batch size will be smaller due to the lower number of examples. Note that we are passing the same array for both x and y, because the autoencoder reconstructs its input."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "OHoZ3IuJGSrL",
"outputId": "7923e0b0-7bc6-42ba-b3e6-c2bfe025291d"
},
"source": [
"callbacks = [\n",
" keras.callbacks.EarlyStopping(\n",
" monitor='val_loss',\n",
" min_delta=1e-2,\n",
" patience=5,\n",
" verbose=1,\n",
" ),\n",
" keras.callbacks.TensorBoard(log_dir='logs')\n",
"]\n",
"\n",
"model.fit(\n",
" x, \n",
" x, \n",
" batch_size=16, \n",
" epochs=100, \n",
" validation_split=0.1,\n",
" callbacks=callbacks\n",
")"
],
"execution_count": null,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"WARNING:tensorflow:Model failed to serialize as JSON. Ignoring... Layer Decoder has arguments in `__init__` and therefore must override `get_config`.\n",
"Epoch 1/100\n",
"95/95 [==============================] - 2s 7ms/step - loss: 4.6136 - val_loss: 1.1074\n",
"Epoch 2/100\n",
"95/95 [==============================] - 0s 4ms/step - loss: 1.1491 - val_loss: 1.0088\n",
"Epoch 3/100\n",
"95/95 [==============================] - 0s 5ms/step - loss: 1.0577 - val_loss: 0.9768\n",
"Epoch 4/100\n",
"95/95 [==============================] - 0s 4ms/step - loss: 1.0257 - val_loss: 0.9758\n",
"Epoch 5/100\n",
"95/95 [==============================] - 0s 4ms/step - loss: 0.9971 - val_loss: 0.9774\n",
"Epoch 6/100\n",
"95/95 [==============================] - 0s 4ms/step - loss: 0.9812 - val_loss: 0.9604\n",
"Epoch 7/100\n",
"95/95 [==============================] - 0s 5ms/step - loss: 0.9598 - val_loss: 0.9275\n",
"Epoch 8/100\n",
"95/95 [==============================] - 0s 5ms/step - loss: 0.9501 - val_loss: 0.9253\n",
"Epoch 9/100\n",
"95/95 [==============================] - 0s 5ms/step - loss: 0.9177 - val_loss: 0.9159\n",
"Epoch 10/100\n",
"95/95 [==============================] - 0s 5ms/step - loss: 0.9193 - val_loss: 0.9189\n",
"Epoch 11/100\n",
"95/95 [==============================] - 0s 4ms/step - loss: 0.9016 - val_loss: 0.9040\n",
"Epoch 12/100\n",
"95/95 [==============================] - 0s 4ms/step - loss: 0.9119 - val_loss: 0.9108\n",
"Epoch 13/100\n",
"95/95 [==============================] - 0s 5ms/step - loss: 0.8917 - val_loss: 0.9192\n",
"Epoch 14/100\n",
"95/95 [==============================] - 0s 5ms/step - loss: 0.8855 - val_loss: 0.9166\n",
"Epoch 15/100\n",
"95/95 [==============================] - 0s 4ms/step - loss: 0.8843 - val_loss: 0.9067\n",
"Epoch 16/100\n",
"95/95 [==============================] - 0s 4ms/step - loss: 0.8851 - val_loss: 0.9034\n",
"Epoch 00016: early stopping\n"
]
},
{
"data": {
"text/plain": [
""
]
},
"execution_count": 27,
"metadata": {
"tags": []
},
"output_type": "execute_result"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kkkhIjHhGhP5"
},
"source": [
"Let's visualize our loss and the model itself with TensorBoard."
]
},
{
"cell_type": "code",
"metadata": {
"id": "MMVp_HbwGdGQ"
},
"source": [
"%tensorboard --logdir logs"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "eSBFppW8Gkih"
},
"source": [
"That's it! We've seen how to use TensorFlow to implement recommender systems in a few different ways. I hope this short introduction has been informative and has prepared you to use TF on new problems. Thank you for your attention!"
]
}
]
}