{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# RePlay recommender models comparison\n", "\n", "### Dataset\n", "We will compare RePlay models on __MovieLens 1m__. \n", "\n", "### Dataset preprocessing: \n", "Ratings greater than or equal to 3 are considered as positive interactions.\n", "\n", "### Data split\n", "Dataset is split by date so that 20% of the last interactions as are placed in the test part. Cold items and users are dropped.\n", "\n", "### Predict:\n", "We will predict top-10 most relevant films for each user.\n", "\n", "### Metrics\n", "Quality metrics used:__ndcg@k, hitrate@k, map@k, mrr@k__ for k = 1, 5, 10\n", "Additional metrics used: __coverage@k__ and __surprisal@k__." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2020-02-10T16:01:45.639135Z", "start_time": "2020-02-10T16:01:45.612577Z" }, "jupyter": { "outputs_hidden": false }, "pycharm": { "is_executing": false } }, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "%config Completer.use_jedi = False" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import warnings\n", "from optuna.exceptions import ExperimentalWarning\n", "warnings.filterwarnings(\"ignore\", category=UserWarning)\n", "warnings.filterwarnings(\"ignore\", category=ExperimentalWarning)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "import logging\n", "import pandas as pd\n", "import time\n", "\n", "from pyspark.sql import functions as sf, types as st\n", "from pyspark.sql.types import IntegerType\n", "\n", "from replay.data_preparator import DataPreparator\n", "from replay.experiment import Experiment\n", "from replay.metrics import Coverage, HitRate, MRR, MAP, NDCG, Surprisal\n", "from replay.models import (\n", " ALSWrap, \n", " ADMMSLIM, \n", " KNN,\n", " LightFMWrap, \n", " MultVAE, \n", " NeuroMF, \n", " SLIM, \n", " PopRec, \n", " RandomRec, \n", " Wilson, \n", " Word2VecRec\n", ")\n", "\n", "from replay.models.base_rec import HybridRecommender\n", "from replay.session_handler import State\n", "from replay.splitters import DateSplitter\n", "from replay.utils import get_log_info\n", "from rs_datasets import MovieLens" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`State` object allows passing existing Spark session or create a new one, which will be used by the all RePlay modules.\n", "\n", "To create session with custom parameters ``spark.driver.memory`` and ``spark.sql.shuffle.partitions`` use function `get_spark_session` from `session_handler` module." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2020-02-10T15:59:09.227179Z", "start_time": "2020-02-10T15:59:06.427348Z" }, "jupyter": { "outputs_hidden": false }, "pycharm": { "is_executing": false } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "WARNING: An illegal reflective access operation has occurred\n", "WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/home/u19893556/miniconda3/envs/replay/lib/python3.7/site-packages/pyspark/jars/spark-unsafe_2.12-3.1.2.jar) to constructor java.nio.DirectByteBuffer(long,int)\n", "WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform\n", "WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations\n", "WARNING: All illegal access operations will be denied in a future release\n", "22/02/25 18:05:27 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable\n", "Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties\n", "Setting default log level to \"WARN\".\n", "To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).\n", "22/02/25 18:05:28 WARN SparkConf: Note that spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone/kubernetes and LOCAL_DIRS in YARN).\n", "22/02/25 18:05:28 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.\n" ] }, { "data": { "text/html": [ "\n", "
\n", "

SparkSession - hive

\n", " \n", "
\n", "

SparkContext

\n", "\n", "

Spark UI

\n", "\n", "
\n", "
Version
\n", "
v3.1.2
\n", "
Master
\n", "
local[*]
\n", "
AppName
\n", "
pyspark-shell
\n", "
\n", "
\n", " \n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "spark = State().session\n", "spark" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "logger = logging.getLogger(\"replay\")" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "K = 10\n", "K_list_metrics = [1, 5, 10]\n", "BUDGET = 20\n", "SEED = 12345" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 0. Preprocessing " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 0.1 Data loading" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2020-02-10T15:59:42.041251Z", "start_time": "2020-02-10T15:59:09.230636Z" }, "jupyter": { "outputs_hidden": false }, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "ratings\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
user_iditem_idratingtimestamp
0111935978300760
116613978302109
219143978301968
\n", "
" ], "text/plain": [ " user_id item_id rating timestamp\n", "0 1 1193 5 978300760\n", "1 1 661 3 978302109\n", "2 1 914 3 978301968" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "users\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
user_idgenderageoccupationzip_code
01F11048067
12M561670072
23M251555117
\n", "
" ], "text/plain": [ " user_id gender age occupation zip_code\n", "0 1 F 1 10 48067\n", "1 2 M 56 16 70072\n", "2 3 M 25 15 55117" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "items\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
item_idtitlegenres
01Toy Story (1995)Animation|Children's|Comedy
12Jumanji (1995)Adventure|Children's|Fantasy
23Grumpier Old Men (1995)Comedy|Romance
\n", "
" ], "text/plain": [ " item_id title genres\n", "0 1 Toy Story (1995) Animation|Children's|Comedy\n", "1 2 Jumanji (1995) Adventure|Children's|Fantasy\n", "2 3 Grumpier Old Men (1995) Comedy|Romance" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "data = MovieLens(\"1m\")\n", "data.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### log preprocessing" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "[Stage 13:=========================================> (36 + 12) / 48]\r" ] }, { "name": "stdout", "output_type": "stream", "text": [ "total lines: 1000209, total users: 6040, total items: 3706\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\r", " \r" ] } ], "source": [ "preparator = DataPreparator()\n", "log, _, _ = preparator(data.ratings, mapping={\"relevance\": \"rating\"})\n", "print(get_log_info(log))" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "836478" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# will consider ratings >= 3 as positive feedback. A positive feedback is treated with relevance = 1\n", "only_positives_log = log.filter(sf.col('relevance') >= 3).withColumn('relevance', sf.lit(1))\n", "only_positives_log.count()" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "user_features=None\n", "item_features=None" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 0.2. Data split" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2020-02-10T15:59:50.986401Z", "start_time": "2020-02-10T15:59:42.042998Z" }, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "22/02/25 18:05:49 WARN WindowExec: No Partition Defined for Window operation! Moving all data to a single partition, this can cause serious performance degradation.\n", " \r" ] }, { "name": "stdout", "output_type": "stream", "text": [ "train info:\n", " total lines: 669181, total users: 5397, total items: 3569\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Stage 46:=============================================> (124 + 20) / 144]\r" ] }, { "name": "stdout", "output_type": "stream", "text": [ "test info:\n", " total lines: 86542, total users: 1139, total items: 3279\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\r", " \r" ] } ], "source": [ "# train/test split \n", "train_spl = DateSplitter(\n", " test_start=0.2,\n", " drop_cold_items=True,\n", " drop_cold_users=True,\n", ")\n", "train, test = train_spl.split(only_positives_log)\n", "print('train info:\\n', get_log_info(train))\n", "print('test info:\\n', get_log_info(test))" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecuteTime": { "end_time": "2020-02-10T15:59:50.986401Z", "start_time": "2020-02-10T15:59:42.042998Z" }, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "22/02/25 18:06:05 WARN WindowExec: No Partition Defined for Window operation! Moving all data to a single partition, this can cause serious performance degradation.\n" ] }, { "data": { "text/plain": [ "(535343, 24241)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# train/test split for hyperparameters selection\n", "opt_train, opt_val = train_spl.split(train)\n", "opt_train.count(), opt_val.count()" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "798993" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# negative feedback will be used for Wilson models\n", "only_negatives_log = log.filter(sf.col('relevance') < 3).withColumn('relevance', sf.lit(0.))\n", "test_start = test.agg(sf.min('timestamp')).collect()[0][0]\n", "\n", "# train with both positive and negative feedback\n", "pos_neg_train=(train\n", " .withColumn('relevance', sf.lit(1))\n", " .union(only_negatives_log.filter(sf.col('timestamp') < test_start))\n", " )\n", "pos_neg_train.count()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "+---------+---------+--------+--------+\n", "|relevance|timestamp|user_idx|item_idx|\n", "+---------+---------+--------+--------+\n", "| 1|975735012| 677| 1314|\n", "| 1|975736432| 677| 1282|\n", "+---------+---------+--------+--------+\n", "only showing top 2 rows\n", "\n" ] } ], "source": [ "train.show(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 1. Metrics definition" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "# experiment is used for metrics calculation\n", "e = Experiment(test, {MAP(): K, NDCG(): K, HitRate(): K_list_metrics, Coverage(train): K, Surprisal(train): K, MRR(): K})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 2. Models training" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "def fit_predict_add_res(name, model, experiment, train, suffix=''):\n", " \"\"\"\n", " Run fit_predict for the `model`, measure time on fit_predict and evaluate metrics\n", " \"\"\"\n", " start_time=time.time()\n", " \n", " fit_predict_params = {'log': train, 'k': K, 'users': test.select('user_idx').distinct()}\n", " if isinstance(model, Wilson):\n", " fit_predict_params['log'] = pos_neg_train\n", "\n", " if isinstance(model, HybridRecommender):\n", " fit_predict_params['item_features'] = item_features\n", " fit_predict_params['user_features'] = user_features\n", " \n", " pred=model.fit_predict(**fit_predict_params)\n", " pred.count()\n", " fit_predict_time = time.time() - start_time\n", " \n", " experiment.add_result(name + suffix, pred)\n", " experiment.results.loc[name + suffix, 'fit_pred_time'] = fit_predict_time\n", " \n", " print(experiment.results[['NDCG@{}'.format(K), 'MRR@{}'.format(K), 'Coverage@{}'.format(K), 'fit_pred_time']].sort_values('NDCG@{}'.format(K), ascending=False))" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "def full_pipeline(models, experiment, train, suffix='', budget=BUDGET):\n", " \"\"\"\n", " For each model:\n", " - if required: run hyperparameters search, set best params and save param values to `experiment`\n", " - pass model to `fit_predict_add_res` \n", " \"\"\"\n", " \n", " for name, [model, params] in models.items():\n", " model.logger.info(msg='{} started'.format(name))\n", " if params != 'no_opt':\n", " model.logger.info(msg='{} optimization started'.format(name))\n", " best_params = model.optimize(opt_train, \n", " opt_val, \n", " param_borders=params, \n", " item_features=item_features,\n", " user_features=user_features,\n", " k=K, \n", " budget=budget)\n", " model.set_params(**best_params)\n", " logger.info(msg='best params for {} are: {}'.format(name, best_params))\n", " experiment.results.loc[name + suffix, 'params'] = best_params.__repr__()\n", " \n", " logger.info(msg='{} fit_predict started'.format(name))\n", " fit_predict_add_res(name, model, experiment, train, suffix) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.1. Non-personalized models" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "non_personalized_models = {'Popular Recommender': [PopRec(), 'no_opt'], \n", " 'Random Recommender (uniform)': [RandomRec(seed=SEED, distribution='uniform'), 'no_opt'], \n", " 'Random Recommender (popularity-based)': [RandomRec(seed=SEED, distribution='popular_based'), {\"alpha\": [-0.5, 100]}],\n", " 'Wilson Recommender': [Wilson(), 'no_opt']}" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 18:06:10, replay, INFO: Popular Recommender started\n", "25-Feb-22 18:06:10, replay, INFO: Popular Recommender fit_predict started\n", "25-Feb-22 18:06:12, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:06:12, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:06:16 WARN WindowExec: No Partition Defined for Window operation! Moving all data to a single partition, this can cause serious performance degradation.\n", "22/02/25 18:06:27 WARN WindowExec: No Partition Defined for Window operation! Moving all data to a single partition, this can cause serious performance degradation.\n", "22/02/25 18:06:27 WARN WindowExec: No Partition Defined for Window operation! Moving all data to a single partition, this can cause serious performance degradation.\n", "22/02/25 18:06:34 WARN WindowExec: No Partition Defined for Window operation! Moving all data to a single partition, this can cause serious performance degradation.\n", "22/02/25 18:06:59 WARN WindowExec: No Partition Defined for Window operation! Moving all data to a single partition, this can cause serious performance degradation.\n", "25-Feb-22 18:07:09, replay, INFO: Random Recommender (uniform) started \n", "25-Feb-22 18:07:09, replay, INFO: Random Recommender (uniform) fit_predict started\n", "22/02/25 18:07:10 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:07:10 WARN CacheManager: Asked to cache already cached data.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 fit_pred_time\n", "Popular Recommender 0.243783 0.390426 0.033903 16.93119\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 18:07:57, replay, INFO: Random Recommender (popularity-based) started \n", "25-Feb-22 18:07:57, replay, INFO: Random Recommender (popularity-based) optimization started\n", "\u001b[32m[I 2022-02-25 18:07:57,597]\u001b[0m A new study created in memory with name: no-name-0c8690d5-63e3-4ac4-a88e-4c0635389c0b\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 fit_pred_time\n", "Popular Recommender 0.243783 0.390426 0.033903 16.931190\n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 11.719672\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\u001b[32m[I 2022-02-25 18:08:09,888]\u001b[0m Trial 0 finished with value: 0.070029319068223 and parameters: {'distribution': 'popular_based', 'alpha': 0.0}. Best is trial 0 with value: 0.070029319068223.\u001b[0m\n", "22/02/25 18:08:09 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:08:09 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:08:24,813]\u001b[0m Trial 1 finished with value: 0.05713612473413634 and parameters: {'distribution': 'popular_based', 'alpha': 75.03346685193002}. Best is trial 0 with value: 0.070029319068223.\u001b[0m\n", "22/02/25 18:08:24 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:08:24 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:08:36,593]\u001b[0m Trial 2 finished with value: 0.05153169911235664 and parameters: {'distribution': 'popular_based', 'alpha': 97.54888710139787}. Best is trial 0 with value: 0.070029319068223.\u001b[0m\n", "22/02/25 18:08:36 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:08:36 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:08:52,523]\u001b[0m Trial 3 finished with value: 0.0518623536949851 and parameters: {'distribution': 'popular_based', 'alpha': 96.1973316747191}. Best is trial 0 with value: 0.070029319068223.\u001b[0m\n", "22/02/25 18:08:52 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:08:52 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:09:00,885]\u001b[0m Trial 4 finished with value: 0.05730576763735235 and parameters: {'distribution': 'popular_based', 'alpha': 58.64844374708354}. Best is trial 0 with value: 0.070029319068223.\u001b[0m\n", "22/02/25 18:09:00 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:09:00 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:09:11,501]\u001b[0m Trial 5 finished with value: 0.051486423306174 and parameters: {'distribution': 'popular_based', 'alpha': 97.17034467279761}. Best is trial 0 with value: 0.070029319068223.\u001b[0m\n", "22/02/25 18:09:11 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:09:11 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:09:20,480]\u001b[0m Trial 6 finished with value: 0.06877412964112065 and parameters: {'distribution': 'popular_based', 'alpha': 30.932894355294856}. Best is trial 0 with value: 0.070029319068223.\u001b[0m\n", "22/02/25 18:09:20 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:09:20 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:09:30,885]\u001b[0m Trial 7 finished with value: 0.07231492907083414 and parameters: {'distribution': 'popular_based', 'alpha': 24.910056332924484}. Best is trial 7 with value: 0.07231492907083414.\u001b[0m\n", "22/02/25 18:09:30 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:09:30 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:09:43,493]\u001b[0m Trial 8 finished with value: 0.05146102688433041 and parameters: {'distribution': 'popular_based', 'alpha': 83.35573079975558}. Best is trial 7 with value: 0.07231492907083414.\u001b[0m\n", "22/02/25 18:09:43 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:09:43 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:09:53,987]\u001b[0m Trial 9 finished with value: 0.06589418081651598 and parameters: {'distribution': 'popular_based', 'alpha': 45.67915503913063}. Best is trial 7 with value: 0.07231492907083414.\u001b[0m\n", "22/02/25 18:09:54 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:09:54 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:10:07,285]\u001b[0m Trial 10 finished with value: 0.07244780485539214 and parameters: {'distribution': 'popular_based', 'alpha': 8.393590069607932}. Best is trial 10 with value: 0.07244780485539214.\u001b[0m\n", "22/02/25 18:10:07 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:10:07 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:10:18,869]\u001b[0m Trial 11 finished with value: 0.07128531572755266 and parameters: {'distribution': 'popular_based', 'alpha': 9.1121108511421}. Best is trial 10 with value: 0.07244780485539214.\u001b[0m\n", "22/02/25 18:10:18 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:10:18 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:10:28,667]\u001b[0m Trial 12 finished with value: 0.07021276209157654 and parameters: {'distribution': 'popular_based', 'alpha': 21.920615501998572}. Best is trial 10 with value: 0.07244780485539214.\u001b[0m\n", "22/02/25 18:10:28 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:10:28 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:10:38,483]\u001b[0m Trial 13 finished with value: 0.07250422685213238 and parameters: {'distribution': 'popular_based', 'alpha': 24.81903675148096}. Best is trial 13 with value: 0.07250422685213238.\u001b[0m\n", "22/02/25 18:10:38 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:10:38 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:10:50,274]\u001b[0m Trial 14 finished with value: 0.07044914388956931 and parameters: {'distribution': 'popular_based', 'alpha': 39.78357949318482}. Best is trial 13 with value: 0.07250422685213238.\u001b[0m\n", "22/02/25 18:10:50 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:10:50 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:10:58,866]\u001b[0m Trial 15 finished with value: 0.06990745822147226 and parameters: {'distribution': 'popular_based', 'alpha': 11.285176919262263}. Best is trial 13 with value: 0.07250422685213238.\u001b[0m\n", "22/02/25 18:10:58 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:10:58 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:11:10,069]\u001b[0m Trial 16 finished with value: 0.06945863866048066 and parameters: {'distribution': 'popular_based', 'alpha': 13.157958263856074}. Best is trial 13 with value: 0.07250422685213238.\u001b[0m\n", "22/02/25 18:11:10 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:11:10 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:11:18,493]\u001b[0m Trial 17 finished with value: 0.055891036293363734 and parameters: {'distribution': 'popular_based', 'alpha': 56.385257429331006}. Best is trial 13 with value: 0.07250422685213238.\u001b[0m\n", "22/02/25 18:11:18 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:11:18 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:11:34,095]\u001b[0m Trial 18 finished with value: 0.06944705850293577 and parameters: {'distribution': 'popular_based', 'alpha': -0.3234430217815767}. Best is trial 13 with value: 0.07250422685213238.\u001b[0m\n", "22/02/25 18:11:34 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:11:34 WARN CacheManager: Asked to cache already cached data.\n", "\u001b[32m[I 2022-02-25 18:11:41,740]\u001b[0m Trial 19 finished with value: 0.07018080227585625 and parameters: {'distribution': 'popular_based', 'alpha': 35.58564248192705}. Best is trial 13 with value: 0.07250422685213238.\u001b[0m\n", "25-Feb-22 18:11:41, replay, INFO: best params for Random Recommender (popularity-based) are: {'distribution': 'popular_based', 'alpha': 24.81903675148096}\n", "25-Feb-22 18:11:41, replay, INFO: Random Recommender (popularity-based) fit_predict started\n", "22/02/25 18:11:41 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:11:41 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:12:17, replay, INFO: Wilson Recommender started ]\n", "25-Feb-22 18:12:17, replay, INFO: Wilson Recommender fit_predict started\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "\n", " fit_pred_time \n", "Popular Recommender 16.931190 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 18:12:20, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:12:20, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:12:28 WARN WindowExec: No Partition Defined for Window operation! Moving all data to a single partition, this can cause serious performance degradation.\n", "22/02/25 18:12:34 WARN WindowExec: No Partition Defined for Window operation! Moving all data to a single partition, this can cause serious performance degradation.\n", "22/02/25 18:12:34 WARN WindowExec: No Partition Defined for Window operation! Moving all data to a single partition, this can cause serious performance degradation.\n", "22/02/25 18:12:40 WARN WindowExec: No Partition Defined for Window operation! Moving all data to a single partition, this can cause serious performance degradation.\n", "22/02/25 18:12:52 WARN WindowExec: No Partition Defined for Window operation! Moving all data to a single partition, this can cause serious performance degradation.\n", "[Stage 1930:===============================================> (134 + 10) / 144]\r" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "Wilson Recommender 0.092121 0.180976 0.017092 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "\n", " fit_pred_time \n", "Popular Recommender 16.931190 \n", "Wilson Recommender 16.660789 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n", "CPU times: user 3.49 s, sys: 1.33 s, total: 4.82 s\n", "Wall time: 6min 50s\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\r", "[Stage 1930:===================================================>(143 + 1) / 144]\r", "\r", " \r" ] } ], "source": [ "%%time\n", "full_pipeline(non_personalized_models, e, train)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Coverage@10HitRate@1HitRate@5HitRate@10MAP@10MRR@10NDCG@10Surprisal@10fit_pred_timeparams
Popular Recommender0.0339030.2844600.5302900.6453030.1573010.3904260.2437830.11835416.931190NaN
Wilson Recommender0.0170920.0834060.3450400.4143990.0450020.1809760.0921210.26219016.660789NaN
Random Recommender (popularity-based)0.7604370.0711150.2616330.3784020.0270260.1504340.0666650.34478410.100435{'distribution': 'popular_based', 'alpha': 24....
Random Recommender (uniform)0.9576910.0175590.1000880.1676910.0073320.0548460.0217250.53867711.719672NaN
\n", "
" ], "text/plain": [ " Coverage@10 HitRate@1 HitRate@5 \\\n", "Popular Recommender 0.033903 0.284460 0.530290 \n", "Wilson Recommender 0.017092 0.083406 0.345040 \n", "Random Recommender (popularity-based) 0.760437 0.071115 0.261633 \n", "Random Recommender (uniform) 0.957691 0.017559 0.100088 \n", "\n", " HitRate@10 MAP@10 MRR@10 \\\n", "Popular Recommender 0.645303 0.157301 0.390426 \n", "Wilson Recommender 0.414399 0.045002 0.180976 \n", "Random Recommender (popularity-based) 0.378402 0.027026 0.150434 \n", "Random Recommender (uniform) 0.167691 0.007332 0.054846 \n", "\n", " NDCG@10 Surprisal@10 fit_pred_time \\\n", "Popular Recommender 0.243783 0.118354 16.931190 \n", "Wilson Recommender 0.092121 0.262190 16.660789 \n", "Random Recommender (popularity-based) 0.066665 0.344784 10.100435 \n", "Random Recommender (uniform) 0.021725 0.538677 11.719672 \n", "\n", " params \n", "Popular Recommender NaN \n", "Wilson Recommender NaN \n", "Random Recommender (popularity-based) {'distribution': 'popular_based', 'alpha': 24.... \n", "Random Recommender (uniform) NaN " ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "e.results.sort_values('NDCG@10', ascending=False)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "e.results.to_csv('res_21_rel_1.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.2 Personalized models without features" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "common_models = {\n", " 'ADMM SLIM': [ADMMSLIM(seed=SEED), {\"lambda_1\": [1e-6, 10],\n", " \"lambda_2\": [1e-6, 1000]},],\n", " 'Implicit ALS': [ALSWrap(seed=SEED), None], \n", " 'Explicit ALS': [ALSWrap(seed=SEED, implicit_prefs=False), None], \n", " 'KNN': [KNN(), None], \n", " 'LightFM': [LightFMWrap(random_state=SEED), {\"no_components\": [8, 512]}], \n", " 'SLIM': [SLIM(seed=SEED), None]}" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 18:13:01, replay, INFO: ADMM SLIM started\n", "25-Feb-22 18:13:01, replay, INFO: ADMM SLIM optimization started\n", "\u001b[32m[I 2022-02-25 18:13:01,337]\u001b[0m A new study created in memory with name: no-name-474fa5aa-13fa-4eb0-8877-90a6ada17610\u001b[0m\n", "22/02/25 18:13:01 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:13:01 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:13:14, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:13:14, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:13:27,023]\u001b[0m Trial 0 finished with value: 0.21300281804105106 and parameters: {'lambda_1': 0.8417364694294401, 'lambda_2': 62.68159062953527}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:13:27 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:13:27 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:13:33, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:13:33, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:13:35 WARN TaskSetManager: Stage 2016 contains a task of very large size (3403 KiB). The maximum recommended task size is 1000 KiB.\n", "\u001b[32m[I 2022-02-25 18:13:47,243]\u001b[0m Trial 1 finished with value: 0.16633606135193407 and parameters: {'lambda_1': 0.0026839723484365086, 'lambda_2': 4.0171660978124555}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:13:47 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:13:47 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:13:58, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:13:58, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:14:01 WARN TaskSetManager: Stage 2064 contains a task of very large size (2496 KiB). The maximum recommended task size is 1000 KiB.\n", "\u001b[32m[I 2022-02-25 18:14:13,279]\u001b[0m Trial 2 finished with value: 0.13527220995694636 and parameters: {'lambda_1': 0.0331848510334433, 'lambda_2': 0.00269323577759365}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:14:13 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:14:13 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:14:30, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:14:30, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:14:32 WARN TaskSetManager: Stage 2111 contains a task of very large size (2192 KiB). The maximum recommended task size is 1000 KiB.\n", "\u001b[32m[I 2022-02-25 18:14:42,581]\u001b[0m Trial 3 finished with value: 0.11563834718196556 and parameters: {'lambda_1': 0.05631144225953223, 'lambda_2': 0.008086175274293016}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:14:42 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:14:42 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:15:03, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:15:03, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:15:05 WARN TaskSetManager: Stage 2159 contains a task of very large size (2047 KiB). The maximum recommended task size is 1000 KiB.\n", "\u001b[32m[I 2022-02-25 18:15:17,857]\u001b[0m Trial 4 finished with value: 0.11296191273073267 and parameters: {'lambda_1': 0.07506012990908968, 'lambda_2': 3.2304188720030225e-05}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:15:17 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:15:17 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:15:23, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:15:23, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:15:25 WARN TaskSetManager: Stage 2207 contains a task of very large size (4313 KiB). The maximum recommended task size is 1000 KiB.\n", "\u001b[32m[I 2022-02-25 18:15:37,171]\u001b[0m Trial 5 finished with value: 0.16020431179270944 and parameters: {'lambda_1': 3.894248858324239e-05, 'lambda_2': 5.325167965675722}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:15:37 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:15:37 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:15:43, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:15:43, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:15:45 WARN TaskSetManager: Stage 2255 contains a task of very large size (4403 KiB). The maximum recommended task size is 1000 KiB.\n", "\u001b[32m[I 2022-02-25 18:15:57,208]\u001b[0m Trial 6 finished with value: 0.15497052724672036 and parameters: {'lambda_1': 3.728819929103577e-06, 'lambda_2': 0.0001433381048991562}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:15:57 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:15:57 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:16:18, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:16:18, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:16:28,387]\u001b[0m Trial 7 finished with value: 0.1911351691713013 and parameters: {'lambda_1': 6.405554333749737, 'lambda_2': 338.5882070391319}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:16:28 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:16:28 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:16:48, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:16:48, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:16:50 WARN TaskSetManager: Stage 2351 contains a task of very large size (1664 KiB). The maximum recommended task size is 1000 KiB.\n", "\u001b[32m[I 2022-02-25 18:16:58,912]\u001b[0m Trial 8 finished with value: 0.17374179581588542 and parameters: {'lambda_1': 0.22912395701006136, 'lambda_2': 84.3492785748498}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:16:58 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:16:58 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:17:38, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:17:38, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:17:47,528]\u001b[0m Trial 9 finished with value: 0.15608116175098724 and parameters: {'lambda_1': 1.534276936525244, 'lambda_2': 2.7103228862965198e-05}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:17:47 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:17:47 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:17:54, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:17:54, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:17:56 WARN TaskSetManager: Stage 2447 contains a task of very large size (3803 KiB). The maximum recommended task size is 1000 KiB.\n", "\u001b[32m[I 2022-02-25 18:18:07,535]\u001b[0m Trial 10 finished with value: 0.15742528077365897 and parameters: {'lambda_1': 0.0006888568990244618, 'lambda_2': 0.2121493341339812}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:18:07 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:18:07 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:18:26, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:18:26, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:18:34,207]\u001b[0m Trial 11 finished with value: 0.1985943351417266 and parameters: {'lambda_1': 7.835711100515496, 'lambda_2': 805.4597251339675}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:18:34 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:18:34 WARN CacheManager: Asked to cache already cached data.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 18:18:52, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:18:52, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:19:01,726]\u001b[0m Trial 12 finished with value: 0.19905606955492455 and parameters: {'lambda_1': 6.517023092744527, 'lambda_2': 805.8144516511958}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:19:01 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:19:01 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:19:19, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:19:19, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:19:34,402]\u001b[0m Trial 13 finished with value: 0.16177655469968685 and parameters: {'lambda_1': 0.7561910399408325, 'lambda_2': 5.183021744998351}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:19:34 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:19:34 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:19:50, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:19:50, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:19:58,256]\u001b[0m Trial 14 finished with value: 0.18577685563341298 and parameters: {'lambda_1': 9.373073583775676, 'lambda_2': 39.9159121546185}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:19:58 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:19:58 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:20:04, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:20:04, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:20:06 WARN TaskSetManager: Stage 2688 contains a task of very large size (2958 KiB). The maximum recommended task size is 1000 KiB.\n", "\u001b[32m[I 2022-02-25 18:20:16,453]\u001b[0m Trial 15 finished with value: 0.16489309895278065 and parameters: {'lambda_1': 0.007152197272175191, 'lambda_2': 0.1469331981105664}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:20:16 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:20:16 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:20:49, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:20:49, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:20:51 WARN TaskSetManager: Stage 2736 contains a task of very large size (1111 KiB). The maximum recommended task size is 1000 KiB.\n", "\u001b[32m[I 2022-02-25 18:20:59,778]\u001b[0m Trial 16 finished with value: 0.15904321309095248 and parameters: {'lambda_1': 0.5876401586615683, 'lambda_2': 33.61501975205848}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:20:59 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:20:59 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:21:06, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:21:06, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:21:08 WARN TaskSetManager: Stage 2783 contains a task of very large size (3819 KiB). The maximum recommended task size is 1000 KiB.\n", "\u001b[32m[I 2022-02-25 18:21:20,495]\u001b[0m Trial 17 finished with value: 0.15864957348401526 and parameters: {'lambda_1': 0.0005375185259248001, 'lambda_2': 0.8584948104511229}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:21:20 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:21:20 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:21:48, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:21:48, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:21:58,113]\u001b[0m Trial 18 finished with value: 0.20077218584235693 and parameters: {'lambda_1': 1.4670495377769377, 'lambda_2': 966.601698114542}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "22/02/25 18:21:58 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:21:58 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:22:09, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:22:09, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:22:12 WARN TaskSetManager: Stage 2880 contains a task of very large size (2893 KiB). The maximum recommended task size is 1000 KiB.\n", "\u001b[32m[I 2022-02-25 18:22:22,100]\u001b[0m Trial 19 finished with value: 0.15271308341501147 and parameters: {'lambda_1': 0.019334126264360343, 'lambda_2': 2.2134616598012305e-06}. Best is trial 0 with value: 0.21300281804105106.\u001b[0m\n", "25-Feb-22 18:22:22, replay, INFO: best params for ADMM SLIM are: {'lambda_1': 0.8417364694294401, 'lambda_2': 62.68159062953527}\n", "25-Feb-22 18:22:22, replay, INFO: ADMM SLIM fit_predict started\n", "22/02/25 18:22:22 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:22:22 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:23:01, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:23:01, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:24:24, replay, INFO: Implicit ALS started \n", "25-Feb-22 18:24:24, replay, INFO: Implicit ALS optimization started\n", "\u001b[32m[I 2022-02-25 18:24:24,141]\u001b[0m A new study created in memory with name: no-name-ad654e07-97cc-4ad2-a22f-40e265dc0f2c\u001b[0m\n", "/home/u19893556/miniconda3/envs/replay/lib/python3.7/site-packages/optuna/distributions.py:364: FutureWarning: Samplers and other components in Optuna will assume that `step` is 1. `step` argument is deprecated and will be removed in the future. The removal of this feature is currently scheduled for v4.0.0, but this schedule is subject to change.\n", " FutureWarning,\n", "22/02/25 18:24:24 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:24:24 WARN CacheManager: Asked to cache already cached data.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "ADMM SLIM 0.216480 0.373958 0.348837 \n", "Wilson Recommender 0.092121 0.180976 0.017092 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "\n", " fit_pred_time \n", "Popular Recommender 16.931190 \n", "ADMM SLIM 56.977886 \n", "Wilson Recommender 16.660789 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "22/02/25 18:24:26 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS\n", "22/02/25 18:24:26 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS\n", "22/02/25 18:24:26 WARN LAPACK: Failed to load implementation from: com.github.fommil.netlib.NativeSystemLAPACK\n", "22/02/25 18:24:26 WARN LAPACK: Failed to load implementation from: com.github.fommil.netlib.NativeRefLAPACK\n", "25-Feb-22 18:24:29, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:24:29, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:24:29, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:24:29, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:24:53,988]\u001b[0m Trial 0 finished with value: 0.2087745613888222 and parameters: {'rank': 10}. Best is trial 0 with value: 0.2087745613888222.\u001b[0m\n", "22/02/25 18:24:54 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:24:54 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:25:00 WARN DAGScheduler: Broadcasting large task binary with size 1004.8 KiB\n", "22/02/25 18:25:01 WARN DAGScheduler: Broadcasting large task binary with size 1129.3 KiB\n", "22/02/25 18:25:03 WARN DAGScheduler: Broadcasting large task binary with size 1005.7 KiB\n", "22/02/25 18:25:03 WARN DAGScheduler: Broadcasting large task binary with size 1253.8 KiB\n", "22/02/25 18:25:04 WARN DAGScheduler: Broadcasting large task binary with size 1130.2 KiB\n", "22/02/25 18:25:04 WARN DAGScheduler: Broadcasting large task binary with size 1378.2 KiB\n", "22/02/25 18:25:06 WARN DAGScheduler: Broadcasting large task binary with size 1254.6 KiB\n", "22/02/25 18:25:06 WARN DAGScheduler: Broadcasting large task binary with size 1502.7 KiB\n", "22/02/25 18:25:07 WARN DAGScheduler: Broadcasting large task binary with size 1379.1 KiB\n", "22/02/25 18:25:07 WARN DAGScheduler: Broadcasting large task binary with size 1627.2 KiB\n", "22/02/25 18:25:09 WARN DAGScheduler: Broadcasting large task binary with size 1503.6 KiB\n", "22/02/25 18:25:09 WARN DAGScheduler: Broadcasting large task binary with size 1751.7 KiB\n", "22/02/25 18:25:10 WARN DAGScheduler: Broadcasting large task binary with size 1628.1 KiB\n", "22/02/25 18:25:10 WARN DAGScheduler: Broadcasting large task binary with size 1876.1 KiB\n", "22/02/25 18:25:12 WARN DAGScheduler: Broadcasting large task binary with size 1752.5 KiB\n", "22/02/25 18:25:12 WARN DAGScheduler: Broadcasting large task binary with size 2000.6 KiB\n", "22/02/25 18:25:13 WARN DAGScheduler: Broadcasting large task binary with size 1877.0 KiB\n", "22/02/25 18:25:13 WARN DAGScheduler: Broadcasting large task binary with size 2.1 MiB\n", "22/02/25 18:25:15 WARN DAGScheduler: Broadcasting large task binary with size 2001.5 KiB\n", "22/02/25 18:25:15 WARN DAGScheduler: Broadcasting large task binary with size 2.2 MiB\n", "22/02/25 18:25:16 WARN DAGScheduler: Broadcasting large task binary with size 2.1 MiB\n", "22/02/25 18:25:16 WARN DAGScheduler: Broadcasting large task binary with size 2.3 MiB\n", "22/02/25 18:25:18 WARN DAGScheduler: Broadcasting large task binary with size 2.2 MiB\n", "22/02/25 18:25:18 WARN DAGScheduler: Broadcasting large task binary with size 2.4 MiB\n", "22/02/25 18:25:19 WARN DAGScheduler: Broadcasting large task binary with size 2.3 MiB\n", "22/02/25 18:25:19 WARN DAGScheduler: Broadcasting large task binary with size 2.6 MiB\n", "22/02/25 18:25:21 WARN DAGScheduler: Broadcasting large task binary with size 2.4 MiB\n", "22/02/25 18:25:21 WARN DAGScheduler: Broadcasting large task binary with size 2.7 MiB\n", "22/02/25 18:25:22 WARN DAGScheduler: Broadcasting large task binary with size 2.6 MiB\n", "22/02/25 18:25:23 WARN DAGScheduler: Broadcasting large task binary with size 2.8 MiB\n", "22/02/25 18:25:24 WARN DAGScheduler: Broadcasting large task binary with size 2.7 MiB\n", "22/02/25 18:25:24 WARN DAGScheduler: Broadcasting large task binary with size 2.8 MiB\n", "22/02/25 18:25:26 WARN DAGScheduler: Broadcasting large task binary with size 2.7 MiB\n", "25-Feb-22 18:25:26, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:25:26, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:25:26, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:25:26, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:25:28 WARN DAGScheduler: Broadcasting large task binary with size 2.8 MiB\n", "22/02/25 18:25:28 WARN DAGScheduler: Broadcasting large task binary with size 2.7 MiB\n", "22/02/25 18:25:43 WARN DAGScheduler: Broadcasting large task binary with size 2.9 MiB]\n", "22/02/25 18:25:45 WARN DAGScheduler: Broadcasting large task binary with size 2.9 MiB\n", "22/02/25 18:25:49 WARN DAGScheduler: Broadcasting large task binary with size 3.0 MiB\n", "22/02/25 18:25:50 WARN DAGScheduler: Broadcasting large task binary with size 3.0 MiB\n", "22/02/25 18:25:51 WARN DAGScheduler: Broadcasting large task binary with size 3.0 MiB\n", "22/02/25 18:25:52 WARN DAGScheduler: Broadcasting large task binary with size 3.0 MiB\n", "\u001b[32m[I 2022-02-25 18:25:52,474]\u001b[0m Trial 1 finished with value: 0.17305266876239014 and parameters: {'rank': 175}. Best is trial 0 with value: 0.2087745613888222.\u001b[0m\n", "22/02/25 18:25:52 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:25:52 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:26:00 WARN DAGScheduler: Broadcasting large task binary with size 1011.9 KiB\n", "22/02/25 18:26:01 WARN DAGScheduler: Broadcasting large task binary with size 1048.9 KiB\n", "22/02/25 18:26:01 WARN DAGScheduler: Broadcasting large task binary with size 1012.7 KiB\n", "22/02/25 18:26:01 WARN DAGScheduler: Broadcasting large task binary with size 1085.9 KiB\n", "22/02/25 18:26:02 WARN DAGScheduler: Broadcasting large task binary with size 1049.8 KiB\n", "22/02/25 18:26:02 WARN DAGScheduler: Broadcasting large task binary with size 1122.9 KiB\n", "22/02/25 18:26:02 WARN DAGScheduler: Broadcasting large task binary with size 1086.8 KiB\n", "22/02/25 18:26:02 WARN DAGScheduler: Broadcasting large task binary with size 1124.4 KiB\n", "22/02/25 18:26:03 WARN DAGScheduler: Broadcasting large task binary with size 1087.4 KiB\n", "25-Feb-22 18:26:03, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:26:03, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:26:03, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:26:03, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:26:05 WARN DAGScheduler: Broadcasting large task binary with size 1134.9 KiB\n", "22/02/25 18:26:05 WARN DAGScheduler: Broadcasting large task binary with size 1097.9 KiB\n", "22/02/25 18:26:21 WARN DAGScheduler: Broadcasting large task binary with size 1223.8 KiB\n", "22/02/25 18:26:23 WARN DAGScheduler: Broadcasting large task binary with size 1248.1 KiB\n", "22/02/25 18:26:26 WARN DAGScheduler: Broadcasting large task binary with size 1314.2 KiB\n", "22/02/25 18:26:27 WARN DAGScheduler: Broadcasting large task binary with size 1297.0 KiB\n", "22/02/25 18:26:28 WARN DAGScheduler: Broadcasting large task binary with size 1308.3 KiB\n", "22/02/25 18:26:29 WARN DAGScheduler: Broadcasting large task binary with size 1334.8 KiB\n", "\u001b[32m[I 2022-02-25 18:26:29,581]\u001b[0m Trial 2 finished with value: 0.1748208366108379 and parameters: {'rank': 93}. Best is trial 0 with value: 0.2087745613888222.\u001b[0m\n", "22/02/25 18:26:29 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:26:29 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:26:37 WARN DAGScheduler: Broadcasting large task binary with size 1073.5 KiB\n", "22/02/25 18:26:38 WARN DAGScheduler: Broadcasting large task binary with size 1159.9 KiB\n", "22/02/25 18:26:39 WARN DAGScheduler: Broadcasting large task binary with size 1074.4 KiB\n", "22/02/25 18:26:39 WARN DAGScheduler: Broadcasting large task binary with size 1246.3 KiB\n", "22/02/25 18:26:40 WARN DAGScheduler: Broadcasting large task binary with size 1160.8 KiB\n", "22/02/25 18:26:40 WARN DAGScheduler: Broadcasting large task binary with size 1332.7 KiB\n", "22/02/25 18:26:41 WARN DAGScheduler: Broadcasting large task binary with size 1247.2 KiB\n", "22/02/25 18:26:41 WARN DAGScheduler: Broadcasting large task binary with size 1419.1 KiB\n", "22/02/25 18:26:42 WARN DAGScheduler: Broadcasting large task binary with size 1333.6 KiB\n", "22/02/25 18:26:42 WARN DAGScheduler: Broadcasting large task binary with size 1505.5 KiB\n", "22/02/25 18:26:43 WARN DAGScheduler: Broadcasting large task binary with size 1420.0 KiB\n", "22/02/25 18:26:44 WARN DAGScheduler: Broadcasting large task binary with size 1591.9 KiB\n", "22/02/25 18:26:44 WARN DAGScheduler: Broadcasting large task binary with size 1506.4 KiB\n", "22/02/25 18:26:44 WARN DAGScheduler: Broadcasting large task binary with size 1678.3 KiB\n", "22/02/25 18:26:46 WARN DAGScheduler: Broadcasting large task binary with size 1592.8 KiB\n", "22/02/25 18:26:46 WARN DAGScheduler: Broadcasting large task binary with size 1764.7 KiB\n", "22/02/25 18:26:46 WARN DAGScheduler: Broadcasting large task binary with size 1679.2 KiB\n", "22/02/25 18:26:47 WARN DAGScheduler: Broadcasting large task binary with size 1851.1 KiB\n", "22/02/25 18:26:48 WARN DAGScheduler: Broadcasting large task binary with size 1765.6 KiB\n", "22/02/25 18:26:49 WARN DAGScheduler: Broadcasting large task binary with size 1937.5 KiB\n", "22/02/25 18:26:50 WARN DAGScheduler: Broadcasting large task binary with size 1851.9 KiB\n", "22/02/25 18:26:50 WARN DAGScheduler: Broadcasting large task binary with size 2023.8 KiB\n", "22/02/25 18:26:51 WARN DAGScheduler: Broadcasting large task binary with size 1938.3 KiB\n", "22/02/25 18:26:51 WARN DAGScheduler: Broadcasting large task binary with size 2.1 MiB\n", "22/02/25 18:26:52 WARN DAGScheduler: Broadcasting large task binary with size 2024.7 KiB\n", "22/02/25 18:26:52 WARN DAGScheduler: Broadcasting large task binary with size 2.1 MiB\n", "22/02/25 18:26:53 WARN DAGScheduler: Broadcasting large task binary with size 2025.3 KiB\n", "25-Feb-22 18:26:53, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:26:53, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:26:53, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:26:53, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:26:56 WARN DAGScheduler: Broadcasting large task binary with size 2035.8 KiB\n", "22/02/25 18:26:56 WARN DAGScheduler: Broadcasting large task binary with size 2.1 MiB\n", "22/02/25 18:27:10 WARN DAGScheduler: Broadcasting large task binary with size 2.2 MiB\n", "22/02/25 18:27:12 WARN DAGScheduler: Broadcasting large task binary with size 2.2 MiB\n", "22/02/25 18:27:18 WARN DAGScheduler: Broadcasting large task binary with size 2.2 MiB\n", "22/02/25 18:27:20 WARN DAGScheduler: Broadcasting large task binary with size 2.2 MiB\n", "22/02/25 18:27:21 WARN DAGScheduler: Broadcasting large task binary with size 2.2 MiB\n", "22/02/25 18:27:21 WARN DAGScheduler: Broadcasting large task binary with size 2.3 MiB\n", "\u001b[32m[I 2022-02-25 18:27:22,038]\u001b[0m Trial 3 finished with value: 0.17641949083790578 and parameters: {'rank': 145}. Best is trial 0 with value: 0.2087745613888222.\u001b[0m\n", "22/02/25 18:27:22 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:27:22 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:27:26 WARN DAGScheduler: Broadcasting large task binary with size 1092.9 KiB\n", "22/02/25 18:27:29 WARN DAGScheduler: Broadcasting large task binary with size 1329.8 KiB\n", "22/02/25 18:27:32 WARN DAGScheduler: Broadcasting large task binary with size 1093.8 KiB\n", "22/02/25 18:27:32 WARN DAGScheduler: Broadcasting large task binary with size 1566.6 KiB\n", "22/02/25 18:27:35 WARN DAGScheduler: Broadcasting large task binary with size 1330.6 KiB\n", "22/02/25 18:27:35 WARN DAGScheduler: Broadcasting large task binary with size 1803.4 KiB\n", "22/02/25 18:27:38 WARN DAGScheduler: Broadcasting large task binary with size 1567.5 KiB\n", "22/02/25 18:27:38 WARN DAGScheduler: Broadcasting large task binary with size 2040.3 KiB\n", "22/02/25 18:27:41 WARN DAGScheduler: Broadcasting large task binary with size 1804.3 KiB\n", "22/02/25 18:27:41 WARN DAGScheduler: Broadcasting large task binary with size 2.2 MiB\n", "22/02/25 18:27:44 WARN DAGScheduler: Broadcasting large task binary with size 2041.2 KiB\n", "22/02/25 18:27:44 WARN DAGScheduler: Broadcasting large task binary with size 2.5 MiB\n", "22/02/25 18:27:47 WARN DAGScheduler: Broadcasting large task binary with size 2.2 MiB\n", "22/02/25 18:27:48 WARN DAGScheduler: Broadcasting large task binary with size 2.7 MiB\n", "22/02/25 18:27:50 WARN DAGScheduler: Broadcasting large task binary with size 2.5 MiB\n", "22/02/25 18:27:51 WARN DAGScheduler: Broadcasting large task binary with size 2.9 MiB\n", "22/02/25 18:27:54 WARN DAGScheduler: Broadcasting large task binary with size 2.7 MiB\n", "22/02/25 18:27:54 WARN DAGScheduler: Broadcasting large task binary with size 3.1 MiB\n", "22/02/25 18:27:57 WARN DAGScheduler: Broadcasting large task binary with size 2.9 MiB\n", "22/02/25 18:27:57 WARN DAGScheduler: Broadcasting large task binary with size 3.4 MiB\n", "22/02/25 18:28:00 WARN DAGScheduler: Broadcasting large task binary with size 3.1 MiB\n", "22/02/25 18:28:00 WARN DAGScheduler: Broadcasting large task binary with size 3.6 MiB\n", "22/02/25 18:28:03 WARN DAGScheduler: Broadcasting large task binary with size 3.4 MiB\n", "22/02/25 18:28:03 WARN DAGScheduler: Broadcasting large task binary with size 3.8 MiB\n", "22/02/25 18:28:06 WARN DAGScheduler: Broadcasting large task binary with size 3.6 MiB\n", "22/02/25 18:28:07 WARN DAGScheduler: Broadcasting large task binary with size 4.1 MiB\n", "22/02/25 18:28:09 WARN DAGScheduler: Broadcasting large task binary with size 3.8 MiB\n", "22/02/25 18:28:09 WARN DAGScheduler: Broadcasting large task binary with size 4.3 MiB\n", "22/02/25 18:28:13 WARN DAGScheduler: Broadcasting large task binary with size 4.1 MiB\n", "22/02/25 18:28:13 WARN DAGScheduler: Broadcasting large task binary with size 4.5 MiB\n", "22/02/25 18:28:16 WARN DAGScheduler: Broadcasting large task binary with size 4.3 MiB\n", "22/02/25 18:28:16 WARN DAGScheduler: Broadcasting large task binary with size 4.8 MiB\n", "22/02/25 18:28:19 WARN DAGScheduler: Broadcasting large task binary with size 4.5 MiB\n", "22/02/25 18:28:19 WARN DAGScheduler: Broadcasting large task binary with size 5.0 MiB\n", "22/02/25 18:28:22 WARN DAGScheduler: Broadcasting large task binary with size 4.8 MiB\n", "22/02/25 18:28:22 WARN DAGScheduler: Broadcasting large task binary with size 5.0 MiB\n", "22/02/25 18:28:26 WARN DAGScheduler: Broadcasting large task binary with size 4.8 MiB\n", "25-Feb-22 18:28:26, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:28:26, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:28:26, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:28:26, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:28:29 WARN DAGScheduler: Broadcasting large task binary with size 4.8 MiB\n", "22/02/25 18:28:29 WARN DAGScheduler: Broadcasting large task binary with size 5.0 MiB\n", "22/02/25 18:28:45 WARN DAGScheduler: Broadcasting large task binary with size 5.1 MiB]\n", "22/02/25 18:28:48 WARN DAGScheduler: Broadcasting large task binary with size 5.1 MiB\n", "22/02/25 18:28:53 WARN DAGScheduler: Broadcasting large task binary with size 5.2 MiB\n", "22/02/25 18:28:54 WARN DAGScheduler: Broadcasting large task binary with size 5.2 MiB\n", "22/02/25 18:28:55 WARN DAGScheduler: Broadcasting large task binary with size 5.2 MiB\n", "22/02/25 18:28:58 WARN DAGScheduler: Broadcasting large task binary with size 5.2 MiB\n", "\u001b[32m[I 2022-02-25 18:28:59,324]\u001b[0m Trial 4 finished with value: 0.17185749028039718 and parameters: {'rank': 243}. Best is trial 0 with value: 0.2087745613888222.\u001b[0m\n", "22/02/25 18:28:59 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:28:59 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:29:02, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:29:02, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:29:03, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:29:03, replay, WARNING: This model can't predict cold items, they will be ignored\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\u001b[32m[I 2022-02-25 18:29:34,674]\u001b[0m Trial 5 finished with value: 0.19340064264131435 and parameters: {'rank': 29}. Best is trial 0 with value: 0.2087745613888222.\u001b[0m\n", "22/02/25 18:29:34 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:29:34 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:29:37, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:29:37, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:29:37, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:29:37, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:30:01,211]\u001b[0m Trial 6 finished with value: 0.19864077980817504 and parameters: {'rank': 25}. Best is trial 0 with value: 0.2087745613888222.\u001b[0m\n", "22/02/25 18:30:01 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:30:01 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:30:10, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:30:10, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:30:10, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:30:10, replay, WARNING: This model can't predict cold items, they will be ignored\n", "22/02/25 18:30:32 WARN DAGScheduler: Broadcasting large task binary with size 1069.1 KiB\n", "22/02/25 18:30:35 WARN DAGScheduler: Broadcasting large task binary with size 1093.4 KiB\n", "22/02/25 18:30:38 WARN DAGScheduler: Broadcasting large task binary with size 1159.5 KiB\n", "22/02/25 18:30:39 WARN DAGScheduler: Broadcasting large task binary with size 1142.4 KiB\n", "22/02/25 18:30:40 WARN DAGScheduler: Broadcasting large task binary with size 1153.6 KiB\n", "22/02/25 18:30:40 WARN DAGScheduler: Broadcasting large task binary with size 1180.1 KiB\n", "\u001b[32m[I 2022-02-25 18:30:41,368]\u001b[0m Trial 7 finished with value: 0.17573029777489088 and parameters: {'rank': 82}. Best is trial 0 with value: 0.2087745613888222.\u001b[0m\n", "22/02/25 18:30:41 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:30:41 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:30:44, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:30:44, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:30:44, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:30:44, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:31:17,554]\u001b[0m Trial 8 finished with value: 0.19340064264131435 and parameters: {'rank': 29}. Best is trial 0 with value: 0.2087745613888222.\u001b[0m\n", "22/02/25 18:31:17 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:31:17 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:31:20, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:31:21, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:31:21, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:31:21, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:31:48,639]\u001b[0m Trial 9 finished with value: 0.19662544429266166 and parameters: {'rank': 24}. Best is trial 0 with value: 0.2087745613888222.\u001b[0m\n", "22/02/25 18:31:48 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:31:48 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:31:51, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:31:51, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:31:51, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:31:51, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:32:31,903]\u001b[0m Trial 10 finished with value: 0.2159939352928923 and parameters: {'rank': 8}. Best is trial 10 with value: 0.2159939352928923.\u001b[0m\n", "22/02/25 18:32:31 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:32:31 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:32:34, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:32:34, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:32:34, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:32:34, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:33:08,698]\u001b[0m Trial 11 finished with value: 0.2159939352928923 and parameters: {'rank': 8}. Best is trial 10 with value: 0.2159939352928923.\u001b[0m\n", "22/02/25 18:33:08 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:33:08 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:33:11, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:33:11, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:33:11, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:33:11, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:33:44,576]\u001b[0m Trial 12 finished with value: 0.2159939352928923 and parameters: {'rank': 8}. Best is trial 10 with value: 0.2159939352928923.\u001b[0m\n", "22/02/25 18:33:44 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:33:44 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:33:49, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:33:49, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:33:49, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:33:49, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:34:21,301]\u001b[0m Trial 13 finished with value: 0.20338683139989547 and parameters: {'rank': 13}. Best is trial 10 with value: 0.2159939352928923.\u001b[0m\n", "22/02/25 18:34:21 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:34:21 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:34:23, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:34:24, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:34:24, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:34:24, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:34:58,708]\u001b[0m Trial 14 finished with value: 0.20092112717623575 and parameters: {'rank': 14}. Best is trial 10 with value: 0.2159939352928923.\u001b[0m\n", "22/02/25 18:34:58 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:34:58 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:35:01, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:35:01, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:35:01, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:35:01, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:35:51,511]\u001b[0m Trial 15 finished with value: 0.2012198464341923 and parameters: {'rank': 16}. Best is trial 10 with value: 0.2159939352928923.\u001b[0m\n", "22/02/25 18:35:51 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:35:51 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:35:58, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:35:58, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:35:58, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:35:58, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:36:27,119]\u001b[0m Trial 16 finished with value: 0.2159939352928923 and parameters: {'rank': 8}. Best is trial 10 with value: 0.2159939352928923.\u001b[0m\n", "22/02/25 18:36:27 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:36:27 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:36:34, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:36:34, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:36:34, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:36:34, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:37:08,237]\u001b[0m Trial 17 finished with value: 0.1802430095769175 and parameters: {'rank': 53}. Best is trial 10 with value: 0.2159939352928923.\u001b[0m\n", "22/02/25 18:37:08 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:37:08 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:37:11, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:37:11, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:37:11, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:37:11, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:37:39,099]\u001b[0m Trial 18 finished with value: 0.20136589109047304 and parameters: {'rank': 17}. Best is trial 10 with value: 0.2159939352928923.\u001b[0m\n", "22/02/25 18:37:39 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:37:39 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:37:43, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:37:43, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:37:43, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:37:43, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:38:10,385]\u001b[0m Trial 19 finished with value: 0.18217301749533787 and parameters: {'rank': 49}. Best is trial 10 with value: 0.2159939352928923.\u001b[0m\n", "25-Feb-22 18:38:10, replay, INFO: best params for Implicit ALS are: {'rank': 8}\n", "25-Feb-22 18:38:10, replay, INFO: Implicit ALS fit_predict started\n", "22/02/25 18:38:10 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:38:10 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:38:12, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:38:12, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:38:12, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:38:12, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:40:24, replay, INFO: Explicit ALS started 8] 144]\n", "25-Feb-22 18:40:24, replay, INFO: Explicit ALS optimization started\n", "\u001b[32m[I 2022-02-25 18:40:24,854]\u001b[0m A new study created in memory with name: no-name-b0234304-de3f-467f-8ca3-12ee2c114c0d\u001b[0m\n", "/home/u19893556/miniconda3/envs/replay/lib/python3.7/site-packages/optuna/distributions.py:364: FutureWarning: Samplers and other components in Optuna will assume that `step` is 1. `step` argument is deprecated and will be removed in the future. The removal of this feature is currently scheduled for v4.0.0, but this schedule is subject to change.\n", " FutureWarning,\n", "22/02/25 18:40:24 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:40:24 WARN CacheManager: Asked to cache already cached data.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "Implicit ALS 0.253444 0.406855 0.131129 \n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "ADMM SLIM 0.216480 0.373958 0.348837 \n", "Wilson Recommender 0.092121 0.180976 0.017092 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "\n", " fit_pred_time \n", "Implicit ALS 32.185843 \n", "Popular Recommender 16.931190 \n", "ADMM SLIM 56.977886 \n", "Wilson Recommender 16.660789 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 18:40:28, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:40:28, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:40:28, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:40:28, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:40:58,997]\u001b[0m Trial 0 finished with value: 0.008955289357265354 and parameters: {'rank': 10}. Best is trial 0 with value: 0.008955289357265354.\u001b[0m\n", "22/02/25 18:40:59 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:40:59 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:41:27, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:41:27, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:41:27, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:41:27, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:42:04,407]\u001b[0m Trial 1 finished with value: 0.023477530684574626 and parameters: {'rank': 154}. Best is trial 1 with value: 0.023477530684574626.\u001b[0m\n", "22/02/25 18:42:04 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:42:04 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:43:16, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:43:16, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:43:16, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:43:16, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:44:06,220]\u001b[0m Trial 2 finished with value: 0.019533663325742814 and parameters: {'rank': 252}. Best is trial 1 with value: 0.023477530684574626.\u001b[0m\n", "22/02/25 18:44:06 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:44:06 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:44:08, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:44:08, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:44:08, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:44:08, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:44:43,129]\u001b[0m Trial 3 finished with value: 0.016537098203526356 and parameters: {'rank': 8}. Best is trial 1 with value: 0.023477530684574626.\u001b[0m\n", "22/02/25 18:44:43 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:44:43 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:44:45, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:44:45, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:44:45, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:44:45, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:45:18,316]\u001b[0m Trial 4 finished with value: 0.01698723632187098 and parameters: {'rank': 19}. Best is trial 1 with value: 0.023477530684574626.\u001b[0m\n", "22/02/25 18:45:18 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:45:18 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:45:23, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:45:23, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:45:24, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:45:24, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:45:54,187]\u001b[0m Trial 5 finished with value: 0.0351628297834589 and parameters: {'rank': 32}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n", "22/02/25 18:45:54 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:45:54 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:45:58, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:45:58, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:45:58, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:45:58, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:46:39,864]\u001b[0m Trial 6 finished with value: 0.024091514129978394 and parameters: {'rank': 33}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n", "22/02/25 18:46:39 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:46:39 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:46:41, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:46:41, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:46:41, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:46:41, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:47:11,993]\u001b[0m Trial 7 finished with value: 0.008955289357265354 and parameters: {'rank': 10}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n", "22/02/25 18:47:12 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:47:12 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:47:18, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:47:18, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:47:18, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:47:18, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:47:53,312]\u001b[0m Trial 8 finished with value: 0.026573136136269142 and parameters: {'rank': 72}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n", "22/02/25 18:47:53 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:47:53 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:48:15, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:48:15, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:48:15, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:48:15, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:48:54,156]\u001b[0m Trial 9 finished with value: 0.022072761578651786 and parameters: {'rank': 140}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n", "22/02/25 18:48:54 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:48:54 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:49:03, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:49:03, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:49:03, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:49:03, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:49:37,089]\u001b[0m Trial 10 finished with value: 0.021146637937479635 and parameters: {'rank': 46}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n", "22/02/25 18:49:37 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:49:37 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:49:44, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:49:44, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:49:44, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:49:44, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:50:23,351]\u001b[0m Trial 11 finished with value: 0.029342052597121235 and parameters: {'rank': 58}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "22/02/25 18:50:23 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:50:23 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:50:26, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:50:26, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:50:26, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:50:26, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:51:07,353]\u001b[0m Trial 12 finished with value: 0.028913571540665126 and parameters: {'rank': 26}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n", "22/02/25 18:51:07 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:51:07 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:51:15, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:51:15, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:51:15, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:51:15, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:51:49,459]\u001b[0m Trial 13 finished with value: 0.03223115300900205 and parameters: {'rank': 68}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n", "22/02/25 18:51:49 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:51:49 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:51:58, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:51:58, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:51:58, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:51:58, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:52:37,976]\u001b[0m Trial 14 finished with value: 0.02440488953053382 and parameters: {'rank': 87}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n", "22/02/25 18:52:38 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:52:38 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:52:43, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:52:43, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:52:43, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:52:43, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:53:16,765]\u001b[0m Trial 15 finished with value: 0.01698723632187098 and parameters: {'rank': 19}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n", "22/02/25 18:53:16 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:53:16 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:53:31, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:53:31, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:53:31, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:53:31, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:54:07,413]\u001b[0m Trial 16 finished with value: 0.024170785631848903 and parameters: {'rank': 103}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n", "22/02/25 18:54:07 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:54:07 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:54:11, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:54:11, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:54:11, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:54:11, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:55:08,247]\u001b[0m Trial 17 finished with value: 0.019199738748180862 and parameters: {'rank': 34}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n", "22/02/25 18:55:08 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:55:08 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:55:12, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:55:12, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:55:12, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:55:12, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:55:55,187]\u001b[0m Trial 18 finished with value: 0.02763944542994879 and parameters: {'rank': 17}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n", "22/02/25 18:55:55 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:55:55 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:56:00, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:56:00, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:56:00, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:56:00, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 18:56:39,820]\u001b[0m Trial 19 finished with value: 0.024138106227539577 and parameters: {'rank': 50}. Best is trial 5 with value: 0.0351628297834589.\u001b[0m\n", "25-Feb-22 18:56:39, replay, INFO: best params for Explicit ALS are: {'rank': 32}\n", "25-Feb-22 18:56:39, replay, INFO: Explicit ALS fit_predict started\n", "22/02/25 18:56:39 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 18:56:39 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 18:56:43, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:56:43, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:56:43, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 18:56:43, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:59:46, replay, INFO: KNN started / 144]\n", "25-Feb-22 18:59:46, replay, INFO: KNN optimization started\n", "\u001b[32m[I 2022-02-25 18:59:46,490]\u001b[0m A new study created in memory with name: no-name-67568bf1-580e-4564-bb9b-0f18c3b4e7f9\u001b[0m\n", "25-Feb-22 18:59:46, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 18:59:46, replay, WARNING: This model can't predict cold items, they will be ignored\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "Implicit ALS 0.253444 0.406855 0.131129 \n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "ADMM SLIM 0.216480 0.373958 0.348837 \n", "Wilson Recommender 0.092121 0.180976 0.017092 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "Explicit ALS 0.017995 0.041331 0.569908 \n", "\n", " fit_pred_time \n", "Implicit ALS 32.185843 \n", "Popular Recommender 16.931190 \n", "ADMM SLIM 56.977886 \n", "Wilson Recommender 16.660789 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n", "Explicit ALS 50.138072 \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\u001b[32m[I 2022-02-25 19:00:47,700]\u001b[0m Trial 0 finished with value: 0.20815145550561892 and parameters: {'num_neighbours': 10, 'shrink': 0}. Best is trial 0 with value: 0.20815145550561892.\u001b[0m\n", "25-Feb-22 19:00:47, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:00:47, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:01:02,302]\u001b[0m Trial 1 finished with value: 0.23523974161744457 and parameters: {'num_neighbours': 75, 'shrink': 78}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:01:02, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:01:02, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:01:17,643]\u001b[0m Trial 2 finished with value: 0.21441020220440898 and parameters: {'num_neighbours': 16, 'shrink': 30}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:01:17, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:01:17, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:01:32,878]\u001b[0m Trial 3 finished with value: 0.22448744295760434 and parameters: {'num_neighbours': 44, 'shrink': 27}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:01:32, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:01:32, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:01:43,331]\u001b[0m Trial 4 finished with value: 0.23279789836463874 and parameters: {'num_neighbours': 82, 'shrink': 51}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:01:43, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:01:43, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:02:01,535]\u001b[0m Trial 5 finished with value: 0.19810278623194055 and parameters: {'num_neighbours': 3, 'shrink': 40}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:02:01, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:02:01, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:02:13,690]\u001b[0m Trial 6 finished with value: 0.22783974523463837 and parameters: {'num_neighbours': 30, 'shrink': 73}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:02:13, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:02:13, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:02:25,859]\u001b[0m Trial 7 finished with value: 0.2304653150727685 and parameters: {'num_neighbours': 74, 'shrink': 53}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:02:25, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:02:25, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:02:38,242]\u001b[0m Trial 8 finished with value: 0.22550381157206978 and parameters: {'num_neighbours': 33, 'shrink': 38}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:02:38, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:02:38, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:02:52,168]\u001b[0m Trial 9 finished with value: 0.2349945151088935 and parameters: {'num_neighbours': 85, 'shrink': 63}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:02:52, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:02:52, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:03:05,750]\u001b[0m Trial 10 finished with value: 0.23450274456172532 and parameters: {'num_neighbours': 100, 'shrink': 99}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:03:05, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:03:05, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:03:18,621]\u001b[0m Trial 11 finished with value: 0.23400621596881335 and parameters: {'num_neighbours': 67, 'shrink': 83}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:03:18, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:03:18, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:03:33,037]\u001b[0m Trial 12 finished with value: 0.23376766818801611 and parameters: {'num_neighbours': 96, 'shrink': 68}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:03:33, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:03:33, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:03:46,248]\u001b[0m Trial 13 finished with value: 0.23419601716420926 and parameters: {'num_neighbours': 60, 'shrink': 94}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:03:46, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:03:46, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:03:58,977]\u001b[0m Trial 14 finished with value: 0.23458170188264757 and parameters: {'num_neighbours': 84, 'shrink': 65}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:03:59, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:03:59, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:04:12,793]\u001b[0m Trial 15 finished with value: 0.2329234767367508 and parameters: {'num_neighbours': 56, 'shrink': 86}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:04:12, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:04:12, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:04:25,709]\u001b[0m Trial 16 finished with value: 0.23466370701835013 and parameters: {'num_neighbours': 83, 'shrink': 60}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:04:25, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:04:25, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:04:38,541]\u001b[0m Trial 17 finished with value: 0.23365597532059698 and parameters: {'num_neighbours': 71, 'shrink': 80}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:04:38, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:04:38, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:04:55,142]\u001b[0m Trial 18 finished with value: 0.22573782189156147 and parameters: {'num_neighbours': 88, 'shrink': 13}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:04:55, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:04:55, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:05:09,715]\u001b[0m Trial 19 finished with value: 0.22695609944222642 and parameters: {'num_neighbours': 41, 'shrink': 75}. Best is trial 1 with value: 0.23523974161744457.\u001b[0m\n", "25-Feb-22 19:05:09, replay, INFO: best params for KNN are: {'num_neighbours': 75, 'shrink': 78}\n", "25-Feb-22 19:05:09, replay, INFO: KNN fit_predict started\n", "22/02/25 19:05:09 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:05:09 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:05:10, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:05:10, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:06:47, replay, INFO: LightFM started ]8]\n", "25-Feb-22 19:06:47, replay, INFO: LightFM optimization started\n", "\u001b[32m[I 2022-02-25 19:06:47,795]\u001b[0m A new study created in memory with name: no-name-98061fe1-cb0f-47e9-9c34-6d31fb74bd1e\u001b[0m\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/home/u19893556/miniconda3/envs/replay/lib/python3.7/site-packages/optuna/distributions.py:364: FutureWarning: Samplers and other components in Optuna will assume that `step` is 1. `step` argument is deprecated and will be removed in the future. The removal of this feature is currently scheduled for v4.0.0, but this schedule is subject to change.\n", " FutureWarning,\n", "22/02/25 19:06:47 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:06:47 WARN CacheManager: Asked to cache already cached data.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "KNN 0.258407 0.412565 0.054077 \n", "Implicit ALS 0.253444 0.406855 0.131129 \n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "ADMM SLIM 0.216480 0.373958 0.348837 \n", "Wilson Recommender 0.092121 0.180976 0.017092 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "Explicit ALS 0.017995 0.041331 0.569908 \n", "\n", " fit_pred_time \n", "KNN 36.018561 \n", "Implicit ALS 32.185843 \n", "Popular Recommender 16.931190 \n", "ADMM SLIM 56.977886 \n", "Wilson Recommender 16.660789 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n", "Explicit ALS 50.138072 \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 19:06:51, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:06:51, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:06:51, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:06:51, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:07:29,124]\u001b[0m Trial 0 finished with value: 0.1875257299916022 and parameters: {'loss': 'warp', 'no_components': 128}. Best is trial 0 with value: 0.1875257299916022.\u001b[0m\n", "22/02/25 19:07:29 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:07:29 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:07:33, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:07:33, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:07:33, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:07:33, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:08:00,855]\u001b[0m Trial 1 finished with value: 0.1710494878307369 and parameters: {'loss': 'warp', 'no_components': 267}. Best is trial 0 with value: 0.1875257299916022.\u001b[0m\n", "22/02/25 19:08:00 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:08:00 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:08:06, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:08:06, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:08:06, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:08:06, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:08:35,211]\u001b[0m Trial 2 finished with value: 0.21059182723504485 and parameters: {'loss': 'warp', 'no_components': 19}. Best is trial 2 with value: 0.21059182723504485.\u001b[0m\n", "22/02/25 19:08:35 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:08:35 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:08:38, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:08:38, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:08:38, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:08:38, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:09:03,559]\u001b[0m Trial 3 finished with value: 0.2035607276696068 and parameters: {'loss': 'warp', 'no_components': 40}. Best is trial 2 with value: 0.21059182723504485.\u001b[0m\n", "22/02/25 19:09:03 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:09:03 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:09:06, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:09:06, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:09:06, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:09:06, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:09:35,536]\u001b[0m Trial 4 finished with value: 0.21570849200557923 and parameters: {'loss': 'warp', 'no_components': 15}. Best is trial 4 with value: 0.21570849200557923.\u001b[0m\n", "22/02/25 19:09:35 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:09:35 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:09:38, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:09:38, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:09:38, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:09:38, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:10:04,952]\u001b[0m Trial 5 finished with value: 0.21335429413246954 and parameters: {'loss': 'warp', 'no_components': 19}. Best is trial 4 with value: 0.21570849200557923.\u001b[0m\n", "22/02/25 19:10:04 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:10:04 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:10:08, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:10:08, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:10:08, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:10:08, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:10:40,154]\u001b[0m Trial 6 finished with value: 0.18320247444814067 and parameters: {'loss': 'warp', 'no_components': 97}. Best is trial 4 with value: 0.21570849200557923.\u001b[0m\n", "22/02/25 19:10:40 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:10:40 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:10:44, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:10:44, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:10:44, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:10:44, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:11:15,891]\u001b[0m Trial 7 finished with value: 0.17309102899431783 and parameters: {'loss': 'warp', 'no_components': 177}. Best is trial 4 with value: 0.21570849200557923.\u001b[0m\n", "22/02/25 19:11:15 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:11:15 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:11:19, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:11:19, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:11:19, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:11:19, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:11:45,049]\u001b[0m Trial 8 finished with value: 0.19061423018293083 and parameters: {'loss': 'warp', 'no_components': 106}. Best is trial 4 with value: 0.21570849200557923.\u001b[0m\n", "22/02/25 19:11:45 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:11:45 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:11:48, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:11:48, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:11:48, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:11:48, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:12:12,202]\u001b[0m Trial 9 finished with value: 0.18773530918414727 and parameters: {'loss': 'warp', 'no_components': 129}. Best is trial 4 with value: 0.21570849200557923.\u001b[0m\n", "22/02/25 19:12:12 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:12:12 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:12:16, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:12:16, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:12:16, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:12:16, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:12:47,207]\u001b[0m Trial 10 finished with value: 0.2098285545297713 and parameters: {'loss': 'warp', 'no_components': 10}. Best is trial 4 with value: 0.21570849200557923.\u001b[0m\n", "22/02/25 19:12:47 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:12:47 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:12:51, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:12:51, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:12:51, replay, WARNING: This model can't predict cold users, they will be ignored\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 19:12:51, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:13:21,382]\u001b[0m Trial 11 finished with value: 0.20817796681780582 and parameters: {'loss': 'warp', 'no_components': 30}. Best is trial 4 with value: 0.21570849200557923.\u001b[0m\n", "22/02/25 19:13:21 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:13:21 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:13:26, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:13:26, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:13:26, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:13:26, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:13:54,813]\u001b[0m Trial 12 finished with value: 0.21646685479648656 and parameters: {'loss': 'warp', 'no_components': 8}. Best is trial 12 with value: 0.21646685479648656.\u001b[0m\n", "22/02/25 19:13:54 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:13:54 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:14:00, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:14:00, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:14:00, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:14:00, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:14:28,815]\u001b[0m Trial 13 finished with value: 0.21974415173227532 and parameters: {'loss': 'warp', 'no_components': 9}. Best is trial 13 with value: 0.21974415173227532.\u001b[0m\n", "22/02/25 19:14:28 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:14:28 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:14:34, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:14:34, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:14:34, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:14:34, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:15:08,792]\u001b[0m Trial 14 finished with value: 0.2178691082491699 and parameters: {'loss': 'warp', 'no_components': 8}. Best is trial 13 with value: 0.21974415173227532.\u001b[0m\n", "22/02/25 19:15:08 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:15:08 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:15:13, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:15:13, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:15:13, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:15:13, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:15:44,526]\u001b[0m Trial 15 finished with value: 0.1623190565121521 and parameters: {'loss': 'warp', 'no_components': 500}. Best is trial 13 with value: 0.21974415173227532.\u001b[0m\n", "22/02/25 19:15:44 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:15:44 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:15:47, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:15:47, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:15:47, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:15:47, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:16:18,795]\u001b[0m Trial 16 finished with value: 0.20290243364137026 and parameters: {'loss': 'warp', 'no_components': 51}. Best is trial 13 with value: 0.21974415173227532.\u001b[0m\n", "22/02/25 19:16:18 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:16:18 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:16:21, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:16:21, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:16:21, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:16:21, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:16:55,561]\u001b[0m Trial 17 finished with value: 0.21123896164389136 and parameters: {'loss': 'warp', 'no_components': 13}. Best is trial 13 with value: 0.21974415173227532.\u001b[0m\n", "22/02/25 19:16:55 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:16:55 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:16:58, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:16:58, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:16:58, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:16:58, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:17:28,232]\u001b[0m Trial 18 finished with value: 0.1968663748562113 and parameters: {'loss': 'warp', 'no_components': 28}. Best is trial 13 with value: 0.21974415173227532.\u001b[0m\n", "22/02/25 19:17:28 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:17:28 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:17:31, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:17:31, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:17:31, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:17:31, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:17:58,137]\u001b[0m Trial 19 finished with value: 0.22045316351988123 and parameters: {'loss': 'warp', 'no_components': 9}. Best is trial 19 with value: 0.22045316351988123.\u001b[0m\n", "25-Feb-22 19:17:58, replay, INFO: best params for LightFM are: {'loss': 'warp', 'no_components': 9}\n", "25-Feb-22 19:17:58, replay, INFO: LightFM fit_predict started\n", "22/02/25 19:17:58 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:17:58 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:18:02, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:18:02, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:18:02, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:18:02, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:20:31, replay, INFO: SLIM started \n", "25-Feb-22 19:20:31, replay, INFO: SLIM optimization started\n", "\u001b[32m[I 2022-02-25 19:20:31,888]\u001b[0m A new study created in memory with name: no-name-927d3f59-e246-4228-a483-7b2359793979\u001b[0m\n", "22/02/25 19:20:31 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:20:31 WARN CacheManager: Asked to cache already cached data.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "LightFM 0.267207 0.436674 0.156346 \n", "KNN 0.258407 0.412565 0.054077 \n", "Implicit ALS 0.253444 0.406855 0.131129 \n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "ADMM SLIM 0.216480 0.373958 0.348837 \n", "Wilson Recommender 0.092121 0.180976 0.017092 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "Explicit ALS 0.017995 0.041331 0.569908 \n", "\n", " fit_pred_time \n", "LightFM 28.989394 \n", "KNN 36.018561 \n", "Implicit ALS 32.185843 \n", "Popular Recommender 16.931190 \n", "ADMM SLIM 56.977886 \n", "Wilson Recommender 16.660789 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n", "Explicit ALS 50.138072 \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 19:20:33, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:20:33, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:20:47,377]\u001b[0m Trial 0 finished with value: 0.18690087157310825 and parameters: {'beta': 0.01, 'lambda_': 0.01}. Best is trial 0 with value: 0.18690087157310825.\u001b[0m\n", "22/02/25 19:20:47 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:20:47 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:20:48, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:20:48, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:21:11,009]\u001b[0m Trial 1 finished with value: 0.18594299730857067 and parameters: {'beta': 0.008860922345504325, 'lambda_': 0.0010236160434899768}. Best is trial 0 with value: 0.18690087157310825.\u001b[0m\n", "22/02/25 19:21:11 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:21:11 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:21:12, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:21:12, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:21:33,441]\u001b[0m Trial 2 finished with value: 0.1914804215004526 and parameters: {'beta': 0.03399489931280087, 'lambda_': 1.0689126947930436e-05}. Best is trial 2 with value: 0.1914804215004526.\u001b[0m\n", "22/02/25 19:21:33 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:21:33 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:21:34, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:21:34, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:21:55,768]\u001b[0m Trial 3 finished with value: 0.18818577543640305 and parameters: {'beta': 0.01749412504714548, 'lambda_': 5.878811424546011e-06}. Best is trial 2 with value: 0.1914804215004526.\u001b[0m\n", "22/02/25 19:21:55 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:21:55 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:21:57, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:21:57, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:22:15,722]\u001b[0m Trial 4 finished with value: 0.23028625249358875 and parameters: {'beta': 3.32881664728497, 'lambda_': 0.0030820211120937847}. Best is trial 4 with value: 0.23028625249358875.\u001b[0m\n", "22/02/25 19:22:15 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:22:15 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:22:17, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:22:17, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:22:28,318]\u001b[0m Trial 5 finished with value: 0.18977644975842903 and parameters: {'beta': 0.001862346947843061, 'lambda_': 0.030055794047238554}. Best is trial 4 with value: 0.23028625249358875.\u001b[0m\n", "22/02/25 19:22:28 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:22:28 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:22:29, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:22:29, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:22:49,259]\u001b[0m Trial 6 finished with value: 0.17814562373577825 and parameters: {'beta': 0.00012775867662144253, 'lambda_': 1.0177081254658634e-06}. Best is trial 4 with value: 0.23028625249358875.\u001b[0m\n", "22/02/25 19:22:49 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:22:49 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:22:50, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:22:50, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:23:09,900]\u001b[0m Trial 7 finished with value: 0.18698139949360768 and parameters: {'beta': 0.0149385975872309, 'lambda_': 3.950492881084857e-05}. Best is trial 4 with value: 0.23028625249358875.\u001b[0m\n", "22/02/25 19:23:09 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:23:09 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:23:11, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:23:11, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:23:35,316]\u001b[0m Trial 8 finished with value: 0.20018991755327953 and parameters: {'beta': 0.13305997659810645, 'lambda_': 1.027879641716447e-06}. Best is trial 4 with value: 0.23028625249358875.\u001b[0m\n", "22/02/25 19:23:35 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:23:35 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:23:36, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:23:36, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:23:50,269]\u001b[0m Trial 9 finished with value: 0.18589332274451 and parameters: {'beta': 1.6110437038934183e-06, 'lambda_': 0.009903609101365389}. Best is trial 4 with value: 0.23028625249358875.\u001b[0m\n", "22/02/25 19:23:50 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:23:50 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:23:51, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:23:51, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:23:59,186]\u001b[0m Trial 10 finished with value: 0.0 and parameters: {'beta': 1.916830428828259, 'lambda_': 1.4971228352593446}. Best is trial 4 with value: 0.23028625249358875.\u001b[0m\n", "22/02/25 19:23:59 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:23:59 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:24:00, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:24:00, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:24:32,332]\u001b[0m Trial 11 finished with value: 0.23527354383676072 and parameters: {'beta': 4.65156643702147, 'lambda_': 0.00020347014573639274}. Best is trial 11 with value: 0.23527354383676072.\u001b[0m\n", "22/02/25 19:24:32 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:24:32 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:24:33, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:24:33, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:24:54,850]\u001b[0m Trial 12 finished with value: 0.22680688946156807 and parameters: {'beta': 2.3255554135460432, 'lambda_': 0.0002637561593290979}. Best is trial 11 with value: 0.23527354383676072.\u001b[0m\n", "22/02/25 19:24:54 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:24:54 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:24:56, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:24:56, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:25:17,059]\u001b[0m Trial 13 finished with value: 0.20644682200416017 and parameters: {'beta': 0.3520802515702011, 'lambda_': 0.00021646365951865894}. Best is trial 11 with value: 0.23527354383676072.\u001b[0m\n", "22/02/25 19:25:17 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:25:17 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:25:18, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:25:18, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:25:27,040]\u001b[0m Trial 14 finished with value: 0.0 and parameters: {'beta': 3.197480126421834, 'lambda_': 0.3730250636589489}. Best is trial 11 with value: 0.23527354383676072.\u001b[0m\n", "22/02/25 19:25:27 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:25:27 WARN CacheManager: Asked to cache already cached data.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 19:25:28, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:25:28, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:25:50,260]\u001b[0m Trial 15 finished with value: 0.20300583504319952 and parameters: {'beta': 0.28289285694841343, 'lambda_': 0.0017039249402949794}. Best is trial 11 with value: 0.23527354383676072.\u001b[0m\n", "22/02/25 19:25:50 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:25:50 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:25:51, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:25:51, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:26:05,324]\u001b[0m Trial 16 finished with value: 0.20343160064300064 and parameters: {'beta': 0.0006544018473466481, 'lambda_': 0.07281205873403812}. Best is trial 11 with value: 0.23527354383676072.\u001b[0m\n", "22/02/25 19:26:05 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:26:05 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:26:06, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:26:06, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:26:24,075]\u001b[0m Trial 17 finished with value: 0.18172222659066103 and parameters: {'beta': 5.497105051749887e-06, 'lambda_': 0.0015014482291685223}. Best is trial 11 with value: 0.23527354383676072.\u001b[0m\n", "22/02/25 19:26:24 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:26:24 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:26:25, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:26:25, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:26:44,816]\u001b[0m Trial 18 finished with value: 0.20838470325290787 and parameters: {'beta': 0.5085918189201124, 'lambda_': 0.00012001438620151046}. Best is trial 11 with value: 0.23527354383676072.\u001b[0m\n", "22/02/25 19:26:44 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:26:44 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:26:46, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:26:46, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:27:01,309]\u001b[0m Trial 19 finished with value: 0.18298417070248463 and parameters: {'beta': 6.765221433225136e-05, 'lambda_': 0.007729843624444044}. Best is trial 11 with value: 0.23527354383676072.\u001b[0m\n", "25-Feb-22 19:27:01, replay, INFO: best params for SLIM are: {'beta': 4.65156643702147, 'lambda_': 0.00020347014573639274}\n", "25-Feb-22 19:27:01, replay, INFO: SLIM fit_predict started\n", "22/02/25 19:27:01 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:27:01 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 19:27:03, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:27:03, replay, WARNING: This model can't predict cold items, they will be ignored\n", "[Stage 18192:==========================================> (122 + 22) / 144]8]]\r" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "SLIM 0.270859 0.434489 0.063323 \n", "LightFM 0.267207 0.436674 0.156346 \n", "KNN 0.258407 0.412565 0.054077 \n", "Implicit ALS 0.253444 0.406855 0.131129 \n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "ADMM SLIM 0.216480 0.373958 0.348837 \n", "Wilson Recommender 0.092121 0.180976 0.017092 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "Explicit ALS 0.017995 0.041331 0.569908 \n", "\n", " fit_pred_time \n", "SLIM 41.092457 \n", "LightFM 28.989394 \n", "KNN 36.018561 \n", "Implicit ALS 32.185843 \n", "Popular Recommender 16.931190 \n", "ADMM SLIM 56.977886 \n", "Wilson Recommender 16.660789 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n", "Explicit ALS 50.138072 \n", "CPU times: user 2h 23min 58s, sys: 1h 18min 27s, total: 3h 42min 25s\n", "Wall time: 1h 16min 41s\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\r", " \r" ] } ], "source": [ "%%time\n", "full_pipeline(common_models, e, train)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Coverage@10HitRate@1HitRate@5HitRate@10MAP@10MRR@10NDCG@10Surprisal@10fit_pred_timeparams
SLIM0.0633230.3248460.5838450.6935910.1763290.4344890.2708590.13332341.092457{'beta': 4.65156643702147, 'lambda_': 0.000203...
LightFM0.1563460.3248460.5812120.6944690.1702510.4366740.2672070.16901228.989394{'loss': 'warp', 'no_components': 9}
KNN0.0540770.3028970.5566290.6488150.1686650.4125650.2584070.13855436.018561{'num_neighbours': 75, 'shrink': 78}
Implicit ALS0.1311290.2923620.5627740.6812990.1621400.4068550.2534440.16382432.185843{'rank': 8}
Popular Recommender0.0339030.2844600.5302900.6453030.1573010.3904260.2437830.11835416.931190NaN
ADMM SLIM0.3488370.2581210.5417030.6479370.1270430.3739580.2164800.22198456.977886{'lambda_1': 0.8417364694294401, 'lambda_2': 6...
Wilson Recommender0.0170920.0834060.3450400.4143990.0450020.1809760.0921210.26219016.660789NaN
Random Recommender (popularity-based)0.7604370.0711150.2616330.3784020.0270260.1504340.0666650.34478410.100435{'distribution': 'popular_based', 'alpha': 24....
Random Recommender (uniform)0.9576910.0175590.1000880.1676910.0073320.0548460.0217250.53867711.719672NaN
Explicit ALS0.5699080.0175590.0702370.1246710.0065340.0413310.0179950.54051750.138072{'rank': 32}
\n", "
" ], "text/plain": [ " Coverage@10 HitRate@1 HitRate@5 \\\n", "SLIM 0.063323 0.324846 0.583845 \n", "LightFM 0.156346 0.324846 0.581212 \n", "KNN 0.054077 0.302897 0.556629 \n", "Implicit ALS 0.131129 0.292362 0.562774 \n", "Popular Recommender 0.033903 0.284460 0.530290 \n", "ADMM SLIM 0.348837 0.258121 0.541703 \n", "Wilson Recommender 0.017092 0.083406 0.345040 \n", "Random Recommender (popularity-based) 0.760437 0.071115 0.261633 \n", "Random Recommender (uniform) 0.957691 0.017559 0.100088 \n", "Explicit ALS 0.569908 0.017559 0.070237 \n", "\n", " HitRate@10 MAP@10 MRR@10 \\\n", "SLIM 0.693591 0.176329 0.434489 \n", "LightFM 0.694469 0.170251 0.436674 \n", "KNN 0.648815 0.168665 0.412565 \n", "Implicit ALS 0.681299 0.162140 0.406855 \n", "Popular Recommender 0.645303 0.157301 0.390426 \n", "ADMM SLIM 0.647937 0.127043 0.373958 \n", "Wilson Recommender 0.414399 0.045002 0.180976 \n", "Random Recommender (popularity-based) 0.378402 0.027026 0.150434 \n", "Random Recommender (uniform) 0.167691 0.007332 0.054846 \n", "Explicit ALS 0.124671 0.006534 0.041331 \n", "\n", " NDCG@10 Surprisal@10 fit_pred_time \\\n", "SLIM 0.270859 0.133323 41.092457 \n", "LightFM 0.267207 0.169012 28.989394 \n", "KNN 0.258407 0.138554 36.018561 \n", "Implicit ALS 0.253444 0.163824 32.185843 \n", "Popular Recommender 0.243783 0.118354 16.931190 \n", "ADMM SLIM 0.216480 0.221984 56.977886 \n", "Wilson Recommender 0.092121 0.262190 16.660789 \n", "Random Recommender (popularity-based) 0.066665 0.344784 10.100435 \n", "Random Recommender (uniform) 0.021725 0.538677 11.719672 \n", "Explicit ALS 0.017995 0.540517 50.138072 \n", "\n", " params \n", "SLIM {'beta': 4.65156643702147, 'lambda_': 0.000203... \n", "LightFM {'loss': 'warp', 'no_components': 9} \n", "KNN {'num_neighbours': 75, 'shrink': 78} \n", "Implicit ALS {'rank': 8} \n", "Popular Recommender NaN \n", "ADMM SLIM {'lambda_1': 0.8417364694294401, 'lambda_2': 6... \n", "Wilson Recommender NaN \n", "Random Recommender (popularity-based) {'distribution': 'popular_based', 'alpha': 24.... \n", "Random Recommender (uniform) NaN \n", "Explicit ALS {'rank': 32} " ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "e.results.sort_values('NDCG@10', ascending=False)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "e.results.to_csv('res_22_rel_1.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.3 Neural models" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 19:29:58, replay, INFO: The model is neural network with non-distributed training\n", "25-Feb-22 19:29:58, replay, INFO: The model is neural network with non-distributed training\n", "25-Feb-22 19:29:58, replay, INFO: The model is neural network with non-distributed training\n", "25-Feb-22 19:29:58, replay, INFO: The model is neural network with non-distributed training\n" ] } ], "source": [ "nets = {'MultVAE with default parameters': [MultVAE(), 'no_opt'],\n", " 'NeuroMF with default parameters': [NeuroMF(), 'no_opt'], \n", " 'Word2Vec with default parameters': [Word2VecRec(seed=SEED), 'no_opt'],\n", " 'MultVAE with optimized parameters': [MultVAE(), {\"learning_rate\": [0.001, 0.5],\n", " \"dropout\": [0, 0.5],\n", " \"l2_reg\": [1e-6, 5]\n", " }],\n", " 'NeuroMF with optimized parameters': [NeuroMF(), {\n", " \"learning_rate\": [0.001, 0.5],\n", " \"l2_reg\": [1e-6, 5],\n", " \"count_negative_sample\": [1, 20]\n", " }],\n", " 'Word2Vec with optimized parameters': [Word2VecRec(seed=SEED), None]}" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 19:30:02, replay, INFO: MultVAE with default parameters started\n", "25-Feb-22 19:30:02, replay, INFO: MultVAE with default parameters fit_predict started\n", "22/02/25 19:30:02 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:30:02 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 19:30:12,385 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 19:30:12, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:30:12, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:32:21, replay, INFO: NeuroMF with default parameters started 8]\n", "25-Feb-22 19:32:21, replay, INFO: NeuroMF with default parameters fit_predict started\n", "22/02/25 19:32:21 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:32:21 WARN CacheManager: Asked to cache already cached data.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "SLIM 0.270859 0.434489 0.063323 \n", "LightFM 0.267207 0.436674 0.156346 \n", "KNN 0.258407 0.412565 0.054077 \n", "Implicit ALS 0.253444 0.406855 0.131129 \n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "ADMM SLIM 0.216480 0.373958 0.348837 \n", "Wilson Recommender 0.092121 0.180976 0.017092 \n", "MultVAE with default parameters 0.075648 0.120041 0.011488 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "Explicit ALS 0.017995 0.041331 0.569908 \n", "\n", " fit_pred_time \n", "SLIM 41.092457 \n", "LightFM 28.989394 \n", "KNN 36.018561 \n", "Implicit ALS 32.185843 \n", "Popular Recommender 16.931190 \n", "ADMM SLIM 56.977886 \n", "Wilson Recommender 16.660789 \n", "MultVAE with default parameters 37.993317 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n", "Explicit ALS 50.138072 \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-02-25 19:35:45,008 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 19:35:45, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:35:45, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:35:45, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 19:35:45, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:42:15, replay, INFO: Word2Vec with default parameters started \n", "25-Feb-22 19:42:15, replay, INFO: Word2Vec with default parameters fit_predict started\n", "22/02/25 19:42:15 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:42:15 WARN CacheManager: Asked to cache already cached data.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "SLIM 0.270859 0.434489 0.063323 \n", "LightFM 0.267207 0.436674 0.156346 \n", "KNN 0.258407 0.412565 0.054077 \n", "Implicit ALS 0.253444 0.406855 0.131129 \n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "ADMM SLIM 0.216480 0.373958 0.348837 \n", "NeuroMF with default parameters 0.198796 0.336243 0.261138 \n", "Wilson Recommender 0.092121 0.180976 0.017092 \n", "MultVAE with default parameters 0.075648 0.120041 0.011488 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "Explicit ALS 0.017995 0.041331 0.569908 \n", "\n", " fit_pred_time \n", "SLIM 41.092457 \n", "LightFM 28.989394 \n", "KNN 36.018561 \n", "Implicit ALS 32.185843 \n", "Popular Recommender 16.931190 \n", "ADMM SLIM 56.977886 \n", "NeuroMF with default parameters 283.866455 \n", "Wilson Recommender 16.660789 \n", "MultVAE with default parameters 37.993317 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n", "Explicit ALS 50.138072 \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 19:42:27, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:42:27, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:52:55, replay, INFO: MultVAE with optimized parameters started 8]8]\n", "25-Feb-22 19:52:55, replay, INFO: MultVAE with optimized parameters optimization started\n", "\u001b[32m[I 2022-02-25 19:52:55,977]\u001b[0m A new study created in memory with name: no-name-406f77e5-8a2b-4499-ad1d-3dd200f5f28d\u001b[0m\n", "22/02/25 19:52:55 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:52:55 WARN CacheManager: Asked to cache already cached data.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "SLIM 0.270859 0.434489 0.063323 \n", "LightFM 0.267207 0.436674 0.156346 \n", "KNN 0.258407 0.412565 0.054077 \n", "Implicit ALS 0.253444 0.406855 0.131129 \n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "ADMM SLIM 0.216480 0.373958 0.348837 \n", "NeuroMF with default parameters 0.198796 0.336243 0.261138 \n", "Word2Vec with default parameters 0.139835 0.247189 0.139255 \n", "Wilson Recommender 0.092121 0.180976 0.017092 \n", "MultVAE with default parameters 0.075648 0.120041 0.011488 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "Explicit ALS 0.017995 0.041331 0.569908 \n", "\n", " fit_pred_time \n", "SLIM 41.092457 \n", "LightFM 28.989394 \n", "KNN 36.018561 \n", "Implicit ALS 32.185843 \n", "Popular Recommender 16.931190 \n", "ADMM SLIM 56.977886 \n", "NeuroMF with default parameters 283.866455 \n", "Word2Vec with default parameters 161.589033 \n", "Wilson Recommender 16.660789 \n", "MultVAE with default parameters 37.993317 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n", "Explicit ALS 50.138072 \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-02-25 19:53:04,801 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 19:53:04, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:53:04, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:53:19,996]\u001b[0m Trial 0 finished with value: 0.18084826743321916 and parameters: {'learning_rate': 0.059780365208038436, 'epochs': 100, 'latent_dim': 200, 'hidden_dim': 600, 'dropout': 0.09979924534622853, 'anneal': 0.1, 'l2_reg': 0.25185273202603975, 'gamma': 0.99}. Best is trial 0 with value: 0.18084826743321916.\u001b[0m\n", "22/02/25 19:53:20 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:53:20 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 19:53:32,065 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 19:53:32, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:53:32, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:53:49,494]\u001b[0m Trial 1 finished with value: 0.20853843319338372 and parameters: {'learning_rate': 0.3583302116611014, 'epochs': 100, 'latent_dim': 200, 'hidden_dim': 600, 'dropout': 0.28709685309539446, 'anneal': 0.1, 'l2_reg': 1.666118292644979e-05, 'gamma': 0.99}. Best is trial 1 with value: 0.20853843319338372.\u001b[0m\n", "22/02/25 19:53:49 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:53:49 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 19:53:57,441 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 19:53:57, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:53:57, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:54:20,563]\u001b[0m Trial 2 finished with value: 0.06541673572098419 and parameters: {'learning_rate': 0.049272047761650804, 'epochs': 100, 'latent_dim': 200, 'hidden_dim': 600, 'dropout': 0.21969936783595656, 'anneal': 0.1, 'l2_reg': 0.00012250694692868968, 'gamma': 0.99}. Best is trial 1 with value: 0.20853843319338372.\u001b[0m\n", "22/02/25 19:54:20 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:54:20 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 19:54:30,399 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 19:54:30, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:54:30, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:54:48,879]\u001b[0m Trial 3 finished with value: 0.09095461014138063 and parameters: {'learning_rate': 0.11780452144009537, 'epochs': 100, 'latent_dim': 200, 'hidden_dim': 600, 'dropout': 0.12710156821241347, 'anneal': 0.1, 'l2_reg': 0.17511326609812616, 'gamma': 0.99}. Best is trial 1 with value: 0.20853843319338372.\u001b[0m\n", "22/02/25 19:54:48 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:54:48 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 19:55:00,693 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 19:55:00, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:55:00, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:55:17,814]\u001b[0m Trial 4 finished with value: 0.20613087758965873 and parameters: {'learning_rate': 0.30138733562979303, 'epochs': 100, 'latent_dim': 200, 'hidden_dim': 600, 'dropout': 0.12915681012418168, 'anneal': 0.1, 'l2_reg': 0.061891680661423386, 'gamma': 0.99}. Best is trial 1 with value: 0.20853843319338372.\u001b[0m\n", "22/02/25 19:55:17 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:55:17 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 19:55:37,318 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 19:55:37, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:55:37, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:55:48,621]\u001b[0m Trial 5 finished with value: 0.1957610993862016 and parameters: {'learning_rate': 0.03394602749342825, 'epochs': 100, 'latent_dim': 200, 'hidden_dim': 600, 'dropout': 0.01629179731067748, 'anneal': 0.1, 'l2_reg': 8.475188712022929e-05, 'gamma': 0.99}. Best is trial 1 with value: 0.20853843319338372.\u001b[0m\n", "22/02/25 19:55:48 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:55:48 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 19:55:57,603 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 19:55:57, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:55:57, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:56:09,484]\u001b[0m Trial 6 finished with value: 0.21024996430883497 and parameters: {'learning_rate': 0.46938471623589645, 'epochs': 100, 'latent_dim': 200, 'hidden_dim': 600, 'dropout': 0.04560351437303739, 'anneal': 0.1, 'l2_reg': 1.160360502291246e-06, 'gamma': 0.99}. Best is trial 6 with value: 0.21024996430883497.\u001b[0m\n", "22/02/25 19:56:09 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:56:09 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 19:56:16,017 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 19:56:16, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:56:16, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:56:29,308]\u001b[0m Trial 7 finished with value: 0.16342123509357262 and parameters: {'learning_rate': 0.00580041160516003, 'epochs': 100, 'latent_dim': 200, 'hidden_dim': 600, 'dropout': 0.4396651367191305, 'anneal': 0.1, 'l2_reg': 0.07657940650930974, 'gamma': 0.99}. Best is trial 6 with value: 0.21024996430883497.\u001b[0m\n", "22/02/25 19:56:29 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:56:29 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 19:56:46,852 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 19:56:46, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:56:46, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:56:58,600]\u001b[0m Trial 8 finished with value: 0.21266139182024277 and parameters: {'learning_rate': 0.015677916317796903, 'epochs': 100, 'latent_dim': 200, 'hidden_dim': 600, 'dropout': 0.08068159614111009, 'anneal': 0.1, 'l2_reg': 0.00031366067118685257, 'gamma': 0.99}. Best is trial 8 with value: 0.21266139182024277.\u001b[0m\n", "22/02/25 19:56:58 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:56:58 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 19:57:07,067 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 19:57:07, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:57:07, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 19:57:19,421]\u001b[0m Trial 9 finished with value: 0.05312484562781685 and parameters: {'learning_rate': 0.13573393668627406, 'epochs': 100, 'latent_dim': 200, 'hidden_dim': 600, 'dropout': 0.4234936113828407, 'anneal': 0.1, 'l2_reg': 4.076496680438271, 'gamma': 0.99}. Best is trial 8 with value: 0.21266139182024277.\u001b[0m\n", "25-Feb-22 19:57:19, replay, INFO: best params for MultVAE with optimized parameters are: {'learning_rate': 0.015677916317796903, 'epochs': 100, 'latent_dim': 200, 'hidden_dim': 600, 'dropout': 0.08068159614111009, 'anneal': 0.1, 'l2_reg': 0.00031366067118685257, 'gamma': 0.99}\n", "25-Feb-22 19:57:19, replay, INFO: MultVAE with optimized parameters fit_predict started\n", "22/02/25 19:57:19 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:57:19 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 19:57:39,749 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 19:57:39, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:57:39, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 19:59:45, replay, INFO: NeuroMF with optimized parameters started 8]\n", "25-Feb-22 19:59:45, replay, INFO: NeuroMF with optimized parameters optimization started\n", "\u001b[32m[I 2022-02-25 19:59:45,402]\u001b[0m A new study created in memory with name: no-name-0eebd505-521d-41d8-b437-f2835a1e497a\u001b[0m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "SLIM 0.270859 0.434489 0.063323 \n", "LightFM 0.267207 0.436674 0.156346 \n", "KNN 0.258407 0.412565 0.054077 \n", "Implicit ALS 0.253444 0.406855 0.131129 \n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "MultVAE with optimized parameters 0.236728 0.378478 0.034744 \n", "ADMM SLIM 0.216480 0.373958 0.348837 \n", "NeuroMF with default parameters 0.198796 0.336243 0.261138 \n", "Word2Vec with default parameters 0.139835 0.247189 0.139255 \n", "Wilson Recommender 0.092121 0.180976 0.017092 \n", "MultVAE with default parameters 0.075648 0.120041 0.011488 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "Explicit ALS 0.017995 0.041331 0.569908 \n", "\n", " fit_pred_time \n", "SLIM 41.092457 \n", "LightFM 28.989394 \n", "KNN 36.018561 \n", "Implicit ALS 32.185843 \n", "Popular Recommender 16.931190 \n", "MultVAE with optimized parameters 45.044486 \n", "ADMM SLIM 56.977886 \n", "NeuroMF with default parameters 283.866455 \n", "Word2Vec with default parameters 161.589033 \n", "Wilson Recommender 16.660789 \n", "MultVAE with default parameters 37.993317 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n", "Explicit ALS 50.138072 \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "22/02/25 19:59:45 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 19:59:45 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 20:10:03, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 20:10:03, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 20:10:03, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 20:10:03, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 20:10:41,181]\u001b[0m Trial 0 finished with value: 0.2064613019160715 and parameters: {'embedding_gmf_dim': 128, 'embedding_mlp_dim': 128, 'learning_rate': 0.007286919999637349, 'l2_reg': 5.2988661733736925e-06, 'gamma': 0.99, 'count_negative_sample': 5}. Best is trial 0 with value: 0.2064613019160715.\u001b[0m\n", "22/02/25 20:10:41 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 20:10:41 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 20:27:33,479 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 20:27:33, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 20:27:33, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 20:27:33, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 20:27:33, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 20:28:14,765]\u001b[0m Trial 1 finished with value: 0.17299232765028 and parameters: {'embedding_gmf_dim': 128, 'embedding_mlp_dim': 128, 'learning_rate': 0.1233331520732471, 'l2_reg': 4.803592936767287e-05, 'gamma': 0.99, 'count_negative_sample': 13}. Best is trial 0 with value: 0.2064613019160715.\u001b[0m\n", "22/02/25 20:28:14 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 20:28:14 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 20:37:21, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 20:37:21, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 20:37:21, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 20:37:21, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 20:38:02,506]\u001b[0m Trial 2 finished with value: 0.21307408723446133 and parameters: {'embedding_gmf_dim': 128, 'embedding_mlp_dim': 128, 'learning_rate': 0.021637215064705798, 'l2_reg': 0.3067280492440385, 'gamma': 0.99, 'count_negative_sample': 4}. Best is trial 2 with value: 0.21307408723446133.\u001b[0m\n", "22/02/25 20:38:02 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 20:38:02 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 20:59:18,923 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 20:59:18, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 20:59:18, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 20:59:18, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 20:59:18, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 20:59:55,088]\u001b[0m Trial 3 finished with value: 0.21996091320177072 and parameters: {'embedding_gmf_dim': 128, 'embedding_mlp_dim': 128, 'learning_rate': 0.019671725534141614, 'l2_reg': 0.8613182091456935, 'gamma': 0.99, 'count_negative_sample': 15}. Best is trial 3 with value: 0.21996091320177072.\u001b[0m\n", "22/02/25 20:59:55 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 20:59:55 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 21:16:15, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 21:16:15, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 21:16:15, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 21:16:15, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 21:16:57,740]\u001b[0m Trial 4 finished with value: 0.21487124878524896 and parameters: {'embedding_gmf_dim': 128, 'embedding_mlp_dim': 128, 'learning_rate': 0.0027782611868129303, 'l2_reg': 1.3469436856157776e-06, 'gamma': 0.99, 'count_negative_sample': 10}. Best is trial 3 with value: 0.21996091320177072.\u001b[0m\n", "22/02/25 21:16:57 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 21:16:57 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 21:23:09, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 21:23:09, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 21:23:10, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 21:23:10, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 21:23:43,083]\u001b[0m Trial 5 finished with value: 0.2177892726529676 and parameters: {'embedding_gmf_dim': 128, 'embedding_mlp_dim': 128, 'learning_rate': 0.0010564746952059123, 'l2_reg': 4.9368730884663886e-05, 'gamma': 0.99, 'count_negative_sample': 2}. Best is trial 3 with value: 0.21996091320177072.\u001b[0m\n", "22/02/25 21:23:43 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 21:23:43 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 21:31:28, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 21:31:29, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 21:31:29, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 21:31:29, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 21:32:01,761]\u001b[0m Trial 6 finished with value: 0.22209780168477072 and parameters: {'embedding_gmf_dim': 128, 'embedding_mlp_dim': 128, 'learning_rate': 0.01546369650648298, 'l2_reg': 0.3132798237751134, 'gamma': 0.99, 'count_negative_sample': 3}. Best is trial 6 with value: 0.22209780168477072.\u001b[0m\n", "22/02/25 21:32:01 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 21:32:01 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 21:47:55,167 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 21:47:55, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 21:47:55, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 21:47:55, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 21:47:55, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 21:48:31,488]\u001b[0m Trial 7 finished with value: 0.16615881668396787 and parameters: {'embedding_gmf_dim': 128, 'embedding_mlp_dim': 128, 'learning_rate': 0.03466033665592717, 'l2_reg': 7.915211779609062e-05, 'gamma': 0.99, 'count_negative_sample': 11}. Best is trial 6 with value: 0.22209780168477072.\u001b[0m\n", "22/02/25 21:48:31 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 21:48:31 WARN CacheManager: Asked to cache already cached data.\n", "2022-02-25 22:02:21,911 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 22:02:21, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 22:02:21, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 22:02:21, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 22:02:21, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 22:02:57,607]\u001b[0m Trial 8 finished with value: 0.18532793549296822 and parameters: {'embedding_gmf_dim': 128, 'embedding_mlp_dim': 128, 'learning_rate': 0.07375389362564352, 'l2_reg': 0.024112570854850378, 'gamma': 0.99, 'count_negative_sample': 13}. Best is trial 6 with value: 0.22209780168477072.\u001b[0m\n", "22/02/25 22:02:57 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 22:02:57 WARN CacheManager: Asked to cache already cached data.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2022-02-25 22:10:21,855 ignite.handlers.early_stopping.EarlyStopping INFO: EarlyStopping: Stop training\n", "25-Feb-22 22:10:21, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 22:10:21, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 22:10:21, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 22:10:21, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 22:11:02,352]\u001b[0m Trial 9 finished with value: 0.17543632527491465 and parameters: {'embedding_gmf_dim': 128, 'embedding_mlp_dim': 128, 'learning_rate': 0.060119225245243824, 'l2_reg': 0.02935821245727142, 'gamma': 0.99, 'count_negative_sample': 6}. Best is trial 6 with value: 0.22209780168477072.\u001b[0m\n", "25-Feb-22 22:11:02, replay, INFO: best params for NeuroMF with optimized parameters are: {'embedding_gmf_dim': 128, 'embedding_mlp_dim': 128, 'learning_rate': 0.01546369650648298, 'l2_reg': 0.3132798237751134, 'gamma': 0.99, 'count_negative_sample': 3}\n", "25-Feb-22 22:11:02, replay, INFO: NeuroMF with optimized parameters fit_predict started\n", "22/02/25 22:11:02 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 22:11:02 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 22:20:36, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 22:20:36, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 22:20:36, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 22:20:36, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 22:27:22, replay, INFO: Word2Vec with optimized parameters started \n", "25-Feb-22 22:27:22, replay, INFO: Word2Vec with optimized parameters optimization started\n", "\u001b[32m[I 2022-02-25 22:27:22,213]\u001b[0m A new study created in memory with name: no-name-79d66816-9a1b-4aa0-bf26-c980be8f69c0\u001b[0m\n", "22/02/25 22:27:22 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 22:27:22 WARN CacheManager: Asked to cache already cached data.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "SLIM 0.270859 0.434489 0.063323 \n", "LightFM 0.267207 0.436674 0.156346 \n", "KNN 0.258407 0.412565 0.054077 \n", "Implicit ALS 0.253444 0.406855 0.131129 \n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "NeuroMF with optimized parameters 0.239874 0.397850 0.092463 \n", "MultVAE with optimized parameters 0.236728 0.378478 0.034744 \n", "ADMM SLIM 0.216480 0.373958 0.348837 \n", "NeuroMF with default parameters 0.198796 0.336243 0.261138 \n", "Word2Vec with default parameters 0.139835 0.247189 0.139255 \n", "Wilson Recommender 0.092121 0.180976 0.017092 \n", "MultVAE with default parameters 0.075648 0.120041 0.011488 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "Explicit ALS 0.017995 0.041331 0.569908 \n", "\n", " fit_pred_time \n", "SLIM 41.092457 \n", "LightFM 28.989394 \n", "KNN 36.018561 \n", "Implicit ALS 32.185843 \n", "Popular Recommender 16.931190 \n", "NeuroMF with optimized parameters 657.766227 \n", "MultVAE with optimized parameters 45.044486 \n", "ADMM SLIM 56.977886 \n", "NeuroMF with default parameters 283.866455 \n", "Word2Vec with default parameters 161.589033 \n", "Wilson Recommender 16.660789 \n", "MultVAE with default parameters 37.993317 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n", "Explicit ALS 50.138072 \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 22:27:28, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 22:27:28, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 22:29:57,549]\u001b[0m Trial 0 finished with value: 0.13781682129218153 and parameters: {'rank': 100, 'window_size': 1, 'use_idf': False}. Best is trial 0 with value: 0.13781682129218153.\u001b[0m\n", "22/02/25 22:29:57 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 22:29:57 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 22:33:25, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 22:33:25, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 22:35:13,468]\u001b[0m Trial 1 finished with value: 0.0307612393780298 and parameters: {'rank': 193, 'window_size': 72, 'use_idf': True}. Best is trial 0 with value: 0.13781682129218153.\u001b[0m\n", "22/02/25 22:35:13 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 22:35:13 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 22:37:29, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 22:37:29, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 22:39:04,541]\u001b[0m Trial 2 finished with value: 0.04790512088419334 and parameters: {'rank': 203, 'window_size': 37, 'use_idf': False}. Best is trial 0 with value: 0.13781682129218153.\u001b[0m\n", "22/02/25 22:39:04 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 22:39:04 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 22:40:05, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 22:40:05, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 22:41:19,992]\u001b[0m Trial 3 finished with value: 0.05834027516644601 and parameters: {'rank': 225, 'window_size': 12, 'use_idf': False}. Best is trial 0 with value: 0.13781682129218153.\u001b[0m\n", "22/02/25 22:41:20 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 22:41:20 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 22:45:05, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 22:45:05, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 22:46:28,867]\u001b[0m Trial 4 finished with value: 0.0320001405181196 and parameters: {'rank': 186, 'window_size': 85, 'use_idf': True}. Best is trial 0 with value: 0.13781682129218153.\u001b[0m\n", "22/02/25 22:46:28 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 22:46:28 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 22:47:15, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 22:47:15, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 22:48:50,703]\u001b[0m Trial 5 finished with value: 0.06609937887072302 and parameters: {'rank': 262, 'window_size': 9, 'use_idf': False}. Best is trial 0 with value: 0.13781682129218153.\u001b[0m\n", "22/02/25 22:48:50 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 22:48:50 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 22:50:18, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 22:50:18, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 22:51:43,507]\u001b[0m Trial 6 finished with value: 0.03343272537312708 and parameters: {'rank': 81, 'window_size': 61, 'use_idf': True}. Best is trial 0 with value: 0.13781682129218153.\u001b[0m\n", "22/02/25 22:51:43 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 22:51:43 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 22:57:42, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 22:57:42, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 22:59:02,215]\u001b[0m Trial 7 finished with value: 0.03211271778111359 and parameters: {'rank': 280, 'window_size': 97, 'use_idf': True}. Best is trial 0 with value: 0.13781682129218153.\u001b[0m\n", "22/02/25 22:59:02 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 22:59:02 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 22:59:34, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 22:59:34, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:00:57,127]\u001b[0m Trial 8 finished with value: 0.04316049263039244 and parameters: {'rank': 62, 'window_size': 17, 'use_idf': True}. Best is trial 0 with value: 0.13781682129218153.\u001b[0m\n", "22/02/25 23:00:57 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:00:57 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:03:13, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 23:03:13, replay, WARNING: This model can't predict cold items, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:04:32,134]\u001b[0m Trial 9 finished with value: 0.042026085556290914 and parameters: {'rank': 117, 'window_size': 72, 'use_idf': False}. Best is trial 0 with value: 0.13781682129218153.\u001b[0m\n", "25-Feb-22 23:04:32, replay, INFO: best params for Word2Vec with optimized parameters are: {'rank': 100, 'window_size': 1, 'use_idf': False}\n", "25-Feb-22 23:04:32, replay, INFO: Word2Vec with optimized parameters fit_predict started\n", "22/02/25 23:04:32 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:04:32 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:04:32 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:04:39 WARN CacheManager: Asked to cache already cached data. \n", "25-Feb-22 23:04:39, replay, WARNING: This model can't predict cold items, they will be ignored\n", "25-Feb-22 23:04:39, replay, WARNING: This model can't predict cold items, they will be ignored\n", "[Stage 21081:=================================================> (139 + 5) / 144] 48]\r" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "SLIM 0.270859 0.434489 0.063323 \n", "LightFM 0.267207 0.436674 0.156346 \n", "KNN 0.258407 0.412565 0.054077 \n", "Implicit ALS 0.253444 0.406855 0.131129 \n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "NeuroMF with optimized parameters 0.239874 0.397850 0.092463 \n", "MultVAE with optimized parameters 0.236728 0.378478 0.034744 \n", "ADMM SLIM 0.216480 0.373958 0.348837 \n", "NeuroMF with default parameters 0.198796 0.336243 0.261138 \n", "Word2Vec with default parameters 0.139835 0.247189 0.139255 \n", "Word2Vec with optimized parameters 0.139835 0.247189 0.139255 \n", "Wilson Recommender 0.092121 0.180976 0.017092 \n", "MultVAE with default parameters 0.075648 0.120041 0.011488 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "Explicit ALS 0.017995 0.041331 0.569908 \n", "\n", " fit_pred_time \n", "SLIM 41.092457 \n", "LightFM 28.989394 \n", "KNN 36.018561 \n", "Implicit ALS 32.185843 \n", "Popular Recommender 16.931190 \n", "NeuroMF with optimized parameters 657.766227 \n", "MultVAE with optimized parameters 45.044486 \n", "ADMM SLIM 56.977886 \n", "NeuroMF with default parameters 283.866455 \n", "Word2Vec with default parameters 161.589033 \n", "Word2Vec with optimized parameters 108.236670 \n", "Wilson Recommender 16.660789 \n", "MultVAE with default parameters 37.993317 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n", "Explicit ALS 50.138072 \n", "CPU times: user 2d 3h 58min 46s, sys: 7h 49min 53s, total: 2d 11h 48min 40s\n", "Wall time: 3h 42min 25s\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\r", "[Stage 21081:==================================================>(142 + 2) / 144]\r", "\r", " \r" ] } ], "source": [ "%%time\n", "full_pipeline(nets, e, train, budget=10)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Coverage@10HitRate@1HitRate@5HitRate@10MAP@10MRR@10NDCG@10Surprisal@10fit_pred_timeparams
SLIM0.0633230.3248460.5838450.6935910.1763290.4344890.2708590.13332341.092457{'beta': 4.65156643702147, 'lambda_': 0.000203...
LightFM0.1563460.3248460.5812120.6944690.1702510.4366740.2672070.16901228.989394{'loss': 'warp', 'no_components': 9}
KNN0.0540770.3028970.5566290.6488150.1686650.4125650.2584070.13855436.018561{'num_neighbours': 75, 'shrink': 78}
Implicit ALS0.1311290.2923620.5627740.6812990.1621400.4068550.2534440.16382432.185843{'rank': 8}
Popular Recommender0.0339030.2844600.5302900.6453030.1573010.3904260.2437830.11835416.931190NaN
NeuroMF with optimized parameters0.0924630.2914840.5425810.6540830.1508620.3978500.2398740.160814657.766227{'embedding_gmf_dim': 128, 'embedding_mlp_dim'...
MultVAE with optimized parameters0.0347440.2730470.5241440.6435470.1517650.3784780.2367280.12109845.044486{'learning_rate': 0.015677916317796903, 'epoch...
ADMM SLIM0.3488370.2581210.5417030.6479370.1270430.3739580.2164800.22198456.977886{'lambda_1': 0.8417364694294401, 'lambda_2': 6...
NeuroMF with default parameters0.2611380.2168570.4925370.6224760.1134130.3362430.1987960.222385283.866455NaN
Word2Vec with default parameters0.1392550.1474980.3836700.5004390.0745790.2471890.1398350.237858161.589033NaN
Word2Vec with optimized parameters0.1392550.1474980.3836700.5004390.0745790.2471890.1398350.237858108.236670{'rank': 100, 'window_size': 1, 'use_idf': False}
Wilson Recommender0.0170920.0834060.3450400.4143990.0450020.1809760.0921210.26219016.660789NaN
MultVAE with default parameters0.0114880.0228270.2537310.4644420.0290050.1200410.0756480.25551237.993317NaN
Random Recommender (popularity-based)0.7604370.0711150.2616330.3784020.0270260.1504340.0666650.34478410.100435{'distribution': 'popular_based', 'alpha': 24....
Random Recommender (uniform)0.9576910.0175590.1000880.1676910.0073320.0548460.0217250.53867711.719672NaN
Explicit ALS0.5699080.0175590.0702370.1246710.0065340.0413310.0179950.54051750.138072{'rank': 32}
\n", "
" ], "text/plain": [ " Coverage@10 HitRate@1 HitRate@5 \\\n", "SLIM 0.063323 0.324846 0.583845 \n", "LightFM 0.156346 0.324846 0.581212 \n", "KNN 0.054077 0.302897 0.556629 \n", "Implicit ALS 0.131129 0.292362 0.562774 \n", "Popular Recommender 0.033903 0.284460 0.530290 \n", "NeuroMF with optimized parameters 0.092463 0.291484 0.542581 \n", "MultVAE with optimized parameters 0.034744 0.273047 0.524144 \n", "ADMM SLIM 0.348837 0.258121 0.541703 \n", "NeuroMF with default parameters 0.261138 0.216857 0.492537 \n", "Word2Vec with default parameters 0.139255 0.147498 0.383670 \n", "Word2Vec with optimized parameters 0.139255 0.147498 0.383670 \n", "Wilson Recommender 0.017092 0.083406 0.345040 \n", "MultVAE with default parameters 0.011488 0.022827 0.253731 \n", "Random Recommender (popularity-based) 0.760437 0.071115 0.261633 \n", "Random Recommender (uniform) 0.957691 0.017559 0.100088 \n", "Explicit ALS 0.569908 0.017559 0.070237 \n", "\n", " HitRate@10 MAP@10 MRR@10 \\\n", "SLIM 0.693591 0.176329 0.434489 \n", "LightFM 0.694469 0.170251 0.436674 \n", "KNN 0.648815 0.168665 0.412565 \n", "Implicit ALS 0.681299 0.162140 0.406855 \n", "Popular Recommender 0.645303 0.157301 0.390426 \n", "NeuroMF with optimized parameters 0.654083 0.150862 0.397850 \n", "MultVAE with optimized parameters 0.643547 0.151765 0.378478 \n", "ADMM SLIM 0.647937 0.127043 0.373958 \n", "NeuroMF with default parameters 0.622476 0.113413 0.336243 \n", "Word2Vec with default parameters 0.500439 0.074579 0.247189 \n", "Word2Vec with optimized parameters 0.500439 0.074579 0.247189 \n", "Wilson Recommender 0.414399 0.045002 0.180976 \n", "MultVAE with default parameters 0.464442 0.029005 0.120041 \n", "Random Recommender (popularity-based) 0.378402 0.027026 0.150434 \n", "Random Recommender (uniform) 0.167691 0.007332 0.054846 \n", "Explicit ALS 0.124671 0.006534 0.041331 \n", "\n", " NDCG@10 Surprisal@10 fit_pred_time \\\n", "SLIM 0.270859 0.133323 41.092457 \n", "LightFM 0.267207 0.169012 28.989394 \n", "KNN 0.258407 0.138554 36.018561 \n", "Implicit ALS 0.253444 0.163824 32.185843 \n", "Popular Recommender 0.243783 0.118354 16.931190 \n", "NeuroMF with optimized parameters 0.239874 0.160814 657.766227 \n", "MultVAE with optimized parameters 0.236728 0.121098 45.044486 \n", "ADMM SLIM 0.216480 0.221984 56.977886 \n", "NeuroMF with default parameters 0.198796 0.222385 283.866455 \n", "Word2Vec with default parameters 0.139835 0.237858 161.589033 \n", "Word2Vec with optimized parameters 0.139835 0.237858 108.236670 \n", "Wilson Recommender 0.092121 0.262190 16.660789 \n", "MultVAE with default parameters 0.075648 0.255512 37.993317 \n", "Random Recommender (popularity-based) 0.066665 0.344784 10.100435 \n", "Random Recommender (uniform) 0.021725 0.538677 11.719672 \n", "Explicit ALS 0.017995 0.540517 50.138072 \n", "\n", " params \n", "SLIM {'beta': 4.65156643702147, 'lambda_': 0.000203... \n", "LightFM {'loss': 'warp', 'no_components': 9} \n", "KNN {'num_neighbours': 75, 'shrink': 78} \n", "Implicit ALS {'rank': 8} \n", "Popular Recommender NaN \n", "NeuroMF with optimized parameters {'embedding_gmf_dim': 128, 'embedding_mlp_dim'... \n", "MultVAE with optimized parameters {'learning_rate': 0.015677916317796903, 'epoch... \n", "ADMM SLIM {'lambda_1': 0.8417364694294401, 'lambda_2': 6... \n", "NeuroMF with default parameters NaN \n", "Word2Vec with default parameters NaN \n", "Word2Vec with optimized parameters {'rank': 100, 'window_size': 1, 'use_idf': False} \n", "Wilson Recommender NaN \n", "MultVAE with default parameters NaN \n", "Random Recommender (popularity-based) {'distribution': 'popular_based', 'alpha': 24.... \n", "Random Recommender (uniform) NaN \n", "Explicit ALS {'rank': 32} " ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "e.results.sort_values('NDCG@10', ascending=False)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "e.results.to_csv('res_23_rel_1.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.4 Models considering features" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.4.1 item features preprocessing" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ " \r" ] }, { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 566 ms, sys: 37.5 ms, total: 603 ms\n", "Wall time: 4.07 s\n" ] } ], "source": [ "%%time\n", "preparator = DataPreparator()\n", "log, _, item_features = preparator(data.ratings, item_features=data.items, mapping={\"relevance\": \"rating\"})" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "+----------------+--------------------+--------+\n", "| title| genres|item_idx|\n", "+----------------+--------------------+--------+\n", "|Toy Story (1995)|Animation|Childre...| 29|\n", "| Jumanji (1995)|Adventure|Childre...| 393|\n", "+----------------+--------------------+--------+\n", "only showing top 2 rows\n", "\n" ] } ], "source": [ "item_features.show(2)" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "+--------+----+\n", "|item_idx|year|\n", "+--------+----+\n", "| 29|1995|\n", "| 393|1995|\n", "+--------+----+\n", "only showing top 2 rows\n", "\n" ] } ], "source": [ "year = item_features.withColumn('year', sf.substring(sf.col('title'), -5, 4).astype(st.IntegerType())).select('item_idx', 'year')\n", "year.show(2)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "genres = (\n", " spark.createDataFrame(data.items[[\"item_id\", \"genres\"]].rename({'item_id': 'item_idx'}, axis=1))\n", " .select(\n", " \"item_idx\",\n", " sf.split(\"genres\", \"\\|\").alias(\"genres\")\n", " )\n", ")" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "genres_list = (\n", " genres.select(sf.explode(\"genres\").alias(\"genre\"))\n", " .distinct().filter('genre <> \"(no genres listed)\"')\n", " .toPandas()[\"genre\"].tolist()\n", ")" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['Documentary',\n", " 'Adventure',\n", " 'Animation',\n", " 'Comedy',\n", " 'Thriller',\n", " 'Sci-Fi',\n", " 'Musical',\n", " 'Horror',\n", " 'Action',\n", " 'Fantasy',\n", " 'War',\n", " 'Mystery',\n", " \"Children's\",\n", " 'Drama',\n", " 'Film-Noir',\n", " 'Crime',\n", " 'Western',\n", " 'Romance']" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "genres_list" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3883" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "item_features = genres\n", "for genre in genres_list:\n", " item_features = item_features.withColumn(\n", " genre,\n", " sf.array_contains(sf.col(\"genres\"), genre).astype(IntegerType())\n", " )\n", "item_features = item_features.drop(\"genres\").cache()\n", "item_features.count()" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3813" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "item_features = item_features.join(year, on='item_idx', how='inner')\n", "item_features.count()" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "DataFrame[item_idx: int, Documentary: int, Adventure: int, Animation: int, Comedy: int, Thriller: int, Sci-Fi: int, Musical: int, Horror: int, Action: int, Fantasy: int, War: int, Mystery: int, Children's: int, Drama: int, Film-Noir: int, Crime: int, Western: int, Romance: int, year: int]" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "item_features.cache()" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "+--------+-----------+---------+---------+------+--------+------+-------+------+------+-------+---+-------+----------+-----+---------+-----+-------+-------+----+\n", "|item_idx|Documentary|Adventure|Animation|Comedy|Thriller|Sci-Fi|Musical|Horror|Action|Fantasy|War|Mystery|Children's|Drama|Film-Noir|Crime|Western|Romance|year|\n", "+--------+-----------+---------+---------+------+--------+------+-------+------+------+-------+---+-------+----------+-----+---------+-----+-------+-------+----+\n", "| 29| 0| 1| 0| 0| 0| 1| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|1995|\n", "| 393| 0| 0| 0| 0| 0| 0| 0| 0| 1| 0| 0| 0| 0| 0| 0| 0| 0| 0|1995|\n", "| 648| 0| 1| 0| 0| 0| 0| 0| 0| 1| 0| 0| 1| 0| 0| 0| 0| 0| 0|1995|\n", "+--------+-----------+---------+---------+------+--------+------+-------+------+------+-------+---+-------+----------+-----+---------+-----+-------+-------+----+\n", "only showing top 3 rows\n", "\n" ] } ], "source": [ "item_features.show(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.4.2 Models training" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [], "source": [ "models_with_features = {'LightFM with item features': [LightFMWrap(random_state=SEED), {\"no_components\": [8, 512]}]}" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "25-Feb-22 23:12:34, replay, INFO: LightFM with item features started\n", "25-Feb-22 23:12:34, replay, INFO: LightFM with item features optimization started\n", "\u001b[32m[I 2022-02-25 23:12:34,833]\u001b[0m A new study created in memory with name: no-name-a86246ae-1ea9-4636-9278-763f3bf90129\u001b[0m\n", "/home/u19893556/miniconda3/envs/replay/lib/python3.7/site-packages/optuna/distributions.py:364: FutureWarning: Samplers and other components in Optuna will assume that `step` is 1. `step` argument is deprecated and will be removed in the future. The removal of this feature is currently scheduled for v4.0.0, but this schedule is subject to change.\n", " FutureWarning,\n", "22/02/25 23:12:34 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:13:51, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:13:51, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:14:18,532]\u001b[0m Trial 0 finished with value: 0.2022408950793562 and parameters: {'loss': 'warp', 'no_components': 128}. Best is trial 0 with value: 0.2022408950793562.\u001b[0m\n", "22/02/25 23:14:18 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:14:18 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:15:21, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:15:21, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:15:50,494]\u001b[0m Trial 1 finished with value: 0.21235478041254502 and parameters: {'loss': 'warp', 'no_components': 63}. Best is trial 1 with value: 0.21235478041254502.\u001b[0m\n", "22/02/25 23:15:50 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:15:50 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:16:42, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:16:42, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:17:19,484]\u001b[0m Trial 2 finished with value: 0.21617225390362893 and parameters: {'loss': 'warp', 'no_components': 53}. Best is trial 2 with value: 0.21617225390362893.\u001b[0m\n", "22/02/25 23:17:19 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:17:19 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:17:37, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:17:37, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:18:05,727]\u001b[0m Trial 3 finished with value: 0.21104014039193397 and parameters: {'loss': 'warp', 'no_components': 8}. Best is trial 2 with value: 0.21617225390362893.\u001b[0m\n", "22/02/25 23:18:05 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:18:05 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:18:24, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:18:24, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:18:51,745]\u001b[0m Trial 4 finished with value: 0.21250813084760045 and parameters: {'loss': 'warp', 'no_components': 10}. Best is trial 2 with value: 0.21617225390362893.\u001b[0m\n", "22/02/25 23:18:51 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:18:51 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:20:02, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:20:03, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:20:29,995]\u001b[0m Trial 5 finished with value: 0.20744692015026825 and parameters: {'loss': 'warp', 'no_components': 133}. Best is trial 2 with value: 0.21617225390362893.\u001b[0m\n", "22/02/25 23:20:30 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:20:30 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:21:50, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:21:50, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:22:24,696]\u001b[0m Trial 6 finished with value: 0.20594069340874005 and parameters: {'loss': 'warp', 'no_components': 212}. Best is trial 2 with value: 0.21617225390362893.\u001b[0m\n", "22/02/25 23:22:24 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:22:24 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:23:43, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:23:43, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:24:13,227]\u001b[0m Trial 7 finished with value: 0.20352024919175718 and parameters: {'loss': 'warp', 'no_components': 171}. Best is trial 2 with value: 0.21617225390362893.\u001b[0m\n", "22/02/25 23:24:13 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:24:13 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:25:27, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:25:27, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:25:52,569]\u001b[0m Trial 8 finished with value: 0.2038353445167596 and parameters: {'loss': 'warp', 'no_components': 126}. Best is trial 2 with value: 0.21617225390362893.\u001b[0m\n", "22/02/25 23:25:52 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:25:52 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:27:01, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:27:01, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:27:35,478]\u001b[0m Trial 9 finished with value: 0.20053281638654186 and parameters: {'loss': 'warp', 'no_components': 100}. Best is trial 2 with value: 0.21617225390362893.\u001b[0m\n", "22/02/25 23:27:35 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:27:35 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:29:31, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:29:31, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:30:01,779]\u001b[0m Trial 10 finished with value: 0.1940453657774853 and parameters: {'loss': 'warp', 'no_components': 485}. Best is trial 2 with value: 0.21617225390362893.\u001b[0m\n", "22/02/25 23:30:01 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:30:01 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:30:29, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:30:29, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:30:54,033]\u001b[0m Trial 11 finished with value: 0.2195214275914755 and parameters: {'loss': 'warp', 'no_components': 16}. Best is trial 11 with value: 0.2195214275914755.\u001b[0m\n", "22/02/25 23:30:54 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:30:54 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:31:34, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:31:34, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:31:56,682]\u001b[0m Trial 12 finished with value: 0.21450551100986395 and parameters: {'loss': 'warp', 'no_components': 25}. Best is trial 11 with value: 0.2195214275914755.\u001b[0m\n", "22/02/25 23:31:56 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:31:56 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:32:32, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:32:32, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:32:57,183]\u001b[0m Trial 13 finished with value: 0.21297319750606825 and parameters: {'loss': 'warp', 'no_components': 24}. Best is trial 11 with value: 0.2195214275914755.\u001b[0m\n", "22/02/25 23:32:57 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:32:57 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:33:32, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:33:32, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:33:56,933]\u001b[0m Trial 14 finished with value: 0.21805442951145454 and parameters: {'loss': 'warp', 'no_components': 25}. Best is trial 11 with value: 0.2195214275914755.\u001b[0m\n", "22/02/25 23:33:56 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:33:56 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:34:26, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:34:26, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:34:47,573]\u001b[0m Trial 15 finished with value: 0.21680272494347633 and parameters: {'loss': 'warp', 'no_components': 19}. Best is trial 11 with value: 0.2195214275914755.\u001b[0m\n", "22/02/25 23:34:47 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:34:47 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:35:11, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:35:11, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:35:36,033]\u001b[0m Trial 16 finished with value: 0.21167780352219775 and parameters: {'loss': 'warp', 'no_components': 14}. Best is trial 11 with value: 0.2195214275914755.\u001b[0m\n", "22/02/25 23:35:36 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:35:36 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:36:19, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:36:19, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:36:45,463]\u001b[0m Trial 17 finished with value: 0.21920864639917328 and parameters: {'loss': 'warp', 'no_components': 38}. Best is trial 11 with value: 0.2195214275914755.\u001b[0m\n", "22/02/25 23:36:45 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:36:45 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:37:43, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:37:43, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:38:06,982]\u001b[0m Trial 18 finished with value: 0.21394091055030293 and parameters: {'loss': 'warp', 'no_components': 67}. Best is trial 11 with value: 0.2195214275914755.\u001b[0m\n", "22/02/25 23:38:07 WARN CacheManager: Asked to cache already cached data.\n", "22/02/25 23:38:07 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:38:50, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:38:50, replay, WARNING: This model can't predict cold users, they will be ignored\n", "\u001b[32m[I 2022-02-25 23:39:14,894]\u001b[0m Trial 19 finished with value: 0.21760181189620253 and parameters: {'loss': 'warp', 'no_components': 40}. Best is trial 11 with value: 0.2195214275914755.\u001b[0m\n", "25-Feb-22 23:39:14, replay, INFO: best params for LightFM with item features are: {'loss': 'warp', 'no_components': 16}\n", "25-Feb-22 23:39:14, replay, INFO: LightFM with item features fit_predict started\n", "22/02/25 23:39:14 WARN CacheManager: Asked to cache already cached data.\n", "25-Feb-22 23:39:53, replay, WARNING: This model can't predict cold users, they will be ignored\n", "25-Feb-22 23:39:53, replay, WARNING: This model can't predict cold users, they will be ignored\n", "[Stage 23535:=================================================> (141 + 3) / 144]44]\r" ] }, { "name": "stdout", "output_type": "stream", "text": [ " NDCG@10 MRR@10 Coverage@10 \\\n", "SLIM 0.270859 0.434489 0.063323 \n", "LightFM 0.267207 0.436674 0.156346 \n", "KNN 0.258407 0.412565 0.054077 \n", "Implicit ALS 0.253444 0.406855 0.131129 \n", "LightFM with item features 0.250395 0.403145 0.096105 \n", "Popular Recommender 0.243783 0.390426 0.033903 \n", "NeuroMF with optimized parameters 0.239874 0.397850 0.092463 \n", "MultVAE with optimized parameters 0.236728 0.378478 0.034744 \n", "ADMM SLIM 0.216480 0.373958 0.348837 \n", "NeuroMF with default parameters 0.198796 0.336243 0.261138 \n", "Word2Vec with default parameters 0.139835 0.247189 0.139255 \n", "Word2Vec with optimized parameters 0.139835 0.247189 0.139255 \n", "Wilson Recommender 0.092121 0.180976 0.017092 \n", "MultVAE with default parameters 0.075648 0.120041 0.011488 \n", "Random Recommender (popularity-based) 0.066665 0.150434 0.760437 \n", "Random Recommender (uniform) 0.021725 0.054846 0.957691 \n", "Explicit ALS 0.017995 0.041331 0.569908 \n", "\n", " fit_pred_time \n", "SLIM 41.092457 \n", "LightFM 28.989394 \n", "KNN 36.018561 \n", "Implicit ALS 32.185843 \n", "LightFM with item features 59.903357 \n", "Popular Recommender 16.931190 \n", "NeuroMF with optimized parameters 657.766227 \n", "MultVAE with optimized parameters 45.044486 \n", "ADMM SLIM 56.977886 \n", "NeuroMF with default parameters 283.866455 \n", "Word2Vec with default parameters 161.589033 \n", "Word2Vec with optimized parameters 108.236670 \n", "Wilson Recommender 16.660789 \n", "MultVAE with default parameters 37.993317 \n", "Random Recommender (popularity-based) 10.100435 \n", "Random Recommender (uniform) 11.719672 \n", "Explicit ALS 50.138072 \n", "CPU times: user 13h 9min 36s, sys: 2min 17s, total: 13h 11min 53s\n", "Wall time: 28min 34s\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\r", "[Stage 23535:==================================================>(143 + 1) / 144]\r", "\r", " \r" ] } ], "source": [ "%%time\n", "full_pipeline(models_with_features, e, train)" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Coverage@10HitRate@1HitRate@5HitRate@10MAP@10MRR@10NDCG@10Surprisal@10fit_pred_timeparams
SLIM0.0633230.3248460.5838450.6935910.1763290.4344890.2708590.13332341.092457{'beta': 4.65156643702147, 'lambda_': 0.000203...
LightFM0.1563460.3248460.5812120.6944690.1702510.4366740.2672070.16901228.989394{'loss': 'warp', 'no_components': 9}
KNN0.0540770.3028970.5566290.6488150.1686650.4125650.2584070.13855436.018561{'num_neighbours': 75, 'shrink': 78}
Implicit ALS0.1311290.2923620.5627740.6812990.1621400.4068550.2534440.16382432.185843{'rank': 8}
LightFM with item features0.0961050.2791920.5671640.6865670.1580740.4031450.2503950.15272759.903357{'loss': 'warp', 'no_components': 16}
Popular Recommender0.0339030.2844600.5302900.6453030.1573010.3904260.2437830.11835416.931190NaN
NeuroMF with optimized parameters0.0924630.2914840.5425810.6540830.1508620.3978500.2398740.160814657.766227{'embedding_gmf_dim': 128, 'embedding_mlp_dim'...
MultVAE with optimized parameters0.0347440.2730470.5241440.6435470.1517650.3784780.2367280.12109845.044486{'learning_rate': 0.015677916317796903, 'epoch...
ADMM SLIM0.3488370.2581210.5417030.6479370.1270430.3739580.2164800.22198456.977886{'lambda_1': 0.8417364694294401, 'lambda_2': 6...
NeuroMF with default parameters0.2611380.2168570.4925370.6224760.1134130.3362430.1987960.222385283.866455NaN
Word2Vec with default parameters0.1392550.1474980.3836700.5004390.0745790.2471890.1398350.237858161.589033NaN
Word2Vec with optimized parameters0.1392550.1474980.3836700.5004390.0745790.2471890.1398350.237858108.236670{'rank': 100, 'window_size': 1, 'use_idf': False}
Wilson Recommender0.0170920.0834060.3450400.4143990.0450020.1809760.0921210.26219016.660789NaN
MultVAE with default parameters0.0114880.0228270.2537310.4644420.0290050.1200410.0756480.25551237.993317NaN
Random Recommender (popularity-based)0.7604370.0711150.2616330.3784020.0270260.1504340.0666650.34478410.100435{'distribution': 'popular_based', 'alpha': 24....
Random Recommender (uniform)0.9576910.0175590.1000880.1676910.0073320.0548460.0217250.53867711.719672NaN
Explicit ALS0.5699080.0175590.0702370.1246710.0065340.0413310.0179950.54051750.138072{'rank': 32}
\n", "
" ], "text/plain": [ " Coverage@10 HitRate@1 HitRate@5 \\\n", "SLIM 0.063323 0.324846 0.583845 \n", "LightFM 0.156346 0.324846 0.581212 \n", "KNN 0.054077 0.302897 0.556629 \n", "Implicit ALS 0.131129 0.292362 0.562774 \n", "LightFM with item features 0.096105 0.279192 0.567164 \n", "Popular Recommender 0.033903 0.284460 0.530290 \n", "NeuroMF with optimized parameters 0.092463 0.291484 0.542581 \n", "MultVAE with optimized parameters 0.034744 0.273047 0.524144 \n", "ADMM SLIM 0.348837 0.258121 0.541703 \n", "NeuroMF with default parameters 0.261138 0.216857 0.492537 \n", "Word2Vec with default parameters 0.139255 0.147498 0.383670 \n", "Word2Vec with optimized parameters 0.139255 0.147498 0.383670 \n", "Wilson Recommender 0.017092 0.083406 0.345040 \n", "MultVAE with default parameters 0.011488 0.022827 0.253731 \n", "Random Recommender (popularity-based) 0.760437 0.071115 0.261633 \n", "Random Recommender (uniform) 0.957691 0.017559 0.100088 \n", "Explicit ALS 0.569908 0.017559 0.070237 \n", "\n", " HitRate@10 MAP@10 MRR@10 \\\n", "SLIM 0.693591 0.176329 0.434489 \n", "LightFM 0.694469 0.170251 0.436674 \n", "KNN 0.648815 0.168665 0.412565 \n", "Implicit ALS 0.681299 0.162140 0.406855 \n", "LightFM with item features 0.686567 0.158074 0.403145 \n", "Popular Recommender 0.645303 0.157301 0.390426 \n", "NeuroMF with optimized parameters 0.654083 0.150862 0.397850 \n", "MultVAE with optimized parameters 0.643547 0.151765 0.378478 \n", "ADMM SLIM 0.647937 0.127043 0.373958 \n", "NeuroMF with default parameters 0.622476 0.113413 0.336243 \n", "Word2Vec with default parameters 0.500439 0.074579 0.247189 \n", "Word2Vec with optimized parameters 0.500439 0.074579 0.247189 \n", "Wilson Recommender 0.414399 0.045002 0.180976 \n", "MultVAE with default parameters 0.464442 0.029005 0.120041 \n", "Random Recommender (popularity-based) 0.378402 0.027026 0.150434 \n", "Random Recommender (uniform) 0.167691 0.007332 0.054846 \n", "Explicit ALS 0.124671 0.006534 0.041331 \n", "\n", " NDCG@10 Surprisal@10 fit_pred_time \\\n", "SLIM 0.270859 0.133323 41.092457 \n", "LightFM 0.267207 0.169012 28.989394 \n", "KNN 0.258407 0.138554 36.018561 \n", "Implicit ALS 0.253444 0.163824 32.185843 \n", "LightFM with item features 0.250395 0.152727 59.903357 \n", "Popular Recommender 0.243783 0.118354 16.931190 \n", "NeuroMF with optimized parameters 0.239874 0.160814 657.766227 \n", "MultVAE with optimized parameters 0.236728 0.121098 45.044486 \n", "ADMM SLIM 0.216480 0.221984 56.977886 \n", "NeuroMF with default parameters 0.198796 0.222385 283.866455 \n", "Word2Vec with default parameters 0.139835 0.237858 161.589033 \n", "Word2Vec with optimized parameters 0.139835 0.237858 108.236670 \n", "Wilson Recommender 0.092121 0.262190 16.660789 \n", "MultVAE with default parameters 0.075648 0.255512 37.993317 \n", "Random Recommender (popularity-based) 0.066665 0.344784 10.100435 \n", "Random Recommender (uniform) 0.021725 0.538677 11.719672 \n", "Explicit ALS 0.017995 0.540517 50.138072 \n", "\n", " params \n", "SLIM {'beta': 4.65156643702147, 'lambda_': 0.000203... \n", "LightFM {'loss': 'warp', 'no_components': 9} \n", "KNN {'num_neighbours': 75, 'shrink': 78} \n", "Implicit ALS {'rank': 8} \n", "LightFM with item features {'loss': 'warp', 'no_components': 16} \n", "Popular Recommender NaN \n", "NeuroMF with optimized parameters {'embedding_gmf_dim': 128, 'embedding_mlp_dim'... \n", "MultVAE with optimized parameters {'learning_rate': 0.015677916317796903, 'epoch... \n", "ADMM SLIM {'lambda_1': 0.8417364694294401, 'lambda_2': 6... \n", "NeuroMF with default parameters NaN \n", "Word2Vec with default parameters NaN \n", "Word2Vec with optimized parameters {'rank': 100, 'window_size': 1, 'use_idf': False} \n", "Wilson Recommender NaN \n", "MultVAE with default parameters NaN \n", "Random Recommender (popularity-based) {'distribution': 'popular_based', 'alpha': 24.... \n", "Random Recommender (uniform) NaN \n", "Explicit ALS {'rank': 32} " ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "e.results.sort_values('NDCG@10', ascending=False)" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [], "source": [ "e.results.to_csv('res_25_rel_1.csv')" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Coverage@10HitRate@1HitRate@5HitRate@10MAP@10MRR@10NDCG@10Surprisal@10fit_pred_timeparams
SLIM0.0633230.3248460.5838450.6935910.1763290.4344890.2708590.13332341.092457{'beta': 4.65156643702147, 'lambda_': 0.000203...
LightFM0.1563460.3248460.5812120.6944690.1702510.4366740.2672070.16901228.989394{'loss': 'warp', 'no_components': 9}
KNN0.0540770.3028970.5566290.6488150.1686650.4125650.2584070.13855436.018561{'num_neighbours': 75, 'shrink': 78}
ALS (Implicit)0.1311290.2923620.5627740.6812990.1621400.4068550.2534440.16382432.185843{'rank': 8}
LightFM (w/ feats)0.0961050.2791920.5671640.6865670.1580740.4031450.2503950.15272759.903357{'loss': 'warp', 'no_components': 16}
PopRec0.0339030.2844600.5302900.6453030.1573010.3904260.2437830.11835416.931190NaN
MultVAE0.0347440.2730470.5241440.6435470.1517650.3784780.2367280.12109845.044486{'learning_rate': 0.015677916317796903, 'epoch...
ADMM SLIM0.3488370.2581210.5417030.6479370.1270430.3739580.2164800.22198456.977886{'lambda_1': 0.8417364694294401, 'lambda_2': 6...
NeuroMF0.2611380.2168570.4925370.6224760.1134130.3362430.1987960.222385283.866455NaN
Word2Vec0.1392550.1474980.3836700.5004390.0745790.2471890.1398350.237858161.589033NaN
Wilson0.0170920.0834060.3450400.4143990.0450020.1809760.0921210.26219016.660789NaN
RandomRec (popular)0.7604370.0711150.2616330.3784020.0270260.1504340.0666650.34478410.100435{'distribution': 'popular_based', 'alpha': 24....
RandomRec (uniform)0.9576910.0175590.1000880.1676910.0073320.0548460.0217250.53867711.719672NaN
ALS (Explicit)0.5699080.0175590.0702370.1246710.0065340.0413310.0179950.54051750.138072{'rank': 32}
\n", "
" ], "text/plain": [ " Coverage@10 HitRate@1 HitRate@5 HitRate@10 MAP@10 \\\n", "SLIM 0.063323 0.324846 0.583845 0.693591 0.176329 \n", "LightFM 0.156346 0.324846 0.581212 0.694469 0.170251 \n", "KNN 0.054077 0.302897 0.556629 0.648815 0.168665 \n", "ALS (Implicit) 0.131129 0.292362 0.562774 0.681299 0.162140 \n", "LightFM (w/ feats) 0.096105 0.279192 0.567164 0.686567 0.158074 \n", "PopRec 0.033903 0.284460 0.530290 0.645303 0.157301 \n", "MultVAE 0.034744 0.273047 0.524144 0.643547 0.151765 \n", "ADMM SLIM 0.348837 0.258121 0.541703 0.647937 0.127043 \n", "NeuroMF 0.261138 0.216857 0.492537 0.622476 0.113413 \n", "Word2Vec 0.139255 0.147498 0.383670 0.500439 0.074579 \n", "Wilson 0.017092 0.083406 0.345040 0.414399 0.045002 \n", "RandomRec (popular) 0.760437 0.071115 0.261633 0.378402 0.027026 \n", "RandomRec (uniform) 0.957691 0.017559 0.100088 0.167691 0.007332 \n", "ALS (Explicit) 0.569908 0.017559 0.070237 0.124671 0.006534 \n", "\n", " MRR@10 NDCG@10 Surprisal@10 fit_pred_time \\\n", "SLIM 0.434489 0.270859 0.133323 41.092457 \n", "LightFM 0.436674 0.267207 0.169012 28.989394 \n", "KNN 0.412565 0.258407 0.138554 36.018561 \n", "ALS (Implicit) 0.406855 0.253444 0.163824 32.185843 \n", "LightFM (w/ feats) 0.403145 0.250395 0.152727 59.903357 \n", "PopRec 0.390426 0.243783 0.118354 16.931190 \n", "MultVAE 0.378478 0.236728 0.121098 45.044486 \n", "ADMM SLIM 0.373958 0.216480 0.221984 56.977886 \n", "NeuroMF 0.336243 0.198796 0.222385 283.866455 \n", "Word2Vec 0.247189 0.139835 0.237858 161.589033 \n", "Wilson 0.180976 0.092121 0.262190 16.660789 \n", "RandomRec (popular) 0.150434 0.066665 0.344784 10.100435 \n", "RandomRec (uniform) 0.054846 0.021725 0.538677 11.719672 \n", "ALS (Explicit) 0.041331 0.017995 0.540517 50.138072 \n", "\n", " params \n", "SLIM {'beta': 4.65156643702147, 'lambda_': 0.000203... \n", "LightFM {'loss': 'warp', 'no_components': 9} \n", "KNN {'num_neighbours': 75, 'shrink': 78} \n", "ALS (Implicit) {'rank': 8} \n", "LightFM (w/ feats) {'loss': 'warp', 'no_components': 16} \n", "PopRec NaN \n", "MultVAE {'learning_rate': 0.015677916317796903, 'epoch... \n", "ADMM SLIM {'lambda_1': 0.8417364694294401, 'lambda_2': 6... \n", "NeuroMF NaN \n", "Word2Vec NaN \n", "Wilson NaN \n", "RandomRec (popular) {'distribution': 'popular_based', 'alpha': 24.... \n", "RandomRec (uniform) NaN \n", "ALS (Explicit) {'rank': 32} " ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = e.results.drop([\n", " 'NeuroMF with optimized parameters', \n", " 'MultVAE with default parameters', \n", " 'Word2Vec with optimized parameters'\n", "]).rename(\n", " index={\n", " 'Popular Recommender': 'PopRec', \n", " 'Random Recommender (uniform)': 'RandomRec (uniform)', \n", " 'Random Recommender (popularity-based)': 'RandomRec (popular)',\n", " 'Wilson Recommender': 'Wilson', 'Implicit ALS': 'ALS (Implicit)', 'Explicit ALS': 'ALS (Explicit)',\n", " 'NeuroMF with default parameters': 'NeuroMF', 'MultVAE with optimized parameters': 'MultVAE',\n", " 'Word2Vec with default parameters': 'Word2Vec', 'LightFM with item features': 'LightFM (w/ feats)'\n", " }).sort_values('NDCG@10', ascending=False)\n", "df" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "df.index.name = 'Model'" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [], "source": [ "df = df.round(3)[['HitRate@10', 'MAP@10', 'MRR@10', 'NDCG@10', 'Coverage@10', 'Surprisal@10', 'fit_pred_time']]\n", "df = df.rename(columns={'HitRate@10': 'HitRate', 'MAP@10': 'MAP', 'MRR@10': 'MRR',\n", " 'NDCG@10': 'NDCG', 'Coverage@10': 'Coverage', \n", " 'Surprisal@10': 'Surprisal'})\n", "df.to_csv('res_1m.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 3. Results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The best results by quality and time were shown by the commonly-used models such as ALS, SLIM and LightFM. " ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
HitRateMAPMRRNDCGCoverageSurprisalfit_pred_time
Model
SLIM0.6940.1760.4340.2710.0630.13341.092
LightFM0.6940.1700.4370.2670.1560.16928.989
KNN0.6490.1690.4130.2580.0540.13936.019
ALS (Implicit)0.6810.1620.4070.2530.1310.16432.186
LightFM (w/ feats)0.6870.1580.4030.2500.0960.15359.903
\n", "
" ], "text/plain": [ " HitRate MAP MRR NDCG Coverage Surprisal \\\n", "Model \n", "SLIM 0.694 0.176 0.434 0.271 0.063 0.133 \n", "LightFM 0.694 0.170 0.437 0.267 0.156 0.169 \n", "KNN 0.649 0.169 0.413 0.258 0.054 0.139 \n", "ALS (Implicit) 0.681 0.162 0.407 0.253 0.131 0.164 \n", "LightFM (w/ feats) 0.687 0.158 0.403 0.250 0.096 0.153 \n", "\n", " fit_pred_time \n", "Model \n", "SLIM 41.092 \n", "LightFM 28.989 \n", "KNN 36.019 \n", "ALS (Implicit) 32.186 \n", "LightFM (w/ feats) 59.903 " ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.11" }, "name": "movielens_nmf.ipynb", "pycharm": { "stem_cell": { "cell_type": "raw", "metadata": { "collapsed": false }, "source": [ "null" ] } } }, "nbformat": 4, "nbformat_minor": 4 }