## Collaborative filtering

In [None]:
from fastai.gen_doc.nbdoc import *

This package contains all the necessary functions to quickly train a model for a collaborative filtering task. Let's start by importing all we'll need.

In [None]:
from fastai.collab import * 

## Overview

Collaborative filtering is when you're tasked to predict how much a user is going to like a certain item. The fastai library contains a [`CollabFilteringDataset`](/collab.html#CollabFilteringDataset) class that will help you create datasets suitable for training, and a function `get_colab_learner` to build a simple model directly from a ratings table. Let's first see how we can get started before delving into the documentation.

For this example, we'll use a small subset of the [MovieLens](https://grouplens.org/datasets/movielens/) dataset to predict the rating a user would give a particular movie (from 0 to 5). The dataset comes in the form of a csv file where each line is a rating of a movie by a given person.

In [None]:
path = untar_data(URLs.ML_SAMPLE)
ratings = pd.read_csv(path/'ratings.csv')
ratings.head()

Unnamed: 0,userId,movieId,rating,timestamp
0,73,1097,4.0,1255504951
1,561,924,3.5,1172695223
2,157,260,3.5,1291598691
3,358,1210,5.0,957481884
4,130,316,2.0,1138999234


We'll first turn the `userId` and `movieId` columns in category codes, so that we can replace them with their codes when it's time to feed them to an `Embedding` layer. This step would be even more important if our csv had names of users, or names of items in it. To do it, we simply have to call a [`CollabDataBunch`](/collab.html#CollabDataBunch) factory method.

In [None]:
data = CollabDataBunch.from_df(ratings)

Now that this step is done, we can directly create a [`Learner`](/basic_train.html#Learner) object:

In [None]:
learn = collab_learner(data, n_factors=50, y_range=(0.,5.))

And then immediately begin training

In [None]:
learn.fit_one_cycle(5, 5e-3, wd=0.1)

epoch,train_loss,valid_loss
1,2.427430,1.999472
2,1.116335,0.663345
3,0.736155,0.636640
4,0.612827,0.626773
5,0.565003,0.626336


In [None]:
show_doc(CollabDataBunch)

<h2 id="CollabDataBunch" class="doc_header"><code>class</code> <code>CollabDataBunch</code><a href="https://github.com/fastai/fastai/blob/master/fastai/collab.py#L50" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#CollabDataBunch-pytest" style="float:right; padding-right:10px">[test]</a></h2>

> <code>CollabDataBunch</code>(**`train_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), **`valid_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), **`fix_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)=***`None`***, **`test_dl`**:`Optional`\[[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)\]=***`None`***, **`device`**:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=***`None`***, **`dl_tfms`**:`Optional`\[`Collection`\[`Callable`\]\]=***`None`***, **`path`**:`PathOrStr`=***`'.'`***, **`collate_fn`**:`Callable`=***`'data_collate'`***, **`no_check`**:`bool`=***`False`***) :: [`DataBunch`](/basic_data.html#DataBunch)

<div class="collapse" id="CollabDataBunch-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#CollabDataBunch-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>Tests found for <code>CollabDataBunch</code>:</p><p>Some other tests where <code>CollabDataBunch</code> is used:</p><ul><li><code>pytest -sv tests/test_collab_train.py::test_val_loss</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_collab_train.py#L16" class="source_link" style="float:right">[source]</a></li></ul><p>To run tests please refer to this <a href="/dev/test.html#quick-guide">guide</a>.</p></div></div>

Base [`DataBunch`](/basic_data.html#DataBunch) for collaborative filtering.  

The init function shouldn't be called directly (as it's the one of a basic [`DataBunch`](/basic_data.html#DataBunch)), instead, you'll want to use the following factory method.

In [None]:
show_doc(CollabDataBunch.from_df)

<h4 id="CollabDataBunch.from_df" class="doc_header"><code>from_df</code><a href="https://github.com/fastai/fastai/blob/master/fastai/collab.py#L52" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#CollabDataBunch-from_df-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>from_df</code>(**`ratings`**:`DataFrame`, **`valid_pct`**:`float`=***`0.2`***, **`user_name`**:`Optional`\[`str`\]=***`None`***, **`item_name`**:`Optional`\[`str`\]=***`None`***, **`rating_name`**:`Optional`\[`str`\]=***`None`***, **`test`**:`DataFrame`=***`None`***, **`seed`**:`int`=***`None`***, **`path`**:`PathOrStr`=***`'.'`***, **`bs`**:`int`=***`64`***, **`val_bs`**:`int`=***`None`***, **`num_workers`**:`int`=***`16`***, **`dl_tfms`**:`Optional`\[`Collection`\[`Callable`\]\]=***`None`***, **`device`**:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=***`None`***, **`collate_fn`**:`Callable`=***`'data_collate'`***, **`no_check`**:`bool`=***`False`***) → `CollabDataBunch`

<div class="collapse" id="CollabDataBunch-from_df-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#CollabDataBunch-from_df-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>Tests found for <code>from_df</code>:</p><ul><li><code>pytest -sv tests/test_collab_train.py::test_val_loss</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_collab_train.py#L16" class="source_link" style="float:right">[source]</a></li></ul><p>To run tests please refer to this <a href="/dev/test.html#quick-guide">guide</a>.</p></div></div>

Create a [`DataBunch`](/basic_data.html#DataBunch) suitable for collaborative filtering from `ratings`.  

Take a `ratings` dataframe and splits it randomly for train and test following `pct_val` (unless it's None). `user_name`, `item_name` and `rating_name` give the names of the corresponding columns (defaults to the first, the second and the third column). Optionally a `test` dataframe can be passed an a `seed` for the separation between training and validation set. The `kwargs` will be passed to [`DataBunch.create`](/basic_data.html#DataBunch.create).

## Model and [`Learner`](/basic_train.html#Learner)

In [None]:
show_doc(CollabLearner, title_level=3)

<h3 id="CollabLearner" class="doc_header"><code>class</code> <code>CollabLearner</code><a href="https://github.com/fastai/fastai/blob/master/fastai/collab.py#L68" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#CollabLearner-pytest" style="float:right; padding-right:10px">[test]</a></h3>

> <code>CollabLearner</code>(**`data`**:[`DataBunch`](/basic_data.html#DataBunch), **`model`**:[`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module), **`opt_func`**:`Callable`=***`'Adam'`***, **`loss_func`**:`Callable`=***`None`***, **`metrics`**:`Collection`\[`Callable`\]=***`None`***, **`true_wd`**:`bool`=***`True`***, **`bn_wd`**:`bool`=***`True`***, **`wd`**:`Floats`=***`0.01`***, **`train_bn`**:`bool`=***`True`***, **`path`**:`str`=***`None`***, **`model_dir`**:`PathOrStr`=***`'models'`***, **`callback_fns`**:`Collection`\[`Callable`\]=***`None`***, **`callbacks`**:`Collection`\[[`Callback`](/callback.html#Callback)\]=***`<factory>`***, **`layer_groups`**:`ModuleList`=***`None`***, **`add_time`**:`bool`=***`True`***, **`silent`**:`bool`=***`None`***) :: [`Learner`](/basic_train.html#Learner)

<div class="collapse" id="CollabLearner-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#CollabLearner-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>CollabLearner</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

[`Learner`](/basic_train.html#Learner) suitable for collaborative filtering.  

This is a subclass of [`Learner`](/basic_train.html#Learner) that just introduces helper functions to analyze results, the initialization is the same as a regular [`Learner`](/basic_train.html#Learner).

In [None]:
show_doc(CollabLearner.bias)

<h4 id="CollabLearner.bias" class="doc_header"><code>bias</code><a href="https://github.com/fastai/fastai/blob/master/fastai/collab.py#L82" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#CollabLearner-bias-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>bias</code>(**`arr`**:`Collection`\[`T_co`\], **`is_item`**:`bool`=***`True`***)

<div class="collapse" id="CollabLearner-bias-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#CollabLearner-bias-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>bias</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Bias for item or user (based on `is_item`) for all in `arr`. (Set model to `cpu` and no grad.)  

In [None]:
show_doc(CollabLearner.get_idx)

<h4 id="CollabLearner.get_idx" class="doc_header"><code>get_idx</code><a href="https://github.com/fastai/fastai/blob/master/fastai/collab.py#L70" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#CollabLearner-get_idx-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>get_idx</code>(**`arr`**:`Collection`\[`T_co`\], **`is_item`**:`bool`=***`True`***)

<div class="collapse" id="CollabLearner-get_idx-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#CollabLearner-get_idx-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>get_idx</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Fetch item or user (based on `is_item`) for all in `arr`. (Set model to `cpu` and no grad.)  

In [None]:
show_doc(CollabLearner.weight)

<h4 id="CollabLearner.weight" class="doc_header"><code>weight</code><a href="https://github.com/fastai/fastai/blob/master/fastai/collab.py#L89" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#CollabLearner-weight-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>weight</code>(**`arr`**:`Collection`\[`T_co`\], **`is_item`**:`bool`=***`True`***)

<div class="collapse" id="CollabLearner-weight-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#CollabLearner-weight-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>weight</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Weight for item or user (based on `is_item`) for all in `arr`. (Set model to `cpu` and no grad.)  

In [None]:
show_doc(EmbeddingDotBias, title_level=3)

<h3 id="EmbeddingDotBias" class="doc_header"><code>class</code> <code>EmbeddingDotBias</code><a href="https://github.com/fastai/fastai/blob/master/fastai/collab.py#L36" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#EmbeddingDotBias-pytest" style="float:right; padding-right:10px">[test]</a></h3>

> <code>EmbeddingDotBias</code>(**`n_factors`**:`int`, **`n_users`**:`int`, **`n_items`**:`int`, **`y_range`**:`Point`=***`None`***) :: [`PrePostInitMeta`](/core.html#PrePostInitMeta) :: [`Module`](/torch_core.html#Module)

<div class="collapse" id="EmbeddingDotBias-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#EmbeddingDotBias-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>EmbeddingDotBias</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Base dot model for collaborative filtering.  

Creates a simple model with `Embedding` weights and biases for `n_users` and `n_items`, with `n_factors` latent factors. Takes the dot product of the embeddings and adds the bias, then if `y_range` is specified, feed the result to a sigmoid rescaled to go from `y_range[0]` to `y_range[1]`. 

In [None]:
show_doc(EmbeddingNN, title_level=3)

<h3 id="EmbeddingNN" class="doc_header"><code>class</code> <code>EmbeddingNN</code><a href="https://github.com/fastai/fastai/blob/master/fastai/collab.py#L26" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#EmbeddingNN-pytest" style="float:right; padding-right:10px">[test]</a></h3>

> <code>EmbeddingNN</code>(**`emb_szs`**:`ListSizes`, **`layers`**:`Collection`\[`int`\]=***`None`***, **`ps`**:`Collection`\[`float`\]=***`None`***, **`emb_drop`**:`float`=***`0.0`***, **`y_range`**:`OptRange`=***`None`***, **`use_bn`**:`bool`=***`True`***, **`bn_final`**:`bool`=***`False`***) :: [`PrePostInitMeta`](/core.html#PrePostInitMeta) :: [`TabularModel`](/tabular.models.html#TabularModel)

<div class="collapse" id="EmbeddingNN-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#EmbeddingNN-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>EmbeddingNN</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Subclass [`TabularModel`](/tabular.models.html#TabularModel) to create a NN suitable for collaborative filtering.  

`emb_szs` will overwrite the default and `kwargs` are passed to [`TabularModel`](/tabular.models.html#TabularModel).

In [None]:
show_doc(collab_learner)

<h4 id="collab_learner" class="doc_header"><code>collab_learner</code><a href="https://github.com/fastai/fastai/blob/master/fastai/collab.py#L96" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#collab_learner-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>collab_learner</code>(**`data`**, **`n_factors`**:`int`=***`None`***, **`use_nn`**:`bool`=***`False`***, **`emb_szs`**:`Dict`\[`str`, `int`\]=***`None`***, **`layers`**:`Collection`\[`int`\]=***`None`***, **`ps`**:`Collection`\[`float`\]=***`None`***, **`emb_drop`**:`float`=***`0.0`***, **`y_range`**:`OptRange`=***`None`***, **`use_bn`**:`bool`=***`True`***, **`bn_final`**:`bool`=***`False`***, **\*\*`learn_kwargs`**) → [`Learner`](/basic_train.html#Learner)

<div class="collapse" id="collab_learner-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#collab_learner-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>Tests found for <code>collab_learner</code>:</p><ul><li><code>pytest -sv tests/test_collab_train.py::test_val_loss</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_collab_train.py#L16" class="source_link" style="float:right">[source]</a></li></ul><p>To run tests please refer to this <a href="/dev/test.html#quick-guide">guide</a>.</p></div></div>

Create a Learner for collaborative filtering on `data`.  

More specifically, binds [`data`](/tabular.data.html#tabular.data) with a model that is either an [`EmbeddingDotBias`](/collab.html#EmbeddingDotBias) with `n_factors` if `use_nn=False` or a [`EmbeddingNN`](/collab.html#EmbeddingNN) with `emb_szs` otherwise. In both cases the numbers of users and items will be inferred from the data, `y_range` can be specified in the `kwargs` and you can pass [`metrics`](/metrics.html#metrics) or `wd` to the [`Learner`](/basic_train.html#Learner) constructor.

## Links with the Data Block API

In [None]:
show_doc(CollabLine, doc_string=False, title_level=3)

<h3 id="CollabLine" class="doc_header"><code>class</code> <code>CollabLine</code><a href="https://github.com/fastai/fastai/blob/master/fastai/collab.py#L14" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#CollabLine-pytest" style="float:right; padding-right:10px">[test]</a></h3>

> <code>CollabLine</code>(**`cats`**, **`conts`**, **`classes`**, **`names`**) :: [`TabularLine`](/tabular.data.html#TabularLine)

<div class="collapse" id="CollabLine-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#CollabLine-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>CollabLine</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Subclass of [`TabularLine`](/tabular.data.html#TabularLine) for collaborative filtering.

In [None]:
show_doc(CollabList, title_level=3, doc_string=False)

<h3 id="CollabList" class="doc_header"><code>class</code> <code>CollabList</code><a href="https://github.com/fastai/fastai/blob/master/fastai/collab.py#L20" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#CollabList-pytest" style="float:right; padding-right:10px">[test]</a></h3>

> <code>CollabList</code>(**`items`**:`Iterator`\[`T_co`\], **`cat_names`**:`OptStrList`=***`None`***, **`cont_names`**:`OptStrList`=***`None`***, **`procs`**=***`None`***, **\*\*`kwargs`**) → `TabularList` :: [`TabularList`](/tabular.data.html#TabularList)

<div class="collapse" id="CollabList-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#CollabList-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>CollabList</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Subclass of [`TabularList`](/tabular.data.html#TabularList) for collaborative filtering.

## Undocumented Methods - Methods moved below this line will intentionally be hidden

In [None]:
show_doc(EmbeddingDotBias.forward)

<h4 id="EmbeddingDotBias.forward" class="doc_header"><code>forward</code><a href="https://github.com/fastai/fastai/blob/master/fastai/collab.py#L44" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#EmbeddingDotBias-forward-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>forward</code>(**`users`**:`LongTensor`, **`items`**:`LongTensor`) → `Tensor`

<div class="collapse" id="EmbeddingDotBias-forward-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#EmbeddingDotBias-forward-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>forward</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Defines the computation performed at every call. Should be overridden by all subclasses.

.. note::
    Although the recipe for forward pass needs to be defined within
    this function, one should call the :class:[`Module`](/torch_core.html#Module) instance afterwards
    instead of this since the former takes care of running the
    registered hooks while the latter silently ignores them. 

In [None]:
show_doc(CollabList.reconstruct)

<h4 id="CollabList.reconstruct" class="doc_header"><code>reconstruct</code><a href="https://github.com/fastai/fastai/blob/master/fastai/collab.py#L24" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#CollabList-reconstruct-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>reconstruct</code>(**`t`**:`Tensor`)

<div class="collapse" id="CollabList-reconstruct-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#CollabList-reconstruct-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>reconstruct</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Reconstruct one of the underlying item for its data `t`.  

In [None]:
show_doc(EmbeddingNN.forward)

<h4 id="EmbeddingNN.forward" class="doc_header"><code>forward</code><a href="https://github.com/fastai/fastai/blob/master/fastai/collab.py#L33" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#EmbeddingNN-forward-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>forward</code>(**`users`**:`LongTensor`, **`items`**:`LongTensor`) → `Tensor`

<div class="collapse" id="EmbeddingNN-forward-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#EmbeddingNN-forward-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>forward</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Defines the computation performed at every call. Should be overridden by all subclasses.

.. note::
    Although the recipe for forward pass needs to be defined within
    this function, one should call the :class:[`Module`](/torch_core.html#Module) instance afterwards
    instead of this since the former takes care of running the
    registered hooks while the latter silently ignores them. 

## New Methods - Please document or move to the undocumented section