{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Implementation of the language models" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [], "source": [ "from fastai.gen_doc.nbdoc import *\n", "from fastai.text import * \n", "from fastai.text.models import * " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[`text.models`](/text.models.html#text.models) module fully implements the encoder for an [AWD-LSTM](https://arxiv.org/pdf/1708.02182.pdf), the [transformer model](https://arxiv.org/abs/1706.03762) and the [transformer XL model](https://arxiv.org/abs/1901.02860). They can then plugged in with a decoder to make a language model, or some classifying layers to make a text classifier." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Language model modules" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": false }, "outputs": [ { "data": { "text/markdown": [ "
class
AWD_LSTM
[source]AWD_LSTM
(**`vocab_sz`**:`int`, **`emb_sz`**:`int`, **`n_hid`**:`int`, **`n_layers`**:`int`, **`pad_token`**:`int`=***`1`***, **`hidden_p`**:`float`=***`0.2`***, **`input_p`**:`float`=***`0.6`***, **`embed_p`**:`float`=***`0.1`***, **`weight_p`**:`float`=***`0.5`***, **`qrnn`**:`bool`=***`False`***, **`bidir`**:`bool`=***`False`***) :: [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)\n",
"\n",
"AWD-LSTM/QRNN inspired by https://arxiv.org/abs/1708.02182. \n",
"\n"
],
"text/plain": [
"reset
[source]reset
()\n",
"\n",
"Reset the hidden states. \n",
"\n"
],
"text/plain": [
"class
Transformer
[source]Transformer
(**`vocab_sz`**:`int`, **`ctx_len`**:`int`, **`n_layers`**:`int`, **`n_heads`**:`int`, **`d_model`**:`int`, **`d_head`**:`int`, **`d_inner`**:`int`, **`resid_p`**:`float`=***`0.0`***, **`attn_p`**:`float`=***`0.0`***, **`ff_p`**:`float`=***`0.0`***, **`embed_p`**:`float`=***`0.0`***, **`bias`**:`bool`=***`True`***, **`scale`**:`bool`=***`True`***, **`act`**:[`Activation`](/text.models.transformer.html#Activation)=***`class
TransformerXL
[source]TransformerXL
(**`vocab_sz`**:`int`, **`ctx_len`**:`int`, **`n_layers`**:`int`, **`n_heads`**:`int`, **`d_model`**:`int`, **`d_head`**:`int`, **`d_inner`**:`int`, **`resid_p`**:`float`=***`0.0`***, **`attn_p`**:`float`=***`0.0`***, **`ff_p`**:`float`=***`0.0`***, **`embed_p`**:`float`=***`0.0`***, **`bias`**:`bool`=***`False`***, **`scale`**:`bool`=***`True`***, **`act`**:[`Activation`](/text.models.transformer.html#Activation)=***`reset
[source]reset
()\n",
"\n",
"Reset the internal memory. \n",
"\n"
],
"text/plain": [
"class
LinearDecoder
[source]LinearDecoder
(**`n_out`**:`int`, **`n_hid`**:`int`, **`output_p`**:`float`, **`tie_encoder`**:[`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)=***`None`***, **`bias`**:`bool`=***`True`***) :: [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)\n",
"\n",
"To go on top of a RNNCore module and create a Language Model. \n",
"\n"
],
"text/plain": [
"class
PoolingLinearClassifier
[source]PoolingLinearClassifier
(**`layers`**:`Collection`\\[`int`\\], **`drops`**:`Collection`\\[`float`\\]) :: [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)\n",
"\n",
"Create a linear classifier with pooling. \n",
"\n"
],
"text/plain": [
"pool
[source]pool
(**`x`**:`Tensor`, **`bs`**:`int`, **`is_max`**:`bool`)\n",
"\n",
"Pool the tensor along the seq_len dimension. \n",
"\n"
],
"text/plain": [
"class
EmbeddingDropout
[source]EmbeddingDropout
(**`emb`**:[`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module), **`embed_p`**:`float`) :: [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)\n",
"\n",
"Apply dropout with probabily `embed_p` to an embedding layer `emb`. \n",
"\n"
],
"text/plain": [
"class
RNNDropout
[source]RNNDropout
(**`p`**:`float`=***`0.5`***) :: [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)\n",
"\n",
"Dropout with probability `p` that is consistent on the seq_len dimension. \n",
"\n"
],
"text/plain": [
"class
WeightDropout
[source]WeightDropout
(**`module`**:[`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module), **`weight_p`**:`float`, **`layer_names`**:`StrList`=***`['weight_hh_l0']`***) :: [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)\n",
"\n",
"A module that warps another layer in which some weights will be replaced by 0 during training. \n",
"\n"
],
"text/plain": [
"class
PositionalEncoding
[source]PositionalEncoding
(**`d`**:`int`) :: [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)\n",
"\n",
"Encode the position with a sinusoid. \n",
"\n"
],
"text/plain": [
"class
DecoderLayer
[source]DecoderLayer
(**`n_heads`**:`int`, **`d_model`**:`int`, **`d_head`**:`int`, **`d_inner`**:`int`, **`resid_p`**:`float`=***`0.0`***, **`attn_p`**:`float`=***`0.0`***, **`ff_p`**:`float`=***`0.0`***, **`bias`**:`bool`=***`True`***, **`scale`**:`bool`=***`True`***, **`act`**:[`Activation`](/text.models.transformer.html#Activation)=***`class
MultiHeadAttention
[source]MultiHeadAttention
(**`n_heads`**:`int`, **`d_model`**:`int`, **`d_head`**:`int`=***`None`***, **`resid_p`**:`float`=***`0.0`***, **`attn_p`**:`float`=***`0.0`***, **`bias`**:`bool`=***`True`***, **`scale`**:`bool`=***`True`***) :: [`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)\n",
"\n",
"MutiHeadAttention. \n",
"\n"
],
"text/plain": [
"class
MultiHeadRelativeAttention
[source]MultiHeadRelativeAttention
(**`n_heads`**:`int`, **`d_model`**:`int`, **`d_head`**:`int`, **`resid_p`**:`float`=***`0.0`***, **`attn_p`**:`float`=***`0.0`***, **`bias`**:`bool`=***`True`***, **`scale`**:`bool`=***`True`***) :: [`MultiHeadAttention`](/text.models.transformer.html#MultiHeadAttention)\n",
"\n",
"MutiHeadAttention with relative positional encoding. \n",
"\n"
],
"text/plain": [
"class
SequentialRNN
[source]SequentialRNN
(**\\*`args`**) :: [`Sequential`](https://pytorch.org/docs/stable/nn.html#torch.nn.Sequential)\n",
"\n",
"A sequential module that passes the reset call to its children. \n",
"\n"
],
"text/plain": [
"reset
[source]reset
()\n",
"\n"
],
"text/plain": [
"dropout_mask
[source]dropout_mask
(**`x`**:`Tensor`, **`sz`**:`Collection`\\[`int`\\], **`p`**:`float`)\n",
"\n",
"Return a dropout mask of the same type as `x`, size `sz`, with probability `p` to cancel an element. \n",
"\n"
],
"text/plain": [
"feed_forward
[source]feed_forward
(**`d_model`**:`int`, **`d_ff`**:`int`, **`ff_p`**:`float`=***`0.0`***, **`act`**:[`Activation`](/text.models.transformer.html#Activation)=***`forward
[source]forward
(**\\*`args`**:`ArgStar`)\n",
"\n",
"Defines the computation performed at every call. Should be overridden by all subclasses.\n",
"\n",
".. note::\n",
" Although the recipe for forward pass needs to be defined within\n",
" this function, one should call the :class:`Module` instance afterwards\n",
" instead of this since the former takes care of running the\n",
" registered hooks while the latter silently ignores them. \n",
"\n"
],
"text/plain": [
"forward
[source]forward
(**`words`**:`LongTensor`, **`scale`**:`Optional`\\[`float`\\]=***`None`***) → `Tensor`\n",
"\n",
"Defines the computation performed at every call. Should be overridden by all subclasses.\n",
"\n",
".. note::\n",
" Although the recipe for forward pass needs to be defined within\n",
" this function, one should call the :class:`Module` instance afterwards\n",
" instead of this since the former takes care of running the\n",
" registered hooks while the latter silently ignores them. \n",
"\n"
],
"text/plain": [
"forward
[source]forward
(**`x`**:`Tensor`) → `Tensor`\n",
"\n",
"Defines the computation performed at every call. Should be overridden by all subclasses.\n",
"\n",
".. note::\n",
" Although the recipe for forward pass needs to be defined within\n",
" this function, one should call the :class:`Module` instance afterwards\n",
" instead of this since the former takes care of running the\n",
" registered hooks while the latter silently ignores them. \n",
"\n"
],
"text/plain": [
"reset
[source]reset
()\n",
"\n"
],
"text/plain": [
"forward
[source]forward
(**`input`**:`Tuple`\\[`Tensor`, `Tensor`\\]) → `Tuple`\\[`Tensor`, `Tensor`, `Tensor`\\]\n",
"\n",
"Defines the computation performed at every call. Should be overridden by all subclasses.\n",
"\n",
".. note::\n",
" Although the recipe for forward pass needs to be defined within\n",
" this function, one should call the :class:`Module` instance afterwards\n",
" instead of this since the former takes care of running the\n",
" registered hooks while the latter silently ignores them. \n",
"\n"
],
"text/plain": [
"forward
[source]forward
(**`input`**:`Tuple`\\[`Tensor`, `Tensor`\\]) → `Tuple`\\[`Tensor`, `Tensor`, `Tensor`\\]\n",
"\n",
"Defines the computation performed at every call. Should be overridden by all subclasses.\n",
"\n",
".. note::\n",
" Although the recipe for forward pass needs to be defined within\n",
" this function, one should call the :class:`Module` instance afterwards\n",
" instead of this since the former takes care of running the\n",
" registered hooks while the latter silently ignores them. \n",
"\n"
],
"text/plain": [
"