{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# (MultiFiT) French Sentiment Classifier on French Amazon Customer Reviews\n",
"### (architecture 4 QRNN with 1550 hidden parameters by layer, SentencePiece tokenizer and hyperparameters from the MultiFiT method)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Author: [Pierre Guillou](https://www.linkedin.com/in/pierreguillou)\n",
"- Date: **edition of October 20, 2019** (initial publication on September 2019)\n",
"- Post in medium: [link](https://medium.com/@pierre_guillou/nlp-fastai-french-language-model-d0e2a9e12cab)\n",
"- Ref: [Fastai v1](https://docs.fast.ai/) (Deep Learning library on PyTorch)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Warning (20/10/2019)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**This notebook is a modified version of the v1 published in September 2019.** Indeed (thanks to [David Vieira](https://medium.com/@davidhsv/ol%C3%A1-pierre-tudo-bom-2bc8ae36dc14)), we noticed that the fine-tuning of the LM and classifier did not use the SentencePiece model and vocab trained for the General Portuguese Language Model ([lm3-portuguese.ipynb](https://github.com/piegu/language-models/blob/master/lm3-portuguese.ipynb)). This was the case of the General French Language Model ([lm3-french.ipynb](https://github.com/piegu/language-models/blob/master/lm3-french.ipynb)), too.\n",
"\n",
"For example, the code used to create the fine-tuned French forward LM was : \n",
"\n",
"```\n",
"data_lm = (TextList.from_df(df_trn_val, path, cols=reviews, \n",
" processor=[OpenFileProcessor(), SPProcessor(max_vocab_sz=15000)])\n",
" .split_by_rand_pct(0.1, seed=42)\n",
" .label_for_lm() \n",
" .databunch(bs=bs, num_workers=1))\n",
"```\n",
"\n",
"It has been corrected by using the [SPProcessor.load()](https://github.com/fastai/fastai/blob/master/fastai/text/data.py#L481) function:\n",
"\n",
"```\n",
"data_lm = (TextList.from_df(df_trn_val, path, cols=reviews, processor=SPProcessor.load(dest))\n",
" .split_by_rand_pct(0.1, seed=42)\n",
" .label_for_lm() \n",
" .databunch(bs=bs, num_workers=1))\n",
"```\n",
" \n",
"Therefore, we retrained the fine-tuned French forward LM and the classifier on Amazon Reviews dataset (see the Results paragraph to get all results).\n",
"\n",
"- **(fine-tuned) Language Model** \n",
" - forward : (accuracy) **42.03%** instead of 37.57% | (perplexity) 17.87 instead of 24.62\n",
" - backward: (accuracy) **46.82%** instead of 43.25% | (perplexity) 18.05 instead of 24.90 \n",
" \n",
"- **(fine-tuned) Text Classifier**\n",
" - **Accuracy** (ensemble) **95.74%** instead of 95.92%\n",
" - **f1 score** (ensemble): **0.9758** instead of 0.9636 \n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"heading_collapsed": true
},
"source": [
"## Information"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"### Overview"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"According to this new article \"[MultiFiT: Efficient Multi-lingual Language Model Fine-tuning](https://arxiv.org/abs/1909.04761)\" (September 10, 2019), the QRNN architecture and the SentencePiece tokenizer give better results than AWD-LSTM and the spaCy tokenizer respectively. \n",
"\n",
"Therefore, they have been used in this notebook to **fine-tune a French bidirectional Language Model** by Transfer Learning of a French bidirectional Language Model (with the QRNN architecture and the SentencePiece tokenizer, too) trained on a Wikipedia corpus of 100 millions tokens ([lm2-french.ipynb](https://github.com/piegu/language-models/blob/master/lm2-french.ipynb)). \n",
"\n",
"This French bidirectional Language Model has been **fine-tuned on \"[French Amazon Customer Reviews](https://s3.amazonaws.com/amazon-reviews-pds/readme.html)\"** and **its encoder part has been transfered to a sentiment classifier which has been finally trained on this amazon corpus**.\n",
"\n",
"This process **LM General --> LM fine-tuned --> Classifier fine-tuned** is called [ULMFiT](http://nlp.fast.ai/category/classification.html)."
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"### Hyperparameters values"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"The following hyperparameters values given at the end of the MultiFiT article have been used:\n",
"- Language Model\n",
" - (batch size) bs = 50\n",
" - (QRNN) 3 QRNN (default: 3) with 1152 hidden parameters each one (default: 1152) (note: it would have been better to increae to 4 QRNN with 1550 hidden parameters like described in the article)\n",
" - (SentencePiece) vocab of 15000 tokens\n",
" - (dropout) mult_drop = 0\n",
" - (weight decay) wd = 0.01\n",
" - (number of training epochs) 20 epochs\n",
" \n",
"\n",
"- Sentiment Classifier\n",
" - (batch size) bs = 18\n",
" - (SentencePiece) vocab of 15000 tokens\n",
" - (dropout) mult_drop = 0.5\n",
" - (weight decay) wd = 0.01\n",
" - (number of training epochs) 10 epochs\n",
" - (loss) FlattenedLoss of weighted CrossEntropyLoss"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Results"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Our Bidirectional French LM ([lm3-french.ipynb](https://github.com/piegu/language-models/blob/master/lm3-french.ipynb)) and Sentiment Classifier with a MultiFiT configuration (4 QRNN architecture, 1550 hidden parameters and using the SentencePiecce tokenizer) ([lm3-french-classifier-amazon.ipynb](https://github.com/piegu/language-models/blob/master/lm3-french-classifier-amazon.ipynb)) have better results (accuracy, perplexity and f1) than the 2 others Bidirectional French LM ([lm-french.ipynb](https://github.com/piegu/language-models/blob/master/lm-french.ipynb) and [lm2-french.ipynb](https://github.com/piegu/language-models/blob/master/lm2-french.ipynb)) we have trained before.\n",
"\n",
"This improvement comes from:\n",
"- the deeper complexity of our last Bidirectional French LMs and Sentiment Classifier that use the MultiFiT configuration (4 QRNN architecture with 1550 hidden parameters): 4 layers instead of 3 and each hidden layer has more parameters.\n",
"- the use of the SentencePiecce tokenizer (even with only a vocab of 15 000 tokens instead of 60 000 with the spaCy tokenizer)\n",
"\n",
"**We can conclude that this Bidirectional French LM model using the MultiFiT configuration is a good model to perform text classification but with only 46 millions of parameters, it is far from being a LM that can gan compete with [GPT-2](https://openai.com/blog/better-language-models/) or [BERT](https://arxiv.org/abs/1810.04805) in NLP tasks like text generation.**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- **About the data**: the dataset \"French Amazon Customer Reviews\", even after pre-processing (deleting empty reviews, neutral ones, etc.), is unbalanced. Therefore, we used a weighted loss function (FlattenedLoss of weighted CrossEntropyLoss).\n",
" - initial number of reviews: 253 961\n",
" - number of reviews after pre-processing: 221 456\n",
" - neg: 25106 (11.34%)\n",
" - pos: 196350 (88.66%)\n",
"\n",
"\n",
"- **(fine-tuned) Language Model** \n",
" - forward : (accuracy) 42.03% | 17.87 (perplexity)\n",
" - backward: (accuracy) 46.82% | 18.05 (perplexity) \n",
" \n",
"\n",
"- **(fine-tuned) Sentiment Classifier**\n",
"\n",
" - **Accuracy**\n",
" - forward : (global) 95.53% | (neg) 87.56% | (pos) 96.51%\n",
" - backward: (global) 94.79% | **(neg) 91.83%** | (pos) 95.15%\n",
" - ensemble: **(global) 95.74%** | (neg) 90.63% | **(pos) 96.37%**\n",
"\n",
" - **f1 score**\n",
" - forward: 0.9718\n",
" - backward: 0.9718\n",
" - ensemble: **0.9758**\n",
"\n",
"(neg = negative reviews | pos = positive reviews)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### To be improved"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The lost function FlattenedLoss of LabelSmoothingCrossEntropy should be tested as it is used in the MultiFiT method (see the notebook [lm3-portuguese-classifier-TCU-jurisprudencia.ipynb](https://github.com/piegu/language-models/blob/master/lm3-portuguese-classifier-TCU-jurisprudencia.ipynb) to get the code)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialisation"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"%reload_ext autoreload\n",
"%autoreload 2\n",
"%matplotlib inline\n",
"\n",
"from fastai import *\n",
"from fastai.text import *\n",
"from fastai.callbacks import *"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.metrics import f1_score\n",
"\n",
"@np_func\n",
"def f1(inp,targ): return f1_score(targ, np.argmax(inp, axis=-1))"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r\n",
"\r\n",
"```text\r\n",
"=== Software === \r\n",
"python : 3.7.4\r\n",
"fastai : 1.0.58\r\n",
"fastprogress : 0.1.21\r\n",
"torch : 1.2.0\r\n",
"nvidia driver : 410.104\r\n",
"torch cuda : 10.0.130 / is available\r\n",
"torch cudnn : 7602 / is enabled\r\n",
"\r\n",
"=== Hardware === \r\n",
"nvidia gpus : 1\r\n",
"torch devices : 1\r\n",
" - gpu0 : 16130MB | Tesla V100-SXM2-16GB\r\n",
"\r\n",
"=== Environment === \r\n",
"platform : Linux-4.9.0-9-amd64-x86_64-with-debian-9.9\r\n",
"distro : #1 SMP Debian 4.9.168-1+deb9u5 (2019-08-11)\r\n",
"conda env : base\r\n",
"python : /opt/anaconda3/bin/python\r\n",
"sys.path : /home/jupyter/tutorials/fastai/course-nlp\r\n",
"/opt/anaconda3/lib/python37.zip\r\n",
"/opt/anaconda3/lib/python3.7\r\n",
"/opt/anaconda3/lib/python3.7/lib-dynload\r\n",
"/opt/anaconda3/lib/python3.7/site-packages\r\n",
"/opt/anaconda3/lib/python3.7/site-packages/IPython/extensions\r\n",
"```\r\n",
"\r\n",
"Please make sure to include opening/closing ``` when you paste into forums/github to make the reports appear formatted as code sections.\r\n",
"\r\n",
"Optional package(s) to enhance the diagnostics can be installed with:\r\n",
"pip install distro\r\n",
"Once installed, re-run this utility to get the additional information\r\n"
]
}
],
"source": [
"!python -m fastai.utils.show_install"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# bs=48\n",
"# bs=24\n",
"bs=50"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"torch.cuda.set_device(0)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"data_path = Config.data_path()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This will create a `{lang}wiki` folder, containing a `{lang}wiki` text file with the wikipedia contents. (For other languages, replace `{lang}` with the appropriate code from the [list of wikipedias](https://meta.wikimedia.org/wiki/List_of_Wikipedias).)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"lang = 'fr'"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"name = f'{lang}wiki'\n",
"path = data_path/name\n",
"path.mkdir(exist_ok=True, parents=True)\n",
"\n",
"lm_fns3 = [f'{lang}_wt_sp15_multifit', f'{lang}_wt_vocab_sp15_multifit']\n",
"lm_fns3_bwd = [f'{lang}_wt_sp15_multifit_bwd', f'{lang}_wt_vocab_sp15_multifit_bwd']"
]
},
{
"cell_type": "markdown",
"metadata": {
"heading_collapsed": true
},
"source": [
"## Data"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"- [French Amazon Customer Reviews](https://s3.amazonaws.com/amazon-reviews-pds/readme.html)\n",
"- [Guide on how to download the French Amazon Customer Reviews](https://forums.fast.ai/t/ulmfit-french/29379/36)\n",
"- File: amazon_reviews_multilingual_FR_v1_00.tsv.gz"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/plain": [
"[PosixPath('/home/jupyter/.fastai/data/amazon_reviews_fr/amazon_reviews_multilingual_FR_v1_00.tsv'),\n",
" PosixPath('/home/jupyter/.fastai/data/amazon_reviews_fr/amazon_reviews_filtered_fr.csv'),\n",
" PosixPath('/home/jupyter/.fastai/data/amazon_reviews_fr/amazon_reviews_fr.csv')]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"name = 'amazon_reviews_fr'\n",
"path_data = data_path/name\n",
"path_data.ls()"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"### Run this code the first time"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"#### Get reviews neg and pos"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"# to solve display error of pandas dataframe\n",
"get_ipython().config.get('IPKernelApp', {})['parent_appname'] = \"\""
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" review_id \n",
" review_body \n",
" star_rating \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" R32VYUWDIB5LKE \n",
" je conseille fortement ce bouquin à ceux qui s... \n",
" 5 \n",
" \n",
" \n",
" 1 \n",
" R3CCMP4EV6HAVL \n",
" ce magnifique est livre , les personnages sont... \n",
" 5 \n",
" \n",
" \n",
" 2 \n",
" R14NAE6UGTVTA2 \n",
" Je dirais qu'il a un défaut :<br />On ne peut ... \n",
" 3 \n",
" \n",
" \n",
" 3 \n",
" R2E7QEWSC6EWFA \n",
" Je l'ai depuis quelques jours et j'en suis trè... \n",
" 4 \n",
" \n",
" \n",
" 4 \n",
" R26E6I47GQRYKR \n",
" je m'attendait à un bon film, car j'aime beauc... \n",
" 2 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" review_id review_body \\\n",
"0 R32VYUWDIB5LKE je conseille fortement ce bouquin à ceux qui s... \n",
"1 R3CCMP4EV6HAVL ce magnifique est livre , les personnages sont... \n",
"2 R14NAE6UGTVTA2 Je dirais qu'il a un défaut : On ne peut ... \n",
"3 R2E7QEWSC6EWFA Je l'ai depuis quelques jours et j'en suis trè... \n",
"4 R26E6I47GQRYKR je m'attendait à un bon film, car j'aime beauc... \n",
"\n",
" star_rating \n",
"0 5 \n",
"1 5 \n",
"2 3 \n",
"3 4 \n",
"4 2 "
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fields = ['review_id', 'review_body', 'star_rating']\n",
"df = pd.read_csv(path_data/'amazon_reviews_multilingual_FR_v1_00.tsv', delimiter='\\t',encoding='utf-8', usecols=fields)\n",
"df = df[fields]\n",
"df.loc[pd.isna(df.review_body),'review_body']='NA'\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"# columns names\n",
"reviews = \"review_body\"\n",
"idx = \"review_id\"\n",
"rating = \"star_rating\"\n",
"label = \"label\"\n",
"\n",
"# keep not null reviews\n",
"df2 = df.copy()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(orginal csv) number of all reviews: 253961\n",
"there is no empty review.\n",
"there is no identical review id.\n",
"23277 neutral reviews (rating = 3) were deleted\n",
"\n",
"number of neg reviews (rating = 1 or 2): 25637 (11.11%)\n",
"number of pos reviews (rating = 4 or 5): 205047 (88.89%)\n",
"\n",
"(final) number of all reviews neg and pos: 230684\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" review_body \n",
" star_rating \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" je conseille fortement ce bouquin à ceux qui s... \n",
" 5 \n",
" \n",
" \n",
" 1 \n",
" ce magnifique est livre , les personnages sont... \n",
" 5 \n",
" \n",
" \n",
" 3 \n",
" Je l'ai depuis quelques jours et j'en suis trè... \n",
" 4 \n",
" \n",
" \n",
" 4 \n",
" je m'attendait à un bon film, car j'aime beauc... \n",
" 2 \n",
" \n",
" \n",
" 5 \n",
" Ne disait pas sur l'annonce que c'était un 10'... \n",
" 2 \n",
" \n",
" \n",
" 6 \n",
" du bon bowie,très bon meme parfois.esperons qu... \n",
" 5 \n",
" \n",
" \n",
" 7 \n",
" très bon film beaucoup d'action l'image est le... \n",
" 5 \n",
" \n",
" \n",
" 8 \n",
" Un sujet délicat mais parfaitement traité avec... \n",
" 5 \n",
" \n",
" \n",
" 9 \n",
" Un coffret d'un duo culte que les DVD nous per... \n",
" 5 \n",
" \n",
" \n",
" 10 \n",
" Un grand classique dans ce genre de films, mai... \n",
" 5 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" review_body star_rating\n",
"0 je conseille fortement ce bouquin à ceux qui s... 5\n",
"1 ce magnifique est livre , les personnages sont... 5\n",
"3 Je l'ai depuis quelques jours et j'en suis trè... 4\n",
"4 je m'attendait à un bon film, car j'aime beauc... 2\n",
"5 Ne disait pas sur l'annonce que c'était un 10'... 2\n",
"6 du bon bowie,très bon meme parfois.esperons qu... 5\n",
"7 très bon film beaucoup d'action l'image est le... 5\n",
"8 Un sujet délicat mais parfaitement traité avec... 5\n",
"9 Un coffret d'un duo culte que les DVD nous per... 5\n",
"10 Un grand classique dans ce genre de films, mai... 5"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# number of reviews\n",
"print(f'(orginal csv) number of all reviews: {len(df)}')\n",
"\n",
"df2 = df.copy()\n",
"\n",
"# keep not null reviews\n",
"empty = (df2[reviews].isnull()).sum()\n",
"df2 = df2[df2[reviews].notnull()]\n",
"if empty != 0:\n",
" print(f'{empty} empty reviews were deleted')\n",
"else:\n",
" print('there is no empty review.')\n",
"\n",
"# check that there is no twice the same review\n",
"# keep the first of unique review_id reviews\n",
"same = len(df2) - len(df2[idx].unique())\n",
"if same != 0:\n",
" df2.drop_duplicates(subset=[idx], inplace=True)\n",
" print(f'from the {same} identical reviews ids, only the first one has been kept.')\n",
"else:\n",
" print('there is no identical review id.')\n",
"\n",
"# categorify reviews in 2 classes neg, pos in the label column (rating != 3)\n",
"neutral = len(df2[df2[rating] == 3])\n",
"df2 = df2[df2[rating] != 3]\n",
"df2[label] = 'neg'\n",
"df2.loc[df2[rating] > 3, label] = 'pos'\n",
"print(f'{neutral} neutral reviews (rating = 3) were deleted')\n",
"\n",
"# number of reviews neg or pos\n",
"num_neg = len(df2[df2[label] == 'neg'])\n",
"num_pos = len(df2[df2[label] == 'pos'])\n",
"num_neg_pos = len(df2)\n",
"pc_neg = round((num_neg/num_neg_pos)*100,2)\n",
"pc_pos = round((num_pos/num_neg_pos)*100,2)\n",
"print(f'\\nnumber of neg reviews (rating = 1 or 2): {num_neg} ({pc_neg}%)')\n",
"print(f'number of pos reviews (rating = 4 or 5): {num_pos} ({pc_pos}%)')\n",
"print(f'\\n(final) number of all reviews neg and pos: {num_neg_pos}') \n",
"\n",
"# convert HTML caracters to normal letters\n",
"df2[reviews] = df2[reviews].apply(convert)\n",
"\n",
"df2[[reviews, rating]].head(10)"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"#### Delete the non French reviews"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"**FastText**\n",
"\n",
"- library source: https://github.com/facebookresearch/fastText/tree/master/python\n",
"- Blog post about using fasttext for language detection: https://amitness.com/2019/07/identify-text-language-python/\n",
"- \"New release of python module\" (june 2019): https://fasttext.cc/blog/2019/06/25/blog-post.html"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"import fasttext\n",
"import urllib.request\n",
"from converter import *"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/plain": [
"(PosixPath('/home/jupyter/tutorials/fastai/course-nlp/lid.176.bin'),\n",
" )"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Download the file from url and save it locally under file_name\n",
"# source: https://fasttext.cc/docs/en/language-identification.html\n",
"\n",
"path_ft = Path.cwd() # /home/jupyter/tutorials/fastai/course-nlp\n",
"url = 'https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin'\n",
"file_name = 'lid.176.bin'\n",
"url_file = path_ft/file_name\n",
"urllib.request.urlretrieve(url, url_file)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n"
]
}
],
"source": [
"# get the model trained on 176 languages\n",
"model = fasttext.load_model(file_name)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"True\n",
"([['__label__fr']], array([[1.000027]]))\n"
]
}
],
"source": [
"# test\n",
"sentences = ['Il fait très beau']\n",
"predictions = model.predict(sentences)\n",
"print(predictions[0][0][0].replace('__label__','') == 'fr')\n",
"print(predictions)"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"number of non-French reviews: 9228\n",
"CPU times: user 39.9 s, sys: 44 ms, total: 40 s\n",
"Wall time: 40 s\n"
]
}
],
"source": [
"%%time\n",
"list_idx = []\n",
"for idx, row in df2.iterrows():\n",
" try:\n",
" string = str(row[reviews]).replace('\\r',' ').replace('\\n','').replace(' ', ' ').lower()\n",
" predictions = model.predict(string)\n",
" language = predictions[0][0].replace('__label__','')\n",
" except:\n",
" language = \"error\"\n",
" if not (language == 'fr'):\n",
" list_idx.append(idx)\n",
" \n",
"print(f'number of non-French reviews: {len(list_idx)}')"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/plain": [
"16 Just great complination, there are 48 cds insi...\n",
"19 I know it's a classic but really it is a marve...\n",
"20 Waiting for so long to get a sequel of Bridget...\n",
"72 Not one of his best science fiction novels but...\n",
"126 Für die Liebhaber von Schwarzer Humor ist di...\n",
"165 A great alternate look into the world of cats,...\n",
"194 It's the perfect book for a screenwriter and a...\n",
"238 Good delivery, on time. However the image of t...\n",
"376 This is Frank Herbert's masterpiece and should...\n",
"379 It was by sheer chance that I came across this...\n",
"Name: review_body, dtype: object"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df2[reviews][list_idx][:10]"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"df2.drop(list_idx, axis=0, inplace=True)\n",
"df_trn_val = df2.copy()"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"#### Save filtered reviews dataset"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"number of neg reviews (rating = 1 or 2): 25106 (11.34%)\n",
"number of pos reviews (rating = 4 or 5): 196350 (88.66%)\n",
"number of all reviews: 221456\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYoAAAD4CAYAAADy46FuAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAVxUlEQVR4nO3dcbDdZX3n8fdnk+LoWgrIhcmQsKE2tUW2jZLFdB0dV1YItNNgV7bJ7JisZSbKwk5duzPG7h84KjO4u9YpU8WNJUPYVRBBh4yNxSzr6HYHlItQICLNJVK5JgPRINLB4oR+94/zXD2Ek+de7g33puT9mvnN+Z3v73me33Nmrvnwe36/40lVIUnS4fyThZ6AJOnoZlBIkroMCklSl0EhSeoyKCRJXYsXegJH2sknn1zLly9f6GlI0j8qd9999w+qamzUsZdcUCxfvpzx8fGFnoYk/aOS5G8Pd8ylJ0lSl0EhSeqaNiiSLEvy1SQPJtmV5A9b/aQkO5Psbq8ntnqSXJ1kIsl9SV4/NNbG1n53ko1D9bOT3N/6XJ0kvXNIkubPTK4oDgJ/VFW/DqwGLktyJrAZuL2qVgC3t/cAFwAr2rYJuAYG/+gDVwBvAM4Brhj6h/+a1naq35pWP9w5JEnzZNqgqKp9VfWttv8U8CBwGrAW2NaabQMuavtrgetr4E7ghCRLgPOBnVV1oKqeAHYCa9qx46vqjhr8H09df8hYo84hSZonL+geRZLlwOuAbwCnVtU+GIQJcEprdhrw6FC3yVbr1SdH1Omc49B5bUoynmR8//79L+QjSZKmMeOgSPJK4BbgvVX1417TEbWaRX3GqmpLVa2qqlVjYyMfA5YkzdKMgiLJLzAIic9U1Rda+bG2bER7fbzVJ4FlQ92XAnunqS8dUe+dQ5I0T2by1FOAa4EHq+pPhg5tB6aeXNoI3DpU39CefloNPNmWjW4DzktyYruJfR5wWzv2VJLV7VwbDhlr1DkkSfNkJt/MfiPwTuD+JPe22h8DVwE3JbkE+B5wcTu2A7gQmACeBt4FUFUHknwYuKu1+1BVHWj7lwLXAS8Hvtw2OueQjknLN//FQk9BR7FHrvrtF2XcaYOiqv6K0fcRAM4d0b6Ayw4z1lZg64j6OHDWiPoPR51DkjR//Ga2JKnLoJAkdRkUkqQug0KS1GVQSJK6DApJUpdBIUnqMigkSV0GhSSpy6CQJHUZFJKkLoNCktRlUEiSugwKSVKXQSFJ6jIoJEldBoUkqWsmv5m9NcnjSR4Yqn0uyb1te2TqJ1KTLE/yk6Fjnxrqc3aS+5NMJLm6/T42SU5KsjPJ7vZ6YquntZtIcl+S1x/5jy9Jms5MriiuA9YMF6rq96tqZVWtBG4BvjB0+OGpY1X1nqH6NcAmYEXbpsbcDNxeVSuA29t7gAuG2m5q/SVJ82zaoKiqrwMHRh1rVwX/FrihN0aSJcDxVXVH+03t64GL2uG1wLa2v+2Q+vU1cCdwQhtHkjSP5nqP4k3AY1W1e6h2RpJ7knwtyZta7TRgcqjNZKsBnFpV+wDa6ylDfR49TJ/nSLIpyXiS8f3798/tE0mSnmOuQbGe515N7ANOr6rXAe8DPpvkeCAj+tY0Y8+4T1VtqapVVbVqbGxsBtOWJM3U4tl2TLIY+D3g7KlaVT0DPNP2707yMPCrDK4Glg51XwrsbfuPJVlSVfva0tLjrT4JLDtMH0nSPJnLFcW/Br5TVT9bUkoylmRR2/9lBjei97QlpaeSrG73NTYAt7Zu24GNbX/jIfUN7emn1cCTU0tUkqT5M5PHY28A7gBek2QyySXt0DqefxP7zcB9Sf4auBl4T1VN3Qi/FPhzYAJ4GPhyq18FvC3JbuBt7T3ADmBPa/9p4D+88I8nSZqraZeeqmr9Yer/fkTtFgaPy45qPw6cNaL+Q+DcEfUCLptufpKkF5ffzJYkdRkUkqQug0KS1GVQSJK6DApJUpdBIUnqMigkSV0GhSSpy6CQJHUZFJKkLoNCktRlUEiSugwKSVKXQSFJ6jIoJEldBoUkqcugkCR1zeSnULcmeTzJA0O1Dyb5fpJ723bh0LEPJJlI8lCS84fqa1ptIsnmofoZSb6RZHeSzyU5rtVf1t5PtOPLj9SHliTN3EyuKK4D1oyof7yqVrZtB0CSMxn8lvZrW59PJlmUZBHwCeAC4ExgfWsL8NE21grgCWDqN7kvAZ6oql8BPt7aSZLm2bRBUVVfBw7McLy1wI1V9UxVfReYAM5p20RV7amqnwI3AmuTBHgrcHPrvw24aGisbW3/ZuDc1l6SNI/mco/i8iT3taWpE1vtNODRoTaTrXa4+quAH1XVwUPqzxmrHX+ytZckzaPZBsU1wKuBlcA+4GOtPuq/+GsW9d5Yz5NkU5LxJOP79+/vzVuS9ALNKiiq6rGqeraq/gH4NIOlJRhcESwbaroU2Nup/wA4IcniQ+rPGasd/yUOswRWVVuqalVVrRobG5vNR5IkHcasgiLJkqG3bwemnojaDqxrTyydAawAvgncBaxoTzgdx+CG9/aqKuCrwDta/43ArUNjbWz77wD+T2svSZpHi6drkOQG4C3AyUkmgSuAtyRZyWAp6BHg3QBVtSvJTcC3gYPAZVX1bBvncuA2YBGwtap2tVO8H7gxyUeAe4BrW/1a4H8mmWBwJbFuzp9WkvSCTRsUVbV+RPnaEbWp9lcCV46o7wB2jKjv4edLV8P1vwcunm5+kqQXl9/MliR1GRSSpC6DQpLUZVBIkroMCklSl0EhSeoyKCRJXQaFJKnLoJAkdRkUkqQug0KS1GVQSJK6DApJUpdBIUnqMigkSV0GhSSpy6CQJHUZFJKkrmmDIsnWJI8neWCo9t+SfCfJfUm+mOSEVl+e5CdJ7m3bp4b6nJ3k/iQTSa5OklY/KcnOJLvb64mtntZuop3n9Uf+40uSpjOTK4rrgDWH1HYCZ1XVbwB/A3xg6NjDVbWybe8Zql8DbAJWtG1qzM3A7VW1Ari9vQe4YKjtptZfkjTPpg2Kqvo6cOCQ2leq6mB7eyewtDdGkiXA8VV1R1UVcD1wUTu8FtjW9rcdUr++Bu4ETmjjSJLm0ZG4R/EHwJeH3p+R5J4kX0vyplY7DZgcajPZagCnVtU+gPZ6ylCfRw/T5zmSbEoynmR8//79c/s0kqTnmFNQJPkvwEHgM620Dzi9ql4HvA/4bJLjgYzoXtMNP9M+VbWlqlZV1aqxsbGZTV6SNCOLZ9sxyUbgd4Bz23ISVfUM8EzbvzvJw8CvMrgaGF6eWgrsbfuPJVlSVfva0tLjrT4JLDtMH0nSPJnVFUWSNcD7gd+tqqeH6mNJFrX9X2ZwI3pPW1J6Ksnq9rTTBuDW1m07sLHtbzykvqE9/bQaeHJqiUqSNH+mvaJIcgPwFuDkJJPAFQyecnoZsLM95Xpne8LpzcCHkhwEngXeU1VTN8IvZfAE1csZ3NOYuq9xFXBTkkuA7wEXt/oO4EJgAngaeNdcPqgkaXamDYqqWj+ifO1h2t4C3HKYY+PAWSPqPwTOHVEv4LLp5idJenH5zWxJUpdBIUnqMigkSV0GhSSpy6CQJHUZFJKkLoNCktRlUEiSugwKSVKXQSFJ6jIoJEldBoUkqcugkCR1GRSSpC6DQpLUZVBIkroMCklS14yCIsnWJI8neWCodlKSnUl2t9cTWz1Jrk4ykeS+JK8f6rOxtd+dZONQ/ewk97c+V7ff1T7sOSRJ82emVxTXAWsOqW0Gbq+qFcDt7T3ABcCKtm0CroHBP/oMfm/7DcA5wBVD//Bf09pO9VszzTkkSfNkRkFRVV8HDhxSXgtsa/vbgIuG6tfXwJ3ACUmWAOcDO6vqQFU9AewE1rRjx1fVHe13sq8/ZKxR55AkzZO53KM4tar2AbTXU1r9NODRoXaTrdarT46o987xHEk2JRlPMr5///45fCRJ0qFejJvZGVGrWdRnrKq2VNWqqlo1Njb2QrpKkqYxl6B4rC0b0V4fb/VJYNlQu6XA3mnqS0fUe+eQJM2TuQTFdmDqyaWNwK1D9Q3t6afVwJNt2eg24LwkJ7ab2OcBt7VjTyVZ3Z522nDIWKPOIUmaJ4tn0ijJDcBbgJOTTDJ4eukq4KYklwDfAy5uzXcAFwITwNPAuwCq6kCSDwN3tXYfqqqpG+SXMniy6uXAl9tG5xySpHkyo6CoqvWHOXTuiLYFXHaYcbYCW0fUx4GzRtR/OOockqT54zezJUldBoUkqcugkCR1GRSSpC6DQpLUZVBIkroMCklSl0EhSeoyKCRJXQaFJKnLoJAkdRkUkqQug0KS1GVQSJK6DApJUpdBIUnqMigkSV0GhSSpa9ZBkeQ1Se4d2n6c5L1JPpjk+0P1C4f6fCDJRJKHkpw/VF/TahNJNg/Vz0jyjSS7k3wuyXGz/6iSpNmYdVBU1UNVtbKqVgJnA08DX2yHPz51rKp2ACQ5E1gHvBZYA3wyyaIki4BPABcAZwLrW1uAj7axVgBPAJfMdr6SpNk5UktP5wIPV9XfdtqsBW6sqmeq6rvABHBO2yaqak9V/RS4EVibJMBbgZtb/23ARUdovpKkGTpSQbEOuGHo/eVJ7kuyNcmJrXYa8OhQm8lWO1z9VcCPqurgIfXnSbIpyXiS8f3798/900iSfmbOQdHuG/wu8PlWugZ4NbAS2Ad8bKrpiO41i/rzi1VbqmpVVa0aGxt7AbOXJE1n8REY4wLgW1X1GMDUK0CSTwNfam8ngWVD/ZYCe9v+qPoPgBOSLG5XFcPtJUnz5EgsPa1naNkpyZKhY28HHmj724F1SV6W5AxgBfBN4C5gRXvC6TgGy1jbq6qArwLvaP03ArcegflKkl6AOV1RJHkF8Dbg3UPl/5pkJYNlokemjlXVriQ3Ad8GDgKXVdWzbZzLgduARcDWqtrVxno/cGOSjwD3ANfOZb6SpBduTkFRVU8zuOk8XHtnp/2VwJUj6juAHSPqexg8FSVJWiB+M1uS1GVQSJK6DApJUpdBIUnqMigkSV0GhSSpy6CQJHUZFJKkLoNCktRlUEiSugwKSVKXQSFJ6jIoJEldBoUkqcugkCR1GRSSpC6DQpLUNeegSPJIkvuT3JtkvNVOSrIzye72emKrJ8nVSSaS3Jfk9UPjbGztdyfZOFQ/u40/0fpmrnOWJM3ckbqi+FdVtbKqVrX3m4Hbq2oFcHt7D3ABsKJtm4BrYBAswBXAGxj89OkVU+HS2mwa6rfmCM1ZkjQDL9bS01pgW9vfBlw0VL++Bu4ETkiyBDgf2FlVB6rqCWAnsKYdO76q7qiqAq4fGkuSNA+ORFAU8JUkdyfZ1GqnVtU+gPZ6SqufBjw61Hey1Xr1yRH150iyKcl4kvH9+/cfgY8kSZqy+AiM8caq2pvkFGBnku902o66v1CzqD+3ULUF2AKwatWq5x2XJM3enK8oqmpve30c+CKDewyPtWUj2uvjrfkksGyo+1Jg7zT1pSPqkqR5MqegSPJPk/zi1D5wHvAAsB2YenJpI3Br298ObGhPP60GnmxLU7cB5yU5sd3EPg+4rR17Ksnq9rTThqGxJEnzYK5LT6cCX2xPrC4GPltVf5nkLuCmJJcA3wMubu13ABcCE8DTwLsAqupAkg8Dd7V2H6qqA23/UuA64OXAl9smSZoncwqKqtoD/OaI+g+Bc0fUC7jsMGNtBbaOqI8DZ81lnpKk2fOb2ZKkLoNCktRlUEiSugwKSVKXQSFJ6jIoJEldBoUkqcugkCR1GRSSpC6DQpLUZVBIkroMCklSl0EhSeoyKCRJXQaFJKnLoJAkdRkUkqSuWQdFkmVJvprkwSS7kvxhq38wyfeT3Nu2C4f6fCDJRJKHkpw/VF/TahNJNg/Vz0jyjSS7k3wuyXGzna8kaXbmckVxEPijqvp1YDVwWZIz27GPV9XKtu0AaMfWAa8F1gCfTLIoySLgE8AFwJnA+qFxPtrGWgE8AVwyh/lKkmZh1kFRVfuq6ltt/yngQeC0Tpe1wI1V9UxVfReYAM5p20RV7amqnwI3AmuTBHgrcHPrvw24aLbzlSTNzuIjMUiS5cDrgG8AbwQuT7IBGGdw1fEEgxC5c6jbJD8PlkcPqb8BeBXwo6o6OKL9oeffBGwCOP3002f9OZZv/otZ99VL3yNX/fZCT0FaEHO+mZ3klcAtwHur6sfANcCrgZXAPuBjU01HdK9Z1J9frNpSVauqatXY2NgL/ASSpJ45XVEk+QUGIfGZqvoCQFU9NnT808CX2ttJYNlQ96XA3rY/qv4D4IQki9tVxXB7SdI8mctTTwGuBR6sqj8Zqi8ZavZ24IG2vx1Yl+RlSc4AVgDfBO4CVrQnnI5jcMN7e1UV8FXgHa3/RuDW2c5XkjQ7c7mieCPwTuD+JPe22h8zeGppJYNlokeAdwNU1a4kNwHfZvDE1GVV9SxAksuB24BFwNaq2tXGez9wY5KPAPcwCCZJ0jyadVBU1V8x+j7Cjk6fK4ErR9R3jOpXVXsYPBUlSVogfjNbktRlUEiSugwKSVKXQSFJ6jIoJEldBoUkqcugkCR1GRSSpC6DQpLUZVBIkroMCklSl0EhSeoyKCRJXQaFJKnLoJAkdRkUkqQug0KS1GVQSJK6jvqgSLImyUNJJpJsXuj5SNKx5qgOiiSLgE8AFwBnAuuTnLmws5KkY8tRHRTAOcBEVe2pqp8CNwJrF3hOknRMWbzQE5jGacCjQ+8ngTcc2ijJJmBTe/t3SR6ah7kdC04GfrDQkzha5KMLPQON4N/okDn+jf6zwx042oMiI2r1vELVFmDLiz+dY0uS8apatdDzkA7Hv9H5cbQvPU0Cy4beLwX2LtBcJOmYdLQHxV3AiiRnJDkOWAdsX+A5SdIx5aheeqqqg0kuB24DFgFbq2rXAk/rWOJyno52/o3Og1Q9b8lfkqSfOdqXniRJC8ygkCR1GRSSpC6DQpLUZVAco5IsT/Jgkk8n2ZXkK0lenuTVSf4yyd1J/m+SX2vtX53kziR3JflQkr9b6M+gl772d/qdJNuS3Jfk5iSvSHJuknuS3J9ka5KXtfZXJfl2a/vfF3r+LxUGxbFtBfCJqnot8CPg3zB43PA/VtXZwH8GPtna/inwp1X1L/BLj5pfrwG2VNVvAD8G3gdcB/x+Vf1zBo/5X5rkJODtwGtb248s0HxfcgyKY9t3q+retn83sBz4l8Dnk9wL/A9gSTv+W8Dn2/5n53OSOuY9WlX/r+3/L+BcBn+7f9Nq24A3MwiRvwf+PMnvAU/P+0xfoo7qL9zpRffM0P6zwKnAj6pq5QLNRxplRl/2al/QPYdBkKwDLgfe+mJO7FjhFYWG/Rj4bpKLATLwm+3YnQyWpmDwP0Jpvpye5Lfa/nrgfwPLk/xKq70T+FqSVwK/VFU7gPcC/gfPEWJQ6FD/DrgkyV8Du/j573+8F3hfkm8yWI56coHmp2PPg8DGJPcBJwEfB97FYIn0fuAfgE8Bvwh8qbX7GvCfFmi+Lzn+X3hoRpK8AvhJVVWSdcD6qvJHpPSiSrIc+FJVnbXAUzmmeY9CM3U28GdJwuAJqT9Y4PlImideUUiSurxHIUnqMigkSV0GhSSpy6CQJHUZFJKkrv8PSQTQ5bpjLm0AAAAASUVORK5CYII=\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# number of reviews neg or pos\n",
"num_neg = len(df_trn_val[df_trn_val[label] == 'neg'])\n",
"num_pos = len(df_trn_val[df_trn_val[label] == 'pos'])\n",
"num_neg_pos = len(df_trn_val)\n",
"pc_neg = round((num_neg/num_neg_pos)*100,2)\n",
"pc_pos = round((num_pos/num_neg_pos)*100,2)\n",
"print(f'number of neg reviews (rating = 1 or 2): {num_neg} ({pc_neg}%)')\n",
"print(f'number of pos reviews (rating = 4 or 5): {num_pos} ({pc_pos}%)')\n",
"print(f'number of all reviews: {num_neg_pos}') \n",
"\n",
"# plot histogram\n",
"x= [1,2]\n",
"keys = list(df_trn_val[label].value_counts().keys())\n",
"values = list(df_trn_val[label].value_counts().array)\n",
"plt.bar(x, values[::-1]) \n",
"plt.xticks(x, keys[::-1])\n",
"# print(df_trn_val['label'].value_counts())\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"df_trn_val.to_csv (path_data/'amazon_reviews_filtered_fr.csv', index = None, header=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get the csv of pre-processed data "
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"name_data = 'amazon_reviews_fr'\n",
"path_data = data_path/name_data\n",
"\n",
"# Load csv\n",
"df_trn_val = pd.read_csv(path_data/'amazon_reviews_filtered_fr.csv')\n",
"\n",
"# columns names\n",
"reviews = \"review_body\"\n",
"label = \"label\""
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[PosixPath('/home/jupyter/.fastai/data/frwiki/corpus2_100/tmp/spm.model'),\n",
" PosixPath('/home/jupyter/.fastai/data/frwiki/corpus2_100/tmp/spm.vocab')]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dest = path/'corpus2_100'\n",
"(dest/'tmp').ls()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" review_id \n",
" review_body \n",
" star_rating \n",
" label \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" R32VYUWDIB5LKE \n",
" je conseille fortement ce bouquin à ceux qui s... \n",
" 5 \n",
" pos \n",
" \n",
" \n",
" 1 \n",
" R3CCMP4EV6HAVL \n",
" ce magnifique est livre , les personnages sont... \n",
" 5 \n",
" pos \n",
" \n",
" \n",
" 2 \n",
" R2E7QEWSC6EWFA \n",
" Je l'ai depuis quelques jours et j'en suis trè... \n",
" 4 \n",
" pos \n",
" \n",
" \n",
" 3 \n",
" R26E6I47GQRYKR \n",
" je m'attendait à un bon film, car j'aime beauc... \n",
" 2 \n",
" neg \n",
" \n",
" \n",
" 4 \n",
" R1RJMTSNCKB9LP \n",
" Ne disait pas sur l'annonce que c'était un 10'... \n",
" 2 \n",
" neg \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" review_id review_body \\\n",
"0 R32VYUWDIB5LKE je conseille fortement ce bouquin à ceux qui s... \n",
"1 R3CCMP4EV6HAVL ce magnifique est livre , les personnages sont... \n",
"2 R2E7QEWSC6EWFA Je l'ai depuis quelques jours et j'en suis trè... \n",
"3 R26E6I47GQRYKR je m'attendait à un bon film, car j'aime beauc... \n",
"4 R1RJMTSNCKB9LP Ne disait pas sur l'annonce que c'était un 10'... \n",
"\n",
" star_rating label \n",
"0 5 pos \n",
"1 5 pos \n",
"2 4 pos \n",
"3 2 neg \n",
"4 2 neg "
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_trn_val.head()"
]
},
{
"cell_type": "markdown",
"metadata": {
"heading_collapsed": true
},
"source": [
"## Fine-tuning \"forward LM\""
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"### Databunch"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 33.2 s, sys: 1.66 s, total: 34.9 s\n",
"Wall time: 57.4 s\n"
]
}
],
"source": [
"%%time\n",
"data_lm = (TextList.from_df(df_trn_val, path, cols=reviews, processor=SPProcessor.load(dest))\n",
" .split_by_rand_pct(0.1, seed=42)\n",
" .label_for_lm() \n",
" .databunch(bs=bs, num_workers=1))"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"data_lm.save(f'{path}/{lang}_databunch_lm_aws_sp15_multifit_v2')"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"### Training"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"data_lm = load_data(path, f'{lang}_databunch_lm_aws_sp15_multifit_v2', bs=bs)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"config = awd_lstm_lm_config.copy()\n",
"config['qrnn'] = True\n",
"config['n_hid'] = 1550 #default 1152\n",
"config['n_layers'] = 4 #default 3"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 4.64 s, sys: 1.5 s, total: 6.14 s\n",
"Wall time: 32.3 s\n"
]
}
],
"source": [
"%%time\n",
"perplexity = Perplexity()\n",
"learn_lm = language_model_learner(data_lm, AWD_LSTM, config=config, pretrained_fnames=lm_fns3, drop_mult=0.3, \n",
" metrics=[error_rate, accuracy, perplexity]).to_fp16()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.\n"
]
}
],
"source": [
"learn_lm.lr_find()"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3de3RcZ3nv8e8zMxpJo7ts2XFsxyaJsZOUXBw1JU1PViBcSsoJpNAWVrsKoefk0FJ64ZBTuliL08KipXegtElD2hwol7YJDRBOCIGSnKRACHJiJyF2YuP7XZJtaaTRaG7P+WNv2WNZsmVbey6a32etWdrz7j2zH42t/cx72e9r7o6IiDSuWLUDEBGR6lIiEBFpcEoEIiINTolARKTBKRGIiDS4RLUDOFuLFy/21atXVzsMEZG6smHDhiF375tpX90lgtWrVzMwMFDtMERE6oqZ7Zptn5qGREQanBKBiEiDUyIQEWlwSgQiIg1OiUBEpMEpEYiINDglAhGRBqdEICJSBz75nZd5cutgJO+tRCAiUuPcnb/97jZ+uP1IJO+vRCAiUuMyuSLFktPREs1kEEoEIiI1Lp0tANDR0hTJ+ysRiIjUuHQ2D1B/NQIzW2tmG8seo2b2e9OOMTP7tJltM7PnzGx9VPGIiNSr0eM1gmgSQWSzj7r7S8DVAGYWB/YBD0477E3AmvDxM8Bd4U8REQmdqBHUd9PQzcBP3H36NKhvAT7vgaeAbjNbVqGYRETqwlQfQWe9NQ1N8w7gyzOULwf2lD3fG5adxMzuMLMBMxsYHIxmHK2ISK2q+85iM0sCtwL3z7R7hjI/pcD9Hnfvd/f+vr4ZF9gREVmwRuu1s7jMm4Bn3P3QDPv2AivLnq8A9lcgJhGRupHO5onHjFQyHsn7VyIRvJOZm4UAvg78ejh66NXAiLsfqEBMIiJ1I50t0N6cwGymRpTzF+maxWaWAl4P/I+ysvcCuPvdwMPALcA2IAPcHmU8IiL1KJ0t0Nka3eU60kTg7hlg0bSyu8u2HXhflDGIiNS7dDZPR3M0HcWgO4tFRGreaLYQWUcxKBGIiNS8dLYQ2dBRUCIQEal56Ww+spvJQIlARKTmpdU0JCLSuNw96CxW05CISGMazxUpeXR3FYMSgYhITYt65lFQIhARqWnpiNciACUCEZGaFvXqZKBEICJS06ZWJ+tsVdOQiEhDinpRGlAiEBGpaeosFhFpcOosFhFpcFOL0rQ2RbMoDSgRiIjUtKnpJaJalAaUCEREalrU8wyBEoGISE0bnYh2URpQIhARqWmqEYiINLjRiGceBSUCEZGals4WIr2ZDJQIRERqWjqbj3R6CVAiEBGpWe7O2KT6CEREGlYlFqUBJQIRkZpViXmGQIlARKRmVWKeIVAiEBGpWaoRiIg0uNEJ1QhERBraaFgj0H0EIiIN6kQfgZqGREQakjqLRUQaXCUWpYGIE4GZdZvZA2a2xcw2m9n10/bfZGYjZrYxfHwkynhEROpJJRalAYi2vgGfAh5x97ebWRJIzXDMk+7+5ojjEBGpO+lsns6I+wcgwkRgZp3AjcC7Adw9B+SiOp+IyEJTibUIINqmoYuBQeA+M3vWzO41s7YZjrvezDaZ2TfN7IqZ3sjM7jCzATMbGBwcjDBkEZHasRASQQJYD9zl7tcA48CHph3zDLDK3a8C/hb46kxv5O73uHu/u/f39fVFGLKISO2oxKI0EG0i2Avsdfcfhs8fIEgMx7n7qLuPhdsPA01mtjjCmERE6kbd1wjc/SCwx8zWhkU3Ay+WH2NmF1jYHW5m14XxDEcVk4hIPRmt987i0PuBL4YjhrYDt5vZewHc/W7g7cBvmlkBmADe4e4ecUwiIjWvVKrMojQQcSJw941A/7Tiu8v2fwb4TJQxiIjUo/FcAa/AojSgO4tFRGpSpeYZAiUCEZGaVKl5hkCJQESkJlVqURpQIhARqUlTNYKo1yIAJQIRkZo0qhqBiEhjU41ARKTBadSQiEiDG83mScSMlqboL9NKBCIiNSidzVdkURpQIhARqUmjEwU6W6NvFgIlAhGRmjQ8PsmitmRFzqVEICJSg4bSORa3N1fkXEoEIiI1aGhsksUdSgQiIg2pUCxxJJNjsZqGREQa09FMHndUIxARaVRDY5MA6iMQEWlUSgQiIg1uKhEsalcfgYhIQxoeywGqEYiINKzBsUmS8VhFZh4FJQIRkZoT3EyWrMg8Q6BEICJScyp5MxkoEYiI1JyhscrNMwRKBCIiNWd4rHLzDIESgYhITXF3hsfVNCQi0rBGJvLki64agYhIozpxV7H6CEREGtJQhW8mAyUCEZGaUul5hkCJQESkpgylF1jTkJl1m9kDZrbFzDab2fXT9puZfdrMtpnZc2a2Psp4RERq3dBYjnjM6ElVLhFEPZHFp4BH3P3tZpYEUtP2vwlYEz5+Brgr/Cki0pCGxyfpbUsSi1VmegmIsEZgZp3AjcA/Arh7zt2PTTvsLcDnPfAU0G1my6KKSUSk1g2mcxW9qxiibRq6GBgE7jOzZ83sXjNrm3bMcmBP2fO9YdlJzOwOMxsws4HBwcHoIhYRqbKhsUn6KngzGUSbCBLAeuAud78GGAc+NO2Ymeo+fkqB+z3u3u/u/X19ffMfqYhIjRgam6zoiCGINhHsBfa6+w/D5w8QJIbpx6wse74C2B9hTCIiNcvdw0SwQJqG3P0gsMfM1oZFNwMvTjvs68Cvh6OHXg2MuPuBqGISEallmVyRbL7EogrXCKIeNfR+4IvhiKHtwO1m9l4Ad78beBi4BdgGZIDbI45HRKRmVeNmMphjIjCzSwiaeSbN7CbgSoLRPtNHAZ3E3TcC/dOK7y7b78D7zipiEZEFqhrzDMHcm4a+AhTN7FKC4aCvAL4UWVQiIg1oMF35eYZg7omg5O4F4Dbgk+7++4DG+4uIzKPh8aBGUKvDR/Nm9k7gXcA3wrKmaEISEWlMQ2GNoLdGbyi7Hbge+Li77zCzVwBfiC4sEZHGMzQ2SXeqiaZ4ZecDnVNnsbu/CPwOgJn1AB3u/okoAxMRaTTVuJkM5lgjMLPHzazTzHqBTQTTRvx1tKGJiDSWYNH6yjYLwdybhrrcfRT4ReA+d78WeF10YYmINJ6arhEAiXBW0F/mRGexiIjMo8EaTwQfBb4F/MTdf2RmFwNbowtLRKSxZPNF0tlCVZqG5tpZfD9wf9nz7cDbogpKRKTRDI9X52YymHtn8Qoze9DMDpvZITP7ipmtiDo4EZFGMVyleYZg7k1D9xHMFHohwcIxD4VlIiIyDwbDResX1fCooT53v8/dC+Hj/wBaIUZEZJ7sOZIBYGXv9KXdozfXRDBkZr9mZvHw8WvAcJSBiYg0kl1HMrQl4xVfrxjmngjeQzB09CBwAHg7WjtARGTe7BrOcNGiNsxmWsE3WnNKBO6+291vdfc+d1/i7m8luLlMRETmwa7hcVZVoVkIzm+pyg/MWxQiIg2sVHL2HJ1g1aL6SwSVr7+IiCxAB0ez5AolLqrDRODzFoWISAPbNRyMGFrV21aV85/2zmIzSzPzBd+A1kgiEhFpMLuPjANUrWnotInA3TsqFYiISKPaNZwhETOWdbVU5fyVXQZHREROsWs4w8reFIkKr0w2RYlARKTKdh0Z56IqDR0FJQIRkapyd3YNZ6rWPwBKBCIiVXUskyedLahGICLSqHaFk82tWlSdoaOgRCAiUlW7hqs7dBSUCEREqmp3eDOZmoZERBrUzuEMSzubaWmKVy0GJQIRkSrafWS8qv0DoEQgIlJVu4YzVZt+esppp5g4X2a2E0gDRaDg7v3T9t8EfA3YERb9u7t/NMqYRERqxUSuyOH0ZFU7iiHiRBB6jbsPnWb/k+7+5grEISJSU3aHQ0cvUtOQiEhjOj50tMpNQ1EnAgceNbMNZnbHLMdcb2abzOybZnbFTAeY2R1mNmBmA4ODg9FFKyJSQbuP30y2sJuGbnD3/Wa2BPi2mW1x9yfK9j8DrHL3MTO7BfgqsGb6m7j7PcA9AP39/VoQR0QWhF3DGTpbEnSnklWNI9IagbvvD38eBh4Erpu2f9Tdx8Lth4EmM1scZUwiIrVi53D1h45ChInAzNrMrGNqG3gD8MK0Yy4wMwu3rwvjGY4qJhGRWrL7SHVnHZ0SZdPQUuDB8DqfAL7k7o+Y2XsB3P1u4O3Ab5pZAZgA3uHuavoRkQVvslBk39EJ/uuVF1Y7lOgSgbtvB66aofzusu3PAJ+JKgYRkVq19dAYhZKzbln1VwTW8FERkSrYfGAUgMuWdVY5EiUCEZGq2HwgTUtTjNULubNYRERmt/nAKGsv6CQes2qHokQgIlJp7s7mg6NcXgP9A6BEICJScQdHsxzL5GuifwCUCEREKu7F/bXTUQxKBCIiFTc1YmjdBWoaEhFpSJsPpLmoN0VHS1O1QwGUCEREKm7zgVEuq5GOYlAiEBGpqEyuwI7h8ZrpHwAlAhGRinrpYBr32ukoBiUCEZGK2nwgDcDlSgQiIo1p84FROpoTrOhprXYoxykRiIhU0IsHRlm3rINwiv6aoEQgIlIhpZKz5cBoTfUPgBKBiEjF7DmaYTxXVCIQEWlUU3cU11JHMSgRiIhUzIsH0sQM1tbI1BJTlAhERCrk2d1HuXRJOy1N8WqHchIlAhGRChifLPDD7Ue4cU1ftUM5hRKBiEgFfP8nw+SKJV6zbkm1QzmFEoGISAU8/tJh2pJx+lf3VDuUUygRiIhEzN15/KVBbrh0Mc2J2uofACUCEZHIbT08xr5jEzXZLARKBCIikXtsy2EAblpbex3FoEQgIhK5x146zLoLOljWVTsTzZVTIhARidBoNs/AzqM12ywESgQiIpH63tYhCiXnNWuVCEREGtJjLx2moyXB+ou6qx3KrJQIREQi4u489tIgN67pIxGv3cttpJGZ2U4ze97MNprZwAz7zcw+bWbbzOw5M1sfZTwiIpX04/2jDKYna3a00JREBc7xGncfmmXfm4A14eNngLvCnyIide9rG/eRiBmvreGOYqh+09BbgM974Cmg28yWVTkmEZHzli+WePDZfbx23RIWtTdXO5zTijoROPComW0wsztm2L8c2FP2fG9YdhIzu8PMBsxsYHBwMKJQRUTmz2NbDjM0luOX+1dWO5QzijoR3ODu6wmagN5nZjdO2z/T6s1+SoH7Pe7e7+79fX213dYmIgJw/4a9LG5vrvn+AYg4Ebj7/vDnYeBB4Lpph+wFytPlCmB/lDGJiERtMD3Jd7cc5m3rl9f0aKEpkUVoZm1m1jG1DbwBeGHaYV8Hfj0cPfRqYMTdD0QVk4hIJXz12X0US84v9a+odihzEuWooaXAg2Y2dZ4vufsjZvZeAHe/G3gYuAXYBmSA2yOMR0Qkcu7Ovw3s4ZqLurl0SW2tTTybyBKBu28Hrpqh/O6ybQfeF1UMIiKVtmnvCFsPj/Ent72q2qHMWe03XomI1JH7B/bQ0hTjzVfVz0h4JQIRkXkyms3z9Y37+fkrLqCzpana4cyZEoGIyDz53Pd2kp4s8Bs/d3G1QzkrSgQiIvNgbLLAvf+5g5vXLeFVK7qqHc5ZUSIQEZkHn/v+TkYm8vzOzWuqHcpZUyIQETlP45MF7n1yOzet7eOqlbW77sBslAhERM7TPz+1i6OZPL9bh7UBUCIQETkvmVyBzz6xnRtf2cc1F/VUO5xzokQgInIePv+DXQyP5+q2NgBKBCIi52zH0Dif+s5Wbl63hGtX1WdtAJQIRETOSbHkfPD+TTTFjY/X0XQSM6nEUpUiIgvOvU9uZ8Ouo/zNr1zFBV0t1Q7nvKhGICJyll4+lOavHn2ZN16xlLdefcqiinVHiUBE5CzkiyX+579tor0lwcdvexXhVPt1TU1DIiJz5O585Gs/5vl9I9z1q+tZXOOL0s+VEsE07s7g2CS7hjPEY0YyHqM5EWNxezM9bclqhyciVfR3j23jy0/v5rduuoQ3vap+ppk+k4ZNBFsPpXli6xCZyQLjuSJjk3l2DI2z5UCa4fHcjK9Z0dPKlSu6uHJFN9es7Oaqld20NMUrHLmIVMMDG/byl4++zC9es5w737i22uHMq4ZMBA9t2s+dD2wimy8BkIzHSDXHWdWb4nWXLWXdsg5esbgNgMlCiVyhxP5jEzy3d4Tn9h3j4ecPApCIGVcs7+LK5V0k4kap5JQcHCduRixmJGLG1St7eO26JbQm6ztpuDuFkpMvlsgXnFyxxGg2z7FMjqPjedKTeWJmNMVjJGKGA9l8kUyuyESuiFnwmSXiMeIxwwCz4GciHtS+kongkYjFSMTt+HGZXJF0tsDYZIF8sUQiFpynKfy362xJ0NHSREdLgpZEnJamOM2JoAtsslBiIl8kmy+SSsbpbGkiFqv/dl2pnCdeHuRDX3mOn7t0MZ9425ULol+gXEMlglLJ+eR3XubT391G/6oePvXOa1jS0UxT/Oz6zI+M53hm11EGdh1lw64jfHXjPnCIxYILFwRjjEslZ7JY4rNP7iCVjPP6y5fy+suX0t2apCkeXBC7Wpu4qDdFMhFNv32uUOLAyARbD43x0qE0Ww6m2T08DhYkqXjMiBkEl9sgiWVyRUYm8oxM5BnLFii64x5JeFURjxk9qSZ625Is7WxhWVcLF3S10tfRTGdLgs7WJjpbmmhvTtDaFKc1GSeVDJJLXAmkoRRLzj/+53b+8tGXuXRJO3f92vrI/larybzO/sL7+/t9YGDgrF+XyRX4wL9u4pEfH+SX+1fwsbf+FM2J6L+hF0vOD3cM89CmA3zzhQMcy+RPOSZmsKInxapFKQBGpy7CkwXMjLgFF+zmphidLU10tgbffGNmFEsl8kWnWHLcHQfcIZ3Ns+/YBIfTkyddxJd3t3JxXxtmQQ2mUCpRKp3Y7zipZILuVBNdrcHFMBGz4Ju7QdyMpkQs/DZudLY00Z1qoieVpKMlQcmhUCpRKAYnTSWDC2lr2ISWL/pJ+z2sQRVKTi6sfeWKwf5iGJ87tDUnaGuO09HcRFPCKBTDmknRGc8VSGcLjE7kSWcLTBaKTBZKZPNF3Dl+/uZEjEyuyJHxHEcyOYbSkxxKT3Jw5NTPaTbJRIyWRIxEPEa+LM7etiQre1tZ2ZNiWXcLiViMmAVJtjUZp6+jmSUdLSzpbObCrta6rx02gm2Hx7jzgU08u/sYr7tsKZ9426vqunPYzDa4e/+M+xolEdw/sIc/+MpzfPgXLuc9N6yuStUuXyyx+cAo2XyJQjG44B0Zz7FzaJztQ+PsPpIhZkZXa3gRbkngDsVSiWIJsoUioxN5RrMF0hNBQomH3+oTccMILtYGpJIJlve0sry7leU9rVzS18Yrl3bQUUfL51VSvljiaCbH6ESB0WyQiDOTRTK5wvHmrWz+RBNToVQKmq/Cz39oLMeeoxn2HZ3gwMgEpTP8WS1uT7K8J8WK7qAm0tfRTF97M92pJtqaE6SScdqaE3S1Bkl2IX4LrVVjkwX+8ckd/N3j20gl4/zxrVdw61UX1n1zkBIBQfv2loNpLlvWGUFUIidzD/uL3BmfLHI4neVwepLD6Sz7j2XZezTD3qMT7Ds6wWB6kvRk4bTv19GcoKctefxLQldrE63J+PEaFUBfRzOrettYtSjFhd2tAMe/cLQ3J1jZk1LfyGlk80W+8NQu/v7xn3BkPMctr7qAP7r1CpZ01Pddw1NOlwgapo/AzJQEpGKCJj0AoysVoyvVxJqlHbMen80XGUxPciyTJ5MrkMkXGZ8scCyT5+h4juHxHEczueN9N/tHJsjmiid9Sx1MT5IrlmY9RyoZZ83SDtYt7WBpZ/PxvpDO1sTxJsfOliZSzfGg0z7sjF/o/SK7hzM8+Ow+vvz0bg6OZvm5SxfzwTeu5eo6XGDmXDVMIhCpZS1NcVb2pljZe+7vUSw5B0ez7Boe58Cx7PEmw0QsxshEji0H07x0MM1/bDnE0NjMQ6Rn0paM09uepLetmcVtSZb3BH0hK3tbWdGTYnl3K92pprpqOjk0muU7mw/xtWf38/TOI5jBz16yiL/+lav42UsWVzu8ilMiEFkg4jEL+oTCZqHTKZacsclC2OeUP97ZPpotMB4O0c0Xgw78kYk8R8YnGR7Pse/YBE/vOHJKU1YqGefC7lYWtyfpbUvSk0qyqL2Zi3pTvGJxitWL2uhtS1YlWUzdJLr10BhP7zjCd7cc5vl9IwBc3NfGnW9cy1uvWT6nz22hUiIQaUDx2IlBCediJJNnz9EMe49m2Hcsy/5jQX/H8PgkLx8a42jYlFXead7aFKe3LcmiMFks725l9aITfRr5YolMLuiYL7kfH7rb2hSnpy3J4vbkSSP9SiUnnS1wJBOca6oJbWhsksF08Dg4kmXr4TFGwsEVMYP1F/Vw5xvXcvNlS1i7tKOuajJRUSIQkbPWlWqiK9XFTy3vmvWYXKHE3qMZdgyNs2NonAMj2ZMu1hv3HJtxOPVpz9saDFdOZwscm5ZoyrU3J8Ihu838wpXLeOWSdtYs7eDyZZ2aKmYGSgQiEolkIsbFfe1c3Nc+6zHHMjl2DWc4MJKlORE7fvNezIzJQpGJXIlMrsCR8VzwLX8s6FDvaEnQk0oev4eltz1JbypoklrckSSV1KXtbOjTEpGq6U4l6U4luWpltSNpbLpLRUSkwSkRiIg0uMgTgZnFzexZM/vGDPvebWaDZrYxfPy3qOMREZGTVaKP4HeBzcBst/X+q7v/dgXiEBGRGURaIzCzFcAvAPdGeR4RETl3UTcNfRL4X8DsE6DA28zsOTN7wMxmHDtgZneY2YCZDQwODkYSqIhIo4osEZjZm4HD7r7hNIc9BKx29yuB7wCfm+kgd7/H3fvdvb+vry+CaEVEGleUNYIbgFvNbCfwL8BrzewL5Qe4+7C7T4ZPPwtcG2E8IiIyg4qsR2BmNwEfdPc3Tytf5u4Hwu3bgD9w91ef4b0GgWPAyLRdXWcoO9N2edliYOiMv9iZzz+X/fMV97nEfLq4zrR/evnpnivuM8d1pv3nEvdMZYr7zPvP5m+y/Pl8xR3VtWSVu8/cpOLukT+Am4BvhNsfBW4Nt/8U+DGwCXgMWDfH97vnbMvOtD2tbOAcfsdTzj+X/fMV97nEPJ9xn+654q5O3LOUKe4z7D+bv8ko4o7qWnK6R0WmmHD3x4HHw+2PlJX/IfCH5/CWD51D2Zm2Z3r9+cY0l/0LJe7TPVfcs59vrvvPJe7Zfpdz0Uhxn83fZPnz+Yo7qmvJrOpuqcpKMLMBn2VJt1pVjzGD4q40xV1Z9RK3ppiY2T3VDuAc1GPMoLgrTXFXVl3ErRqBiEiDU41ARKTBKRGIiDS4BZ0IzOyfzOywmb1wDq+91syeN7NtZvZpK1vY1Mzeb2YvmdmPzezP5zfqaOI2sz8ys31lM73eUg9xl+3/oJm5mS2ev4iPv3cUn/fHwqlTNprZo2Z2YZ3E/RdmtiWM/UEz666TuH8p/Hssmdm8dc6eT6yzvN+7zGxr+HhXWflp//9H7lzG5tbLA7gRWA+8cA6vfRq4HjDgm8CbwvLXEEyH0Rw+X1Incf8RwU19dfV5h/tWAt8CdgGL6yFuoLPsmN8B7q6TuN8AJMLtPwP+rE7ivgxYSzBMvb/asYZxrJ5W1gtsD3/2hNs9p/u9KvVY0DUCd38COFJeZmaXmNkjZrbBzJ40s3XTX2dmywj+kH/gwb/S54G3hrt/E/iEh1NjuPvhOok7chHG/TcEkxdGMrIhirjdfbTs0LYoYo8o7kfdvRAe+hSwok7i3uzuL9VKrLN4I/Btdz/i7keBbwM/X+2/W1jgTUOzuAd4v7tfC3wQ+PsZjlkO7C17vjcsA3gl8F/M7Idm9v/M7KcjjfaE840b4LfDKv8/mVlPdKGe5LziNrNbgX3uvinqQKc578/bzD5uZnuAXwU+QmXMx/+TKe8h+HZaCfMZd9TmEutMlgN7yp5PxV/136uhFq83s3bgZ4H7y5rgmmc6dIayqW90CYJq3auBnwb+zcwuDjN5JOYp7ruAj4XPPwb8FcEfemTON24zSwEfJmiuqJh5+rxx9w8DHzazPwR+G/jf8xzqycHMU9zhe30YKABfnM8YZzKfcUftdLGa2e0EC3EBXAo8bGY5YIe738bs8Vf992qoREBQAzrm7leXF5pZHJiaLvvrBBfN8irxCmB/uL0X+Pfwwv+0mZUIJpaKcqGE847b3Q+Vve6zwClLh0bgfOO+BHgFsCn8o1sBPGNm17n7wRqOe7ovAf+XiBMB8xR32In5ZuDmKL/glJnvzztKM8YK4O73AfcBmNnjwLvdfWfZIXsJ5l2bsoKgL2Ev1f69KtkhUY0HsJqyjh7g+8AvhdsGXDXL635E8K1/qvPmlrD8vcBHw+1XElT1rA7iXlZ2zO8D/1IPn/e0Y3YSQWdxRJ/3mrJj3g88UCdx/zzwItAXRbxR/z9hnjuLzzVWZu8s3kHQotATbvfO5feK+lGxE1XjAXwZOADkCbLubxB8w3yEYMbTF4GPzPLafuAF4CfAZzhxF3YS+EK47xngtXUS9z8DzwPPEXy7WlYPcU87ZifRjBqK4vP+Slj+HMHEX8vrJO5tBF9uNoaPKEY7RRH3beF7TQKHgG9VM1ZmSARh+XvCz3gbcPvZ/P+P8qEpJkREGlwjjhoSEZEySgQiIg1OiUBEpMEpEYiINDglAhGRBqdEIAuCmY1V+Hz3mtnl8/ReRQtmKX3BzB4604yfZtZtZr81H+cWAa1QJguEmY25e/s8vl/CT0y+Fqny2M3sc8DL7v7x0xy/GviGu/9UJeKThU81AlmwzKzPzL5iZj8KHzeE5deZ2ffN7Nnw59qw/N1mdr+ZPQQ8amY3mdnjZvaABXP0f3FqnviwvD/cHgsnmNtkZk+Z2dKw/JLw+Y/M7KNzrLX8gBMT7rWb2X+Y2TMWzFX/lvCYTwCXhLWIvwiPvTM8z3Nm9sfz+DFKA1AikIXsU8DfuPtPA28D7g3LtwA3uvs1BLOC/knZa64H3uXurw2fXwP8HnA5cDFwwwznaQOecvergCeA/152/k+F5z/j3DHh3Do3E9z5DZAFbnP39QTrYPxVmIg+BPzE3a929zvN7DFvUbEAAAHMSURBVA3AGuA64GrgWjO78UznE5nSaJPOSWN5HXB52SyRnWbWAXQBnzOzNQSzPDaVvebb7l4+//zT7r4XwMw2Esw785/TzpPjxCR+G4DXh9vXc2Je+S8BfzlLnK1l772BYJ56COad+ZPwol4iqCksneH1bwgfz4bP2wkSwxOznE/kJEoEspDFgOvdfaK80Mz+FnjM3W8L29sfL9s9Pu09Jsu2i8z8N5P3E51tsx1zOhPufrWZdREklPcBnyZYx6APuNbd82a2E2iZ4fUG/Km7/8NZnlcEUNOQLGyPEqwDAICZTU0d3AXsC7ffHeH5nyJokgJ4x5kOdvcRgmUtP2hmTQRxHg6TwGuAVeGhaaCj7KXfAt4TzpWPmS03syXz9DtIA1AikIUiZWZ7yx4fILio9ocdqC8STCEO8OfAn5rZ94B4hDH9HvABM3saWAaMnOkF7v4swayW7yBYFKbfzAYIagdbwmOGge+Fw03/wt0fJWh6+oGZPQ88wMmJQuS0NHxUJCLhCmsT7u5m9g7gne7+ljO9TqTS1EcgEp1rgc+EI32OEfHSoCLnSjUCEZEGpz4CEZEGp0QgItLglAhERBqcEoGISINTIhARaXD/Hx/rYHNcB3eFAAAAAElFTkSuQmCC\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"learn_lm.recorder.plot()"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"lr = 1e-3\n",
"lr *= bs/48\n",
"\n",
"wd = 0.01"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" error_rate \n",
" accuracy \n",
" perplexity \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 3.868189 \n",
" 3.581505 \n",
" 0.672373 \n",
" 0.327627 \n",
" 35.927704 \n",
" 05:29 \n",
" \n",
" \n",
" 1 \n",
" 3.737394 \n",
" 3.442246 \n",
" 0.656272 \n",
" 0.343728 \n",
" 31.256992 \n",
" 05:28 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_lm.fit_one_cycle(2, lr*10, wd=wd, moms=(0.8,0.7))"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_lm.save(f'{lang}fine_tuned1_sp15_multifit_v2')\n",
"learn_lm.save_encoder(f'{lang}fine_tuned1_enc_sp15_multifit_v2')"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" error_rate \n",
" accuracy \n",
" perplexity \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 3.487757 \n",
" 3.317794 \n",
" 0.640384 \n",
" 0.359616 \n",
" 27.599421 \n",
" 07:31 \n",
" \n",
" \n",
" 1 \n",
" 3.351605 \n",
" 3.209483 \n",
" 0.625792 \n",
" 0.374207 \n",
" 24.766258 \n",
" 07:31 \n",
" \n",
" \n",
" 2 \n",
" 3.257046 \n",
" 3.127676 \n",
" 0.614715 \n",
" 0.385285 \n",
" 22.820894 \n",
" 07:31 \n",
" \n",
" \n",
" 3 \n",
" 3.171384 \n",
" 3.077426 \n",
" 0.608249 \n",
" 0.391751 \n",
" 21.702461 \n",
" 07:32 \n",
" \n",
" \n",
" 4 \n",
" 3.153673 \n",
" 3.041128 \n",
" 0.603322 \n",
" 0.396678 \n",
" 20.928787 \n",
" 07:32 \n",
" \n",
" \n",
" 5 \n",
" 3.086838 \n",
" 3.011693 \n",
" 0.599402 \n",
" 0.400598 \n",
" 20.321775 \n",
" 07:32 \n",
" \n",
" \n",
" 6 \n",
" 3.078970 \n",
" 2.989866 \n",
" 0.596276 \n",
" 0.403725 \n",
" 19.883024 \n",
" 07:33 \n",
" \n",
" \n",
" 7 \n",
" 3.031662 \n",
" 2.972289 \n",
" 0.593523 \n",
" 0.406477 \n",
" 19.536604 \n",
" 07:32 \n",
" \n",
" \n",
" 8 \n",
" 3.000917 \n",
" 2.955677 \n",
" 0.591351 \n",
" 0.408648 \n",
" 19.214760 \n",
" 07:33 \n",
" \n",
" \n",
" 9 \n",
" 2.949565 \n",
" 2.941218 \n",
" 0.589300 \n",
" 0.410700 \n",
" 18.938911 \n",
" 07:33 \n",
" \n",
" \n",
" 10 \n",
" 2.942891 \n",
" 2.926159 \n",
" 0.586940 \n",
" 0.413059 \n",
" 18.655859 \n",
" 07:32 \n",
" \n",
" \n",
" 11 \n",
" 2.919689 \n",
" 2.915022 \n",
" 0.585162 \n",
" 0.414837 \n",
" 18.449244 \n",
" 07:33 \n",
" \n",
" \n",
" 12 \n",
" 2.892335 \n",
" 2.903746 \n",
" 0.583260 \n",
" 0.416739 \n",
" 18.242331 \n",
" 07:32 \n",
" \n",
" \n",
" 13 \n",
" 2.880667 \n",
" 2.894594 \n",
" 0.581840 \n",
" 0.418160 \n",
" 18.076181 \n",
" 07:32 \n",
" \n",
" \n",
" 14 \n",
" 2.856038 \n",
" 2.888913 \n",
" 0.580878 \n",
" 0.419122 \n",
" 17.973751 \n",
" 07:31 \n",
" \n",
" \n",
" 15 \n",
" 2.861682 \n",
" 2.884239 \n",
" 0.580082 \n",
" 0.419918 \n",
" 17.889946 \n",
" 07:32 \n",
" \n",
" \n",
" 16 \n",
" 2.823543 \n",
" 2.883112 \n",
" 0.579754 \n",
" 0.420245 \n",
" 17.869801 \n",
" 07:32 \n",
" \n",
" \n",
" 17 \n",
" 2.830968 \n",
" 2.882927 \n",
" 0.579680 \n",
" 0.420320 \n",
" 17.866495 \n",
" 07:32 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXUAAAD4CAYAAAATpHZ6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3deXxU5b3H8c8vk8nOGoLsBEFlUWSJKGIVWxdEBNtSi9pFu9C63FZvb9Uu1qW2arfbWlsptbba0lYvVmstLrggat0CIkYW2aKEsCSBrGTPc/+YkzAJmWSSTBI4ft+v17xm5pxnzvzmBL5z5jnnPMecc4iIiD/E9XYBIiISOwp1EREfUaiLiPiIQl1ExEcU6iIiPhLfa2+c0s9NmXhcb729iMhRac2aNYXOuYxI83st1IP9jyE7O7u33l5E5KhkZh+0NV/dLyIiPtJroe7QSU8iIrHWe6HuoL5BwS4iEku91qcOMPa7K3jvtvNJTezVMkTkKFFbW0teXh5VVVW9XUq3S0pKYsSIEQSDwQ69rtfT9Af/fI+fX3Jyb5chIkeBvLw8+vTpQ2ZmJmbW2+V0G+ccRUVF5OXlMWbMmA69ttd3lD66No+/vN7mzlwREQCqqqpIT0/3daADmBnp6emd+kXS66EO8P3Hc/jqQzq8UUTa5/dAb9TZz9lroT42I7XZ85Ub9vLAKzt6qRoREX/otVBPSYjnlRvPbjbt9ic3kHnTv8m86d9s2lNKYXl1L1UnInK44uJifvvb33b4dXPnzqW4uLgbKjpcr3a/jBiQwoNfmtHqvDm/fJmsO56jpq6hh6sSEWldpFCvr69v83UrVqygf//+3VVWM73ep37W8Rmsv/W8iPOP//5TvP3hAdZ8sL8HqxIROdxNN93Etm3bmDJlCqeccgpnn302l112GSeddBIAF198MdOnT2fSpEksXbq06XWZmZkUFhaSm5vLhAkT+OpXv8qkSZM477zzqKysjGmN1luXs8vKynItx37JvOnf7b7u6tljuWHO+O4qS0SOYBs3bmTChAkA3Pav99iQXxrT5U8c1pdbLpoUcX5ubi7z5s0jJyeHVatWceGFF5KTk9N02OH+/fsZOHAglZWVnHLKKbz00kukp6eTmZlJdnY25eXljBs3juzsbKZMmcIll1zC/Pnz+dznPtfu521kZmucc1mRauz149TDrfn+OdQ1OE798fMR2/x21TZ+u2obADvunPuR2RMuIkeeGTNmNDuO/J577uGxxx4DYOfOnWzZsoX09PRmrxkzZgxTpkwBYPr06eTm5sa0pnZD3cySgNVAotd+uXPulhZtrgB+CuzyJt3rnLu/o8WkpyUCsOmHc7hzxUZGDkzhjn9vjNj+gl+9zIpvfIzK2nqSggECcQp4kY+Ktraoe0pq6qGj+FatWsVzzz3Ha6+9RkpKCrNnz271OPPExMSmx4FAIObdL9FsqVcDH3fOlZtZEHjFzJ5yzr3eot3DzrlrY1FUUjDAbQtOBODnz75PZW3rOyE27Snj2O+uaHqee9eFsXh7EZFW9enTh7KyslbnlZSUMGDAAFJSUti0aROvv94yIntGu6HuQp3u5d7ToHfrsY74jT+cQ1lVLcFAHJ9d+jrv7Ix8WFB+cSVrPzzA2Iw0Jgzt21MlishHRHp6OrNmzeLEE08kOTmZY445pmnenDlzWLJkCZMnT+aEE07gtNNO65Uao9pRamYBYA0wDviNc+7GFvOvAO4ECoD3geudcztbWc5iYDHAqFGjpn/wQeeGBygsrybrjufabLPph3NICgY6tXwROTK1tuPQzzqzozSqQxqdc/XOuSnACGCGmZ3Yosm/gEzn3GTgOeDBCMtZ6pzLcs5lZWREvBpTuwalJbLy+jOZPKJfxDbjb3666USm8uq6Tr+XiMjRpEPHqTvnioFVwJwW04ucc42nf/4emB6T6tpw3DF9eORrM6Nqe+Itz7B1Xxm9dfimiEhPiebolwyg1jlXbGbJwDnA3S3aDHXO7faezgciH7ISQ0nBQNPO0aLyaqa30SVzzi9WA/DqTR9neP/knihPRKTHRXP0y1DgQa9fPQ54xDn3pJndDmQ7554AvmFm84E6YD9wRXcVHEl6WiK/vXwayQkBrvzjWxHbzbrrhWbP7/rUSZw3aQgDUxO6u0QRkW53RJ1RGiuvbi2kpLKWmx/PoaiiJqrXvPTt2YxOT22/oYj0Gu0oPcrOKI2VWeMGATD3pKFU19Xzi2ff59G1eRSWRw74s366CoDtP55LnE5iEpGjVK8P6NXdEuMDfGfuBFZef1ZU7Y/97gr2llaxdV859zy/pZurExE/S0tLAyA/P5+FCxe22mb27NnEstfCl1vqrRmQmtC0U7W6rp6i8hpOb9G/3ih87Jn4gHHexFCfu/rdRaQzhg0bxvLly3vkvY7+UH/rfhg8CUaeCnHR/fBIjA8wrH8yuXddSF19A+O+91TEtj95ejM/eXpzs2l/X3wapx2bHuEVIuJXN954I6NHj+bqq68G4NZbb8XMWL16NQcOHKC2tpY77riDBQsWNHtd+OiOlZWVXHnllWzYsIEJEyb0ytgvR66ag7DyFqgphz5DYeICmHhxhwI+PhBHfJxR1xD9DuNFS19nYGoCy78+k2Mz0qitb6D4YC39U4IEA77v0RI5Mjx1E+x5N7bLHHISXHBXxNmLFi3iuuuuawr1Rx55hKeffprrr7+evn37UlhYyGmnncb8+fMjjiB73333kZKSwvr161m/fj3Tpk2L6Uc4ukM9IQX+eyO8/wxseByy/whvLAkF/IT5MOliGHlauwG/5UcXUFlbzzf+to7nNu6N6q33V9Tw8Z+/dNj0JZ+bxs+ffZ9nrjtTO1xFfGbq1Kns27eP/Px8CgoKGDBgAEOHDuX6669n9erVxMXFsWvXLvbu3cuQIUNaXcbq1av5xje+AcDkyZOZPHlyTGs8ukMdIKkvTP5M6FZVeijg1/wJ3vwdpA2BifNh0icjBryZkZIQz/1fzOKR7J1kpCVyxnGDOK6NbplIvv6XtQB86r7/8Pg1s7r66UQkkja2qLvTwoULWb58OXv27GHRokUsW7aMgoIC1qxZQzAYJDMzs9Uhd8N153Ug/NVX0Bjwi5bBDdvg03+AEVmw9iH44wXwiwmw4tuQ+yo0tD6c7yVZIzl7/GCCgThy77qQx64+vVOlrNtZzM+e2dx+QxE5qixatIi///3vLF++nIULF1JSUsLgwYMJBoO8+OKLtDdQ4ZlnnsmyZcsAyMnJYf369TGt7+jfUo8ksQ+ctDB0qy47tAW/9iF4cymkHeN10XwSRp0Gca2P6Dh11ADW3nwuO/cfZNKwvlRU1/P2zgM8uX43y9fktVnCvS9u5Xert1FbH+qv/8mnJ/On/+Ry3TnHcebxGfzsmc3ceMF49cOLHEUmTZpEWVkZw4cPZ+jQoVx++eVcdNFFZGVlMWXKFMaPb/tym1dddRVXXnklkydPZsqUKcyYMSOm9fnyjNI2VZfD+0+HAn7LSqirOhTwExfAyBkQn9j+cmh+TdU/f3kGn//Dm10q7eUbzmbur15myeenc/Wytdxy0UQ+NW1El5Yp4ic6o7T9M0o/eqEerroctjwD7zUGfCXEJ4eCPfNjMOZjMGwaxLd+fPq+0ioCcdZ0Gb7s3P0sXPJaTEv8y5dP5YzjBsV0mSJHK4W6Qj161eWwfRXkvhK67fUOlYpPhlGnQuYZkHkmDJsaMeQb/enVHdz6rw0xK23dD86lf0rb77luZzF/fHUH/3vJFB11I76lUP+Ijv3SKYlpMGFe6AZwcD988Goo4He8DC/cEZoeTAkdBz/mY6Gt+WFTIRBstqgrZo3hillj+M2LW/lpDHaWTrl9ZajE+Diq6xo4dlAqD39tJoPSEvje4zlcddZYLv7Nq6Gya+p5Y3sRpVWhC4PMyBzI7z4/nak/XMkNc07g6tnjulyPSG9yznXr0SNHis5ucGtLPVoVRV7IvxwK+n3elngwNbSjNfMML+SnHBbyj2Tv5PmNezl34hDOm3QM+0qrGd4/mWDAeODVHWzZW86Vs8Yw956Xu/1j6OLccjTbsWMHffr0IT093dfB7pyjqKiIsrIyxowZ02yeul+6S0Xhoa6a3FegwLsuSEJaKOSHZ8Gg40K39HGQ0P6wvn9+/QPiDCYO7cvf39zJw9mHXea1y7b/eC4f7j9I5qBQPa9vL2LR0te5cc54rpo9tqldQ4Pjvpe2cfmpo5p1/eyvqKFfcpCAunikF9TW1pKXl9fuceB+kJSUxIgRIwgGm28kKtR7SnkBfPDKoe6awveBsHXbd8ShkB90fCjoBx0PfYdBhC2O7QXlrZ612hXfPv+EDncJpacmcOWsTH727PtN07b+6ALidSimSI9TqPeW2irYvz0U7oVboGiL93gr1JQdahdMhUHjIN0L+0Fe2A8cCwkpjL/5KapqG3rvc7Qhmq6cypp6qmrrKamsZUi/JDbtKSM+zjhxeOSLhotIZNpR2luCSXDMxNAtnHNQtscL+S2HAj/vTch5lGZb9/1G8e7YsTyzK8jpUyeTPGgUCQNHEeg3HPoNpzouiTgzgoE4PiiqoKauAQec97+reeRrMyk+WMO5E4+htLKOn6/czEOvtX2mW2f8+vkt/Hzl+3z9rLEMH5BM/+QgHx8/mG8vf4cV7+6J+LqzT8jgxc0FALx987kMiGJY490llewtrWbKyP7Npm/aU8qcX76sXw8iKNR7nhn0HRq6jTmz+bzaSija1izwg0VbmJewC9589rBFJSb1C3Xr9B3G6H7DoW/olrt4OPQpguGhrp1+KUGSg62fMdsVj72dx89Xhrpklry0rUOvbQx0gHV5xZx9wuA22+8trWLmnaHx7//6lVN58LVc3s0rYfKI/jz9XujL4+pla1n6hYgbMCIfCe2GupklAauBRK/9cufcLS3aJAIPAdOBIuCzzrncmFfrd8FkGHJi6NZSXTWU7YaSXVDq3Up2QWk+lObB7nVQUXD465L6Q78R/HfKEEbH11NIPwpcfwpdv9CN0H0Zybz5vXOY8aPQBUJWXn8mVy9bS32DY3thRdPiTh7Zn3d2FgNw/cPvxORjV9WExuH5oKii6bKCnxg/mHrnmDV2ED9asbFZ+8vuf6PpcX7JoV8Dz26IboTNypp6Nu0pJTM9lcRgHCkJh/4bPL9xL3tKq7j81NGd/TgivardPnULHTeU6pwrN7Mg8ArwTefc62FtrgYmO+e+bmaLgE865z7b1nJ936feG2qroCw/FPTh4V+aDyV5ULYbd7AIc6300QcSIW0wBxPSsbTBJA8YAqmDIW0wpGZQnTiIxAFDITWDzNteAWJ79MumH85h/M1Pd3k54f3833/8XbJGD+TiqcOB0GFie0urOe3O5w973TkTBrO7pIr38ksPW47IkSSmO0rNLIVQqF/lnHsjbPozwK3OudfMLB7YA2S4NhauUO8lDfVwsAjK90HFvtBROxX7vOcFUL730LSKQnCHj2ZZ7eIpJo0Sl0oJqWH3h0+787KP8fS2Ku59rZBSUqkm1HfeeCJVrOXcdj7JwQBzfrmaLfvKO72c5791FmMz0thTUsX2wnJOHxsaqiH8xJcdhRXkHTjIsP7JbMgv5aKThwFw+f2v8+rWIn0xSLeISaibWQBYA4wDfuOcu7HF/BxgjnMuz3u+DTjVOVfYot1iYDHAqFGjprc3RKX0soYGqNx/2BfA/n15PPPWRvpZBf2oYOKAevpbBWXFhfS1ti/N5eKTseT+kNSfDw4Geb80nnKSKXfJrd6XeY8rSKLMpVBOMgdJJNa/FFqTe9eFTYO2vXzD2Tz81k7ufXErC6YM45/r8tt9fWpCgL9+9TRObrFjNxpf+tNbvLBpHwNSghw4WMvG2+eQnBD7/SJy9In1lnp/4DHgv5xzOWHT3wPObxHqM5xzRZGWpS31o9u/1+/mmr+u5d7LpjJv8rBDM+rroLoUKg9AZTFUNd4Xh+4rDzQ9rq04wJYP8kilkjSrpA8HSbDWx7kPV+8Ml5BGXFJf6oOp7K4KklsWRznJVLik0D1JVLhkyklqmtZsvkuiwmvX0MplBaaN6s/aD4tjsq5y77oQ5xzVdQ3kFlXwrUfe4cEvzSC3sIKFS17jOxeMZ9roAYzLSCMnv4Rt+8pbHTtoyeemMefEoVG/b1VtPevzSpgxZmBMPoccGWJ+nLqZ3QJUOOd+FjZN3S8fQe/llzBxaN8una4dPnzxjXPGs/j04QRqyyksKuRr968ia2g8N8weRqC2IvRlUVMeGh+/6VYK1eXUHiwhN38vqVZJGlWkUknAovu3fdAlUkESgaQ+5FfGe18ISRwkkYON9yRx0CVSSajtQedNI5FK7/Xh7atIoPHXxJLPTefGR9dTUlnb6fXUqGWXzvaCcsYMSsXM2FVcSXyckbOrhIKyap7K2cNL74d2nv9wwSRGDkzhqr+sZc3N5zTbORxJTV0DheXVDOuf3OW6JXa6HOpmlgHUOueKzSwZeBa42zn3ZFiba4CTwnaUfso5d0lby1WoC0DegYMMTE0gMT7Q5aEHwr8gwJFEDWlUce7YFAqKiigrOUCqVfLApROguoy9hYUUFx9gWHIdfawKasppqC7jrc0fkkw1qVSRbNWkUE0KVVH9imjU4KzZl8FBkqjwQj/0BRA2vWl+UtOXQuMXR/i0ShLZfOfFVNc3cML3O79T+bNZI7l7YfvXxbx06eu8tr2IP395Bk+sy2f80L4kBIzPZI1ke0EFJwzpo+EiekEsQn0y8CAQIHT5u0ecc7eb2e1AtnPuCe+wxz8DU4H9wCLn3Pa2lqtQl1hzzrH2wwN8+r7QmPZXzR7LDeefgJnxbl4JF937Ck9cO4vJI9ru427+5QD3fyGLkQNTmPfLF1i6aAJ/Xr2RszJT+GLWIH725Ntszy/gt585nr1F+7n3mfWcMiyBHbsLSAn7YkilihSqSLXQF0QK1aRYFalUkWQd24KvdkGqCFJFAlUugerGxyR48xKobvG8iiDVLqHpdd+ZP5WkpBSKquP4n8ffb1pWY9u/fP0sLrovu2m5LsKVLx+9aibTR3ese6e8uo7SylqeXJ/P4jPHtv8CaUbDBIh0UHio//rSqU1HtURjT0kVg/sksn5XSdNwyO0JUN/0ayDFC/1UqrzQr2behD68sWknSdSQZDUkUUsiNSRSQ5LVhqZTQyK13nzvuYXaJRFqE+zAL42Wql180xdFVdMXR+j+lLFDQofExieyvxp2FNdxcuYx1BCkuNqojwsyMmMgP352G3HBREprA9QQT7ULUkOQauK5ecEUxh4zIDTCaVyAijojPj5IYmJS6FKTcfEQFwzdB+LZVlTN39bk8/15k3EWh8M+MtcRUKiLdFBjqK/6n9lNo1l2xlPv7uaqZWubTTt/0jH8atHUpmPyt/147mFdGDm7SjhhSB/KquoY6A2f0PLXQ7Q+MX4wz2/aB4S+PBI59CXQ1hdAs3lhXxatfZEkWC0J1JFA6D60rNpDz6klLsr9G51V6wLUWYDEhAQsLh6Li/e+CAKHvhSs8cshnnqMWhcgKTHBmx4Ia++91uKox6iuc4f2QZgB1v49jXct57UmwvQI7e2iXyrURTqi0jvDNRaHEOYXV5JbVMFlv3+D9++4gIT4zo1N01qob7x9Dm/sKOKKP77FrxZN4ZTMgZx+1wt8YeZobps/CTOjqraevAMHGTe4DwCrNu/jij++1aXP1HGOeOpbBH9NU+A3To+3euJpeWsgnjoC1kCQegKtzIs3rw2H2pxx7ACqa2vYV1zBmeMGEqCe4vJK/rN1H/HUE6Ah1D6ugbiGeo4bnExywLFlTzEBGjg+I5nC0oNU1dSCc4zJSMNwFFfUkJIQR2LAAIdzjoYGR8BCnxMvTx2Oqpo6koNx3jQXtjagvsERH2dN7VtbZ3WNbZpNdtiN2xXqIke7lqH+3m3nk5rYuaGb2tvq//6FE7jj36GhGS6eMozH1+Xzyo1nc8bdL3bq/eSQy08dxbI3PgTg2rPHce+LW3ni2lmcNLwfZsaNy9cTH7CmNuOH9OHW+ZNYtDR0Av+T/3UGJ43or1AXOdqVHKwlv6SSwX0SccAg72LnnbGruJJ/r8/nnue3Ul4duuxhzm3nk7OrhPFD+tAnKci3/+8dZo0bxKenj2h63YGKGuqda3rvqtr6Dg/t8ItLTmbyiH68tn0/n5k+guc27uXav77d6c/yUfTB3fMU6iJyuJxdJcz79StA58e6KSqv5uyfreIfV8/iKw++xR0Xn8QZxx0aUuH+l3dw3qRjmgZq++c1s1o9w7az+wwaJQTiqKk/Mq87EGsKdRFpVUFZNaf86Dm+dtaxfOeCCe2/oAvq6hvYuLuMk0a0fnGUsqpaTrr10PDSO+6cy4GDtfzgnzk8uX43AI9fM4v6hgbe2VnC52eO5g+v7OALM0dTXlXHoLRE4uKM+1/e3tR11JqvfmwMv395B09f9zFGDUxhR2EFF97zSmw/bDdTqItIRGVVtaQlxh/RF3H+x9o8Jo/oz7jBaVG1X7lhL2MzUnkqZw8vbNrHmg8OsOPOuRE/Y3VdPVv3lTd1Kx3TN6lpXuMviJNH9OPhr80EaNblNKRvEntKW79e6pYfXcBjb+/ihuXro6q7pdPHpvOfbYePtKJQFxHppNv/tYE+SfF88xPHNR0H/891u/ig6CDXnj2OuDhjf0UNKQmhM6KXrt5OcjDAmcdnNH0JOed4bVsRM8YM5LNLX2fMoFQuO3UUY9JTSYiPIzkY4MDBGtLTEtlVXMmsu17g5BH9ePyaWewprSIpPkBCfBypifFU1daTnBCvUBcR8Yv2Tj7SBR1FRHxEoS4i4iMKdRERH1Goi4j4iEJdRMRHFOoiIj6iUBcR8RGFuoiIjyjURUR8RKEuIuIjCnURER9pN9TNbKSZvWhmG83sPTP7ZittZptZiZmt824/6J5yRUSkLdFcD6sO+JZzbq2Z9QHWmNlK59yGFu1eds7Ni32JIiISrXa31J1zu51za73HZcBGYHh3FyYiIh3XoT51M8sEpgJvtDJ7ppm9Y2ZPmdmkCK9fbGbZZpZdUFDQ4WJFRKRtUYe6maUBjwLXOedKW8xeC4x2zp0M/Bp4vLVlOOeWOueynHNZGRkZna1ZREQiiCrUzSxIKNCXOef+0XK+c67UOVfuPV4BBM1sUEwrFRGRdkVz9IsBfwA2Oud+EaHNEK8dZjbDW+7hF9cTEZFuFc3RL7OAzwPvmtk6b9p3gVEAzrklwELgKjOrAyqBRa63rpMnIvIR1m6oO+deAdq81Lhz7l7g3lgVJSIinaMzSkVEfEShLiLiIwp1EREfUaiLiPiIQl1ExEcU6iIiPqJQFxHxEYW6iIiPKNRFRHxEoS4i4iMKdRERH1Goi4j4iEJdRMRHFOoiIj6iUBcR8RGFuoiIjyjURUR8RKEuIuIjCnURER9pN9TNbKSZvWhmG83sPTP7ZittzMzuMbOtZrbezKZ1T7kiItKWdi88DdQB33LOrTWzPsAaM1vpnNsQ1uYC4Djvdipwn3cvIiI9qN0tdefcbufcWu9xGbARGN6i2QLgIRfyOtDfzIbGvFoREWlTh/rUzSwTmAq80WLWcGBn2PM8Dg9+ERHpZlGHupmlAY8C1znnSlvObuUlrpVlLDazbDPLLigo6FilIiLSrqhC3cyChAJ9mXPuH600yQNGhj0fAeS3bOScW+qcy3LOZWVkZHSmXhERaUM0R78Y8Adgo3PuFxGaPQF8wTsK5jSgxDm3O4Z1iohIFKI5+mUW8HngXTNb5037LjAKwDm3BFgBzAW2AgeBK2NfqoiItKfdUHfOvULrfebhbRxwTayKEhGRztEZpSIiPqJQFxHxEYW6iIiPKNRFRHxEoS4i4iMKdRERH1Goi4j4iEJdRMRHFOoiIj6iUBcR8RGFuoiIjyjURUR8RKEuIuIjCnURER9RqIuI+IhCXUTERxTqIiI+olAXEfERhbqIiI8o1EVEfKTdUDezB8xsn5nlRJg/28xKzGydd/tB7MsUEZFoxEfR5k/AvcBDbbR52Tk3LyYViYhIp7W7pe6cWw3s74FaRESki2LVpz7TzN4xs6fMbFKkRma22MyyzSy7oKAgRm8tIiKNYhHqa4HRzrmTgV8Dj0dq6Jxb6pzLcs5lZWRkxOCtRUQkXJdD3TlX6pwr9x6vAIJmNqjLlYmISId1OdTNbIiZmfd4hrfMoq4uV0REOq7do1/M7G/AbGCQmeUBtwBBAOfcEmAhcJWZ1QGVwCLnnOu2ikVEJKJ2Q905d2k78+8ldMijiIj0Mp1RKiLiIwp1EREfUaiLiPiIQl1ExEcU6iIiPqJQFxHxEYW6iIiPKNRFRHxEoS4i4iMKdRERH1Goi4j4iEJdRMRHFOoiIj6iUBcR8RGFuoiIjyjURUR8RKEuIuIjCnURER9RqIuI+IhCXUTER9oNdTN7wMz2mVlOhPlmZveY2VYzW29m02JfpoiIRCOaLfU/AXPamH8BcJx3Wwzc1/WyRESkM9oNdefcamB/G00WAA+5kNeB/mY2NFYFiohI9GLRpz4c2Bn2PM+bdhgzW2xm2WaWXVBQEIO3FhGRcLEIdWtlmmutoXNuqXMuyzmXlZGREYO3FhGRcLEI9TxgZNjzEUB+DJYrIiIdFItQfwL4gncUzGlAiXNudwyWKyIiHRTfXgMz+xswGxhkZnnALUAQwDm3BFgBzAW2AgeBK7urWBERaVu7oe6cu7Sd+Q64JmYViYhIp+mMUhERH1Goi4j4iEJdRMRHFOoiIj6iUBcR8RGFuoiIjyjURUR8RKEuIuIjCnURER9RqIuI+IhCXUTERxTqIiI+olAXEfERhbqIiI8o1EVEfEShLiLiIwp1EREfUaiLiPiIQl1ExEeiCnUzm2Nmm81sq5nd1Mr8K8yswMzWebevxL5UERFpT7sXnjazAPAb4FwgD3jLzJ5wzm1o0fRh59y13VCjiIhEKZot9RnAVufcdudcDfB3YEH3liUiIp0RTagPB3aGPc/zprX0aTNbb2bLzWxkawsys8Vmlm1m2QUFBZ0oV0RE2hJNqFsr01yL5/8CMp1zk4HngAdbW5aUTCwAAAd/SURBVJBzbqlzLss5l5WRkdGxSkVEpF3RhHoeEL7lPQLID2/gnCtyzlV7T38PTI9NeSIi0hHRhPpbwHFmNsbMEoBFwBPhDcxsaNjT+cDG2JUoIiLRavfoF+dcnZldCzwDBIAHnHPvmdntQLZz7gngG2Y2H6gD9gNXdGPNIiISgTnXsnu8Z2RlZbns7OxeeW8RkaOVma1xzmVFmq8zSkVEfEShLiLiIwp1EREfUaiLiPiIQl1ExEcU6iIiPqJQFxHxEYW6iIiPKNRFRHxEoS4i4iMKdRERH1Goi4j4iEJdRMRHFOoiIj6iUBcR8RGFuoiIjyjURUR8RKEuIuIjCnURER9RqIuI+EhUoW5mc8xss5ltNbObWpmfaGYPe/PfMLPMWBcqIiLtazfUzSwA/Aa4AJgIXGpmE1s0+zJwwDk3Dvhf4O5YFyoiIu2LZkt9BrDVObfdOVcD/B1Y0KLNAuBB7/Fy4BNmZrErU0REohEfRZvhwM6w53nAqZHaOOfqzKwESAcKwxuZ2WJgsfe02sxyOlN0LxtEi891FFDNPedorFs195xY1D26rZnRhHprW9yuE21wzi0FlgKYWbZzLiuK9z+iHI11q+aeczTWrZp7Tk/UHU33Sx4wMuz5CCA/Uhsziwf6AftjUaCIiEQvmlB/CzjOzMaYWQKwCHiiRZsngC96jxcCLzjnDttSFxGR7tVu94vXR34t8AwQAB5wzr1nZrcD2c65J4A/AH82s62EttAXRfHeS7tQd286GutWzT3naKxbNfecbq/btEEtIuIfOqNURMRHFOoiIj7SK6He3rADPfD+I83sRTPbaGbvmdk3vekDzWylmW3x7gd4083M7vHqXW9m08KW9UWv/RYz+2LY9Olm9q73mntidTKWmQXM7G0ze9J7PsYbmmGLN1RDgjc94tANZvYdb/pmMzs/bHrM/y5m1t/MlpvZJm99zzxK1vP13r+NHDP7m5klHWnr2sweMLN94ed79MS6jfQeXaz7p96/kfVm9piZ9Q+b16F12Jm/U2dqDpv3P2bmzGyQ97x317VzrkdvhHa2bgOOBRKAd4CJPVzDUGCa97gP8D6hIRB+AtzkTb8JuNt7PBd4itDx+KcBb3jTBwLbvfsB3uMB3rw3gZnea54CLohR7f8N/BV40nv+CLDIe7wEuMp7fDWwxHu8CHjYezzRW+eJwBjvbxHorr8LoTONv+I9TgD6H+nrmdDJdDuA5LB1fMWRtq6BM4FpQE7YtG5ft5Heo4t1nwfEe4/vDqu7w+uwo3+nztbsTR9J6CCSD4BBR8K67rEgDVsJM4Fnwp5/B/hOT9fRoqZ/AucCm4Gh3rShwGbv8e+AS8Pab/bmXwr8Lmz677xpQ4FNYdObtetCnSOA54GPA096/wAKw/4zNK1b7x/aTO9xvNfOWq7vxnbd8XcB+hIKR2sx/Uhfz41nSA/01t2TwPlH4roGMmkejt2+biO9R1fqbjHvk8Cy1tZNe+uwM/8nulIzoWFRTgZyORTqvbque6P7pbVhB4b3Qh0AeD/BpgJvAMc453YDePeDvWaRam5rel4r07vql8ANQIP3PB0ods7VtfI+zYZuABqHbujoZ+mKY4EC4I8W6jK638xSOcLXs3NuF/Az4ENgN6F1t4Yje1036ol1G+k9YuVLhLZWO1N3Z/5PdIqZzQd2OefeaTGrV9d1b4R6VEMK9AQzSwMeBa5zzpW21bSVaa4T0zvNzOYB+5xza6Koq615PVYzoa2hacB9zrmpQAWhn5CRHAk14/VbLiD0c38YkEpolNJI73VE1N2Oo6FGzOx7QB2wrHFShDo6U3fMPpOZpQDfA37Q2uwO1hbTdd0boR7NsAPdzsyChAJ9mXPuH97kvWY21Js/FNjnTY9Uc1vTR7QyvStmAfPNLJfQSJkfJ7Tl3t9CQzO0fJ9IQzd09LN0RR6Q55x7w3u+nFDIH8nrGeAcYIdzrsA5Vwv8AzidI3tdN+qJdRvpPbrE23E4D7jcef0Nnai7kI7/nTpjLKEv/Xe8/5MjgLVmNqQTNcd2XXemH68rN0Jbb9u9FdK4g2NSD9dgwEPAL1tM/ynNd0r8xHt8Ic13fLzpTR9IqM94gHfbAQz05r3ltW3c8TE3hvXP5tCO0v+j+U6hq73H19B8p9Aj3uNJNN/xtJ3QTqdu+bsALwMneI9v9dbxEb2eCY1C+h6Q4i33QeC/jsR1zeF96t2+biO9RxfrngNsADJatOvwOuzo36mzNbeYl8uhPvVeXdc9FqQtVsBcQkecbAO+1wvvfwahnzfrgXXebS6h/rXngS3efeMKN0IXCtkGvAtkhS3rS8BW73Zl2PQsIMd7zb10YIdMFPXP5lCoH0toz/lW7x9zojc9yXu+1Zt/bNjrv+fVtZmwo0W64+8CTAGyvXX9uPeP+Yhfz8BtwCZv2X8mFCpH1LoG/kaoz7+W0Nbel3ti3UZ6jy7WvZVQf3Pj/8clnV2Hnfk7dabmFvNzORTqvbquNUyAiIiP6IxSEREfUaiLiPiIQl1ExEcU6iIiPqJQFxHxEYW6iIiPKNRFRHzk/wFGKf9ztvObIgAAAABJRU5ErkJggg==\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_lm.unfreeze()\n",
"learn_lm.fit_one_cycle(18, lr, wd=wd, moms=(0.8,0.7), callbacks=[ShowGraph(learn_lm)])"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_lm.save(f'{lang}fine_tuned2_sp15_multifit_v2')\n",
"learn_lm.save_encoder(f'{lang}fine_tuned2_enc_sp15_multifit_v2')"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"Save best LM learner and its encoder"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_lm.save(f'{lang}fine_tuned_sp15_multifit_v2')\n",
"learn_lm.save_encoder(f'{lang}fine_tuned_enc_sp15_multifit_v2')"
]
},
{
"cell_type": "markdown",
"metadata": {
"heading_collapsed": true
},
"source": [
"## Fine-tuning \"backward LM\""
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"### Databunch"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 29.9 s, sys: 1.16 s, total: 31 s\n",
"Wall time: 48.9 s\n"
]
}
],
"source": [
"%%time\n",
"data_lm = (TextList.from_df(df_trn_val, path, cols=reviews, processor=SPProcessor.load(dest))\n",
" .split_by_rand_pct(0.1, seed=42)\n",
" .label_for_lm() \n",
" .databunch(bs=bs, num_workers=1, backwards=True))"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"data_lm.save(f'{path}/{lang}_databunch_lm_aws_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"### Training"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 1.96 s, sys: 628 ms, total: 2.59 s\n",
"Wall time: 4.12 s\n"
]
}
],
"source": [
"%%time\n",
"data_lm = load_data(path, f'{lang}_databunch_lm_aws_sp15_multifit_bwd_v2', bs=bs, backwards=True)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"config = awd_lstm_lm_config.copy()\n",
"config['qrnn'] = True\n",
"config['n_hid'] = 1550 #default 1152\n",
"config['n_layers'] = 4 #default 3"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 4.3 s, sys: 1.52 s, total: 5.82 s\n",
"Wall time: 30.6 s\n"
]
}
],
"source": [
"%%time\n",
"perplexity = Perplexity()\n",
"learn_lm = language_model_learner(data_lm, AWD_LSTM, config=config, pretrained_fnames=lm_fns3_bwd, drop_mult=0.3, \n",
" metrics=[error_rate, accuracy, perplexity]).to_fp16()"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.\n"
]
}
],
"source": [
"learn_lm.lr_find()"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3deZRc5Xnn8e9TVV1dvWptLUgCAQaxeAyGDjZhhmA78cTYB+wJyZDV4CQMjtd4nDnJeE7isY8TJ07GMWFiQjwh9thkMQ4Z7NgY4pjYiY1BYjNBAoRYuqUW6k1dXd3VtT7zx73VKjUtdUuqW0vX73POPX3rvbfqPip111Pvct/X3B0REWlfsUYHICIijaVEICLS5pQIRETanBKBiEibUyIQEWlziUYHcKLWr1/v27dvb3QYIiItZdeuXWPuPrDYsZZLBNu3b2fnzp2NDkNEpKWY2YvHOqamIRGRNqdEICLS5pQIRETanBKBiEibUyIQEWlzSgQiIm1OiUBEpM0pEYiItIA//sdn+O6zo5G8thKBiEiTc3du+dazPLhvPJLXVyIQEWlyM/kSZYf+VEckr69EICLS5NLZAgD9XUoEIiJtaXquCKhGICLSttJzQY2gLxXNPKFKBCIiTU5NQyIiba5SI+hXjUBEpD3N9xGoRiAi0p4qTUPqIxARaVPpuSKdiRidiXgkr69EICLS5KbnCpE1C4ESgYhI00tni5F1FIMSgYhI00vPFeiL6GYyUCIQEWl66ayahkRE2tr0nJqGRETaWrpVO4vNbIeZPVa1pc3sgwvOMTO7xcz2mtkTZnZJVPGIiLQidyedLUZ2DwFAZK/s7k8DFwOYWRzYD9y94LS3AOeE2+uAz4Y/RUQEyBXL5EvlyGYehfo1Db0JeM7dX1xQfi3wBQ88CKw2s811iklEpOnNzzPUik1DC1wP/NUi5VuAoarHw2HZUczsJjPbaWY7R0ejWbNTRKQZpbOVtQhauLPYzJLANcCXFzu8SJm/osD9dncfdPfBgYGBWocoItK0jsw82to1grcAj7j7y4scGwa2VT3eChyoQ0wiIi3hyFoELVwjAH6WxZuFAO4BfikcPfR6YMrdR+oQk4hIS4h6mUqIcNQQgJl1Az8B/JeqspsB3P024OvA1cBeYBa4Mcp4RERaTT06iyNNBO4+C6xbUHZb1b4D74kyBhGRVlbpLI7yPgLdWSwi0sSm5wokYkZXRzRrEYASgYhIU6tML2G22CDL2lAiEBFpYlGvRQBKBCIiTS3qtQhAiUBEpKlNzxUjvYcAlAhERJpaOluI9B4CUCIQEWlq6TklAhGRthb1WgSgRCAi0rQKpTLZQinSu4pBiUBEpGkdmWdINQIRkbZ0ZOZR1QhERNpSZcI53UcgItKm1DQkItLm1DQkItLm6rEWASgRiIg0rUrTkO4jEBFpU+lsATPoTSoRiIi0pfRckb7OBLFYdGsRgBKBiEjTSmejn4IalAhERJpWeq4YeUcxKBGIiDStYObRaPsHQIlARKRppbMF1QhERNrZ9Fz0U1CDEoGISNOqx6I0oEQgItKUymUnk1NnsYhI25rOFXGPfsI5UCIQEWlK8xPOtXrTkJmtNrO7zGyPme02s8sXHL/KzKbM7LFw++0o4xERaRXzU1B3RV8jiPoKnwHudffrzCwJdC9yznfd/W0RxyEi0lLmZx6tQ40gskRgZv3AlcANAO6eB/JRXU9EZCWp11oEEG3T0FnAKHCHmT1qZp8zs55FzrvczB43s2+Y2YWLvZCZ3WRmO81s5+joaIQhi4g0h3pNQQ3RJoIEcAnwWXd/LTAD/OaCcx4BznD3i4A/Af5+sRdy99vdfdDdBwcGBiIMWUSkOdSzaSjKRDAMDLv7D8LHdxEkhnnunnb3TLj/daDDzNZHGJOISEtIZ1dAjcDdDwJDZrYjLHoT8FT1OWa2ycws3L8sjGc8qphERFpFeq5ATzJOIh79KP+oU837gC+FI4b2ATea2c0A7n4bcB3wbjMrAlngenf3iGMSEWl603P1WYsAIk4E7v4YMLig+Laq47cCt0YZg4hIK5qYKbCmJ1mXa+nOYhGRJjQ+k2N9rxKBiEjbGs/kWacagYhI+xrL5FjX21mXaykRiIg0mdl8kdl8iXVqGhIRaU/jmWA2nvU9qhGIiLSl8ZkgEahGICLSpsYzOQD1EYiItKtK05BGDYmItKmxmaBGsF41AhGR9jSeydOTjNOVjNflekoEIiJNZryO9xCAEoGISNMZn8nXbcQQKBGIiDSdsUyedXW6hwCUCEREms54pn4TzoESgYhIUymXXU1DIiLtbCpboFR2NQ2JiLSr8ZnKXcWqEYiItKWxyoRzGj4qItKe5qeXUI1ARKQ9zTcNqY9ARKQ9jWXymMGa7o66XVOJQESkiYxncqztTpKI1+/jWYlARKSJjGfqew8BKBGIiDSV8ZlcXfsHQIlARKSpqEYgItLmxjK5ut5DAEoEIiJNI1cskZ4r1m2JyopIE4GZrTazu8xsj5ntNrPLFxw3M7vFzPaa2RNmdkmU8YiINLOJmcrNZPWtESQifv3PAPe6+3VmlgS6Fxx/C3BOuL0O+Gz4U0Sk7TTirmKIsEZgZv3AlcD/AXD3vLsfXnDatcAXPPAgsNrMNkcVk4hIMxvLVBatb8JEYGZnm1lnuH+Vmb3fzFYv8bSzgFHgDjN71Mw+Z2Y9C87ZAgxVPR4OyxZe/yYz22lmO0dHR5cTsohIy5mvETTp8NGvACUzexXBN/wzgTuXeE4CuAT4rLu/FpgBfnPBObbI8/wVBe63u/uguw8ODAwsM2QRkdbSiCmoYfmJoOzuReAdwB+7+68DSzXhDAPD7v6D8PFdBIlh4Tnbqh5vBQ4sMyYRkRVlPJMnmYjR2xl19+3RlpsICmb2s8A7ga+FZcedEcndDwJDZrYjLHoT8NSC0+4BfikcPfR6YMrdR5YZk4jIijKWybO+J4nZYo0l0Vlu2rkRuBn4hLs/b2ZnAl9cxvPeB3wpHDG0D7jRzG4GcPfbgK8DVwN7gdnwOiIibWl8Jsf6vvr2D8AyE4G7PwW8H8DM1gB97v7JZTzvMWBwQfFtVccdeM+yoxURWcHGM/m6jxiC5Y8aesDM+s1sLfA4wUig/xVtaCIi7WU8k6v7zWSw/D6CVe6eBv4TcIe7Xwr8eHRhiYi0F3dnbKb+E87B8hNBIrzR62c40lksIiI1Mp0rki+WWV/newhg+YngY8A3gefc/WEzOwt4NrqwRETaS6Oml4DldxZ/Gfhy1eN9wE9FFZSISLsZz1RuJmvSGoGZbTWzu83skJm9bGZfMbOtUQcnItIuDk03Zp4hWH7T0B0EN3+dRjAX0FfDMhERqYHhyVkAtq5ZOElz9JabCAbc/Q53L4bbXwKa9EdEpEaGJrL0pxKs6jrupA2RWG4iGDOzXzCzeLj9AjAeZWAiIu1kaHKWbWvrXxuA5SeCdxEMHT0IjADXoekgRERq5qWJWbY1oFkIlpkI3P0ld7/G3QfcfYO7v53g5jIRETlF5bIzPJll29quhlz/VFYo+1DNohARaWOjmRz5Yrnpm4YWU995UkVEVqihiWDEUFM3DR3DK1YSExGREzcUDh1tVNPQce8sNrNpFv/AN6AxEYuIrDBDE1mgMfcQwBKJwN376hWIiEi7GpqYZUNfJ6mOeEOufypNQyIiUgONvIcAlAhERBpuaCLLtjWNa21XIhARaaBCqczIVFY1AhGRdjVyeI6yN27oKCgRiIg0VGXo6NYGDR0FJQIRkYZq9M1koEQgItJQQ5OzxGPG5lWphsWgRCAi0kBDE1lOW50iEW/cx7ESgYhIAzVy+ukKJQIRkQYanlQiEBFpW7P5ImOZfMMmm6s47lxDp8rMXgCmgRJQdPfBBcevAv4f8HxY9Hfu/rEoYxIRaRbDk8Fkc428mQwiTgShN7j72HGOf9fd31aHOEREmkpl6GijZh2tUNOQiEiDzN9D0OCmoagTgQP3mdkuM7vpGOdcbmaPm9k3zOzCxU4ws5vMbKeZ7RwdHY0uWhGROhqazJLqiDHQ29nQOKJuGrrC3Q+Y2QbgfjPb4+7fqTr+CHCGu2fM7Grg74FzFr6Iu98O3A4wODioldFEZEUYmphl65puzBq78m+kNQJ3PxD+PATcDVy24Hja3TPh/teBDjNbH2VMIiLNYmgyy+kN7iiGCBOBmfWYWV9lH3gz8OSCczZZmArN7LIwnvGoYhIRaRbuzvDEbEPXIaiIsmloI3B3+DmfAO5093vN7GYAd78NuA54t5kVgSxwvbur6UdEVryRqTmmc0VetaG30aFElwjcfR9w0SLlt1Xt3wrcGlUMIiLNavdIGoDzN/c3OBINHxURaYhKItixqa/BkSgRiIg0xO6Rabat7aIv1dHoUJQIREQaYfdImvM3Nb5ZCJQIRETqbjZf5PnxmaboHwAlAhGRunvm5QzuzdFRDEoEIiJ1V+kovkCJQESkPe0eSdPbmWBrE9xMBkoEIiJ1t3skzY5NfcRijZ1jqEKJQESkjtydPSPTnL+58fcPVCgRiIjU0fBklulcsWk6ikGJQESkrpppaokKJQIRkTraPTKNGZzXBFNLVCgRiIjU0e6RNNvX9dCdrMeS8cujRCAiUke7D6abqjYASgQiInUzkyvy4vhsU/UPgBKBiEjd7Dk4DTRXRzEoEYiI1M2REUNqGhIRaUtPjaTpSyXYsro5ppaoUCIQEamTf907xuAZawjXcm8aSgQiInXw/NgML47PctWODY0O5RWUCERE6uCBpw8B8AYlAhGR9vTA06Octb6H09d1NzqUV1AiEBGJ2FyhxIP7xvmxHQONDmVRSgQiIhH7/r5xcsVyU/YPgBKBiEjk/vnpUVIdMV535tpGh7IoJQIRkYh9++lDXH7WOlId8UaHsiglAhGRCFWGjb7hvOZsFoKIE4GZvWBmPzSzx8xs5yLHzcxuMbO9ZvaEmV0SZTwiIvVWGTZ61bnNmwjqMSH2G9x97BjH3gKcE26vAz4b/hQRWRGaedhoRaObhq4FvuCBB4HVZra5wTGJiNREsw8brYg6EThwn5ntMrObFjm+BRiqejwclh3FzG4ys51mtnN0dDSiUEVEautfnh1r6mGjFVEngivc/RKCJqD3mNmVC44vNvOSv6LA/XZ3H3T3wYGB5s6sIiIVdz+6n7U9SS4/a12jQzmuSBOBux8Ifx4C7gYuW3DKMLCt6vFW4ECUMYmI1MPh2Tz3P/Uy1158GslEo1vhjy+y6Mysx8z6KvvAm4EnF5x2D/BL4eih1wNT7j4SVUwiIvXy1ccPkC+Vue7SrY0OZUlRjhraCNwdzrudAO5093vN7GYAd78N+DpwNbAXmAVujDAeEZG6uWvXMOdt6uPC01Y1OpQlRZYI3H0fcNEi5bdV7TvwnqhiEBFphGdfnubx4Sn+x1vPb3Qoy9LcDVciIi3orkeGScSMt7/2FYMgm5ISgYhIDRVLZe5+ZD9X7djA+t7ORoezLEoEIiI19N29YxyaznHdpa1RGwAlAhGRmvrKrmHWdHfwxvM2NjqUZVMiEBGpkf2Hs9z3by9z7cVbmv7egWqtE6mISJP79P3PgMGvXnlWo0M5IUoEIiI18PTBaf7ukWHeefkZbFnd1ehwTogSgYhIDXzqm3voSSb4tate1ehQTpgSgYjIKXr4hQn+cfchbr7qbNb0JBsdzglTIhAROQXuzu9/Yw8DfZ3ceMX2RodzUpQIREROwbd2H2Lni5N88MfPoTtZj0Ufa0+JQETkJL2cnuO/3/1Dzhro4WcGty39hCalRCAichJyxRLv/uIuMrkif/rzl9ARb92P09asx4iINNhH73mKR146zP/+uUs4b1N/o8M5Ja2bwkREGuTOH7zEXz30Eu++6mze+prNjQ7nlCkRiIicgO/tHeN37nmSK88d4MNv3tHocGpCiUBEZJke3DfOL39+J2eu7+GW6y8mHrNGh1QTSgQiIsvw8AsTvOsvH2bLmi6+9CuvZ3V36904dixKBCIiS9j14iQ3/MVDbOpPceevvI6BvtZYcGa5NGqoRg5OzfG1Jw5wz+MHODxb4EfPXse/P2c9V5y9vq63nM8VSjw3mmHk8Bwz+SKZXJHZXIlcsUSh5JTKjuOs7elkfW+Sgd5O+rs6MIN4zIiZ0ZdKsLYnSWciXre4RZrVvU+O8F//9nEG+jq581dfz4b+VKNDqrm2SQT5YplsoUTMwMwwIJMrMjw5y/Bklv2Hs3R1xNnUn2LjqhQb+1Os60mS6gg+DN2dkak5njqQZs/BNJOzBbKFEtl8if2TWR5+cQJ3eM3WVezY1Mc/PDHCXz88hBmcsbabczf2ce7GPs7e0MPq7iR9nQn6Uh10JmKU3HF3SuUgpvRcgXS2wGy+RCJmJBMxkvEYuWKZg+k5Dk7N8XJ6jkLJiceCD/B80dk3muGF8RnKfuz3odKmWTreSaHezgTrepNs6k+xeVWKTau6wp+p+Z8dsRiFUplC2SkUyxTLZQolp1gKXj8RNzriMTrixvRckcnZPBMzedJzRTrjMVLJOKlEjK5knFRHnFQiTqojhof/Z8WyM1coMZ7JM5bJMZbJkS+V6U91sKrrlVt/V/CediZiJMJx3eWyky+VyRXLdMSNro44ZiujbVeiUyo7n77/GW799l4u2raa23/xUjauwCQAbZQI7nvqIO+989ETfl6qI8ba7iSzhRKHZwvz5T3JOF3htqqrgw+86Ryuueg0zhroBYJ1Sx8fnuJ7e8fYfTDNMy9n+NaeQ8v6AF5Kb2eCDf2dJOMxyu6UHWIG527s422v2cy5m/rYtqab3lSC3s4EPZ0JOhMx4mbEYoa7M5UtMJbJcWg6R2auSNmDZFdyJ50tMp7JMT4TfPgeSufY9dIkB6dGKJROPf5TETNIxGLkS+Ulz43HgoRfXOQ97+qI09MZp6czQU+y8j7F6U4m6E7G6U7GcQiTWpm5YpnxMBGNZfJk8yWSYcJJVrZ4jI54jETcKDuUymVKZUjEbP7/oS+VINURC5Nj8PxVXR2s7u4IviCkEvMxdHWEv2MdQZJcKR2TrWBqtsAH/uZRHnh6lP88uI2Pvf3CFV1DNvfG/mGfqMHBQd+5c+cJP2/faIZ/2nMIgLI77tCdjLN1bTfb1nRx2uousvlS1TfuHJOzeQ7P5pmYKZBMxLjgtH4u2NzPeZv66Ok88RyaK5YYmsiSniswPVckM1ckVyzNN8nEY0ZPZ4L+VIL+rg66k3GKpeDbbL5YpiMeY9OqFL0nce1aKJed8Zk8B6fmGJnKcjA9R7nsJOLBh2AibiTiMTpiNv9tvFhVW+gJaxhrupP0pxLkS2XmCiXmCkFtrXo/bkYibiTjwYfsut4k63o6WduTJB4z5golprIFprJB7enwbLA/PVcgVwy+/eeLZcrudCbiJBNBraRYdmZzRWbzJWbyJWbzwf9DJiybzYfHckXMjtRmkokY63qSrO/tZKCvk+5knHyxHNQ0CuX5/6NCKajFVP4/Y2aUymVmciWmc0Wm5wrz5xVKQW1nsUS1mGQiRm9n4qikUl0bWl2VUFZ3d9CX6qAvFZwXt6BGlp4rkJkr0pGI0Z/qoL8rQX8q+F1TLSmoBXx55xB/eN/TTGULfPSaC/m5y05fEe+Nme1y98FFj7VLIhBpRu7ObL7E4WyBw7N50tki2UIlKQXJMZsPEuRsochM7kjiSmeDD/ZKQpzNl046jlRHbD7Jre8NtoHeJOv7Otm2tptXDfSyZXUXsRVcK/nec2N8/Gu72T2SZvCMNXz0mgt59ZZVjQ6rZo6XCNqmaUikGZkFtcCezsQpr2qVL5Y5nM0zNVtgcjaoHU3PBbWQUtnp7+qgP9VBbypBoVQ+KpGMTR9p9hqamOXRlyYZn8lT/T0x1RFj+7oetq7pZuuaLk5bneK01V1sXhXsb+hLtVzzlbvz/efGufXbe/nec+NsWd3FrT/3Wt767zaviFrAcikRiKwQyUSMDX3BB3ItFEtlJmbyvDA+y3OjGfYeyrBvNMPQxCwP7hsnkysedX48Zmxd08WZ63vmtw19KQb6OtnQ18mG/s6maWefyRX57rOj/Nl39vHoS4cZ6OvkI1efzy9efsb8AJF2okQgIotKxGNs6E+xoT/FZWeuPeqYu5OeK3LgcJaDU3McmMpy4HCWF8dneX5shoeen3hFU5UZbOpPcfrabk5f282mVcHIvPV9nazr6WSgL8lAb4r+rkTNv43PFUo8NZLmoecn+M4zozz8wgSFkrNldRcff/ur+elLt7ZlAqiIPBGYWRzYCex397ctOHYD8Clgf1h0q7t/LuqYROTUmNl8J/X5m18586a7MxqOOBvN5BhN59h/OMvQ5Cwvjc/yz8+MMpbJLTrUORkOijh3Yx87NvVy7sY+NvWnWNXdMT9seLHO7VyxxORMgf2HswxPzjI0Mcu+sRmeOpDm2UOZ+RF7523q48YrzuTKcwZ43VlrW3r66FqpR43gA8Bu4FjztP6Nu7+3DnGISJ2Y2ZLNVKWyMzmbP+oekdHpI/0UT788zbefXnzIdTwcktvbmcAMDs8WXtFUBbChr5MLT+vnJy7YyKu3rOLibatX7L0ApyLSRGBmW4G3Ap8APhTltUSktcRjNj9CaQd9i56TK5Z4fmyGsen8USOkMmEn+PRcEQfWdCdZ2xMMnT1tdYpta7rZuqabrmT7NveciKhrBH8M/Dc4xv9y4KfM7ErgGeDX3X1o4QlmdhNwE8Dpp58eRZwi0oQ6E/Fg0ZdNjY5kZYuscczM3gYccvddxzntq8B2d38N8I/A5xc7yd1vd/dBdx8cGBiIIFoRkfYVZS/JFcA1ZvYC8NfAG83si9UnuPu4u+fCh38OXBphPCIisojIEoG7/5a7b3X37cD1wD+5+y9Un2Nm1Wu8XUPQqSwiInVU9/sIzOxjwE53vwd4v5ldAxSBCeCGescjItLuNNeQiEgbON5cQ7qTQkSkzSkRiIi0OSUCEZE213J9BGY2ChwGphYcWrVE2VL71WXrgbETDG2x6y/neK3iPpmYjxfXUscXlh/vseJeOq6ljp9M3IuVKe6lj5/I32T141rFHdVnyRnuvviNWB6ul9tKG3D7iZYttb+gbGctYlrO8VrFfTIx1zLu4z1W3I2J+xhlinuJ4yfyNxlF3FF9lhxva9Wmoa+eRNlS+4s9/1RjWs7xlRL38R4r7mNfb7nHTybuY/1bTkY7xX0if5PVj2sVd1SfJcfUck1D9WBmO/0Yw6yaVSvGDIq73hR3fbVK3K1aI4ja7Y0O4CS0YsyguOtNcddXS8StGoGISJtTjUBEpM0pEYiItLkVnQjM7C/M7JCZPXkSz73UzH5oZnvN7BarWiDVzN5nZk+b2b+Z2R/UNupo4jazj5rZfjN7LNyuboW4q45/2MzczNbXLuL5147i/f64mT0Rvtf3mdlpLRL3p8xsTxj73Wa2ukXi/unw77FsZjXrnD2VWI/xeu80s2fD7Z1V5cf9/Y/cyYzNbZUNuBK4BHjyJJ77EHA5YMA3gLeE5W8gWESnM3y8oUXi/ijw4VZ7v8Nj24BvAi8C61shbqC/6pz3A7e1SNxvBhLh/u8Dv98icZ8P7AAeAAYbHWsYx/YFZWuBfeHPNeH+muP9u+q1regagbt/h2B663lmdraZ3Wtmu8zsu2Z23sLnhesk9Lv79z34X/oC8Pbw8LuBT3q4oI67H2qRuCMXYdyfJljyNJKRDVHE7e7pqlN7oog9orjvc/fKKvAPAltbJO7d7v50s8R6DP8RuN/dJ9x9Ergf+MlG/93CCm8aOobbgfe5+6XAh4E/XeScLcBw1ePhsAzgXOA/mNkPzOyfzexHIo32iFONG+C9YZX/L8xsTXShHuWU4rZgvYr97v541IEucMrvt5l9wsyGgJ8HfjvCWKvV4vek4l0E307roZZxR205sS5mC1C9Jnsl/ob/u+q+ME0jmVkv8KPAl6ua4DoXO3WRsso3ugRBte71wI8Af2tmZ4WZPBI1ivuzwMfDxx8H/ojgDz0ypxq3mXUDHyForqibGr3fuPtHgI+Y2W8B7wV+p8ahHh1MjeIOX+sjBAtGfamWMS6mlnFH7XixmtmNwAfCslcBXzezPPC8u7+DY8ff8H9XWyUCghrQYXe/uLrQzOLArvDhPQQfmtVV4q3AgXB/GPi78IP/ITMrE0wsNdrMcbv7y1XP+3PgaxHGW3GqcZ8NnAk8Hv7RbQUeMbPL3P1gE8e90J3APxBxIqBGcYedmG8D3hTlF5wqtX6/o7RorADufgdwB4CZPQDc4O4vVJ0yDFxV9XgrQV/CMI3+d9WzQ6IRG7Cdqo4e4HvAT4f7Blx0jOc9TPCtv9J5c3VYfjPwsXD/XIKqnrVA3Jurzvl14K9b4f1ecM4LRNBZHNH7fU7VOe8D7mqRuH8SeAoYiCLeqH9PqHFn8cnGyrE7i58naFFYE+6vXc6/K+qtbhdqxAb8FTACFAiy7i8TfMO8F3g8/IX/7WM8dxB4EngOuJUjd2EngS+Gxx4B3tgicf9f4IfAEwTfrja3QtwLznmBaEYNRfF+fyUsf4Jg4q8tLRL3XoIvN4+FWxSjnaKI+x3ha+WAl4FvNjJWFkkEYfm7wvd4L3Djifz+R7lpigkRkTbXjqOGRESkihKBiEibUyIQEWlzSgQiIm1OiUBEpM0pEciKYGaZOl/vc2Z2QY1eq2TBLKVPmtlXl5rx08xWm9mv1eLaIqAVymSFMLOMu/fW8PUSfmTytUhVx25mnweecfdPHOf87cDX3P3V9YhPVj7VCGTFMrMBM/uKmT0cbleE5ZeZ2ffM7NHw546w/AYz+7KZfRW4z8yuMrMHzOwuC+bo/1JlnviwfDDcz4QTzD1uZg+a2caw/Ozw8cNm9rFl1lq+z5EJ93rN7Ftm9ogFc9VfG57zSeDssBbxqfDc3wiv84SZ/c8avo3SBpQIZCX7DPBpd/8R4KeAz4Xle4Ar3f21BLOC/m7Vcy4H3unubwwfvxb4IHABcBZwxSLX6QEedPeLgO8Av1p1/c+E119y7phwbp03Edz5DTAHvMPdLyFYB+OPwkT0m8Bz7n6xu/+Gmb0ZOAe4DBUSMhoAAAHHSURBVLgYuNTMrlzqeiIV7TbpnLSXHwcuqJolst/M+oBVwOfN7ByCWR47qp5zv7tXzz//kLsPA5jZYwTzzvzLguvkOTKJ3y7gJ8L9yzkyr/ydwB8eI86uqtfeRTBPPQTzzvxu+KFeJqgpbFzk+W8Ot0fDx70EieE7x7ieyFGUCGQliwGXu3u2utDM/gT4tru/I2xvf6Dq8MyC18hV7ZdY/G+m4Ec62451zvFk3f1iM1tFkFDeA9xCsI7BAHCpuxfM7AUgtcjzDfg9d/+zE7yuCKCmIVnZ7iNYBwAAM6tMHbwK2B/u3xDh9R8kaJICuH6pk919imBZyw+bWQdBnIfCJPAG4Izw1Gmgr+qp3wTeFc6Vj5ltMbMNNfo3SBtQIpCVotvMhqu2DxF8qA6GHahPEUwhDvAHwO+Z2b8C8Qhj+iDwITN7CNgMTC31BHd/lGBWy+sJFoUZNLOdBLWDPeE548C/hsNNP+Xu9xE0PX3fzH4I3MXRiULkuDR8VCQi4QprWXd3M7se+Fl3v3ap54nUm/oIRKJzKXBrONLnMBEvDSpyslQjEBFpc+ojEBFpc0oEIiJtTolARKTNKRGIiLQ5JQIRkTb3/wF6rYQUv3wFqgAAAABJRU5ErkJggg==\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"learn_lm.recorder.plot()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"lr = 1e-3\n",
"lr *= bs/48\n",
"\n",
"wd = 0.01"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" error_rate \n",
" accuracy \n",
" perplexity \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 3.885873 \n",
" 3.570395 \n",
" 0.619722 \n",
" 0.380278 \n",
" 35.530659 \n",
" 05:31 \n",
" \n",
" \n",
" 1 \n",
" 3.720925 \n",
" 3.430218 \n",
" 0.604340 \n",
" 0.395660 \n",
" 30.883373 \n",
" 05:31 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXIAAAD4CAYAAADxeG0DAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAfLUlEQVR4nO3deXwV5b3H8c+Tc7IQkgAJAQIBExZZAmELiKIWEZXFq7ZSxa121br0Vm1V1OrVWlu1m/XWtWoXS1WKVluLVPGKigsYkB0UkCBhTYBAyEKSc577x5mEJGQ5CWeSDHzfr1dezJmZM/PLE853Zp5ZjrHWIiIi3hXV3gWIiMixUZCLiHicglxExOMU5CIiHqcgFxHxOL8bC+3SLdkOGtDfjUWLiByXli1bVmitTW3Ne10J8tS0vuTm5rqxaBGR45IxZmtr3+tK14pF16aLiLQVd4JcOS4i0mZc2iMXEZG24kofuW77F5GWqKysJD8/n/Ly8vYuxXVxcXGkp6cTHR0dsWW6EuRB5biItEB+fj6JiYlkZGRgjGnvclxjrWXv3r3k5+eTmZkZseW60rWy++Dxv1UVkcgpLy8nJSXluA5xAGMMKSkpET/y0A1BItIhHO8hXs2N39O1ID9YXunWokVEpBbXgvy//nexW4sWEYmooqIiHn/88Ra/b/r06RQVFblQUcu4FuRb95YS1FlPEfGAxoI8EAg0+b758+fTtWtXt8oKmytBHusPLbb/nfMpOVzlxipERCJm9uzZbN68mVGjRjFu3DjOOussLr/8ckaMGAHARRddxNixY8nKyuLpp5+ueV9GRgaFhYXk5eUxdOhQvve975GVlcW5555LWVlZm9XvyuWHA1IT2O8ML95UyHlZvdxYjYgch+7711rW7TgY0WUO653E//xXVqPTH3zwQdasWcOKFStYtGgRM2bMYM2aNTWXCD733HMkJydTVlbGuHHjuPjii0lJSamzjI0bN/LCCy/whz/8gUsuuYSXX36ZK6+8MqK/R2Nc2SP3RRkuHNUbgP9bv8eNVYiIuGb8+PF1rvN+9NFHGTlyJBMmTGDbtm1s3LjxqPdkZmYyatQoAMaOHUteXl5blevOHjnA72aNZumWfbyUu43hfZK46tQMt1YlIseRpvac20rnzp1rhhctWsTChQv56KOPiI+PZ9KkSQ1eBx4bG1sz7PP52rRrxdXryGeOTQfg7tfWsnDdbjdXJSLSaomJiRQXFzc47cCBA3Tr1o34+Hg2bNjAxx9/3MbVNc/VIL/lnJNrhr/7l1xKK3TiU0Q6npSUFCZOnMjw4cO59dZb60ybOnUqVVVVZGdnc/fddzNhwoR2qrJxxo0HXOXk5NjqL5aw1pJ5x3wA7po+lO+dqW8OEpG61q9fz9ChQ9u7jDbT0O9rjFlmrc1pzfJcv0XfGMPKe84F4IH563ltxXa3VykickJpk2etdImPpmdS6ETAD19cQf7+0rZYrYjICaHNHpq15M4pjMvoBsAz729pq9WKiBz32vTph3OvPRWAP32Yx4EyPVRLRCQS2jTIjTGceXIqACPve5O9hw635epFRI5Lbf488r98e3zN8NifLWRLYUlblyAiclxply+W2PjAtJrhs361SN8oJCKek5CQAMCOHTuYOXNmg/NMmjSJ6kux3dQuQR7tiyLvwRl0ivYBcMrP39YXNouIJ/Xu3Zt58+a1aw3t+lVvn95zTs1w5h3zWbBmVztWIyInsttvv73OM8nvvfde7rvvPs4++2zGjBnDiBEjeO211456X15eHsOHDwegrKyMWbNmkZ2dzaWXXtpmz1tx7aFZ4YiL9rH2vvPI+p//APD9vy5j4S1nMrBHYnuWJSLt6Y3ZsGt1ZJfZawRMe7DJWWbNmsVNN93E9ddfD8DcuXNZsGABN998M0lJSRQWFjJhwgQuuOCCRr9384knniA+Pp5Vq1axatUqxowZE9nfoxHt/uXLnWP9fPazqTWvf/z3Ve1YjYicqEaPHs2ePXvYsWMHK1eupFu3bqSlpXHnnXeSnZ3NlClT2L59O7t3N/4AwPfee6/mGeTZ2dlkZ2e3Se3tukdeLdbvI+/BGWTM/jcrthWRMfvfDE1LYu61E0iMi27v8kSkLTWz5+ymmTNnMm/ePHbt2sWsWbOYM2cOBQUFLFu2jOjoaDIyMhp8hG1tje2tu6nd98hr+92sUTXD63ceZNIvF7VfMSJywpk1axYvvvgi8+bNY+bMmRw4cIAePXoQHR3NO++8w9atW5t8/5lnnsmcOXMAWLNmDatWtU0PQ4cK8gtH9eGZb+QQ5WzQ9pZU6Ds/RaTNZGVlUVxcTJ8+fUhLS+OKK64gNzeXnJwc5syZw5AhQ5p8/3XXXcehQ4fIzs7m4YcfZvz48U3OHylhP8bWGOMDcoHt1trzm5q39mNsW+v+19fx7OItTByYwpzvdrzn/4pI5Ogxtsf2GNuW9JH/EFgPJLVmRS31kxlDeXbxFj7YtJeM2f+uGZ/WJY5Ft04i1u9rizJERDq8sLpWjDHpwAzgGXfLqbNOnvvm0RunnQfKGfyTBRQU6zktIiIQ/h75I8BtQKMXeBtjrgGuAejXr9+xVwZMHtKTV2+YSHllgB6JsUz+9bs108Y9sJBu8dEsuXMKMf4O1dUvIq1grW2XKz7amht3sTfbR26MOR+Ybq293hgzCfhxW/SRN6a0ooozH15EYb0nJz5xxRium7McgA33TyUuWl0vkRYIWnxRx/8HTdreli1bSExMJCUl5bgOc2ste/fupbi4mMzMzDrTjqWPPJwg/wVwFVAFxBHqI3/FWntlY+9xM8irLd2yj0ue+qjR6T+ZMZRpI9JITYg9LvfYG9p7+XBTIVl9upAU5+fQ4SoCwdA8XTqFrsXfureEJVv2cUlO3zrv21dSwS/mr+farwygd9c4XvpkG2+t280zV+ewo6iMAakJHCirZNRP36p5T/XGsjIQ5IuCEvolxxPrj+KJdzdTFbD8YPJAoqIM2/aV8uInXzI+M4WunaL518odTOifwpRhPVv0+5ZVBOgU0/zG+XBVQOdPPKiyspL8/Pxmr9E+HsTFxZGenk50dN17ZFwN8normkQ775HXFghabnppBf9auaPJ+T69+xy27S9lWFoShYcquGXuCq79ygBG9OlCcucY1+tsrfLKAJc89RG/vXQUA1JDT1q75i+5vLkudGfZN0/LwB9l2HGgjPmrG39Ozep7z2XEvW82OO2HZw/isXc2URVs2eHeU1eNZWCPBM6u1d1VW+cYH726xLG5oOHHFD9y6SjGZyZz2oP/xx++kcM5w3pireWrj3/IxWP6EB/j59QBKazdcZC5uaENC8CmB6ZRUhHg5pdW8IPJAxndr1vNMt/ZsIdv/ekTAF69YSKj+nYFdCQh3nDCBnltwaAlyvmw/uatz3n07Y1hve+/Jw+kf2oCnWP9TBnao9nDuh1FZWzcc4jxGclh7SFu21fKrKc/ZntRGddPGsCt5w0+ah2bCw5x9q/f5ZxhPZk9bQgL1+3mF29sqDPPjOw0zhnak5teWhHW7+WWm6eczG8Xfu7Kskf17cqKbUURW96E/sl8/MU+AE7umcDnuw/hizJs/vn0mnl+MX89X8/py8AeCRyuClBcXkX3hNhGlxkIWqy1+H3H31GetK82C/JwtUeQ11cZCPL8R1v56evrWvS+hFg/v7lkJNc8v6xm3Ib7pxLrj6K8MsjQexbUjI+LjmLFPeey/Mv9RPuiyE7vwvTfvd/oXmh9r94wkc93F3PbvJbd/XX2kB68vWFPnXEDUjvzynUTOf/37zNxQHdumxq6cWHSL9/hYPmRm6pumjKIRxYevZH7142n0y8lnvU7D9I9IZbM7p0ZcOd8pmb14vRB3Qlay3lZveiZFMf3n1/GgrWhI4C+yZ14/7bJbNpziIcWbOB/LxvNki37uPq5pcTH+HjpmlMZ3CuR7UVldIoO7aX/c+UO/vuFT1v0O98+dQgPLdjQ7HynZCbz+e5i9pc2/lWCZwzqzq8vGcn4B96uGXfByN78s9aR3bu3TuKklM513lcVCDLwrjcAyP3JFPaXVHDj3z5l+og08veX8uDF2drzl1ZTkDfh0y/3s3HPIb4+Nh0I7VH5fVFUVAX5JG8fVzyzpJ0rbNpDF4/AWpj9SuhpcFm9k/j3f59B/v5Slm3dz4Wj+jT5fmstJRUBon0mon3HB8srifP7Wn3+4aVPvuT2l1fz1s1nsrmghO//NbThfOqqsZyX1YvKQJC5uduYPKQHiXHRJMSGLrAqrwxgLcT6o1i/6yAzHl3MyT0T+OrodFZvL+Kxy8dgjKlZfm0x/tDfPVy3njeYy8b3Y1/JYV76ZBt/aOZLw6+fNKBmA1qt+lzGB5sKmbNkK/NX72JIr0QW3HRmnfm27Svl36t3MjWrF726xB11sv6bf1zKos8KuGv6UK469aSwT+afKFeCHA8U5MfIWsvhqiBzlnzJ/c4e/LThvbhx8kBmPLq4zrw5J3XjqavGYoGcny1scrmv3jCRkeldCFrwRRkCQcvVzy0lb2/o5OCHm/cC0DMpliV3TqEqEKQiECQ+5uirQiuqgvxtyVYuGNWnQ/frt9bhqgAHSivpkRQX8WUHgpai0gpSEmJZsGYn3//r8pppt5xzMqvyi1i4fg85J3VjdL+uzQZ2JG3++XTe31jAN//4SYvf+53TM/nJjKFYS023Ym3WWjLvmE9K5xiW3jWFN9fu4ro5y5kytCfPXN2qvBAXKcgjyFrLki37OCUzGWMMeYUl/PmjPKYNT2N8ZvJR828pLMFnDBWBIJv2HOL0Qd1JiPVTXhlodq+poirIj/6+kp9dNLzmyhJx364D5WzdW8KI9C4NbjQBvvb4Byz/sm5//TPfyKm52uaFpV9yxyurWXHPOUT7onj07Y089d4Xrtfull5Jcdw4eSAFxYf5nXN+Ke/BGXXmWfLFXtbsOMh3Ts9saBENqgoEeffzAs4YlHpcXj0WSQpyEZcUl1cSH+MPq+/7ntfWsKOonLEndWPWuL4kxPkpPRzgG88t4fapQzhtYHcASg5X8du3PueZxUf2/O+cPoRLx/UjN28fZw3uwZvrdtd0N3WNj+ay8f24feoQyisDFJVWcsvcFTVHdG5acufZ/GjuSgb3SuTZxXWPVB6/YgzTR6QBR/b+R6Z34YpTTmLm2HSiogzPf5TH3a+tBULnJp7/zinAka61TQ9Mw++LovDQYWL9UfiiDJ2ifY12B322q5jUxNgGj0qttazfWcz0R9/n3GE9ufW8wQzskUBJRQCgpnuuJYJBS/AYT25XBoK8sPRLxmcmM6hH4lH/l6ov1FCQi3hQIGh5e/1uJg3u0eDealUgSJQxjXabXPTYB6Qnx/PY5WMorahi/c6DvLx8OzNGpLHrQDm/e3sjD12czeh+XZn55IfcMGkg00ak1VzVdcag7pRWBFi2dX+rf4fV957L+p3FTd7TUVtW7yQuP6Ufd/1jDQBXTujHzLF9ueixDxqcv0unaC7JSee0Ad25/eVV7HEezXHxmHQeungEfl8Ub63bzeCeiXztiQ8oPFTR6Lq/+Pl0SisDVFQF6dIpmh//fSXXTxrAoJ5H37D+j0/zmTy4JyN/euSy3c0/n97gBn3qI++xYVcxH86eTO+unWrGB4KWxZsKKS6v5Ma/HTm5/8ldU0jpHMO85fk1FzpsuH8qnWL8CnIRab1t+0pJiosmIS509HHocBXDna9grO3CUb15bUXT923UF2VCwfv3ZfmRKrfGxIEpfLDJ/SOTao9dPoYZ2Wm8vCyf5IQYvlXv3Mbca09lfGZynQf9hWvrQ+cryEUk8nYeKCO5cwzb9pXW+S7dYNAy8qdvUlzr0tYtv5iOMYYthSWc9atF3HP+MHYdLGf21CFERRmeXbyl5mKCa7/Sn10HyutsFJ69OgdflKGg+DAHyioZ2COhwZPA/VM780Ujl/i+f9tZ9E2OB+DLvaX8a9UOrpxwEiPva/iGuHC8/aOvNHrjW0usuOecOndH16cgFxHPsdZSWhGgcxN910++u5ms3kmcMSi1zvjKQJBBzjX9g3okcOhwFT8+dzAXO5cZ11daUcWOonICQUta1zh2FpXTp1snFn22h55JcXz9yY/4/eWj6d89gd8u/Jzbpw4mNTGOhNjQEcrKbUV87YkPCTRwB3T14yq+95fcmjuQAf5x/WmkJsaycN1urj4tA2MMwaDlyfc2Mz4jmSFpSSTE+rn71TU8//FWBbmISFuoCgTJ21uKP8pgCX0/Qu2r04JO0Dd0XqM5bfXFEiIiJzS/L4qBPRIand6aAI8EXdgpIuJxCnIREY9T14qcOPI+gMpS8MdBdDxEd6r3Ew8+3WEr3qMglxPHW/fA9mZOwkf5wV8/4DsdCf46G4F4iK63UfDXm7+x9/tjQQ+zkghRkMuJ46tPQXlRaK+8ssz5t7zW6zKoKqs1razuT+k+qKo9vzMcbPyRuY0zYW4I6m04mjqaaOj9/jiIUg/q8U5BLieO7gPdWW6g0tkIlDewkai1UWhqI1FZeuT9FYegpKDutMoyCBxuvpaG+ONasCFo5RGHvxP4FCftRS0vcqx80U7fepK76wkGnLBv6GiiNIxpDWxAyvY3vBFqDV9MvW6p5jYEcTR5NFF/Q1O9kfFFq1uqHgW5iFdE+SCmc+jHTdZC1eEwu5xKG94Q1O+yqjnCqH2kUgI2/C/6qGF8zWwIGtqIxNc9MmmyW6p6fu+cx1CQi0hdxjghGPkv+ajD2lC31FEbiaaOJprqsioLnQMp3nX0RijQ+FMRG2caOCII44ihNd1Sx3geQ0EuIu3DGPDHhH7iuri7rkBV3cBvbEPQ5LRaRxoVJVBSePRGqKq8dfX5j22jqSAXkeOfzw++RIg9+tnjERUM1j1X0VyXU+0NAfe3erUKchGRSImKgpj40A8pLXxz64NcF5iKiHicglxExOMU5CIiHqcgFxHxOAW5iIjHKchFRDxOQS4i4nEKchERj1OQi4h4nIJcRMTjFOQiIh6nIBcR8TgFuYiIxynIRUQ8rtkgN8bEGWOWGmNWGmPWGmPua4vCREQkPOE8j/wwMNlae8gYEw0sNsa8Ya392OXaREQkDM0GubXWAoecl9HOj3WzKBERCV9YfeTGGJ8xZgWwB3jLWrukgXmuMcbkGmNyCwoKIl2niIg0Iqwgt9YGrLWjgHRgvDFmeAPzPG2tzbHW5qSmpka6ThERaUSLrlqx1hYBi4CprlQjIiItFs5VK6nGmK7OcCdgCrDB7cJERCQ84Vy1kgb82RjjIxT8c621r7tbloiIhCucq1ZWAaPboBYREWkF3dkpIuJxCnIREY9TkIuIeJyCXETE4xTkIiIepyAXEfE4BbmIiMcpyEVEPE5BLiLicQpyERGPU5CLiHicglxExOMU5CIiHqcgFxHxOAW5iIjHKchFRDxOQS4i4nEKchERj1OQi4h4nIJcRMTjFOQiIh6nIBcR8TgFuYiIxynIRUQ8TkEuIuJxCnIREY9TkIuIeJyCXETE4xTkIiIepyAXEfE4BbmIiMcpyEVEPE5BLiLicQpyERGPU5CLiHicglxExOMU5CIiHtdskBtj+hpj3jHGrDfGrDXG/LAtChMRkfD4w5inCviRtXa5MSYRWGaMectau87l2kREJAzN7pFba3daa5c7w8XAeqCP24WJiEh4WtRHbozJAEYDSxqYdo0xJtcYk1tQUBCZ6kREpFlhB7kxJgF4GbjJWnuw/nRr7dPW2hxrbU5qamokaxQRkSaEFeTGmGhCIT7HWvuKuyWJiEhLhHPVigGeBdZba3/jfkkiItIS4eyRTwSuAiYbY1Y4P9NdrktERMLU7OWH1trFgGmDWkREpBV0Z6eIiMcpyEVEPE5BLiLicQpyERGPU5CLiHicglxExOMU5CIiHqcgFxHxOAW5iIjHKchFRDxOQS4i4nEKchERj1OQi4h4nIJcRMTjFOQiIh6nIBcR8TgFuYiIxynIRUQ8TkEuIuJxCnIREY9TkIuIeJyCXETE4xTkIiIepyAXEfE4BbmIiMcpyEVEPE5BLiLicQpyERGPU5CLiHicglxExOMU5CIiHqcgFxHxOAW5iIjHKchFRDxOQS4i4nEKchERj2s2yI0xzxlj9hhj1rRFQSIi0jLh7JH/CZjqch0iItJKzQa5tfY9YF8b1CIiIq0QsT5yY8w1xphcY0xuQUFBpBYrIiLNiFiQW2ufttbmWGtzUlNTI7VYERFphq5aERHxOAW5iIjHhXP54QvAR8BgY0y+MeY77pclIiLh8jc3g7X2srYoREREWkddKyIiHqcgFxHxOAW5iIjHKchFRDxOQS4i4nEKchERj1OQi4h4nIJcRMTjFOQiIh6nIBcR8TgFuYiIxynIRUQ8TkEuIuJxCnIREY9TkIuIeJyCXETE4xTkIiIepyAXEfE4BbmIiMcpyEVEPE5BLiLicQpyERGPU5CLiHicglxExOMU5CIiHqcgFxHxOAW5iIjHKchFRDxOQS4i4nEKchERj1OQi4h4nIJcRMTjFOQiIh6nIBcR8TgFuYiIxynIRUQ8TkEuIuJxYQW5MWaqMeYzY8wmY8xst4sSEZHwNRvkxhgf8BgwDRgGXGaMGeZ2YSIiEp5w9sjHA5ustV9YayuAF4EL3S1LRETC5Q9jnj7Atlqv84FT6s9kjLkGuMZ5edgYs+bYy3NNd6CwvYtohmqMDNUYGR29xo5eHzRf40mtXXA4QW4aGGePGmHt08DTAMaYXGttTmuLcltHrw9UY6Soxsjo6DV29PrA3RrD6VrJB/rWep0O7HCjGBERablwgvwTYJAxJtMYEwPMAv7pblkiIhKuZrtWrLVVxpgbgf8APuA5a+3aZt72dCSKc1FHrw9UY6Soxsjo6DV29PrAxRqNtUd1d4uIiIfozk4REY9TkIuIeFxEg7w9b+U3xvQ1xrxjjFlvjFlrjPmhMz7ZGPOWMWaj8283Z7wxxjzq1LrKGDOm1rKudubfaIy5OsJ1+owxnxpjXndeZxpjljjresk5oYwxJtZ5vcmZnlFrGXc44z8zxpwX4fq6GmPmGWM2OG15agdsw5udv/EaY8wLxpi49m5HY8xzxpg9te+fiGS7GWPGGmNWO+951BjT0GXBranxl87fepUx5h/GmK61pjXYPo19zhv7GxxrjbWm/dgYY40x3Z3XHaYdnfE/cNplrTHm4Vrj3W9Ha21EfgidCN0M9AdigJXAsEgtP4z1pwFjnOFE4HNCjxR4GJjtjJ8NPOQMTwfeIHSd/ARgiTM+GfjC+bebM9wtgnXeAvwNeN15PReY5Qw/CVznDF8PPOkMzwJecoaHOW0bC2Q6be6LYH1/Br7rDMcAXTtSGxK6QW0L0KlW+32zvdsROBMYA6ypNS5i7QYsBU513vMGMC1CNZ4L+J3hh2rV2GD70MTnvLG/wbHW6IzvS+iCi61A9w7YjmcBC4FY53WPtmzHSAbpqcB/ar2+A7gjUstvRT2vAecAnwFpzrg04DNn+Cngslrzf+ZMvwx4qtb4OvMdY03pwNvAZOB15z9TYa0PUk0bOv9pT3WG/c58pn671p4vAvUlEQpJU298R2rD6juNk512eR04ryO0I5BR78MdkXZzpm2oNb7OfMdSY71pXwXmOMMNtg+NfM6b+r8ciRqBecBIII8jQd5h2pFQ+E5pYL42acdIdq00dCt/nwguP2zO4fNoYAnQ01q7E8D5t4czW2P1uvl7PALcBgSd1ylAkbW2qoF11dThTD/gzO9mff2BAuCPJtT984wxpjMdqA2ttduBXwFfAjsJtcsyOlY7VotUu/Vxht2sFeDbhPZSW1NjU/+Xj4kx5gJgu7V2Zb1JHakdTwbOcLpE3jXGjGtlja1qx0gGeVi38rvNGJMAvAzcZK092NSsDYyzTYw/1rrOB/ZYa5eFUUNT09xsZz+hQ8YnrLWjgRJCXQKNafManX7mCwkdpvYGOhN6Mmdj62uPdmxOS2tyvVZjzF1AFTCnelQLa3HrcxMP3AXc09DkFtbi9menG6EunluBuU7/e5vUGMkgb/db+Y0x0YRCfI619hVn9G5jTJozPQ3Y44xvrF63fo+JwAXGmDxCT5CcTGgPvasxpvrGrNrrqqnDmd4F2OdifdXrzLfWLnFezyMU7B2lDQGmAFustQXW2krgFeA0OlY7VotUu+U7w67U6pwMPB+4wjrH862osZDG/wbHYgChjfZK57OTDiw3xvRqRY1utmM+8IoNWUroqLt7K2psXTu2pn+okT4jP6GTCpkc6bzPitTyw1i/Af4CPFJv/C+pe8LpYWd4BnVPlCx1xicT6ifu5vxsAZIjXOskjpzs/Dt1T2xc7wzfQN2TdHOd4Szqnjz5gsie7HwfGOwM3+u0X4dpQ0JP3lwLxDvr/TPwg47QjhzdbxqxdiP0qIwJHDlJNz1CNU4F1gGp9eZrsH1o4nPe2N/gWGusNy2PI33kHakdvw/81Bk+mVC3iWmrdoxYODkrnU7oapHNwF2RXHYY6z6d0CHIKmCF8zOdUJ/T28BG59/qP6gh9IUZm4HVQE6tZX0b2OT8fMuFWidxJMj7EzqTvsn5A1af9Y5zXm9ypvev9f67nLo/oxVn3ZupbRSQ67Tjq84HoUO1IXAfsAFYAzzvfEjatR2BFwj12VcS2tv6TiTbDchxft/NwO+pd0L6GGrcRCh0qj8zTzbXPjTyOW/sb3CsNdabnseRIO9I7RgD/NVZ9nJgclu2o27RFxHxON3ZKSLicQpyERGPU5CLiHicglxExOMU5CIiHqcgFxHxOAW5iIjH/T97WGBf7Ny8WgAAAABJRU5ErkJggg==\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_lm.fit_one_cycle(2, lr*10, wd=wd, moms=(0.8,0.7), callbacks=[ShowGraph(learn_lm)])"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_lm.save(f'{lang}fine_tuned1_sp15_multifit_bwd_v2')\n",
"learn_lm.save_encoder(f'{lang}fine_tuned1_enc_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" error_rate \n",
" accuracy \n",
" perplexity \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 3.496426 \n",
" 3.315957 \n",
" 0.589387 \n",
" 0.410613 \n",
" 27.548714 \n",
" 07:27 \n",
" \n",
" \n",
" 1 \n",
" 3.343582 \n",
" 3.213353 \n",
" 0.576103 \n",
" 0.423897 \n",
" 24.862326 \n",
" 07:26 \n",
" \n",
" \n",
" 2 \n",
" 3.248597 \n",
" 3.134908 \n",
" 0.565534 \n",
" 0.434466 \n",
" 22.986530 \n",
" 07:24 \n",
" \n",
" \n",
" 3 \n",
" 3.199230 \n",
" 3.085525 \n",
" 0.559169 \n",
" 0.440831 \n",
" 21.878988 \n",
" 07:29 \n",
" \n",
" \n",
" 4 \n",
" 3.114331 \n",
" 3.050895 \n",
" 0.554850 \n",
" 0.445150 \n",
" 21.134266 \n",
" 07:29 \n",
" \n",
" \n",
" 5 \n",
" 3.118084 \n",
" 3.022557 \n",
" 0.550928 \n",
" 0.449072 \n",
" 20.543736 \n",
" 07:26 \n",
" \n",
" \n",
" 6 \n",
" 3.077358 \n",
" 3.001435 \n",
" 0.547984 \n",
" 0.452016 \n",
" 20.114338 \n",
" 07:24 \n",
" \n",
" \n",
" 7 \n",
" 3.029807 \n",
" 2.984077 \n",
" 0.545689 \n",
" 0.454311 \n",
" 19.768270 \n",
" 07:24 \n",
" \n",
" \n",
" 8 \n",
" 2.978406 \n",
" 2.967431 \n",
" 0.543427 \n",
" 0.456573 \n",
" 19.441893 \n",
" 07:24 \n",
" \n",
" \n",
" 9 \n",
" 3.003683 \n",
" 2.951951 \n",
" 0.541148 \n",
" 0.458852 \n",
" 19.143238 \n",
" 07:24 \n",
" \n",
" \n",
" 10 \n",
" 2.929004 \n",
" 2.938523 \n",
" 0.539094 \n",
" 0.460906 \n",
" 18.887917 \n",
" 07:24 \n",
" \n",
" \n",
" 11 \n",
" 2.946163 \n",
" 2.925794 \n",
" 0.537161 \n",
" 0.462839 \n",
" 18.649021 \n",
" 07:24 \n",
" \n",
" \n",
" 12 \n",
" 2.894151 \n",
" 2.914138 \n",
" 0.535388 \n",
" 0.464612 \n",
" 18.432899 \n",
" 07:24 \n",
" \n",
" \n",
" 13 \n",
" 2.911956 \n",
" 2.904919 \n",
" 0.533872 \n",
" 0.466128 \n",
" 18.263720 \n",
" 07:24 \n",
" \n",
" \n",
" 14 \n",
" 2.853363 \n",
" 2.898480 \n",
" 0.532755 \n",
" 0.467245 \n",
" 18.146566 \n",
" 07:24 \n",
" \n",
" \n",
" 15 \n",
" 2.873746 \n",
" 2.894358 \n",
" 0.532119 \n",
" 0.467881 \n",
" 18.071903 \n",
" 07:24 \n",
" \n",
" \n",
" 16 \n",
" 2.848691 \n",
" 2.893291 \n",
" 0.531773 \n",
" 0.468227 \n",
" 18.052658 \n",
" 07:24 \n",
" \n",
" \n",
" 17 \n",
" 2.842775 \n",
" 2.893298 \n",
" 0.531823 \n",
" 0.468177 \n",
" 18.052734 \n",
" 07:24 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXUAAAD4CAYAAAATpHZ6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3deXgV5d3/8fc3ycnOGoKEHQQfEEWQiKAWcWsBqdallv5qW61KXVqX2gVrrX2srdo+2j5W69LHttpSq6XW3apUBRVQgwqyKWGTsIYtZM9Jcv/+mEk4CSf7yTZ+Xtd1rsxynznfM4HPTGa5x5xziIhIMMR1dgEiIhI7CnURkQBRqIuIBIhCXUQkQBTqIiIBktBZHxyf2stNPHp0Z328iEi3tHz58j3OucyG5ndaqCf06k9OTk5nfbyISLdkZlsam6/DLyIiAaJQFxEJkE4Ndd3NKiISW512TB2gNFxFamKnliAi3Ug4HCYvL4+ysrLOLqXdJScnM3jwYEKhUIve16mJ+qe3N3PNaaM6swQR6Uby8vLo0aMHw4cPx8w6u5x245xj79695OXlMWLEiBa9t1MPv/z65Y958aMdnVmCiHQjZWVlZGRkBDrQAcyMjIyMVv1F0uknSq+e/35nlyAi3UjQA71Ga79nlzigPXzeCwD89bITGdo3laEZqZ1ckYhI99Tpe+qRLn7kHab9+nW++cd3O7sUEZHDHDhwgN///vctft+sWbM4cOBAO1R0uE4L9WMH9eLkURlR5y36JJ/i8soOrkhEpHENhXpVVVWj73vxxRfp3bt3e5VVR6cefpl/+ZTaQy/1jbv15drhzXee3VEliYg0aN68eWzYsIEJEyYQCoVIT08nKyuLDz/8kDVr1vClL32JrVu3UlZWxnXXXcfcuXMBGD58ODk5ORQVFTFz5kxOOeUUlixZwqBBg3jmmWdISUmJWY3WWTcAZWdnu8i+X77/jxUsWJ7XYPucn5xJQWmYIzPTO6I8EemC1q5dy9ixYwH47+dWs2b7wZgu/+iBPbn1i+ManL9582Zmz57NqlWreOONNzj77LNZtWpV7WWH+/bto2/fvpSWlnLCCSewaNEiMjIy6oT6qFGjyMnJYcKECVx00UWcc845XHzxxU1+3xpmttw5l91QjV3mmPr/fPk4Vvz08w3Oz759IWfcvYjc3UU6NCMiXcLkyZPrXEd+7733ctxxxzFlyhS2bt3K+vXrD3vPiBEjmDBhAgCTJk1i8+bNMa2pycMvZpYMLAaS/PYLnHO31mtzCfBrYJs/6T7n3P+1tJheqSE2/HIWP3n6Ix5/d2vUNmfeswiAH88aw8VThumOVJHPqMb2qDtKWlpa7fAbb7zBwoULWbp0KampqUyfPj3qdeZJSUm1w/Hx8ZSWlsa0pubsqZcDpzvnjgMmADPMbEqUdk845yb4rxYHeo34OOOO88dz95ePIzE+jtduPDVqu1++uI6jf/oyL6zUzUsi0jF69OhBYWFh1HkFBQX06dOH1NRU1q1bx7Jlyzq4Ok+Toe48Rf5oyH+1+4H4CyYN5pNfzGRkZjo5PzmzwXbX/O19hs97gdzdRSzZsKe9yxKRz7CMjAxOPvlkjjnmGH7wgx/UmTdjxgwqKysZP348t9xyC1OmRNv3bX/NOlFqZvHAcmAUcL9z7kf15l8C3AHkA58ANzjnDjt+YmZzgbkAQ4cOnbRlS6N9vR/mP2t3cdmjjT9Y45rTjuS6M44iMaHLnC4QkRiJduIwyFpzorRFV7+YWW/gX8B3nXOrIqZnAEXOuXIzuxK4yDl3emPLqn/1S3M55whXOT74dD9febjhP29eu/FUCkrDTBzap8WfISJdk0I9xle/OOcOAG8AM+pN3+ucK/dH/wBMaslyW8LMSEyI48SRGdw0cww/mjEmarvT717Eeb9fwuJP8nl30772KkdEpEtpztUvmUDYOXfAzFKAM4G76rXJcs7VnLE8B1gb80qj+PapRwIQijdufyH6R36jXpcD500cxG++MqHdaxMR6QzN2VPPAl43s5XAe8Crzrnnzew2MzvHb3Otma02sxXAtcAl7VNudJd/biSb7zybBVdObbLtvz7YxvB5L8T8pgURka6gy9xRGiuFZWG+dP/bbMgvblb7rF7JfDl7CMP6pnLBpMExr0dEYkfH1Js+ph64O3d6JIdY+L1Tue+1XOZMHsrG/KJGT6juKCjj3v94d319uq+EG846qqNKFRGJuUBe92dmfPeM0WT2SOLEkRmcfWwWYwb0aPJ9//uf9Sxcs6sDKhSRz4L0dK+vqu3bt3PhhRdGbTN9+nRiedQicHvq0dz/teNrhwtKw2zdV8Ls370Vte3ljx1auct/ciYZ6Ydu6f3XB3k8/u5Wnvx208fuRURqDBw4kAULFnTIZ3X/UC/cBen9oZmPfuqVEqLnwJ58dfJQkkNx/OntzQ22nXT7wtrh/50zgRueWAF418p/Vh6pJSKH/OhHP2LYsGFcffXVAPzsZz/DzFi8eDH79+8nHA5z++23c+6559Z5X2TvjqWlpVx66aWsWbOGsWPHxrzvl+4d6tVV8MBJkNwTxsyGsV+EQdkQ1/hRJTPjjvOPBbxOgfYVV7C7sIwZv32zwfdc9/cPa4fv/Pc6bpo5locWbSCrdwrnHDcwNt9HRJrvpXmw86PYLnPAsTDzzgZnz5kzh+uvv7421J988kn+/e9/c8MNN9CzZ0/27NnDlClTOOeccxrc8XvggQdITU1l5cqVrFy5kuOPPz5qu9bq5qFeCaffDGufh2UPwJJ7If0I+K9ZXsiPmAYJiU0upm9aIn3TEklMiKOisrrJ9g8t2shDizbWjl/7+AcAnDC8Dz+cMYZRmen0SWv6c0Wke5k4cSK7d+9m+/bt5Ofn06dPH7KysrjhhhtYvHgxcXFxbNu2jV27djFgwICoy1i8eDHXXnstAOPHj2f8+PExrbF7h3pCEmR/y3uVFcD6V2Htc/DRP2D5nyCpJ4w+ywv40WdBUuMnSz+45SxueXoVT32wjVOPymTRJ/ktKue9zfv58oNLAT2tSaTdNbJH3Z4uvPBCFixYwM6dO5kzZw7z588nPz+f5cuXEwqFGD58eNQudyO15+Hb7h3qkZJ7wbEXeq9wGWxaBOueh3Uvwqp/QnwijJwOY8729uTT+x+2iLSkBO75ygTuqXfHaUOP3GvMK6t38vlx0bfUItJ9zZkzhyuuuII9e/awaNEinnzySfr3708oFOL111+nqY4Kp02bxvz58znttNNYtWoVK1eujGl9wQn1SKFkOOoL3mv2b2HrO7DuBW8vfv0r8Nz1MHSKF/BjZkPfEY0uLucnZxJvRp+0RF7/eDeX/um9JkuY+5flAHxudD/WbD/Is989hcqqaob2Ta3dSs/+3Zt89/TRfGHcAA6WhTlYGmZwn9S2f38RaTfjxo2jsLCQQYMGkZWVxde+9jW++MUvkp2dzYQJExgzJnp/VDWuuuoqLr30UsaPH8+ECROYPHlyTOsL3B2ljXIOdq329+CfP3SSpf84GDvb24MfML7JE63//dxqJg3rw4xxAxh180ttLuuxb02u7aPm/v93PNf87X3mX34iq7cXcMXnRupKGxGf7iiNcde7sdQpoV7f/s3eHvy6F+DTpeCqvcM4Q6bAsJO8V9aERk+25heWs6eonLFZPbnpqZXsKarg1RjfwDT/8hM5eVS/mC5TpDtSqH8GuwlokT7DYeo13qt4D+QuhC1LvIBf/7LXJiEFBmfDsJNh2FQYfAIkHnouYWaPJDJ7eDco3XG+dxa7qLySY259OWZlPr9yOyeP6kd1taOssqr2uawVldV8uq+EUf3To77vvc37OOqIHvRKCcWsFhHp2j7boR4prR8cN8d7ARTle+G+ZQl8ugQW/8rbk49L8Pbeh53kBf3QEyGl7oM40pMSePDiSVz51+V1pg/slcz2gsbPikdzsKzysJO1IzPTKA9Xs+2Ad+PCsIxUtuwtYdMds6isdlz11+UsXLu7znvW3PYFPahbur3Pys1/rT2K8tk+/NISZQWw9V0v5Lcsge3vQ1UFYHDEOBg69dAhmx6HrnopKq/kp8+s4uZZY2u7HJjz8FKWbdzHrGMH8OJHO+t8TGpiPJ8b3Y+XV8e+D5qrph8Z9aEizjmWbtzLlBEZxMUF/z+LdF+bNm2iR48eZGRkBDrYnXPs3buXwsJCRoyoeyGHjqm3l3ApbFsOW5bClre9wA/73f32HQkDj4d+oyFj1KFX0uGHSUoqKjn6py9z2SkjuGX20XXmvbV+Dxc/8k5My452/fxFDy7l3c37OGVUP/56+YkAHCipoHeqdy6hLFzFfa/lct/ruYwb2JOTR/Xjx7M+O8c1pesIh8Pk5eU1eR14ECQnJzN48GBCobqHTxXqHaWqEnau8EN+Cez6CA5sBSLWb4+BkHGkH/ajD4V+76EQFx91sa25Rr4xI/ulcbAszMh+6by7eR9fPG4gz63YHrXtMYN6smpb9IeJ/OqC8Vx0wpCY1iYiTVOod6ZwKezbCHtzYc/6iJ/rvcM5NeITvb37jFERe/de6H+wxzjv90vqLHbOCUMor6zmXx9sA+BfV590WJuOEG2vv6rasbe4nD6piYTivUtDnXMUV1RRWVXNhNteZeH3TmVkvzTW7jzIuIG9OrpskW5Nod4VOQclew8F/N5c2JPrDe/bBNXhQ21T+kCvwd5efs8s6DkIemRBzyw2VfQia/BIknv0BTMOlFRQUBrm1F+/wXs3n0ne/hLW7iikyjlueXpVzL/G5jvPZtW2gtpujM+dMJBnPoy+11/fUUek88muIgDW/XwGyaHof6nUWLZxL1v2FvOVE4a2rWiRbk6h3t1UVcKBLXX37g9uh8LtcHAHlOw5/D2hVD/oB9YGvrcRGFg77eUt1Xz7b7G9HTlWrjz1SObNrHsC1znHmh3envyjSzZz67OrAa9/npwt+5k4tDfZftfI/7hyKgN6JjOkr+7GleBTqAdNZTkU7vACviboC3fAwW2HphXu9K/MOcRZHPnVPdnjerHH9WQvPdnrvPG99OTG807hiAGDufedAv7yUTF7yg7dVfvy9dNIDsVxRM9kwlXVHPuzV2L+tb49bSQjM9P40T8/4nOj+/Hm+igbrya8e/MZ9O+RXDteWVXNqJtf4udfOoY/v72JV244lfiIq3sWrtnFA4s2MPOYAewpquD6M0c3+ReDSGdrc6ibWTKwGEjCu659gXPu1nptkoDHgEnAXuArzrnNjS1Xod6Oag7vHNxeu5dfdWA7/3jjPaYNMgaGiqBot3fDVTj6A7qrQukUxvemV7+BWHp/7zr+tExIy+Q7z2xlP+kUuDQOkkaBS6OQVKq7wNMRN995NmXhKuLMmPzLhRwoOXQo65UbpjG4Twob84t5dsV2Hl68sc57+6SG2F8SZsKQ3vztihPrXNNfXe04WBauvSLIOcfWfaUMzYj+10FhWZhdB8sY1b/pxyiKtEQsQt2ANOdckZmFgLeA65xzyyLaXA2Md85daWZzgPOcc19pbLkK9S6iotgL9+L8uq+iyPE9ULzb21C4hvubP+hSOEgaB50X9Km9MjhQncqGogT2V6Xw3bOzKY7rwYJVhby0oYyDpFLg0igihVKSgLZfdzxj3AD+vXpn0w2bcN//m8js8YcefnLHi2t5aPFGfnHeMXxh3IDaQz9Pfnsqo/qnc8+rHzN+cG9+uKDuIa7FPzitweAXaY2YHn4xs1S8UL/KOfdOxPSXgZ8555aaWQKwE8h0jSxcod4NVVdB6X4qCnawKvdTju9vUHbAu5Kn9ACVJftxpQewsgJ27NrJoORy4soPUl26n7hwSaOLrnJGMSkUkUyRS6GIlDo/i0mmkBSK/WmF/rQil+q9p3ZeMmUk0tYNRHIojrJw0w9MaY5oVwlVVztG/vhFvj5lGDfNGsPWfaWE4o30pASWbdpHr5QQQ/qkMCwjjYrKalISdVhIPDHp+8XM4oHlwCjg/shA9w0CtgI45yrNrADIAPbUW85cYC7A0KG6iqHbiYuHtH4kpvXj+IHHHjY78h9T5BXscQCVFVB+EEr9jUDZfqpLDnDLE2+TRinpVko6ZaTXDpcyZXAim7ZtJz2ulKykMHHhokb/UqhR6eIoIZkikimu2SC4FIqp2RAkexsQ/2exvyEpJpli528gwskUk0wJyVSQQFs3EuGqaqqqXe0x+5E/fhGAvyzbwl+WNd7/NjTvCqH6hs97gZ+fO46vTx3Om+vzeePj/MNucJPgaemeem/gX8B3nXOrIqavBr7gnMvzxzcAk51zextalvbUJVLNTVYbfzmL9buLGNI3heSEeK574kMuOWk4k4b18c4VhEugvAjKC72NRIU3fMNjb5Jm3kah9idlpPkbi5pp6eZNT7cyQlQ2q7awi/cCn2RKXDLFJPk/vdAvrjOcdNi0r50yht+9uY0Skih1SZTivcIt7Hpp6U2nk9UrhZc+2sHi9fl876z/qu1Mrkbu7kKyeqUwrokO5Zr7ZK7tB0oxg1dW7+Lz446gtKKKkZnRO5CTjhHzq1/M7Fag2Dn3PxHTdPhF2uRgWZh4M9KSWtfhWOSdt1eeeiQXThrEmfcs5pPbZ3LnS+s4Y2z/w7ovfm75JqYNS6ZXfLm3oagooqK0gOsffYs0KyOVctIoI9XKvA1ExHCq+fP8DUTNz5YIu3hKSfRC3g97L/gTKSWZUhIpidgIlLpEPnf0UJ5ds99v7723jMQ646UukTK88coGNhyv3DCNo47oQe7uQjLSkuiTlkju7kLA6vT6Ge2O5nkzx3BR9hC27C3m2EG9SIjv/BPknyWxOFGaCYSdcwfMLAV4BbjLOfd8RJtrgGMjTpSe75y7qLHlKtQllv7+7qfMe8p76Mk/rzrJ27NvpYa6Znj8iilMPTKDM+5+g4uyhzCoTwrLNu7lqumjSAnF0zclgYqyYk667TlSrYxJA0Js3bmbFKsghXLvZRWkUk4y5aRaeZ3hFCpIs3JG943HwiUUFhb6071XkjXvL4tINRuOstqNhbcRyOjdi8q4JD7eW0kZIcr8eeUk1o6X4b9cIqUk+vMSI+aFmHb0EG47f5J3V3RCEsQnUuXgw637mTSsb20d3/jjuyz+JJ/cX8ykNFzFJ7sKCVc5RmamUVxexY1Pfshjl51Ieis36p8lsQj18cCjQDze4dEnnXO3mdltQI5z7ln/sse/ABOBfcAc59zGBheKQl3aR2lFVZtPKj60aAN3vLSON74/nTdz9/Drf69jwtA+/PmSE5rVi2Xe/hLKK6sJV1Uz47dvNtgu9xcz+XDrAQpKw1z2qPd/IfKwyJfuf5v0pAQ25Bexo6CMeKpIpoIUKkj2NwJe4FeQYuX+vPLajUiyP72mXbIdap9sFSQT8Yoct3BDJTdLuQtRTgIVhKgggQoXorxmmBAVzhv2pvltXYjePVKZOX4oxCdQXGkkhBJJSkyG+ASIC/H0R7s5ZnA/RmX1ZndxFe/nFfHMR/ncOGMcR/RJZ1+pY1hmL4gPgcV754Di4g8N+z9LKiE1KdGfn+BPj4t4T0LEezqnJ0jnHNWOOvdV1NDNRyKdpKracaR/QrTGG9+fzgUPLOGBiycxecShPdk/LN7Iq2t38eS3p0ZdVms6drt4ylD+uuzT2vEFV05lT1HFYf3812dUk0hl7UbCC/xwnfBP8jcuSRYmEe+VRGXEeKU/LUyi1R+vJKlmuKathQlRSd/kOCrDFbiqMCGravF3bh9+sJp5w/V/NjHPGRhxLTrXXlJRRbiqOuoDbuymrXrykUhniI8zlt50Oj2TQyTEG4nxcZgZy28567C2V0wbyRXTRjZrueMH9+LBiyeRmhjPxJ+/Ss1+2YvXfo4xA3rwwKINnDG2P2MG9GTmMVmM7p9O/57enbb5heVNLt8RR7l/uKWA9NqORp/89lQuemhp8758a9Upz5FAFQlUEaKKBCoPDVsVIX88cn7IvPF4qrn+9CMZ1DPELU+vJJ5q4qkmjmoSrIrR/VI4+9gjmL9kI6MzU1iVt584v433qiLeqjEcozLT2ZBfyJQRfdlfXMHA3kkkGCQlxPHy6p010c3XTxzK0g172HWwlNKKSk4Z1Y94gzfX53N0Vg+mjMygLFxFYkIccf5fAM45/rxkM9nD+9IrJUSvlBDJoXieeMe7ImpSRh+Wb9nPN04aTmFp2O8k755GV6H21EW6gZpHJN5x/rF8dfKhy4EPlFQw4bZXmXPCEO68YHyzltXQXv/3P38UGelJ3PXvdXzntFGM6JfGZY/msPB706LeGRvrbqE/a351wXh++M+W98e05a7ZOvwiIod8+cElvLd5f21fOTsLykhLiqdHcsueZVtRWc1RP3kJgGtPH8W9r+XWzjt/4iCe8ruGjnTbueP4xtThFJSGKQtXceIv/9O2L/MZpFAXkXbjnGN/SZi+aYmUVFRSXF5V59r519btYu5jy1n4vVMZ3i8t6vtzdxdx1m8W15l++pj+PPLNbEbcVPecxNKbTufuVz5h6Ya9vD3vdLYdKOXkO19rny/XRSnURaTLm/27N1m17SD/ufFUXl+3m8tOGYGZsbOgjAOlFYzKTKessrrJSx53FJQy9Y66IX/XBccSio+jX3oS047K5OQ7X2PbgVLSEuNZcevnueKxHEoqqiirrObO849lbFZPAH7+/BoeeWtTk7Xf/eXjuPEfK1r/5aOo6cNoUO+U2ofLg/fA+de/f5pCXUS6tsqqarYfKGtz52cFJWGOu83rGnrZTWfwya5Cph2V2erllYWreG7Fdi44fjCzf/cWJRWVPPj1SSTGx7Ei7wDnTRxc23ZPUTm9UkLsKSqnsKySz/9mMacelclPzh7L3a98whlj+zOkbyrvbdrH3a9+Uvu+M8f2Z2DvFB5b6p0cvf1Lx3DxlGF16qg5f7H5zrN1SaOIfLb8YfFGzp0wsPaKn65o+Zb9DOiVTO+UULPuov7Nq5+walsBj1xygkJdRCRImgp1ddogIhIgCnURkQBRqIuIBIhCXUQkQBTqIiIBolAXEQkQhbqISIAo1EVEAkShLiISIAp1EZEAUaiLiARIk6FuZkPM7HUzW2tmq83suihtpptZgZl96L9+2j7liohIY5rzjNJK4Ebn3Ptm1gNYbmavOufW1Gv3pnNuduxLFBGR5mpyT905t8M5974/XAisBQa1d2EiItJyLTqmbmbDgYnAO1FmTzWzFWb2kpmNa+D9c80sx8xy8vPzW1ysiIg0rtmhbmbpwD+B651zB+vNfh8Y5pw7Dvgd8HS0ZTjnHnbOZTvnsjMzW/80EhERia5ZoW5mIbxAn++ce6r+fOfcQedckT/8IhAys34xrVRERJrUnKtfDHgEWOucu6eBNgP8dpjZZH+5e2NZqIiINK05V7+cDHwd+MjMPvSn/RgYCuCcexC4ELjKzCqBUmCO66zn5ImIfIY1GerOubcAa6LNfcB9sSpKRERaR3eUiogEiEJdRCRAFOoiIgGiUBcRCRCFuohIgCjURUQCRKEuIhIgCnURkQBRqIuIBIhCXUQkQBTqIiIBolAXEQkQhbqISIAo1EVEAkShLiISIAp1EZEAUaiLiASIQl1EJEAU6iIiAaJQFxEJkCZD3cyGmNnrZrbWzFab2XVR2piZ3WtmuWa20syOb59yRUSkMQnNaFMJ3Oice9/MegDLzexV59yaiDYzgdH+60TgAf+niIh0oCb31J1zO5xz7/vDhcBaYFC9ZucCjznPMqC3mWXFvFoREWlUi46pm9lwYCLwTr1Zg4CtEeN5HB78mNlcM8sxs5z8/PyWVSoiIk1qdqibWTrwT+B659zB+rOjvMUdNsG5h51z2c657MzMzJZVKiIiTWpWqJtZCC/Q5zvnnorSJA8YEjE+GNje9vJERKQlmnP1iwGPAGudc/c00OxZ4Bv+VTBTgALn3I4Y1ikiIs3QnKtfTga+DnxkZh/6034MDAVwzj0IvAjMAnKBEuDS2JcqIiJNaTLUnXNvEf2YeWQbB1wTq6JERKR1dEepiEiAKNRFRAJEoS4iEiAKdRGRAFGoi4gEiEJdRCRAFOoiIgGiUBcRCRCFuohIgCjURUQCRKEuIhIgCnURkQBRqIuIBIhCXUQkQBTqIiIBolAXEQkQhbqISIAo1EVEAkShLiISIE2Gupn90cx2m9mqBuZPN7MCM/vQf/009mWKiEhzNPngaeDPwH3AY420edM5NzsmFYmISKs1uafunFsM7OuAWkREpI1idUx9qpmtMLOXzGxcQ43MbK6Z5ZhZTn5+fow+WkREasQi1N8HhjnnjgN+BzzdUEPn3MPOuWznXHZmZmYMPlpERCK1OdSdcwedc0X+8ItAyMz6tbkyERFpsTaHupkNMDPzhyf7y9zb1uWKiEjLNXn1i5k9DkwH+plZHnArEAJwzj0IXAhcZWaVQCkwxznn2q1iERFpUJOh7pz7ahPz78O75FFERDqZ7igVEQkQhbqISIAo1EVEAkShLiISIAp1EZEAUaiLiASIQl1EJEAU6iIiAaJQFxEJEIW6iEiAKNRFRAJEoS4iEiAKdRGRAFGoi4gEiEJdRCRAFOoiIgGiUBcRCRCFuohIgCjURUQCRKEuIhIgTYa6mf3RzHab2aoG5puZ3WtmuWa20syOj32ZIiLSHM3ZU/8zMKOR+TOB0f5rLvBA28sSEZHWaDLUnXOLgX2NNDkXeMx5lgG9zSwrVgWKiEjzxeKY+iBga8R4nj/tMGY218xyzCwnPz8/Bh8tIiKRYhHqFmWai9bQOfewcy7bOZedmZkZg48WEZFIsQj1PGBIxPhgYHsMlisiIi0Ui1B/FviGfxXMFKDAObcjBssVEZEWSmiqgZk9DkwH+plZHnArEAJwzj0IvAjMAnKBEuDS9ipWREQa12SoO+e+2sR8B1wTs4pERKTVdEepiEiAKNRFRAJEoS4iEiAKdRGRAFGoi4gEiEJdRCRAFOoiIgGiUBcRCRCFuohIgCjURUQCRKEuIhIgCnURkQBRqIuIBIhCXUQkQBTqIiIBolAXEQkQhbqISIAo1EVEAkShLiISIM0KdTObYWYfm1mumc2LMv8SM8s3sw/91+WxL1VERJrS5KUEt6AAAAfvSURBVIOnzSweuB84C8gD3jOzZ51za+o1fcI59512qFFERJqpOXvqk4Fc59xG51wF8Hfg3PYtS0REWqM5oT4I2BoxnudPq+8CM1tpZgvMbEi0BZnZXDPLMbOc/Pz8VpQrIiKNaU6oW5Rprt74c8Bw59x4YCHwaLQFOeceds5lO+eyMzMzW1apiIg0qTmhngdE7nkPBrZHNnDO7XXOlfujfwAmxaY8ERFpieaE+nvAaDMbYWaJwBzg2cgGZpYVMXoOsDZ2JYqISHM1efWLc67SzL4DvAzEA390zq02s9uAHOfcs8C1ZnYOUAnsAy5px5pFRKQB5lz9w+MdIzs72+Xk5HTKZ4uIdFdmttw5l93QfN1RKiISIAp1EZEAUaiLiASIQl1EJEAU6iIiAaJQFxEJEIW6iEiAKNRFRAJEoS4iEiAKdRGRAFGoi4gEiEJdRCRAFOoiIgGiUBcRCRCFuohIgCjURUQCRKEuIhIgCnURkQBRqIuIBIhCXUQkQJoV6mY2w8w+NrNcM5sXZX6SmT3hz3/HzIbHulAREWlak6FuZvHA/cBM4Gjgq2Z2dL1mlwH7nXOjgN8Ad8W6UBERaVpz9tQnA7nOuY3OuQrg78C59dqcCzzqDy8AzjAzi12ZIiLSHAnNaDMI2Boxngec2FAb51ylmRUAGcCeyEZmNheY64+Wm9mq1hTdyfpR73t1A6q543THulVzx4lF3cMam9mcUI+2x+1a0Qbn3MPAwwBmluOcy27G53cp3bFu1dxxumPdqrnjdETdzTn8kgcMiRgfDGxvqI2ZJQC9gH2xKFBERJqvOaH+HjDazEaYWSIwB3i2XptngW/6wxcCrznnDttTFxGR9tXk4Rf/GPl3gJeBeOCPzrnVZnYbkOOcexZ4BPiLmeXi7aHPacZnP9yGujtTd6xbNXec7li3au447V63aYdaRCQ4dEepiEiAKNRFRAKkU0K9qW4HOuDzh5jZ62a21sxWm9l1/vS+Zvaqma33f/bxp5uZ3evXu9LMjo9Y1jf99uvN7JsR0yeZ2Uf+e+6N1c1YZhZvZh+Y2fP++Ai/a4b1flcNif70BrtuMLOb/Okfm9kXIqbH/PdiZr3NbIGZrfPX99Rusp5v8P9trDKzx80suautazP7o5ntjrzfoyPWbUOf0ca6f+3/G1lpZv8ys94R81q0Dlvze2pNzRHzvm9mzsz6+eOdu66dcx36wjvZugEYCSQCK4CjO7iGLOB4f7gH8AleFwi/Aub50+cBd/nDs4CX8K7HnwK840/vC2z0f/bxh/v4894FpvrveQmYGaPavwf8DXjeH38SmOMPPwhc5Q9fDTzoD88BnvCHj/bXeRIwwv9dxLfX7wXvTuPL/eFEoHdXX894N9NtAlIi1vElXW1dA9OA44FVEdPafd029BltrPvzQII/fFdE3S1ehy39PbW2Zn/6ELyLSLYA/brCuu6wII1YCVOBlyPGbwJu6ug66tX0DHAW8DGQ5U/LAj72hx8CvhrR/mN//leBhyKmP+RPywLWRUyv064NdQ4G/gOcDjzv/wPYE/GfoXbd+v/QpvrDCX47q7++a9q1x+8F6IkXjlZveldfzzV3SPf1193zwBe64roGhlM3HNt93Tb0GW2pu96884D50dZNU+uwNf8n2lIzXrcoxwGbORTqnbquO+PwS7RuBwZ1Qh0A+H+CTQTeAY5wzu0A8H/295s1VHNj0/OiTG+r3wI/BKr98QzggHOuMsrn1Om6AajpuqGl36UtRgL5wJ/MO2T0f2aWRhdfz865bcD/AJ8CO/DW3XK69rqu0RHrtqHPiJVv4e2ttqbu1vyfaBUzOwfY5pxbUW9Wp67rzgj1ZnUp0BHMLB34J3C9c+5gY02jTHOtmN5qZjYb2O2cW96Muhqb12E14+0NHQ884JybCBTj/QnZkK5QM/5xy3Px/twfCKTh9VLa0Gd1ibqb0B1qxMxuBiqB+TWTGqijNXXH7DuZWSpwM/DTaLNbWFtM13VnhHpzuh1od2YWwgv0+c65p/zJu8wsy5+fBez2pzdUc2PTB0eZ3hYnA+eY2Wa8njJPx9tz721e1wz1P6ehrhta+l3aIg/Ic869448vwAv5rryeAc4ENjnn8p1zYeAp4CS69rqu0RHrtqHPaBP/xOFs4GvOP97Qirr30PLfU2scibfRX+H/nxwMvG9mA1pRc2zXdWuO47Xlhbf3ttFfITUnOMZ1cA0GPAb8tt70X1P3pMSv/OGzqXvi411/el+8Y8Z9/NcmoK8/7z2/bc2Jj1kxrH86h06U/oO6J4Wu9oevoe5JoSf94XHUPfG0Ee+kU7v8XoA3gf/yh3/mr+MuvZ7xeiFdDaT6y30U+G5XXNccfky93ddtQ5/RxrpnAGuAzHrtWrwOW/p7am3N9eZt5tAx9U5d1x0WpPVWwCy8K042ADd3wuefgvfnzUrgQ/81C+/42n+A9f7PmhVueA8K2QB8BGRHLOtbQK7/ujRiejawyn/PfbTghEwz6p/OoVAfiXfmPNf/x5zkT0/2x3P9+SMj3n+zX9fHRFwt0h6/F2ACkOOv66f9f8xdfj0D/w2s85f9F7xQ6VLrGngc75h/GG9v77KOWLcNfUYb687FO95c8//xwdauw9b8nlpTc735mzkU6p26rtVNgIhIgOiOUhGRAFGoi4gEiEJdRCRAFOoiIgGiUBcRCRCFuohIgCjURUQC5P8Dz6kUYU6+elEAAAAASUVORK5CYII=\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_lm.load(f'{lang}fine_tuned1_sp15_multifit_bwd_v2')\n",
"learn_lm.unfreeze()\n",
"learn_lm.fit_one_cycle(18, lr, wd=wd, moms=(0.8,0.7), callbacks=[ShowGraph(learn_lm)])"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_lm.save(f'{lang}fine_tuned2_sp15_multifit_bwd_v2')\n",
"learn_lm.save_encoder(f'{lang}fine_tuned2_enc_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"Save best LM learner and its encoder"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_lm.save(f'{lang}fine_tuned_sp15_multifit_bwd_v2')\n",
"learn_lm.save_encoder(f'{lang}fine_tuned_enc_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "markdown",
"metadata": {
"heading_collapsed": true
},
"source": [
"## Fine-tuning \"forward Classifier\""
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"bs = 18"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"### Databunch"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 1.8 s, sys: 548 ms, total: 2.34 s\n",
"Wall time: 4.86 s\n"
]
}
],
"source": [
"%%time\n",
"data_lm = load_data(path, f'{lang}_databunch_lm_aws_sp15_multifit_v2', bs=bs)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 40 s, sys: 1.7 s, total: 41.7 s\n",
"Wall time: 1min 5s\n"
]
}
],
"source": [
"%%time\n",
"data_clas = (TextList.from_df(df_trn_val, path, cols=reviews, processor=SPProcessor.load(dest))\n",
" .split_by_rand_pct(0.1, seed=42)\n",
" .label_from_df(cols=label)\n",
" .databunch(bs=bs, num_workers=1))"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 5.04 s, sys: 744 ms, total: 5.78 s\n",
"Wall time: 5.28 s\n"
]
}
],
"source": [
"%%time\n",
"data_clas.save(f'{lang}_textlist_class_sp15_multifit_v2')"
]
},
{
"cell_type": "markdown",
"metadata": {
"heading_collapsed": true,
"hidden": true
},
"source": [
"### Get weights to penalize loss function of the majority class"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 12 s, sys: 884 ms, total: 12.9 s\n",
"Wall time: 14.4 s\n"
]
}
],
"source": [
"%%time\n",
"data_clas = load_data(path, f'{lang}_textlist_class_sp15_multifit_v2', bs=bs, num_workers=1)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/plain": [
"(199311, 22145, 221456)"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"num_trn = len(data_clas.train_ds.x)\n",
"num_val = len(data_clas.valid_ds.x)\n",
"num_trn, num_val, num_trn+num_val"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/plain": [
"(array([ 22694, 176617]), array([ 2412, 19733]))"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"trn_LabelCounts = np.unique(data_clas.train_ds.y.items, return_counts=True)[1]\n",
"val_LabelCounts = np.unique(data_clas.valid_ds.y.items, return_counts=True)[1]\n",
"trn_LabelCounts, val_LabelCounts"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/plain": [
"([0.8861377445299056, 0.11386225547009443],\n",
" [0.891081508241138, 0.10891849175886203])"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"trn_weights = [1 - count/num_trn for count in trn_LabelCounts]\n",
"val_weights = [1 - count/num_val for count in val_LabelCounts]\n",
"trn_weights, val_weights"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"### Training (Loss = FlattenedLoss of weighted CrossEntropyLoss)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 12.8 s, sys: 428 ms, total: 13.2 s\n",
"Wall time: 12.6 s\n"
]
}
],
"source": [
"%%time\n",
"data_clas = load_data(path, f'{lang}_textlist_class_sp15_multifit_v2', bs=bs, num_workers=1)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"config = awd_lstm_clas_config.copy()\n",
"config['qrnn'] = True\n",
"config['n_hid'] = 1550 #default 1152\n",
"config['n_layers'] = 4 #default 3"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_c = text_classifier_learner(data_clas, AWD_LSTM, config=config, pretrained=False, drop_mult=0.5, \n",
" metrics=[accuracy,f1]).to_fp16()\n",
"learn_c.load_encoder(f'{lang}fine_tuned_enc_sp15_multifit_v2');"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"#### Change loss function"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/plain": [
"FlattenedLoss of CrossEntropyLoss()"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"learn_c.loss_func"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"loss_weights = torch.FloatTensor(trn_weights).cuda()\n",
"learn_c.loss_func = partial(F.cross_entropy, weight=loss_weights)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/plain": [
"functools.partial(, weight=tensor([0.8861, 0.1139], device='cuda:0'))"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"learn_c.loss_func"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"#### Training"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_c.freeze()"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.\n"
]
}
],
"source": [
"learn_c.lr_find()"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAY8AAAEGCAYAAACdJRn3AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3dd3zV5fn/8deVHbIISQgjQNhDQNCIIIp71tZtsVp3qbba1lb707ZWv7S2Wts6Wqt1t7bu0aJiQRQUxQGorEgg7BBCAoSEkJ3cvz/OCR4hE3JyRt7Px+M8OOczzue6E3Kuc8+POecQERHpiIhAByAiIqFHyUNERDpMyUNERDpMyUNERDpMyUNERDosKtABdJb09HSXnZ0d6DBERELK0qVLdzjnMjp6Xtgkj+zsbJYsWRLoMEREQoqZbTqY89RsJSIiHabkISIiHabkISIiHabkISIiHabkISIiHabkISIiHabkISIiHabkISISwl5ZWsBzn27u8usqeYiIhLAXFm/htc+3dvl1lTxEREJYQWklWanxXX5dJQ8RkRBVW99IUXk1Wak9uvzaSh4iIiFqW1kVjQ7VPEREpP0KSqsAGKCah4iItFdBaSWgmoeIiHTAll1VREYYfVPiuvzaSh4iIiGqoLSSvilxREV2/Ue5koeISIgqKK0KSJMVKHmIiIQsT/Lo+s5yUPIQEQlJNfUNbN9THZ41DzM7w8zyzCzfzG5tZv9AM5tvZp+b2XIzO8u7PdvMqszsC+/jEX/GKSISagp3V+NcYIbpAkT5643NLBJ4CDgVKAAWm9ks51yuz2G/Al50zj1sZmOA2UC2d98659wEf8UnIhLKAjlMF/xb85gE5Dvn1jvnaoHngXP2O8YByd7nKUChH+MREQkbTRMEs3qFX59Hf2CLz+sC7zZfdwKXmVkBnlrHjT77Bnubs94zs+Oau4CZzTCzJWa2pKSkpBNDFxEJblt2VRIVYfRJ7vo5HuDf5GHNbHP7vb4EeNo5lwWcBTxjZhHANmCgc24i8FPgWTNL3u9cnHOPOudynHM5GRkZnRy+iEjwKiitol/PeCIjmvuo9T9/Jo8CYIDP6ywObJa6BngRwDn3ERAHpDvnapxzO73blwLrgBF+jFVEJKQEain2Jv5MHouB4WY22MxigOnArP2O2QycDGBmo/EkjxIzy/B2uGNmQ4DhwHo/xioiElICOUEQ/DjayjlXb2Y3AHOASOBJ59wqM5sJLHHOzQJ+BjxmZjfhadK60jnnzGwaMNPM6oEG4Drn3C5/xSoiEkqq6xoo3lMTsGG64MfkAeCcm42nI9x32699nucCU5s57xXgFX/GJiISqrbubhppFZ7NViIi4gf7hukGsOah5CEiEmICPUEQlDxERELOll1VREcamUmBmeMBSh4iIiGnoLSS/j3jiQjQHA9Q8hARCTmBXIq9iZKHiEiICfQEQVDyEBEJKVW1DeyoqGVAgBZEbKLkISISQrbuDvxIK1DyEBEJKVv2zfFQ8hARkXYq2OWpeQRyaRJQ8hARCSkFpVXEREWQnhgb0DiUPEREQkhBaRVZAZ7jAUoeIiIhZf2OvQxMC2yTFSh5iIiEjIZGx/qSCoZlJAY6FCUPEZFQUbi7ipr6Rob1VvIQEZF2yi+uAFDyEBGR9mtKHkPVbCUiIu2VX1xBWkIMqQkxgQ5FyUNEJFTkl1QwNAiarEDJQ0QkJDjnyC+uCIr+DlDyEBEJCTv31lJWVRcUw3RByUNEJCQE00grUPIQEQkJ+0ZaKXmIiEh75RdX0CMmkn4pcYEOBVDyEBEJCetKKhiakYhZYBdEbKLkISISAtYF0UgrUPIQEQl6e2vqKSyrVvIQEZH2W1fStCxJQoAj+YqSh4hIkAu2Ybqg5CEiEvTyiyuIijAGpanmISIi7ZRfXMGgtB5ERwbPR3bwRCIiIs1aVxJcI61AyUNEJKjVNTSyaWdlUNzDw5eSh4hIENu0cy/1ja571TzM7AwzyzOzfDO7tZn9A81svpl9bmbLzewsn323ec/LM7PT/RmniEiwCsaRVgBR/npjM4sEHgJOBQqAxWY2yzmX63PYr4AXnXMPm9kYYDaQ7X0+HTgM6AfMM7MRzrkGf8UrIhKMgunWs778WfOYBOQ759Y752qB54Fz9jvGAcne5ylAoff5OcDzzrka59wGIN/7fiIi3cq6kr30S4kjIdZv3/UPij+TR39gi8/rAu82X3cCl5lZAZ5ax40dOBczm2FmS8xsSUlJSWfFLSISNL7cVs7wzKRAh3EAfyaP5pZ+dPu9vgR42jmXBZwFPGNmEe08F+fco865HOdcTkZGxiEHLCISTKpqG1hbXMH4rJRAh3IAf9aDCoABPq+z+KpZqsk1wBkAzrmPzCwOSG/nuSIiYS13WzkNjY5x/YMvefiz5rEYGG5mg80sBk8H+Kz9jtkMnAxgZqOBOKDEe9x0M4s1s8HAcOBTP8YqIhJ0VhTsBmB8Vs8AR3Igv9U8nHP1ZnYDMAeIBJ50zq0ys5nAEufcLOBnwGNmdhOeZqkrnXMOWGVmLwK5QD3wQ420EpHuZvnWMjKSYslMjg10KAfwa/e9c242no5w322/9nmeC0xt4dy7gLv8GZ+ISDBbubWM8f1Tgubugb40w1xEJAjtraknv7iCsUHY3wFKHiIiQSl3WzmNjqAcaQVKHiIiQWl5QRlAUI60AiUPEZGgtKJgN32S4+idHBfoUJql5CEiEoRWbC1jXJA2WYGSh4hI0NlTXcf6HXsZH6RNVqDkISISdFYVluMcjFXNQ0RE2mtFkHeWg5KHiEjQWb61jP4940lPDL6Z5U2UPEREgszKrWVBXesAJQ8RkaBSVlXHhh17g3qkFSh5iIgElVVbPf0dwTqzvImSh4hIEFnuTR5j+yl5iIhIO60oKGNAr3hSE2ICHUqrlDxERILI6qJyxvRNDnQYbVLyEBEJEjX1DWzcWcmIzKRAh9ImJQ8RkSCxYcdeGhodw3onBjqUNil5iIgEibXbKwBU8xARkfZbu30PEQZDMhICHUqblDxERILEmu0VZKclEBsVGehQ2qTkISISJNYU72F4ZvD3d4CSh4hIUKipb2BTiIy0AiUPEZGgEEojrUDJQ0QkKKwJoZFWoOQhIhIU8rfvITLCQmKkFSh5iIgEhTXbKxiU1iMkRlqBkoeISFBYU7yH4SHS3wFKHiIiARdqI62gncnDzIaaWaz3+Qlm9iMz6+nf0EREuoemkVbDwy15AK8ADWY2DHgCGAw867eoRES6kaaRVuHYbNXonKsHzgPud87dBPT1X1giIt3H2hAbaQXtTx51ZnYJcAXwhndbtH9CEhHpXtZs3xNSI62g/cnjKmAKcJdzboOZDQb+5b+wRES6j7XFFSHVZAUQ1Z6DnHO5wI8AzCwVSHLO3e3PwEREuoOmkVbfGBdaPQHtHW21wMySzawXsAx4ysz+3I7zzjCzPDPLN7Nbm9l/n5l94X2sMbPdPvsafPbN6kihRERCxfqS0BtpBe2seQApzrlyM7sWeMo5d4eZLW/tBDOLBB4CTgUKgMVmNstbiwHA2/HedPyNwESft6hyzk1ob0FERELR2uKmNa1Cq9mqvX0eUWbWF7iYrzrM2zIJyHfOrXfO1QLPA+e0cvwlwHPtfG8RkbDQNNJqcHrojLSC9iePmcAcYJ1zbrGZDQHWtnFOf2CLz+sC77YDmNkgPHNH3vXZHGdmS8zsYzM7t4XzZniPWVJSUtLOooiIBI9QHGkF7e8wfwl4yef1euCCNk6z5t6qhWOnAy875xp8tg10zhV6E9W7ZrbCObduv7geBR4FyMnJaem9RUSC1uZdVQxOC61aB7S/wzzLzF4zs2Iz225mr5hZVhunFQADfF5nAYUtHDud/ZqsnHOF3n/XAwv4en+IiEhY2FpaSf/U+ECH0WHtbbZ6CpgF9MPT9PS6d1trFgPDzWywmcXgSRAHjJoys5FAKvCRz7ZUn7W00oGpQO7+54qIhLI91XWUV9fTr2f4Jo8M59xTzrl67+NpIKO1E7zLmdyAp6/kS+BF59wqM5tpZt/yOfQS4HnnnG+z02hgiZktA+YDd/uO0hIRCQeFu6sB6B+CyaO9Q3V3mNllfNW0dAmws62TnHOzgdn7bfv1fq/vbOa8RcC4dsYmIhKSCndXAYR1zeNqPMN0i4BtwIV4liwREZGDVOBNHlnh2ufhnNvsnPuWcy7DOdfbOXcucL6fYxMRCWuFu6uIjjQyEmMDHUqHHcqdBH/aaVGIiHRDW0ur6JsST0REczMbgtuhJI/QK62ISBAp3F1Fv55xgQ7joBxK8tCkPBGRQ7B1dxX9e/YIdBgHpdXRVma2h+aThAGh18MjIhIk6hoa2V5eTf8QrXm0mjycc6G1RrCISIgoKqum0RGSs8vh0JqtRETkIG0N4TkeoOQRcorKqrl/3hq+3FYe6FBE5BA0TRAMxdnl0P4Z5hJgW3dX8ciCdbyweAu1DY08+cEG/n3tZMZlpQQ6NBE5CFtLVfMQP2psdPzf66s44d75PL94MxccmcULMyaTHB/Ndx7/mGVbdrf9JiISdArLqkhPjCEuOrTu49FEySPIPbd4M099uJHzJvZnwS0n8vvzx3H0kDSenzGZnj2iuezxT/h8c2mgwxSRDioorQrZWgcoeQS14vJq7n5rNVOGpHHPBeO/1jaaldqDF2ZMoVdiDN994lOe/GADBaWVAYxWRDqicHdVyPZ3gJJHUJv5Ri419Y3cdd5YzA6c0N+vZzwvzJjCkIwEZr6Ry7H3zOcbDy7kgXlr2VNdF4CIRaQ9nHNs3R3aNQ91mAep+XnFvLF8Gz89dQRDMhJbPK5PShyzbjiWDTv28nZuEXNXbef+d9awrayKuy8Y34URi0h7lVbWUV3XGNI1DyWPLlbf0MhjCzdQWllLSnw0qT1iSO0Rzdj+KQzo5VmmoLK2nl+9tpJhvRO57vih7XrfwekJzJg2lBnThnLbqyt49bMC/t8Zo0hNiOn0MjTdt6u52pCItC3UR1qBkkeXu3duHn9/bz2xURHU1Dd+bd+gtB5MHZbOnup6tu6u4qXrphAT1fGWxaumZvPcp5t5bvFmfnDCsM4KHYDPN5fyg39/xrayamKiIoiNjCA2OpJfnDWK849o67b2IgJfTRAMxft4NFHy6EJvLC/k7++t59KjB3LXeeOormugtLKWHXtqWbJpFx/m72DWF4VU1NRzyaQBHJXd66CuMyIzianD0njmo01877ghREd2TtfWWyu28ZMXvqB3ciw/Onk4tfWN1NY38t6aYu6dk8fZ4/sdVLIT6W5CfXY5KHl0mdVF5dzy0nKOHJTKHd88DIC46Ej6psTTNyWecVkpXDV1MHUNjeQV7WFE5qEtK3bVMYO59p9LmLtqO98Y3/eQ3ss5x+MLN/C7t75kwoCePH55Dmk+N69ZkJfOlU8t5j+fb+XiowYc0rU6Q1FZNakJ0cRGheb4eQl/hburiI+OJLVHdKBDOWj6mtgFyirr+P4zS0mKi+LhS49o9dt5dGQEY/unHPI3+JNG9WZQWg+e+nBDm8c2Nrp9/Rj7K9lTw62vrOCu2V9y1ti+PPe9yV9LHADHj8jgsH7JPPLeOhoaA7NSv3OOT9bv5Np/LGHy79/hhmc/b7FMIoG2tdRzH49Q7jdU8vCzxkbHT174nMLdVTx82RH0Tu6a5ZcjIozLp2SzZFMpKwrKWjxu885KJv1uHqfe9z4PzFvLupIKALbsquRX/1nB1Hve5cWlW7j+hKH85ZKJzc6GNTN+eOIw1u/Yy/9WFnV6WVYXlbNrb22L++euKuLchz7k249+zNJNuzhldCZv527nhcVbOj0Wkc5QWFZF/9TQvI9HEzVb+dlzizczP6+EmeccxpGDDq4P42BdlJPFn+fm8dSiDfz54gkH7K+srWfGM0uoa3CkJcRw/ztruG/eGoZkJLBpZyURBhcckcX3jx/K4PSEVq91+mF9GJKRwEPz8zlrXJ9O+0a1aedevvmXD0iJj+HB6RM4Zlj6vn3VdQ383+u5PPfpZrLTevDbc8dywRFZxEZFcNkTnzDzjVwmD0kju43YRbra1tIqDuuXHOgwDolqHm3YWVHDhQ8v4h+LNnb43G1lVdw9ezXHDE3ju5MHdX5wbUiOi+bCI7N4Y9k2SvbUfG2fc45bXl7Omu17+MslE3nh+1P46NaTuf3sMfRNieOqY7JZ+POTuPuC8W0mDoDICOP644eSu62cBWtKOq0Mf5q7hsgIIyU+ikuf+IT7562hodGxeWclFzy8iOc+3cx1xw9l3k+P57LJg4iPiSQiwvjjRYcTFWHc9OIX1Dc0tn0hkS5SXdfAzr21IT3HA1TzaJVzjl++tpIlm0pZsqmUjTv38qtvjCGyHTerd85x+39WUtfYyO/PHxewts0rjsnmnx9v4rtPfMLPThvJKaN7Y2Y88t563ly+jVvPHMW0ERmAZ8LhNccO5ppjBx/Utc6d2J/7563lb/PzOXFkb4rLq5mbu50FeSWkxEeTk53KUdmpDElPJKIdP8OVW8uYtayQG04cxvUnDOX2/6zk/nlrWbh2B2u278GAxy/P4ZQxmQec269nPL85dyw/fv4LHl6wjhtPHs7GHXt59fOtvLm8kKiICEb2SWJU3yRG90nmmGFp6mCXLhEOI61AyaNVs5YV8r9VRdxy+kh2VtTy5Icb2FpaxQPTJxIfE8nemnreX1PCJxt2ceywdE72fjADvLF8G/O+LOZX3xjNoLTANZsMyUjk4UuP4O63VvO9fy7h8KwUzhzXlz/MWc3Z4/vy/WlDOu1a0ZERzJg2hDtmreLsvyxkVWE5zsGAXvHsrWnglc8KAOjZI5pbzxjF9EkDW32/e/63mtQe0cw4fggJsVH86eLDmTw0jV//dyXDeyfxt0uP2DexsjnnTOjPvC+LeeCdtcxbXcyyLbsxg2OGphEXFcnSTaXMWlYIwCmjM3ns8iNDugNTQkOo38ejiZJHC7aXV3P7f1ZyxMCeXHf8UCIjjAG94pn5Ri4X/X0R6YmxLMrfSW1DI1ERxtOLNnLssHR+dfZoMpPiuHPWKg73Dr8NtDPG9uXk0Zm8+lkBD76Tz91vrWZUnyT+cOH4Tv+w/PZRA3jm4004BzedMoLTD+vDiEzP8irrd+xl6cZSXv6sgNteW0FKfDRnjmt+GPHCtSUsXLuD288eQ3KcZzijmXFxzgBOG5NJYmwUUe2Yv/Lbc8aycmsZ1bUN3HbmKM6Z0J8+KV8NWiirquOpDzdw/7y1vL58G986vF8n/BREWhYOs8sBLFyGM+bk5LglS5Z0+LzGRsfNLy/jtDGZnDw6k+jICJxzXPX0Yj5ev5O3fjzta23+c1cV8bMXl5GaEMOpYzI5dUwmEwb05PlPN3Ofd0HC7PQENu+s5I0fHcuoPsHVKVZT38D/VhYxeUgamV008mt/VbUNXPr4x6wsLOdf1xzNpMFfH0jQ2Oj41kMfULq3jndvPt7vzUkNjY7zH17Ell2VvH3TtAOGIot0pj/NzeOh+fnk/fbMTpvAeyjMbKlzLqej53X7msfW3VV8tG4nr362lfTEWC7KySIxNooFeSXc+c0xB3QWn3ZYH764I5MI+/raTldOHbyvzf+Zjzfxo5OGB13iAIiNiuScCf0DGkN8TCRPXHEUFz6yiGv/sZiXrz/ma5Mi31ixjZVby7nv24d3ST9EZIRx74Xj+caDC5n5Ri4PTJ/o92tK97V1dxV9kuOCInEcim5f8wDPN8/31hTz7CdbmJ9XTEOjY8qQNP597dHt6tjdX0VNPQkxkWo/b0NBaSXn/20RkRHGZZMHsXlnJZt27WXl1nIG9OrBmzcee1A//4P1wLy13DdvDU9ckcPJow/shBfpDNMf/Yi6Bscr1x8T6FAA1TwOSWSEcdKoTE4alcn28mrmriri9LF9DvqDKzFWP9b2yErtwdNXTeKSxz7m3jl5pCfGMLBXD04bk8l1Jwzt0sQBcP0JQ5m9Yhu/fG0lRw3uta+vRaQzFZfXMLpv8LVKdJQ+5faTmRzHd6dkBzqMbmNMv2Q+vu1kGpwLeNKNiYrgDxeO57y/fcjvZ6/m9+ePC2g8Ep62l1dz/MiMQIdxyEK70U3CQnxMZMATR5PDB/Tk2uOG8Nynm1m0bkegw5EwU1FTz97ahoANVulMSh4i+7nplBFkp/Xg1ldWUFlb3+7z6hoa2bKrkk/W72R3ZctrcUn3VVxeDUDvpNAf0RccX/dEgkh8TCT3XDCebz/6MX+au4bbzx7T4rHbyqq4680v+XzzbraVVdG0qHB6Ygx/uHA8J41Sx7t8ZXu5Z5kg1TzaYGZnmFmemeWb2a3N7L/PzL7wPtaY2W6ffVeY2Vrv4wp/ximyv6OHpHHZ5IE8+eEGPttcesB+5xwvLtnCafe9zztfFpOTncoPTxzGHy4Yz6PfPZL0xFiufnoJv/rPCqpqGwJQAglGxXs8NY/MZNU8WmRmkcBDwKlAAbDYzGY553KbjnHO3eRz/I3ARO/zXsAdQA7ggKXecw/8Kxbxk/93xije/bKYn7+8nDd/dCxRERFU1tazvbyG376Zy4K8EiYN7sW9F44/YAma40dm8Mc5eTy2cAOL1u3k+uOHMjwziaEZCSR5R3E1NDp27q2huLyGmvqmBGOYwZi+yc0ufy+hrdhb88hICv2ahz+brSYB+c659QBm9jxwDpDbwvGX4EkYAKcDbzvndnnPfRs4A3jOj/GKfE1SXDS/O38cVz61mHF3zKXWZ3Xe+OhI7vzmGC6fkt3skOLYqEh++Y0xnDCyNze/tIxbXl6+b1+f5Dgcjh0VtS3ePGtM32SemzGZlHgNFw4n28uriYuOIDku9HsM/FmC/oDv3XgKgKObO9DMBgGDgXdbOfeAadFmNgOYATBwYOuL7IkcjBNG9ubPFx9OXtEeesREkRAbSY+YKI4bnt7qooxNpg5LZ+HPT2TzrkryiyvIL6lgXfFeIiM87d69k2LJSIqjR0wkTWmkcHcVv/7vSq5+ejHPXDOJHjGh/0EjHsV7ashMDu07CDbx5//K5n46LU1nnw687Jz7qu7ejnOdc48Cj4JnhvnBBCnSlvOPyDqk86MiIxiSkciQjEROa+c5KfHR3PDsZ8z451IevyJHTVhhYnt5dViMtAL/dpgXAAN8XmcBhS0cO52vN0l15FyRsHPWuL7cc8F4PsjfwY3PfU6dbmgVFor31HTZraj9zZ/JYzEw3MwGm1kMngQxa/+DzGwkkAp85LN5DnCamaWaWSpwmnebSLdxUc4A7vzmGN7O3c4ds1YFOhzpBMXl1WSGQWc5+LHZyjlXb2Y34PnQjwSedM6tMrOZwBLnXFMiuQR43vms0Oic22Vmv8GTgABmNnWei3QnV04dzLayav7+/nqOHZbOWS3c/0SCX9Ps8t5hMEwX/DxJ0Dk3G5i937Zf7/f6zhbOfRJ40m/BiYSIm08fycfrd3LrK8s5fEDPkL8DXXe1vTx85niAlicRCXrRkRE8eMlEGh385PnPqVf/R0hqmuPRO0yarZQ8RELAoLQEfnPuYSzeWMpf5+fv297Q6Ni4Yy/VdS3PYp+1rJBlW3a3uF+6RjjNLgetbSUSMs6bmMXCNTt48J21bC2tIr+kgtXb9lBV18DQjASeueboA+6L/dd31/LHuWvITI7l3Z+dQEKQrF7cHTU1W2m0lYh0uZnnjmVwegL/W1lEdGQE0ycN4FffGE1xeQ0XPryI/OKKfcc2JY5jh6WzvbzmazUW6XrF5TXERUeQFCYJPDxKIdJNJMZG8fZNx2PG12YpTxmaxhVPLuaiRxbx1FWT+GBtCX+cu4bzJvbnjxcdzi0vL+OJhRu4OGcAg9MTWrlCx81ZVURlbT1nju2ryYyt2B5Gs8tBNQ+RkBMRYQd8AB3WL4VXrp9CUlw0Fz/y0dcSR2SEceuZo4iJiuA3b7S0tJxHVW0Dv5v9Jbe9upw5q4qoqGn9fiYrt5bxg39/xk0vLOPo373Db97IZV1JRavndFfbw2iOB6jmIRI2BqUl8PL1U/jhvz9jWO8kfnvuWCK9izb2TorjxycP567ZX/Lu6u3N3mdkfUkF1//rM9YU7yEhJornPt1CTGQEkwb3Ysa0IUwb8fVbp9bWN3LzS8tIS4jh9+eP49XPt/KPRRt54oMNXHRkFr8/fxxRkfp+2qRkTw1j+oX+vcubKHmIhJHeSXG8dN0xze674phsnlu8mZmv5zJ1WDqxUV81Mc1esY2fv7yc6EjjH1dNYsrQNBZv3MX81cW8tbKIa/6xmIcvPZJTxnyVdB6an8/qoj08fnkOJ4/O5OTRmZTsqeGxhet59P31VNY2cP/0CUQrgeCcY3t5NSeO7B3oUDqNkodINxETFcGd3zyMy5/8lPP/toj0xFiiIyOobWjk/TUlTBjQk79desS+EVvHDE3nmKHp3HjycC57/BN+8O/P+PvlR3LiyN7kFpbz0Px8zp3Q72sJJSMpll+cNZreSbH89s0vqW1o5K/fmfi1RNUdVdTUU1nbEDbDdEF9HiLdyrQRGdx0yggSYqIoq6pjW1kVRWVVzJg2hBe/P+WAob4AyXHRPHP10QzPTOT7zyxlfl4xN7+0jJ49Yrjjm4c1e51rjxvC/33rMN7O3c51zyxlfUkFc1cV8eA7a7nh2c947fMCfxc1qBTv8U4QDKPkoZqHSDfz41OG8+NThnfonJQe0fzrmqO55LGPueopz5Jzj1x2BKkJMS2ec8Ux2URHRvCL11YwP++9fdt7xESycO0OThvTJyznnZRX1wGepNtk39Ik6jAXke4mNSGGf197NFc/vZhRfZI5Y2zbizR+5+iBZKf1YNOuSkb1SWJEZhJ52/dw/t8W8cLiLVx97OAuiLxr/eT5L6iua+DZ703et23f0iRhMkEQlDxEpAPSEmP5zw+ndmiuwjHD0vHtwj9iYCpHZafyxAcb+O6UQWHXoZ5fXEFBaSW7K2vp2cNTM2tamiScmq3C67cmIn7XGZPcvj9tKFt3VzF7xbZOiCh4OOcoKq+m0cH7a3fs2769vIb46MiwmV0OSh4iEgAnjerN0IwE/v7eenxu5RPySivrqK33rHq8ILvEFtYAAA+LSURBVK9433bPvctjw2Z2OSh5iEgAREQY3582lNxt5XyQv6PtE0JEUZmneSopLor38kpobPQkRs+9y8OnvwOUPEQkQM6Z2I/eSbE8+v76QIfSaZpGVZ03sT8799aysrAM8Nx+Npz6O0DJQ0QCJDYqkiunZrNw7Q5WeT9kQ902b83j4pwBmMH81SU457zNVqp5iIh0ikuPHkRCTCT3zsnb18QTyorKq4kwGNknifFZPVmwpnjf7PLeSap5iIh0ipT4aG45fSQL8kp48N21gQ7nkBWVVe1b9uXEkRl8sWU3eUV7AFTzEBHpTFcck80FR2Rx/7y1zF1VFOhwDklReQ19UzxJ4oSRvXEOXl7qWYpFfR4iIp3IzLjrvLEcnpXCTS98wdrtewId0kErKqvaV8MY3z+FtIQYXl9WCKDRViIinS0uOpJHvnsk8TGRzHhmKWVVdYEO6aAUlVXTx1vziIgwpo3IYG9tA0BYragLSh4iEiT6psTz8GVHsmVXJZc/+SlbdlUGOqQOqaytp7y6fl/yADhhpOcGWj1iIkkMo9nloOQhIkHkqOxe/PU7E1lfXMFZDy7kzeWhs3xJ0wTBPj4d49OGZxBh0DspvGaXg5KHiASZM8b25c0fHcfQjER++Oxn3PbqCqq8TT/BrMg7QdC35pGaEMOkwb0YkpEYqLD8JrzqUSISFgam9eCl66bwp7lreOS9dSzIK+YHJwzlopwBxEUH510Jm6t5APz9uzmEWaUDUPIQkSAVHRnBrWeO4oSRGdw7J4/b/7uKh+av47rjhzCmXwqrCstYVVjOl9vKGZGZxO1nj6FXKzen8rfmah7gmcsSjpQ8RCSoTR6SxsvXTWHRup08MG8td76eu29femIMIzKTeGN5IR/k7+CeC8Zx0qjMVt7Nf4rKqkmOi6JHTPf4WO0epRSRkGZmTB2WztRh6SzdVEpZVS2H9UvZ1xGdW1jOT1/8gqufXsIlkwbyi7NGkRTXtd/4fYfpdgdKHiISUo4clHrAtjH9kvnvDVP589trePT99by0ZAuH9UvmiEGpHDkolclD0khPPHCexWebS/nb/HyiIyP4ySkjGNkn6aDjKiqvpk9K/EGfH2qUPEQkLMRGRXLbmaM5a2xf5qwqYummUp77dDNPfbiRCIOpw9L55uH9OP2wPmzZVcmf317Du6uLSUuIobahkf+tKuL8iVn89LQR9O/Z8SRQVFbNqENIPqFGyUNEwsrhA3py+ICeANQ1NJJbWM7c3CJmLSvk5y8v55evraCuwZESH83PzxjJFVOyqa1v5OH31vH0oo28vryQGccN4UcnDycmqn2zGeoaGimpqFHNo7OY2RnAA0Ak8Lhz7u5mjrkYuBNwwDLn3He82xuAFd7DNjvnvuXPWEUk/ERHRuxLJjefNpJlBWXMXrGNlPhovjtlEMnefpGEWPjFWaO54phs/jgnj7/Oz+f9tSXc/+0J7ZqjUbKnBucOHKYbzvyWPMwsEngIOBUoABab2SznXK7PMcOB24CpzrlSM+vt8xZVzrkJ/opPRLoXM2PCgJ5M8NZKmtO/Zzz3fXsCp43J5NZXV/CNBz/gzm+N8d7cqeXJGk3DdPt2ow5zf84wnwTkO+fWO+dqgeeBc/Y75nvAQ865UgDnXDEiIgF25ri+/O8nxzFxYE/+3ysrmPlGbqvHN00QDLd7drTGn8mjP7DF53WBd5uvEcAIM/vQzD72NnM1iTOzJd7t5zZ3ATOb4T1mSUlJSedGLyLdWt+UeP51zdFcMmkATy/a2OpS8ftml6vm0Smaq+Ptf5/JKGA4cAJwCfC4mTXVKQc653KA7wD3m9nQA97MuUedcznOuZyMjIzOi1xEBM+y6recPoqEmCj+/PaaFo8rKq8mJiqC1B7hOZu8Of5MHgXAAJ/XWUBhM8f81zlX55zbAOThSSY45wq9/64HFgAT/RiriEizeiXEcM2xg3lrZRErCsqaPaaorJo+yXFht3Jua/yZPBYDw81ssJnFANOBWfsd8x/gRAAzS8fTjLXezFLNLNZn+1Sg9UZHERE/ufa4wfTsEc0f5+Y1u7+7zS4HPyYP51w9cAMwB/gSeNE5t8rMZppZ07DbOcBOM8sF5gO3OOd2AqOBJWa2zLv9bt9RWiIiXSkpLprrjx/Ke2tK+HTDrgP2F5VXd6thuuDneR7OudnA7P22/drnuQN+6n34HrMIGOfP2EREOuLyKdk88cEG/jgnjxe+P3lfE5VzjqLy6m41TBd0MygRkXaJj4nkxpOG8enGXby/dse+7aWVddTWN3arYbqg5CEi0m7fPmogWanxzHx9FZW19UD3HKYLSh4iIu0WExXBPReMZ/2Ovdz+n1UAFJVXAd0veWhhRBGRDpg6LJ0bTxrOg++sZfKQXtQ1eKavdbcOc9U8REQ66McnD2fykF7c/t+VLFxbQoRBRtKB9wsJZ0oeIiIdFBlhPDh9IomxUby1soj0xFiiI7vXx2n3Kq2ISCfpnRzH/d+eiFn36+8A9XmIiBy0Y4enc88F40mI6X4fpd2vxCIinejinAFtHxSG1GwlIiIdpuQhIiIdpuQhIiIdpuQhIiIdpuQhIiIdpuQhIiIdpuQhIiIdpuQhIiIdZp6b+YU+MysBNjWzKwXY/671+2/zfd3cc99t6cAODk5zsbT3GJWj+ecqh8rR0naVo33lGOScy2jjmAM558L6ATza1jbf180932/bks6Mpb3HqBwtlknlUDlUDj+Uo61Hd2i2er0d215v43lz79FZsbT3GJWj5ecHS+VQOVp7frDCpRytCptmq65iZkucczmBjuNQqRzBReUILipH27pDzaOzPRroADqJyhFcVI7gonK0QTUPERHpMNU8RESkw5Q8RESkw7pt8jCzJ82s2MxWHsS5R5rZCjPLN7MHzcx89t1oZnlmtsrM/tC5UTcbS6eXw8zuNLOtZvaF93FW50febDx++Z14999sZs7M0jsv4hZj8cfv5Ddmttz7+5hrZv06P/IDYvFHOe41s9XesrxmZj07P/IDYvFHOS7y/o03mpnfOtYPJfYW3u8KM1vrfVzhs73Vv59m+WsMcLA/gGnAEcDKgzj3U2AKYMBbwJne7ScC84BY7+veIVqOO4Gbw+F34t03AJiDZxJpeiiWA0j2OeZHwCMhWo7TgCjv83uAe0K0HKOBkcACICfYYvfGlb3ftl7Aeu+/qd7nqa2Vs7VHt615OOfeB3b5bjOzoWb2PzNbamYLzWzU/ueZWV88f8gfOc9P/Z/Aud7d1wN3O+dqvNco9m8p/FaOgPBjWe4Dfg50yegQf5TDOVfuc2gCXVAWP5VjrnOu3nvox0CWf0vht3J86ZzLC9bYW3A68LZzbpdzrhR4GzjjYD8Lum3yaMGjwI3OuSOBm4G/NXNMf6DA53WBdxvACOA4M/vEzN4zs6P8Gm3LDrUcADd4mxaeNLNU/4XapkMqi5l9C9jqnFvm70DbcMi/EzO7y8y2AJcCv/ZjrK3pjP9bTa7G8y03EDqzHF2tPbE3pz+wxed1U3kOqpxR7bxo2DOzROAY4CWf5r7Y5g5tZlvTt8AoPNXBycBRwItmNsSbzbtEJ5XjYeA33te/Af6E5w+9Sx1qWcysB/BLPE0lAdNJvxOcc78EfmlmtwE3AHd0cqit6qxyeN/rl0A98O/OjLE9OrMcXa212M3sKuDH3m3DgNlmVgtscM6dR8vlOahyKnl8JQLY7Zyb4LvRzCKBpd6Xs/B8sPpWtbOAQu/zAuBVb7L41Mwa8SxMVuLPwPdzyOVwzm33Oe8x4A1/BtyKQy3LUGAwsMz7h5YFfGZmk5xzRX6O3Vdn/N/y9SzwJl2cPOikcng7as8GTu7KL1Y+Ovv30ZWajR3AOfcU8BSAmS0ArnTObfQ5pAA4wed1Fp6+kQIOppz+6ugJhQeQjU9HFLAIuMj73IDDWzhvMZ7aRVPn0lne7dcBM73PR+CpIloIlqOvzzE3Ac+H6u9kv2M20gUd5n76nQz3OeZG4OUQLccZQC6Q0VX/p/z5/wo/d5gfbOy03GG+AU/rSKr3ea/2lLPZuLryFxhMD+A5YBtQhyfzXoPnW+r/gGXe/+C/buHcHGAlsA74K1/N1I8B/uXd9xlwUoiW4xlgBbAczzewvv4uh7/Kst8xG+ma0Vb++J284t2+HM+id/1DtBz5eL5UfeF9dMWoMX+U4zzve9UA24E5wRQ7zSQP7/arvb+DfOCqjvz97P/Q8iQiItJhGm0lIiIdpuQhIiIdpuQhIiIdpuQhIiIdpuQhIiIdpuQhYc3MKrr4eo+b2ZhOeq8G86yiu9LMXm9rBVoz62lmP+iMa4u0RUN1JayZWYVzLrET3y/KfbWwn1/5xm5m/wDWOOfuauX4bOAN59zYrohPujfVPKTbMbMMM3vFzBZ7H1O92yeZ2SIz+9z770jv9ivN7CUzex2Ya2YnmNkCM3vZPPem+HfT/Q+823O8zyu8ixkuM7OPzSzTu32o9/ViM5vZztrRR3y12GOimb1jZp+Z5x4M53iPuRsY6q2t3Os99hbvdZab2f914o9RujklD+mOHgDuc84dBVwAPO7dvhqY5pybiGfV2t/5nDMFuMI5d5L39UTgJ8AYYAgwtZnrJAAfO+cOB94Hvudz/Qe8129zDSHvmksn45ntD1ANnOecOwLPPWT+5E1etwLrnHMTnHO3mNlpwHBgEjABONLMprV1PZH20MKI0h2dAozxWZU02cySgBTgH2Y2HM+qotE+57ztnPO9r8KnzrkCADP7As/6Qx/sd51avlpUcilwqvf5FL66X8KzwB9biDPe572X4rn/AnjWH/qdNxE04qmRZDZz/mnex+fe14l4ksn7LVxPpN2UPKQ7igCmOOeqfDea2V+A+c6587z9Bwt8du/d7z1qfJ430PzfUp37qlOxpWNaU+Wcm2BmKXiS0A+BB/HczyMDONI5V2dmG4G4Zs434PfOub938LoibVKzlXRHc/HcDwMAM2ta3joF2Op9fqUfr/8xnuYygOltHeycK8Nz69mbzSwaT5zF3sRxIjDIe+geIMnn1DnA1d57QGBm/c2sdyeVQbo5JQ8Jdz3MrMDn8VM8H8Q53k7kXDxL6QP8Afi9mX0IRPoxpp8APzWzT4G+QFlbJzjnPseziup0PDdQyjGzJXhqIau9x+wEPvQO7b3XOTcXT7PYR2a2AniZrycXkYOmoboiXcx7h8Mq55wzs+nAJc65c9o6TySYqM9DpOsdCfzVO0JqNwG4xa/IoVLNQ0REOkx9HiIi0mFKHiIi0mFKHiIi0mFKHiIi0mFKHiIi0mH/HwFD+DXaBLenAAAAAElFTkSuQmCC\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"learn_c.recorder.plot()"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"lr = 2e-2\n",
"lr *= bs/48\n",
"\n",
"wd = 0.01"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"hidden": true,
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" accuracy \n",
" f1 \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 0.397058 \n",
" 0.264077 \n",
" 0.886385 \n",
" 0.929270 \n",
" 03:50 \n",
" \n",
" \n",
" 1 \n",
" 0.382111 \n",
" 0.263587 \n",
" 0.877805 \n",
" 0.923267 \n",
" 04:27 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.fit_one_cycle(2, lr, wd=wd, moms=(0.8,0.7))"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_c.save(f'{lang}clas1_sp15_multifit_v2')"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" accuracy \n",
" f1 \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 0.421706 \n",
" 0.264088 \n",
" 0.905306 \n",
" 0.942071 \n",
" 04:47 \n",
" \n",
" \n",
" 1 \n",
" 0.400071 \n",
" 0.258678 \n",
" 0.883631 \n",
" 0.927468 \n",
" 04:50 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.load(f'{lang}clas1_sp15_multifit_v2');\n",
"learn_c.fit_one_cycle(2, lr, wd=wd, moms=(0.8,0.7))"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_c.save(f'{lang}clas2_sp15_multifit_v2')"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" accuracy \n",
" f1 \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 0.349437 \n",
" 0.222335 \n",
" 0.944592 \n",
" 0.967229 \n",
" 04:12 \n",
" \n",
" \n",
" 1 \n",
" 0.328945 \n",
" 0.205256 \n",
" 0.949153 \n",
" 0.969924 \n",
" 04:26 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.load(f'{lang}clas2_sp15_multifit_v2');\n",
"learn_c.freeze_to(-2)\n",
"learn_c.fit_one_cycle(2, slice(lr/(2.6**4),lr), wd=wd, moms=(0.8,0.7))"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_c.save(f'{lang}clas3_sp15_multifit_v2')"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" accuracy \n",
" f1 \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 0.283645 \n",
" 0.219118 \n",
" 0.955882 \n",
" 0.974198 \n",
" 05:24 \n",
" \n",
" \n",
" 1 \n",
" 0.264125 \n",
" 0.196374 \n",
" 0.955159 \n",
" 0.973541 \n",
" 06:35 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.load(f'{lang}clas3_sp15_multifit_v2');\n",
"learn_c.freeze_to(-3)\n",
"learn_c.fit_one_cycle(2, slice(lr/2/(2.6**4),lr/2), wd=wd, moms=(0.8,0.7))"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_c.save(f'{lang}clas4_sp15_multifit_v2')"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" accuracy \n",
" f1 \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 0.256820 \n",
" 0.192569 \n",
" 0.955340 \n",
" 0.973631 \n",
" 11:34 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.load(f'{lang}clas4_sp15_multifit_v2');\n",
"learn_c.unfreeze()\n",
"learn_c.fit_one_cycle(1, slice(lr/10/(2.6**4),lr/10), wd=wd, moms=(0.8,0.7))"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_c.save(f'{lang}clas5_sp15_multifit_v2')"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_c.load(f'{lang}clas5_sp15_multifit_v2')\n",
"learn_c.save(f'{lang}clas_sp15_multifit_v2')"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"learn_c.load(f'{lang}clas_sp15_multifit_v2');\n",
"learn_c.to_fp32().export(f'{lang}_classifier_sp15_multifit_v2')"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"### Confusion matrix"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 12.9 s, sys: 796 ms, total: 13.7 s\n",
"Wall time: 13 s\n"
]
}
],
"source": [
"%%time\n",
"data_clas = load_data(path, f'{lang}_textlist_class_sp15_multifit_v2', bs=bs, num_workers=1)\n",
"\n",
"config = awd_lstm_clas_config.copy()\n",
"config['qrnn'] = True\n",
"config['n_hid'] = 1550 #default 1152\n",
"config['n_layers'] = 4 #default 3\n",
"\n",
"learn_c = text_classifier_learner(data_clas, AWD_LSTM, config=config, pretrained=False, drop_mult=0.5, \n",
" metrics=[accuracy,f1])\n",
"# learn_c.load_encoder(f'{lang}fine_tuned_enc_sp15_multifit_v2');\n",
"\n",
"learn_c.load(f'{lang}clas_sp15_multifit_v2');\n",
"\n",
"# put weight on cpu\n",
"loss_weights = torch.FloatTensor(trn_weights).cpu()\n",
"learn_c.loss_func = partial(F.cross_entropy, weight=loss_weights)"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAbYAAAEGCAYAAAAJw7AFAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAYyUlEQVR4nO3debxVdb3/8dfnBKIMCs6EqZkgKjPFNbuZSpl2bXDIMcuhzKF+15nCCb1lXjPLMnN4dI2SHMrMwh5l16umJqkg4ACGQ5QDqQgCDijw+f2xFrg5MpxzYJ99Wryej8d5sNd3fffanw3n8D7ru7/ruyIzkSSpKpoaXYAkSWuTwSZJqhSDTZJUKQabJKlSDDZJUqUYbJKkSjHYpLUoIjaIiN9GxCsR8Ys1OM7hEXHb2qytUSLiwxHxeKPr0LojvI5N66KIOAw4BegPzAcmA9/MzHvW8LhHAF8Fds3MRWtcaAcXEQn0zcwnGl2LtJRnbFrnRMQpwPeAC4AtgK2By4FPr4XDbwP8dV0ItZaIiE6NrkHrHoNN65SI2Ag4HzgxM3+Vma9m5luZ+dvMPL3s0yUivhcRz5Vf34uILuW+3SPimYg4NSJeiIjnI+Koct95wDnAwRGxICKOiYgxEXFtzetvGxG59D/8iDgyIp6KiPkR8XREHF7Tfk/N83aNiAfKIc4HImLXmn13RsR/RcS95XFui4hNV/L+l9Z/Rk39n4mIT0TEXyPi5YgYXdN/RETcFxFzy76XRcR65b4/ld2mlO/34Jrjj4qIWcA1S9vK57yvfI1h5fa7I+KliNh9jf5hpRoGm9Y1HwTWB25eRZ8zgV2AIcBgYARwVs3+LYGNgD7AMcAPI6JXZp5LcRZ4Q2Z2z8wfr6qQiOgGfB/YJzN7ALtSDIk277cxcGvZdxPgEuDWiNikptthwFHA5sB6wGmreOktKf4O+lAE8dXA54DhwIeBcyJiu7LvYuBkYFOKv7uRwAkAmblb2Wdw+X5vqDn+xhRnr8fWvnBmPgmMAsZFRFfgGuAnmXnnKuqVWsVg07pmE+Cl1QwVHg6cn5kvZOaLwHnAETX73yr3v5WZvwMWADu0sZ4lwICI2CAzn8/MR1fQ5z+AGZn5s8xclJnXAdOBT9b0uSYz/5qZrwM3UoTyyrxF8XniW8D1FKF1aWbOL1//UWAQQGZOzMwJ5ev+DbgS+EgL3tO5mbmwrGc5mXk1MAP4C9Cb4hcJaa0x2LSumQ1suprPft4NzKzZnlm2LTtGs2B8Deje2kIy81XgYOA44PmIuDUi+regnqU19anZntWKemZn5uLy8dLg+WfN/teXPj8i+kXE+IiYFRHzKM5IVzjMWePFzHxjNX2uBgYAP8jMhavpK7WKwaZ1zX3AG8BnVtHnOYphtKW2Ltva4lWga832lrU7M/MPmfkxijOX6RT/4a+unqU1PdvGmlrjRxR19c3MDYHRQKzmOaucah0R3Skm7/wYGFMOtUprjcGmdUpmvkLxudIPy0kTXSOic0TsExEXld2uA86KiM3KSRjnANeu7JirMRnYLSK2LieufH3pjojYIiI+VX7WtpBiSHPxCo7xO6BfRBwWEZ0i4mBgJ2B8G2tqjR7APGBBeTZ5fLP9/wS2e8ezVu1SYGJmfpHis8Mr1rhKqYbBpnVOZl5CcQ3bWcCLwD+ArwC/Lrt8A3gQmAo8DEwq29ryWn8EbiiPNZHlw6gJOJXijOxlis+uTljBMWYD+5Z9ZwNnAPtm5kttqamVTqOYmDKf4mzyhmb7xwBjy1mTB63uYBHxaWBviuFXKP4dhi2dDSqtDV6gLUmqFM/YtExEvCci7oiIaRHxaET8Z9n+2XJ7SUS8v6b/JmX/BRFxWU1713IixPTyeRc24v1IayIi1o+I+yNiSvl9fF7Z/t6I+EtEzIiIG2qu6+tSbj9R7t+2kfWvyww21VoEnJqZO1Jcx3ViROwEPALsD/ypWf83gLNZ8TVTF2dmf2Ao8KGI2Kd+ZUt1sRDYMzMHU1w+sXdE7AL8N/DdzOwLzKG4lpHyzzmZuT3w3bKfGsBg0zLldVSTysfzgWlAn8yclpnvWMS2XLXjHoqAq21/LTPvKB+/SfEZ1VZ1fwPSWpSFBeVm5/IrgT2BX5btY3l7hu2ny23K/SMjYnUzSFUHdQu2cumgaRFxdXkaf1sUK5+/LyJ+HxETI+LupdftlO0TyuWCzo+IBat7DdVPOYwylOIi2jU5Tk+KC4lvX/OqpPYVEe+KiMnAC8AfgSeBuTXXMT7D29cT9qGYiES5/xWKBQHUzup9xtYX+GFm7gzMBQ4ArgK+mpnDKYawLi/7Xkqx+sEHaPs1Q1oLyuuMbgJOysx5a3CcThRT57+fmU+trfqk9pKZizNzCMWIwwhgxxV1K/9c0dmZs/MaoG6zIsvf+P9YjkMTEaMoTuXPBGqHtbpk5o4RMRvYIjMXRcSGwHOZ+Y7VEyLiWMr157p27TZ8+35tXclIK5KZPP3UE/TosSGbbb7FcvuenPFXevfpQ9eu3ZZrf3n2bF5//TX6bPWe5dr/8feZNDU1vaNda0enJke52tPzzz9HU1MTs2bNYtCgwUQECxYs4Pnnn6dv377MmDGD3r170717dzKTqVOnLOun+pg0aeJLmblZ8/Z631KidqmcxRS3CJlb/gbUJpl5FcVZH4OHDs/f/d+f16xCLZOZnHTCMYzY5UOc962L37H/wE9+jLPPv5DBQ4cv137jz3/KlMmT+OZF31vWdtE3z2XG449z5U9+TlOTH+XWwyY9ujS6hEp78cUX6dy5Mz179uT1119n33324tTTR3Htz8bymf0O4KCDD+GrJxzHgIGD+PLxJ3DF5T/k0Uce5geXX8GNN1zPLb/+FeOuu7HRb6PSNugczZeaA+p/xjY+MweU26dRrD+3F8WMol+UH6wOyswpEXEr8NPMvKE8K7tkRWdstQy2tev+Cfey/ydG0n+nAcvCaNTZ5/PmwoWcPeoUXp79Ihtu1JOdBwxi3E3Fdca7DO7H/PnzeeutN9lww578/KbxdO/RgxEDt2f7vjuwXpfiP98jv3gch33+6Ia9tyoy2Orr4alT+dLRX2Dx4sUsySUccOBBjD7rHJ5+6imOOPwQ5sx5mcFDhnLN2Gvp0qULb7zxBkcfeQRTJj9Er14b87Nx1/Pe7Vq7KItaY4POMTEz39+8vRHBNpZi/bneFEOT12fm+RHRl2LZoqBYZufYzOyzgkMvY7BpXWawaV23smCr21BkeYuLATXbtWNbe6/gKc8Cu2RmRsQhFEsaSZLUKh3ptu3DgcvK4cm5gONWkqRW6zDBlpl3U9ytWJKkNnO6miSpUgw2SVKlGGySpEox2CRJlWKwSZIqxWCTJFWKwSZJqhSDTZJUKQabJKlSDDZJUqUYbJKkSjHYJEmVYrBJkirFYJMkVYrBJkmqFINNklQpBpskqVIMNklSpRhskqRKMdgkSZVisEmSKsVgkyRVisEmSaoUg02SVCkGmySpUgw2SVKlGGySpEox2CRJlWKwSZIqxWCTJFWKwSZJqhSDTZJUKQabJKlSDDZJUqUYbJKkSjHYJEmVYrBJkirFYJMkVYrBJkmqFINNklQpBpskqVIMNklSpRhskqRKMdgkSZVisEmSKsVgkyRVisEmSaoUg02SVCkGmySpUgw2SVKlGGySpEox2CRJlWKwSZIqxWCTJFWKwSZJqhSDTZJUKQabJKlSDDZJUqUYbJKkSjHYJEmVYrBJkirFYJMkVYrBJkmqFINNklQpBpskqVIMNklSpRhskqRKMdgkSZVisEmSKsVgkyRVisEmSaoUg02SVCkGmySpUgw2SVKlGGySpEox2CRJldJpZTsi4rdArmx/Zn6qLhVJkrQGVhpswMXtVoUkSWvJSoMtM+9qz0IkSVobVnXGBkBE9AW+BewErL+0PTO3q2NdkiS1SUsmj1wD/AhYBOwB/BT4WT2LkiSprVoSbBtk5u1AZObMzBwD7FnfsiRJapvVDkUCb0REEzAjIr4CPAtsXt+yJElqm5acsZ0EdAX+HzAcOAL4Qj2LkiSprVZ7xpaZD5QPFwBH1bccSZLWTEtmRd7BCi7Uzkw/Z5MkdTgt+YzttJrH6wMHUMyQlCSpw2nJUOTEZk33RoQXb0uSOqSWDEVuXLPZRDGBZMu6VdQK72oKNuraudFlSA3R6wNfaXQJUofUkqHIiRSfsQXFEOTTwDH1LEqSpLZqSbDtmJlv1DZERJc61SNJ0hppyXVsf15B231ruxBJktaGVd2PbUugD7BBRAylGIoE2JDigm1JkjqcVQ1Ffhw4EtgK+A5vB9s8YHR9y5IkqW1WdT+2scDYiDggM29qx5okSWqzlnzGNjwiei7diIheEfGNOtYkSVKbtSTY9snMuUs3MnMO8In6lSRJUtu1JNjeVTu9PyI2AJzuL0nqkFpyHdu1wO0RcU25fRQwtn4lSZLUdi1ZK/KiiJgKfJRiZuTvgW3qXZgkSW3RkqFIgFnAEoqV/UcC0+pWkSRJa2BVF2j3Aw4BDgVmAzcAkZl7tFNtkiS12qqGIqcDdwOfzMwnACLi5HapSpKkNlrVUOQBFEOQd0TE1RExkrdXH5EkqUNaabBl5s2ZeTDQH7gTOBnYIiJ+FBF7tVN9kiS1ymonj2Tmq5k5LjP3pVg3cjLwtbpXJklSG7R0ViQAmflyZl6ZmXvWqyBJktZEq4JNkqSOzmCTJFWKwSZJqhSDTZJUKQabJKlSDDZJUqUYbJKkSjHYJEmVYrBJkirFYJMkVYrBJkmqFINNklQpBpskqVIMNklSpRhskqRKMdgkSZVisEmSKsVgkyRVisEmSaoUg02SVCkGmySpUgw2SVKlGGySpEox2CRJlWKwSZIqxWCTJFWKwSZJqhSDTZJUKQabJKlSDDZJUqUYbJKkSjHYJEmVYrBJkirFYJMkVYrBJkmqFINNklQpBpskqVIMNklSpRhskqRKMdgkSZVisEmSKsVgkyRVisEmSaoUg02SVCkGmySpUgw2SVKlGGySpEox2CRJlWKwSZIqxWCTJFWKwSZJqhSDTZJUKQabJKlSDDZJUqUYbJKkSjHYJEmVYrBJkirFYJMkVYrBJkmqFINNklQpBpskqVIMNklSpRhskqRKMdgkSZVisEmSKsVgkyRVisEmSaoUg02SVCkGmySpUgw2SVKlGGySpEox2CRJlWKwSZIqxWDTSs2dO5fPHfpZhg3aieGDd+YvE+5j6pTJ7LHbruw6Yhi77TqCBx+4H4A5c+Zw6EH7s8v7h7D7v+/CY48+0uDqpZa54tzDmXn7t3jwF6OXtQ3s14c7x57KAzeO5pff+zI9uq2/bN9pR+/FI7ecy5Sbz+ajH9xxuWM1NQX3XTeKmy497h2vc8moz/Livd+p3xvRMgabVuqMU0/iox/7OJOmPsZ9DzzEDv135OzRo/j6mWfz5/snceY5Yzh79NcAuPiibzFo0BAmPDiZK3/8E8449eQGVy+1zM9+O4FPn/jD5dp+dM5hnPX9W/jAQRfwmzumcPIXRgLQf7st+ezHhzHswG/yqRMv59KvH0RTUyx73lcO24PHn/7nO15j2E5bs1H3Der7RrSMwaYVmjdvHn++526+cNQxAKy33nr07NmTiGD+vHlFn1deoXfv3gBMn/YYH9ljTwB22KE/f5/5N1745zt/wKWO5t5JT/LyK68t19Z3m825Z+ITAPzfhOl8ZuQQAPbdfRC/+MMk3nxrETOfm82T/3iJDwzYFoA+m/dk73/fmWtu/vNyx2pqCi446TOceemv6/9mBNQ52CJi24iYHhFjI2JqRPwyIrpGxMiIeCgiHo6I/4mILmX/CyPisbLvxfWsTav2t6efYtPNNuO4Lx3Nh/5tOCce9yVeffVVLrz4u5z19VH0f982nPn1MxjzXxcAMHDgYH5zy80APPjA/fz97zN59tlnGvkWpDZ77Mnn2Xf3gQDs/7FhbLVFLwD6bLYRz8yas6zfsy/M4d2bbwTAt08/gDMv/TVLluRyxzr+4I9w610PM+ulee1UvdrjjG0H4KrMHATMA04BfgIcnJkDgU7A8RGxMbAfsHPZ9xvtUJtWYtGiRUx+aBJfPPY47v3LRLp168Yl3/5vfnzVFVz47e8w/cmZXHjRdzjxuC8BcMrpo5g7Zw67jhjGlZdfxuAhQ+nUqVOD34XUNl8eM44vH7Qb9447g+5du/DmW4uLHRHv6JsJ+3x4AC+8PJ+Hpv1juX29N9uI/T82lMuvv6s9ylYpMnP1vdp68IhtgT9l5tbl9p7A2cC7MnO3sm0kcCJwEDAReBC4FRifmW+u4JjHAseWmzsAj9ftDazbOgE7Ag+X292BLcs/J5dtmwLvAR5awfMHAo8CS+pbprTm+vXrt9748eP79uvX79Hm+wYOHNhl3Lhx2w0aNGjaBRdcsCXA6NGjZwGb3n333b3GjBnz3H777dfzwAMP3GTRokXZpUuXpm7dujXddtttc6+77rqXL7vssm0XLly4BKB3797rPfPMMwu32WYbZ1etHdtk5mbNG9sj2O7KzG3K7ZUGW2buXw5JjgQOAbbKzD3rVpxWKyLuBr6YmY9HxBigG7AvcHxm3hkRjwMLMnN4RPQEXsvMNyPiS8CHM/PzjateapVtgfHAgHJ7c+AFilGtnwB3Av8D7Az8HBjRv3//B6ZPn94d6AssrjnW7sBpFD8rzS2g+OVQddQeY0VbR8QHM/M+4FDgf4EvR8T2mfkEcARwV0R0B7pm5u8iYgLwRDvUplX7KjAuItYDngKOAm4BLo2ITkAf4CNl3x2Bn0bEYuAx4JgG1Cu1xXUUYbQp8AxwLkX4nFju/xVwTfn4UeBG4LHf//73fSg+PqkNNXUA7XHG9jvgT8CuwAyKIPsgcDFFsD4AHA9sTPGf5vpAABdn5ti6Fac1FhEPZub7G12H1Ah+/3dc7XHGtiQzm1+teDswtFnb88CIdqhHa89VjS5AaiC//zuo9jhjG5+ZA1bTVZKktaKuwSZJUntz5RFJUqUYbJKkSjHYJEmVYrCpVSJifkTMa/b1j4i4OSK2a3R9Uj1FxEURsWFEdI6I2yPipYj4XKPr0vIMNrXWJcDpFBdnb0WxwsLVwPUUKzNIVbZXZs6jWFXkGaAfxc+DOhCDTa21d2ZemZnzM3NeZl4FfCIzbwB6Nbo4qc46l39+ArguM19uZDFaMYNNrbUkIg6KiKby66CafV47oqr7bURMB94P3B4RmwFvNLgmNeN1bGqV8nO0SymWRUtgAnAy8CwwPDPvaWB5Ut1FRC9gXmYujoiuwIaZOavRdeltBpsktVBEdKZY23a3suku4IrMfKtxVak5hyLVKhHRr5wN9ki5PSgizmp0XVI7+REwHLi8/BpWtqkD8YxNrRIRd1HMArsyM4eWbY+4HqjWBRExJTMHr65NjeUZm1qra2be36xtUUMqkdrf4oh439KN8jNn78fWwbTHbWtULS+VP9gJEBEHUtxySFoXnA7cERFPldvbUtyAVx2IQ5FqlfI31Ksobhw7B3gaODwzZza0MKkdRMT6wKnAyLLpj8B3M9Mp/x2IwaZWiYguwIEUv6luDMwDMjPPb2RdUnuIiBspvufHlU2HAr0y87ONq0rNORSp1roFmAtMAp5rcC1Se9uh2USROyJiSsOq0QoZbGqtrTJz70YXITXIQxGxS2ZOAIiIfwPubXBNasahSLVKRFwF/CAzH250LVJ7i4hpwA7A38umrYFpwBKKIflBjapNbzPY1CoR8RiwPcWkkYVA4A+01hERsc2q9juJqmMw2NQqK/vB9gdaUkdhsEmSKsWVRyRJlWKwSZIqxWCT6iwiFkfE5Ih4JCJ+Ud7Dq63H2j0ixpePPxURX1tF354RcUIbXmNMRJzW1hqlRjPYpPp7PTOHlHdAeBM4rnZnFFr9s5iZv8nMC1fRpSfQ6mCT/tUZbFL7uhvYPiK2jYhpEXE5xSou74mIvSLivoiYVJ7ZdQeIiL0jYnpE3APsv/RAEXFkRFxWPt4iIm6OiCnl167AhcD7yrPFb5f9To+IByJiakScV3OsMyPi8Yj4X4rrtKR/WQab1E4iohOwD7D04vYdgJ+W97V7FTgL+GhmDgMeBE4pF929Gvgk8GFgy5Uc/vvAXeVyT8OAR4GvAU+WZ4unR8ReQF9gBDAEGB4Ru0XEcOAQYChFcH5gLb91qV25pJZUfxtExOTy8d3Aj4F3AzOXLs0E7ALsBNwbEQDrAfcB/YGnM3MGQERcCxy7gtfYE/g8QGYuBl6JiF7N+uxVfj1UbnenCLoewM2Z+Vr5Gr9Zo3crNZjBJtXf65k5pLahDK9Xa5uAP2bmoc36DaG8991aEMC3MvPKZq9x0lp8DanhHIqUOoYJwIciYnuAiOgaEf2A6cB7a+7afOhKnn87cHz53HdFxIbAfIqzsaX+ABxd89ldn4jYHPgTsF9EbBARPSiGPaV/WQab1AFk5ovAkcB1ETGVIuj6lzewPBa4tZw8srKly/4T2CMiHgYmAjtn5myKoc1HIuLbmXkb8HPgvrLfL4EemTkJuAGYDNxEMVwq/ctySS1JUqV4xiZJqhSDTZJUKQabJKlSDDZJUqUYbJKkSjHYJEmVYrBJkirFYJMkVcr/B+FI8cOeknhbAAAAAElFTkSuQmCC\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"preds,y,losses = learn_c.get_preds(with_loss=True)\n",
"predictions = np.argmax(preds, axis = 1)\n",
"\n",
"interp = ClassificationInterpretation(learn_c, preds, y, losses)\n",
"interp.plot_confusion_matrix()"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 2112 300]\n",
" [ 689 19044]]\n",
"accuracy global: 0.9553398058252427\n",
"accuracy on negative reviews: 87.56218905472637\n",
"accuracy on positive reviews: 96.50838696599605\n"
]
}
],
"source": [
"from sklearn.metrics import confusion_matrix\n",
"cm = confusion_matrix(np.array(y), np.array(predictions))\n",
"print(cm)\n",
"\n",
"## acc\n",
"print(f'accuracy global: {(cm[0,0]+cm[1,1])/(cm[0,0]+cm[0,1]+cm[1,0]+cm[1,1])}')\n",
"\n",
"# acc neg, acc pos\n",
"print(f'accuracy on negative reviews: {cm[0,0]/(cm[0,0]+cm[0,1])*100}') \n",
"print(f'accuracy on positive reviews: {cm[1,1]/(cm[1,0]+cm[1,1])*100}')"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" text \n",
" target \n",
" prediction \n",
" \n",
" \n",
" \n",
" \n",
" ▁xxbos ▁xxup ▁a vert issement ▁du ▁7 ▁mai ▁2014 ▁: ▁ref ont e ▁de ▁ce ▁comment aire ▁avec ▁3 ▁par a graphe s ▁et ▁ ajout s ▁au ▁xxup ▁monde ▁xxup ▁de ▁xxup ▁la ▁xxup ▁ symphonie , ▁soit ▁une ▁occasion ▁de ▁faire ▁l ' ap ologie ▁de ▁cette ▁forme ▁universelle ▁de ▁la ▁geste ▁symphonique ▁de ▁l ' harmoni a ▁xxmaj ▁mun di ▁! ▁xxmaj ▁parmi ▁les ▁quelques ▁ intégrale \n",
" pos \n",
" pos \n",
" \n",
" \n",
" ▁xxbos ▁xxmaj ▁tout ▁comme ▁il ▁l ' avait ▁fait ▁pour ▁la ▁première ▁saison ▁d ' her o es ▁en ▁xxup ▁dvd , ▁xxmaj ▁universal ▁france ▁nous ▁propose ▁un ▁c offre t ▁intégra l ▁de ▁xxmaj ▁ battle star ▁xxmaj ▁gal ac tica ▁encore ▁plus ▁complet ▁que ▁l ' édition ▁américaine , ▁déjà ▁que ▁celle - ci ▁est ▁bour rée ▁à ▁cra quer ▁de ▁bonus , ▁documentaire s ▁et ▁comment aires \n",
" pos \n",
" pos \n",
" \n",
" \n",
" ▁xxbos ▁xxmaj ▁cette ▁xxup ▁saga ▁risque ▁de ▁devenir ▁une ▁vit rine ▁c inématographique , ▁si ▁l ' on ▁regard e ▁le ▁nombre ▁et ▁le ▁pan el ▁d ' acteur s ▁connus , ▁qui ▁ont ▁été ▁pré ssent i . ▁xxup ▁un ▁xxup ▁top , ▁xxup ▁la ▁xxup ▁recette ▁xxup ▁: s yl vers ter ▁xxup ▁st allo ne ▁xxup ▁aux ▁xxup ▁commandes , ▁xxup ▁au ▁xxup ▁sens ▁xxup ▁propre ▁xxup \n",
" pos \n",
" pos \n",
" \n",
" \n",
" ▁xxbos ▁xxmaj ▁personne ▁n ' imagin ait , ▁et ▁probablement ▁xxmaj ▁ross ini ▁le ▁premier , ▁à ▁la ▁création ▁trio m phal e ▁de ▁xxmaj ▁guillaume ▁xxmaj ▁tel l ▁à ▁xxmaj ▁paris ▁en ▁août ▁1829 , ▁que ▁ce ▁chef ▁d ' œuvre ▁d ' un ▁compositeur ▁de ▁ 37 ▁ans , ▁parvenu ▁au ▁sommet ▁de ▁sa ▁gloire , ▁serait ▁aussi ▁sa ▁dernière ▁œuvre ▁majeure , ▁pré lu dant ▁à ▁un \n",
" pos \n",
" pos \n",
" \n",
" \n",
" ▁xxbos ▁xxmaj ▁ attention ▁: ▁cette ▁critique ▁sera ▁con stell ée ▁de ▁gros ▁ s po il ers . ▁xxmaj ▁vous ▁ voi là ▁donc ▁pré ven us ▁que ▁vous ▁continue z ▁cette ▁lecture ▁à ▁vo s ▁risque s ▁et ▁pér ils . ▁xxmaj ▁après ▁avoir ▁ex ami né ▁un ▁peu ▁le ▁film ▁sur ▁la ▁forme , ▁nous ▁analyse ron s ▁le ▁fond ▁en ▁détail , ▁afin ▁de ▁répondre ▁à \n",
" neg \n",
" neg \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.show_results()"
]
},
{
"cell_type": "markdown",
"metadata": {
"hidden": true
},
"source": [
"### Predictions some random sentences"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"import matplotlib.cm as cm\n",
"import warnings\n",
"warnings.filterwarnings('ignore') # \"error\", \"ignore\", \"always\", \"default\", \"module\" or \"on"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"neg tensor([0.9965, 0.0035])\n"
]
}
],
"source": [
"# Get the prediction neg/pos\n",
"review = 'Ce produit est bizarre.'\n",
"pred = learn_c.predict(review)\n",
"print(pred[0], pred[2])"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [
"▁xxbos ▁xxmaj ▁ce ▁produit ▁est ▁bi zar re . "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# The darker the word-shading in the below example, the more it contributes to the classification. \n",
"txt_ci = TextClassificationInterpretation.from_learner(learn_c)\n",
"test_text = 'Ce produit est bizarre.'\n",
"txt_ci.show_intrinsic_attention(test_text,cmap=cm.Purples)"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([0.0788, 0.1047, 0.1551, 0.3086, 0.3445, 0.8357, 1.0000, 0.6749, 0.1169],\n",
" device='cuda:0')"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"txt_ci.intrinsic_attention(test_text)[1]"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" Text \n",
" Prediction \n",
" Actual \n",
" Loss \n",
" Probability \n",
" \n",
" \n",
" \n",
" \n",
" ▁xxbos ▁xxmaj ▁si ▁on ▁a ▁ aim é ▁le ▁film , ▁on ▁retrouve ▁toute ▁son ▁ intensité ▁à ▁l ' écoute ▁du ▁xxup ▁cd ▁xxmaj ▁je ▁l ' ai ▁en ▁boucle ▁dans ▁la ▁voiture , ▁je ▁ne ▁m ' en ▁la sse ▁pas \n",
" pos \n",
" pos \n",
" 13.37 \n",
" 1.00 \n",
" \n",
" \n",
" ▁xxbos ▁xxmaj ▁en ▁xxmaj ▁europe , ▁il ▁a ▁existé ▁3 ▁montage s ▁ diff e rent s , ▁celui ▁de ▁la ▁sortie ▁en ▁1982 ▁( international ), ▁celui ▁de ▁1992 ▁( dire c tor ' s ▁ cut ) ▁et ▁celui ▁de ▁2007 ▁( fin al ▁ cut ). ▁xxmaj ▁le ▁moins ▁inter re ssant , ▁parce ▁que ▁le ▁plus ▁ aff adi ▁( s up press ion ▁de ▁4 \n",
" pos \n",
" pos \n",
" 11.64 \n",
" 1.00 \n",
" \n",
" \n",
" ▁xxbos ▁xxmaj ▁bonne ▁suite ▁de ▁la ▁saison ▁1 ▁xxmaj ▁ attention ▁ne ▁pas ▁regard er ▁avant ▁16 -17 ▁ans ▁beaucoup ▁de ▁sang ▁et ▁de ▁sexe ▁xxmaj ▁trop ▁peu ▁d ' épisode ▁par ▁saison . ▁xxmaj ▁trop ▁ excellent \n",
" pos \n",
" pos \n",
" 8.09 \n",
" 1.00 \n",
" \n",
" \n",
" ▁xxbos ▁xxmaj ▁je ▁trouve ▁ce ▁produit ▁fort ▁cher ▁pour ▁ce ▁que ▁c ' est . ▁xxmaj ▁comme ▁d ' habitude ▁on ▁pay e ▁la ▁pomme . ▁xxmaj ▁ ok ▁le ▁mac book ▁est ▁léger , ▁mais ▁il ▁faut ▁toujours ▁le ▁se t ▁d ' adapt ateurs ▁avec ... \n",
" pos \n",
" neg \n",
" 5.85 \n",
" 0.00 \n",
" \n",
" \n",
" ▁xxbos ▁xxmaj ▁un ▁film ▁de ▁toute ▁beauté ▁ , ▁ é mou vant , ▁re la tant ▁une ▁réalité ▁historique ▁xxrep ▁5 ▁! ▁xxmaj ▁une ▁distribution ▁de ▁qualité , ▁film ée ▁d ' une ▁main ▁de ▁maître ▁! ! ! \n",
" pos \n",
" pos \n",
" 5.75 \n",
" 1.00 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# tabulation showing the first k texts in top_losses along with their prediction, actual,loss, and probability of actual class.\n",
"# max_len is the maximum number of tokens displayed. If max_len=None, it will display all tokens.\n",
"txt_ci.show_top_losses(5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Fine-tuning \"backward Classifier\""
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"bs = 18\n",
"warnings.filterwarnings('ignore') # \"error\", \"ignore\", \"always\", \"default\", \"module\" or \"on"
]
},
{
"cell_type": "markdown",
"metadata": {
"heading_collapsed": true
},
"source": [
"### Databunch"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 2.18 s, sys: 560 ms, total: 2.74 s\n",
"Wall time: 4.07 s\n"
]
}
],
"source": [
"%%time\n",
"data_lm = load_data(path, f'{lang}_databunch_lm_aws_sp15_multifit_bwd_v2', bs=bs, backwards=True)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 49.9 s, sys: 1.75 s, total: 51.6 s\n",
"Wall time: 1min 30s\n"
]
}
],
"source": [
"%%time\n",
"data_clas = (TextList.from_df(df_trn_val, path, cols=reviews, processor=SPProcessor.load(dest))\n",
" .split_by_rand_pct(0.1, seed=42)\n",
" .label_from_df(cols=label)\n",
" .databunch(bs=bs, num_workers=1, backwards=True))"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 6.49 s, sys: 1.02 s, total: 7.51 s\n",
"Wall time: 8.75 s\n"
]
}
],
"source": [
"%%time\n",
"data_clas.save(f'{lang}_textlist_class_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Get weights to penalize loss function of the majority class"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 12.4 s, sys: 800 ms, total: 13.2 s\n",
"Wall time: 14.9 s\n"
]
}
],
"source": [
"%%time\n",
"data_clas = load_data(path, f'{lang}_textlist_class_sp15_multifit_bwd_v2', bs=bs, num_workers=1, backwards=True)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(199311, 22145, 221456)"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"num_trn = len(data_clas.train_ds.x)\n",
"num_val = len(data_clas.valid_ds.x)\n",
"num_trn, num_val, num_trn+num_val"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(array([ 22694, 176617]), array([ 2412, 19733]))"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"trn_LabelCounts = np.unique(data_clas.train_ds.y.items, return_counts=True)[1]\n",
"val_LabelCounts = np.unique(data_clas.valid_ds.y.items, return_counts=True)[1]\n",
"trn_LabelCounts, val_LabelCounts"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"([0.8861377445299056, 0.11386225547009443],\n",
" [0.891081508241138, 0.10891849175886203])"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"trn_weights = [1 - count/num_trn for count in trn_LabelCounts]\n",
"val_weights = [1 - count/num_val for count in val_LabelCounts]\n",
"trn_weights, val_weights"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Training (Loss = FlattenedLoss of weighted CrossEntropyLoss)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 14.4 s, sys: 684 ms, total: 15.1 s\n",
"Wall time: 13.8 s\n"
]
}
],
"source": [
"%%time\n",
"data_clas = load_data(path, f'{lang}_textlist_class_sp15_multifit_bwd_v2', bs=bs, num_workers=1, backwards=True)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"config = awd_lstm_clas_config.copy()\n",
"config['qrnn'] = True\n",
"config['n_hid'] = 1550 #default 1152\n",
"config['n_layers'] = 4 #default 3"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"learn_c = text_classifier_learner(data_clas, AWD_LSTM, config=config, drop_mult=0.5, pretrained=False, \n",
" metrics=[accuracy,f1]).to_fp16()\n",
"learn_c.load_encoder(f'{lang}fine_tuned_enc_sp15_multifit_bwd_v2');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Change loss function"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"FlattenedLoss of CrossEntropyLoss()"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"learn_c.loss_func"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"loss_weights = torch.FloatTensor(trn_weights).cuda()\n",
"learn_c.loss_func = partial(F.cross_entropy, weight=loss_weights)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"functools.partial(, weight=tensor([0.8861, 0.1139], device='cuda:0'))"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"learn_c.loss_func"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Training"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"learn_c.freeze()"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.\n"
]
}
],
"source": [
"learn_c.lr_find()"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAY4AAAEGCAYAAABy53LJAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3dd3yV9dn48c+VTRISCEkgBAgrInsFcG+rIgraasFRrW1trePRap/Wn1ato7XjqdU+jqpVHxe4akVFcRSsCgJh75EAIWElkATIPsn1++PciSfhJDkZJycnud6v13lx7nmum4wr3y2qijHGGOOrkEAHYIwxJrhY4jDGGNMiljiMMca0iCUOY4wxLWKJwxhjTIuEBTqAjpCYmKiDBw8OdBjGGBNUVq5cWaCqSQ33d4vEMXjwYDIzMwMdhjHGBBUR2e1tv1VVGWOMaRFLHMYYY1rEEocxxpgW8WviEJELRWSriOwQkV97OZ4mIp+LyDoRWSwiAzyOXSci253XdR77J4vIeueeT4iI+PMZjDHG1Oe3xCEiocCTwEXAKGCOiIxqcNqfgZdVdRzwIPB759oE4H5gGjAVuF9EejvXPA3cCKQ7rwv99QzGGGOO588Sx1Rgh6pmq2olMA+Y2eCcUcDnzvtFHscvAD5V1cOqWgh8ClwoIilAnKouVffsjC8Ds/z4DMYYYxrwZ+JIBfZ4bOc6+zytBb7rvL8M6CkifZq4NtV539Q9ARCRG0UkU0Qy8/PzW/0Qxhhj6vNn4vDW9tBwDve7gDNFZDVwJpAHuJq41pd7uneqPquqGaqakZR03PgVY4zp0vKPVvDQB5soLqtq93v7M3HkAgM9tgcAez1PUNW9qnq5qk4E7nH2FTdxba7zvtF7GmOMgd8t2MwrS3dTcKyi3e/tz8SxAkgXkSEiEgHMBuZ7niAiiSJSG8PdwAvO+4XAd0Skt9Mo/h1goaruA46KyElOb6ofAO/58RmMMSboLMkq4N3Vefz0zKEMS4pt9/v7LXGoqgu4BXcS2Ay8qaobReRBEbnUOe0sYKuIbAP6Ao841x4GHsKdfFYADzr7AG4Cngd2AFnAR/56BmOMCTYVrmru/dcGBiVEc/PZw/3yGdIdlo7NyMhQm6vKGNMd/O+/t/PnT7bx0g+ncNaI5DbdS0RWqmpGw/02ctwYY7qInEOl/O3fO7h4bEqbk0ZTLHEYY0wXoKrcN38DYSHCb2Y0HGvdvixxGGNMF7BmTxGLt+Zzx/kn0C8+yq+fZYnDGGO6gK37jwJwweh+fv8sSxzGGNMF7CwoISIshP69evj9syxxGGNMF5BdUEJaQjShIf6fMNwShzHGdAG7CkoYkhjTIZ9licMYY4JcdY2y+1ApQ5IscRhjjPHB3qIyKqtrGNLHEocxxhgfZBeUAFhVlTHGGN/sqk0cVlVljDHGFzsLSoiNDCMpNrJDPs8ShzHGBLnsghIGJ0bjXm3C/yxxGGNMkNtZcIwhie2/7kZjLHEYY0wQq3BVk1dY1mEN42CJwxhjgtqew6XUKAxJjO6wz7TEYYwxQSw7v7YrrlVVGWOM8cGuQ07i6KDBf2CJwxhjgtrOghISYiKIjw7vsM/0a+IQkQtFZKuI7BCRX3s5PkhEFonIahFZJyLTnf1Xi8gaj1eNiExwji127ll7zH/rIxpjTCeXnd9xkxvW8lviEJFQ4EngImAUMEdEGq5neC/wpqpOBGYDTwGo6muqOkFVJwDXArtUdY3HdVfXHlfVg/56BmOM6ex2duCsuLX8WeKYCuxQ1WxVrQTmATMbnKNAnPM+Htjr5T5zgLl+i9IYY4JUSYWLg0crulTiSAX2eGznOvs8PQBcIyK5wALgVi/3+T7HJ44XnWqq30gjQyVF5EYRyRSRzPz8/FY9gDHGdGY7O3hyw1r+TBzefqFrg+05wEuqOgCYDrwiInUxicg0oFRVN3hcc7WqjgVOd17XevtwVX1WVTNUNSMpKaktz2GMMZ1SV0wcucBAj+0BHF8V9SPgTQBVXQpEAYkex2fToLShqnnOv0eB13FXiRljTLdTmzgGd2BXXPBv4lgBpIvIEBGJwJ0E5jc4Jwc4F0BERuJOHPnOdghwBe62EZx9YSKS6LwPB2YAGzDGmG5oV0EJKfFR9IgI7dDPDfPXjVXVJSK3AAuBUOAFVd0oIg8Cmao6H7gTeE5E7sBdjXW9qtZWZ50B5KpqtsdtI4GFTtIIBT4DnvPXMxhjTGeWHYAeVeDHxAGgqgtwN3p77rvP4/0m4NRGrl0MnNRgXwkwud0DNcaYIKOqZOcf45Lx/Tv8s23kuDHGBKHC0iqOlLsCUuKwxGGMMUEo/2gFAP3iozr8sy1xGGNMECqpdAEQE+HXFgevLHEYY0wQKq2oBiC6g3tUgSUOY4wJSqW1JY5IK3EYY4zxQWmllTiMMca0QG0bR7S1cRhjjPFFWW2JI9JKHMYYY3xQUts4Hm6JwxhjjA9KK11EhIUQFtrxv8YtcRhjTBAqqXQRE4CGcbDEYYwxQam0sjogDeNgicMYY4JSaUU1MQFoGAdLHMYYE5RKKl30sBKHMcYYX5VWVlsbhzHGGN9ZG4cxxpgWKa10WRuHMcYY35VUVAdknirwc+IQkQtFZKuI7BCRX3s5PkhEFonIahFZJyLTnf2DRaRMRNY4r2c8rpksIuudez4hIuLPZzDGmM6orNLV9aqqRCQUeBK4CBgFzBGRUQ1Ouxd4U1UnArOBpzyOZanqBOf1M4/9TwM3AunO60J/PYMxxnRGNTVKaVXXbByfCuxQ1WxVrQTmATMbnKNAnPM+Htjb1A1FJAWIU9WlqqrAy8Cs9g3bGGM6t3JXNaoQHYC1OMC/iSMV2OOxnevs8/QAcI2I5AILgFs9jg1xqrC+EJHTPe6Z28w9ARCRG0UkU0Qy8/Pz2/AYxhjTuZQEcPU/8G/i8Nb2oA225wAvqeoAYDrwioiEAPuAQU4V1i+A10Ukzsd7uneqPquqGaqakZSU1OqHMMaYzqZuSvUAtXH481NzgYEe2wM4virqRzhtFKq6VESigERVPQhUOPtXikgWcIJzzwHN3NMYY7q02kWcumIbxwogXUSGiEgE7sbv+Q3OyQHOBRCRkUAUkC8iSU7jOiIyFHcjeLaq7gOOishJTm+qHwDv+fEZjDGm06ldb7xHgBKH30ocquoSkVuAhUAo8IKqbhSRB4FMVZ0P3Ak8JyJ34K5yul5VVUTOAB4UERdQDfxMVQ87t74JeAnoAXzkvIwxptuoXW88JkCN4379VFVdgLvR23PffR7vNwGnernuHeCdRu6ZCYxp30iNMSZ4dOXGcWOMMX5QWtfG0fW64xpjjPGDkkorcRhjjGmBMqfE0RUHABpjjPGD2jaOHuFW4jDGGOOD0koXPcJDCQ0JzByvljiMMSbIuBdxCkxpAyxxGGNM0CmtrCY6QIs4gSWORqkqLy/dxWOfbgt0KMYYU09JhStgXXHBEkeTNu09wuOfb+fDdfsCHYoxxtSxqqpOSkT47czRTE7rzV1vrWXj3uJAh2SMMYC7cTxQM+OCJY4mRYaF8vQ1k4jvEc6NL6/k0LGKQIdkjDFW4ujskntG8ewPJlNwrIKfv7aKquqaQIdkjOnmSipdAZvgECxx+GTcgF48+t2xLNt5mMc/2x7ocIwx3VxZZXXAplQHSxw+u2ziAKYNSeCrHQWBDsUY082VVFQHbBEnsMTRIsOSY9l9qCTQYRhjurHqGqWsqtoax4PF4D7RFJZWUVxaFehQjDHdVFlV7SJOVuIICml9YgDYfdhKHcaYwPh22VgrcQSFtD7RAOw+VBrgSIwx3VWpMzNul23jEJELRWSriOwQkV97OT5IRBaJyGoRWSci053954vIShFZ7/x7jsc1i517rnFeyf58Bk+DEmoTh5U4jDGBUVK7FkcASxx++2QRCQWeBM4HcoEVIjLfWWe81r3Am6r6tIiMwr0++WCgALhEVfeKyBhgIZDqcd3VztrjHSo6Ioy+cZHsshKHMSZAyiq7dhvHVGCHqmaraiUwD5jZ4BwF4pz38cBeAFVdrap7nf0bgSgRifRjrD5L6xNjJQ5jTMAEetlY8G/iSAX2eGznUr/UAPAAcI2I5OIubdzq5T7fBVarqud8Hy861VS/EZEOXckkLSHa2jiMMQFTWhH4qip/Jg5vv9C1wfYc4CVVHQBMB14RkbqYRGQ08Afgpx7XXK2qY4HTnde1Xj9c5EYRyRSRzPz8/DY8Rn2DE2M4eLSirmeDMcZ0pNoSR1edVj0XGOixPQCnKsrDj4A3AVR1KRAFJAKIyADgXeAHqppVe4Gq5jn/HgVex10ldhxVfVZVM1Q1IykpqV0eCKxnlTEmsMpqG8e7aBvHCiBdRIaISAQwG5jf4Jwc4FwAERmJO3Hki0gv4EPgblX9uvZkEQkTkdrEEg7MADb48RmOk5bgjOWwdg5jTAB06TYOVXUBt+DuEbUZd++pjSLyoIhc6px2J/ATEVkLzAWuV1V1rhsO/KZBt9tIYKGIrAPWAHnAc/56Bm8GWYnDGBNApRUuRCAqLHCJw6+VZKq6AHejt+e++zzebwJO9XLdw8DDjdx2cnvG2FLxPcJJiImwLrnGmIAorawmOjyUkJAO7RdUj40cb4W0PtFWVWWMCYiSyuqATjcCljhaxbrkGmMCpbTSFdDBf2CJo1XS+sSwt7iMCld1oEMxxnQzJRWBnVIdLHG0yuDEaFRhz+GyQIdijOlmyqpcAZ3gECxxtErd9OrWzmGM6WAlFYFdNhZ8TBwiMqx2rigROUtEbnPGWnRLaQnWJdcYExilla6AjhoH30sc7wDVIjIc+AcwBPeo7W4pISaCnpFhVuIwxnS4korqgI4aB98TR40zoO8y4K+qegeQ4r+wOjcRIS0x2sZyGGM6XFlVddCUOKpEZA5wHfCBsy/cPyEFh7QEm17dGNPxSipcAZ1uBHxPHD8ETgYeUdWdIjIEeNV/YXV+aX2iyS0sw1VdE+hQjDHdRHWNUuGqCY7uuKq6SVVvU9W5ItIb6Kmqj/o5tk5tcJ8YXDXK3qLyQIdijOmi7nxzLQs37q/brl3OISgGADrrfMeJSAKwFvdCSn/xb2idW+306rususoY00qvL8vhzRV7vB7bX1zOO6tyeW9NXt2+0rqZcYOgxAHEq+oR4HLgRVWdDJznv7A6PxvLYYxpi9JKF498uInHP9/u9fiqnEIANu87WrevpG71vyAocQBhIpICXMm3jePdWnLPSKLCQ1i48QCFJZWBDscYE2QWbtxPSWU1eUVl7C06fhaK1U7i2HWopK6KqrQTrMUBvieOB3Gvq5GlqitEZCjgPU12EyEhwn+dewJLsw9x7l++4M3MPdTUuFfG3VVQwhOfb+eKZ5awZk9RgCM1xnRG76zMo0e4OwGs2HX4uOOrcooIDRFUYct+d6mjNnHERAa2qsqnT1fVt4C3PLazge/6K6hgcdNZwzhrRBL3/msD//32OuYuz6G6RlmXWwxAaIjw+rLdTBjYbQfZG2O82FtUxtdZBdx81nBeWrKLzF2FzJyQWne80lXD+rxivjOqLx9t2M/mfUeYNKg3JU7JI1imHBkgIu+KyEEROSAi7zhrgnd7I1PieOunJ/PH741jX1E5qnDP9JEsvfscLhrTj39vya8riRhjDMC/1uShCldkDGBSWu/jShyb9h2h0lXDjHH96RkZxhannaO0wilxBLhx3NdPfxH3FCNXONvXOPvO90dQwSYkRLgyYyBXZgyst/+8kX35YN0+1uYWMXFQ7wBFZ4zpTFSVd1bmMmVwb9L6xDAlrTd/+WwbxaVVxEe7x1XXtm9MSuvFiSk92bzvCPBtd9xgaeNIUtUXVdXlvF4CkvwYV5dw1ogkQkOEzzYfCHQoxphOYm1uMVn5JXx3krvSJmNwAqqwMufbUseqnCJS4qNIie/ByJQ4tuw/Sk2Ndpo2Dl8TR4GIXCMioc7rGuBQcxeJyIUislVEdojIr70cHyQii0RktYisE5HpHsfudq7bKiIX+HrPzqRXdAST03rz+eaDgQ7FGNNJvLMyl8iwEKaPc0/3N2FgL8JDhRW7CuvOWbW7kImD3G2jI1PiOFbhIrewrK6NI1hKHDfg7oq7H9gHfA/3NCSNEpFQ4EngImAUMEdERjU47V7gTVWdCMwGnnKuHeVsjwYuBJ6qTVo+3LNTOW9kMlv2HyW30CZENKa7q3BVM3/tXr4zuh9xUe5qqR4RoYxJjSfTaec4eKScvKIyJjnV2yNT4gB3u0dpRTUhApFhgV1KydcpR3JU9VJVTVLVZFWdhXswYFOmAjtUNVtVK4F5wMyGtwbinPfxwF7n/UxgnqpWqOpOYIdzP1/u2amcN7IvgJU6jDEs2nKQ4rIqvjsptd7+qYMTWLunmPKqalbluLvw17aLntA3FhHYvO8IpZXumXFFpMNj99SWtPWLZo6nAp5j6XOdfZ4eAK4RkVxgAXBrM9f6ck8ARORGEckUkcz8/PxmQvWfoUmxDE2MsXYOYwwL1u8nMTaS09PrNxFnDE6gstrdBXf1nkLCQ4XR/d1/U0dHhDGkTwxb9h+htNIV8LU4oG2Jo7mU5+14w36pc4CXVHUAMB14RURCmrjWl3u6d6o+q6oZqpqRlBTYdvxzRyazLPswx5zpAowx3dO2A0cZPyCe0JD6v8oy0tyli+U7D7N6dxGj+8cTFf5tghiZEsfmfUcpqawO+DxV0LbE0dzghFzAs3/qAL6tiqr1I+BNAFVdCkQBiU1c68s9O51zR/alsrqGL7cFruRjjAms6holu6CE4cmxxx3rHRNBenIs32QfYl1eUV3DeK2RKT3JOVzKwSPlAW8Yh2YSh4gcFZEjXl5Hgf7N3HsFkC4iQ0QkAndj9/wG5+QA5zqfNRJ34sh3zpstIpHO2h/pwHIf79npZKT1Jr5HOJ9ZO4cx3VZuYSmVrhqGJR2fOMBdXfXVjgLKq2rqGsZrndjPXW21Lrc44IP/oJkBgKras7U3VlWXiNyCe46rUOAFVd0oIg8Cmao6H7gTeE5E7sBdgrleVRXYKCJvApsAF3CzqlYDeLtna2PsKGGhIZw1IolFWw9SXaPHFVONMV1fVv4xAIZ5KXEATB3Sm7nLcwCYlFY/cYx02jvKqgK/3jj4PnK8VVR1Ae5Gb89993m83wSc2si1jwCP+HLPYHDuyL68t2Yvi7Yc5LxRfQMdjjGmg+046CSOpBivxzPSEgD3zNv946PqHesfH0VcVBhHygO/bCy0rY3DtMB5I5NJT47ljjfXsO3A0eYvMMZ0KVkHS0iMjaBXdITX4wN692BgQg+mDkk4rrutiNSN5wj2xnHTAtERYbz4wylEhYdy/QvLOXDElpw1pjvZkX+s0fYNcCeH1398Eg/NHOP1eG3iiLESR/cyoHc0L14/haKyKm54aYV1zzWmm1BVdhw81mj7Rq2BCdH0jvFeIhmZ4m5y7mElju5nTGo8T149iS37j3Lza6twVdcEOiRjjJ8dKqmkuKyK4U2UOJpjJY5u7uwRyTw0cwxfbMvn4Q83BzocY4yfZR1sukeVL07o25MT+/VkdGpc8yf7WeDLPN3UVdMGkZ1/jOe/2kl631iunpYW6JCMMX6yw+mK623wn6+iwkP5+PYz2iukNrESRwDdPX0kZ41I4v73NrIkqyDQ4Rhj/CTrYAk9wkNJiYtq/uQgYIkjgEJDhCfmTGRwYgw3vbqKXQUlgQ7JGOMHO/KPMSw5hpAuMvjXEkeAxUWF84/rMhCBH7+cSYWrOtAhGWPaWdbBprviBhtLHJ1AWp8YHpk1lh0Hj7Es+3DzFxhjgkZppYu8orI29ajqbCxxdBLnnJhMZFgIi7baRIjGdCXZ+e4q6Lb0qOpsLHF0Ej0iQjl5WB8Wbel6ieP5L7N5/svsQIdhTEBktUOPqs7GEkcncvaIZHYdKmVnF2skn7s8hz8u3MqhYxWBDsWYDpd18BghAml9ogMdSruxxNGJnD0iGaBLlTpUlbyiMipdNXVTRhvTnezIP0ZanxgiwwI/4ru9WOLoRAb1iWZoUkyr2zncS5l0LodLKimvqiFE4JVvdlNlU6yYbibrYEmjU6kHK0scncw5I9zrk5dWtmwCxJteXcnPX1vlp6haL6+oDIDvTxnIgSMVLFi/L8ARGdNxqmuUnQUlXaphHCxxdDpnn5hMZXUNS3Yc8vma8qpqPt98kI827O90vbLyCt2J4+ppaQxNjOGFr3cFNiBjOtCew6VUVje+XGywssTRyWQM7k1MRGiLEsCaPUVUVtcQGRbCwx9s6lTVQbUljoG9o7n+1MGs3VPEqpzCAEdlTMfoij2qwM+JQ0QuFJGtIrJDRH7t5fhjIrLGeW0TkSJn/9ke+9eISLmIzHKOvSQiOz2OTfDnM3S0yLBQTh2eyKItB31us1iWfRgRePS7Y8nKL+GVpbuPO6fSVROQNpDcwjJiI8OI6xHGdycNoGdUGC9aqcN0E3XLxSZa4vCJiIQCTwIXAaOAOSIyyvMcVb1DVSeo6gTgb8A/nf2LPPafA5QCn3hc+sva46q6xl/PEChnn5jM3uJyth045tP5y3YeYmS/OGZNSOX09ET++tk2DpdU1h1/Z2UuEx78hKe/yPJXyI3KKyojtVcPRISYyDC+nzGQBev3sa+4rMNjMaajbd53hL5xkcRHhwc6lHblzxLHVGCHqmaraiUwD5jZxPlzgLle9n8P+EhVS/0QY6d01ogkAJ+qqypdNazKKaxbp/i+GaMoqazmL59u5ViFi1+8sYY731pLVXUNry/L6fBSR15hGam9e9RtX3fKYFSVl3wsdRSXVvH04iz2FnlPNJv3HeHRj7ZQXmVzfJnOZ1VOERMH9g50GO3On4kjFdjjsZ3r7DuOiKQBQ4B/ezk8m+MTyiMiss6p6ops5J43ikimiGTm5+e3PPoASonvwYn9evo0nmN9XhHlVTWcNDQBgPS+Pblm2iBeX5bD9Me/5F9r8rjjvBP43WVjyS0sY+Xujm1fqC1x1BqYEM2sCam8+PWuZmcDXrHrMNOf+JI/fLyFy576mo17i+sd/yb7EFc+s5Rnvsjis80H/BK/Ma1VcKyCnMOlTErrFehQ2p0/E4e3+YMb+3N3NvC2qtb7s1FEUoCxwEKP3XcDJwJTgATgV95uqKrPqmqGqmYkJSW1NPaAO+fEZDJ3F1LoUeXkzTfOpIhTBifU7bvj/BOI7xFOpauG139yEv91XjrTx6bQIzyUd1fn+TVuT8cqXBSXVdUrcQD8+qITiQgL4bfvb/RaAnJV1/DYp9v4/t+XEhYqPD57AiEifP/v3/DldvcfAR9v2M8PXlhO3/goEmMj+WCtdfM1ncvqnCIAJg2yEkdL5AIDPbYHAHsbOddbqQLgSuBdVa2q3aGq+9StAngRd5VYlzN9bArVNcpHG/Y3ed6ynYdJT46lT+y3Ba9e0REsvP0MPrvzTE4a2geAmMgwzh/Vlw/X76PS1TG9rmq74nqWOACS46K4/bx0Fm3N5/PN9UtVh45VMOe5b3j88+3MmpjKh7edzswJqbz781MZ0LsHP3xxBXf/cx0/f20lo/vH8dZPT2bGuBQWbT3IsYqWjX0xxp9W5RQSFiKMSY0PdCjtzp+JYwWQLiJDRCQCd3KY3/AkERkB9AaWernHce0eTikEERFgFrChnePuFEb3j2NoYgzvr20s17r/Ml+56zDThiYcdyw5LorYyPorA8+a2J+i0iq+2NYxVXd5Re5mqYYlDnC3daQnx/LbDzbWtU/sKijh8qeXsC63mMe+P56/XDmh7hn6xUfx1s9O5qShfZi7fA9nnJDEaz+eRu+YCC4el0KFq4bPrbrKdCKrdhcyun8cUeFdZ6qRWn5LHKrqAm7BXc20GXhTVTeKyIMicqnHqXOAedqgzkJEBuMusXzR4Navich6YD2QCDzsnycILBFhxvj+fLPzEAeOlHs9Z+PeI5RUVjNtSB+f7nl6ehIJMRH8a03HVFfVljgG9Do+cYSHhvDbmaPZc7iMZ77IYnVOIZc/vYQjZVW8/pOTuGzigOOu6RkVzgvXT+HFH07huR9kEB3hTiqTB/WmX1wU71t1lekkXNU1rMstZmIXrKYCCGv+lNZT1QXAggb77muw/UAj1+7CS2O6qp7TfhF2bpeO788Tn2/nw3X7uOG0IccdX7bTPbp82pDjSxzehIeGMGNcCm+s2MPR8ip6Rvm3i2BuURkRoSEkxnrtv8ApwxKZMS6FpxZn8cwXWST3jOKlH05haBOjbCPCQuomg6wVEiJcPC6FV5buprisivgeXavrowk+W/YfpayqmomDul7DONjI8U5teHIso1LimN9IddWy7MMMSYwhOS7K53vOnJBKhauGj5tpO2kPeYVl9O8V1eQ6y/dcPJLI0BBG9O3JP39+SpNJoykzxqVQWV3Dp5v8V11V6arhsqe+5t5/redoeVXzF5hua7UzO0JXbBgHSxyd3iXj+7NmTxF7DtcfxlJdoyzfddjn0katSYN6MSghukOqq/KKyry2b3hKie/B4l+exds3ndJoycQXEwb2IrVXDz5Y13ibUFtt3FvM6pwiXv0mhwse+w+LO9m8YKbzWJVTRFLPSAY08/0frCxxdHIzxqUA8H6DX4hb9h/haLnLa8N4U0SEWRP6syTrEBvyijlSXuW3QYF5hWXH9ajypk9sJOGhbftWFBFmjEvhq+0FFJU23YW5tVY53SufvnoS0ZFhXP/iCu58c6315jLHWZVTyKRBvXD34el6LHF0cgMTopk0qBfz19RPHMuc8Ru+Nox7mjUxFVWY8bevGPfAJ6Tf8xGn/eHfrNx9uF1iBqhwVXPwaAWpvTpu1bMZ4/rjqlEWbvRPNdyqnEL6x0dx0dgUPrztNG45ezj/WpPHT1/JpMJlI9eN26FjFew+VNplG8bBEkdQuHR8f7bsP8r2A0epqVHeXZ3L3/+TxaCEaPr78Bd9Q0OTYnnnppP50/fGcc/0kdx4xlBU4c4317bb1B37itw9wZqrqmpPY1LjSOsTzQfr/NO7avXuQiamuX8ZRIaFctcFI/jT98bx9Y5D3PXWOmpqOt9CWqbjdeWBf7X82qvKtI/p41J48A4IPXEAAB3OSURBVINNPPbZNnIOl7Ih7whjUuN4eNbYVt9zcloCk9O+reY6bXgiVz2/jMc+3cbd00e2OebcRgb/+VNtddUzX2Rz8Gg5yT197zTQnP3F5ewtLudHDX4ZXD5pAAeOVPCHj7fQt2ck984Y1cgdTHdRO/BvbBcc+FfLShxBILlnFCcP68OC9fspLKnir9+fwPybT2PCwPbr6nfK8ETmTB3Ic19msy63qM33qx3819GNg5dPGkB1jfLOyvZt/P+2l8zx/+c/O3Mo158ymOe/2slz/8lu1881wWd1ThEjU+LoEdH1Bv7VssQRJO6bMZrfXTaWz+88k1kTU5vs4tpad08fSVLPSP777XVtnpYkr7CMEHGP+O5Iw5JimTo4gTcz97Rro/+qnEIiwkIY3f/4vyJFhN/MGMX0sf14ZMFmPvNjl2DTubmqa1ibW+T1D4yuxBJHkBjRrydXTRvk1+kL4qLCeWTWWLbsP8rTi7PYfaiEf63O4/73NnD7vNUtav/ILSqjb1xUm3tLtcb3pwxkZ0EJy3e2X2P/qpwixqbGExHm/XlCQ4S/XDmBUSlx/PLttY2O9jdd29YDRymtrGZSWtdt3wBLHKaB80b15ZLx/Xnss22c+afF3P7GGuau2MO/1uz1aZr3Wr52xfWH6WNT6BkZxhsr9jR/sg8qXTWszytmYjNVg1HhoTwxZwJlVdXc+ebaFjWW19Qov3p7HRf+9T9sO3C0rSGbAFmzx13N2xXX4PBkicMc58FLR/OT04fwyGVjWHDb6ax/4Dv0jg7n4xZ0c/Vl8J+/9IgI5dIJ/VmwYR/FZW0f4b1p3xEqXTU+/RU5PLkn918ymq92FPD8V763d/z5k628kbmHnMOlzHry6yYntzSd1878EqLCQxiY0DUH/tWyxGGO0zsmgnsuHsXV09IY1T+OyLBQzh/Vl39vPujTeIXqGmV/cXnAShwAs6cMoryqptHpWlpi1e6WTR8xe8pALhzdjz8t3Mr63OJmz39t2W6eWpzFVdMGseiusxiZEsetc1fz0AebqKrumCnwTfvwXCq5K7PEYXxy4Zh+HK1wsSTrULPnHjhSjqtGGdC74wb/NTQmNY5RKXG8sSKnzfeqHfjna0O/iPDod8eSGBvJbfNWk3Oo8VWPF205yG/+tYGzRyTx4KWj6RsXxdyfnMT1pwzmH1/tZNrvPueix7/k2n8s4/Z5q1mW3fz/vwmc3MIyUgP4fd9RLHEYn5wyLJHYyDAW+jA5Yp6zPnigqqrA/cv7+1MGsiHvCBvymv+rvymrc4rqBv75qld0BI/PnsjBI+Wc/9gX/O3z7fVKa6WVLt5bk8fNr69iVP84/veqSYQ5HQkiwkJ44NLRPHPNJL4zqi+pvXpwrMLF4m353DK3ZZ0UTMfKKyrrsvNTebIBgMYnUeGhnH1iMp9sOsAjlymhTXQHbmzlv442a0IqjyzYzLwVOTyc2rrBkgeOlJNXVOZ1WvvmTB2SwOd3nsVDH2zifz7dxrur87julMF8k32IRVsPUl5Vw9DEGF64bgoxkcf/KF44JoULx6TUbS/JKuCq55bxVuYerj15cKuex/hPaaWLwyWVAf++7wiWOIzPLhzdj/fX7mXFrsN1S9J6U1fiCPAPUHx0OBePTeHVb3JYtCWfjMG9yRicwPkj+/pc7dTUwD9f9IuP4smrJ3HF1oPcP38j98/fSGJsJFdMHsjF41KYMjihySTs6eShfZic1ptnvshm9tRBAenqbBq31/m+txKHMR7OGpFERFgIH2/Y32TiyC0so09MRKcYOfvQrDGMTY1n5e5ClmQd4r01e3l60Q6++O+zffrFuyqniIjQEEb1j2tTHGeNSGbh7X3YdaiE9OSePicLTyLCLWcP54cvreDd1XlcmTGwTTGZ9rWnk5S0O4L9yWJ8FhMZxhnpSSzcuL/RUdnFZVV8vGEf49txOpS2iI0M44bThvDk1ZNY/v/O5W9zJrK3uNynBZ/Kq6pZln2IManunmVtFRUeyon94lqVNGqdNSKJ0f3jeHpxFtU2qWKnUrdUsjWOG1PfRWP6sa+4nHWNdDN9atEOisqquPM7J3RwZM0TEaaPTSG1Vw9eWbrb6zmVrhr+veUAv3hjDRkPf8ba3GLOOTHZ67mBUFvq2FlQwofrbY31ziSvqIzwUCG5Z+sXJAsWfk0cInKhiGwVkR0i8msvxx8TkTXOa5uIFHkcq/Y4Nt9j/xARWSYi20XkDRGJ8OczmPrOHZlMWIh4HQy453ApL369i8snDvA6p1NnEBoiXH3SIJZmH2LHwfojtCtdNXz36SXc8FImn20+wPSx/XjlR1P5+VnDAxStdxeM7sfw5Fie/PcOm8q9E8krLCMlvodf5pHrbPyWOEQkFHgSuAgYBcwRkXpzTqvqHao6QVUnAH8D/ulxuKz2mKpe6rH/D8BjqpoOFAI/8tczmOP1io5wZurdd1y30D8t3EpICNx1QecrbXi6MmMgEaEhvPpN/TEez32Zzfq8Yn532Vgy7z2fP35vPKenJ3W6XwQhIcLNZw9j64GjfLrZJlTsLHILS7tF+wb4t8QxFdihqtmqWgnMA2Y2cf4cYG5TNxT3cMxzgLedXf8HzGqHWE0LzJk6iN2HSrn8qSXsPlQCuOfomb92Lz85fSgp8Z37hycxNpLpY/vxzspcSpxlX3cVlPDE59u5aEw/rpo2qNHJDDuLS8b1Z2hiDA99sMmWru0kussYDvBv4kgFPGeZy3X2HUdE0oAhwL89dkeJSKaIfCMitcmhD1CkqrU/KU3d80bn+sz8/Py2PIdpYPrYFP5xXQZ5RWXMeOIrPlq/j0c+3ERibCQ/PXNYoMPzybUnp3G0wsV7a/aiqvzmvQ2Eh4Zw/yWjAx2aT8JCQ/jj98aRV1TGIx9uCnQ43V6lq8a9VLIljjbzVr5vrEJ2NvC2qnrWfQxS1QzgKuCvIjKsJfdU1WdVNUNVM5KSkloSt/HBuSP78sGtpzE0KYabXlvFil2F/OL8E4j1MpCtM5o0qDcjU+J49ZvdvLdmL19uL+C/LxzR4euHtEXG4ARuPGMoc5fvadHMxab97SsuQ7V7dMUF/yaOXMCzo/kAoLEZ52bToJpKVfc6/2YDi4GJQAHQS0Rqfzs1dU/jZwMTonnrZ6fw49OGcN7IvlyZMSDQIflMRLj2pDQ27TvC/3t3PeMH9uLqaWmBDqvFfnH+CYzo25NfvbOOotLKQIfTbdUtlWwljjZbAaQ7vaAicCeH+Q1PEpERQG9gqce+3iIS6bxPBE4FNql78MAi4HvOqdcB7/nxGUwzIsJCuHfGKJ6/LqNurqVgMXNCf3pGhlHhquH3l41t0/iKQIkMC+V/rhzP4ZJKfvPexkCH023VjuEY2A3GcIAfR46rqktEbgEWAqHAC6q6UUQeBDJVtTaJzAHmaf0RZSOBv4tIDe7k9qiq1lbk/gqYJyIPA6uBf/jrGUzXFhMZxm9njqbSVdPmkeGBNCY1ntvPS+fPn2xjz+FS4nqEExMRSq/ocG46cziD+nSPX2aBlFsUmKWSA8WvFdKqugBY0GDffQ22H/By3RLA66x0TtXV1PaL0nRnl08Knuq1pvzszGEcLqli64EjHCmrYn9xGXsOl7FydyH/uvlUoiOCo+0pWOUVBm6p5ECw7yZjuoCw0BDuu6TeMCm+2l7AtS8s4+5/ruev35/Q5RcXCqTuNIYDbMoRY7qs09ITufP8E3hvzV5e/cb7FCumfXSnMRxgicOYLu3nZw3nnBOTefCDTazOKURVyc4/xotf7+S+9zZwtLzta7J3d3VLJXejxGFVVcZ0YSEhwl+uHM+Mv33FT17OpEdEKHsOl9Udjwh194ozrVe7VHJqr+7TCcFKHMZ0cb2iI3jmmsnERoYxom8cD80aw39+eTazpwzkpSW7jpvssTVKK11UumraIdrg0xmWSu5oljiM6QbGpMaz+Jdn8/x1GVx7UhqD+kRz1wUj6BERym/f39To+iq+KK+q5pK/fcWMv33ZqkGI+4vLueypr9m870irYwik3MJSoHus/FfLEocx3VRibCR3nHcCX24v8Glhq8Y8uWgHWfkl7Cwo4Uf/l0lZZXXzF3l4cclOVucU8eSiHa2OIZDyutHKf7UscRjTjV17chrpybE89OGm46bJ98WOg0d55ossLp+YyuOzJ7Iqp5BbXl9FVbVv1VallS7mLsshIjSEjzbsr6v2CSZ5RWUkxkYQFR74pZI7iiUOY7qx8NAQHrh0NHsOl/H8l9kturamRvl//9xATGQY91w8kuljU3hw5hg+33KQu/+53qfqr3dW5XGk3MWfrhiHqvLykl2tfJLAyS0s61alDbDEYUy3d+rwRC4a048nF2Vx6FiFz9e9vTKX5bsOc/dFJ9In1r1c6rUnpfFf56bz9spc/rRwa5PX19QoL369k/ED4rl0fH8uGpPC3OU5dWukBIu8wrJusc64J0scxhhuP+8EyqqqeXd1nk/nFxyr4JEFm5k6OIErJg+sd+z289KZM3UgTy3O4o0VOY3cAb7Ylk92fgk3nDYEEeGG04ZwpNzFO6ty2/QsHUlVySsq61Y9qsAShzEGGNGvJxMH9WLu8pxmq5hUld++v4nSShePXDbmuKV1RYQHZ47h9PRE7nl3A1/vKPB6nxe+3knfuEimj00BYHJabyYM7MWLX+9q0VrqVdU1FJcGZiBjwbFKKlw1VlVljOme5kwZRFZ+CSt3FzZ53jur8nh/7V5uOyed9L49vZ4THhrCk1dPYmhSDD97dSXbD9QfK7LtwFG+3F7AD04eXG9iwBtOG8LOghIWbfVtYar9xeXMevJrTnn0c5+vaU91YzgscRhjuqOLx6UQExHK3OV7Gj0nO/8Y9723gWlDEvj52cObvF9cVDgvXD+FyLBQfvjSCr7Yls/W/Uc5XFLJC1/tJDIshKumDqp3zUVj+pESH8U/vtrZbLwb8oqZ+eRX7CoooX+vHvz4/zJ5bVn7zsmlqqzPLaa6kRLQnsPuMRxWVWWM6ZZiIsO4dEIqH67fyxEvc1hVuKq5de5qIsJC+OvsCT4tfDWgdzT/uC6DQ8cque6F5Vzw1/8w6aFPmbdiD5dPGkDvmIh654eHhnDdKYNZknWIhRv3N3rfhRv3c8UzSwkLCeGdn5/CuzefyhlO1djvP9rcoqqupryVmcsl//sV76/1vtDoyt2FRIWHMCwptl0+L1hY4jDG1JkzdSDlVTW8t+b4X5R//HgrG/ce4U/fG09KvO9/YY8f2IuvfnU2b/70ZJ68ahIPXDKK289L5/bz0r2ef+1JaYwf2ItbXl/Fv7fUH5hYXaP89bNt/OzVlZzQryfv3nwKJ/aLIzYyjOd+kMHV0wbx9y+yuW3e6jZPgbLj4FHun+9eVXFxI9VgS7IKmDI4gYiw7vWr1CY5NMbUGZsaz8iUOOYtz+Hak75dg/29NXn846udXHdyGueP6tvi+/aJjazrstucmMgwXr5hKtc8v4yfvbKK567L4MwTkthbVMbtb6xh+c7DXDYxld9fPrbeoLuw0BAenjWGgQnRPPrRFo5VuHjmmsmtGphXXlXNrXPX0CMilMlpvflqxyFUtd6aJvlHK9h24BiXTewai4G1RPdKk8aYJokIc6YOZOPeI2zIKyb/aAU3v76K/5q3hvEDe3H39JEdEkd8j3Be+dFUhifHcuPLmTzx+XYuevxLNuYV85crx/PY9yd4TQgiws/OHMbvLhvLF9vyue6F5a2aOv7Rj7awed8R/ueK8cyc0J+CYxVs2V+/gX9p9iEAThnWp3UPGcT8WuIQkQuBx3GvOf68qj7a4PhjwNnOZjSQrKq9RGQC8DQQB1QDj6jqG841LwFnAsXOdder6hp/Pocx3cnM8ak88uFm7ntvA1n5JZRVVnPn+Sfw0zOHdWiVTK/oCF798TSueu4b/vLpNsYNiOeJ2RMZnBjT7LVXTRtETGQod765lmueX8ZdF4xgQ94RVucUsi63mMsmpfKrC0/0eu1nmw7w0pJd3HDqEM4+MZn9xeWAe0XFkSnfrk2/NKuAnlFhjA7i9epby2+JQ0RCgSeB84FcYIWIzFfVTbXnqOodHuffCkx0NkuBH6jqdhHpD6wUkYWqWuQc/6Wqvu2v2I3pzuKjw7l4bAr/XJ3H1CEJ/P7ysQFr/E2IieD1n5zE4q0HmTGuf4sS18wJqcRGhnHTa6u49h/LARiSGENizwie+SKL6WNSGDsgvt41B46U88u31zIqJY5fXTQCgH7xUQxPjuXLHQX85IyhdecuyTrEtCF9COsm64x78meJYyqwQ1WzAURkHjAT2NTI+XOA+wFUdVvtTlXdKyIHgSSgqJFrjTHt6N4Zo5gxPoWzTkg+boBfR0uIieDySa1rRzh3ZF8W3HY6ewpLmTCgF71jIjhSXsU5f/6Ce9/bwLs3nVL3fNU1yh1vrKG8qoa/XTWRyLBvq8JOG57I3OU5lFdVExUeSm5hKbsPlXLdyYPb4xGDjj9TZSrg2SE819l3HBFJA4YA//ZybCoQAWR57H5ERNaJyGMi4rXFTURuFJFMEcnMz89v7TMY0y0lxERwzol9A5402sPw5FjOHpFc1/U3Liqcey4+kbV7ipi34ttfUX//TxZLsg7xwKWjjithnZ6eSIWrpm5w5JIsd/vGqcMTO+gpOhd/Jg5v33GNda6eDbytqvXmdRaRFOAV4IeqWtu37m7gRGAKkAD8ytsNVfVZVc1Q1YykpKTWxG+M6aJmTUhl2pAE/vDxFg4dq2DNniL+8sk2Lh6bwpUZA487f9rQPoSFCF9ud0+fsjTrEH1iIjihb/cav1HLn4kjF/D8CgwAvI+icSeOuZ47RCQO+BC4V1W/qd2vqvvUrQJ4EXeVmDHG+ExEeHjWGEoqXDzw/iZum7uavnFR/O7ysfW63NaKjQxj0qDefLUjH1VlSVYBJw/r4/Xc7sCfiWMFkC4iQ0QkAndymN/wJBEZAfQGlnrsiwDeBV5W1bcanJ/i/CvALGCD357AGNNlpfftyY9OG8L7a/eSW1jK47MnEN8jvNHzT0tPZOPeI6zcXciBIxWcMqx7VlOBHxOHqrqAW4CFwGbgTVXdKCIPisilHqfOAeZp/Sk5rwTOAK4XkTXOa4Jz7DURWQ+sBxKBh/31DMaYru22c9OZOKgX/2/6SDIGJzR57mnpiahSt85Idxy/UUvaskh9sMjIyNDMzMxAh2GMCWKu6homPvQpR8td9I+P4utfn9Plq6pEZKWqZjTc3/06IBtjTCuEhYbUlTJOHpbY5ZNGUyxxGGOMj05Ld/fQ7M7VVGCTHBpjjM8uHdefnfklXDCmX6BDCShLHMYY46P46HDuu2RUoMMIOKuqMsYY0yKWOIwxxrSIJQ5jjDEtYonDGGNMi1jiMMYY0yKWOIwxxrSIJQ5jjDEtYonDGGNMi3SLSQ5FJB/Y3WB3PFDczD7P7ebeJwIFrQjPWxy+ntMez+C5L9ifobXxNxWfL+c0FW9T2+35fdRUfM0d7yw/C03F2Nxx+1nwz89CmqoevxKeqnbLF/Bsc/s8t5t7D2S2Vxy+ntMez9BgX1A/Q2vjb+9n8HW7Pb+PfHkGf34NusIz2M+C78/Qnauq3vdh3/stfN9ecfh6Tns8Q1vj9/Ue3ekZfN1uz+8jX+7hz6+BL5/vi0A+Q2f7PvK2r1M8Q7eoquoIIpKpXuatDybB/gzBHj/YM3QWwf4M/o6/O5c42tuzgQ6gHQT7MwR7/GDP0FkE+zP4NX4rcRhjjGkRK3EYY4xpEUscxhhjWsQShxci8oKIHBSRDa24drKIrBeRHSLyhHgsTCwit4rIVhHZKCJ/bN+o68XQ7vGLyAMikicia5zX9PaPvF4cfvkaOMfvEhEVkcT2i9hrHP74OjwkIuucr8EnItK//SOvF4c/nuFPIrLFeY53RaRX+0deF4M/4r/C+RmuERG/NUC3JfZG7nediGx3Xtd57G/y58Wr1vb17cov4AxgErChFdcuB04GBPgIuMjZfzbwGRDpbCcHWfwPAHcF89fAOTYQWIh7QGhisD0DEOdxzm3AM0H4DN8Bwpz3fwD+EGTxjwRGAIuBjM4WuxPX4Ab7EoBs59/ezvveTT1nUy8rcXihqv8BDnvuE5FhIvKxiKwUkS9F5MSG14lICu4f7KXq/oq8DMxyDt8EPKqqFc5nHAyy+DuUH5/hMeC/Ab/3CvHHM6jqEY9TY/Dzc/jpGT5RVZdz6jfAgCCLf7OqbvVXzG2NvREXAJ+q6mFVLQQ+BS5s7c+8JQ7fPQvcqqqTgbuAp7yckwrkemznOvsATgBOF5FlIvKFiEzxa7THa2v8ALc41QsviEhv/4XaqDY9g4hcCuSp6lp/B9qENn8dROQREdkDXA3c58dYG9Me30u1bsD9V25Has/4O5ovsXuTCuzx2K59nlY9Z5iPH9qtiUgscArwlkf1X6S3U73sq/2LMAx3EfEkYArwpogMdbK8X7VT/E8DDznbDwH/g/uHvkO09RlEJBq4B3c1SUC009cBVb0HuEdE7gZuAe5v51Ab1V7P4NzrHsAFvNaeMTalPePvaE3FLiI/BP7L2TccWCAilcBOVb2Mxp+nVc9picM3IUCRqk7w3CkiocBKZ3M+7l+unsXuAcBe530u8E8nUSwXkRrcE5Hl+zNwR5vjV9UDHtc9B3zgz4C9aOszDAOGAGudH7oBwCoRmaqq+/0ce632+D7y9DrwIR2YOGinZ3AaZ2cA53bEH08e2vtr0JG8xg6gqi8CLwKIyGLgelXd5XFKLnCWx/YA3G0hubTmOf3VsBPsL2AwHo1SwBLgCue9AOMbuW4F7lJFbUPTdGf/z4AHnfcn4C42ShDFn+Jxzh3AvGD7GjQ4Zxd+bhz309ch3eOcW4G3g/AZLgQ2AUn+jt2f30f4uXG8tbHTeOP4Tty1Hr2d9wm+PKfXuDriCxdsL2AusA+owp2Rf4T7r9WPgbXON/19jVybAWwAsoD/5dvR+RHAq86xVcA5QRb/K8B6YB3uv8hS/BW/v56hwTm78H+vKn98Hd5x9q/DPRldahA+ww7cfzitcV5+6xnmp/gvc+5VARwAFnam2PGSOJz9Nzj/9zuAH7bk56Xhy6YcMcYY0yLWq8oYY0yLWOIwxhjTIpY4jDHGtIglDmOMMS1iicMYY0yLWOIw3ZKIHOvgz3teREa1072qxT077gYReb+52WVFpJeI/Lw9PtsYsBUATTclIsdUNbYd7xem307c51eesYvI/wHbVPWRJs4fDHygqmM6Ij7T9VmJwxiHiCSJyDsissJ5nersnyoiS0RktfPvCGf/9SLyloi8D3wiImeJyGIReVvc6028Vru2gbM/w3l/zJmocK2IfCMifZ39w5ztFSLyoI+loqV8O4ljrIh8LiKrxL2+wkznnEeBYU4p5U/Oub90PmediPy2Hf8bTTdgicOYbz0OPKaqU4DvAs87+7cAZ6jqRNyz0f7O45qTgetU9RxneyJwOzAKGAqc6uVzYoBvVHU88B/gJx6f/7jz+c3OF+TMr3Qu7pH8AOXAZao6Cff6L//jJK5fA1mqOkFVfyki3wHSganABGCyiJzR3OcZU8smOTTmW+cBozxmHo0TkZ5APPB/IpKOe+bQcI9rPlVVzzUTlqtqLoCIrME919BXDT6nkm8niVwJnO+8P5lv10J4HfhzI3H28Lj3StxrK4B7rqHfOUmgBndJpK+X67/jvFY727G4E8l/Gvk8Y+qxxGHMt0KAk1W1zHOniPwNWKSqlzntBYs9Dpc0uEeFx/tqvP+MVem3jYuNndOUMlWdICLxuBPQzcATuNfnSAImq2qViOwCorxcL8DvVfXvLfxcYwCrqjLG0ye417cAQERqp6+OB/Kc99f78fO/wV1FBjC7uZNVtRj38rF3iUg47jgPOknjbCDNOfUo0NPj0oXADc76DohIqogkt9MzmG7AEofprqJFJNfj9Qvcv4QznAbjTbinwgf4I/B7EfkaCPVjTLcDvxCR5UAKUNzcBaq6GvdMqbNxL4iUISKZuEsfW5xzDgFfO913/6Sqn+CuClsqIuuBt6mfWIxpknXHNaaTcFYpLFNVFZHZwBxVndncdcZ0NGvjMKbzmAz8r9MTqogOXJrXmJawEocxxpgWsTYOY4wxLWKJwxhjTItY4jDGGNMiljiMMca0iCUOY4wxLfL/AQJX6Z2DPADmAAAAAElFTkSuQmCC\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"learn_c.recorder.plot()"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"lr = 2e-2\n",
"lr *= bs/48\n",
"\n",
"wd = 0.01"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" accuracy \n",
" f1 \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 0.470085 \n",
" 0.305418 \n",
" 0.915873 \n",
" 0.949746 \n",
" 03:55 \n",
" \n",
" \n",
" 1 \n",
" 0.442705 \n",
" 0.284821 \n",
" 0.891669 \n",
" 0.932779 \n",
" 04:30 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.fit_one_cycle(2, lr, wd=wd, moms=(0.8,0.7))"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [],
"source": [
"learn_c.save(f'{lang}clas1_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" accuracy \n",
" f1 \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 0.367868 \n",
" 0.229045 \n",
" 0.939806 \n",
" 0.964251 \n",
" 04:59 \n",
" \n",
" \n",
" 1 \n",
" 0.342452 \n",
" 0.207793 \n",
" 0.943554 \n",
" 0.966407 \n",
" 04:56 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.load(f'{lang}clas1_sp15_multifit_bwd_v2');\n",
"learn_c.freeze_to(-2)\n",
"learn_c.fit_one_cycle(2, slice(lr/(2.6**4),lr), wd=wd, moms=(0.8,0.7))"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [],
"source": [
"learn_c.save(f'{lang}clas2_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" accuracy \n",
" f1 \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 0.296243 \n",
" 0.212597 \n",
" 0.949831 \n",
" 0.970389 \n",
" 06:06 \n",
" \n",
" \n",
" 1 \n",
" 0.233712 \n",
" 0.185665 \n",
" 0.950056 \n",
" 0.970505 \n",
" 06:55 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.load(f'{lang}clas2_sp15_multifit_bwd_v2');\n",
"learn_c.freeze_to(-3)\n",
"learn_c.fit_one_cycle(2, slice(lr/2/(2.6**4),lr/2), wd=wd, moms=(0.8,0.7))"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [],
"source": [
"learn_c.save(f'{lang}clas3_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" accuracy \n",
" f1 \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 0.261080 \n",
" 0.183791 \n",
" 0.946218 \n",
" 0.968110 \n",
" 09:09 \n",
" \n",
" \n",
" 1 \n",
" 0.234161 \n",
" 0.177020 \n",
" 0.948295 \n",
" 0.969387 \n",
" 10:26 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.load(f'{lang}clas3_sp15_multifit_bwd_v2');\n",
"learn_c.unfreeze()\n",
"learn_c.fit_one_cycle(2, slice(lr/10/(2.6**4),lr/10), wd=wd, moms=(0.8,0.7))"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"learn_c.save(f'{lang}clas4_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" accuracy \n",
" f1 \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 0.178675 \n",
" 0.173477 \n",
" 0.943825 \n",
" 0.966685 \n",
" 09:15 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.load(f'{lang}clas4_sp15_multifit_bwd_v2');\n",
"learn_c.fit_one_cycle(1, slice(lr/10/(2.6**4),lr/10), wd=wd, moms=(0.8,0.7))"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [],
"source": [
"learn_c.save(f'{lang}clas5_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" accuracy \n",
" f1 \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 0.223472 \n",
" 0.169082 \n",
" 0.944096 \n",
" 0.966770 \n",
" 11:20 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.load(f'{lang}clas5_sp15_multifit_bwd_v2');\n",
"learn_c.fit_one_cycle(1, slice(lr/10/(2.6**4),lr/10), wd=wd, moms=(0.8,0.7))"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [],
"source": [
"learn_c.save(f'{lang}clas6_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" accuracy \n",
" f1 \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 0.209474 \n",
" 0.168092 \n",
" 0.947302 \n",
" 0.968784 \n",
" 10:37 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.load(f'{lang}clas6_sp15_multifit_bwd_v2');\n",
"learn_c.fit_one_cycle(1, slice(lr/100/(2.6**4),lr/100), wd=wd, moms=(0.8,0.7))"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [],
"source": [
"learn_c.save(f'{lang}clas7_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" epoch \n",
" train_loss \n",
" valid_loss \n",
" accuracy \n",
" f1 \n",
" time \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 0.174563 \n",
" 0.167442 \n",
" 0.947934 \n",
" 0.969169 \n",
" 10:26 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.load(f'{lang}clas7_sp15_multifit_bwd_v2');\n",
"learn_c.fit_one_cycle(1, slice(lr/100/(2.6**4),lr/100), wd=wd, moms=(0.8,0.7))"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"learn_c.save(f'{lang}clas8_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [],
"source": [
"learn_c.load(f'{lang}clas8_sp15_multifit_bwd_v2')\n",
"learn_c.save(f'{lang}clas_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [],
"source": [
"learn_c.load(f'{lang}clas_sp15_multifit_bwd_v2');\n",
"learn_c.to_fp32().export(f'{lang}_classifier_sp15_multifit_bwd_v2')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Confusion matrix"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 13.9 s, sys: 1.09 s, total: 15 s\n",
"Wall time: 14.2 s\n"
]
}
],
"source": [
"%%time\n",
"data_clas = load_data(path, f'{lang}_textlist_class_sp15_multifit_bwd_v2', bs=bs, num_workers=1, backwards=True)\n",
"\n",
"config = awd_lstm_clas_config.copy()\n",
"config['qrnn'] = True\n",
"config['n_hid'] = 1550 #default 1152\n",
"config['n_layers'] = 4 #default 3\n",
"\n",
"learn_c = text_classifier_learner(data_clas, AWD_LSTM, config=config, drop_mult=0.5, pretrained=False,\n",
" metrics=[accuracy,f1])\n",
"# learn_c.load_encoder(f'{lang}fine_tuned_enc_sp15_multifit_bwd_v2');\n",
"\n",
"learn_c.load(f'{lang}clas_sp15_multifit_bwd_v2');\n",
"\n",
"# put weight on cpu\n",
"loss_weights = torch.FloatTensor(trn_weights).cpu()\n",
"learn_c.loss_func = partial(F.cross_entropy, weight=loss_weights)"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAbYAAAEGCAYAAAAJw7AFAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAYgElEQVR4nO3dd5gdZfn/8fedAkmEUEIJBhHBhBZqICpV4SsCNhRpRpCiKIoKCFYQBEG+igUb7YsRUBEbKgE1GCEUiT8IhCYoXUEQSIANhJJy//6Y2XhYNsnuJmfP8uz7dV17XTvPzJlzT7K7nzPPPPNMZCaSJJViQKsLkCRpWTLYJElFMdgkSUUx2CRJRTHYJElFMdgkSUUx2KRlKCKGRsSlEfF0RPxiKfYzISImL8vaWiUidoiIv7e6DvUf4X1s6o8i4v3A0cCGwGxgBnBKZl67lPs9APgEsG1mzlvqQvu4iEhgdGbe0+papHaesanfiYijgW8DpwJrAusAPwDevQx2/1rgH/0h1LoiIga1ugb1Pwab+pWIWAk4Cfh4Zv46M5/NzLmZeWlmHltvs3xEfDsi/l1/fTsilq/XvTkiHoqIT0fEYxHxSEQcXK/7MvAlYN+IeCYiDo2IEyPixw3vv25EZPsf/Ig4KCLui4jZEXF/RExoaL+24XXbRsQNdRfnDRGxbcO6qyLi5Ii4rt7P5IhYbRHH317/Zxrq3zMi9oiIf0TErIj4QsP24yPi+oh4qt72exGxXL3u6nqzW+rj3bdh/5+NiEeBie1t9WvWr99jq3r51RHxRES8ean+Y6UGBpv6mzcBQ4BLFrPNF4E3AlsAmwPjgeMa1o8EVgJGAYcC34+IVTLzBKqzwIszc4XMPG9xhUTEq4DvALtn5orAtlRdoh23WxW4rN52BPBN4LKIGNGw2fuBg4E1gOWAYxbz1iOp/g1GUQXxucAHgHHADsCXImK9etv5wFHAalT/drsAHwPIzB3rbTavj/fihv2vSnX2eljjG2fmvcBngZ9ExDBgIvCjzLxqMfVK3WKwqb8ZATyxhK7CCcBJmflYZj4OfBk4oGH93Hr93My8HHgG2KCH9SwAxkbE0Mx8JDPv6GSbtwN3Z+aFmTkvMy8C7gLe2bDNxMz8R2Y+B/ycKpQXZS7V9cS5wM+oQuuMzJxdv/8dwGYAmTk9M6fV7/sAcDawUxeO6YTMfKGu5yUy81zgbuCvwFpUHySkZcZgU38zE1htCdd+Xg082LD8YN22cB8dgnEOsEJ3C8nMZ4F9gY8Cj0TEZRGxYRfqaa9pVMPyo92oZ2Zmzq+/bw+e/zSsf6799RExJiImRcSjEdFGdUbaaTdng8cz8/klbHMuMBb4bma+sIRtpW4x2NTfXA88D+y5mG3+TdWN1m6duq0nngWGNSyPbFyZmX/MzLdSnbncRfUHf0n1tNf0cA9r6o4zqeoanZnDgS8AsYTXLHaodUSsQDV45zzgxLqrVVpmDDb1K5n5NNV1pe/XgyaGRcTgiNg9Ir5Wb3YRcFxErF4PwvgS8ONF7XMJZgA7RsQ69cCVz7eviIg1I+Jd9bW2F6i6NOd3so/LgTER8f6IGBQR+wIbA5N6WFN3rAi0Ac/UZ5OHd1j/H2C9l71q8c4Apmfmh6iuHZ611FVKDQw29TuZ+U2qe9iOAx4H/gUcAfym3uQrwI3ArcBtwE11W0/e6wrg4npf03lpGA0APk11RjaL6trVxzrZx0zgHfW2M4HPAO/IzCd6UlM3HUM1MGU21dnkxR3WnwicX4+a3GdJO4uIdwO7UXW/QvX/sFX7aFBpWfAGbUlSUTxj00IR8ZqIuDIi7oyIOyLiU3X71yPiroi4NSIuiYiV6/YR9fbPRMT3Ouzrqoj4e0TMqL/WaMUxST0VET+s7/W7vaFt8/q+vtuimjpteN0+oeFnfUZELIiIxY1MVRN5xqaFImItYK3MvCkiVqTqOtsTWBv4c2bOi4j/BcjMz9bXhrakGt02NjOPaNjXVcAxmXljbx+HtCxExI5U1z0vyMyxddsNVD/XUyPiEOB1mXl8h9dtCvw2M7t77VHLiGdsWqi+j+qm+vvZwJ3AqMyc3DC8fRpV0FHP2nEt1ShDqSiZeTXVtc9GGwDtM65cAezVyUv3pxqApBZpWrBFNXXQnRFxbt2tNTmqmc/Xj4g/RMT0iLim/b6dun1aVNMFnRQRzzSrNi1ZRKxLdTb21w6rDgF+38XdTKy7ZY6PiCUNEZdeCW4H3lV/vzfwmk622ReDraWafcY2Gvh+Zm4CPEX16eYc4BOZOY5qxNUP6m3PoJr9YBt6fs+QloH6PqNfAUdmZltD+xeBecBPurCbCZm5KdUUTTvw0pk7pFeqQ4CPR8R0qlshXmxcGRFvAOZk5u2dvVi9o2nX2OpP/Fdk5uh6+bPAYKrpcxqfzbR8Zm4UETOBNevrOMOBf2fmy2ZPiIjDqOefGzbsVePWH93TmYzUmczkgfvuYYXhw1l99TUXtj85ayYzZz7BeuuPZsCAl34eenLWTObMmcOotTv78Lrk9eqZwQM9CW62F154gXvvvYeNN97kZeuef/55HnjgfjbccKOFbQ899C8GDRrEyJFr9WaZ/dZNN01/IjNX79je7EdKNE6VM5/qESFPZWaPRwtl5jlUZ31stsW4nPTnvyxdhVooMzn6Y4cy/k3bccKppy9sv2rKZE4+7jNcd/NdjFjtZT9D/OKnF3DrjJs4+WvfBmDevHm0Pf0Uq45Yjblz5/KJDx/I9jvtzAcO/nCvHUt/sMbw5VtdQvEefOAB3rvnO7jur9UYqMcee4w11liDBQsW8OFDDuKTRx7NBw8+BIAFCxYwer11+NOfr+Z16zlupDcMHRwdp5oDmh9sHbUB90fE3pn5i/q6y2aZeQvVoIS9qG4A3a+X6xJw41//wq9//lM23Hgsu+80HoBjjzuJEz9/NC++8AIf2OvtAGy59XhO/UY1un+7LcYwe/Zs5s59kcmXX8qFv5zE2q9ZhwP2fifz5s5l/vz5bL/Tzux/4CEtOy6pJw78wP5cM/UqnnjiCdZfd22O/9KXeeaZZzj7rO8D8O4938uBBx28cPtrr7maUaPWNtT6gGZ3RU5qGCZ7DNXEqudTzT+3FlXX5M8y86SIGE01bVFQTbNzWGaO6mTXC3nGpv7MMzb1d0MHx/TM3Lpje9PO2OpHXIxtWD69YfVunbzkYeCNmZkRsR/VlEaSJHVLX3ps+zjge3X35FNUo48kSeqWPhNsmXkN1dOKJUnqMWcekSQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVxWCTJBXFYJMkFcVgkyQVZdCiVkTEpUAuan1mvqspFUmStBQWGWzA6b1WhSRJy8gigy0zp/ZmIZIkLQuLO2MDICJGA18FNgaGtLdn5npNrEuSpB7pyuCRicCZwDzgLcAFwIXNLEqSpJ7qSrANzcwpQGTmg5l5IrBzc8uSJKlnltgVCTwfEQOAuyPiCOBhYI3mliVJUs905YztSGAY8ElgHHAA8MFmFiVJUk8t8YwtM2+ov30GOLi55UiStHS6MirySjq5UTszvc4mSepzunKN7ZiG74cAe1GNkJQkqc/pSlfk9A5N10WEN29LkvqkrnRFrtqwOIBqAMnIplXUDQMHBMOHdOWkUyrPKtsc0eoSpD6pK6kwneoaW1B1Qd4PHNrMoiRJ6qmuBNtGmfl8Y0NELN+keiRJWipduY/tL520Xb+sC5EkaVlY3PPYRgKjgKERsSVVVyTAcKobtiVJ6nMW1xX5NuAgYG3gG/w32NqALzS3LEmSemZxz2M7Hzg/IvbKzF/1Yk2SJPVYV66xjYuIldsXImKViPhKE2uSJKnHuhJsu2fmU+0LmfkksEfzSpIkqee6EmwDG4f3R8RQwOH+kqQ+qSv3sf0YmBIRE+vlg4Hzm1eSJEk915W5Ir8WEbcC/0M1MvIPwGubXZgkST3Rla5IgEeBBVQz++8C3Nm0iiRJWgqLu0F7DLAfsD8wE7gYiMx8Sy/VJklSty2uK/Iu4BrgnZl5D0BEHNUrVUmS1EOL64rci6oL8sqIODciduG/s49IktQnLTLYMvOSzNwX2BC4CjgKWDMizoyIXXupPkmSumWJg0cy89nM/ElmvoNq3sgZwOeaXpkkST3Q1VGRAGTmrMw8OzN3blZBkiQtjW4FmyRJfZ3BJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKorBJkkqisEmSSqKwSZJKsqgVhegvuus73+H8yeeByQHHnQohx/xKU475ctcMPE8Rqy2OgDHn3gyu+62BwC333YrR3/ycGbPnk3EAP58zTSGDBnSwiOQluysEyaw+45jeXzWbLbe+1QANhsziu9+cT+WX34w8+Yv4MhTL+bGOx7kqAN3Yd89tgFg0MABbPi6kbxm58+x2iorcOH/HrJwn68bNYKTz7yM7/30Ki487WBGr7smACuvOJSnZj/HG/c7rdePsz8x2NSpv91xO+dPPI8pV1/Pcsstx/vevcfCADv8iE/xiSM//ZLt582bx0cO/SBn/d+P2HSzzZk1cyaDBw9uRelSt1x46TTOungq/3fygQvbTjlyT0455/dMvu5vvG37jTnlyD1524fP4FsXTOFbF0wBYI8dx/KJCW/hybY5PNk2Z2FYDRgQ3PvHU/jdlbcAcMDnJi7c72lHv4enn3muF4+uf7IrUp36x9/vYpvxb2DYsGEMGjSI7XbYkUm/+80it//znyazydhN2XSzzQFYdcQIBg4c2FvlSj123U33MuvpOS9py4Thr6p6G1ZaYSiPPP70y163z25b8/M/TH9Z+1vGb8D9Dz3OPx958mXr9nrrVp2+RstWU4MtItaNiLsi4vyIuDUifhkRwyJil4i4OSJui4gfRsTy9fanRcTf6m1Pb2ZtWryNNt6Ev1x3DbNmzmTOnDlc8cff8/DDDwFw7tk/YLvxW3LERz/EU09Wv7z33nM3EcFe79qdnbbdhjO++fVWli8tlWNP/yWnHrknd//+ZL561Hv40nd/+5L1Q4cM5q3bbsRvpsx42Wv3ftu4TsNru63W5z+zZnPvPx9vWt2q9MYZ2wbAOZm5GdAGHA38CNg3Mzel6g49PCJWBd4DbFJv+5VeqE2LsMGGG/Gpo4/lPe/cjfftuQebbLo5gwYO5JAPfZSbb/8H10ybzpojR3Lc548Fqq7Iaddfxzk/vJDf/2kql136G6ZeOaXFRyH1zGF778BnvvFrRu9+PJ85/VececKEl6x/+46bcv2M+3iy7aVneoMHDeTtO23Kr6+4+WX73Ge3rfnFH25sat2qRGY2b+cR6wJXZ+Y69fLOwPHAwMzcsW7bBfg4sA8wHbgRuAyYlJkvdrLPw4DD6sUNgL837QDUaBTwItD4cXMkMAK4A1gFWAl4oF63FrAA+E/vlSj1zJgxY5abNGnS6DFjxtwB0NbWtsVKK600IzOJCNra2rZcccUVF6bV5MmT17/ooovmTZw48cHG/UyYMGHlww8/fPXtt9/+7sb2QYMG8eijj24+fvz4v913331ze+eo+oXXZubqHRt7I9imZuZr6+VFBltmvrfuktwF2A9YOzN3blpxWqKIWCMzH4uIdYDJwJuAIZn5SL3+n8BfMnO/iFgFmAJsTxWAfwC+lZmXtah8qTvWBSYBY+vlO4HDgauo/iZ9DRhXr1sJuH/48OEPtLW1bdVhPz8D/ghM7NC+G/B5YKdlXbherjdGRa4TEW/KzOuB/YE/AR+JiNdn5j3AAcDUiFgBGJaZl0fENOCeXqhNi/eriBgBzKX68PFkRFwYEVsACQwHjgKo130TuKFed7mhpleIi4A3A6sBDwEnAB8GzqD6G/k8/+0lguqSyeTZs2e/vsN+hgFvBT7SyXvsV7+PekFvnLFdDlwNbAvcTRVkbwJOp/qhuYHqk9GqwG+BIUAAp2fm+U0rTkstIm7MzK1bXYfUCv789129cca2IDM/2qFtCrBlh7ZHgPG9UI+WnXNaXYDUQv7891G9ccY2KTPHLmFTSZKWiaYGmyRJvc2ZRyRJRTHYJElFMdgkSUUx2NQtETE7Ito6fP0rIi6JiPVaXZ/UTBHxtYgYHhGDI2JKRDwRER9odV16KYNN3fVN4FiqKbbWBo4BzqWaceGHLaxL6g27ZmYb8A6qm7nHUP0+qA8x2NRdu2Xm2Zk5OzPbMvMcYI/MvJhqvkipZO0PGdwDuCgzZ7WyGHXOYFN3LYiIfSJiQP21T8M67x1R6S6NiLuArYEpEbE61ZRb6kO8j03dUl9HO4NqWrQEplHNF/kwMC4zr21heVLT1RN+t2Xm/IgYBgzPzEdbXZf+y2CTpC6KiMFUc9vuWDdNBc7KTB9F04fYFaluiYgx9Wiw2+vlzSLiuFbXJfWSM6keX/OD+muruk19iGds6paImEo1CuzszNyybrvd+UDVH0TELZm5+ZLa1Fqesam7hmXm/+vQNq8llUi9b35ErN++UF9znt/CetSJ3nhsjcryRP2LnQAR8T6qRw5J/cGxwJURcV+9vC5wcOvKUWfsilS31J9Qz6F6cOyTwP3AhMx8sKWFSb0gIoYAnwZ2qZuuAL6VmQ7570MMNnVLRCwPvI/qk+qqQBuQmXlSK+uSekNE/JzqZ/4nddP+wCqZuXfrqlJHdkWqu34LPAXcBPy7xbVIvW2DDgNFroyIW1pWjTplsKm71s7M3VpdhNQiN0fEGzNzGkBEvAG4rsU1qQO7ItUtEXEO8N3MvK3VtUi9LSLuBDYA/lk3rQPcCSyg6pLfrFW16b8MNnVLRPwNeD3VoJEXgMBfaPUTEfHaxa13EFXfYLCpWxb1i+0vtKS+wmCTJBXFmUckSUUx2CRJRTHYpCaLiPkRMSMibo+IX9TP8Orpvt4cEZPq798VEZ9bzLYrR8THevAeJ0bEMT2tUWo1g01qvucyc4v6CQgvAh9tXBmVbv8uZubvMvO0xWyyMtDtYJNe6Qw2qXddA7w+ItaNiDsj4gdUs7i8JiJ2jYjrI+Km+sxuBYCI2C0i7oqIa4H3tu8oIg6KiO/V368ZEZdExC3117bAacD69dni1+vtjo2IGyLi1oj4csO+vhgRf4+IP1HdpyW9YhlsUi+JiEHA7kD7ze0bABfUz7V7FjgO+J/M3Aq4ETi6nnT3XOCdwA7AyEXs/jvA1Hq6p62AO4DPAffWZ4vHRsSuwGhgPLAFMC4idoyIccB+wJZUwbnNMj50qVc5pZbUfEMjYkb9/TXAecCrgQfbp2YC3ghsDFwXEQDLAdcDGwL3Z+bdABHxY+CwTt5jZ+BAgMycDzwdEat02GbX+uvmenkFqqBbEbgkM+fU7/G7pTpaqcUMNqn5nsvMLRob6vB6trEJuCIz9++w3RbUz75bBgL4amae3eE9jlyG7yG1nF2RUt8wDdguIl4PEBHDImIMcBfwuoanNu+/iNdPAQ6vXzswIoYDs6nOxtr9ETik4drdqIhYA7gaeE9EDI2IFam6PaVXLINN6gMy83HgIOCiiLiVKug2rB9geRhwWT14ZFFTl30KeEtE3AZMBzbJzJlUXZu3R8TXM3My8FPg+nq7XwIrZuZNwMXADOBXVN2l0iuWU2pJkoriGZskqSgGmySpKAabJKkoBpskqSgGmySpKAabJKkoBpskqSgGmySpKP8fHS3STvDEcRgAAAAASUVORK5CYII=\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"preds,y,losses = learn_c.get_preds(with_loss=True)\n",
"predictions = np.argmax(preds, axis = 1)\n",
"\n",
"interp = ClassificationInterpretation(learn_c, preds, y, losses)\n",
"interp.plot_confusion_matrix()"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 2215 197]\n",
" [ 956 18777]]\n",
"accuracy global: 0.9479340708963648\n",
"accuracy on negative reviews: 91.83250414593698\n",
"accuracy on positive reviews: 95.1553235696549\n"
]
}
],
"source": [
"from sklearn.metrics import confusion_matrix\n",
"cm = confusion_matrix(np.array(y), np.array(predictions))\n",
"print(cm)\n",
"\n",
"## acc\n",
"print(f'accuracy global: {(cm[0,0]+cm[1,1])/(cm[0,0]+cm[0,1]+cm[1,0]+cm[1,1])}')\n",
"\n",
"# acc neg, acc pos\n",
"print(f'accuracy on negative reviews: {cm[0,0]/(cm[0,0]+cm[0,1])*100}') \n",
"print(f'accuracy on positive reviews: {cm[1,1]/(cm[1,0]+cm[1,1])*100}')"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" text \n",
" target \n",
" prediction \n",
" \n",
" \n",
" \n",
" \n",
" . s ▁remarque ▁xxup ▁dans ▁continue aire ▁comment ▁le ▁xxmaj ] ] le agi ▁ ▁xxmaj ro est ▁ma ▁xxmaj xx 6 l ▁6 ▁xxup ▁0 ▁4 ▁xxrep b : in as [ ▁[ ▁xxup ] ] ▁2 . ▁vol ▁collection ▁xxmaj icht ur ▁sch ▁xxmaj ▁carl ▁xxmaj k tak 25 9 00 b : in as [ ▁[ ▁xxup ] ] ] ▁dvd ▁xxup ▁1 ▁/ s ▁cd 20 \n",
" pos \n",
" pos \n",
" \n",
" \n",
" . g s ▁b ▁xxup ▁de ▁fan ▁tout ▁pour ▁posséder ▁à t offre ▁c ▁un ▁xxmaj ) f v ▁( ▁xxup ol contr - u ▁ ▁bonus ▁le ▁dans ▁disponibles ▁informations ▁des intégralité ▁ ▁: ▁\" \\ ▁ cle ora ' l ▁\" \\ ▁ ) f v ▁( ▁xxup nage ▁vision ▁de ▁cours ▁en ▁vaisseaux ▁les ▁et ▁personnages ▁les ▁sur ▁informations ▁: ▁\" \\ ▁ cle ora ' l \n",
" pos \n",
" pos \n",
" \n",
" \n",
" . tor ▁termina ▁xxmaj ▁en gger e en arz ▁schw ▁xxmaj ▁arnold ▁xxmaj aperçoit ▁ ▁on ▁fin ▁de ▁générique ▁au ▁xxmaj . ▁jones ▁xxmaj indiana ▁ ▁xxmaj ▁à ▁référence ▁fait ▁xxmaj ▁: ▁film ▁du ▁fin ▁la ▁à \") ▁ \\ ▁ ▁musée ▁un ▁dans ▁est ▁place sa ▁\" \\ ▁ ▁( ▁\" \\ ▁ ▁museum ▁a ▁in s ong ▁bel thing ▁ at th ▁\" \\ ▁ ). ton mou \n",
" pos \n",
" pos \n",
" \n",
" \n",
" ... ▁semaines ▁quelques ▁dans ni van gio ▁bon ▁xxmaj ▁chez ▁paraît ▁qui ] ] l ▁tel ▁xxmaj ▁guillaume ▁xxmaj 8 f 45 89 v 00 b : in as [ ▁[ ▁xxup ani gli ▁fo ▁xxmaj ▁/ ber le ön sh ▁ ▁xxmaj ▁version ▁la ▁de ▁dvd ▁xxup ▁le ▁avec r ▁compare ▁la ▁de nant ▁passion ▁être ▁va ▁il ▁xxmaj . ▁vidéo ▁que ▁audio ▁tant , ographie ▁disc ce ▁min \n",
" pos \n",
" pos \n",
" \n",
" \n",
" . ▁disney ▁xxmaj ▁vieux s ▁bon ▁aux ▁restera ▁en ▁on ▁bien ▁ou , ] ] s ▁géant ▁de ▁chasseur ▁le ▁jack ▁xxmaj 18 t h 76 c 00 b : in as [ ▁[ ▁xxup mé i est s ▁mé ▁et ▁bon ▁très ▁le , ▁exemple ▁par , ra re fé ▁pré ▁leur ▁on , ▁veine ▁même ▁la ▁dans ▁xxmaj . ▁spectacle ▁ce ▁devant ▁enfants ▁des ▁mettre ▁de a ▁éviter \n",
" neg \n",
" pos \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"learn_c.show_results()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Predictions some random sentences"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.cm as cm\n",
"import warnings\n",
"warnings.filterwarnings('ignore') # \"error\", \"ignore\", \"always\", \"default\", \"module\" or \"on"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"neg tensor([0.9984, 0.0016])\n"
]
}
],
"source": [
"# Get the prediction neg/pos\n",
"review = 'Ce produit est bizarre.'\n",
"pred = learn_c.predict(review)\n",
"print(pred[0], pred[2])"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
". re zar ▁bi ▁est ▁produit ▁ce ▁xxmaj ▁xxbos "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# The darker the word-shading in the below example, the more it contributes to the classification. \n",
"txt_ci = TextClassificationInterpretation.from_learner(learn_c)\n",
"test_text = 'Ce produit est bizarre.'\n",
"txt_ci.show_intrinsic_attention(test_text,cmap=cm.Purples)"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([0.2897, 0.6939, 1.0000, 0.6681, 0.1437, 0.2126, 0.1160, 0.0454, 0.0124],\n",
" device='cuda:0')"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"txt_ci.intrinsic_attention(test_text)[1]"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" Text \n",
" Prediction \n",
" Actual \n",
" Loss \n",
" Probability \n",
" \n",
" \n",
" \n",
" \n",
" ▁xxbos ▁xxmaj ▁bonne ▁suite ▁de ▁la ▁saison ▁1 ▁xxmaj ▁ attention ▁ne ▁pas ▁regard er ▁avant ▁16 -17 ▁ans ▁beaucoup ▁de ▁sang ▁et ▁de ▁sexe ▁xxmaj ▁trop ▁peu ▁d ' épisode ▁par ▁saison . ▁xxmaj ▁trop ▁ excellent \n",
" pos \n",
" pos \n",
" 9.98 \n",
" 1.00 \n",
" \n",
" \n",
" ▁xxbos ▁xxmaj ▁très ▁bonne ▁adaptation ▁du ▁personnage . ▁xxmaj ▁le ▁fait ▁de ▁place r ▁les ▁ intrigue s ▁à ▁notre ▁époque ▁rend ▁les ▁personnages ▁plus ▁ abord ables ▁et ▁ atta chant s . ▁xxmaj ▁nous ▁assis ton s ▁à ▁la ▁rencontre ▁de ▁xxmaj ▁ holm es ▁et ▁xxmaj ▁watson , ▁et ▁la ▁aussi , ▁c ' est ▁une ▁très ▁bonne ▁ idée . ▁xxmaj ▁chaque ▁histoire ▁est ▁bien ▁fi \n",
" pos \n",
" pos \n",
" 6.88 \n",
" 1.00 \n",
" \n",
" \n",
" ▁xxbos ▁xxmaj ▁en ▁xxmaj ▁europe , ▁il ▁a ▁existé ▁3 ▁montage s ▁ diff e rent s , ▁celui ▁de ▁la ▁sortie ▁en ▁1982 ▁( international ), ▁celui ▁de ▁1992 ▁( dire c tor ' s ▁ cut ) ▁et ▁celui ▁de ▁2007 ▁( fin al ▁ cut ). ▁xxmaj ▁le ▁moins ▁inter re ssant , ▁parce ▁que ▁le ▁plus ▁ aff adi ▁( s up press ion ▁de ▁4 \n",
" pos \n",
" pos \n",
" 6.61 \n",
" 1.00 \n",
" \n",
" \n",
" ▁xxbos ▁xxmaj ▁le ▁plus ▁abouti ▁de ▁tous ▁les ▁albums ▁de ▁xxmaj ▁ kate ▁xxmaj ▁bush . ▁xxmaj ▁le ▁plus ▁magique . ▁xxmaj ▁elle ▁y ▁a ▁mis ▁toute ▁son ▁en er gie , ▁toute ▁sa ▁créa t ivité ▁et ▁sa ▁per son alité . ▁xxmaj ▁un ▁voyage ▁parmi ▁les ▁thèmes ▁du ▁savoir , ▁des ▁cultures , ▁des ▁croyance s , ▁de ▁l ' horreur , ▁de ▁la ▁peur , ▁du ▁doute \n",
" pos \n",
" pos \n",
" 5.87 \n",
" 1.00 \n",
" \n",
" \n",
" ▁xxbos ▁xxmaj ▁je ▁trouve ▁ce ▁produit ▁fort ▁cher ▁pour ▁ce ▁que ▁c ' est . ▁xxmaj ▁comme ▁d ' habitude ▁on ▁pay e ▁la ▁pomme . ▁xxmaj ▁ ok ▁le ▁mac book ▁est ▁léger , ▁mais ▁il ▁faut ▁toujours ▁le ▁se t ▁d ' adapt ateurs ▁avec ... \n",
" pos \n",
" neg \n",
" 5.31 \n",
" 0.00 \n",
" \n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# tabulation showing the first k texts in top_losses along with their prediction, actual,loss, and probability of actual class.\n",
"# max_len is the maximum number of tokens displayed. If max_len=None, it will display all tokens.\n",
"txt_ci.show_top_losses(5)"
]
},
{
"cell_type": "markdown",
"metadata": {
"heading_collapsed": true
},
"source": [
"## Ensemble"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"bs = 18"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"config = awd_lstm_clas_config.copy()\n",
"config['qrnn'] = True\n",
"config['n_hid'] = 1550 #default 1152\n",
"config['n_layers'] = 4 #default 3"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"data_clas = load_data(path, f'{lang}_textlist_class_sp15_multifit_v2', bs=bs, num_workers=1)\n",
"learn_c = text_classifier_learner(data_clas, AWD_LSTM, config=config, drop_mult=0.5, metrics=[accuracy,f1]).to_fp16()\n",
"learn_c.load(f'{lang}clas_sp15_multifit_v2', purge=False);"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor(0.9553), tensor(0.9747))"
]
},
"execution_count": 59,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"preds,targs = learn_c.get_preds(ordered=True)\n",
"accuracy(preds,targs),f1(preds,targs)"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"data_clas_bwd = load_data(path, f'{lang}_textlist_class_sp15_multifit_bwd_v2', bs=bs, num_workers=1, backwards=True)\n",
"learn_c_bwd = text_classifier_learner(data_clas_bwd, AWD_LSTM, config=config, drop_mult=0.5, metrics=[accuracy,f1]).to_fp16()\n",
"learn_c_bwd.load(f'{lang}clas_sp15_multifit_bwd_v2', purge=False);"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor(0.9478), tensor(0.9702))"
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"preds_b,targs_b = learn_c_bwd.get_preds(ordered=True)\n",
"accuracy(preds_b,targs_b),f1(preds_b,targs_b)"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {
"hidden": true
},
"outputs": [],
"source": [
"preds_avg = (preds+preds_b)/2"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {
"hidden": true
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor(0.9574), tensor(0.9758))"
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"accuracy(preds_avg,targs_b),f1(preds_avg,targs_b)"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {
"hidden": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 2186 226]\n",
" [ 717 19016]]\n",
"accuracy global: 0.9574170241589524\n",
"accuracy on negative reviews: 90.6301824212272\n",
"accuracy on positive reviews: 96.36649267724117\n"
]
}
],
"source": [
"from sklearn.metrics import confusion_matrix\n",
"\n",
"predictions = np.argmax(preds_avg, axis = 1)\n",
"cm = confusion_matrix(np.array(targs_b), np.array(predictions))\n",
"print(cm)\n",
"\n",
"## acc\n",
"print(f'accuracy global: {(cm[0,0]+cm[1,1])/(cm[0,0]+cm[0,1]+cm[1,0]+cm[1,1])}')\n",
"\n",
"# acc neg, acc pos\n",
"print(f'accuracy on negative reviews: {cm[0,0]/(cm[0,0]+cm[0,1])*100}') \n",
"print(f'accuracy on positive reviews: {cm[1,1]/(cm[1,0]+cm[1,1])*100}')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"hidden": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}