{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Basic training functionality" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [], "source": [ "from fastai.basic_train import *\n", "from fastai.gen_doc.nbdoc import *\n", "from fastai.vision import *\n", "from fastai.distributed import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[`basic_train`](/basic_train.html#basic_train) wraps together the data (in a [`DataBunch`](/basic_data.html#DataBunch) object) with a PyTorch model to define a [`Learner`](/basic_train.html#Learner) object. Here the basic training loop is defined for the [`fit`](/basic_train.html#fit) method. The [`Learner`](/basic_train.html#Learner) object is the entry point of most of the [`Callback`](/callback.html#Callback) objects that will customize this training loop in different ways. Some of the most commonly used customizations are available through the [`train`](/train.html#train) module, notably:\n", "\n", " - [`Learner.lr_find`](/train.html#lr_find) will launch an LR range test that will help you select a good learning rate.\n", " - [`Learner.fit_one_cycle`](/train.html#fit_one_cycle) will launch a training using the 1cycle policy to help you train your model faster.\n", " - [`Learner.to_fp16`](/train.html#to_fp16) will convert your model to half precision and help you launch a training in mixed precision." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

class Learner[source]

\n", "\n", "> Learner(**`data`**:[`DataBunch`](/basic_data.html#DataBunch), **`model`**:[`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module), **`opt_func`**:`Callable`=***`'Adam'`***, **`loss_func`**:`Callable`=***`None`***, **`metrics`**:`Collection`\\[`Callable`\\]=***`None`***, **`true_wd`**:`bool`=***`True`***, **`bn_wd`**:`bool`=***`True`***, **`wd`**:`Floats`=***`0.01`***, **`train_bn`**:`bool`=***`True`***, **`path`**:`str`=***`None`***, **`model_dir`**:`str`=***`'models'`***, **`callback_fns`**:`Collection`\\[`Callable`\\]=***`None`***, **`callbacks`**:`Collection`\\[[`Callback`](/callback.html#Callback)\\]=***``***, **`layer_groups`**:`ModuleList`=***`None`***)\n", "\n", "Trainer for `model` using `data` to minimize `loss_func` with optimizer `opt_func`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner, title_level=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The main purpose of [`Learner`](/basic_train.html#Learner) is to train `model` using [`Learner.fit`](/basic_train.html#Learner.fit). After every epoch, all *metrics* will be printed and also made available to callbacks.\n", "\n", "The default weight decay will be `wd`, which will be handled using the method from [Fixing Weight Decay Regularization in Adam](https://arxiv.org/abs/1711.05101) if `true_wd` is set (otherwise it's L2 regularization). If `bn_wd` is `False`, then weight decay will be removed from batchnorm layers, as recommended in [Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour](https://arxiv.org/abs/1706.02677). If `train_bn`, batchnorm layer learnable params are trained even for frozen layer groups.\n", "\n", "To use [discriminative layer training](#Discriminative-layer-training), pass a list of [`nn.Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) as `layer_groups`; each [`nn.Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) will be used to customize the optimization of the corresponding layer group.\n", "\n", "If `path` is provided, all the model files created will be saved in `path`/`model_dir`; if not, then they will be saved in `data.path`/`model_dir`.\n", "\n", "You can pass a list of [`callback`](/callback.html#callback)s that you have already created, or (more commonly) simply pass a list of callback functions to `callback_fns` and each function will be called (passing `self`) on object initialization, with the results stored as callback objects. For a walk-through, see the [training overview](/training.html) page. You may also want to use an [application](applications.html) specific model. For example, if you are dealing with a vision dataset, here the MNIST, you might want to use the [`create_cnn`](/vision.learner.html#create_cnn) method:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": false }, "outputs": [], "source": [ "path = untar_data(URLs.MNIST_SAMPLE)\n", "data = ImageDataBunch.from_folder(path)\n", "learn = create_cnn(data, models.resnet18, metrics=accuracy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Model fitting methods" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

lr_find[source]

\n", "\n", "> lr_find(**`learn`**:[`Learner`](/basic_train.html#Learner), **`start_lr`**:`Floats`=***`1e-07`***, **`end_lr`**:`Floats`=***`10`***, **`num_it`**:`int`=***`100`***, **`stop_div`**:`bool`=***`True`***, **`wd`**:`float`=***`None`***)\n", "\n", "Explore lr from `start_lr` to `end_lr` over `num_it` iterations in `learn`. If `stop_div`, stops when loss diverges. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.lr_find)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Runs the learning rate finder defined in [`LRFinder`](/callbacks.lr_finder.html#LRFinder), as discussed in [Cyclical Learning Rates for Training Neural Networks](https://arxiv.org/abs/1506.01186). " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.\n" ] } ], "source": [ "learn.lr_find()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Min numerical gradient: 1.32E-02\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "learn.recorder.plot()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

fit[source]

\n", "\n", "> fit(**`epochs`**:`int`, **`lr`**:`Union`\\[`float`, `Collection`\\[`float`\\], `slice`\\]=***`slice(None, 0.003, None)`***, **`wd`**:`Floats`=***`None`***, **`callbacks`**:`Collection`\\[[`Callback`](/callback.html#Callback)\\]=***`None`***)\n", "\n", "Fit the model on this learner with `lr` learning rate, `wd` weight decay for `epochs` with `callbacks`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.fit)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Uses [discriminative layer training](#Discriminative-layer-training) if multiple learning rates or weight decay values are passed. To control training behaviour, use the [`callback`](/callback.html#callback) system or one or more of the pre-defined [`callbacks`](/callbacks.html#callbacks)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [ "Total time: 00:04

\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
epochtrain_lossvalid_lossaccuracy
10.1296070.0820840.973013
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "learn.fit(1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

fit_one_cycle[source]

\n", "\n", "> fit_one_cycle(**`learn`**:[`Learner`](/basic_train.html#Learner), **`cyc_len`**:`int`, **`max_lr`**:`Union`\\[`float`, `Collection`\\[`float`\\], `slice`\\]=***`slice(None, 0.003, None)`***, **`moms`**:`Point`=***`(0.95, 0.85)`***, **`div_factor`**:`float`=***`25.0`***, **`pct_start`**:`float`=***`0.3`***, **`wd`**:`float`=***`None`***, **`callbacks`**:`Optional`\\[`Collection`\\[[`Callback`](/callback.html#Callback)\\]\\]=***`None`***, **`tot_epochs`**:`int`=***`None`***, **`start_epoch`**:`int`=***`1`***)\n", "\n", "Fit a model following the 1cycle policy. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.fit_one_cycle)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use cycle length `cyc_len`, a per cycle maximal learning rate `max_lr`, momentum `moms`, division factor `div_factor`, weight decay `wd`, and optional callbacks [`callbacks`](/callbacks.html#callbacks). Uses the [`OneCycleScheduler`](/callbacks.one_cycle.html#OneCycleScheduler) callback. Please refer to [What is 1-cycle](/callbacks.one_cycle.html#What-is-1cycle?) for a conceptual background of 1-cycle training policy and more technical details on what do the method's arguments do." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [ "Total time: 00:04

\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
epochtrain_lossvalid_lossaccuracy
10.0888840.0663790.978410
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "learn.fit_one_cycle(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### See results" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

predict[source]

\n", "\n", "> predict(**`item`**:[`ItemBase`](/core.html#ItemBase), **\\*\\*`kwargs`**)\n", "\n", "Return predicted class, label and probabilities for `item`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.predict)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`predict` can be used to get a single prediction from the trained learner on one specific piece of data you are interested in." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(Image (3, 28, 28), Category 3)" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.data.train_ds[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each element of the dataset is a tuple, where the first element is the data itself, while the second element is the target label. So to get the data, we need to index one more time." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data = learn.data.train_ds[0][0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "image/jpeg": "/9j/4AAQSkZJRgABAQEAZABkAAD/2wBDAAIBAQEBAQIBAQECAgICAgQDAgICAgUEBAMEBgUGBgYFBgYGBwkIBgcJBwYGCAsICQoKCgoKBggLDAsKDAkKCgr/2wBDAQICAgICAgUDAwUKBwYHCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgr/wAARCAAcABwDASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD+f+vc/wDgnb+xNcft+ftLWPwEk+NfhT4eaaNNuNU17xf4xv1gtNPsLcKZnBYqHkww2oWQHnLKATXhlfoV+z5/wSO/Ya8EfCzwl+0p/wAFKv8Agq58NfB/hvxHZWup23w9+F923iPxPPayosnkzJAjCxl2kgny51jJAY7soADQ+IP/AAQ5/Zii+FPx0+K/7PP/AAVn8DfE1Pgx4Wm8Qy6b4X8FX0sd1aCZ4oopr/zPskE8jKqokckxYlmA2ruP5y1+oP8AwWN/aB+Hvh/9i/4cfBL/AIJXPpmhfseeJNYv1kbSre+t9Y8R+KbIWzXg15rxVlnZFmtpYQpaIo6DgwJHD+X1AH2L/wAE3/2W/wDgkp+0B8OtZ1D9vn/gopr3wc8WWmtSR6Votn4Gnv7W604QwFbg3McbgSNK8yeUQCFhDc7uPr7w98Bv+DQj9mayfWPiR+1v8WPjtqCWCv8A2Lpmm39nBPJkAiPyLW02NlSQslzgK/JJwa/HyigD6/8A+CnP/BUXRP20NB8Lfs1fs2/s9aH8JPgH8Nbm4f4e+ArCCOe8WWYnzr27u2BkeaUksyByoJ+ZpWHmn5AoooA//9k=\n", "image/png": "iVBORw0KGgoAAAANSUhEUgAAABwAAAAcCAYAAAByDd+UAAAABHNCSVQICAgIfAhkiAAAAlhJREFUSIntVj1rKlEQPUZDUoiF0SR2FomWIRGRRLCwDhK2EdE/YJE/INhaWiVNSCWI1TZRi4BVTBMswoJFwEZsJGyz2gjKvec1j+Vp1u/3AoE3MLB775w5d2bP7qwNAPGNtvOdZD+P8ODgABcXF2thHJuSxWIxFAoFhEIh3N3dQdd17O3tIZVKAQBUVUU+n7fEcl2/vb3lYDCgEMJ0KeXU/fPzsyV2owrPzs7gdDoBAJ1OB09PT9B1Haqq4vLyEqVSCZ+fn5bYjQhfX18xGo1QqVTQarUwmUwAAPv7+8hkMgCAdrs9F792S63c5XKx2+1SSsn393ceHR3Ni92OyOFw0OPxsNvtUghBTdPo8/kWYTYn8/l8LBaLpmg0TePx8fEy3GZEmUyGhmGYqnx4eFhW2foqTafTUBQFV1dXODw8nNr7+PhAv99fKc9SMdzf31NKaTpJ9no9NptN1mo1qqpKkszlcqt0aHFAJBLheDymEILD4ZDNZpPX19f0er1TcalUisPhkCcnJ9s/w3g8zkQisTBZNBqlEIKJROLvi8bK397eKIRgNpul3W6fG7f1eAqHw1BVFeFwGABwfn6OnZ3FaVeuwu/3MxqN8vHxkS8vL+aXRUrJwWDAbDa7NIft9wUA4ObmBoFA4MuJFEWBy+WC1+uF2+2GzWYDacJgGAaSySQajcZKXTHZZ0eM1diZXSuXy3S73St3aerFr1arCAaDOD09tTxZp9OBEAK6rqNer6PRaEDTNEgpV6oMAKZaCgC7u7umAGbtz1G0qX0h/Nf2s/7a/hNa2S/tek2pzxDJXQAAAABJRU5ErkJggg==\n", "text/plain": [ "Image (3, 28, 28)" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(Category 3, tensor(0), tensor([9.9979e-01, 2.0649e-04]))" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pred = learn.predict(data)\n", "pred" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first two elements of the tuple are, respectively, the predicted class and label. Label here is essentially an internal representation of each class, since class name is a string and cannot be used in computation. To check what each label corresponds to, run:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['3', '7']" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.data.classes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So category 0 is 3 while category 1 is 7." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "probs = pred[2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The last element in the tuple is the predicted probabilities. For a categorization dataset, the number of probabilities returned is the same as the number of classes; `probs[i]` is the probability that the `item` belongs to `learn.data.classes[i]`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "image/jpeg": "/9j/4AAQSkZJRgABAQEAZABkAAD/2wBDAAIBAQEBAQIBAQECAgICAgQDAgICAgUEBAMEBgUGBgYFBgYGBwkIBgcJBwYGCAsICQoKCgoKBggLDAsKDAkKCgr/2wBDAQICAgICAgUDAwUKBwYHCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgr/wAARCAAcABwDASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD+f+vSvC/7Gf7YHjjwBF8V/BX7KXxK1jwtO4SDxLpfgXULjT5GIJAW4jhMZOFY4Dfwn0rz7RdQt9J1m01S70e21GK2uY5ZdPvTIIblVYExSeU6PsYDadjK2CcMDgj9d9T+GP8Awd8+HfHui+J/DV58Wre21sQPoFr4N8VabN4asraSONYUS3tZ3sLS2SN0A3qkaBS2flZgAfkHdWtzY3Mlle28kM0MhSaGVCrIwOCpB5BB4INR19xf8HB/7Q/gv9ob/goEkvhrxF4f8S6x4L+HWg+FfHnj7wysQtPF/iK0tydQ1NTCqo376U24KjaVtV2/Livh2gAr7N/4IZftP/tK/Cz/AIKW/BX4bfC/x1rdx4f8Z/ELTPDXi3wi081zp2o6LqFzFbX6TWmTG6rbs8m4r+7MSyfwV8ZV7r+zr/wUA+KP7Knhu8T4HfDbwJo3jG40S60iz+KMegyHxDptncpLHOtrL532eGV4ppIjciA3ARsCUYGADiv2svBHgX4Z/tT/ABL+HHwvvRc+GfD/AMQNZ03w7crdLOJbGC+mit38xflkzGiHcOGzkda8/pWZnYu7EknJJPWkoA//2Q==\n", "image/png": "iVBORw0KGgoAAAANSUhEUgAAABwAAAAcCAYAAAByDd+UAAAABHNCSVQICAgIfAhkiAAAAdJJREFUSIntVbGq4kAUvW9ZUZCQIqjBzlohhWAsLEQLwdZUNn6AkEbQD7AJthZWFnYWWgj+gKRIpRZiK1hpIYoY1BSZ86q12SUm8T2L5V04zcyZOffMvZf5ICLQG+PXO8V+BH8EfcVvN6RCoUC1Wo1yuRwlEgmaTqd0Op1ot9vRZDIhwzA8icIJqqrCsizYtg3GGGzbfoAxBsuysNlsIMuy4z1/8EEOg8/zPK3XaxqPx9Tv9x/roVCIKpUKxeNxKpVKJAgCzedzymQyrznUNA35fN4xY0mSHo7dOHz6pM8gSRIYY9B13RX/5S4tFosEgDqdjuszjhlxHAdRFBEMBv/a43ke+/0e2+0WHMe5cvh0LGazGUmSRIvFgo7HI41GI7rf70REpGkaRSIRqtfrdLlcvsZht9sFY+yfOBwOUBTFa92dCYFAALFYDNlsFo1GA+12G6ZpwrZt9Ho9P43mvTNlWcb1esX5fIYgCN8vSERoNptgjKFarb5HkIjAGMNyufR05uU5BOCJ////h0Q+69dqtQDgPU0TjUZxu91gGAbC4fDXCyaTSaiqClEUUS6Xoes6TNNEOp328zruiIqiYDAYYLVaYTgcIpVK+SqF44//HfEJXkMk1eKRk3QAAAAASUVORK5CYII=\n", "text/plain": [ "Image (3, 28, 28)" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.data.valid_ds[0][0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You could always check yourself if the probabilities given make sense." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

get_preds[source]

\n", "\n", "> get_preds(**`ds_type`**:[`DatasetType`](/basic_data.html#DatasetType)=***``***, **`with_loss`**:`bool`=***`False`***, **`n_batch`**:`Optional`\\[`int`\\]=***`None`***, **`pbar`**:`Union`\\[`MasterBar`, `ProgressBar`, `NoneType`\\]=***`None`***) → `List`\\[`Tensor`\\]\n", "\n", "Return predictions and targets on `ds_type` dataset. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.get_preds)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It will run inference using the learner on all the data in the `ds_type` dataset and return the predictions; if `n_batch` is not specified, it will run the predictions on the default batch size. If `with_loss`, it will also return the loss on each prediction." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here is how you check the default batch size." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "64" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.data.batch_size" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[tensor([[9.9366e-01, 6.3430e-03],\n", " [9.9828e-01, 1.7193e-03],\n", " [9.9993e-01, 7.1130e-05],\n", " ...,\n", " [1.5793e-04, 9.9984e-01],\n", " [9.0569e-03, 9.9094e-01],\n", " [9.8014e-01, 1.9864e-02]]), tensor([0, 0, 0, ..., 1, 1, 1])]" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "preds = learn.get_preds()\n", "preds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first element of the tuple is a tensor that contains all the predictions." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[9.9366e-01, 6.3430e-03],\n", " [9.9828e-01, 1.7193e-03],\n", " [9.9993e-01, 7.1130e-05],\n", " ...,\n", " [1.5793e-04, 9.9984e-01],\n", " [9.0569e-03, 9.9094e-01],\n", " [9.8014e-01, 1.9864e-02]])" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "preds[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While the second element of the tuple is a tensor that contains all the target labels." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([0, 0, 0, ..., 1, 1, 1])" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "preds[1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor(0)" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "preds[1][0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For more details about what each number mean, refer to the documentation of [`predict`](/basic_train.html#predict).\n", "\n", "Since [`get_preds`](/basic_train.html#get_preds) gets predictions on all the data in the `ds_type` dataset, here the number of predictions will be equal to the number of data in the validation dataset." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2038" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(learn.data.valid_ds)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2038, 2038)" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(preds[0]), len(preds[1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To get predictions on the entire training dataset, simply set the `ds_type` argument accordingly." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[tensor([[9.9973e-01, 2.6554e-04],\n", " [9.9962e-01, 3.8422e-04],\n", " [9.9988e-01, 1.1570e-04],\n", " ...,\n", " [9.9922e-01, 7.8436e-04],\n", " [4.4838e-04, 9.9955e-01],\n", " [1.3715e-04, 9.9986e-01]]), tensor([0, 0, 0, ..., 0, 1, 1])]" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.get_preds(ds_type=DatasetType.Train)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To also get prediction loss along with the predictions and the targets, set `with_loss=True` in the arguments." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[tensor([[9.9366e-01, 6.3430e-03],\n", " [9.9828e-01, 1.7193e-03],\n", " [9.9993e-01, 7.1130e-05],\n", " ...,\n", " [1.5793e-04, 9.9984e-01],\n", " [9.0569e-03, 9.9094e-01],\n", " [9.8014e-01, 1.9864e-02]]),\n", " tensor([0, 0, 0, ..., 1, 1, 1]),\n", " tensor([6.3632e-03, 1.7209e-03, 7.1049e-05, ..., 1.5783e-04, 9.0983e-03,\n", " 3.9189e+00])]" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.get_preds(with_loss=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the third tensor in the output tuple contains the losses." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

validate[source]

\n", "\n", "> validate(**`dl`**=***`None`***, **`callbacks`**=***`None`***, **`metrics`**=***`None`***)\n", "\n", "Validate on `dl` with potential `callbacks` and `metrics`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.validate)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Return the calculated loss and the metrics of the current model on the given data loader `dl`. The default data loader `dl` is the validation dataloader." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can check the default metrics of the learner using:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'[]'" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "str(learn.metrics)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0.06637867, tensor(0.9784)]" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.validate()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0.06637867, tensor(0.9784)]" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.validate(learn.data.valid_dl)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0.039573476, tensor(0.9860)]" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.validate(learn.data.train_dl)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

show_results[source]

\n", "\n", "> show_results(**`ds_type`**=***``***, **`rows`**:`int`=***`5`***, **\\*\\*`kwargs`**)\n", "\n", "Show `rows` result of predictions on `ds_type` dataset. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.show_results)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the text number on the top is the ground truth, or the target label, the one in the middle is the prediction, while the image number on the bottom is the image data itself." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "learn.show_results()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "learn.show_results(ds_type=DatasetType.Train)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

pred_batch[source]

\n", "\n", "> pred_batch(**`ds_type`**:[`DatasetType`](/basic_data.html#DatasetType)=***``***, **`batch`**:`Tuple`=***`None`***, **`reconstruct`**:`bool`=***`False`***) → `List`\\[`Tensor`\\]\n", "\n", "Return output of the model on one batch from `ds_type` dataset. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.pred_batch)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the number of predictions given equals to the batch size." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "64" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.data.batch_size" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "64" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "preds = learn.pred_batch()\n", "len(preds)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since the total number of predictions is too large, we will only look at a part of them." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[9.9366e-01, 6.3430e-03],\n", " [9.9828e-01, 1.7193e-03],\n", " [9.9993e-01, 7.1130e-05],\n", " [1.0000e+00, 5.2653e-07],\n", " [9.9839e-01, 1.6092e-03],\n", " [1.0000e+00, 9.6659e-07],\n", " [9.5156e-01, 4.8442e-02],\n", " [9.9854e-01, 1.4628e-03],\n", " [9.9937e-01, 6.2854e-04],\n", " [8.3490e-01, 1.6510e-01]])" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "preds[:10]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "image/jpeg": "/9j/4AAQSkZJRgABAQEAZABkAAD/2wBDAAIBAQEBAQIBAQECAgICAgQDAgICAgUEBAMEBgUGBgYFBgYGBwkIBgcJBwYGCAsICQoKCgoKBggLDAsKDAkKCgr/2wBDAQICAgICAgUDAwUKBwYHCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgr/wAARCAAcABwDASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD+f+vc/wDgnb+xNcft+ftLWPwEk+NfhT4eaaNNuNU17xf4xv1gtNPsLcKZnBYqHkww2oWQHnLKATXhlfoV+z5/wSO/Ya8EfCzwl+0p/wAFKv8Agq58NfB/hvxHZWup23w9+F923iPxPPayosnkzJAjCxl2kgny51jJAY7soADQ+IP/AAQ5/Zii+FPx0+K/7PP/AAVn8DfE1Pgx4Wm8Qy6b4X8FX0sd1aCZ4oopr/zPskE8jKqokckxYlmA2ruP5y1+oP8AwWN/aB+Hvh/9i/4cfBL/AIJXPpmhfseeJNYv1kbSre+t9Y8R+KbIWzXg15rxVlnZFmtpYQpaIo6DgwJHD+X1AH2L/wAE3/2W/wDgkp+0B8OtZ1D9vn/gopr3wc8WWmtSR6Votn4Gnv7W604QwFbg3McbgSNK8yeUQCFhDc7uPr7w98Bv+DQj9mayfWPiR+1v8WPjtqCWCv8A2Lpmm39nBPJkAiPyLW02NlSQslzgK/JJwa/HyigD6/8A+CnP/BUXRP20NB8Lfs1fs2/s9aH8JPgH8Nbm4f4e+ArCCOe8WWYnzr27u2BkeaUksyByoJ+ZpWHmn5AoooA//9k=\n", "image/png": "iVBORw0KGgoAAAANSUhEUgAAABwAAAAcCAYAAAByDd+UAAAABHNCSVQICAgIfAhkiAAAAlhJREFUSIntVj1rKlEQPUZDUoiF0SR2FomWIRGRRLCwDhK2EdE/YJE/INhaWiVNSCWI1TZRi4BVTBMswoJFwEZsJGyz2gjKvec1j+Vp1u/3AoE3MLB775w5d2bP7qwNAPGNtvOdZD+P8ODgABcXF2thHJuSxWIxFAoFhEIh3N3dQdd17O3tIZVKAQBUVUU+n7fEcl2/vb3lYDCgEMJ0KeXU/fPzsyV2owrPzs7gdDoBAJ1OB09PT9B1Haqq4vLyEqVSCZ+fn5bYjQhfX18xGo1QqVTQarUwmUwAAPv7+8hkMgCAdrs9F792S63c5XKx2+1SSsn393ceHR3Ni92OyOFw0OPxsNvtUghBTdPo8/kWYTYn8/l8LBaLpmg0TePx8fEy3GZEmUyGhmGYqnx4eFhW2foqTafTUBQFV1dXODw8nNr7+PhAv99fKc9SMdzf31NKaTpJ9no9NptN1mo1qqpKkszlcqt0aHFAJBLheDymEILD4ZDNZpPX19f0er1TcalUisPhkCcnJ9s/w3g8zkQisTBZNBqlEIKJROLvi8bK397eKIRgNpul3W6fG7f1eAqHw1BVFeFwGABwfn6OnZ3FaVeuwu/3MxqN8vHxkS8vL+aXRUrJwWDAbDa7NIft9wUA4ObmBoFA4MuJFEWBy+WC1+uF2+2GzWYDacJgGAaSySQajcZKXTHZZ0eM1diZXSuXy3S73St3aerFr1arCAaDOD09tTxZp9OBEAK6rqNer6PRaEDTNEgpV6oMAKZaCgC7u7umAGbtz1G0qX0h/Nf2s/7a/hNa2S/tek2pzxDJXQAAAABJRU5ErkJggg==\n", "text/plain": [ "Image (3, 28, 28)" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "item = learn.data.train_ds[0][0]\n", "item" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(tensor([[[[0., 0., 0., ..., 0., 0., 0.],\n", " [0., 0., 0., ..., 0., 0., 0.],\n", " [0., 0., 0., ..., 0., 0., 0.],\n", " ...,\n", " [0., 0., 0., ..., 0., 0., 0.],\n", " [0., 0., 0., ..., 0., 0., 0.],\n", " [0., 0., 0., ..., 0., 0., 0.]],\n", " \n", " [[0., 0., 0., ..., 0., 0., 0.],\n", " [0., 0., 0., ..., 0., 0., 0.],\n", " [0., 0., 0., ..., 0., 0., 0.],\n", " ...,\n", " [0., 0., 0., ..., 0., 0., 0.],\n", " [0., 0., 0., ..., 0., 0., 0.],\n", " [0., 0., 0., ..., 0., 0., 0.]],\n", " \n", " [[0., 0., 0., ..., 0., 0., 0.],\n", " [0., 0., 0., ..., 0., 0., 0.],\n", " [0., 0., 0., ..., 0., 0., 0.],\n", " ...,\n", " [0., 0., 0., ..., 0., 0., 0.],\n", " [0., 0., 0., ..., 0., 0., 0.],\n", " [0., 0., 0., ..., 0., 0., 0.]]]], device='cuda:0'),\n", " tensor([0], device='cuda:0'))" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "batch = learn.data.one_item(item)\n", "batch" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[9.9979e-01, 2.0649e-04]])" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.pred_batch(batch=batch)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

interpret[source]

\n", "\n", "> interpret(**`learn`**:[`Learner`](/basic_train.html#Learner), **`ds_type`**:[`DatasetType`](/basic_data.html#DatasetType)=***``***, **`tta`**=***`False`***)\n", "\n", "Create a [`ClassificationInterpretation`](/train.html#ClassificationInterpretation) object from `learner` on `ds_type` with `tta`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.interpret, full_name='interpret')" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "
Note: This function only works in the vision application.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "jekyll_note('This function only works in the vision application.')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For more details, refer to [ClassificationInterpretation](/vision.learner.html#ClassificationInterpretation)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Model summary" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

model_summary[source]

\n", "\n", "> model_summary(**`m`**:[`Learner`](/basic_train.html#Learner), **`n`**:`int`=***`70`***)\n", "\n", "Print a summary of `m` using a output text width of `n` chars " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.summary)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Test time augmentation" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

TTA[source]

\n", "\n", "> TTA(**`learn`**:[`Learner`](/basic_train.html#Learner), **`beta`**:`float`=***`0.4`***, **`scale`**:`float`=***`1.35`***, **`ds_type`**:[`DatasetType`](/basic_data.html#DatasetType)=***``***, **`with_loss`**:`bool`=***`False`***) → `Tensors`\n", "\n", "Applies TTA to predict on `ds_type` dataset. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.TTA, full_name = 'TTA')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Applies Test Time Augmentation to `learn` on the dataset `ds_type`. We take the average of our regular predictions (with a weight `beta`) with the average of predictions obtained through augmented versions of the training set (with a weight `1-beta`). The transforms decided for the training set are applied with a few changes `scale` controls the scale for zoom (which isn't random), the cropping isn't random but we make sure to get the four corners of the image. Flipping isn't random but applied once on each of those corner images (so that makes 8 augmented versions total)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Gradient clipping" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

clip_grad[source]

\n", "\n", "> clip_grad(**`learn`**:[`Learner`](/basic_train.html#Learner), **`clip`**:`float`=***`0.1`***) → [`Learner`](/basic_train.html#Learner)\n", "\n", "Add gradient clipping of `clip` during training. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.clip_grad)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Mixed precision training" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

to_fp16[source]

\n", "\n", "> to_fp16(**`learn`**:[`Learner`](/basic_train.html#Learner), **`loss_scale`**:`float`=***`None`***, **`max_noskip`**:`int`=***`1000`***, **`dynamic`**:`bool`=***`False`***, **`clip`**:`float`=***`None`***, **`flat_master`**:`bool`=***`False`***) → [`Learner`](/basic_train.html#Learner)\n", "\n", "Put `learn` in FP16 precision mode. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.to_fp16)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Uses the [`MixedPrecision`](/callbacks.fp16.html#MixedPrecision) callback to train in mixed precision (i.e. forward and backward passes using fp16, with weight updates using fp32), using all [NVIDIA recommendations](https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html) for ensuring speed and accuracy." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

to_fp32[source]

\n", "\n", "> to_fp32(**`learn`**:[`Learner`](/basic_train.html#Learner))\n", "\n", "Put `learn` back to FP32 precision mode. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.to_fp32)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Distributed training" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

distributed[source]

\n", "\n", "> distributed(**`learn`**:[`Learner`](/basic_train.html#Learner), **`cuda_id`**:`int`, **`cache_dir`**:`PathOrStr`=***`'tmp'`***)\n", "\n", "Put `learn` on distributed training with `cuda_id`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.distributed, full_name='distributed')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Discriminative layer training" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When fitting a model you can pass a list of learning rates (and/or weight decay amounts), which will apply a different rate to each *layer group* (i.e. the parameters of each module in `self.layer_groups`). See the [Universal Language Model Fine-tuning for Text Classification](https://arxiv.org/abs/1801.06146) paper for details and experimental results in NLP (we also frequently use them successfully in computer vision, but have not published a paper on this topic yet). When working with a [`Learner`](/basic_train.html#Learner) on which you've called `split`, you can set hyperparameters in four ways:\n", "\n", "1. `param = [val1, val2 ..., valn]` (n = number of layer groups)\n", "2. `param = val`\n", "3. `param = slice(start,end)`\n", "4. `param = slice(end)`\n", "\n", "If we chose to set it in way 1, we must specify a number of values exactly equal to the number of layer groups. If we chose to set it in way 2, the chosen value will be repeated for all layer groups. See [`Learner.lr_range`](/basic_train.html#Learner.lr_range) for an explanation of the `slice` syntax).\n", "\n", "Here's an example of how to use discriminative learning rates (note that you don't actually need to manually call [`Learner.split`](/basic_train.html#Learner.split) in this case, since fastai uses this exact function as the default split for `resnet18`; this is just to show how to customize it):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# creates 3 layer groups\n", "learn.split(lambda m: (m[0][6], m[1]))\n", "# only randomly initialized head now trainable\n", "learn.freeze()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [ "Total time: 00:04

\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
epochtrain_lossvalid_lossaccuracy
10.0677690.0609100.979392
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "learn.fit_one_cycle(1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [ "Total time: 00:06

\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
epochtrain_lossvalid_lossaccuracy
10.0223660.0068720.998037
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# all layers now trainable\n", "learn.unfreeze()\n", "# optionally, separate LR and WD for each group\n", "learn.fit_one_cycle(1, max_lr=(1e-4, 1e-3, 1e-2), wd=(1e-4,1e-4,1e-1))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

lr_range[source]

\n", "\n", "> lr_range(**`lr`**:`Union`\\[`float`, `slice`\\]) → `ndarray`\n", "\n", "Build differential learning rates from `lr`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.lr_range)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Rather than manually setting an LR for every group, it's often easier to use [`Learner.lr_range`](/basic_train.html#Learner.lr_range). This is a convenience method that returns one learning rate for each layer group. If you pass `slice(start,end)` then the first group's learning rate is `start`, the last is `end`, and the remaining are evenly geometrically spaced.\n", "\n", "If you pass just `slice(end)` then the last group's learning rate is `end`, and all the other groups are `end/10`. For instance (for our learner that has 3 layer groups):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(array([1.e-05, 1.e-04, 1.e-03]), array([0.0001, 0.0001, 0.001 ]))" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.lr_range(slice(1e-5,1e-3)), learn.lr_range(slice(1e-3))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

unfreeze[source]

\n", "\n", "> unfreeze()\n", "\n", "Unfreeze entire model. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.unfreeze)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sets every layer group to *trainable* (i.e. `requires_grad=True`)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

freeze[source]

\n", "\n", "> freeze()\n", "\n", "Freeze up to last layer. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.freeze)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sets every layer group except the last to *untrainable* (i.e. `requires_grad=False`)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

freeze_to[source]

\n", "\n", "> freeze_to(**`n`**:`int`)\n", "\n", "Freeze layers up to layer `n`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.freeze_to)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

split[source]

\n", "\n", "> split(**`split_on`**:`SplitFuncOrIdxList`)\n", "\n", "Split the model at `split_on`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.split)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A convenience method that sets `layer_groups` based on the result of [`split_model`](/torch_core.html#split_model). If `split_on` is a function, it calls that function and passes the result to [`split_model`](/torch_core.html#split_model) (see above for example)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Saving and loading models" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Simply call [`Learner.save`](/basic_train.html#Learner.save) and [`Learner.load`](/basic_train.html#Learner.load) to save and load models. Only the parameters are saved, not the actual architecture (so you'll need to create your model in the same way before loading weights back in). Models are saved to the `path`/`model_dir` directory." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": false }, "outputs": [ { "data": { "text/markdown": [ "

save[source]

\n", "\n", "> save(**`name`**:`PathOrStr`, **`return_path`**:`bool`=***`False`***, **`with_opt`**:`bool`=***`True`***)\n", "\n", "Save model and optimizer state (if `with_opt`) with `name` to `self.model_dir`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.save)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "learn.save(\"trained_model\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "PosixPath('/home/jupyter/.fastai/data/mnist_sample/models/trained_model.pth')" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.save(\"trained_model\", return_path=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

load[source]

\n", "\n", "> load(**`name`**:`PathOrStr`, **`device`**:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=***`None`***, **`strict`**:`bool`=***`True`***, **`with_opt`**:`bool`=***`None`***, **`purge`**:`bool`=***`False`***)\n", "\n", "Load model and optimizer state (if `with_opt`) `name` from `self.model_dir` using `device`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.load)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "learn = learn.load(\"trained_model\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Deploying your model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When you are ready to put your model in production, export the minimal state of your [`Learner`](/basic_train.html#Learner) with" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

export[source]

\n", "\n", "> export(**`fname`**:`str`=***`'export.pkl'`***)\n", "\n", "Export the state of the [`Learner`](/basic_train.html#Learner) in `self.path/fname`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.export)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "learn.export()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "learn.export('trained_model.pkl')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "PosixPath('/home/jupyter/.fastai/data/mnist_sample')" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "path = learn.path\n", "path" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

load_learner[source]

\n", "\n", "> load_learner(**`path`**:`PathOrStr`, **`fname`**:`PathOrStr`=***`'export.pkl'`***, **`test`**:[`ItemList`](/data_block.html#ItemList)=***`None`***)\n", "\n", "Load a [`Learner`](/basic_train.html#Learner) object saved with `export_state` in `path/fn` with empty data, optionally add `test` and load on `cpu`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(load_learner)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "learn = load_learner(path)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "learn = load_learner(path, fname='trained_model.pkl')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "WARNING: If you used any customized classes when creating your learner, you must first define these classes first before executing [`load_learner`](/basic_train.html#load_learner).\n", "\n", "You can find more information and multiple examples in [this tutorial](/tutorial.inference.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Other methods" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

init[source]

\n", "\n", "> init(**`init`**)" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.init)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Initializes all weights (except batchnorm) using function `init`, which will often be from PyTorch's [`nn.init`](https://pytorch.org/docs/stable/nn.html#torch-nn-init) module." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

mixup[source]

\n", "\n", "> mixup(**`learn`**:[`Learner`](/basic_train.html#Learner), **`alpha`**:`float`=***`0.4`***, **`stack_x`**:`bool`=***`False`***, **`stack_y`**:`bool`=***`True`***) → [`Learner`](/basic_train.html#Learner)\n", "\n", "Add mixup https://arxiv.org/abs/1710.09412 to `learn`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.mixup)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Uses [`MixUpCallback`](/callbacks.mixup.html#MixUpCallback)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

backward[source]

\n", "\n", "> backward(**`item`**)\n", "\n", "Pass `item` through the model and computes the gradient. Useful if `backward_hooks` are attached. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.backward)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

create_opt[source]

\n", "\n", "> create_opt(**`lr`**:`Floats`, **`wd`**:`Floats`=***`0.0`***)\n", "\n", "Create optimizer with `lr` learning rate and `wd` weight decay. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.create_opt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You generally won't need to call this yourself - it's used to create the [`optim`](https://pytorch.org/docs/stable/optim.html#module-torch.optim) optimizer before fitting the model." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

dl[source]

\n", "\n", "> dl(**`ds_type`**:[`DatasetType`](/basic_data.html#DatasetType)=***``***)\n", "\n", "Return DataLoader for DatasetType `ds_type`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.dl)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "DeviceDataLoader(dl=, device=device(type='cuda'), tfms=[], collate_fn=)" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.dl()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "DeviceDataLoader(dl=, device=device(type='cuda'), tfms=[], collate_fn=)" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.dl(DatasetType.Train)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

class Recorder[source]

\n", "\n", "> Recorder(**`learn`**:[`Learner`](/basic_train.html#Learner)) :: [`LearnerCallback`](/basic_train.html#LearnerCallback)\n", "\n", "A [`LearnerCallback`](/basic_train.html#LearnerCallback) that records epoch, loss, opt and metric data during training. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Recorder, title_level=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A [`Learner`](/basic_train.html#Learner) creates a [`Recorder`](/basic_train.html#Recorder) object automatically - you do not need to explicitly pass it to `callback_fns` - because other callbacks rely on it being available. It stores the smoothed loss, hyperparameter values, and metrics for each batch, and provides plotting methods for each. Note that [`Learner`](/basic_train.html#Learner) automatically sets an attribute with the snake-cased name of each callback, so you can access this through `Learner.recorder`, as shown below." ] }, { "cell_type": "markdown", "metadata": { "hide_input": true }, "source": [ "### Plotting methods" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

plot[source]

\n", "\n", "> plot(**`skip_start`**:`int`=***`10`***, **`skip_end`**:`int`=***`5`***)\n", "\n", "Plot learning rate and losses, trimmed between `skip_start` and `skip_end`. Optionally plot and return min gradient " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Recorder.plot)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is mainly used with the learning rate finder, since it shows a scatterplot of loss vs learning rate." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "path = untar_data(URLs.MNIST_SAMPLE)\n", "data = ImageDataBunch.from_folder(path)\n", "learn = create_cnn(data, models.resnet18, metrics=accuracy)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.\n", "Min numerical gradient: 7.59E-03\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "learn.lr_find()\n", "learn.recorder.plot()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

plot_losses[source]

\n", "\n", "> plot_losses(**`last`**:`int`=***`None`***)\n", "\n", "Plot training and validation losses. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Recorder.plot_losses)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that validation losses are only calculated once per epoch, whereas training losses are calculated after every batch." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [ "Total time: 00:22

\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
epochtrain_lossvalid_lossaccuracy
10.2472470.1412470.954367
20.1096720.0788760.972522
30.0653910.0546350.983808
40.0440420.0495920.981845
50.0412870.0492240.984298
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "\n", "text/plain": [ "

" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "learn.fit_one_cycle(5)\n", "learn.recorder.plot_losses()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

plot_lr[source]

\n", "\n", "> plot_lr(**`show_moms`**=***`False`***)\n", "\n", "Plot learning rate, `show_moms` to include momentum. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Recorder.plot_lr)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "learn.recorder.plot_lr()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "learn.recorder.plot_lr(show_moms=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

plot_metrics[source]

\n", "\n", "> plot_metrics()\n", "\n", "Plot metrics collected during training. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Recorder.plot_metrics)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that metrics are only collected at the end of each epoch, so you'll need to train at least two epochs to have anything to show here." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "learn.recorder.plot_metrics()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Callback methods" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You don't call these yourself - they're called by fastai's [`Callback`](/callback.html#Callback) system automatically to enable the class's functionality. Refer to [`Callback`](/callback.html#Callback) for more details." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

on_backward_begin[source]

\n", "\n", "> on_backward_begin(**`smooth_loss`**:`Tensor`, **\\*\\*`kwargs`**:`Any`)\n", "\n", "Record the loss before any other callback has a chance to modify it. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Recorder.on_backward_begin)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

on_batch_begin[source]

\n", "\n", "> on_batch_begin(**`train`**, **\\*\\*`kwargs`**:`Any`)\n", "\n", "Record learning rate and momentum at beginning of batch. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Recorder.on_batch_begin)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

on_epoch_end[source]

\n", "\n", "> on_epoch_end(**`epoch`**:`int`, **`num_batch`**:`int`, **`smooth_loss`**:`Tensor`, **`last_metrics`**=***`typing.Collection[typing.Union[torch.Tensor, numbers.Number]]`***, **\\*\\*`kwargs`**:`Any`) → `bool`\n", "\n", "Save epoch info: num_batch, smooth_loss, metrics. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Recorder.on_epoch_end)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

on_train_begin[source]

\n", "\n", "> on_train_begin(**`pbar`**:`PBar`, **`metrics_names`**:`StrList`, **\\*\\*`kwargs`**:`Any`)\n", "\n", "Initialize recording status at beginning of training. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Recorder.on_train_begin)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Inner functions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following functions are used along the way by the [`Recorder`](/basic_train.html#Recorder) or can be called by other callbacks." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

add_metrics[source]

\n", "\n", "> add_metrics(**`metrics`**)\n", "\n", "Add `metrics` to the inner stats. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Recorder.add_metrics)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

add_metric_names[source]

\n", "\n", "> add_metric_names(**`names`**)\n", "\n", "Add `names` to the inner metric names. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Recorder.add_metric_names)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

format_stats[source]

\n", "\n", "> format_stats(**`stats`**:`MetricsList`)\n", "\n", "Format stats before printing. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Recorder.format_stats)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Module functions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generally you'll want to use a [`Learner`](/basic_train.html#Learner) to train your model, since they provide a lot of functionality and make things easier. However, for ultimate flexibility, you can call the same underlying functions that [`Learner`](/basic_train.html#Learner) calls behind the scenes:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

fit[source]

\n", "\n", "> fit(**`epochs`**:`int`, **`model`**:[`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module), **`loss_func`**:`LossFunction`, **`opt`**:[`Optimizer`](https://pytorch.org/docs/stable/optim.html#torch.optim.Optimizer), **`data`**:[`DataBunch`](/basic_data.html#DataBunch), **`callbacks`**:`Optional`\\[`Collection`\\[[`Callback`](/callback.html#Callback)\\]\\]=***`None`***, **`metrics`**:`OptMetrics`=***`None`***)\n", "\n", "Fit the `model` on `data` and learn using `loss_func` and `opt`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(fit)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that you have to create the `Optimizer` yourself if you call this function, whereas [`Learn.fit`](/basic_train.html#fit) creates it for you automatically." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

train_epoch[source]

\n", "\n", "> train_epoch(**`model`**:[`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module), **`dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), **`opt`**:[`Optimizer`](https://pytorch.org/docs/stable/optim.html#torch.optim.Optimizer), **`loss_func`**:`LossFunction`)\n", "\n", "Simple training of `model` for 1 epoch of `dl` using optim `opt` and loss function `loss_func`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(train_epoch)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You won't generally need to call this yourself - it's what [`fit`](/basic_train.html#fit) calls for each epoch." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

validate[source]

\n", "\n", "> validate(**`model`**:[`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module), **`dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), **`loss_func`**:`OptLossFunc`=***`None`***, **`cb_handler`**:`Optional`\\[[`CallbackHandler`](/callback.html#CallbackHandler)\\]=***`None`***, **`pbar`**:`Union`\\[`MasterBar`, `ProgressBar`, `NoneType`\\]=***`None`***, **`average`**=***`True`***, **`n_batch`**:`Optional`\\[`int`\\]=***`None`***) → `Iterator`\\[`Tuple`\\[`IntOrTensor`, `Ellipsis`\\]\\]\n", "\n", "Calculate `loss_func` of `model` on `dl` in evaluation mode. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(validate)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is what [`fit`](/basic_train.html#fit) calls after each epoch. You can call it if you want to run inference on a [`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader) manually." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

get_preds[source]

\n", "\n", "> get_preds(**`model`**:[`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module), **`dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), **`pbar`**:`Union`\\[`MasterBar`, `ProgressBar`, `NoneType`\\]=***`None`***, **`cb_handler`**:`Optional`\\[[`CallbackHandler`](/callback.html#CallbackHandler)\\]=***`None`***, **`activ`**:[`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)=***`None`***, **`loss_func`**:`OptLossFunc`=***`None`***, **`n_batch`**:`Optional`\\[`int`\\]=***`None`***) → `List`\\[`Tensor`\\]\n", "\n", "Tuple of predictions and targets, and optional losses (if `loss_func`) using `dl`, max batches `n_batch`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(get_preds)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

loss_batch[source]

\n", "\n", "> loss_batch(**`model`**:[`Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module), **`xb`**:`Tensor`, **`yb`**:`Tensor`, **`loss_func`**:`OptLossFunc`=***`None`***, **`opt`**:`OptOptimizer`=***`None`***, **`cb_handler`**:`Optional`\\[[`CallbackHandler`](/callback.html#CallbackHandler)\\]=***`None`***) → `Tuple`\\[`Union`\\[`Tensor`, `int`, `float`, `str`\\]\\]\n", "\n", "Calculate loss and metrics for a batch, call out to callbacks as necessary. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(loss_batch)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You won't generally need to call this yourself - it's what [`fit`](/basic_train.html#fit) and [`validate`](/basic_train.html#validate) call for each batch. It only does a backward pass if you set `opt`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Other classes" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

class LearnerCallback[source]

\n", "\n", "> LearnerCallback(**`learn`**) :: [`Callback`](/callback.html#Callback)\n", "\n", "Base class for creating callbacks for a [`Learner`](/basic_train.html#Learner). " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(LearnerCallback, title_level=3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

class RecordOnCPU[source]

\n", "\n", "> RecordOnCPU() :: [`Callback`](/callback.html#Callback)\n", "\n", "Store the `input` and `target` going through the model on the CPU. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(RecordOnCPU, title_level=3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Undocumented Methods - Methods moved below this line will intentionally be hidden" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

_tta_only[source]

\n", "\n", "> _tta_only(**`learn`**:[`Learner`](/basic_train.html#Learner), **`ds_type`**:[`DatasetType`](/basic_data.html#DatasetType)=***``***, **`scale`**:`float`=***`1.35`***) → `Iterator`\\[`List`\\[`Tensor`\\]\\]\n", "\n", "Computes the outputs for several augmented inputs for TTA " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.tta_only)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

_TTA[source]

\n", "\n", "> _TTA(**`learn`**:[`Learner`](/basic_train.html#Learner), **`beta`**:`float`=***`0.4`***, **`scale`**:`float`=***`1.35`***, **`ds_type`**:[`DatasetType`](/basic_data.html#DatasetType)=***``***, **`with_loss`**:`bool`=***`False`***) → `Tensors`\n", "\n", "Applies TTA to predict on `ds_type` dataset. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.TTA)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

on_batch_begin[source]

\n", "\n", "> on_batch_begin(**`last_input`**, **`last_target`**, **\\*\\*`kwargs`**)\n", "\n", "Set HP before the step is done. Returns xb, yb (which can allow us to modify the input at that step if needed). " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(RecordOnCPU.on_batch_begin)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## New Methods - Please document or move to the undocumented section" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

purge[source]

\n", "\n", "> purge(**`clear_opt`**:`bool`=***`True`***)\n", "\n", "Purge the [`Learner`](/basic_train.html#Learner) of all cached attributes to release some GPU memory. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Learner.purge)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] } ], "metadata": { "jekyll": { "keywords": "fastai", "summary": "Learner class and training loop", "title": "basic_train" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 2 }