In [None]:
# default_exp callback.core

In [None]:
#export
from fastai2.data.all import *
from fastai2.optimizer import *

In [None]:
from nbdev.showdoc import *

In [None]:
#export
_all_ = ['CancelFitException', 'CancelEpochException', 'CancelTrainException', 'CancelValidException', 'CancelBatchException']

# Callback

> Basic callbacks for Learner

## Events

Callbacks can occur at any of these times:: *begin_fit begin_epoch begin_train begin_batch after_pred after_loss after_backward after_step after_cancel_batch after_batch after_cancel_train after_train begin_validate after_cancel_validate after_validate after_cancel_epoch after_epoch after_cancel_fit after_fit*.

In [None]:
# export
_events = L.split('begin_fit begin_epoch begin_train begin_batch after_pred after_loss \
    after_backward after_step after_cancel_batch after_batch after_cancel_train \
    after_train begin_validate after_cancel_validate after_validate after_cancel_epoch \
    after_epoch after_cancel_fit after_fit')

mk_class('event', **_events.map_dict(),
         doc="All possible events as attributes to get tab-completion and typo-proofing")

In [None]:
# export
_all_ = ['event']

In [None]:
show_doc(event, name='event', title_level=3)

<h3 id="event" class="doc_header"><code>class</code> <code>event</code><a href="" class="source_link" style="float:right">[source]</a></h3>

> <code>event</code>(**\*`args`**, **\*\*`kwargs`**)

All possible events as attributes to get tab-completion and typo-proofing

To ensure that you are refering to an event (that is, the name of one of the times when callbacks are called) that exists, and to get tab completion of event names, use `event`:

In [None]:
test_eq(event.after_backward, 'after_backward')

## Callback - 

In [None]:
#export
_inner_loop = "begin_batch after_pred after_loss after_backward after_step after_cancel_batch after_batch".split()

In [None]:
#export
@funcs_kwargs(as_method=True)
class Callback(GetAttr):
    "Basic class handling tweaks of the training loop by changing a `Learner` in various events"
    _default,learn,run,run_train,run_valid = 'learn',None,True,True,True
    _methods = _events
    
    def __init__(self, **kwargs): assert not kwargs, f'Passed unknown events: {kwargs}'
    def __repr__(self): return type(self).__name__

    def __call__(self, event_name):
        "Call `self.{event_name}` if it's defined"
        _run = (event_name not in _inner_loop or (self.run_train and getattr(self, 'training', True)) or
               (self.run_valid and not getattr(self, 'training', False)))
        res = None
        if self.run and _run: res = getattr(self, event_name, noop)()
        if event_name=='after_fit': self.run=True #Reset self.run to True at each end of fit
        return res

    def __setattr__(self, name, value):
        if hasattr(self.learn,name):
            warn(f"You are setting an attribute ({name}) that also exists in the learner, so you're not setting it in the learner but in the callback. Use `self.learn.{name}` otherwise.")
        super().__setattr__(name, value)

    @property
    def name(self):
        "Name of the `Callback`, camel-cased and with '*Callback*' removed"
        return class2attr(self, 'Callback')

The training loop is defined in `Learner` a bit below and consists in a minimal set of instructions: looping through the data we:
- compute the output of the model from the input
- calculate a loss between this output and the desired target
- compute the gradients of this loss with respect to all the model parameters
- update the parameters accordingly
- zero all the gradients

Any tweak of this training loop is defined in a `Callback` to avoid over-complicating the code of the training loop, and to make it easy to mix and match different techniques (since they'll be defined in different callbacks). A callback can implement actions on the following events:

- `begin_fit`: called before doing anything, ideal for initial setup.
- `begin_epoch`: called at the beginning of each epoch, useful for any behavior you need to reset at each epoch.
- `begin_train`: called at the beginning of the training part of an epoch.
- `begin_batch`: called at the beginning of each batch, just after drawing said batch. It can be used to do any setup necessary for the batch (like hyper-parameter scheduling) or to change the input/target before it goes in the model (change of the input with techniques like mixup for instance).
- `after_pred`: called after computing the output of the model on the batch. It can be used to change that output before it's fed to the loss.
- `after_loss`: called after the loss has been computed, but before the backward pass. It can be used to add any penalty to the loss (AR or TAR in RNN training for instance).
- `after_backward`: called after the backward pass, but before the update of the parameters. It can be used to do any change to the gradients before said update (gradient clipping for instance).
- `after_step`: called after the step and before the gradients are zeroed.
- `after_batch`: called at the end of a batch, for any clean-up before the next one.
- `after_train`: called at the end of the training phase of an epoch.
- `begin_validate`: called at the beginning of the validation phase of an epoch, useful for any setup needed specifically for validation.
- `after_validate`: called at the end of the validation part of an epoch.
- `after_epoch`: called at the end of an epoch, for any clean-up before the next one.
- `after_fit`: called at the end of training, for final clean-up.

In [None]:
show_doc(Callback.__call__)

<h4 id="Callback.__call__" class="doc_header"><code>Callback.__call__</code><a href="__main__.py#L11" class="source_link" style="float:right">[source]</a></h4>

> <code>Callback.__call__</code>(**`event_name`**)

Call `self.{event_name}` if it's defined

One way to define callbacks is through subclassing:

In [None]:
class _T(Callback):
    def call_me(self): return "maybe"
test_eq(_T()("call_me"), "maybe")

Another way is by passing the callback function to the constructor:

In [None]:
def cb(self): return "maybe"
_t = Callback(begin_fit=cb)
test_eq(_t(event.begin_fit), "maybe")

In [None]:
show_doc(Callback.__getattr__)

<h4 id="GetAttr.__getattr__" class="doc_header"><code>GetAttr.__getattr__</code><a href="https://github.com/fastai/fastcore/tree/master/fastcore/foundation.py#L237" class="source_link" style="float:right">[source]</a></h4>

> <code>GetAttr.__getattr__</code>(**`k`**)



This is a shortcut to avoid having to write `self.learn.bla` for any `bla` attribute we seek, and just write `self.bla`.

In [None]:
mk_class('TstLearner', 'a')

class TstCallback(Callback):
    def batch_begin(self): print(self.a)

learn,cb = TstLearner(1),TstCallback()
cb.learn = learn
test_stdout(lambda: cb('batch_begin'), "1")

Note that it only works to get the value of the attribute, if you want to change it, you have to manually access it with `self.learn.bla`. In the example below, `self.a += 1` creates an `a` attribute of 2 in the callback instead of setting the `a` of the learner to 2. It also issues a warning that something is probably wrong:

In [None]:
learn.a

1

In [None]:
class TstCallback(Callback):
    def batch_begin(self): self.a += 1

learn,cb = TstLearner(1),TstCallback()
cb.learn = learn
cb('batch_begin')
test_eq(cb.a, 2)
test_eq(cb.learn.a, 1)



A proper version needs to write `self.learn.a = self.a + 1`:

In [None]:
class TstCallback(Callback):
    def batch_begin(self): self.learn.a = self.a + 1

learn,cb = TstLearner(1),TstCallback()
cb.learn = learn
cb('batch_begin')
test_eq(cb.learn.a, 2)

In [None]:
show_doc(Callback.name, name='Callback.name')

<h4 id="Callback.name" class="doc_header"><code>Callback.name</code><a href="" class="source_link" style="float:right">[source]</a></h4>

Name of the [`Callback`](callback.core#Callback), camel-cased and with '*Callback*' removed

In [None]:
test_eq(TstCallback().name, 'tst')
class ComplicatedNameCallback(Callback): pass
test_eq(ComplicatedNameCallback().name, 'complicated_name')

### TrainEvalCallback -

In [None]:
#export
class TrainEvalCallback(Callback):
    "`Callback` that tracks the number of iterations done and properly sets training/eval mode"
    run_valid = False
    def begin_fit(self):
        "Set the iter and epoch counters to 0, put the model and the right device"
        self.learn.train_iter,self.learn.pct_train = 0,0.
        if hasattr(self.dls, 'device'): self.model.to(self.dls.device)
        if hasattr(self.model, 'reset'): self.model.reset()

    def after_batch(self):
        "Update the iter counter (in training mode)"
        self.learn.pct_train += 1./(self.n_iter*self.n_epoch)
        self.learn.train_iter += 1

    def begin_train(self):
        "Set the model in training mode"
        self.learn.pct_train=self.epoch/self.n_epoch
        self.model.train()
        self.learn.training=True

    def begin_validate(self):
        "Set the model in validation mode"
        self.model.eval()
        self.learn.training=False

In [None]:
show_doc(TrainEvalCallback, title_level=3)

<h3 id="TrainEvalCallback" class="doc_header"><code>class</code> <code>TrainEvalCallback</code><a href="" class="source_link" style="float:right">[source]</a></h3>

> <code>TrainEvalCallback</code>(**`begin_fit`**=*`None`*, **`begin_epoch`**=*`None`*, **`begin_train`**=*`None`*, **`begin_batch`**=*`None`*, **`after_pred`**=*`None`*, **`after_loss`**=*`None`*, **`after_backward`**=*`None`*, **`after_step`**=*`None`*, **`after_cancel_batch`**=*`None`*, **`after_batch`**=*`None`*, **`after_cancel_train`**=*`None`*, **`after_train`**=*`None`*, **`begin_validate`**=*`None`*, **`after_cancel_validate`**=*`None`*, **`after_validate`**=*`None`*, **`after_cancel_epoch`**=*`None`*, **`after_epoch`**=*`None`*, **`after_cancel_fit`**=*`None`*, **`after_fit`**=*`None`*) :: [`Callback`](callback.core#Callback)

[`Callback`](callback.core#Callback) that tracks the number of iterations done and properly sets training/eval mode

This `Callback` is automatically added in every `Learner` at initialization.

In [None]:
#hide
#test of the TrainEvalCallback below in Learner.fit

In [None]:
show_doc(TrainEvalCallback.begin_fit)

<h4 id="TrainEvalCallback.begin_fit" class="doc_header"><code>TrainEvalCallback.begin_fit</code><a href="__main__.py#L5" class="source_link" style="float:right">[source]</a></h4>

> <code>TrainEvalCallback.begin_fit</code>()

Set the iter and epoch counters to 0, put the model and the right device

In [None]:
show_doc(TrainEvalCallback.after_batch)

<h4 id="TrainEvalCallback.after_batch" class="doc_header"><code>TrainEvalCallback.after_batch</code><a href="__main__.py#L11" class="source_link" style="float:right">[source]</a></h4>

> <code>TrainEvalCallback.after_batch</code>()

Update the iter counter (in training mode)

In [None]:
show_doc(TrainEvalCallback.begin_train)

<h4 id="TrainEvalCallback.begin_train" class="doc_header"><code>TrainEvalCallback.begin_train</code><a href="__main__.py#L16" class="source_link" style="float:right">[source]</a></h4>

> <code>TrainEvalCallback.begin_train</code>()

Set the model in training mode

In [None]:
show_doc(TrainEvalCallback.begin_validate)

<h4 id="TrainEvalCallback.begin_validate" class="doc_header"><code>TrainEvalCallback.begin_validate</code><a href="__main__.py#L22" class="source_link" style="float:right">[source]</a></h4>

> <code>TrainEvalCallback.begin_validate</code>()

Set the model in validation mode

In [None]:
# export
if not hasattr(defaults, 'callbacks'): defaults.callbacks = [TrainEvalCallback]

### GatherPredsCallback -

In [None]:
#export
#TODO: save_targs and save_preds only handle preds/targets that have one tensor, not tuples of tensors.
class GatherPredsCallback(Callback):
    "`Callback` that saves the predictions and targets, optionally `with_loss`"
    def __init__(self, with_input=False, with_loss=False, save_preds=None, save_targs=None, concat_dim=0):
        store_attr(self, "with_input,with_loss,save_preds,save_targs,concat_dim")

    def begin_batch(self):
        if self.with_input: self.inputs.append((to_detach(self.xb)))

    def begin_validate(self):
        "Initialize containers"
        self.preds,self.targets = [],[]
        if self.with_input: self.inputs = []
        if self.with_loss:  self.losses = []

    def after_batch(self):
        "Save predictions, targets and potentially losses"
        preds,targs = to_detach(self.pred),to_detach(self.yb)
        if self.save_preds is None: self.preds.append(preds)
        else: (self.save_preds/str(self.iter)).save_array(preds)
        if self.save_targs is None: self.targets.append(targs)
        else: (self.save_targs/str(self.iter)).save_array(targs[0])
        if self.with_loss:
            bs = find_bs(self.yb)
            loss = self.loss if self.loss.numel() == bs else self.loss.view(bs,-1).mean(1)
            self.losses.append(to_detach(loss))

    def after_validate(self):
        "Concatenate all recorded tensors"
        if self.with_input:     self.inputs  = detuplify(to_concat(self.inputs, dim=self.concat_dim))
        if not self.save_preds: self.preds   = detuplify(to_concat(self.preds, dim=self.concat_dim))
        if not self.save_targs: self.targets = detuplify(to_concat(self.targets, dim=self.concat_dim))
        if self.with_loss:      self.losses  = to_concat(self.losses)

    def all_tensors(self):
        res = [None if self.save_preds else self.preds, None if self.save_targs else self.targets]
        if self.with_input: res = [self.inputs] + res
        if self.with_loss:  res.append(self.losses)
        return res

In [None]:
show_doc(GatherPredsCallback, title_level=3)

<h3 id="GatherPredsCallback" class="doc_header"><code>class</code> <code>GatherPredsCallback</code><a href="" class="source_link" style="float:right">[source]</a></h3>

> <code>GatherPredsCallback</code>(**`with_input`**=*`False`*, **`with_loss`**=*`False`*, **`save_preds`**=*`None`*, **`save_targs`**=*`None`*, **`concat_dim`**=*`0`*) :: [`Callback`](callback.core#Callback)

[`Callback`](callback.core#Callback) that saves the predictions and targets, optionally `with_loss`

In [None]:
show_doc(GatherPredsCallback.begin_validate)

<h4 id="GatherPredsCallback.begin_validate" class="doc_header"><code>GatherPredsCallback.begin_validate</code><a href="__main__.py#L11" class="source_link" style="float:right">[source]</a></h4>

> <code>GatherPredsCallback.begin_validate</code>()

Initialize containers

In [None]:
show_doc(GatherPredsCallback.after_batch)

<h4 id="GatherPredsCallback.after_batch" class="doc_header"><code>GatherPredsCallback.after_batch</code><a href="__main__.py#L17" class="source_link" style="float:right">[source]</a></h4>

> <code>GatherPredsCallback.after_batch</code>()

Save predictions, targets and potentially losses

In [None]:
show_doc(GatherPredsCallback.after_validate)

<h4 id="GatherPredsCallback.after_validate" class="doc_header"><code>GatherPredsCallback.after_validate</code><a href="__main__.py#L29" class="source_link" style="float:right">[source]</a></h4>

> <code>GatherPredsCallback.after_validate</code>()

Concatenate all recorded tensors

In [None]:
#export
class FetchPredsCallback(Callback):
    "A callback to fetch predictions during the training loop"
    remove_on_fetch = True
    def __init__(self, ds_idx=1, dl=None, with_input=False, with_decoded=False, cbs=None):
        self.cbs = L(cbs)
        store_attr(self, 'ds_idx,dl,with_input,with_decoded')

    def after_validate(self):
        to_rm = L(cb for cb in self.learn.cbs if getattr(cb, 'remove_on_fetch', False))
        with self.learn.removed_cbs(to_rm + self.cbs) as learn:
            self.preds = learn.get_preds(ds_idx=self.ds_idx, dl=self.dl,
                with_input=self.with_input, with_decoded=self.with_decoded, inner=True)

In [None]:
show_doc(FetchPredsCallback, title_level=3)

<h3 id="FetchPredsCallback" class="doc_header"><code>class</code> <code>FetchPredsCallback</code><a href="" class="source_link" style="float:right">[source]</a></h3>

> <code>FetchPredsCallback</code>(**`ds_idx`**=*`1`*, **`dl`**=*`None`*, **`with_input`**=*`False`*, **`with_decoded`**=*`False`*, **`cbs`**=*`None`*) :: [`Callback`](callback.core#Callback)

A callback to fetch predictions during the training loop

When writing a callback, the following attributes of `Learner` are available:
- `model`: the model used for training/validation
- `data`: the underlying `DataLoaders`
- `loss_func`: the loss function used
- `opt`: the optimizer used to udpate the model parameters
- `opt_func`: the function used to create the optimizer
- `cbs`: the list containing all `Callback`s
- `dl`: current `DataLoader` used for iteration
- `x`/`xb`: last input drawn from `self.dl` (potentially modified by callbacks). `xb` is always a tuple (potentially with one element) and `x` is detuplified. You can only assign to `xb`.
- `y`/`yb`: last target drawn from `self.dl` (potentially modified by callbacks). `yb` is always a tuple (potentially with one element) and `y` is detuplified. You can only assign to `yb`.
- `pred`: last predictions from `self.model` (potentially modified by callbacks)
- `loss`: last computed loss (potentially modified by callbacks)
- `n_epoch`: the number of epochs in this training
- `n_iter`: the number of iterations in the current `self.dl`
- `epoch`: the current epoch index (from 0 to `n_epoch-1`)
- `iter`: the current iteration index in `self.dl` (from 0 to `n_iter-1`)

The following attributes are added by `TrainEvalCallback` and should be available unless you went out of your way to remove that callback:

- `train_iter`: the number of training iterations done since the beginning of this training
- `pct_train`: from 0. to 1., the percentage of training iterations completed
- `training`:  flag to indicate if we're in training mode or not

The following attribute is added by `Recorder` and should be available unless you went out of your way to remove that callback:

- `smooth_loss`: an exponentially-averaged version of the training loss

## Callbacks control flow

It happens that we may want to skip some of the steps of the training loop: in gradient accumulation, we don't aways want to do the step/zeroing of the grads for instance. During an LR finder test, we don't want to do the validation phase of an epoch. Or if we're training with a strategy of early stopping, we want to be able to completely interrupt the training loop.

This is made possible by raising specific exceptions the training loop will look for (and properly catch).

In [None]:
#export
_ex_docs = dict(
    CancelFitException="Skip the rest of this batch and go to `after_batch`",
    CancelEpochException="Skip the rest of the training part of the epoch and go to `after_train`",
    CancelTrainException="Skip the rest of the validation part of the epoch and go to `after_validate`",
    CancelValidException="Skip the rest of this epoch and go to `after_epoch`",
    CancelBatchException="Interrupts training and go to `after_fit`")

for c,d in _ex_docs.items(): mk_class(c,sup=Exception,doc=d)

In [None]:
show_doc(CancelBatchException, title_level=3)

<h3 id="CancelBatchException" class="doc_header"><code>class</code> <code>CancelBatchException</code><a href="" class="source_link" style="float:right">[source]</a></h3>

> <code>CancelBatchException</code>(**\*`args`**, **\*\*`kwargs`**) :: `Exception`

Interrupts training and go to `after_fit`

In [None]:
show_doc(CancelTrainException, title_level=3)

<h3 id="CancelTrainException" class="doc_header"><code>class</code> <code>CancelTrainException</code><a href="" class="source_link" style="float:right">[source]</a></h3>

> <code>CancelTrainException</code>(**\*`args`**, **\*\*`kwargs`**) :: `Exception`

Skip the rest of the validation part of the epoch and go to `after_validate`

In [None]:
show_doc(CancelValidException, title_level=3)

<h3 id="CancelValidException" class="doc_header"><code>class</code> <code>CancelValidException</code><a href="" class="source_link" style="float:right">[source]</a></h3>

> <code>CancelValidException</code>(**\*`args`**, **\*\*`kwargs`**) :: `Exception`

Skip the rest of this epoch and go to `after_epoch`

In [None]:
show_doc(CancelEpochException, title_level=3)

<h3 id="CancelEpochException" class="doc_header"><code>class</code> <code>CancelEpochException</code><a href="" class="source_link" style="float:right">[source]</a></h3>

> <code>CancelEpochException</code>(**\*`args`**, **\*\*`kwargs`**) :: `Exception`

Skip the rest of the training part of the epoch and go to `after_train`

In [None]:
show_doc(CancelFitException, title_level=3)

<h3 id="CancelFitException" class="doc_header"><code>class</code> <code>CancelFitException</code><a href="" class="source_link" style="float:right">[source]</a></h3>

> <code>CancelFitException</code>(**\*`args`**, **\*\*`kwargs`**) :: `Exception`

Skip the rest of this batch and go to `after_batch`

You can detect one of those exceptions occurred and add code that executes right after with the following events:
- `after_cancel_batch`: reached imediately after a `CancelBatchException` before proceeding to `after_batch`
- `after_cancel_train`: reached imediately after a `CancelTrainException` before proceeding to `after_epoch`
- `after_cancel_valid`: reached imediately after a `CancelValidException` before proceeding to `after_epoch`
- `after_cancel_epoch`: reached imediately after a `CancelEpochException` before proceeding to `after_epoch`
- `after_cancel_fit`: reached imediately after a `CancelFitException` before proceeding to `after_fit`

## Export -

In [None]:
#hide
from nbdev.export import notebook2script
notebook2script()

Converted 00_torch_core.ipynb.
Converted 01_layers.ipynb.
Converted 02_data.load.ipynb.
Converted 03_data.core.ipynb.
Converted 04_data.external.ipynb.
Converted 05_data.transforms.ipynb.
Converted 06_data.block.ipynb.
Converted 07_vision.core.ipynb.
Converted 08_vision.data.ipynb.
Converted 09_vision.augment.ipynb.
Converted 09b_vision.utils.ipynb.
Converted 09c_vision.widgets.ipynb.
Converted 10_tutorial.pets.ipynb.
Converted 11_vision.models.xresnet.ipynb.
Converted 12_optimizer.ipynb.
Converted 13_callback.core.ipynb.
Converted 13a_learner.ipynb.
Converted 13b_metrics.ipynb.
Converted 14_callback.schedule.ipynb.
Converted 14a_callback.data.ipynb.
Converted 15_callback.hook.ipynb.
Converted 15a_vision.models.unet.ipynb.
Converted 16_callback.progress.ipynb.
Converted 17_callback.tracker.ipynb.
Converted 18_callback.fp16.ipynb.
Converted 18a_callback.training.ipynb.
Converted 19_callback.mixup.ipynb.
Converted 20_interpret.ipynb.
Converted 20a_distributed.ipynb.
Converted 21_vision.l