{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Get your data ready for training" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This module defines the basic [`DataBunch`](/basic_data.html#DataBunch) object that is used inside [`Learner`](/basic_train.html#Learner) to train a model. This is the generic class, that can take any kind of fastai [`Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset) or [`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader). You'll find helpful functions in the data module of every application to directly create this [`DataBunch`](/basic_data.html#DataBunch) for you." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [], "source": [ "from fastai.gen_doc.nbdoc import *\n", "from fastai.basics import * " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "
class DataBunch[source]DataBunch(**`train_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), **`valid_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), **`fix_dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)=***`None`***, **`test_dl`**:`Optional`\\[[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)\\]=***`None`***, **`device`**:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=***`None`***, **`dl_tfms`**:`Optional`\\[`Collection`\\[`Callable`\\]\\]=***`None`***, **`path`**:`PathOrStr`=***`'.'`***, **`collate_fn`**:`Callable`=***`'data_collate'`***, **`no_check`**:`bool`=***`False`***)\n",
"\n",
"Bind `train_dl`,`valid_dl` and `test_dl` in a a data object. "
],
"text/plain": [
"create[source]create(**`train_ds`**:[`Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset), **`valid_ds`**:[`Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset), **`test_ds`**:`Optional`\\[[`Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset)\\]=***`None`***, **`path`**:`PathOrStr`=***`'.'`***, **`bs`**:`int`=***`64`***, **`val_bs`**:`int`=***`None`***, **`num_workers`**:`int`=***`4`***, **`dl_tfms`**:`Optional`\\[`Collection`\\[`Callable`\\]\\]=***`None`***, **`device`**:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=***`None`***, **`collate_fn`**:`Callable`=***`'data_collate'`***, **`no_check`**:`bool`=***`False`***) → `DataBunch`\n",
"\n",
"Create a [`DataBunch`](/basic_data.html#DataBunch) from `train_ds`, `valid_ds` and maybe `test_ds` with a batch size of `bs`. "
],
"text/plain": [
"show_batch[source]show_batch(**`rows`**:`int`=***`5`***, **`ds_type`**:[`DatasetType`](/basic_data.html#DatasetType)=***`dl[source]dl(**`ds_type`**:[`DatasetType`](/basic_data.html#DatasetType)=***`one_batch[source]one_batch(**`ds_type`**:[`DatasetType`](/basic_data.html#DatasetType)=***`one_item[source]one_item(**`item`**, **`detach`**:`bool`=***`False`***, **`denorm`**:`bool`=***`False`***, **`cpu`**:`bool`=***`False`***)\n",
"\n",
"Get `item` into a batch. Optionally `detach` and `denorm`. "
],
"text/plain": [
"sanity_check[source]sanity_check()\n",
"\n",
"Check the underlying data in the training set can be properly loaded. "
],
"text/plain": [
"export[source]export(**`fname`**:`str`=***`'export.pkl'`***)\n",
"\n",
"Export the minimal state of `self` for inference in `self.path/fname`. "
],
"text/plain": [
"load_empty[source]load_empty(**`path`**, **`fname`**:`str`=***`'export.pkl'`***)\n",
"\n",
"Load an empty [`DataBunch`](/basic_data.html#DataBunch) from the exported file in `path/fname` with optional `tfms`. "
],
"text/plain": [
"add_tfm[source]add_tfm(**`tfm`**:`Callable`)"
],
"text/plain": [
"class DeviceDataLoader[source]DeviceDataLoader(**`dl`**:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), **`device`**:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device), **`tfms`**:`List`\\[`Callable`\\]=***`None`***, **`collate_fn`**:`Callable`=***`'data_collate'`***)\n",
"\n",
"Bind a [`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader) to a [`torch.device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device). "
],
"text/plain": [
"create[source]create(**`dataset`**:[`Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset), **`bs`**:`int`=***`64`***, **`shuffle`**:`bool`=***`False`***, **`device`**:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=***`device(type='cuda')`***, **`tfms`**:`Collection`\\[`Callable`\\]=***`None`***, **`num_workers`**:`int`=***`4`***, **`collate_fn`**:`Callable`=***`'data_collate'`***, **\\*\\*`kwargs`**:`Any`)\n",
"\n",
"Create DeviceDataLoader from `dataset` with `bs` and `shuffle`: process using `num_workers`. "
],
"text/plain": [
"add_tfm[source]add_tfm(**`tfm`**:`Callable`)\n",
"\n",
"Add `tfm` to `self.tfms`. "
],
"text/plain": [
"remove_tfm[source]remove_tfm(**`tfm`**:`Callable`)\n",
"\n",
"Remove `tfm` from `self.tfms`. "
],
"text/plain": [
"new[source]new(**\\*\\*`kwargs`**)\n",
"\n",
"Create a new copy of `self` with `kwargs` replacing current values. "
],
"text/plain": [
"proc_batch[source]proc_batch(**`b`**:`Tensor`) → `Tensor`\n",
"\n",
"Process batch `b` of `TensorImage`. "
],
"text/plain": [
"Enum = [Train, Valid, Test, Single, Fix]"
],
"text/plain": [
"data_collate[source]data_collate(**`batch`**:`ItemsList`) → `Tensor`\n",
"\n",
"Convert `batch` items to tensor data. "
],
"text/plain": [
"