{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Get your data ready for training" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This module defines the basic [`DataBunch`](/basic_data.html#DataBunch) object that is used inside [`Learner`](/basic_train.html#Learner) to train a model. This is the generic class, that can take any kind of fastai [`Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset) or [`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader). You'll find helpful functions in the data module of every application to directly create this [`DataBunch`](/basic_data.html#DataBunch) for you." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [], "source": [ "from fastai.gen_doc.nbdoc import *\n", "from fastai.basic_data import * " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "
class DataBunch[source]DataBunch(`train_dl`:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), `valid_dl`:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), `test_dl`:`Optional`\\[[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)\\]=`None`, `device`:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=`None`, `tfms`:`Optional`\\[`Collection`\\[`Callable`\\]\\]=`None`, `path`:`PathOrStr`=`'.'`, `collate_fn`:`Callable`=`'data_collate'`)"
],
"text/plain": [
"create[source]create(`train_ds`:[`Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset), `valid_ds`:[`Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset), `test_ds`:[`Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset)=`None`, `path`:`PathOrStr`=`'.'`, `bs`:`int`=`64`, `num_workers`:`int`=`4`, `tfms`:`Optional`\\[`Collection`\\[`Callable`\\]\\]=`None`, `device`:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=`None`, `collate_fn`:`Callable`=`'data_collate'`) → `DataBunch`"
],
"text/plain": [
"dl[source]dl(`ds_type`:[`DatasetType`](/basic_data.html#DatasetType)=`add_tfm[source]add_tfm(`tfm`:`Callable`)"
],
"text/plain": [
"class DeviceDataLoader[source]DeviceDataLoader(`dl`:[`DataLoader`](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader), `device`:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device), `tfms`:`List`\\[`Callable`\\]=`None`, `collate_fn`:`Callable`=`'data_collate'`, `skip_size1`:`bool`=`False`)"
],
"text/plain": [
"create[source]create(`dataset`:[`Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset), `bs`:`int`=`64`, `shuffle`:`bool`=`False`, `device`:[`device`](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device)=`device(type='cuda')`, `tfms`:`Collection`\\[`Callable`\\]=`None`, `num_workers`:`int`=`4`, `collate_fn`:`Callable`=`'data_collate'`, `kwargs`:`Any`)"
],
"text/plain": [
"one_batch[source]one_batch() → `Collection`\\[`Tensor`\\]\n",
"\n",
"Get one batch from the data loader. "
],
"text/plain": [
"add_tfm[source]add_tfm(`tfm`:`Callable`)"
],
"text/plain": [
"remove_tfm[source]remove_tfm(`tfm`:`Callable`)"
],
"text/plain": [
"Enum = [Train, Valid, Test]"
],
"text/plain": [
"class DatasetBase[source]DatasetBase(`x`:`Collection`=`None`, `y`:`Collection`=`None`, `classes`:`Collection`=`None`, `c`:`Optional`\\[`int`\\]=`None`, `task_type`:[`TaskType`](/basic_data.html#TaskType)=`None`, `class2idx`:`Dict`\\[`Any`, `int`\\]=`None`, `as_array`:`bool`=`True`, `do_encode_y`:`bool`=`True`) :: [`Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset)\n",
"\n",
"Base class for all fastai datasets. "
],
"text/plain": [
"class SingleClassificationDataset[source]SingleClassificationDataset(`classes`) :: [`DatasetBase`](/basic_data.html#DatasetBase)\n",
"\n",
"A [`Dataset`](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset) that contains no data, only `classes`, mainly used for inference with `set_item` "
],
"text/plain": [
"Enum = [No, Single, Multi, Regression]"
],
"text/plain": [
"proc_batch[source]proc_batch(`b`:`Tensor`) → `Tensor`\n",
"\n",
"Proces batch `b` of `TensorImage`. "
],
"text/plain": [
"data_collate[source]data_collate(`batch`:`ItemsList`) → `Tensor`\n",
"\n",
"Convert `batch` items to tensor data. "
],
"text/plain": [
"