{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Writing Custom Dataset Importers\n", "\n", "This recipe demonstrates how to write a [custom DatasetImporter](https://voxel51.com/docs/fiftyone/user_guide/import_datasets.html#custom-dataset-importer) and use it to load a dataset from disk in your custom format into FiftyOne." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup\n", "\n", "If you haven't already, install FiftyOne:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install fiftyone" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this recipe we'll use the [FiftyOne Dataset Zoo](https://docs.voxel51.com/dataset_zoo/index.html) to download the [CIFAR-10 dataset](https://www.cs.toronto.edu/~kriz/cifar.html) to use as sample data to feed our custom importer.\n", "\n", "Behind the scenes, FiftyOne either uses the\n", "[TensorFlow Datasets](https://www.tensorflow.org/datasets) or\n", "[TorchVision Datasets](https://pytorch.org/vision/stable/datasets.html) libraries to wrangle the datasets, depending on which ML library you have installed.\n", "\n", "You can, for example, install PyTorch as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install torch torchvision" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Writing a DatasetImporter\n", "\n", "FiftyOne provides a [DatasetImporter](https://voxel51.com/docs/fiftyone/api/fiftyone.utils.data.html#fiftyone.utils.data.importers.DatasetImporter) interface that defines how it imports datasets from disk when methods such as [Dataset.from_importer()](https://voxel51.com/docs/fiftyone/api/fiftyone.core.html#fiftyone.core.dataset.Dataset.from_importer) are used.\n", "\n", "`DatasetImporter` itself is an abstract interface; the concrete interface that you should implement is determined by the type of dataset that you are importing. See [writing a custom DatasetImporter](https://voxel51.com/docs/fiftyone/user_guide/import_datasets.html#custom-dataset-importer) for full details.\n", "\n", "In this recipe, we'll write a custom [LabeledImageDatasetImporter](https://voxel51.com/docs/fiftyone/api/fiftyone.utils.data.html#fiftyone.utils.data.importers.LabeledImageDatasetImporter) that can import an image classification dataset whose image metadata and labels are stored in a `labels.csv` file in the dataset directory with the following format:\n", "\n", "```\n", "filepath,size_bytes,mime_type,width,height,num_channels,label\n", ",,,,,,