{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Writing Custom Dataset Exporters\n", "\n", "This recipe demonstrates how to write a [custom DatasetExporter](https://voxel51.com/docs/fiftyone/user_guide/export_datasets.html#custom-formats) and use it to export a FiftyOne dataset to disk in your custom format." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup\n", "\n", "If you haven't already, install FiftyOne:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install fiftyone" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "In this recipe we'll use the [FiftyOne Dataset Zoo](https://voxel51.com/docs/fiftyone/user_guide/dataset_creation/zoo_datasets.html) to download the [CIFAR-10 dataset](https://www.cs.toronto.edu/~kriz/cifar.html) to use as sample data to feed our custom exporter.\n", "\n", "Behind the scenes, FiftyOne uses either the\n", "[TensorFlow Datasets](https://www.tensorflow.org/datasets) or\n", "[TorchVision Datasets](https://pytorch.org/vision/stable/datasets.html) libraries to wrangle the datasets, depending on which ML library you have installed.\n", "\n", "You can, for example, install PyTorch as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install torch torchvision" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Writing a DatasetExporter\n", "\n", "FiftyOne provides a [DatasetExporter](https://voxel51.com/docs/fiftyone/api/fiftyone.utils.data.html#fiftyone.utils.data.exporters.DatasetExporter) interface that defines how it exports datasets to disk when methods such as [Dataset.export()](https://voxel51.com/docs/fiftyone/api/fiftyone.core.html#fiftyone.core.dataset.Dataset.export) are used.\n", "\n", "`DatasetExporter` itself is an abstract interface; the concrete interface that you should implement is determined by the type of dataset that you are exporting. See [writing a custom DatasetExporter](https://voxel51.com/docs/fiftyone/user_guide/export_datasets.html#custom-formats) for full details.\n", "\n", "In this recipe, we'll write a custom [LabeledImageDatasetExporter](https://voxel51.com/docs/fiftyone/api/fiftyone.utils.data.html#fiftyone.utils.data.exporters.LabeledImageDatasetExporter) that can export an image classification dataset to disk in the following format:\n", "\n", "```\n", "/\n", " data/\n", " .\n", " .\n", " ...\n", " labels.csv\n", "```\n", "\n", "where `labels.csv` is a CSV file that contains the image metadata and associated labels in the following format:\n", "\n", "```\n", "filepath,size_bytes,mime_type,width,height,num_channels,label\n", ",,,,,,