{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", " \n", " \n", " \n", " \n", "
\n", " \n", " \n", " Try in Google Colab\n", " \n", " \n", " \n", " \n", " Share via nbviewer\n", " \n", " \n", " \n", " \n", " View on GitHub\n", " \n", " \n", " \n", " \n", " Download notebook\n", " \n", "
\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Converting tags to classifications\n", "\n", "When working in the FiftyOne App, you can interactively [create, delete, and modify tags](https://voxel51.com/docs/fiftyone/user_guide/app.html#tags-and-tagging) right from the browser. When working with [Classification](https://voxel51.com/docs/fiftyone/user_guide/using_datasets.html#classification) labels, it makes sense to want to be able to annotate these labels the same as tags from the App.\n", "\n", "The primary distinction between [Classifications](https://voxel51.com/docs/fiftyone/user_guide/using_datasets.html#multilabel-classification) and [tags](https://voxel51.com/docs/fiftyone/user_guide/using_datasets.html#tags) is that Classifications can contain an arbitrary number of attributes associated with them, which is why it is generally recommended to use the [integrations with annotations tools like CVAT](https://voxel51.com/docs/fiftyone/integrations/cvat.html) to annotate these labels. Another benefit of using the [Classification](https://voxel51.com/docs/fiftyone/user_guide/using_datasets.html#classification) label type is that you can access the extensive [evaluation capabilities](https://voxel51.com/docs/fiftyone/user_guide/evaluation.html#classifications) of FiftyOne.\n", "\n", "However, if you are only populating the `label` of Classifications, then it can make sense to treat them like tags. This example shows how to convert the `label` of Classifications to tags, modify them in the App, and then convert tags back to Classifications. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Start by [installing FiftyOne](https://voxel51.com/docs/fiftyone/getting_started/install.html) if you haven't already:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install fiftyone" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then load an example dataset for this example." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Dataset already downloaded\n", "Loading 'quickstart'\n", " 100% |█████████████████| 200/200 [3.6s elapsed, 0s remaining, 51.3 samples/s] \n", "Dataset 'quickstart' created\n" ] } ], "source": [ "import fiftyone as fo\n", "import fiftyone.zoo as foz\n", "from fiftyone import ViewField as F\n", "\n", "dataset = foz.load_zoo_dataset(\"quickstart\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since this dataset doesn't contain classifications, let's start by converting the existing tags to classifications. The idea here is to efficient load all tags into memory using the [values()](https://voxel51.com/docs/fiftyone/user_guide/using_aggregations.html#values) aggregation, then converting the tags to classification labels and setting them to a field on the dataset." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Convert all tags to Classifications\n", "def tags_to_classifications(dataset, classifications_field):\n", " tags = dataset.values(\"tags\")\n", " classifications = []\n", " for sample_tags in dataset.values(\"tags\"):\n", " cls = []\n", " if sample_tags:\n", " for tag in sample_tags:\n", " cls.append(fo.Classification(label=tag))\n", " classifications.append(fo.Classifications(classifications=cls))\n", " \n", " dataset.set_values(classifications_field, classifications)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: You could also optionally add some logic here to choose a specific tag (like the first in the list) and create a Classification field rather than Classifications." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "field_name = \"classifications\"\n", "tags_to_classifications(dataset, field_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's launch the App and modify some tags. For example, we can change the tag of the first two samples from `validation` to `test`." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "session = fo.launch_app(dataset)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "session.freeze() # Screenshot the App for this example" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can convert the tags to classifications again to update them." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "tags_to_classifications(dataset, field_name)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ ",\n", " ]),\n", " 'logits': None,\n", "}>" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dataset.first()[field_name]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, the classification labels have now been updated to match the tags." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, we can also convert the existing classifications to tags. Since tags are simply a list of strings rather than FiftyOne Label objects, we can use a simple [ViewExpression]() to set tags." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "# Convert Classifications to tags\n", "def classifications_to_tags(dataset, classifications_field):\n", " if not dataset.has_sample_field(classifications_field):\n", " dataset.add_sample_field(\n", " classifications_field,\n", " fo.EmbeddedDocumentField,\n", " embedded_doc_type=fo.Classifications,\n", " )\n", " view = dataset.set_field(\"tags\", F(classifications_field+\".classifications.label\"))\n", " view.save(fields=\"tags\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, let's delete the existing tags on the dataset." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Name: quickstart\n", "Media type: image\n", "Num samples: 200\n", "Persistent: False\n", "Tags: ['test', 'validation']\n", "Sample fields:\n", " id: fiftyone.core.fields.ObjectIdField\n", " filepath: fiftyone.core.fields.StringField\n", " tags: fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)\n", " metadata: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)\n", " ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)\n", " uniqueness: fiftyone.core.fields.FloatField\n", " predictions: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)\n", " classifications: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classifications)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dataset" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "dataset.untag_samples(dataset.distinct(\"tags\"))" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Name: quickstart\n", "Media type: image\n", "Num samples: 200\n", "Persistent: False\n", "Tags: []\n", "Sample fields:\n", " id: fiftyone.core.fields.ObjectIdField\n", " filepath: fiftyone.core.fields.StringField\n", " tags: fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)\n", " metadata: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)\n", " ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)\n", " uniqueness: fiftyone.core.fields.FloatField\n", " predictions: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)\n", " classifications: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classifications)" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now to convert the classification to tags:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "classifications_to_tags(dataset, field_name)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Name: quickstart\n", "Media type: image\n", "Num samples: 200\n", "Persistent: False\n", "Tags: ['test', 'validation']\n", "Sample fields:\n", " id: fiftyone.core.fields.ObjectIdField\n", " filepath: fiftyone.core.fields.StringField\n", " tags: fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)\n", " metadata: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)\n", " ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)\n", " uniqueness: fiftyone.core.fields.FloatField\n", " predictions: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)\n", " classifications: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classifications)" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, the dataset has the tags `test` and `validation` again." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 2 }