{ "cells": [ { "cell_type": "markdown", "id": "SwSYWR4vzk_e", "metadata": { "id": "SwSYWR4vzk_e", "tags": [] }, "source": [ "# Analyzing Image Classification Dataset\n", "\n", "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/visual-layer/fastdup/blob/main/examples/analyzing-image-classification-dataset.ipynb)\n", "[![Open in Kaggle](https://kaggle.com/static/images/open-in-kaggle.svg)](https://kaggle.com/kernels/welcome?src=https://github.com/visual-layer/fastdup/blob/main/examples/analyzing-image-classification-dataset.ipynb)\n", "\n", "This notebook shows how you can use [fastdup](https://github.com/visual-layer/fastdup) to analyze an image classification dataset for:\n", "\n", "+ Duplicates.\n", "+ Outliers.\n", "+ Wrong labels.\n", "+ Image clusters.\n", "\n", "If you're new, run the notebook in Google Colab or Kaggle for free.\n", "\n", "> **Note** - No GPU needed! You can run on an instance with only CPU.\n", "\n" ] }, { "cell_type": "markdown", "id": "bbed0117-e8d1-4df6-b8b7-7bcce10b8655", "metadata": { "tags": [] }, "source": [ "## Installation\n", "\n", "First let's install [fastdup](https://github.com/visual-layer/fastdup) from PyPI with:" ] }, { "cell_type": "code", "execution_count": 1, "id": "506e82b4-a1c2-4262-a326-d0924bb018b6", "metadata": { "id": "506e82b4-a1c2-4262-a326-d0924bb018b6" }, "outputs": [], "source": [ "!pip install -Uqq fastdup" ] }, { "cell_type": "markdown", "id": "a5c3a1ab", "metadata": {}, "source": [ "Now, test the installation. If there's no error message, we are ready to go." ] }, { "cell_type": "code", "execution_count": 2, "id": "7f69d8b2", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'0.930'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import fastdup\n", "fastdup.__version__" ] }, { "cell_type": "markdown", "id": "8a79fb1b-b089-4d4d-8fa8-3e2b2ef7f886", "metadata": { "id": "8a79fb1b-b089-4d4d-8fa8-3e2b2ef7f886", "tags": [] }, "source": [ "## Download Dataset\n", "\n", "We will analyze the [Imagenette](https://github.com/fastai/imagenette) dataset - a subset of 10 easily classified classes from Imagenet (tench, English springer, cassette player, chain saw, church, French horn, garbage truck, gas pump, golf ball, parachute)." ] }, { "cell_type": "code", "execution_count": 3, "id": "be5b7ca5-34f5-4a0f-b081-2e78be6a425a", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--2023-05-16 08:53:02-- https://s3.amazonaws.com/fast-ai-imageclas/imagenette2-160.tgz\n", "Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.216.251.158, 52.217.83.238, 52.217.96.150, ...\n", "Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.216.251.158|:443... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 99003388 (94M) [application/x-tar]\n", "Saving to: ‘imagenette2-160.tgz.1’\n", "\n", "imagenette2-160.tgz 100%[===================>] 94.42M 46.0MB/s in 2.1s \n", "\n", "2023-05-16 08:53:04 (46.0 MB/s) - ‘imagenette2-160.tgz.1’ saved [99003388/99003388]\n", "\n" ] } ], "source": [ "!wget https://s3.amazonaws.com/fast-ai-imageclas/imagenette2-160.tgz\n", "!tar -xf imagenette2-160.tgz" ] }, { "cell_type": "markdown", "id": "f01586fe-db75-4154-aa15-9ea2709c9461", "metadata": { "id": "f01586fe-db75-4154-aa15-9ea2709c9461" }, "source": [ "## Load and Format Annotations" ] }, { "cell_type": "code", "execution_count": 4, "id": "ff90fe31-7c39-46c5-8c58-3ae349fbcc91", "metadata": { "executionInfo": { "elapsed": 949, "status": "ok", "timestamp": 1677666765166, "user": { "displayName": "Tom Shani", "userId": "00667426488827942961" }, "user_tz": -120 }, "id": "ff90fe31-7c39-46c5-8c58-3ae349fbcc91", "tags": [] }, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 5, "id": "21d2474d-3fa5-4148-a0f1-ea8d55d63b85", "metadata": { "executionInfo": { "elapsed": 2, "status": "ok", "timestamp": 1677666768281, "user": { "displayName": "Tom Shani", "userId": "00667426488827942961" }, "user_tz": -120 }, "id": "21d2474d-3fa5-4148-a0f1-ea8d55d63b85", "tags": [] }, "outputs": [], "source": [ "data_dir = 'imagenette2-160/'\n", "csv_path = 'imagenette2-160/noisy_imagenette.csv'" ] }, { "cell_type": "code", "execution_count": 6, "id": "2cb91ccb-9cb6-42ba-9489-96182eccc583", "metadata": { "executionInfo": { "elapsed": 2, "status": "ok", "timestamp": 1677666769859, "user": { "displayName": "Tom Shani", "userId": "00667426488827942961" }, "user_tz": -120 }, "id": "2cb91ccb-9cb6-42ba-9489-96182eccc583", "tags": [] }, "outputs": [], "source": [ "label_map = {\n", " 'n02979186': 'cassette_player', \n", " 'n03417042': 'garbage_truck', \n", " 'n01440764': 'tench', \n", " 'n02102040': 'English_springer', \n", " 'n03028079': 'church',\n", " 'n03888257': 'parachute', \n", " 'n03394916': 'French_horn', \n", " 'n03000684': 'chain_saw', \n", " 'n03445777': 'golf_ball', \n", " 'n03425413': 'gas_pump'\n", "}" ] }, { "cell_type": "markdown", "id": "8aba34e1", "metadata": {}, "source": [ "Load the annotation provided with the dataset." ] }, { "cell_type": "code", "execution_count": 7, "id": "e2e90600-b02d-4a2a-a348-7b67157f9129", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 143 }, "executionInfo": { "elapsed": 2, "status": "ok", "timestamp": 1677666769859, "user": { "displayName": "Tom Shani", "userId": "00667426488827942961" }, "user_tz": -120 }, "id": "e2e90600-b02d-4a2a-a348-7b67157f9129", "outputId": "f9f72c0d-f613-4aac-d29c-3646b2301dcb", "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pathnoisy_labels_0noisy_labels_1noisy_labels_5noisy_labels_25noisy_labels_50is_valid
0train/n02979186/n02979186_9036.JPEGn02979186n02979186n02979186n02979186n02979186False
1train/n02979186/n02979186_11957.JPEGn02979186n02979186n02979186n02979186n03000684False
2train/n02979186/n02979186_9715.JPEGn02979186n02979186n02979186n03417042n03000684False
\n", "
" ], "text/plain": [ " path noisy_labels_0 noisy_labels_1 noisy_labels_5 noisy_labels_25 noisy_labels_50 is_valid\n", "0 train/n02979186/n02979186_9036.JPEG n02979186 n02979186 n02979186 n02979186 n02979186 False\n", "1 train/n02979186/n02979186_11957.JPEG n02979186 n02979186 n02979186 n02979186 n03000684 False\n", "2 train/n02979186/n02979186_9715.JPEG n02979186 n02979186 n02979186 n03417042 n03000684 False" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_annot = pd.read_csv(csv_path)\n", "df_annot.head(3)" ] }, { "cell_type": "markdown", "id": "dfc957bf", "metadata": {}, "source": [ "Transform the annotation to fastdup supported format.\n", "\n", "fastdup expects an annotation `DataFrame` that contains the following column:\n", "\n", "+ filename - contains the path to the image file.\n", "+ label - contains a label of the image.\n", "+ split - whether the image is subset of the training, validation or test dataset." ] }, { "cell_type": "code", "execution_count": 8, "id": "473185d1-89f5-4746-b87b-f2b3ef7c445b", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 424 }, "executionInfo": { "elapsed": 1012, "status": "ok", "timestamp": 1677666771201, "user": { "displayName": "Tom Shani", "userId": "00667426488827942961" }, "user_tz": -120 }, "id": "473185d1-89f5-4746-b87b-f2b3ef7c445b", "outputId": "c09c986d-bcef-4545-8ceb-ee5196b40ee6", "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
filenamelabelsplit
0imagenette2-160/train/n02979186/n02979186_9036.JPEGcassette_playertrain
1imagenette2-160/train/n02979186/n02979186_11957.JPEGcassette_playertrain
2imagenette2-160/train/n02979186/n02979186_9715.JPEGcassette_playertrain
3imagenette2-160/train/n02979186/n02979186_21736.JPEGcassette_playertrain
4imagenette2-160/train/n02979186/ILSVRC2012_val_00046953.JPEGcassette_playertrain
............
13389imagenette2-160/val/n03425413/n03425413_17521.JPEGgas_pumpval
13390imagenette2-160/val/n03425413/n03425413_20711.JPEGgas_pumpval
13391imagenette2-160/val/n03425413/n03425413_19050.JPEGgas_pumpval
13392imagenette2-160/val/n03425413/n03425413_13831.JPEGgas_pumpval
13393imagenette2-160/val/n03425413/n03425413_1242.JPEGgas_pumpval
\n", "

13394 rows × 3 columns

\n", "
" ], "text/plain": [ " filename label split\n", "0 imagenette2-160/train/n02979186/n02979186_9036.JPEG cassette_player train\n", "1 imagenette2-160/train/n02979186/n02979186_11957.JPEG cassette_player train\n", "2 imagenette2-160/train/n02979186/n02979186_9715.JPEG cassette_player train\n", "3 imagenette2-160/train/n02979186/n02979186_21736.JPEG cassette_player train\n", "4 imagenette2-160/train/n02979186/ILSVRC2012_val_00046953.JPEG cassette_player train\n", "... ... ... ...\n", "13389 imagenette2-160/val/n03425413/n03425413_17521.JPEG gas_pump val\n", "13390 imagenette2-160/val/n03425413/n03425413_20711.JPEG gas_pump val\n", "13391 imagenette2-160/val/n03425413/n03425413_19050.JPEG gas_pump val\n", "13392 imagenette2-160/val/n03425413/n03425413_13831.JPEG gas_pump val\n", "13393 imagenette2-160/val/n03425413/n03425413_1242.JPEG gas_pump val\n", "\n", "[13394 rows x 3 columns]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# take relevant columns\n", "df_annot = df_annot[['path', 'noisy_labels_0']]\n", "\n", "# rename columns to fastdup's column names\n", "df_annot = df_annot.rename({'noisy_labels_0': 'label', 'path': 'filename'}, axis='columns')\n", "\n", "# append datadir\n", "df_annot['filename'] = df_annot['filename'].apply(lambda x: data_dir + x)\n", "\n", "# create split column\n", "df_annot['split'] = df_annot['filename'].apply(lambda x: x.split(\"/\")[1])\n", "\n", "# map label ids to regular labels\n", "df_annot['label'] = df_annot['label'].map(label_map)\n", "\n", "# show formated annotations\n", "df_annot" ] }, { "cell_type": "markdown", "id": "0c648ed1-5016-4230-9873-546eb510b764", "metadata": { "id": "0c648ed1-5016-4230-9873-546eb510b764" }, "source": [ "## Run fastdup\n", "\n", "With the images and annotations, we are now ready to run an analysis." ] }, { "cell_type": "markdown", "id": "0a39243e", "metadata": {}, "source": [ "+ `work_dir` is the path to store the artifacts from the analysis.\n", "\n", "+ `input_dir` is the path to the downloaded images." ] }, { "cell_type": "code", "execution_count": 9, "id": "92a6e2f9-e60c-44c0-b48a-f7413f7594ae", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "FastDup Software, (C) copyright 2022 Dr. Amir Alush and Dr. Danny Bickson.\n", "2023-05-16 08:53:06 [INFO] Going to loop over dir imagenette2-160\n", "2023-05-16 08:53:06 [INFO] Found total 13394 images to run on, 13394 train, 0 test, name list 13394, counter 13394 \n", "2023-05-16 08:53:20 [INFO] Found total 13394 images to run onimated: 0 Minutes\n", "Finished histogram 7.122\n", "Finished bucket sort 7.177\n", "2023-05-16 08:53:20 [INFO] 309) Finished write_index() NN model\n", "2023-05-16 08:53:20 [INFO] Stored nn model index file fastdup_imagenette/nnf.index\n", "2023-05-16 08:53:21 [INFO] Total time took 14601 ms\n", "2023-05-16 08:53:21 [INFO] Found a total of 0 fully identical images (d>0.990), which are 0.00 %\n", "2023-05-16 08:53:21 [INFO] Found a total of 0 nearly identical images(d>0.980), which are 0.00 %\n", "2023-05-16 08:53:21 [INFO] Found a total of 16757 above threshold images (d>0.800), which are 62.55 %\n", "2023-05-16 08:53:21 [INFO] Found a total of 1339 outlier images (d<0.050), which are 5.00 %\n", "2023-05-16 08:53:21 [INFO] Min distance found 0.476 max distance 0.969\n", "2023-05-16 08:53:21 [INFO] Running connected components for ccthreshold 0.900000 \n", ".0\n", " ########################################################################################\n", "\n", "Dataset Analysis Summary: \n", "\n", " Dataset contains 13394 images\n", " Valid images are 100.00% (13,394) of the data, invalid are 0.00% (0) of the data\n", " Similarity: 3.11% (416) belong to 19 similarity clusters (components).\n", " 96.89% (12,978) images do not belong to any similarity cluster.\n", " Largest cluster has 566 (4.23%) images.\n", " For a detailed analysis, use `.connected_components()`\n", "(similarity threshold used is 0.8, connected component threshold used is 0.9).\n", "\n", " Outliers: 6.23% (835) of images are possible outliers, and fall in the bottom 5.00% of similarity values.\n", " For a detailed list of outliers, use `.outliers()`.\n" ] } ], "source": [ "work_dir = 'fastdup_imagenette'\n", "\n", "fd = fastdup.create(work_dir=work_dir, input_dir=data_dir) \n", "fd.run(annotations=df_annot, ccthreshold=0.9, threshold=0.8)" ] }, { "cell_type": "markdown", "id": "62e35a12-fadd-4b3f-bcab-69e6e67862a4", "metadata": {}, "source": [ "## Outliers\n", "\n", "Visualize outliers from the dataset." ] }, { "cell_type": "code", "execution_count": 10, "id": "b39ec702-3ea1-4afe-a948-f026ba8fcb47", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "executionInfo": { "elapsed": 2658, "status": "ok", "timestamp": 1677667336302, "user": { "displayName": "Tom Shani", "userId": "00667426488827942961" }, "user_tz": -120 }, "id": "b39ec702-3ea1-4afe-a948-f026ba8fcb47", "outputId": "caa992d2-5267-408c-b44a-3a4a66e1ab5f", "scrolled": false, "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 9642.08it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Stored outliers visual view in fastdup_imagenette/galleries/outliers.html\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Outliers Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "
\n", "
\n", "
\n", "

Outliers Report

Showing image outliers, one per row

\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.489022
Path/train/n02979186/n02979186_3967.JPEG
labelcassette_player
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.51468
Path/train/n03445777/n03445777_5218.JPEG
labelgolf_ball
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.541967
Path/val/n03417042/n03417042_5301.JPEG
labelgarbage_truck
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.57066
Path/train/n03888257/n03888257_34639.JPEG
labelparachute
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.578252
Path/train/n03445777/n03445777_3254.JPEG
labelgolf_ball
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.58389
Path/val/n03445777/n03445777_5932.JPEG
labelgolf_ball
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.590838
Path/val/n02102040/n02102040_7670.JPEG
labelEnglish_springer
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.609527
Path/train/n03888257/n03888257_7793.JPEG
labelparachute
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.611143
Path/val/n01440764/n01440764_4962.JPEG
labeltench
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.61373
Path/train/n03445777/n03445777_6033.JPEG
labelgolf_ball
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.61618
Path/train/n03394916/n03394916_37544.JPEG
labelFrench_horn
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.616785
Path/val/n03445777/n03445777_9292.JPEG
labelgolf_ball
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.617952
Path/train/n03888257/n03888257_16223.JPEG
labelparachute
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.619739
Path/train/n03028079/n03028079_24708.JPEG
labelchurch
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.619768
Path/train/n03888257/n03888257_79145.JPEG
labelparachute
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.620815
Path/train/n03888257/n03888257_5703.JPEG
labelparachute
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.625504
Path/train/n03394916/n03394916_33663.JPEG
labelFrench_horn
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.626412
Path/train/n03445777/n03445777_9199.JPEG
labelgolf_ball
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.630812
Path/train/n02979186/n02979186_10289.JPEG
labelcassette_player
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.631131
Path/train/n03888257/n03888257_75495.JPEG
labelparachute
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fd.vis.outliers_gallery()" ] }, { "cell_type": "markdown", "id": "67378b58", "metadata": {}, "source": [ "Show outliers image data." ] }, { "cell_type": "code", "execution_count": 11, "id": "aa1c0e5d-6038-491b-8a91-1d76a87590d4", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 270 }, "executionInfo": { "elapsed": 429, "status": "ok", "timestamp": 1677667331251, "user": { "displayName": "Tom Shani", "userId": "00667426488827942961" }, "user_tz": -120 }, "id": "aa1c0e5d-6038-491b-8a91-1d76a87590d4", "outputId": "b38332f8-7e4e-45de-f7d3-828a52757ec2", "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
outliernearestdistancefilename_outlierlabel_outliersplit_outlierindex_xerror_code_outlieris_valid_outlierfd_index_outlierfilename_nearestlabel_nearestsplit_nearestindex_yerror_code_nearestis_valid_nearestfd_index_nearest
0266497630.476124imagenette2-160/train/n02979186/n02979186_3967.JPEGcassette_playertrain2664VALIDTrue2664imagenette2-160/val/n01440764/n01440764_710.JPEGtenchval9763VALIDTrue9763
1815078310.514680imagenette2-160/train/n03445777/n03445777_5218.JPEGgolf_balltrain8150VALIDTrue8150imagenette2-160/train/n03445777/n03445777_18756.JPEGgolf_balltrain7831VALIDTrue7831
2120769560.539276imagenette2-160/val/n03417042/n03417042_5301.JPEGgarbage_truckval12076VALIDTrue12076imagenette2-160/train/n01440764/n01440764_9898.JPEGtenchtrain956VALIDTrue956
3908786280.544795imagenette2-160/train/n03888257/n03888257_34639.JPEGparachutetrain9087VALIDTrue9087imagenette2-160/train/n03888257/n03888257_12053.JPEGparachutetrain8628VALIDTrue8628
4796616300.555266imagenette2-160/train/n03445777/n03445777_3254.JPEGgolf_balltrain7966VALIDTrue7966imagenette2-160/train/n02102040/n02102040_585.JPEGEnglish_springertrain1630VALIDTrue1630
\n", "
" ], "text/plain": [ " outlier nearest distance filename_outlier label_outlier split_outlier index_x error_code_outlier is_valid_outlier fd_index_outlier filename_nearest label_nearest split_nearest index_y error_code_nearest is_valid_nearest fd_index_nearest\n", "0 2664 9763 0.476124 imagenette2-160/train/n02979186/n02979186_3967.JPEG cassette_player train 2664 VALID True 2664 imagenette2-160/val/n01440764/n01440764_710.JPEG tench val 9763 VALID True 9763\n", "1 8150 7831 0.514680 imagenette2-160/train/n03445777/n03445777_5218.JPEG golf_ball train 8150 VALID True 8150 imagenette2-160/train/n03445777/n03445777_18756.JPEG golf_ball train 7831 VALID True 7831\n", "2 12076 956 0.539276 imagenette2-160/val/n03417042/n03417042_5301.JPEG garbage_truck val 12076 VALID True 12076 imagenette2-160/train/n01440764/n01440764_9898.JPEG tench train 956 VALID True 956\n", "3 9087 8628 0.544795 imagenette2-160/train/n03888257/n03888257_34639.JPEG parachute train 9087 VALID True 9087 imagenette2-160/train/n03888257/n03888257_12053.JPEG parachute train 8628 VALID True 8628\n", "4 7966 1630 0.555266 imagenette2-160/train/n03445777/n03445777_3254.JPEG golf_ball train 7966 VALID True 7966 imagenette2-160/train/n02102040/n02102040_585.JPEG English_springer train 1630 VALID True 1630" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.outliers().head(5)" ] }, { "cell_type": "markdown", "id": "bc16596d-899a-45eb-87ca-1d2b96a6ad96", "metadata": {}, "source": [ "## Comparing Labels of Similar Images\n", "Find possible mislabels by comparing a query image to other images in the dataset." ] }, { "cell_type": "code", "execution_count": 12, "id": "4d7cf1b9-c6c0-4b90-b7bb-59ca7bdbdcd7", "metadata": { "scrolled": false, "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 106.60it/s]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Stored similar images visual view in fastdup_imagenette/galleries/similarity.html\n" ] }, { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Similarity Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "
\n", "
\n", "
\n", "

Similarity Report

\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelFrench_horn
from/train/n03394916/n03394916_44127.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.968786/val/n03394916/n03394916_30631.JPEGFrench_horn
0.918324/train/n03394916/n03394916_36016.JPEGFrench_horn
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelFrench_horn
from/val/n03394916/n03394916_30631.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.968786/train/n03394916/n03394916_44127.JPEGFrench_horn
0.903753/train/n03394916/n03394916_29969.JPEGFrench_horn
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelgolf_ball
from/val/n03445777/n03445777_6882.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.962458/train/n03445777/n03445777_13918.JPEGgolf_ball
0.918005/val/n03445777/n03445777_5912.JPEGgolf_ball
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelgolf_ball
from/train/n03445777/n03445777_13918.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.962458/val/n03445777/n03445777_6882.JPEGgolf_ball
0.917039/val/n03445777/n03445777_8820.JPEGgolf_ball
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelEnglish_springer
from/train/n02102040/n02102040_1564.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.953837/train/n02102040/n02102040_3837.JPEGEnglish_springer
0.908732/train/n02102040/n02102040_3586.JPEGEnglish_springer
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelEnglish_springer
from/train/n02102040/n02102040_3837.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.953837/train/n02102040/n02102040_1564.JPEGEnglish_springer
0.893944/train/n02102040/n02102040_3027.JPEGEnglish_springer
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labeltench
from/train/n01440764/n01440764_7457.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.953413/train/n01440764/n01440764_11339.JPEGtench
0.918778/train/n01440764/n01440764_9315.JPEGtench
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labeltench
from/train/n01440764/n01440764_11339.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.953413/train/n01440764/n01440764_7457.JPEGtench
0.889166/train/n01440764/n01440764_12279.JPEGtench
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelgarbage_truck
from/train/n03417042/n03417042_1578.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.952239/train/n03417042/n03417042_12906.JPEGgarbage_truck
0.837864/val/n03417042/n03417042_9610.JPEGgarbage_truck
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelgarbage_truck
from/train/n03417042/n03417042_12906.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.952239/train/n03417042/n03417042_1578.JPEGgarbage_truck
0.828749/train/n03417042/n03417042_27686.JPEGgarbage_truck
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelFrench_horn
from/val/n03394916/n03394916_6830.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.951679/val/n03394916/n03394916_21092.JPEGFrench_horn
0.89308/train/n03394916/n03394916_35469.JPEGFrench_horn
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelFrench_horn
from/val/n03394916/n03394916_21092.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.951679/val/n03394916/n03394916_6830.JPEGFrench_horn
0.865771/train/n03394916/n03394916_35469.JPEGFrench_horn
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelparachute
from/train/n03888257/n03888257_21027.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.950477/val/n03888257/n03888257_11210.JPEGparachute
0.92043/val/n03888257/n03888257_12491.JPEGparachute
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelparachute
from/val/n03888257/n03888257_11210.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.950477/train/n03888257/n03888257_21027.JPEGparachute
0.865155/val/n03888257/n03888257_12491.JPEGparachute
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelEnglish_springer
from/train/n02102040/n02102040_6313.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.950174/train/n02102040/n02102040_3767.JPEGEnglish_springer
0.947323/val/n02102040/n02102040_350.JPEGEnglish_springer
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelEnglish_springer
from/train/n02102040/n02102040_3767.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.950174/train/n02102040/n02102040_6313.JPEGEnglish_springer
0.914057/val/n02102040/n02102040_350.JPEGEnglish_springer
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelEnglish_springer
from/train/n02102040/ILSVRC2012_val_00032959.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.949877/val/n02102040/n02102040_662.JPEGEnglish_springer
0.933114/train/n02102040/n02102040_3114.JPEGEnglish_springer
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelEnglish_springer
from/val/n02102040/n02102040_662.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.949877/train/n02102040/ILSVRC2012_val_00032959.JPEGEnglish_springer
0.927345/val/n02102040/n02102040_3502.JPEGEnglish_springer
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelEnglish_springer
from/train/n02102040/n02102040_3114.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.949252/train/n02102040/n02102040_1306.JPEGEnglish_springer
0.941953/train/n02102040/n02102040_1055.JPEGEnglish_springer
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info From
labelEnglish_springer
from/train/n02102040/n02102040_1306.JPEG
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", " \n", "
Info To
0.949252/train/n02102040/n02102040_3114.JPEGEnglish_springer
0.936799/train/n02102040/n02102040_876.JPEGEnglish_springer
\n", "
\n", "
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Query Image
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t\t\n", "\t\t\t\t\t\t\t
Similar
\n", "\t\t\t\t\t\t
\n", "\t\t\t\t\t
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fromtolabellabel2distance
3630imagenette2-160/train/n03394916/n03394916_44127.JPEG[imagenette2-160/val/n03394916/n03394916_30631.JPEG, imagenette2-160/train/n03394916/n03394916_36016.JPEG][French_horn, French_horn][French_horn, French_horn][0.968786, 0.918324]
7823imagenette2-160/val/n03394916/n03394916_30631.JPEG[imagenette2-160/train/n03394916/n03394916_44127.JPEG, imagenette2-160/train/n03394916/n03394916_29969.JPEG][French_horn, French_horn][French_horn, French_horn][0.968786, 0.903753]
8758imagenette2-160/val/n03445777/n03445777_6882.JPEG[imagenette2-160/train/n03445777/n03445777_13918.JPEG, imagenette2-160/val/n03445777/n03445777_5912.JPEG][golf_ball, golf_ball][golf_ball, golf_ball][0.962458, 0.918005]
5363imagenette2-160/train/n03445777/n03445777_13918.JPEG[imagenette2-160/val/n03445777/n03445777_6882.JPEG, imagenette2-160/val/n03445777/n03445777_8820.JPEG][golf_ball, golf_ball][golf_ball, golf_ball][0.962458, 0.917039]
896imagenette2-160/train/n02102040/n02102040_1564.JPEG[imagenette2-160/train/n02102040/n02102040_3837.JPEG, imagenette2-160/train/n02102040/n02102040_3586.JPEG][English_springer, English_springer][English_springer, English_springer][0.953837, 0.908732]
..................
6224imagenette2-160/train/n03888257/n03888257_38633.JPEG[imagenette2-160/train/n03888257/n03888257_12816.JPEG][parachute][parachute][0.800073]
5917imagenette2-160/train/n03888257/n03888257_12816.JPEG[imagenette2-160/train/n03888257/n03888257_38633.JPEG][parachute][parachute][0.800073]
4324imagenette2-160/train/n03417042/n03417042_3236.JPEG[imagenette2-160/train/n03417042/n03417042_12297.JPEG][garbage_truck][garbage_truck][0.800025]
3429imagenette2-160/train/n03394916/n03394916_32478.JPEG[imagenette2-160/train/n03394916/n03394916_35573.JPEG][French_horn][French_horn][0.800012]
7503imagenette2-160/val/n03028079/n03028079_13002.JPEG[imagenette2-160/train/n03028079/n03028079_3839.JPEG][church][church][0.800002]
\n", "

9064 rows × 5 columns

\n", "
" ], "text/plain": [ " from to label label2 distance\n", "3630 imagenette2-160/train/n03394916/n03394916_44127.JPEG [imagenette2-160/val/n03394916/n03394916_30631.JPEG, imagenette2-160/train/n03394916/n03394916_36016.JPEG] [French_horn, French_horn] [French_horn, French_horn] [0.968786, 0.918324]\n", "7823 imagenette2-160/val/n03394916/n03394916_30631.JPEG [imagenette2-160/train/n03394916/n03394916_44127.JPEG, imagenette2-160/train/n03394916/n03394916_29969.JPEG] [French_horn, French_horn] [French_horn, French_horn] [0.968786, 0.903753]\n", "8758 imagenette2-160/val/n03445777/n03445777_6882.JPEG [imagenette2-160/train/n03445777/n03445777_13918.JPEG, imagenette2-160/val/n03445777/n03445777_5912.JPEG] [golf_ball, golf_ball] [golf_ball, golf_ball] [0.962458, 0.918005]\n", "5363 imagenette2-160/train/n03445777/n03445777_13918.JPEG [imagenette2-160/val/n03445777/n03445777_6882.JPEG, imagenette2-160/val/n03445777/n03445777_8820.JPEG] [golf_ball, golf_ball] [golf_ball, golf_ball] [0.962458, 0.917039]\n", "896 imagenette2-160/train/n02102040/n02102040_1564.JPEG [imagenette2-160/train/n02102040/n02102040_3837.JPEG, imagenette2-160/train/n02102040/n02102040_3586.JPEG] [English_springer, English_springer] [English_springer, English_springer] [0.953837, 0.908732]\n", "... ... ... ... ... ...\n", "6224 imagenette2-160/train/n03888257/n03888257_38633.JPEG [imagenette2-160/train/n03888257/n03888257_12816.JPEG] [parachute] [parachute] [0.800073]\n", "5917 imagenette2-160/train/n03888257/n03888257_12816.JPEG [imagenette2-160/train/n03888257/n03888257_38633.JPEG] [parachute] [parachute] [0.800073]\n", "4324 imagenette2-160/train/n03417042/n03417042_3236.JPEG [imagenette2-160/train/n03417042/n03417042_12297.JPEG] [garbage_truck] [garbage_truck] [0.800025]\n", "3429 imagenette2-160/train/n03394916/n03394916_32478.JPEG [imagenette2-160/train/n03394916/n03394916_35573.JPEG] [French_horn] [French_horn] [0.800012]\n", "7503 imagenette2-160/val/n03028079/n03028079_13002.JPEG [imagenette2-160/train/n03028079/n03028079_3839.JPEG] [church] [church] [0.800002]\n", "\n", "[9064 rows x 5 columns]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.vis.similarity_gallery() " ] }, { "cell_type": "markdown", "id": "c2c393be-2b42-4814-8688-03d2be9e8998", "metadata": {}, "source": [ "## Similar Image Pairs\n", "\n", "Find similar image pairs within and across the train and validation subfolders. Pairs may include train-train, train-val, val-train, and val-val." ] }, { "cell_type": "code", "execution_count": 13, "id": "9e065403-582b-4f94-855b-33fd8f4826a1", "metadata": { "scrolled": false, "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/apps/volume/dataset-volume/mambaforge/envs/fastdup/lib/python3.10/site-packages/fastdup/galleries.py:106: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame.\n", "Try using .loc[row_indexer,col_indexer] = value instead\n", "\n", "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", " df[out_col] = df[in_col].apply(lambda x: get_label_func.get(x, MISSING_LABEL))\n", "/apps/volume/dataset-volume/mambaforge/envs/fastdup/lib/python3.10/site-packages/fastdup/galleries.py:106: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame.\n", "Try using .loc[row_indexer,col_indexer] = value instead\n", "\n", "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", " df[out_col] = df[in_col].apply(lambda x: get_label_func.get(x, MISSING_LABEL))\n", "100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 188.62it/s]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Stored similarity visual view in fastdup_imagenette/galleries/duplicates.html\n" ] }, { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Duplicates Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "
\n", "
\n", "
\n", "

Duplicates Report

\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.968786
From/val/n03394916/n03394916_30631.JPEG
To/train/n03394916/n03394916_44127.JPEG
From_LabelFrench_horn
To_LabelFrench_horn
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.962458
From/train/n03445777/n03445777_13918.JPEG
To/val/n03445777/n03445777_6882.JPEG
From_Labelgolf_ball
To_Labelgolf_ball
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.953837
From/train/n02102040/n02102040_3837.JPEG
To/train/n02102040/n02102040_1564.JPEG
From_LabelEnglish_springer
To_LabelEnglish_springer
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.953413
From/train/n01440764/n01440764_7457.JPEG
To/train/n01440764/n01440764_11339.JPEG
From_Labeltench
To_Labeltench
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.952239
From/train/n03417042/n03417042_1578.JPEG
To/train/n03417042/n03417042_12906.JPEG
From_Labelgarbage_truck
To_Labelgarbage_truck
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.951679
From/val/n03394916/n03394916_6830.JPEG
To/val/n03394916/n03394916_21092.JPEG
From_LabelFrench_horn
To_LabelFrench_horn
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.950477
From/val/n03888257/n03888257_11210.JPEG
To/train/n03888257/n03888257_21027.JPEG
From_Labelparachute
To_Labelparachute
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.950174
From/train/n02102040/n02102040_6313.JPEG
To/train/n02102040/n02102040_3767.JPEG
From_LabelEnglish_springer
To_LabelEnglish_springer
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.949877
From/train/n02102040/ILSVRC2012_val_00032959.JPEG
To/val/n02102040/n02102040_662.JPEG
From_LabelEnglish_springer
To_LabelEnglish_springer
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.949252
From/train/n02102040/n02102040_1306.JPEG
To/train/n02102040/n02102040_3114.JPEG
From_LabelEnglish_springer
To_LabelEnglish_springer
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fd.vis.duplicates_gallery()" ] }, { "cell_type": "markdown", "id": "e10989e1", "metadata": {}, "source": [ "Show similar image pairs." ] }, { "cell_type": "code", "execution_count": 14, "id": "3ea590e9-d221-4202-b03b-e5fef4487c89", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 270 }, "executionInfo": { "elapsed": 499, "status": "ok", "timestamp": 1677667342908, "user": { "displayName": "Tom Shani", "userId": "00667426488827942961" }, "user_tz": -120 }, "id": "3ea590e9-d221-4202-b03b-e5fef4487c89", "outputId": "3c5f4cc0-0ba5-42a0-e01b-f165e9cf655c", "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fromtodistancefilename_fromlabel_fromsplit_fromindex_xerror_code_fromis_valid_fromfd_index_fromfilename_tolabel_tosplit_toindex_yerror_code_tois_valid_tofd_index_to
01152153900.968786imagenette2-160/val/n03394916/n03394916_30631.JPEGFrench_hornval11521VALIDTrue11521imagenette2-160/train/n03394916/n03394916_44127.JPEGFrench_horntrain5390VALIDTrue5390
15390115210.968786imagenette2-160/train/n03394916/n03394916_44127.JPEGFrench_horntrain5390VALIDTrue5390imagenette2-160/val/n03394916/n03394916_30631.JPEGFrench_hornval11521VALIDTrue11521
21291477150.962458imagenette2-160/val/n03445777/n03445777_6882.JPEGgolf_ballval12914VALIDTrue12914imagenette2-160/train/n03445777/n03445777_13918.JPEGgolf_balltrain7715VALIDTrue7715
37715129140.962458imagenette2-160/train/n03445777/n03445777_13918.JPEGgolf_balltrain7715VALIDTrue7715imagenette2-160/val/n03445777/n03445777_6882.JPEGgolf_ballval12914VALIDTrue12914
4140411170.953837imagenette2-160/train/n02102040/n02102040_3837.JPEGEnglish_springertrain1404VALIDTrue1404imagenette2-160/train/n02102040/n02102040_1564.JPEGEnglish_springertrain1117VALIDTrue1117
\n", "
" ], "text/plain": [ " from to distance filename_from label_from split_from index_x error_code_from is_valid_from fd_index_from filename_to label_to split_to index_y error_code_to is_valid_to fd_index_to\n", "0 11521 5390 0.968786 imagenette2-160/val/n03394916/n03394916_30631.JPEG French_horn val 11521 VALID True 11521 imagenette2-160/train/n03394916/n03394916_44127.JPEG French_horn train 5390 VALID True 5390\n", "1 5390 11521 0.968786 imagenette2-160/train/n03394916/n03394916_44127.JPEG French_horn train 5390 VALID True 5390 imagenette2-160/val/n03394916/n03394916_30631.JPEG French_horn val 11521 VALID True 11521\n", "2 12914 7715 0.962458 imagenette2-160/val/n03445777/n03445777_6882.JPEG golf_ball val 12914 VALID True 12914 imagenette2-160/train/n03445777/n03445777_13918.JPEG golf_ball train 7715 VALID True 7715\n", "3 7715 12914 0.962458 imagenette2-160/train/n03445777/n03445777_13918.JPEG golf_ball train 7715 VALID True 7715 imagenette2-160/val/n03445777/n03445777_6882.JPEG golf_ball val 12914 VALID True 12914\n", "4 1404 1117 0.953837 imagenette2-160/train/n02102040/n02102040_3837.JPEG English_springer train 1404 VALID True 1404 imagenette2-160/train/n02102040/n02102040_1564.JPEG English_springer train 1117 VALID True 1117" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.similarity().head(5)" ] }, { "cell_type": "markdown", "id": "95d21e6d-a951-48dd-8c4c-894c8ba556fd", "metadata": {}, "source": [ "## Image Clusters" ] }, { "cell_type": "code", "execution_count": 15, "id": "4a6db529-cb1e-4655-af50-d97f3e131319", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000, "output_embedded_package_id": "1Wh1vmG-F-RG0ZYZP1oRgiyqHAtnfsuEk" }, "executionInfo": { "elapsed": 6376, "status": "ok", "timestamp": 1677667352994, "user": { "displayName": "Tom Shani", "userId": "00667426488827942961" }, "user_tz": -120 }, "id": "4a6db529-cb1e-4655-af50-d97f3e131319", "outputId": "adfc3ee1-84c9-4aa6-a0db-09a6a800b566", "scrolled": false, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tench\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 36.72it/s]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Finished OK. Components are stored as image files fastdup_imagenette/galleries/components_[index].jpg\n", "Stored components visual view in fastdup_imagenette/galleries/components.html\n", "Execution time in seconds 3.0\n" ] }, { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Components Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "
\n", "
\n", "
\n", "

Components Report

Showing groups of similar images

\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component6
num_images162
mean_distance0.9001
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
tench54
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component850
num_images70
mean_distance0.9004
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
English_springer54
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component7240
num_images69
mean_distance0.9001
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
golf_ball54
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component5410
num_images21
mean_distance0.9001
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
garbage_truck21
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component4512
num_images13
mean_distance0.9004
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
French_horn13
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component5397
num_images12
mean_distance0.9025
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
garbage_truck12
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component5539
num_images10
mean_distance0.9
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
garbage_truck10
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component1139
num_images8
mean_distance0.9062
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
English_springer8
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component5632
num_images8
mean_distance0.9041
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
garbage_truck8
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component4494
num_images8
mean_distance0.902
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
French_horn8
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component1239
num_images6
mean_distance0.903
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
English_springer6
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component4531
num_images6
mean_distance0.902
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
French_horn6
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component5678
num_images6
mean_distance0.9064
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
garbage_truck6
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component8335
num_images5
mean_distance0.9004
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
parachute5
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component199
num_images5
mean_distance0.9
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
tench5
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component2174
num_images5
mean_distance0.9011
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
cassette_player5
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component7386
num_images5
mean_distance0.9019
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
golf_ball5
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component4616
num_images5
mean_distance0.9043
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
French_horn5
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component8979
num_images4
mean_distance0.9013
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
parachute4
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component4764
num_images4
mean_distance0.9032
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
French_horn4
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fd.vis.component_gallery()" ] }, { "cell_type": "markdown", "id": "ca5d4b6e-7ff6-49b8-b487-6ba1573ab104", "metadata": {}, "source": [ "You can also visualize clusters with specific labels using the `slice` parameter. For example let's visualize clusters with the `chain_saw` label" ] }, { "cell_type": "code", "execution_count": 16, "id": "4b38dacf-becc-4631-9aeb-6fe9bd235aa1", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000, "output_embedded_package_id": "1xYIrPsODG8kAMaZOpGeKNRoa4-HjPC-w" }, "executionInfo": { "elapsed": 5130, "status": "ok", "timestamp": 1677667368207, "user": { "displayName": "Tom Shani", "userId": "00667426488827942961" }, "user_tz": -120 }, "id": "4b38dacf-becc-4631-9aeb-6fe9bd235aa1", "outputId": "131d0f11-5627-4beb-b58c-3801e09a3b42", "scrolled": false, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "chain_saw\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 250.94it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Finished OK. Components are stored as image files fastdup_imagenette/galleries/components_[index].jpg\n", "Stored components visual view in fastdup_imagenette/galleries/components.html\n", "Execution time in seconds 0.3\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Components Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "
\n", "
\n", "
\n", "

Components Report

Showing groups of similar images, for label: chain_saw

\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component2876
num_images3
mean_distance0.9064
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
chain_saw3
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component2798
num_images2
mean_distance0.9029
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
chain_saw2
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component2815
num_images2
mean_distance0.9208
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
chain_saw2
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component2862
num_images2
mean_distance0.9222
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
chain_saw2
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component2989
num_images2
mean_distance0.9139
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
chain_saw2
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component2992
num_images2
mean_distance0.9198
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
chain_saw2
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component3001
num_images2
mean_distance0.9073
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
chain_saw2
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component3002
num_images2
mean_distance0.9192
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
chain_saw2
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component3077
num_images2
mean_distance0.9355
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
chain_saw2
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component3305
num_images2
mean_distance0.9345
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
chain_saw2
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component10204
num_images2
mean_distance0.9039
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", " \n", "
Label
chain_saw2
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fd.vis.component_gallery(slice='chain_saw')" ] }, { "cell_type": "markdown", "id": "28498d81-d073-4f3d-baa4-732e1df93a34", "metadata": {}, "source": [ "## Connected Components" ] }, { "cell_type": "code", "execution_count": 17, "id": "0346be91-5380-48b9-a8df-074c342efcd3", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 206 }, "executionInfo": { "elapsed": 1036, "status": "ok", "timestamp": 1677667380699, "user": { "displayName": "Tom Shani", "userId": "00667426488827942961" }, "user_tz": -120 }, "id": "0346be91-5380-48b9-a8df-074c342efcd3", "outputId": "ffa6bd9d-b5b3-4ed5-86e1-c47ca9658667", "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
indexcomponent_idsumcountmean_distancemin_distancemax_distancefilenamelabelspliterror_codeis_validfd_index
2352356517.2897566.00.91390.90010.9534imagenette2-160/train/n01440764/n01440764_13304.JPEGtenchtrainVALIDTrue235
1211216517.2897566.00.91390.90010.9534imagenette2-160/train/n01440764/n01440764_11486.JPEGtenchtrainVALIDTrue121
6856856517.2897566.00.91390.90010.9534imagenette2-160/train/n01440764/n01440764_6174.JPEGtenchtrainVALIDTrue685
6896896517.2897566.00.91390.90010.9534imagenette2-160/train/n01440764/n01440764_6249.JPEGtenchtrainVALIDTrue689
7067066517.2897566.00.91390.90010.9534imagenette2-160/train/n01440764/n01440764_6494.JPEGtenchtrainVALIDTrue706
\n", "
" ], "text/plain": [ " index component_id sum count mean_distance min_distance max_distance filename label split error_code is_valid fd_index\n", "235 235 6 517.2897 566.0 0.9139 0.9001 0.9534 imagenette2-160/train/n01440764/n01440764_13304.JPEG tench train VALID True 235\n", "121 121 6 517.2897 566.0 0.9139 0.9001 0.9534 imagenette2-160/train/n01440764/n01440764_11486.JPEG tench train VALID True 121\n", "685 685 6 517.2897 566.0 0.9139 0.9001 0.9534 imagenette2-160/train/n01440764/n01440764_6174.JPEG tench train VALID True 685\n", "689 689 6 517.2897 566.0 0.9139 0.9001 0.9534 imagenette2-160/train/n01440764/n01440764_6249.JPEG tench train VALID True 689\n", "706 706 6 517.2897 566.0 0.9139 0.9001 0.9534 imagenette2-160/train/n01440764/n01440764_6494.JPEG tench train VALID True 706" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cc_df, _ = fd.connected_components()\n", "cc_df.sort_values('count', ascending=False).head(5)" ] }, { "cell_type": "markdown", "id": "569cb878", "metadata": {}, "source": [ "We can also get metadata for individual images using their `fastdup_id` available in `fd.annotations()`" ] }, { "cell_type": "code", "execution_count": 18, "id": "e80d6817-fed6-4fa4-8714-b01214e0d3f8", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 990, "status": "ok", "timestamp": 1677667384644, "user": { "displayName": "Tom Shani", "userId": "00667426488827942961" }, "user_tz": -120 }, "id": "e80d6817-fed6-4fa4-8714-b01214e0d3f8", "outputId": "4f973aba-572d-4e50-d22d-c5bfc8cf3d2d", "tags": [] }, "outputs": [ { "data": { "text/plain": [ "{'filename': 'imagenette2-160/train/n01440764/n01440764_1778.JPEG',\n", " 'label': 'tench',\n", " 'split': 'train',\n", " 'index': 349,\n", " 'error_code': 'VALID',\n", " 'is_valid': True,\n", " 'fd_index': 349}" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd[349]" ] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.11" } }, "nbformat": 4, "nbformat_minor": 5 }