{ "cells": [ { "cell_type": "markdown", "id": "ee446f4a", "metadata": {}, "source": [ "[![image](https://raw.githubusercontent.com/visual-layer/visuallayer/main/imgs/vl_horizontal_logo.png)](https://www.visual-layer.com)" ] }, { "cell_type": "markdown", "id": "2d3a2ba6-3ba0-4770-b025-c88adf5b292e", "metadata": {}, "source": [ "# fastdup for Satellite Imagery\n", "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/visual-layer/fastdup/blob/main/examples/satellite-image-analysis.ipynb)\n", "[![Open in Kaggle](https://kaggle.com/static/images/open-in-kaggle.svg)](https://kaggle.com/kernels/welcome?src=https://github.com/visual-layer/fastdup/blob/main/examples/satellite-image-analysis.ipynb)\n", "\n", "In this notebook we load satellite data from Mafat Competition https://mafatchallenge.mod.gov.il/, which consists of 16 bit grayscale images with rotated bounding boxes.\n", "\n", "The dataset is also available on Kaggle [here](https://www.kaggle.com/datasets/dragonzhang/mafat-train-dataset).\n", "\n", "We show how to work with this dataset using fastdup. It takes 140 seconds to process 18,000 bounding boxes and find all similarities.\n", "\n", "We use components gallery to highly suspected wrong bounding boxes as well as correct bounding boxes.\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "b2cc8c20-4069-4183-a247-0dc28788b158", "metadata": {}, "outputs": [], "source": [ "!pip install fastdup -Uq" ] }, { "cell_type": "code", "execution_count": 2, "id": "51b8ea18", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/usr/bin/dpkg\n" ] }, { "data": { "text/plain": [ "'1.26'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import fastdup\n", "fastdup.__version__" ] }, { "cell_type": "markdown", "id": "eb290525", "metadata": {}, "source": [ "Download mafat traing data, extract the zip file and put the notebook one level below images/ folder" ] }, { "cell_type": "code", "execution_count": null, "id": "73ec897b", "metadata": {}, "outputs": [], "source": [ "!kaggle datasets download -d dragonzhang/mafat-train-dataset" ] }, { "cell_type": "code", "execution_count": null, "id": "15dc9cf9", "metadata": {}, "outputs": [], "source": [ "!unzip mafat-train-dataset.zip" ] }, { "cell_type": "markdown", "id": "538d2699-4678-4f0b-a570-412d4a97c7ae", "metadata": {}, "source": [ "## Prepare annotation for fastdup format\n", "\n", "\n", "Here we read the data as given in the competition, one annotation file per each image. We combine all files into a single flat table" ] }, { "cell_type": "code", "execution_count": 3, "id": "8e6087e1-9a59-4958-9110-a199c35c10f6", "metadata": {}, "outputs": [], "source": [ "import os\n", "files=!ls labelTxt\n", "files = [os.path.join('labelTxt', f) for f in files]" ] }, { "cell_type": "code", "execution_count": 4, "id": "d64f0fa9-2ae4-4636-8866-a5303a490669", "metadata": {}, "outputs": [], "source": [ "def read_annotations(f):\n", " with open(f, 'r') as fd:\n", " lines = fd.readlines()\n", "\n", " bounding_boxes = []\n", "\n", " for line in lines:\n", " tokens = line.split()\n", " x1, y1, x2, y2, x3, y3, x4, y4 = map(float, tokens[:8])\n", " label = tokens[8]\n", " bounding_box = {'annot':f , 'x1': x1, 'y1': y1, 'x2': x2, 'y2': y2, 'x3': x3, 'y3': y3, 'x4': x4, 'y4': y4, 'label': label}\n", " bounding_boxes.append(bounding_box)\n", " return bounding_boxes" ] }, { "cell_type": "code", "execution_count": 5, "id": "696a9865-8a7d-45e4-9f8b-eea4b424c91f", "metadata": {}, "outputs": [], "source": [ "annot = []\n", "for f in files:\n", " annot.extend(read_annotations(f))" ] }, { "cell_type": "code", "execution_count": 6, "id": "d6d95cd5-990c-4ce0-9a0d-c127a8a456b6", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
annotx1y1x2y2x3y3x4y4labelfilename
0labelTxt/126_0_0.txt1221.94423.541229.28404.731236.34407.491229.00426.30large_vehicleimages/126_0_0.tiff
1labelTxt/126_0_0.txt445.80729.00457.34729.60457.01735.82445.47735.22medium_vehicleimages/126_0_0.tiff
2labelTxt/126_0_0.txt1059.83237.721079.99225.271084.31232.271064.15244.72heavy_equipmentimages/126_0_0.tiff
3labelTxt/126_0_0.txt964.83831.37981.88832.92981.26839.71964.21838.16medium_vehicleimages/126_0_0.tiff
4labelTxt/126_0_0.txt985.48867.081001.37868.521000.75875.29984.86873.85medium_vehicleimages/126_0_0.tiff
\n", "
" ], "text/plain": [ " annot x1 y1 x2 y2 x3 y3 x4 y4 label filename\n", "0 labelTxt/126_0_0.txt 1221.94 423.54 1229.28 404.73 1236.34 407.49 1229.00 426.30 large_vehicle images/126_0_0.tiff\n", "1 labelTxt/126_0_0.txt 445.80 729.00 457.34 729.60 457.01 735.82 445.47 735.22 medium_vehicle images/126_0_0.tiff\n", "2 labelTxt/126_0_0.txt 1059.83 237.72 1079.99 225.27 1084.31 232.27 1064.15 244.72 heavy_equipment images/126_0_0.tiff\n", "3 labelTxt/126_0_0.txt 964.83 831.37 981.88 832.92 981.26 839.71 964.21 838.16 medium_vehicle images/126_0_0.tiff\n", "4 labelTxt/126_0_0.txt 985.48 867.08 1001.37 868.52 1000.75 875.29 984.86 873.85 medium_vehicle images/126_0_0.tiff" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "df = pd.DataFrame(annot)\n", "df['filename'] = df['annot'].apply(lambda x: x.replace('labelTxt', 'images').replace('.txt', '.tiff'))\n", "df.head()" ] }, { "cell_type": "code", "execution_count": 7, "id": "1b4ccdaa-6162-4684-9808-303966e080bd", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "total annotations 117\n" ] } ], "source": [ "print('total annotations', len(df))" ] }, { "cell_type": "code", "execution_count": 8, "id": "c46545d0-3e52-4257-91cf-68e1a2b8d10c", "metadata": {}, "outputs": [], "source": [ "df.index.name = 'index'\n", "df[['filename', 'x1', 'y1', 'x2', 'y2', 'x3', 'y3', 'x4', 'y4', 'label']].to_csv('mafat.csv',index_label='index')" ] }, { "cell_type": "code", "execution_count": 9, "id": "92a5df03-d456-40a2-a01f-42a47f6835b5", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "index,filename,x1,y1,x2,y2,x3,y3,x4,y4,label\r\n", "0,images/126_0_0.tiff,1221.94,423.54,1229.28,404.73,1236.34,407.49,1229.0,426.3,large_vehicle\r\n", "1,images/126_0_0.tiff,445.8,729.0,457.34,729.6,457.01,735.82,445.47,735.22,medium_vehicle\r\n", "2,images/126_0_0.tiff,1059.83,237.72,1079.99,225.27,1084.31,232.27,1064.15,244.72,heavy_equipment\r\n", "3,images/126_0_0.tiff,964.83,831.37,981.88,832.92,981.26,839.71,964.21,838.16,medium_vehicle\r\n", "4,images/126_0_0.tiff,985.48,867.08,1001.37,868.52,1000.75,875.29,984.86,873.85,medium_vehicle\r\n", "5,images/126_0_0.tiff,1012.44,839.59,1031.34,841.31,1030.73,848.08,1011.83,846.36,large_vehicle\r\n", "6,images/126_0_0.tiff,7.4,262.78,25.79,261.82,26.21,269.89,7.82,270.85,large_vehicle\r\n", "7,images/126_0_0.tiff,1121.18,877.51,1137.87,879.03,1137.25,885.8,1120.56,884.28,medium_vehicle\r\n", "8,images/126_0_0.tiff,571.05,753.26,585.66,754.02,585.31,760.57,570.7,759.81,medium_vehicle\r\n" ] } ], "source": [ "# This is the required input by fastdup\n", "!head mafat.csv" ] }, { "cell_type": "markdown", "id": "620799ea-3318-4a74-8dd0-d74ec3f42849", "metadata": {}, "source": [ "## Run fastdup to crop and build a model for the crops" ] }, { "cell_type": "code", "execution_count": 10, "id": "d8dcc080-7ef8-4789-8e14-7b56794c4d22", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import cv2\n", "\n", "!rm -fr output" ] }, { "cell_type": "code", "execution_count": 11, "id": "d5abac7d-3b78-4090-9c6a-50abea31b0db", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import fastdup\n", "df = pd.read_csv('mafat.csv')\n", "fd = fastdup.create(input_dir='.', work_dir='output')\n" ] }, { "cell_type": "code", "execution_count": 12, "id": "94156d52-1c7d-400f-a0c2-63df5648a0e9", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "FastDup Software, (C) copyright 2022 Dr. Amir Alush and Dr. Danny Bickson.\n", "2023-07-13 18:58:04 [INFO] Going to loop over dir /tmp/tmplebc1a_5.csv\n", "2023-07-13 18:58:04 [INFO] Found total 117 images to run on, 117 train, 0 test, name list 117, counter 117 \n", "FastDup Software, (C) copyright 2022 Dr. Amir Alush and Dr. Danny Bickson.utes\n", "2023-07-13 18:58:05 [INFO] Going to loop over dir /tmp/crops_input.csv\n", "2023-07-13 18:58:05 [INFO] Found total 117 images to run on, 117 train, 0 test, name list 117, counter 117 \n", "2023-07-13 18:58:06 [INFO] Found total 117 images to run onstimated: 0 Minutes\n", "Finished histogram 0.048\n", "Finished bucket sort 0.056\n", "2023-07-13 18:58:06 [INFO] 10) Finished write_index() NN model\n", "2023-07-13 18:58:06 [INFO] Stored nn model index file output/nnf.index\n", "2023-07-13 18:58:06 [INFO] Total time took 1021 ms\n", "2023-07-13 18:58:06 [INFO] Found a total of 0 fully identical images (d>0.990), which are 0.00 %\n", "2023-07-13 18:58:06 [INFO] Found a total of 2 nearly identical images(d>0.980), which are 0.85 %\n", "2023-07-13 18:58:06 [INFO] Found a total of 193 above threshold images (d>0.900), which are 82.48 %\n", "2023-07-13 18:58:06 [INFO] Found a total of 11 outlier images (d<0.050), which are 4.70 %\n", "2023-07-13 18:58:06 [INFO] Min distance found 0.455 max distance 0.982\n", "2023-07-13 18:58:06 [INFO] Running connected components for ccthreshold 0.950000 \n", ".0\n", " ########################################################################################\n", "\n", "Dataset Analysis Summary: \n", "\n", " Dataset contains 117 images\n", " Valid images are 100.00% (117) of the data, invalid are 0.00% (0) of the data\n", " Similarity: 18.80% (22) belong to 4 similarity clusters (components).\n", " 81.20% (95) images do not belong to any similarity cluster.\n", " Largest cluster has 82 (70.09%) images.\n", " For a detailed analysis, use `.connected_components()`\n", "(similarity threshold used is 0.9, connected component threshold used is 0.95).\n", "\n", " Outliers: 5.98% (7) of images are possible outliers, and fall in the bottom 5.00% of similarity values.\n", " For a detailed list of outliers, use `.outliers()`.\n" ] }, { "data": { "text/plain": [ "0" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.run(annotations=df, overwrite=True, bounding_box='rotated', augmentation_additive_margin=15,\n", " verbose=False, ccthreshold=0.95)" ] }, { "cell_type": "markdown", "id": "a834aaaa-a76c-49bc-b293-c3c3e114d7aa", "metadata": {}, "source": [ "## Find suspected wrong bounding boxes\n", "\n", "From - crop image name\n", "To - similar images\n", "where the labels are not matching" ] }, { "cell_type": "code", "execution_count": 13, "id": "4e445a56-ffa9-448d-9e74-715413fc4f3c", "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "medium_vehicle\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 357.88it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Finished OK. Components are stored as image files output/galleries/components_[index].jpg\n", "Stored components visual view in output/galleries/components.html\n", "Execution time in seconds 0.1\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Components Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "
\n", "
\n", "
\n", "

Components Report

Showing groups of similar images, from different classes

\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component45
num_images2
mean_distance0.9688
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Label
medium_vehicle1
small_vessel1
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component59
num_images2
mean_distance0.9576
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Label
medium_vehicle1
medium_vessel1
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component63
num_images5
mean_distance0.9573
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Label
heavy_equipment2
medium_vehicle2
small_aircraft1
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component64
num_images2
mean_distance0.9554
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Label
heavy_equipment1
small_vessel1
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component15
num_images10
mean_distance0.955
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Label
small_vessel4
medium_vehicle3
medium_vessel2
heavy_equipment1
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
component13
num_images23
mean_distance0.95
\n", "
\n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Label
small_vessel10
medium_vehicle7
medium_vessel4
heavy_equipment1
large_aircraft1
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "0" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.vis.component_gallery(load_crops=True, enhance_image=True, keep_aspect_ratio=True, \n", " slice='diff', num_images=20, save_artifacts=True)" ] }, { "cell_type": "markdown", "id": "c0b9d32a", "metadata": {}, "source": [ "Looking at the raw cluster to link back cluster name to to file" ] }, { "cell_type": "code", "execution_count": 14, "id": "d1129fcd-ab0b-4ef7-93a0-30fea445be2f", "metadata": {}, "outputs": [], "source": [ "df = pd.read_csv('output/galleries/components.csv')" ] }, { "cell_type": "code", "execution_count": 15, "id": "1422b9cd-34cf-496f-be2a-48ca5f358193", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0component_idfileslabelfiles_idsdistancelen
04545['output/crops/images126_0_5120.tiff_704_1078_710_1079_709_1091_703_1091.jpg', 'output/crops/images126_0_5120.tiff_991_1081_1004_1081_1004_1086_991_1086.jpg']['medium_vehicle', 'small_vessel'][50, 72]0.96882
15959['output/crops/images126_0_5120.tiff_241_1265_259_1265_259_1273_241_1273.jpg', 'output/crops/images126_0_5120.tiff_1166_1005_1181_1005_1181_1009_1166_1010.jpg']['medium_vehicle', 'medium_vessel'][88, 90]0.95762
26363['output/crops/images126_1280_5120.tiff_996_134_1012_134_1012_141_996_141.jpg', 'output/crops/images126_1280_5120.tiff_192_81_197_80_197_91_193_91.jpg', 'output/crops/images126_1280_5120.tiff_191_101_196_101_196_111_191_111.jpg', 'output/crops/images126_1280_5120.tiff_1012_148_1030_161_1024_170_1006_156.jpg', 'output/crops/images126_1280_5120.tiff_909_1133_909_1107_939_1107_939_1132.jpg']['heavy_equipment', 'medium_vehicle', 'medium_vehicle', 'heavy_equipment', 'small_aircraft'][93, 99, 103, 104, 114]0.95735
36464['output/crops/images126_0_5120.tiff_1134_1049_1134_1061_1129_1061_1129_1050.jpg', 'output/crops/images126_1280_5120.tiff_267_1221_253_1206_259_1201_273_1215.jpg']['small_vessel', 'heavy_equipment'][94, 115]0.95542
41515['output/crops/images126_0_0.tiff_964_831_981_832_981_839_964_838.jpg', 'output/crops/images126_0_5120.tiff_987_1097_997_1097_997_1101_986_1101.jpg', 'output/crops/images126_0_5120.tiff_1149_1050_1149_1065_1143_1065_1143_1051.jpg', 'output/crops/images126_0_5120.tiff_1163_998_1174_998_1174_1003_1163_1003.jpg', 'output/crops/images126_0_5120.tiff_1063_1171_1075_1171_1075_1176_1063_1177.jpg', 'output/crops/images126_0_5120.tiff_1124_1050_1125_1064_1120_1064_1119_1051.jpg', 'output/crops/images126_1280_5120.tiff_1049_127_1064_127_1064_134_1049_134.jpg', 'output/crops/images126_0_5120.tiff_1228_1005_1243_1005_1243_1011_1228_1011.jpg', 'output/crops/images126_1280_5120.tiff_931_143_937_143_937_161_931_161.jpg', 'output/crops/images126_1280_5120.tiff_300_170_315_170_315_177_300_177.jpg']['medium_vehicle', 'small_vessel', 'medium_vessel', 'small_vessel', 'small_vessel', 'small_vessel', 'heavy_equipment', 'medium_vessel', 'medium_vehicle', 'medium_vehicle'][15, 48, 55, 67, 75, 80, 87, 97, 106, 107]0.955010
\n", "
" ], "text/plain": [ " Unnamed: 0 component_id files label \\\n", "0 45 45 ['output/crops/images126_0_5120.tiff_704_1078_710_1079_709_1091_703_1091.jpg', 'output/crops/images126_0_5120.tiff_991_1081_1004_1081_1004_1086_991_1086.jpg'] ['medium_vehicle', 'small_vessel'] \n", "1 59 59 ['output/crops/images126_0_5120.tiff_241_1265_259_1265_259_1273_241_1273.jpg', 'output/crops/images126_0_5120.tiff_1166_1005_1181_1005_1181_1009_1166_1010.jpg'] ['medium_vehicle', 'medium_vessel'] \n", "2 63 63 ['output/crops/images126_1280_5120.tiff_996_134_1012_134_1012_141_996_141.jpg', 'output/crops/images126_1280_5120.tiff_192_81_197_80_197_91_193_91.jpg', 'output/crops/images126_1280_5120.tiff_191_101_196_101_196_111_191_111.jpg', 'output/crops/images126_1280_5120.tiff_1012_148_1030_161_1024_170_1006_156.jpg', 'output/crops/images126_1280_5120.tiff_909_1133_909_1107_939_1107_939_1132.jpg'] ['heavy_equipment', 'medium_vehicle', 'medium_vehicle', 'heavy_equipment', 'small_aircraft'] \n", "3 64 64 ['output/crops/images126_0_5120.tiff_1134_1049_1134_1061_1129_1061_1129_1050.jpg', 'output/crops/images126_1280_5120.tiff_267_1221_253_1206_259_1201_273_1215.jpg'] ['small_vessel', 'heavy_equipment'] \n", "4 15 15 ['output/crops/images126_0_0.tiff_964_831_981_832_981_839_964_838.jpg', 'output/crops/images126_0_5120.tiff_987_1097_997_1097_997_1101_986_1101.jpg', 'output/crops/images126_0_5120.tiff_1149_1050_1149_1065_1143_1065_1143_1051.jpg', 'output/crops/images126_0_5120.tiff_1163_998_1174_998_1174_1003_1163_1003.jpg', 'output/crops/images126_0_5120.tiff_1063_1171_1075_1171_1075_1176_1063_1177.jpg', 'output/crops/images126_0_5120.tiff_1124_1050_1125_1064_1120_1064_1119_1051.jpg', 'output/crops/images126_1280_5120.tiff_1049_127_1064_127_1064_134_1049_134.jpg', 'output/crops/images126_0_5120.tiff_1228_1005_1243_1005_1243_1011_1228_1011.jpg', 'output/crops/images126_1280_5120.tiff_931_143_937_143_937_161_931_161.jpg', 'output/crops/images126_1280_5120.tiff_300_170_315_170_315_177_300_177.jpg'] ['medium_vehicle', 'small_vessel', 'medium_vessel', 'small_vessel', 'small_vessel', 'small_vessel', 'heavy_equipment', 'medium_vessel', 'medium_vehicle', 'medium_vehicle'] \n", "\n", " files_ids distance len \n", "0 [50, 72] 0.9688 2 \n", "1 [88, 90] 0.9576 2 \n", "2 [93, 99, 103, 104, 114] 0.9573 5 \n", "3 [94, 115] 0.9554 2 \n", "4 [15, 48, 55, 67, 75, 80, 87, 97, 106, 107] 0.9550 10 " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "id": "937b6733", "metadata": {}, "source": [ "Looking at good labels" ] }, { "cell_type": "code", "execution_count": 16, "id": "5225bde9-baea-4a45-92fd-baab7d6d4553", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Traceback (most recent call last):\n", " File \"/home/dnth/anaconda3/envs/fastdup/lib/python3.10/site-packages/fastdup/__init__.py\", line 1376, in create_components_gallery\n", " ret = do_create_components_gallery(work_dir, save_path, num_images, lazy_load, get_label_func, group_by, slice,\n", " File \"/home/dnth/anaconda3/envs/fastdup/lib/python3.10/site-packages/fastdup/galleries.py\", line 1399, in do_create_components_gallery\n", " ret = visualize_top_components(work_dir, save_dir, num_images,\n", " File \"/home/dnth/anaconda3/envs/fastdup/lib/python3.10/site-packages/fastdup/galleries.py\", line 795, in visualize_top_components\n", " top_components = do_find_top_components(work_dir=work_dir, get_label_func=get_label_func, group_by=group_by,\n", " File \"/home/dnth/anaconda3/envs/fastdup/lib/python3.10/site-packages/fastdup/galleries.py\", line 1236, in do_find_top_components\n", " assert len(comps), \"No components found with more than one image/video\"\n", "AssertionError: No components found with more than one image/video\n" ] } ], "source": [ "fd.vis.component_gallery(load_crops=True, enhance_image=True, keep_aspect_ratio=True,\n", " slice='same', num_images=20, save_artifacts=True)" ] }, { "cell_type": "markdown", "id": "8bf752ae", "metadata": {}, "source": [ "## Outliers\n", "\n", "Let's look on outliers on the satellite image level" ] }, { "cell_type": "code", "execution_count": 17, "id": "4082bd38-22ab-445b-a9a2-a72856352870", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 26144.37it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Stored outliers visual view in output/galleries/outliers.html\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Outliers Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "
\n", "
\n", "
\n", "

Outliers Report

Showing image outliers, one per row

\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.46795
Path/crops/images126_1280_5120tiff_333_977_331_879_448_877_449_975jpg
labellarge_aircraft
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.848818
Path/crops/images126_0_2560tiff_1221_1277_1244_1273_1245_1280_1222_1283jpg
labelbus
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.855832
Path/crops/images126_0_0tiff_7_262_25_261_26_269_7_270jpg
labellarge_vehicle
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.858068
Path/crops/images126_1280_5120tiff_-2_933_47_930_52_991_1_994jpg
labellarge_aircraft
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.859666
Path/crops/images126_1280_5120tiff_267_1221_253_1206_259_1201_273_1215jpg
labelheavy_equipment
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.863308
Path/crops/images126_0_0tiff_1059_237_1079_225_1084_232_1064_244jpg
labelheavy_equipment
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.867095
Path/crops/images126_1280_5120tiff_601_1050_600_1015_642_1015_643_1049jpg
labelsmall_aircraft
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "0" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.vis.outliers_gallery()" ] }, { "cell_type": "markdown", "id": "b6a420e5", "metadata": {}, "source": [ "Now we look at outliers at the crop level" ] }, { "cell_type": "code", "execution_count": 18, "id": "925c986e-18d9-4a6f-adc5-2cd7949f8424", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 17445.11it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Stored outliers visual view in output/galleries/outliers.html\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Outliers Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "
\n", "
\n", "
\n", "

Outliers Report

Showing image outliers, one per row

\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.46795
Path/crops/images126_1280_5120tiff_333_977_331_879_448_877_449_975jpg
labellarge_aircraft
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.848818
Path/crops/images126_0_2560tiff_1221_1277_1244_1273_1245_1280_1222_1283jpg
labelbus
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.855832
Path/crops/images126_0_0tiff_7_262_25_261_26_269_7_270jpg
labellarge_vehicle
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.858068
Path/crops/images126_1280_5120tiff_-2_933_47_930_52_991_1_994jpg
labellarge_aircraft
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.859666
Path/crops/images126_1280_5120tiff_267_1221_253_1206_259_1201_273_1215jpg
labelheavy_equipment
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.863308
Path/crops/images126_0_0tiff_1059_237_1079_225_1084_232_1064_244jpg
labelheavy_equipment
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.867095
Path/crops/images126_1280_5120tiff_601_1050_600_1015_642_1015_643_1049jpg
labelsmall_aircraft
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "0" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.vis.outliers_gallery(load_crops=True)" ] }, { "cell_type": "markdown", "id": "ad571f11", "metadata": {}, "source": [ "## Brightest Image\n", "\n", "We look for the brightest satellite images" ] }, { "cell_type": "code", "execution_count": 19, "id": "4a861aab-50a2-4f39-944e-f139fe60327a", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 6562.32it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Stored mean visual view in output/galleries/mean.html\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Bright Image Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "
\n", "
\n", "
\n", "

Bright Image Report

Showing example images, sort by descending order

\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean115.5904
filenameoutput/crops/images126_0_0.tiff_949_234_950_219_956_220_955_234.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean92.5701
filenameoutput/crops/images126_0_0.tiff_1030_250_1036_246_1041_256_1034_259.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean91.7934
filenameoutput/crops/images126_1280_5120.tiff_601_1050_600_1015_642_1015_643_1049.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean90.4244
filenameoutput/crops/images126_1280_5120.tiff_794_1192_794_1157_833_1156_834_1192.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean90.1924
filenameoutput/crops/images126_0_0.tiff_1059_237_1079_225_1084_232_1064_244.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean89.8846
filenameoutput/crops/images126_0_0.tiff_1221_423_1229_404_1236_407_1229_426.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean88.5196
filenameoutput/crops/images126_1280_5120.tiff_1012_148_1030_161_1024_170_1006_156.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean88.3636
filenameoutput/crops/images126_1280_5120.tiff_996_134_1012_134_1012_141_996_141.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean86.5223
filenameoutput/crops/images126_1280_5120.tiff_592_900_592_889_611_888_611_900.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean85.2022
filenameoutput/crops/images126_0_5120.tiff_20_1049_28_1057_24_1060_16_1052.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean84.5831
filenameoutput/crops/images126_1280_5120.tiff_889_1005_888_982_917_981_917_1005.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean82.7721
filenameoutput/crops/images126_0_0.tiff_1028_192_1033_185_1039_190_1033_197.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean82.3681
filenameoutput/crops/images126_0_1280.tiff_330_829_348_832_347_841_329_838.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean79.9242
filenameoutput/crops/images126_1280_5120.tiff_909_1133_909_1107_939_1107_939_1132.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean78.291
filenameoutput/crops/images126_1280_5120.tiff_907_1078_906_1055_924_1054_925_1078.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean78.1487
filenameoutput/crops/images126_1280_5120.tiff_906_1106_906_1080_935_1080_936_1106.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean77.239
filenameoutput/crops/images126_0_5120.tiff_1059_1066_1059_1079_1055_1079_1055_1066.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean77.0893
filenameoutput/crops/images126_1280_5120.tiff_1062_124_1080_125_1079_133_1061_132.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean76.4959
filenameoutput/crops/images126_0_2560.tiff_595_253_615_258_613_266_594_261.jpg
labelN/A
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean76.3903
filenameoutput/crops/images126_0_1280.tiff_150_799_170_798_171_806_150_807.jpg
labelN/A
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "0" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.vis.stats_gallery(metric='mean')" ] }, { "cell_type": "markdown", "id": "00277963", "metadata": {}, "source": [ "## Blurry Images \n", "Now we look for the most blurry images" ] }, { "cell_type": "code", "execution_count": 20, "id": "c0a2d9d9-5180-4ebe-b073-f7feef1e4c6d", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 6341.55it/s]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Stored blur visual view in output/galleries/blur.html\n" ] }, { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Blurry Image Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "
\n", "
\n", "
\n", "

Blurry Image Report

Showing example images, sort by ascending order

\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur5.0175
filenameoutput/crops/images126_1280_5120.tiff_267_1221_253_1206_259_1201_273_1215.jpg
labelheavy_equipment
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur5.6971
filenameoutput/crops/images126_0_3840.tiff_631_554_638_551_647_568_641_572.jpg
labelheavy_equipment
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur5.7064
filenameoutput/crops/images126_0_0.tiff_964_831_981_832_981_839_964_838.jpg
labelmedium_vehicle
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur7.9295
filenameoutput/crops/images126_0_0.tiff_1121_877_1137_879_1137_885_1120_884.jpg
labelmedium_vehicle
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur8.1925
filenameoutput/crops/images126_0_2560.tiff_482_267_493_277_488_282_477_271.jpg
labelmedium_vehicle
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur8.464
filenameoutput/crops/images126_0_2560.tiff_621_487_624_477_629_478_626_489.jpg
labelheavy_equipment
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur9.0193
filenameoutput/crops/images126_0_3840.tiff_864_534_871_538_863_554_856_551.jpg
labelheavy_equipment
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur10.5485
filenameoutput/crops/images126_0_2560.tiff_556_455_570_454_570_462_557_463.jpg
labelheavy_equipment
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur11.0871
filenameoutput/crops/images126_0_0.tiff_965_859_985_859_985_866_965_865.jpg
labellarge_vehicle
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur11.5538
filenameoutput/crops/images126_0_0.tiff_985_867_1001_868_1000_875_984_873.jpg
labelmedium_vehicle
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur11.8231
filenameoutput/crops/images126_0_1280.tiff_527_824_557_825_557_836_526_835.jpg
labelheavy_equipment
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur12.157
filenameoutput/crops/images126_0_5120.tiff_704_1078_710_1079_709_1091_703_1091.jpg
labelmedium_vehicle
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur12.5293
filenameoutput/crops/images126_0_5120.tiff_688_1078_694_1078_694_1092_688_1092.jpg
labelmedium_vehicle
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur12.76
filenameoutput/crops/images126_0_0.tiff_1012_839_1031_841_1030_848_1011_846.jpg
labellarge_vehicle
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur13.6241
filenameoutput/crops/images126_0_2560.tiff_307_267_315_267_314_283_306_282.jpg
labelheavy_equipment
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur14.0418
filenameoutput/crops/images126_0_1280.tiff_584_919_592_918_593_935_584_935.jpg
labellarge_vehicle
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur14.3275
filenameoutput/crops/images126_0_3840.tiff_226_1209_245_1205_247_1213_228_1217.jpg
labellarge_vehicle
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur14.3412
filenameoutput/crops/images126_0_0.tiff_1044_803_1057_803_1056_810_1044_809.jpg
labelmedium_vehicle
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur14.3537
filenameoutput/crops/images126_0_2560.tiff_574_412_576_394_582_395_580_413.jpg
labellarge_vehicle
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur14.4656
filenameoutput/crops/images126_0_2560.tiff_528_419_546_427_540_439_523_431.jpg
labelheavy_equipment
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "0" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.vis.stats_gallery(metric='blur',load_crops=True)" ] }, { "cell_type": "markdown", "id": "a8fe9bbf-6be1-4907-b555-53605befbf6d", "metadata": {}, "source": [ "## Wrap Up\n", "\n", "Next, feel free to check out other tutorials -\n", "\n", "+ ⚡ [**Quickstart**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/quick-dataset-analysis.ipynb): Learn how to install fastdup, load a dataset and analyze it for potential issues such as duplicates/near-duplicates, broken images, outliers, dark/bright/blurry images, and view visually similar image clusters. If you're new, start here!\n", "+ 🧹 [**Clean Image Folder**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/cleaning-image-dataset.ipynb): Learn how to analyze and clean a folder of images from potential issues and export a list of problematic files for further action. If you have an unorganized folder of images, this is a good place to start.\n", "+ 🖼 [**Analyze Image Classification Dataset**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/analyzing-image-classification-dataset.ipynb): Learn how to load a labeled image classification dataset and analyze for potential issues. If you have labeled ImageNet-style folder structure, have a go!\n", "+ 🎁 [**Analyze Object Detection Dataset**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/analyzing-object-detection-dataset.ipynb): Learn how to load bounding box annotations for object detection and analyze for potential issues. If you have a COCO-style labeled object detection dataset, give this example a try." ] }, { "cell_type": "markdown", "id": "12d9492f", "metadata": {}, "source": [ "\n", "## VL Profiler\n", "If you prefer a no-code platform to inspect and visualize your dataset, [**try our free cloud product VL Profiler**](https://app.visual-layer.com) - VL Profiler is our first no-code commercial product that lets you visualize and inspect your dataset in your browser. \n", "\n", "[Sign up](https://app.visual-layer.com) now, it's free.\n", "\n", "[![image](https://raw.githubusercontent.com/visual-layer/fastdup/main/gallery/vl_profiler_promo.svg)](https://app.visual-layer.com)\n", "\n", "As usual, feedback is welcome! \n", "\n", "Questions? Drop by our [Slack channel](https://visualdatabase.slack.com/join/shared_invite/zt-19jaydbjn-lNDEDkgvSI1QwbTXSY6dlA#/shared-invite/email) or open an issue on [GitHub](https://github.com/visual-layer/fastdup/issues)." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.11" } }, "nbformat": 4, "nbformat_minor": 5 }