🌐 Language

English | 简体中文 | 繁體中文 | 日本語 | 한국어 | हिन्दी | ไทย | Français | Deutsch | Español | Italiano | Русский | Português | Nederlands | Polski | العربية | فارسی | Türkçe | Tiếng Việt | Bahasa Indonesia

# 🚀 No Time to Train! ### Training-Free Reference-Based Instance Segmentation [![GitHub](https://img.shields.io/badge/%E2%80%8B-No%20Time%20To%20Train-black?logo=github)](https://github.com/miquel-espinosa/no-time-to-train) [![Website](https://img.shields.io/badge/🌐-Project%20Page-grey)](https://miquel-espinosa.github.io/no-time-to-train/) [![arXiv](https://img.shields.io/badge/arXiv-2507.02798-b31b1b)](https://arxiv.org/abs/2507.02798) **State-of-the-art (Papers with Code)** [**_SOTA 1-shot_**](https://paperswithcode.com/sota/few-shot-object-detection-on-ms-coco-1-shot?p=no-time-to-train-training-free-reference) | [![PWC](https://img.shields.io/badge/State%20of%20the%20Art-Few--Shot%20Object%20Detection%20on%20MS--COCO%20(1--shot)-21CBCE?style=flat&logo=paperswithcode)](https://paperswithcode.com/sota/few-shot-object-detection-on-ms-coco-1-shot?p=no-time-to-train-training-free-reference) [**_SOTA 10-shot_**](https://paperswithcode.com/sota/few-shot-object-detection-on-ms-coco-10-shot?p=no-time-to-train-training-free-reference) | [![PWC](https://img.shields.io/badge/State%20of%20the%20Art-Few--Shot%20Object%20Detection%20on%20MS--COCO%20(10--shot)-21CBCE?style=flat&logo=paperswithcode)](https://paperswithcode.com/sota/few-shot-object-detection-on-ms-coco-10-shot?p=no-time-to-train-training-free-reference) [**_SOTA 30-shot_**](https://paperswithcode.com/sota/few-shot-object-detection-on-ms-coco-30-shot?p=no-time-to-train-training-free-reference) | [![PWC](https://img.shields.io/badge/State%20of%20the%20Art-Few--Shot%20Object%20Detection%20on%20MS--COCO%20(30--shot)-21CBCE?style=flat&logo=paperswithcode)](https://paperswithcode.com/sota/few-shot-object-detection-on-ms-coco-30-shot?p=no-time-to-train-training-free-reference)

--- > 🚨 **Update (5th February 2026)**: The paper manuscript has been updated with extensive ablation studies, visualisations and additional experiments. > > 🚨 **Update (22nd July 2025):** Instructions for custom datasets have been added! > > 🔔 **Update (16th July 2025):** Code has been updated with instructions! --- ## 📋 Table of Contents - [🎯 Highlights](#-highlights) - [📜 Abstract](#-abstract) - [🧠 Architecture](#-architecture) - [🛠️ Installation instructions](#️-installation-instructions) - [1. Clone the repository](#1-clone-the-repository) - [2. Create conda environment](#2-create-conda-environment) - [3. Install SAM2 and DINOv2](#3-install-sam2-and-dinov2) - [4. Download datasets](#4-download-datasets) - [5. Download SAM2 and DINOv2 checkpoints](#5-download-sam2-and-dinov2-checkpoints) - [📊 Inference code: Reproduce 30-shot SOTA results in Few-shot COCO](#-inference-code) - [0. Create reference set](#0-create-reference-set) - [1. Fill memory with references](#1-fill-memory-with-references) - [2. Post-process memory bank](#2-post-process-memory-bank) - [3. Inference on target images](#3-inference-on-target-images) - [Results](#results) - [🔍 Custom dataset](#-custom-dataset) - [0. Prepare a custom dataset ⛵🐦](#0-prepare-a-custom-dataset) - [0.1 If only bbox annotations are available](#01-if-only-bbox-annotations-are-available) - [0.2 Convert coco annotations to pickle file](#02-convert-coco-annotations-to-pickle-file) - [1. Fill memory with references](#1-fill-memory-with-references) - [2. Post-process memory bank](#2-post-process-memory-bank) - [📚 Citation](#-citation) ## 🎯 Highlights - 💡 **Training-Free**: No fine-tuning, no prompt engineering—just a reference image. - 🖼️ **Reference-Based**: Segment new objects using just a few examples. - 🔥 **SOTA Performance**: Outperforms previous training-free approaches on COCO, PASCAL VOC, and Cross-Domain FSOD. **Links:** - 🧾 [**arXiv Paper**](https://arxiv.org/abs/2507.02798) - 🌐 [**Project Website**](https://miquel-espinosa.github.io/no-time-to-train/) - 📈 [**Papers with Code**](https://paperswithcode.com/paper/no-time-to-train-training-free-reference) ## 📜 Abstract > The performance of image segmentation models has historically been constrained by the high cost of collecting large-scale annotated data. The Segment Anything Model (SAM) alleviates this original problem through a promptable, semantics-agnostic, segmentation paradigm and yet still requires manual visual-prompts or complex domain-dependent prompt-generation rules to process a new image. Towards reducing this new burden, our work investigates the task of object segmentation when provided with, alternatively, only a small set of reference images. Our key insight is to leverage strong semantic priors, as learned by foundation models, to identify corresponding regions between a reference and a target image. We find that correspondences enable automatic generation of instance-level segmentation masks for downstream tasks and instantiate our ideas via a multi-stage, training-free method incorporating (1) memory bank construction; (2) representation aggregation and (3) semantic-aware feature matching. Our experiments show significant improvements on segmentation metrics, leading to state-of-the-art performance on COCO FSOD (36.8% nAP), PASCAL VOC Few-Shot (71.2% nAP50) and outperforming existing training-free approaches on the Cross-Domain FSOD benchmark (22.4% nAP). ![cdfsod-results-final-comic-sans-min](https://github.com/user-attachments/assets/ab302c02-c080-4042-99fc-0e181ba8abb9) ## 🧠 Architecture ![training-free-architecture-comic-sans-min](https://github.com/user-attachments/assets/d84dd83a-505e-45a0-8ce3-98e1838017f9) ## 🛠️ Installation instructions ### 1. Clone the repository ```bash git clone https://github.com/miquel-espinosa/no-time-to-train.git cd no-time-to-train ``` ### 2. Create conda environment We will create a conda environment with the required packages. ```bash conda env create -f environment.yml conda activate no-time-to-train ``` ### 3. Install SAM2 and DINOv2 We will install SAM2 and DINOv2 from source. ```bash pip install -e . cd dinov2 pip install -e . cd .. ``` ### 4. Download datasets Please download COCO dataset and place it in `data/coco` ### 5. Download SAM2 and DINOv2 checkpoints We will download the exact SAM2 checkpoints used in the paper. (Note, however, that SAM2.1 checkpoints are already available and might perform better.) ```bash mkdir -p checkpoints/dinov2 cd checkpoints wget https://dl.fbaipublicfiles.com/segment_anything_2/072824/sam2_hiera_large.pt cd dinov2 wget https://dl.fbaipublicfiles.com/dinov2/dinov2_vitl14/dinov2_vitl14_pretrain.pth cd ../.. ``` ## 📊 Inference code ⚠️ Disclaimer: This is research code — expect a bit of chaos! ### Reproducing 30-shot SOTA results in Few-shot COCO Define useful variables and create a folder for results: ```bash CONFIG=./no_time_to_train/new_exps/coco_fewshot_10shot_Sam2L.yaml CLASS_SPLIT="few_shot_classes" RESULTS_DIR=work_dirs/few_shot_results SHOTS=30 SEED=33 GPUS=4 mkdir -p $RESULTS_DIR FILENAME=few_shot_${SHOTS}shot_seed${SEED}.pkl ``` #### 0. Create reference set ```bash python no_time_to_train/dataset/few_shot_sampling.py \ --n-shot $SHOTS \ --out-path ${RESULTS_DIR}/${FILENAME} \ --seed $SEED \ --dataset $CLASS_SPLIT ``` #### 1. Fill memory with references ```bash python run_lightening.py test --config $CONFIG \ --model.test_mode fill_memory \ --out_path ${RESULTS_DIR}/memory.ckpt \ --model.init_args.model_cfg.memory_bank_cfg.length $SHOTS \ --model.init_args.dataset_cfgs.fill_memory.memory_pkl ${RESULTS_DIR}/${FILENAME} \ --model.init_args.dataset_cfgs.fill_memory.memory_length $SHOTS \ --model.init_args.dataset_cfgs.fill_memory.class_split $CLASS_SPLIT \ --trainer.logger.save_dir ${RESULTS_DIR}/ \ --trainer.devices $GPUS ``` #### 2. Post-process memory bank ```bash python run_lightening.py test --config $CONFIG \ --model.test_mode postprocess_memory \ --model.init_args.model_cfg.memory_bank_cfg.length $SHOTS \ --ckpt_path ${RESULTS_DIR}/memory.ckpt \ --out_path ${RESULTS_DIR}/memory_postprocessed.ckpt \ --trainer.devices 1 ``` #### 3. Inference on target images ```bash python run_lightening.py test --config $CONFIG \ --ckpt_path ${RESULTS_DIR}/memory_postprocessed.ckpt \ --model.init_args.test_mode test \ --model.init_args.model_cfg.memory_bank_cfg.length $SHOTS \ --model.init_args.model_cfg.dataset_name $CLASS_SPLIT \ --model.init_args.dataset_cfgs.test.class_split $CLASS_SPLIT \ --trainer.logger.save_dir ${RESULTS_DIR}/ \ --trainer.devices $GPUS ``` If you'd like to see inference results online (as they are computed), add the argument: ```bash --model.init_args.model_cfg.test.online_vis True ``` To adjust the score threshold `score_thr` parameter, add the argument (for example, visualising all instances with score higher than `0.4`): ```bash --model.init_args.model_cfg.test.vis_thr 0.4 ``` Images will now be saved in `results_analysis/few_shot_classes/`. The image on the left shows the ground truth, the image on the right shows the segmented instances found by our training-free method. Note that in this example we are using the `few_shot_classes` split, thus, we should only expect to see segmented instances of the classes in this split (not all classes in COCO). #### Results After running all images in the validation set, you should obtain: ``` BBOX RESULTS: Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.368 SEGM RESULTS: Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.342 ``` --- ## 🔍 Custom dataset We provide the instructions for running our pipeline on a custom dataset. Annotation format are always in COCO format. > **TLDR;** To directly see how to run full pipeline on *custom datasets*, find `scripts/matching_cdfsod_pipeline.sh` together with example scripts of CD-FSOD datasets (e.g. `scripts/dior_fish.sh`) ### 0. Prepare a custom dataset ⛵🐦 Let's imagine we want to detect **boats**⛵ and **birds**🐦 in a custom dataset. To use our method we will need: - At least 1 *annotated* reference image for each class (i.e. 1 reference image for boat and 1 reference image for bird) - Multiple target images to find instances of our desired classes. We have prepared a toy script to create a custom dataset with coco images, for a **1-shot** setting. ```bash mkdir -p data/my_custom_dataset python scripts/make_custom_dataset.py ``` This will create a custom dataset with the following folder structure: ``` data/my_custom_dataset/ ├── annotations/ │ ├── custom_references.json │ ├── custom_targets.json │ └── references_visualisations/ │ ├── bird_1.jpg │ └── boat_1.jpg └── images/ ├── 429819.jpg ├── 101435.jpg └── (all target and reference images) ``` **Reference images visualisation (1-shot):** | 1-shot Reference Image for BIRD 🐦 | 1-shot Reference Image for BOAT ⛵ | |:---------------------------------:|:----------------------------------:| | bird_1

bird_1

|

boat_1

| ### 0.1 If only bbox annotations are available We also provide a script to generate instance-level segmentation masks by using SAM2. This is useful if you only have bounding box annotations available for the reference images. ```bash # Download sam_h checkpoint. Feel free to use more recent checkpoints (note: code might need to be adapted) wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth -O checkpoints/sam_vit_h_4b8939.pth # Run automatic instance segmentation from ground truth bounding boxes. python no_time_to_train/dataset/sam_bbox_to_segm_batch.py \ --input_json data/my_custom_dataset/annotations/custom_references.json \ --image_dir data/my_custom_dataset/images \ --sam_checkpoint checkpoints/sam_vit_h_4b8939.pth \ --model_type vit_h \ --device cuda \ --batch_size 8 \ --visualize ``` **Reference images with instance-level segmentation masks (generated by SAM2 from gt bounding boxes, 1-shot):** Visualisation of the generated segmentation masks are saved in `data/my_custom_dataset/annotations/custom_references_with_SAM_segm/references_visualisations/`. | 1-shot Reference Image for BIRD 🐦 (automatically segmented with SAM) | 1-shot Reference Image for BOAT ⛵ (automatically segmented with SAM) | |:---------------------------------:|:----------------------------------:| | bird_1_with_SAM_segm

bird_1_with_SAM_segm

|

boat_1_with_SAM_segm

| ### 0.2 Convert coco annotations to pickle file ```bash python no_time_to_train/dataset/coco_to_pkl.py \ data/my_custom_dataset/annotations/custom_references_with_segm.json \ data/my_custom_dataset/annotations/custom_references_with_segm.pkl \ 1 ``` ### 1. Fill memory with references First, define useful variables and create a folder for results. For correct visualisation of labels, class names should be ordered by category id as appears in the json file. E.g. `bird` has category id `16`, `boat` has category id `9`. Thus, `CAT_NAMES=boat,bird`. ```bash DATASET_NAME=my_custom_dataset DATASET_PATH=data/my_custom_dataset CAT_NAMES=boat,bird CATEGORY_NUM=2 SHOT=1 YAML_PATH=no_time_to_train/pl_configs/matching_cdfsod_template.yaml PATH_TO_SAVE_CKPTS=./tmp_ckpts/my_custom_dataset mkdir -p $PATH_TO_SAVE_CKPTS ``` Run step 1: ```bash python run_lightening.py test --config $YAML_PATH \ --model.test_mode fill_memory \ --out_path $PATH_TO_SAVE_CKPTS/$DATASET_NAME\_$SHOT\_refs_memory.pth \ --model.init_args.dataset_cfgs.fill_memory.root $DATASET_PATH/images \ --model.init_args.dataset_cfgs.fill_memory.json_file $DATASET_PATH/annotations/custom_references_with_segm.json \ --model.init_args.dataset_cfgs.fill_memory.memory_pkl $DATASET_PATH/annotations/custom_references_with_segm.pkl \ --model.init_args.dataset_cfgs.fill_memory.memory_length $SHOT \ --model.init_args.dataset_cfgs.fill_memory.cat_names $CAT_NAMES \ --model.init_args.model_cfg.dataset_name $DATASET_NAME \ --model.init_args.model_cfg.memory_bank_cfg.length $SHOT \ --model.init_args.model_cfg.memory_bank_cfg.category_num $CATEGORY_NUM \ --trainer.devices 1 ``` ### 2. Post-process memory bank ```bash python run_lightening.py test --config $YAML_PATH \ --model.test_mode postprocess_memory \ --ckpt_path $PATH_TO_SAVE_CKPTS/$DATASET_NAME\_$SHOT\_refs_memory.pth \ --out_path $PATH_TO_SAVE_CKPTS/$DATASET_NAME\_$SHOT\_refs_memory_postprocessed.pth \ --model.init_args.model_cfg.dataset_name $DATASET_NAME \ --model.init_args.model_cfg.memory_bank_cfg.length $SHOT \ --model.init_args.model_cfg.memory_bank_cfg.category_num $CATEGORY_NUM \ --trainer.devices 1 ``` #### 2.1 Visualise post-processed memory bank ```bash python run_lightening.py test --config $YAML_PATH \ --model.test_mode vis_memory \ --ckpt_path $PATH_TO_SAVE_CKPTS/$DATASET_NAME\_$SHOT\_refs_memory_postprocessed.pth \ --model.init_args.dataset_cfgs.fill_memory.root $DATASET_PATH/images \ --model.init_args.dataset_cfgs.fill_memory.json_file $DATASET_PATH/annotations/custom_references_with_segm.json \ --model.init_args.dataset_cfgs.fill_memory.memory_pkl $DATASET_PATH/annotations/custom_references_with_segm.pkl \ --model.init_args.dataset_cfgs.fill_memory.memory_length $SHOT \ --model.init_args.dataset_cfgs.fill_memory.cat_names $CAT_NAMES \ --model.init_args.model_cfg.dataset_name $DATASET_NAME \ --model.init_args.model_cfg.memory_bank_cfg.length $SHOT \ --model.init_args.model_cfg.memory_bank_cfg.category_num $CATEGORY_NUM \ --trainer.devices 1 ``` PCA and K-means visualisations for the memory bank images are stored in `results_analysis/memory_vis/my_custom_dataset`. ### 3. Inference on target images If `ONLINE_VIS` is set to True, prediction results will be saved in `results_analysis/my_custom_dataset/` and displayed as they are computed. NOTE that running with online visualisation is much slower. Feel free to change the score threshold `VIS_THR` to see more or less segmented instances. ```bash ONLINE_VIS=True VIS_THR=0.4 python run_lightening.py test --config $YAML_PATH \ --model.test_mode test \ --ckpt_path $PATH_TO_SAVE_CKPTS/$DATASET_NAME\_$SHOT\_refs_memory_postprocessed.pth \ --model.init_args.model_cfg.dataset_name $DATASET_NAME \ --model.init_args.model_cfg.memory_bank_cfg.length $SHOT \ --model.init_args.model_cfg.memory_bank_cfg.category_num $CATEGORY_NUM \ --model.init_args.model_cfg.test.imgs_path $DATASET_PATH/images \ --model.init_args.model_cfg.test.online_vis $ONLINE_VIS \ --model.init_args.model_cfg.test.vis_thr $VIS_THR \ --model.init_args.dataset_cfgs.test.root $DATASET_PATH/images \ --model.init_args.dataset_cfgs.test.json_file $DATASET_PATH/annotations/custom_targets.json \ --model.init_args.dataset_cfgs.test.cat_names $CAT_NAMES \ --trainer.devices 1 ``` ### Results Performance metrics (with the exact same parameters as commands above) should be: ``` BBOX RESULTS: Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.478 SEGM RESULTS: Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.458 ``` Visual results are saved in `results_analysis/my_custom_dataset/`. Note that our method works for false negatives, that is, images that do not contain any instances of the desired classes. *Click images to enlarge ⬇️* | Target image with boats ⛵ (left GT, right predictions) | Target image with birds 🐦 (left GT, right predictions) | |:----------------------:|:----------------------:| | ![000000459673](https://github.com/user-attachments/assets/678dc15a-dd3b-49d5-9287-6290da16aa6b) | ![000000407180](https://github.com/user-attachments/assets/fe306e48-af49-4d83-ac82-76fac6c456d1) | | Target image with boats and birds ⛵🐦 (left GT, right predictions) | Target image without boats or birds 🚫 (left GT, right predictions) | |:---------------------------------:|:----------------------------------:| | ![000000517410](https://github.com/user-attachments/assets/9849b227-7f43-43d7-81ea-58010a623ad5) | ![000000460598](https://github.com/user-attachments/assets/7587700c-e09d-4cf6-8590-3df129c2568e) | ## 🔬 Ablations ### Backbone ablation To evaluate the transferability of our method across foundation models, we replace both the semantic encoder (DINOv2) and the SAM-based segmenter with several alternatives. **Semantic encoder ablation:** ```bash # CLIP (Sizes: b16, b32, l14, l14@336px) bash scripts/clip/clipl14@336px.sh bash scripts/clip/clipl14.sh bash scripts/clip/clipb16.sh bash scripts/clip/clipb32.sh # DINOV3 (Sizes: b, l, h) bash scripts/dinov3/dinov3b.sh bash scripts/dinov3/dinov3l.sh bash scripts/dinov3/dinov3h.sh # PE (Sizes: g14, l14) bash scripts/pe/PEg14.sh bash scripts/pe/PEl14.sh ``` **Segmenter ablation:** ```bash # SAM2 (Sizes: tiny, small, base+, large) bash scripts/sam2/sam2_tiny.sh bash scripts/sam2/sam2_small.sh bash scripts/sam2/sam2_base_plus.sh bash scripts/baseline/dinov2_sam_baseline.sh # SAM2 Large ``` ### VLM evaluation on COCO few-shot dataset We evaluate QWEN VLM on COCO few-shot dataset. ```bash bash scripts/vl-qwen/ablation-vl-qwen.sh ``` ### Reference image heuristics To understand why different reference images lead to performance variation, we analyse the statistical properties of COCO novel classes annotations. #### ANALYSIS We study three annotation characteristics: (1) mask area (object size), (2) mask center location, and (3) distance to image edges.

Instructions:

```bash # Mask area distribution python no_time_to_train/make_plots/mask_area_distribution.py \ --input data/coco/annotations/instances_val2017.json \ --output no_time_to_train/make_plots/mask_area_distribution/mask_area_distribution.png \ --edges-output no_time_to_train/make_plots/mask_area_distribution/bbox_edge_distance_histograms.png \ --center-output no_time_to_train/make_plots/mask_area_distribution/bbox_center_density.png \ --bins 80 \ --distance-bins 80 \ --disable-center-density # Bbox center positions python no_time_to_train/make_plots/bbox_positions.py \ --per-class-root data/coco/annotations/per_class_instances \ --filename centeredness_2d_hist_plain.png \ --max-cols 6 \ --output-dir ./no_time_to_train/make_plots/bbox_positions \ --outfile grid_bbox_positions.png ```

[OUTPUT] Mask area distribution

mask_area_distribution

[OUTPUT] Bbox center density

grid_bbox_positions

[OUTPUT] Bbox edge distance histograms

bbox_edge_distance_histograms

#### SELECTION We sample 100 diverse reference images per class, explicitly covering a range of mask sizes, centers, and edge distances. Each reference is evaluated on a fixed reduced validation subset.

Instructions:

**Setup script:** `scripts/1shot_ref_ablation/setup.sh`: 1. Create per class json file 2. Analyse specific class 3. Create reference set with different heuristics ```bash bash scripts/1shot_ref_ablation/setup.sh ``` **Run scripts:** `scripts/1shot_ref_ablation/gpu*.sh`: 4. Run pipeline for each reference set ```bash # Example launch script that calls template script for each reference set bash scripts/1shot_ref_ablation/gpu0.sh ```

#### RESULTS We analyze how detection scores correlate with reference image characteristics (mask size, center position, edge distance).

Instructions:

```bash python no_time_to_train/make_plots/heuristics_analysis.py # Outputs: # - no_time_to_train/make_plots/heuristics_analysis/heatmap_bbox_norm_scores.png # - no_time_to_train/make_plots/heuristics_analysis/heatmap_segm_norm_scores.png # - no_time_to_train/make_plots/heuristics_analysis/heatmap_center_bbox_norm_scores_kde_smooth.png # - no_time_to_train/make_plots/heuristics_analysis/heatmap_center_bbox_norm_scores.png # - no_time_to_train/make_plots/heuristics_analysis/heatmap_center_segm_norm_scores_kde_smooth.png # - no_time_to_train/make_plots/heuristics_analysis/heatmap_center_segm_norm_scores.png # - no_time_to_train/make_plots/heuristics_analysis/per_class_area_vs_raw_scores.png # - no_time_to_train/make_plots/heuristics_analysis/all_classes_area_vs_norm_scores.png # - no_time_to_train/make_plots/heuristics_analysis/edge_distance_vs_norm_scores.png # - no_time_to_train/make_plots/heuristics_analysis/bars_area_category_norm_scores.png # - no_time_to_train/make_plots/heuristics_analysis/bars_centered_norm_scores.png # - no_time_to_train/make_plots/heuristics_analysis/bars_avoid_sides_norm_scores.png ```

[OUTPUT] Barplots. Effect of mask area (left) and centeredness (right) on performance

barplot

[OUTPUT] Heatmaps. 2D score maps of performance as a function of mask-center location

heatmap

[OUTPUT] Reference-image performance vs. mask area for all COCO novel classes

class_performance

### Reference-image degradation We evaluate our method under progressively degraded reference images by applying increasing levels of Gaussian blur. ablation-blur

ablation-blur

Instructions:

```bash # Run different blur levels bash scripts/blur_ablation/blur_ablation.sh # Plot grid of blur ablation results python no_time_to_train/make_plots/plot_blur_results.py \ --results-root ./work_dirs/blur_ablation \ --class-id 0 \ --max-cols 4 \ --output-dir ./no_time_to_train/make_plots/blur_ablation \ --outfile grid_blur_ablation_class_0.png ```

### Feature similarity Script for visualising feature similarity between reference images and target images. It generates single-feature similarity (path features), and prototype-based similarity (aggregated features). feature_similarity_small

feature_similarity_small

Instructions:

```bash python no_time_to_train/make_plots/feature_similarity.py \ --classes orange \ --num-images 20 \ --min-area 12 \ --max-area 25000 \ --min-instances 2 \ --seed 123 \ --max-per-class 12 ```

### T-SNE plots (DINOv2 feature separability) t-SNE of DINOv2 features shows clear separation for dissimilar classes but heavy overlap for similar ones, suggesting that confusion stems from backbone feature geometry rather than prototypes selection. tsne

tsne

Instructions:

Extract features ```bash python no_time_to_train/make_plots/tsne-coco.py --extract ``` Plot T-SNE plots ```bash # Example spoon vs fork python no_time_to_train/make_plots/tsne-coco.py --classes cat dog ```

## 🛠️ Helpers ### Visualise memory add image feature_comparison_small.png here

Instructions

To visualise the memory bank (PCA and K-means visualisations) for a given experiment, adjust the following command. Set `DO_NOT_CROP` to True/False (in `no_time_to_train/models/Sam2MatchingBaseline_noAMG.py`) to visualise the reference image with/without the cropped mask. ```bash python run_lightening.py test --config $CONFIG \ --model.test_mode vis_memory \ --ckpt_path $RESULTS_DIR/memory_postprocessed.ckpt \ --model.init_args.dataset_cfgs.fill_memory.memory_pkl $RESULTS_DIR/$FILENAME \ --model.init_args.dataset_cfgs.fill_memory.memory_length $SHOT \ --model.init_args.dataset_cfgs.fill_memory.class_split $CLASS_SPLIT \ --model.init_args.model_cfg.dataset_name $CLASS_SPLIT \ --model.init_args.model_cfg.memory_bank_cfg.length $SHOT \ --model.init_args.model_cfg.memory_bank_cfg.category_num $CATEGORY_NUM \ --trainer.devices 1 ```

### Resize images to 512x512 (make the images square) To resize the images to 512x512 and save them to a new directory, run the following command. This is for the paper figures.

Instructions:

```bash python no_time_to_train/make_plots/paper_fig_square_imgs.py ```

### Model size and memory To calculate the model size and memory, run the following command.

Instructions:

- See `no_time_to_train/models/Sam2MatchingBaseline_noAMG_model_and_memory.py` for the model size and memory calculation. (Easiest: temporarily replace by Sam2MatchingBaseline_noAMG.py, then rename back.)

## 🌍 EO datasets ### Evaluation scripts (EO datasets) Evaluation scripts can be found in the `scripts/EO` directory. The EO datasets use the `./scripts/EO/EO_template.sh` script to run the evaluation. Each EO experiment run is saved under the `./EO_results` directory. In the experiment folder we store: - The summary.txt file with the configuration, and runtime of the experiment. - The prediction visualisations on the test set (`results_analysis` folder). - The memory visualisations (`memory_vis` folder). - The few-shot annotation pickle file. - The checkpoints of the model (if not cleaned up). ### Figures and tables Additional scripts for generating figures and tables.

Summary latex table of the EO datasets:

```bash python scripts/convert_datasets/summary_table_datasets.py ```

Generate LaTeX table of the EO datasets:

```bash python scripts/paper_figures/table_EO_results.py ./EO_results_no_heuristics ```

Accuracy plot of the EO datasets:

```bash python scripts/paper_figures/plot_EO_accuracy.py \ --input-root ./EO_results \ --output-root ./EO_results ```

Summary of heuristics effect on the EO datasets:

```bash python scripts/paper_figures/plot_EO_heuristic.py \ --no-heuristics ./EO_results_no_heuristics \ --heuristics ./EO_results ```

Runtime plot of the EO datasets:

```bash python scripts/paper_figures/plot_EO_runtime.py \ --input-root ./EO_results \ --output-root ./EO_results ```

Generate EO grid visualisations for paper figure:

```bash python scripts/paper_figures/plot_EO_grid.py \ --root ./EO_results_no_heuristics \ --dataset ISAID \ --shots 1 ```

## 📚 Citation If you use this work, please cite us: ```bibtex @article{espinosa2025notimetotrain, title={No time to train! Training-Free Reference-Based Instance Segmentation}, author={Miguel Espinosa and Chenhongyi Yang and Linus Ericsson and Steven McDonagh and Elliot J. Crowley}, journal={arXiv preprint arXiv:2507.02798}, year={2025}, primaryclass={cs.CV} } ```