# 🚀 No Time to Train!
### Training-Free Reference-Based Instance Segmentation
[](https://github.com/miquel-espinosa/no-time-to-train)
[](https://miquel-espinosa.github.io/no-time-to-train/)
[](https://arxiv.org/abs/2507.02798)
**State-of-the-art (Papers with Code)**
[**_SOTA 1-shot_**](https://paperswithcode.com/sota/few-shot-object-detection-on-ms-coco-1-shot?p=no-time-to-train-training-free-reference) | [-21CBCE?style=flat&logo=paperswithcode)](https://paperswithcode.com/sota/few-shot-object-detection-on-ms-coco-1-shot?p=no-time-to-train-training-free-reference)
[**_SOTA 10-shot_**](https://paperswithcode.com/sota/few-shot-object-detection-on-ms-coco-10-shot?p=no-time-to-train-training-free-reference) | [-21CBCE?style=flat&logo=paperswithcode)](https://paperswithcode.com/sota/few-shot-object-detection-on-ms-coco-10-shot?p=no-time-to-train-training-free-reference)
[**_SOTA 30-shot_**](https://paperswithcode.com/sota/few-shot-object-detection-on-ms-coco-30-shot?p=no-time-to-train-training-free-reference) | [-21CBCE?style=flat&logo=paperswithcode)](https://paperswithcode.com/sota/few-shot-object-detection-on-ms-coco-30-shot?p=no-time-to-train-training-free-reference)
---
> 🚨 **Update (5th February 2026)**: The paper manuscript has been updated with extensive ablation studies, visualisations and additional experiments.
>
> 🚨 **Update (22nd July 2025):** Instructions for custom datasets have been added!
>
> 🔔 **Update (16th July 2025):** Code has been updated with instructions!
---
## 📋 Table of Contents
- [🎯 Highlights](#-highlights)
- [📜 Abstract](#-abstract)
- [🧠 Architecture](#-architecture)
- [🛠️ Installation instructions](#️-installation-instructions)
- [1. Clone the repository](#1-clone-the-repository)
- [2. Create conda environment](#2-create-conda-environment)
- [3. Install SAM2 and DINOv2](#3-install-sam2-and-dinov2)
- [4. Download datasets](#4-download-datasets)
- [5. Download SAM2 and DINOv2 checkpoints](#5-download-sam2-and-dinov2-checkpoints)
- [📊 Inference code: Reproduce 30-shot SOTA results in Few-shot COCO](#-inference-code)
- [0. Create reference set](#0-create-reference-set)
- [1. Fill memory with references](#1-fill-memory-with-references)
- [2. Post-process memory bank](#2-post-process-memory-bank)
- [3. Inference on target images](#3-inference-on-target-images)
- [Results](#results)
- [🔍 Custom dataset](#-custom-dataset)
- [0. Prepare a custom dataset ⛵🐦](#0-prepare-a-custom-dataset)
- [0.1 If only bbox annotations are available](#01-if-only-bbox-annotations-are-available)
- [0.2 Convert coco annotations to pickle file](#02-convert-coco-annotations-to-pickle-file)
- [1. Fill memory with references](#1-fill-memory-with-references)
- [2. Post-process memory bank](#2-post-process-memory-bank)
- [📚 Citation](#-citation)
## 🎯 Highlights
- 💡 **Training-Free**: No fine-tuning, no prompt engineering—just a reference image.
- 🖼️ **Reference-Based**: Segment new objects using just a few examples.
- 🔥 **SOTA Performance**: Outperforms previous training-free approaches on COCO, PASCAL VOC, and Cross-Domain FSOD.
**Links:**
- 🧾 [**arXiv Paper**](https://arxiv.org/abs/2507.02798)
- 🌐 [**Project Website**](https://miquel-espinosa.github.io/no-time-to-train/)
- 📈 [**Papers with Code**](https://paperswithcode.com/paper/no-time-to-train-training-free-reference)
## 📜 Abstract
> The performance of image segmentation models has historically been constrained by the high cost of collecting large-scale annotated data. The Segment Anything Model (SAM) alleviates this original problem through a promptable, semantics-agnostic, segmentation paradigm and yet still requires manual visual-prompts or complex domain-dependent prompt-generation rules to process a new image. Towards reducing this new burden, our work investigates the task of object segmentation when provided with, alternatively, only a small set of reference images. Our key insight is to leverage strong semantic priors, as learned by foundation models, to identify corresponding regions between a reference and a target image. We find that correspondences enable automatic generation of instance-level segmentation masks for downstream tasks and instantiate our ideas via a multi-stage, training-free method incorporating (1) memory bank construction; (2) representation aggregation and (3) semantic-aware feature matching. Our experiments show significant improvements on segmentation metrics, leading to state-of-the-art performance on COCO FSOD (36.8% nAP), PASCAL VOC Few-Shot (71.2% nAP50) and outperforming existing training-free approaches on the Cross-Domain FSOD benchmark (22.4% nAP).

## 🧠 Architecture

## 🛠️ Installation instructions
### 1. Clone the repository
```bash
git clone https://github.com/miquel-espinosa/no-time-to-train.git
cd no-time-to-train
```
### 2. Create conda environment
We will create a conda environment with the required packages.
```bash
conda env create -f environment.yml
conda activate no-time-to-train
```
### 3. Install SAM2 and DINOv2
We will install SAM2 and DINOv2 from source.
```bash
pip install -e .
cd dinov2
pip install -e .
cd ..
```
### 4. Download datasets
Please download COCO dataset and place it in `data/coco`
### 5. Download SAM2 and DINOv2 checkpoints
We will download the exact SAM2 checkpoints used in the paper.
(Note, however, that SAM2.1 checkpoints are already available and might perform better.)
```bash
mkdir -p checkpoints/dinov2
cd checkpoints
wget https://dl.fbaipublicfiles.com/segment_anything_2/072824/sam2_hiera_large.pt
cd dinov2
wget https://dl.fbaipublicfiles.com/dinov2/dinov2_vitl14/dinov2_vitl14_pretrain.pth
cd ../..
```
## 📊 Inference code
⚠️ Disclaimer: This is research code — expect a bit of chaos!
### Reproducing 30-shot SOTA results in Few-shot COCO
Define useful variables and create a folder for results:
```bash
CONFIG=./no_time_to_train/new_exps/coco_fewshot_10shot_Sam2L.yaml
CLASS_SPLIT="few_shot_classes"
RESULTS_DIR=work_dirs/few_shot_results
SHOTS=30
SEED=33
GPUS=4
mkdir -p $RESULTS_DIR
FILENAME=few_shot_${SHOTS}shot_seed${SEED}.pkl
```
#### 0. Create reference set
```bash
python no_time_to_train/dataset/few_shot_sampling.py \
--n-shot $SHOTS \
--out-path ${RESULTS_DIR}/${FILENAME} \
--seed $SEED \
--dataset $CLASS_SPLIT
```
#### 1. Fill memory with references
```bash
python run_lightening.py test --config $CONFIG \
--model.test_mode fill_memory \
--out_path ${RESULTS_DIR}/memory.ckpt \
--model.init_args.model_cfg.memory_bank_cfg.length $SHOTS \
--model.init_args.dataset_cfgs.fill_memory.memory_pkl ${RESULTS_DIR}/${FILENAME} \
--model.init_args.dataset_cfgs.fill_memory.memory_length $SHOTS \
--model.init_args.dataset_cfgs.fill_memory.class_split $CLASS_SPLIT \
--trainer.logger.save_dir ${RESULTS_DIR}/ \
--trainer.devices $GPUS
```
#### 2. Post-process memory bank
```bash
python run_lightening.py test --config $CONFIG \
--model.test_mode postprocess_memory \
--model.init_args.model_cfg.memory_bank_cfg.length $SHOTS \
--ckpt_path ${RESULTS_DIR}/memory.ckpt \
--out_path ${RESULTS_DIR}/memory_postprocessed.ckpt \
--trainer.devices 1
```
#### 3. Inference on target images
```bash
python run_lightening.py test --config $CONFIG \
--ckpt_path ${RESULTS_DIR}/memory_postprocessed.ckpt \
--model.init_args.test_mode test \
--model.init_args.model_cfg.memory_bank_cfg.length $SHOTS \
--model.init_args.model_cfg.dataset_name $CLASS_SPLIT \
--model.init_args.dataset_cfgs.test.class_split $CLASS_SPLIT \
--trainer.logger.save_dir ${RESULTS_DIR}/ \
--trainer.devices $GPUS
```
If you'd like to see inference results online (as they are computed), add the argument:
```bash
--model.init_args.model_cfg.test.online_vis True
```
To adjust the score threshold `score_thr` parameter, add the argument (for example, visualising all instances with score higher than `0.4`):
```bash
--model.init_args.model_cfg.test.vis_thr 0.4
```
Images will now be saved in `results_analysis/few_shot_classes/`. The image on the left shows the ground truth, the image on the right shows the segmented instances found by our training-free method.
Note that in this example we are using the `few_shot_classes` split, thus, we should only expect to see segmented instances of the classes in this split (not all classes in COCO).
#### Results
After running all images in the validation set, you should obtain:
```
BBOX RESULTS:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.368
SEGM RESULTS:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.342
```
---
## 🔍 Custom dataset
We provide the instructions for running our pipeline on a custom dataset. Annotation format are always in COCO format.
> **TLDR;** To directly see how to run full pipeline on *custom datasets*, find `scripts/matching_cdfsod_pipeline.sh` together with example scripts of CD-FSOD datasets (e.g. `scripts/dior_fish.sh`)
### 0. Prepare a custom dataset ⛵🐦
Let's imagine we want to detect **boats**⛵ and **birds**🐦 in a custom dataset. To use our method we will need:
- At least 1 *annotated* reference image for each class (i.e. 1 reference image for boat and 1 reference image for bird)
- Multiple target images to find instances of our desired classes.
We have prepared a toy script to create a custom dataset with coco images, for a **1-shot** setting.
```bash
mkdir -p data/my_custom_dataset
python scripts/make_custom_dataset.py
```
This will create a custom dataset with the following folder structure:
```
data/my_custom_dataset/
├── annotations/
│ ├── custom_references.json
│ ├── custom_targets.json
│ └── references_visualisations/
│ ├── bird_1.jpg
│ └── boat_1.jpg
└── images/
├── 429819.jpg
├── 101435.jpg
└── (all target and reference images)
```
**Reference images visualisation (1-shot):**
| 1-shot Reference Image for BIRD 🐦 | 1-shot Reference Image for BOAT ⛵ |
|:---------------------------------:|:----------------------------------:|
|
|
|
### 0.1 If only bbox annotations are available
We also provide a script to generate instance-level segmentation masks by using SAM2. This is useful if you only have bounding box annotations available for the reference images.
```bash
# Download sam_h checkpoint. Feel free to use more recent checkpoints (note: code might need to be adapted)
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth -O checkpoints/sam_vit_h_4b8939.pth
# Run automatic instance segmentation from ground truth bounding boxes.
python no_time_to_train/dataset/sam_bbox_to_segm_batch.py \
--input_json data/my_custom_dataset/annotations/custom_references.json \
--image_dir data/my_custom_dataset/images \
--sam_checkpoint checkpoints/sam_vit_h_4b8939.pth \
--model_type vit_h \
--device cuda \
--batch_size 8 \
--visualize
```
**Reference images with instance-level segmentation masks (generated by SAM2 from gt bounding boxes, 1-shot):**
Visualisation of the generated segmentation masks are saved in `data/my_custom_dataset/annotations/custom_references_with_SAM_segm/references_visualisations/`.
| 1-shot Reference Image for BIRD 🐦 (automatically segmented with SAM) | 1-shot Reference Image for BOAT ⛵ (automatically segmented with SAM) |
|:---------------------------------:|:----------------------------------:|
|
|
|
### 0.2 Convert coco annotations to pickle file
```bash
python no_time_to_train/dataset/coco_to_pkl.py \
data/my_custom_dataset/annotations/custom_references_with_segm.json \
data/my_custom_dataset/annotations/custom_references_with_segm.pkl \
1
```
### 1. Fill memory with references
First, define useful variables and create a folder for results. For correct visualisation of labels, class names should be ordered by category id as appears in the json file. E.g. `bird` has category id `16`, `boat` has category id `9`. Thus, `CAT_NAMES=boat,bird`.
```bash
DATASET_NAME=my_custom_dataset
DATASET_PATH=data/my_custom_dataset
CAT_NAMES=boat,bird
CATEGORY_NUM=2
SHOT=1
YAML_PATH=no_time_to_train/pl_configs/matching_cdfsod_template.yaml
PATH_TO_SAVE_CKPTS=./tmp_ckpts/my_custom_dataset
mkdir -p $PATH_TO_SAVE_CKPTS
```
Run step 1:
```bash
python run_lightening.py test --config $YAML_PATH \
--model.test_mode fill_memory \
--out_path $PATH_TO_SAVE_CKPTS/$DATASET_NAME\_$SHOT\_refs_memory.pth \
--model.init_args.dataset_cfgs.fill_memory.root $DATASET_PATH/images \
--model.init_args.dataset_cfgs.fill_memory.json_file $DATASET_PATH/annotations/custom_references_with_segm.json \
--model.init_args.dataset_cfgs.fill_memory.memory_pkl $DATASET_PATH/annotations/custom_references_with_segm.pkl \
--model.init_args.dataset_cfgs.fill_memory.memory_length $SHOT \
--model.init_args.dataset_cfgs.fill_memory.cat_names $CAT_NAMES \
--model.init_args.model_cfg.dataset_name $DATASET_NAME \
--model.init_args.model_cfg.memory_bank_cfg.length $SHOT \
--model.init_args.model_cfg.memory_bank_cfg.category_num $CATEGORY_NUM \
--trainer.devices 1
```
### 2. Post-process memory bank
```bash
python run_lightening.py test --config $YAML_PATH \
--model.test_mode postprocess_memory \
--ckpt_path $PATH_TO_SAVE_CKPTS/$DATASET_NAME\_$SHOT\_refs_memory.pth \
--out_path $PATH_TO_SAVE_CKPTS/$DATASET_NAME\_$SHOT\_refs_memory_postprocessed.pth \
--model.init_args.model_cfg.dataset_name $DATASET_NAME \
--model.init_args.model_cfg.memory_bank_cfg.length $SHOT \
--model.init_args.model_cfg.memory_bank_cfg.category_num $CATEGORY_NUM \
--trainer.devices 1
```
#### 2.1 Visualise post-processed memory bank
```bash
python run_lightening.py test --config $YAML_PATH \
--model.test_mode vis_memory \
--ckpt_path $PATH_TO_SAVE_CKPTS/$DATASET_NAME\_$SHOT\_refs_memory_postprocessed.pth \
--model.init_args.dataset_cfgs.fill_memory.root $DATASET_PATH/images \
--model.init_args.dataset_cfgs.fill_memory.json_file $DATASET_PATH/annotations/custom_references_with_segm.json \
--model.init_args.dataset_cfgs.fill_memory.memory_pkl $DATASET_PATH/annotations/custom_references_with_segm.pkl \
--model.init_args.dataset_cfgs.fill_memory.memory_length $SHOT \
--model.init_args.dataset_cfgs.fill_memory.cat_names $CAT_NAMES \
--model.init_args.model_cfg.dataset_name $DATASET_NAME \
--model.init_args.model_cfg.memory_bank_cfg.length $SHOT \
--model.init_args.model_cfg.memory_bank_cfg.category_num $CATEGORY_NUM \
--trainer.devices 1
```
PCA and K-means visualisations for the memory bank images are stored in `results_analysis/memory_vis/my_custom_dataset`.
### 3. Inference on target images
If `ONLINE_VIS` is set to True, prediction results will be saved in `results_analysis/my_custom_dataset/` and displayed as they are computed. NOTE that running with online visualisation is much slower.
Feel free to change the score threshold `VIS_THR` to see more or less segmented instances.
```bash
ONLINE_VIS=True
VIS_THR=0.4
python run_lightening.py test --config $YAML_PATH \
--model.test_mode test \
--ckpt_path $PATH_TO_SAVE_CKPTS/$DATASET_NAME\_$SHOT\_refs_memory_postprocessed.pth \
--model.init_args.model_cfg.dataset_name $DATASET_NAME \
--model.init_args.model_cfg.memory_bank_cfg.length $SHOT \
--model.init_args.model_cfg.memory_bank_cfg.category_num $CATEGORY_NUM \
--model.init_args.model_cfg.test.imgs_path $DATASET_PATH/images \
--model.init_args.model_cfg.test.online_vis $ONLINE_VIS \
--model.init_args.model_cfg.test.vis_thr $VIS_THR \
--model.init_args.dataset_cfgs.test.root $DATASET_PATH/images \
--model.init_args.dataset_cfgs.test.json_file $DATASET_PATH/annotations/custom_targets.json \
--model.init_args.dataset_cfgs.test.cat_names $CAT_NAMES \
--trainer.devices 1
```
### Results
Performance metrics (with the exact same parameters as commands above) should be:
```
BBOX RESULTS:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.478
SEGM RESULTS:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.458
```
Visual results are saved in `results_analysis/my_custom_dataset/`. Note that our method works for false negatives, that is, images that do not contain any instances of the desired classes.
*Click images to enlarge ⬇️*
| Target image with boats ⛵ (left GT, right predictions) | Target image with birds 🐦 (left GT, right predictions) |
|:----------------------:|:----------------------:|
|  |  |
| Target image with boats and birds ⛵🐦 (left GT, right predictions) | Target image without boats or birds 🚫 (left GT, right predictions) |
|:---------------------------------:|:----------------------------------:|
|  |  |
## 🔬 Ablations
### Backbone ablation
To evaluate the transferability of our method across foundation models, we replace both the semantic
encoder (DINOv2) and the SAM-based segmenter with several alternatives.
**Semantic encoder ablation:**
```bash
# CLIP (Sizes: b16, b32, l14, l14@336px)
bash scripts/clip/clipl14@336px.sh
bash scripts/clip/clipl14.sh
bash scripts/clip/clipb16.sh
bash scripts/clip/clipb32.sh
# DINOV3 (Sizes: b, l, h)
bash scripts/dinov3/dinov3b.sh
bash scripts/dinov3/dinov3l.sh
bash scripts/dinov3/dinov3h.sh
# PE (Sizes: g14, l14)
bash scripts/pe/PEg14.sh
bash scripts/pe/PEl14.sh
```
**Segmenter ablation:**
```bash
# SAM2 (Sizes: tiny, small, base+, large)
bash scripts/sam2/sam2_tiny.sh
bash scripts/sam2/sam2_small.sh
bash scripts/sam2/sam2_base_plus.sh
bash scripts/baseline/dinov2_sam_baseline.sh # SAM2 Large
```
### VLM evaluation on COCO few-shot dataset
We evaluate QWEN VLM on COCO few-shot dataset.
```bash
bash scripts/vl-qwen/ablation-vl-qwen.sh
```
### Reference image heuristics
To understand why different reference images lead to performance variation, we analyse the statistical properties of COCO novel classes annotations.
#### ANALYSIS
We study three annotation characteristics: (1) mask area (object size),
(2) mask center location, and (3) distance to image edges.
Instructions:
```bash
# Mask area distribution
python no_time_to_train/make_plots/mask_area_distribution.py \
--input data/coco/annotations/instances_val2017.json \
--output no_time_to_train/make_plots/mask_area_distribution/mask_area_distribution.png \
--edges-output no_time_to_train/make_plots/mask_area_distribution/bbox_edge_distance_histograms.png \
--center-output no_time_to_train/make_plots/mask_area_distribution/bbox_center_density.png \
--bins 80 \
--distance-bins 80 \
--disable-center-density
# Bbox center positions
python no_time_to_train/make_plots/bbox_positions.py \
--per-class-root data/coco/annotations/per_class_instances \
--filename centeredness_2d_hist_plain.png \
--max-cols 6 \
--output-dir ./no_time_to_train/make_plots/bbox_positions \
--outfile grid_bbox_positions.png
```
[OUTPUT] Mask area distribution
[OUTPUT] Bbox center density
[OUTPUT] Bbox edge distance histograms
#### SELECTION
We sample 100 diverse reference images per class, explicitly
covering a range of mask sizes, centers, and edge distances. Each
reference is evaluated on a fixed reduced validation subset.
Instructions:
**Setup script:** `scripts/1shot_ref_ablation/setup.sh`:
1. Create per class json file
2. Analyse specific class
3. Create reference set with different heuristics
```bash
bash scripts/1shot_ref_ablation/setup.sh
```
**Run scripts:** `scripts/1shot_ref_ablation/gpu*.sh`:
4. Run pipeline for each reference set
```bash
# Example launch script that calls template script for each reference set
bash scripts/1shot_ref_ablation/gpu0.sh
```
#### RESULTS
We analyze how detection scores correlate with reference image characteristics
(mask size, center position, edge distance).
Instructions:
```bash
python no_time_to_train/make_plots/heuristics_analysis.py
# Outputs:
# - no_time_to_train/make_plots/heuristics_analysis/heatmap_bbox_norm_scores.png
# - no_time_to_train/make_plots/heuristics_analysis/heatmap_segm_norm_scores.png
# - no_time_to_train/make_plots/heuristics_analysis/heatmap_center_bbox_norm_scores_kde_smooth.png
# - no_time_to_train/make_plots/heuristics_analysis/heatmap_center_bbox_norm_scores.png
# - no_time_to_train/make_plots/heuristics_analysis/heatmap_center_segm_norm_scores_kde_smooth.png
# - no_time_to_train/make_plots/heuristics_analysis/heatmap_center_segm_norm_scores.png
# - no_time_to_train/make_plots/heuristics_analysis/per_class_area_vs_raw_scores.png
# - no_time_to_train/make_plots/heuristics_analysis/all_classes_area_vs_norm_scores.png
# - no_time_to_train/make_plots/heuristics_analysis/edge_distance_vs_norm_scores.png
# - no_time_to_train/make_plots/heuristics_analysis/bars_area_category_norm_scores.png
# - no_time_to_train/make_plots/heuristics_analysis/bars_centered_norm_scores.png
# - no_time_to_train/make_plots/heuristics_analysis/bars_avoid_sides_norm_scores.png
```
[OUTPUT] Barplots. Effect of mask area (left) and centeredness (right) on performance
[OUTPUT] Heatmaps. 2D score maps of performance as a function of mask-center location
[OUTPUT] Reference-image performance vs. mask area for all COCO novel classes
### Reference-image degradation
We evaluate our method under progressively degraded reference images by applying increasing
levels of Gaussian blur.
Instructions:
```bash
# Run different blur levels
bash scripts/blur_ablation/blur_ablation.sh
# Plot grid of blur ablation results
python no_time_to_train/make_plots/plot_blur_results.py \
--results-root ./work_dirs/blur_ablation \
--class-id 0 \
--max-cols 4 \
--output-dir ./no_time_to_train/make_plots/blur_ablation \
--outfile grid_blur_ablation_class_0.png
```
### Feature similarity
Script for visualising feature similarity between reference images and target images.
It generates single-feature similarity (path features), and prototype-based similarity (aggregated features).
Instructions:
```bash
python no_time_to_train/make_plots/feature_similarity.py \
--classes orange \
--num-images 20 \
--min-area 12 \
--max-area 25000 \
--min-instances 2 \
--seed 123 \
--max-per-class 12
```
### T-SNE plots (DINOv2 feature separability)
t-SNE of DINOv2 features shows clear separation for dissimilar classes
but heavy overlap for similar ones, suggesting that confusion stems from
backbone feature geometry rather than prototypes selection.
Instructions:
Extract features
```bash
python no_time_to_train/make_plots/tsne-coco.py --extract
```
Plot T-SNE plots
```bash
# Example spoon vs fork
python no_time_to_train/make_plots/tsne-coco.py --classes cat dog
```
## 🛠️ Helpers
### Visualise memory
add image feature_comparison_small.png here
Instructions
To visualise the memory bank (PCA and K-means visualisations) for a given experiment, adjust the following command.
Set `DO_NOT_CROP` to True/False (in `no_time_to_train/models/Sam2MatchingBaseline_noAMG.py`) to visualise the reference image with/without the cropped mask.
```bash
python run_lightening.py test --config $CONFIG \
--model.test_mode vis_memory \
--ckpt_path $RESULTS_DIR/memory_postprocessed.ckpt \
--model.init_args.dataset_cfgs.fill_memory.memory_pkl $RESULTS_DIR/$FILENAME \
--model.init_args.dataset_cfgs.fill_memory.memory_length $SHOT \
--model.init_args.dataset_cfgs.fill_memory.class_split $CLASS_SPLIT \
--model.init_args.model_cfg.dataset_name $CLASS_SPLIT \
--model.init_args.model_cfg.memory_bank_cfg.length $SHOT \
--model.init_args.model_cfg.memory_bank_cfg.category_num $CATEGORY_NUM \
--trainer.devices 1
```
### Resize images to 512x512 (make the images square)
To resize the images to 512x512 and save them to a new directory, run the following command. This is for the paper figures.
Instructions:
```bash
python no_time_to_train/make_plots/paper_fig_square_imgs.py
```
### Model size and memory
To calculate the model size and memory, run the following command.
Instructions:
- See `no_time_to_train/models/Sam2MatchingBaseline_noAMG_model_and_memory.py` for the model size and memory calculation.
(Easiest: temporarily replace by Sam2MatchingBaseline_noAMG.py, then rename back.)
## 🌍 EO datasets
### Evaluation scripts (EO datasets)
Evaluation scripts can be found in the `scripts/EO` directory. The EO datasets use the `./scripts/EO/EO_template.sh` script to run the evaluation.
Each EO experiment run is saved under the `./EO_results` directory. In the experiment folder we store:
- The summary.txt file with the configuration, and runtime of the experiment.
- The prediction visualisations on the test set (`results_analysis` folder).
- The memory visualisations (`memory_vis` folder).
- The few-shot annotation pickle file.
- The checkpoints of the model (if not cleaned up).
### Figures and tables
Additional scripts for generating figures and tables.
Summary latex table of the EO datasets:
```bash
python scripts/convert_datasets/summary_table_datasets.py
```
Generate LaTeX table of the EO datasets:
```bash
python scripts/paper_figures/table_EO_results.py ./EO_results_no_heuristics
```
Accuracy plot of the EO datasets:
```bash
python scripts/paper_figures/plot_EO_accuracy.py \
--input-root ./EO_results \
--output-root ./EO_results
```
Summary of heuristics effect on the EO datasets:
```bash
python scripts/paper_figures/plot_EO_heuristic.py \
--no-heuristics ./EO_results_no_heuristics \
--heuristics ./EO_results
```
Runtime plot of the EO datasets:
```bash
python scripts/paper_figures/plot_EO_runtime.py \
--input-root ./EO_results \
--output-root ./EO_results
```
Generate EO grid visualisations for paper figure:
```bash
python scripts/paper_figures/plot_EO_grid.py \
--root ./EO_results_no_heuristics \
--dataset ISAID \
--shots 1
```
## 📚 Citation
If you use this work, please cite us:
```bibtex
@article{espinosa2025notimetotrain,
title={No time to train! Training-Free Reference-Based Instance Segmentation},
author={Miguel Espinosa and Chenhongyi Yang and Linus Ericsson and Steven McDonagh and Elliot J. Crowley},
journal={arXiv preprint arXiv:2507.02798},
year={2025},
primaryclass={cs.CV}
}
```