Dataset assetOpen Source CommunityComputer VisionMultimodal Image Matching

MD-syn

MD‑syn is a new comprehensive dataset for general multimodal image matching. It is generated from the MegaDepth dataset using the MINIMA data engine and adds six additional modalities: infrared, depth, event, normal, sketch, and painting.

Source

github

Created

Dec 17, 2024

Updated

Dec 26, 2024

Signals

182 views

Availability

Linked source ready

Overview

Dataset description and usage context

MINIMA: Modality‑Invariant Image Matching

Dataset Overview

MINIMA is a unified framework for multimodal image matching, aiming to address challenges in cross‑view and cross‑modality matching. The framework enhances generalization via data augmentation and introduces a simple yet effective data engine that generates a large‑scale dataset containing multiple modalities, diverse scenes, and precise matching labels.

Dataset Details

Dataset Name: MegaDepth‑Syn Dataset
Generation Method: Produced from the MegaDepth dataset using the MINIMA data engine, adding six extra modalities: infrared, depth, event, normal, sketch, and painting.
Release: Available on OpenXLab.

Dataset Download

You can download the dataset with the following commands:

pip install openxlab --no-dependencies
openxlab login
openxlab dataset info --dataset-repo lsxi7/MINIMA
openxlab dataset get --dataset-repo lsxi7/MINIMA
openxlab dataset download --dataset-repo lsxi7/MINIMA --source-path /README.md --target-path /path/to/local/folder

Model Weights Download

Weight files: minima_lightglue, minima_loftr, minima_roma
Links: Google Drive or GitHub

Test Datasets

MegaDepth‑1500‑Syn: Download from megadepth‑1500 and organize.
RGB‑Infrared Test Dataset: From XoFTR, download via Google Drive.
MMIM Test Dataset: From Multi‑modality‑image‑matching‑database‑metrics‑methods.
RGB‑Depth Test Dataset: From DIODE, download via AWS or Baidu Cloud.
RGB‑Event Test Dataset: From DSEC, download via Google Drive.

Dataset Structure

A recommended folder layout is:

data/
├── METU-VisTIR/
│   ├── index/
│   └── ...
├── Multi-modality-image-matching-database-metrics-methods/
│   ├── Multimodal_Image_Matching_Datasets/
│   └── ...
├── megadepth/
│   └── train/[modality]/Undistorted_SfM/
└── DIODE/
    └── val/
└── DSEC/
    ├── vent_list.txt
    ├── thun_01_a/
    └── ...

Citation

If you use this dataset, please cite:

@article{Jiang2024minima,
  title={MINIMA: Modality Invariant Image Matching},
  author={Jiang, Xingyu and Ren, Jiangwei and Li, Zizhuo and Zhou, Xin and Liang, Dingkang and Bai, Xiang},
  journal={arXiv preprint},
  year={2024},
}

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio