Dataset assetOpen Source CommunityVideo Object DetectionBenchmark Dataset

XS-VID

A large‑scale ultra‑small video object detection benchmark dataset designed to support subsequent research and applications, providing annotations in COCO, COCOVID, and YOLO formats.

Source

github

Created

May 27, 2024

Updated

Jun 14, 2024

Signals

250 views

Availability

Linked source ready

Overview

Dataset description and usage context

Dataset Overview

Dataset Name

XS‑VID

Dataset Download

Google Drive:
- Annotations: Link
- Images (0‑3): Link
- Images (4‑5): Link
BaiduNetDisk: Link

Dataset Format

Annotation Formats: COCO, COCOVID, YOLO

Dataset Organization

data_root_dir/
├── test.txt
├── train.txt
├── images/
│   ├── video1/
│   │   ├── 0000000.png
│   │   └── 0000001.png
│   ├── video2/
│   │   └── ...
│   └── ...
└── labels/
    ├── video1/
    │   ├── 0000000.txt
    │   └── 0000001.txt
    ├── video2/
    │   └── ...
    └── ...

Results

XS‑VID: Provides performance metrics for multiple methods, including $AP$, $AP_{50}$, $AP_{75}$, $AP_{eS}$, $AP_{rS}$, $AP_{gS}$, and Inference (ms).
Visdrone2019 VID (test‑dev): Provides performance metrics for multiple methods, including $AP$, $AP_{50}$, $AP_{75}$, $AP_{eS}$, $AP_{rS}$, $AP_{gS}$, $AP_{m}$, and $AP_{l}$.

Checkpoints

YOLOFT‑L: Parameters 45.16 M, FLOPs 230.14 G, inference time 36 ms, dataset XS‑VID, checkpoint Link
YOLOFT‑S: Parameters 53.58 M, FLOPs 13.02 G, inference time 16 ms, dataset XS‑VID, checkpoint Link

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio