Back to datasets
Dataset assetOpen Source CommunityVideo Object DetectionBenchmark Dataset

XS-VID

A large‑scale ultra‑small video object detection benchmark dataset designed to support subsequent research and applications, providing annotations in COCO, COCOVID, and YOLO formats.

Source
github
Created
May 27, 2024
Updated
Jun 14, 2024
Signals
250 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Name

XS‑VID

Dataset Download

  • Google Drive:
    • Annotations: Link
    • Images (0‑3): Link
    • Images (4‑5): Link
  • BaiduNetDisk: Link

Dataset Format

  • Annotation Formats: COCO, COCOVID, YOLO

Dataset Organization

data_root_dir/
├── test.txt
├── train.txt
├── images/
│   ├── video1/
│   │   ├── 0000000.png
│   │   └── 0000001.png
│   ├── video2/
│   │   └── ...
│   └── ...
└── labels/
    ├── video1/
    │   ├── 0000000.txt
    │   └── 0000001.txt
    ├── video2/
    │   └── ...
    └── ...

Results

  • XS‑VID: Provides performance metrics for multiple methods, including $AP$, $AP_{50}$, $AP_{75}$, $AP_{eS}$, $AP_{rS}$, $AP_{gS}$, and Inference (ms).
  • Visdrone2019 VID (test‑dev): Provides performance metrics for multiple methods, including $AP$, $AP_{50}$, $AP_{75}$, $AP_{eS}$, $AP_{rS}$, $AP_{gS}$, $AP_{m}$, and $AP_{l}$.

Checkpoints

  • YOLOFT‑L: Parameters 45.16 M, FLOPs 230.14 G, inference time 36 ms, dataset XS‑VID, checkpoint Link
  • YOLOFT‑S: Parameters 53.58 M, FLOPs 13.02 G, inference time 16 ms, dataset XS‑VID, checkpoint Link
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio