DATASET
Open Source Community
VidSTG
The VidSTG dataset is built on the video relation dataset VidOR for spatio‑temporal video grounding tasks, especially handling multi‑form sentences. It includes video partition files and sentence annotation files, detailing video IDs, frame counts, frame rates, dimensions, as well as object, relation and temporal ground‑truth annotations.
Updated 4/22/2024
github
Description
Dataset Overview
Source
- VidSTG: Constructed from the video relation dataset VidOR.
Composition
- Original VidOR: 7,000 training videos, 835 validation videos, and 2,165 test videos (test annotations are unavailable and thus omitted).
- VidSTG: 10 % of the training videos are used as validation data; the original validation set serves as the test set.
Contents
- Video Partition Files:
train_files.json,val_files.json,test_files.jsoncontaining video IDs for each split. - Sentence Annotation Files:
train_annotations.json,val_annotations.json,test_annotations.json.
Annotation Structure
- Video ID: Unique identifier.
- Frame Count: Number of frames.
- Resolution: Width and height.
- Subject/Object List: IDs and categories.
- Temporal Segment: Frame range used.
- Relations: Subject ID, object ID, predicate, and frame range.
- Temporal Ground‑Truth: Time span of each relation.
- Caption: Descriptive sentence.
- Question: Query sentence about the video.
Citation
If you use this dataset, please cite:
- VidSTG paper: Zhang, Zhu et al. "Where Does It Exist: Spatio‑Temporal Video Grounding for Multi‑Form Sentences". CVPR, 2020.
- VidOR paper: Shang, Xindi et al. "Annotating Objects and Relations in User‑Generated Videos". International Conference on Multimedia Retrieval, 2019.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Video Analysis
Spatio‑Temporal Localization
Source
Organization: github
Created: 3/24/2020
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.