JUHE API Marketplace
DATASET
Open Source Community

VidSTG

The VidSTG dataset is built on the video relation dataset VidOR for spatio‑temporal video grounding tasks, especially handling multi‑form sentences. It includes video partition files and sentence annotation files, detailing video IDs, frame counts, frame rates, dimensions, as well as object, relation and temporal ground‑truth annotations.

Updated 4/22/2024
github

Description

Dataset Overview

Source

  • VidSTG: Constructed from the video relation dataset VidOR.

Composition

  • Original VidOR: 7,000 training videos, 835 validation videos, and 2,165 test videos (test annotations are unavailable and thus omitted).
  • VidSTG: 10 % of the training videos are used as validation data; the original validation set serves as the test set.

Contents

  • Video Partition Files: train_files.json, val_files.json, test_files.json containing video IDs for each split.
  • Sentence Annotation Files: train_annotations.json, val_annotations.json, test_annotations.json.

Annotation Structure

  • Video ID: Unique identifier.
  • Frame Count: Number of frames.
  • Resolution: Width and height.
  • Subject/Object List: IDs and categories.
  • Temporal Segment: Frame range used.
  • Relations: Subject ID, object ID, predicate, and frame range.
  • Temporal Ground‑Truth: Time span of each relation.
  • Caption: Descriptive sentence.
  • Question: Query sentence about the video.

Citation

If you use this dataset, please cite:

  • VidSTG paper: Zhang, Zhu et al. "Where Does It Exist: Spatio‑Temporal Video Grounding for Multi‑Form Sentences". CVPR, 2020.
  • VidOR paper: Shang, Xindi et al. "Annotating Objects and Relations in User‑Generated Videos". International Conference on Multimedia Retrieval, 2019.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Video Analysis
Spatio‑Temporal Localization

Source

Organization: github

Created: 3/24/2020

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.