Back to datasets
Dataset assetOpen Source CommunityLanguage ModelsVideo Understanding
MBZUAI/VideoInstruct-100K
VideoInstruct100K is a high‑quality video‑dialogue dataset created through human‑in‑the‑loop and semi‑automatic annotation techniques. The Q&A content covers video summarization, description‑based question answering (exploring spatial, temporal, relational, and reasoning concepts), and creative/generative question answering.
Source
hugging_face
Created
Nov 28, 2025
Updated
Sep 29, 2023
Signals
159 views
Availability
Linked source ready
Overview
Dataset description and usage context
VideoInstruct100K Dataset Overview
Dataset Description
VideoInstruct100K is a high‑quality video‑dialogue dataset generated using human‑in‑the‑loop and semi‑automatic annotation techniques. The Q&A content includes the following aspects:
- Video summarization
- Description‑based question answering (exploring spatial, temporal, relational, and reasoning concepts)
- Creative/generative question answering
Citation Information
If you find this dataset useful, please consider citing the following paper:
@article{Maaz2023VideoChatGPT,
title={Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models},
author={Muhammad Maaz, Hanoona Rasheed, Salman Khan and Fahad Khan},
journal={ArXiv 2306.05424},
year={2023}
}
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.