Back to datasets
Dataset assetOpen Source CommunityVideo AnalysisDataset

COIN Dataset

COIN is currently the largest comprehensive instructional video analysis dataset, containing 11,827 videos covering 180 different tasks across 12 domains. All videos are collected from YouTube and annotated using an efficient toolbox.

Source
github
Created
Mar 4, 2019
Updated
May 24, 2024
Signals
356 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Name: COIN Dataset

Scale: Contains 11,827 videos covering 180 different tasks across 12 domains.

Content: Video content includes a variety of tasks such as vehicle maintenance, cooking, etc., specifically including car polishing, french fry making, and more.

Source: All videos are collected from YouTube.

Annotation Tools: Annotated using a specialized toolbox.

Dataset Structure

Hierarchy: The dataset is organized in a three‑level hierarchy comprising domain, task, and step levels.

File Formats:

  • Video and annotation information: Stored in JSON files, containing YouTube ID, duration, task name, video URL, start and end times, subset type, task ID, and detailed annotation information.
  • Annotation details: Include annotation ID, name, and time intervals.

Usage License

License Type: Research‑only use, including sharing and modification of the material. The licensor must not be implied to support or endorse the user’s actions.

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio