Dataset assetClassic DatasetVideo AnalysisAction Recognition

AVA Dataset

The AVA dataset densely annotates 80 atomic visual actions across 57.6k movie clips, providing spatio‑temporal localization of actions and yielding 210k action labels, with multiple person labels frequently appearing in each video clip. Key features include: 1. Definition of atomic visual actions to avoid collecting data for each complex action; 2. Precise spatio‑temporal annotations, potentially multiple annotations per person; 3. Use of diverse real video material (movies).

Source

github

Created

Oct 23, 2017

Updated

Mar 30, 2024

Signals

1,139 views

Availability

Linked source ready

Overview

Dataset description and usage context

Google AVA Dataset Overview

Dataset Content

Training and test annotations: Included in the dataset.
YouTube IDs of all videos: Provided separately for training and test sets.
action_id: Identifier for action categories.
Partial video download method: For videos that cannot be directly downloaded due to copyright.

Dataset Characteristics

Dense annotation: 80 atomic visual actions annotated across 57.6k movie clips, generating 210k action tags.
Spatio‑temporal localization: Actions are precisely localized in space and time.
Diversity: Uses diverse real video material (movies).

Dataset Structure

Number of videos: 192 total, with 154 for training and 38 for testing.
Annotation scheme: Each video has 15 minutes annotated at 3‑second intervals, totaling 300 annotated segments.
Annotation files: Two CSV files are used, ava_train_v1.0.csv and ava_test_v1.0.csv.
Annotation format: Each row contains an annotation of an action performer, including video ID, middle‑frame timestamp, person bounding box, and action ID.

Download and Usage

Download links: Baidu Cloud link and WeChat peer‑to‑peer sharing.
Video download tool: Recommend using youtube-dl to download YouTube videos.
Copyright video download: Requires registration through a specific process.

Dataset License

License type: The dataset follows the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio