Explore high-quality datasets for your AI and machine learning projects.
UCF‑101 and HMDB‑51 are two video datasets used for training and testing video‑processing models. UCF‑101 contains 101 action categories with over 100 videos per category. HMDB‑51 includes 51 action categories with at least 101 videos per category.
The AVA dataset densely annotates 80 atomic visual actions across 57.6k movie clips, providing spatio‑temporal localization of actions and yielding 210k action labels, with multiple person labels frequently appearing in each video clip. Key features include: 1. Definition of atomic visual actions to avoid collecting data for each complex action; 2. Precise spatio‑temporal annotations, potentially multiple annotations per person; 3. Use of diverse real video material (movies).
The NTU dataset is a multi‑video collection that records 60 different human actions, each captured by three cameras from distinct viewpoints. The data files contain per‑frame skeletal coordinates.