Dataset assetOpen Source CommunityComputer VisionAction Recognition

NTU Dataset

The NTU dataset is a multi‑video collection that records 60 different human actions, each captured by three cameras from distinct viewpoints. The data files contain per‑frame skeletal coordinates.

Source

github

Created

Jul 24, 2020

Updated

Jul 24, 2020

Signals

202 views

Availability

Linked source ready

Overview

Dataset description and usage context

NTU Dataset Overview

Dataset Description

The NTU dataset is a collection of videos covering a wide range of human actions. Each action was recorded using three cameras.

List of Recorded Actions

Drinking water
Eating/snacking
Brushing teeth
Combing hair
Dropping objects
Picking up objects
Throwing
Sitting down
Standing up from a seated position
Clapping
Reading
Writing
Tearing paper
Putting on a jacket
Removing a jacket
Putting on shoes
Removing shoes
Wearing glasses
Taking off glasses
Wearing a hat/cap
Removing a hat/cap
Cheering
Waving
Kicking something
Putting something into a pocket / taking something out of a pocket
Jumping on one foot
Jumping
Making a phone call / answering a phone call
Using a phone/tablet
Typing on a keyboard
Pointing with a finger
Taking a selfie
Checking the time (from a watch)
Rubbing hands
Nodding / bowing
Shaking head
Rubbing face
Saluting
Pressing palms together
Crossing hands in front (signifying stop)
Sneezing / coughing
Staggering
Falling down
Touching head (headache)
Touching chest (stomachache/heart pain)
Touching back (back pain)
Touching neck (neck pain)
Nausea or vomiting
Using a fan (hand or paper) / feeling hot
Hitting / slapping someone
Kicking someone
Pushing someone
Patting someone's back
Pointing at someone with a finger
Hugging someone
Giving something to someone
Touching someone's pocket
Shaking hands
Walking towards each other
Walking away from each other

File Naming Convention

Each file is named according to the following information:

Setup number
Camera ID
Performer ID
Replication number
Action class label

File Content

Each file contains per‑frame skeletal coordinates. The first three columns of coordinates are retained in the code for training neural networks.

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio