confit/crema-d

This is an audio classification dataset primarily for emotion classification tasks. The dataset includes training, validation, and test splits; each sample contains an audio file, audio data, an emotion label, and a categorical label. Emotion labels cover six categories: anger, disgust, fear, happiness, neutral, and sadness. The audio sampling rate is 16 kHz. Total download size is about 606 MB; total dataset size is about 608 MB.

Updated 3/29/2024

hugging_face

Description

Dataset Overview

Task Category

Audio Classification

Features

file: string
audio: audio data, sampled at 16 kHz
emotion: string
label: categorical label with the following emotion categories:
- 0: anger
- 1: disgust
- 2: fear
- 3: happy
- 4: neutral
- 5: sad

Data Split

Training set: 5,209 samples, size 425,762,803.75 bytes
Validation set: 1,116 samples, size 91,023,972.432 bytes
Test set: 1,117 samples, size 91,269,786.5 bytes

Size

Download size: 606,141,777 bytes
Total dataset size: 608,056,562.682 bytes

Configuration

Default config: provides paths for training, validation, and test data

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Please login to view download links and access full dataset details.

Topics

Audio Classification

Emotion Recognition

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.

Check Prices →