JUHE API Marketplace
DATASET
Open Source Community

ccmusic-database/music_genre

The dataset comprises approximately 1,700 music excerpts in .mp3 format, each lasting 270–300 seconds and sampled at 22 kHz. The excerpts are taken from NetEase Cloud Music and are labelled with 16 genre categories. The dataset is divided into a Raw Subset and an Eval Subset, each providing different audio features and annotations. It was created to foster AI research in the music industry and was mainly collected and annotated by students. The dataset is intended for audio‑classification tasks and supports multilingual use.

Updated 3/21/2025
hugging_face

Description

Dataset Overview

Basic Information

  • Name: Music Genre Dataset
  • License: MIT
  • Languages: Chinese, English
  • Tags: music, art
  • Size: 10K<n<100K

Description

  • Overview: Contains about 1,700 music pieces in .mp3 format, each 270–300 seconds long, sampled at 22 kHz. The pieces are divided into 16 distinct music styles.
  • Source: Data sourced from NetEase Music; genre tags are included with the downloads.
  • Classification: 16 music styles.

Structure

  • Audio Format: .mp3
  • Sample Rate: 22 kHz
  • Duration Range: 270–300 seconds
  • Label Taxonomy: Three‑level hierarchy (2 coarse classes, 9 middle classes, 16 fine‑grained classes).

Usage Example

  • Loading: Use the load_dataset function; both eval and default subsets are available.
  • Processing: The dataset includes train, validation, and test splits and supports audio‑classification tasks.

Creation

  • Collection & Annotation: Collected and annotated by CCMUSIC students; 1,700 pieces grouped into 17 styles.
  • Copyright: Only spectrograms are provided due to copyright restrictions.

Notes

  • Language Bias: Majority of tracks are English.
  • Sample Balance: Class distribution is imbalanced.

License

  • Type: MIT License
  • Copyright Holder: CCMUSIC
  • Conditions: Free use, copy, modify, merge, publish, distribute, sublicense, and sell copies, provided the copyright notice and license terms are retained.

Citation

  • Authors: Monan Zhou, Shenyang Xu, Zhaorui Liu, Zhaowen Wang, Feng Yu, Wei Li, Baoqiang Han
  • Title: CCMusic: an Open and Diverse Database for Chinese and General Music Information Retrieval Research
  • Year: 2024
  • Version: 1.2
  • URL: https://huggingface.co/ccmusic-database

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Music Genre Classification
Audio Classification

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.