JUHE API Marketplace
DATASET
Open Source Community

VocalnetOpenDataset

An open‑source Chinese singing‑voice synthesis dataset comprising various guofeng (national style) music; some songs include opera‑style vocal passages. The dataset is split into the main corpus and a set of scattered audio clips used to train a neural‑network vocoder. The main corpus contains multiple full or partial songs across many tracks.

Updated 12/28/2020
github

Description

VocalnetOpenDataset Overview

Dataset Description

Audio Characteristics

  • Sampling Rate: 32 kHz
  • Bit Depth: 16‑bit
  • Channels: Mono
  • Audio Quality: Studio‑grade, processed with a noise gate
  • Composition: Each song may consist of multiple tracks

Dataset Content

  • Main Corpus:
    • Contains several full or partial songs (exact count not disclosed)
    • Predominantly guofeng style, may include opera‑style vocal passages
    • Lyrics are not provided; users must collect them independently
  • Scattered Audio Set:
    • Total duration after silence removal is unspecified
    • Combined into a single audio file for convenient slicing
    • Content may contain duplicates and has not undergone manual cleaning

Annotation Status

  • Partial manual alignment and sentence segmentation have been completed
  • Annotations are provided in Praat TextGrid format; additional formats will be added in the future
  • Ongoing annotation effort; contributions from volunteers are welcome

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Singing Voice Synthesis
Guofeng Music

Source

Organization: github

Created: 7/14/2020

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.