DATASET
Open Source Community
VocalnetOpenDataset
An open‑source Chinese singing‑voice synthesis dataset comprising various guofeng (national style) music; some songs include opera‑style vocal passages. The dataset is split into the main corpus and a set of scattered audio clips used to train a neural‑network vocoder. The main corpus contains multiple full or partial songs across many tracks.
Updated 12/28/2020
github
Description
VocalnetOpenDataset Overview
Dataset Description
- Type: Chinese Singing‑Voice Synthesis Dataset
- License: Distributed under the Creative Commons Attribution‑ShareAlike 4.0 International license, permitting commercial use and redistribution with attribution.
Audio Characteristics
- Sampling Rate: 32 kHz
- Bit Depth: 16‑bit
- Channels: Mono
- Audio Quality: Studio‑grade, processed with a noise gate
- Composition: Each song may consist of multiple tracks
Dataset Content
- Main Corpus:
- Contains several full or partial songs (exact count not disclosed)
- Predominantly guofeng style, may include opera‑style vocal passages
- Lyrics are not provided; users must collect them independently
- Scattered Audio Set:
- Total duration after silence removal is unspecified
- Combined into a single audio file for convenient slicing
- Content may contain duplicates and has not undergone manual cleaning
Annotation Status
- Partial manual alignment and sentence segmentation have been completed
- Annotations are provided in Praat TextGrid format; additional formats will be added in the future
- Ongoing annotation effort; contributions from volunteers are welcome
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Singing Voice Synthesis
Guofeng Music
Source
Organization: github
Created: 7/14/2020
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.