Back to datasets
Dataset assetOpen Source CommunitySpeaker IdentificationAudio Data
CN-Celeb
The CN‑Celeb dataset is used for speaker recognition; the raw data are in FLAC format and can be directly used for training.
Source
github
Created
Mar 28, 2023
Updated
Apr 3, 2023
Signals
198 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Dataset Name
- CN‑Celeb 1
- CN‑Celeb 2
Data Format
- Raw data are in FLAC format.
Data Download
- CN‑Celeb 1: Click here to download
- CN‑Celeb 2: Click here to download
Performance Metrics
| Training Data | Test Data | Augment | EER (%) | minDCF (0.01) |
|---|---|---|---|---|
| CN‑2‑dev | CN‑1‑trials | No | 11.7 | 0.4999 |
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.