Back to datasets
Dataset assetOpen Source CommunitySpeaker IdentificationAudio Data

CN-Celeb

The CN‑Celeb dataset is used for speaker recognition; the raw data are in FLAC format and can be directly used for training.

Source
github
Created
Mar 28, 2023
Updated
Apr 3, 2023
Signals
198 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Name

  • CN‑Celeb 1
  • CN‑Celeb 2

Data Format

  • Raw data are in FLAC format.

Data Download

Performance Metrics

Training DataTest DataAugmentEER (%)minDCF (0.01)
CN‑2‑devCN‑1‑trialsNo11.70.4999
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio