Back to datasets
Dataset assetOpen Source CommunitySpeech EnhancementNoise Processing
URGENT 2025 Challenge Dataset
The URGENT 2025 Challenge dataset includes multiple speech and noise datasets for speech enhancement tasks. It comprises DNS5, LibriTTS, VCTK, WSJ, EARS, CommonVoice 19.0, and MLS, covering various speech and noise types.
Source
github
Created
Nov 12, 2024
Updated
Nov 27, 2024
Signals
304 views
Availability
Linked source ready
Overview
Dataset description and usage context
urgent2025_challenge
Dataset Overview
- Dataset Name: urgent2025_challenge
- Purpose: Data preparation scripts for the URGENT 2025 Challenge.
- Compatibility: Generated metadata files are compatible with the baseline code.
Update Log
- 2024‑11‑27: Added troubleshooting guide for known issues.
- 2024‑11‑19: Modified ESTOI evaluation script for determinism.
- 2024‑11‑18: Added missing files required for Track 2 data preparation.
- 2024‑11‑16: Updated several data‑preparation and evaluation scripts.
Notes
- Validation Set: The generated validation set differs from the official one; the official set can be obtained here.
- Data Usage: The default
data/speech_trainsubset is intended for dynamic mixing within the ESPnet framework.
System Requirements
- CPU Cores: >8
- Disk Space:
- Track 1: ≥1.3 TB
- Track 2: ??? TB
- Data Sizes:
- Speech Data:
- DNS5 speech: 318 GB
- LibriTTS: 51 GB
- VCTK: 12 GB
- WSJ: 55 GB
- EARS: 61 GB
- CommonVoice 19.0 speech:
- Track 1: 421 GB
- Track 2: ??? GB
- MLS:
- Track 1: 120 GB
- Track 2: ??? TB
- Noise Data:
- DNS5 noise: 93 GB
- WHAM! noise: 76 GB
- FSD50K: 30 GB
- FMA: 60 GB
- RIR Data:
- DNS5 RIRs: 6 GB
- Others:
- Default simulated validation data: 2 GB
- Simulated wind noise: 1 GB
- Speech Data:
Usage Instructions
- Initialize Submodules:
git submodule update --init --recursive - Install Environment:
conda env create -f environment.yaml conda activate urgent2025 - Obtain CommonVoice Dataset Link:
- Retrieve the v19.0 download link from Common Voice.
- Create Symbolic Links:
- Create symlinks for wsj0 and wsj1 data.
- Configure FFmpeg:
- Modify the FFmpeg path in
simulation/simulate_data_from_param.py.
- Modify the FFmpeg path in
- Run Scripts:
./prepare_espnet_data.sh - Install eSpeak‑NG:
- Required for phoneme similarity metric computation.
Troubleshooting
- MLS Extraction Error: Delete the failed file and rerun the script.
- FMA Processing Warning: Can be ignored.
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.