Dataset assetOpen Source CommunitySpeech EnhancementNoise Processing

URGENT 2025 Challenge Dataset

The URGENT 2025 Challenge dataset includes multiple speech and noise datasets for speech enhancement tasks. It comprises DNS5, LibriTTS, VCTK, WSJ, EARS, CommonVoice 19.0, and MLS, covering various speech and noise types.

Source

github

Created

Nov 12, 2024

Updated

Nov 27, 2024

Signals

304 views

Availability

Linked source ready

Overview

Dataset description and usage context

urgent2025_challenge

Dataset Overview

Dataset Name: urgent2025_challenge
Purpose: Data preparation scripts for the URGENT 2025 Challenge.
Compatibility: Generated metadata files are compatible with the baseline code.

Update Log

2024‑11‑27: Added troubleshooting guide for known issues.
2024‑11‑19: Modified ESTOI evaluation script for determinism.
2024‑11‑18: Added missing files required for Track 2 data preparation.
2024‑11‑16: Updated several data‑preparation and evaluation scripts.

Notes

Validation Set: The generated validation set differs from the official one; the official set can be obtained here.
Data Usage: The default data/speech_train subset is intended for dynamic mixing within the ESPnet framework.

System Requirements

CPU Cores: >8
Disk Space:
- Track 1: ≥1.3 TB
- Track 2: ??? TB
Data Sizes:
- Speech Data:
  - DNS5 speech: 318 GB
  - LibriTTS: 51 GB
  - VCTK: 12 GB
  - WSJ: 55 GB
  - EARS: 61 GB
  - CommonVoice 19.0 speech:
    - Track 1: 421 GB
    - Track 2: ??? GB
  - MLS:
    - Track 1: 120 GB
    - Track 2: ??? TB
- Noise Data:
  - DNS5 noise: 93 GB
  - WHAM! noise: 76 GB
  - FSD50K: 30 GB
  - FMA: 60 GB
- RIR Data:
  - DNS5 RIRs: 6 GB
- Others:
  - Default simulated validation data: 2 GB
  - Simulated wind noise: 1 GB

Usage Instructions

Initialize Submodules:

git submodule update --init --recursive

Install Environment:

conda env create -f environment.yaml
conda activate urgent2025

Obtain CommonVoice Dataset Link:
- Retrieve the v19.0 download link from Common Voice.
Create Symbolic Links:
- Create symlinks for wsj0 and wsj1 data.
Configure FFmpeg:
- Modify the FFmpeg path in simulation/simulate_data_from_param.py.
Run Scripts:
```
./prepare_espnet_data.sh
```
Install eSpeak‑NG:
- Required for phoneme similarity metric computation.

Troubleshooting

MLS Extraction Error: Delete the failed file and rerun the script.
FMA Processing Warning: Can be ignored.

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio