Back to datasets
Dataset assetOpen Source CommunitySpeech EnhancementNoise Processing

URGENT 2025 Challenge Dataset

The URGENT 2025 Challenge dataset includes multiple speech and noise datasets for speech enhancement tasks. It comprises DNS5, LibriTTS, VCTK, WSJ, EARS, CommonVoice 19.0, and MLS, covering various speech and noise types.

Source
github
Created
Nov 12, 2024
Updated
Nov 27, 2024
Signals
304 views
Availability
Linked source ready
Overview

Dataset description and usage context

urgent2025_challenge

Dataset Overview

  • Dataset Name: urgent2025_challenge
  • Purpose: Data preparation scripts for the URGENT 2025 Challenge.
  • Compatibility: Generated metadata files are compatible with the baseline code.

Update Log

  • 2024‑11‑27: Added troubleshooting guide for known issues.
  • 2024‑11‑19: Modified ESTOI evaluation script for determinism.
  • 2024‑11‑18: Added missing files required for Track 2 data preparation.
  • 2024‑11‑16: Updated several data‑preparation and evaluation scripts.

Notes

  • Validation Set: The generated validation set differs from the official one; the official set can be obtained here.
  • Data Usage: The default data/speech_train subset is intended for dynamic mixing within the ESPnet framework.

System Requirements

  • CPU Cores: >8
  • Disk Space:
    • Track 1: ≥1.3 TB
    • Track 2: ??? TB
  • Data Sizes:
    • Speech Data:
      • DNS5 speech: 318 GB
      • LibriTTS: 51 GB
      • VCTK: 12 GB
      • WSJ: 55 GB
      • EARS: 61 GB
      • CommonVoice 19.0 speech:
        • Track 1: 421 GB
        • Track 2: ??? GB
      • MLS:
        • Track 1: 120 GB
        • Track 2: ??? TB
    • Noise Data:
      • DNS5 noise: 93 GB
      • WHAM! noise: 76 GB
      • FSD50K: 30 GB
      • FMA: 60 GB
    • RIR Data:
      • DNS5 RIRs: 6 GB
    • Others:
      • Default simulated validation data: 2 GB
      • Simulated wind noise: 1 GB

Usage Instructions

  1. Initialize Submodules:
    git submodule update --init --recursive
    
  2. Install Environment:
    conda env create -f environment.yaml
    conda activate urgent2025
    
  3. Obtain CommonVoice Dataset Link:
  4. Create Symbolic Links:
    • Create symlinks for wsj0 and wsj1 data.
  5. Configure FFmpeg:
    • Modify the FFmpeg path in simulation/simulate_data_from_param.py.
  6. Run Scripts:
    ./prepare_espnet_data.sh
    
  7. Install eSpeak‑NG:
    • Required for phoneme similarity metric computation.

Troubleshooting

  • MLS Extraction Error: Delete the failed file and rerun the script.
  • FMA Processing Warning: Can be ignored.
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio