JUHE API Marketplace
DATASET
Open Source Community

URGENT 2025 Challenge Dataset

The URGENT 2025 Challenge dataset includes multiple speech and noise datasets for speech enhancement tasks. It comprises DNS5, LibriTTS, VCTK, WSJ, EARS, CommonVoice 19.0, and MLS, covering various speech and noise types.

Updated 11/27/2024
github

Description

urgent2025_challenge

Dataset Overview

  • Dataset Name: urgent2025_challenge
  • Purpose: Data preparation scripts for the URGENT 2025 Challenge.
  • Compatibility: Generated metadata files are compatible with the baseline code.

Update Log

  • 2024‑11‑27: Added troubleshooting guide for known issues.
  • 2024‑11‑19: Modified ESTOI evaluation script for determinism.
  • 2024‑11‑18: Added missing files required for Track 2 data preparation.
  • 2024‑11‑16: Updated several data‑preparation and evaluation scripts.

Notes

  • Validation Set: The generated validation set differs from the official one; the official set can be obtained here.
  • Data Usage: The default data/speech_train subset is intended for dynamic mixing within the ESPnet framework.

System Requirements

  • CPU Cores: >8
  • Disk Space:
    • Track 1: ≥1.3 TB
    • Track 2: ??? TB
  • Data Sizes:
    • Speech Data:
      • DNS5 speech: 318 GB
      • LibriTTS: 51 GB
      • VCTK: 12 GB
      • WSJ: 55 GB
      • EARS: 61 GB
      • CommonVoice 19.0 speech:
        • Track 1: 421 GB
        • Track 2: ??? GB
      • MLS:
        • Track 1: 120 GB
        • Track 2: ??? TB
    • Noise Data:
      • DNS5 noise: 93 GB
      • WHAM! noise: 76 GB
      • FSD50K: 30 GB
      • FMA: 60 GB
    • RIR Data:
      • DNS5 RIRs: 6 GB
    • Others:
      • Default simulated validation data: 2 GB
      • Simulated wind noise: 1 GB

Usage Instructions

  1. Initialize Submodules:
    git submodule update --init --recursive
    
  2. Install Environment:
    conda env create -f environment.yaml
    conda activate urgent2025
    
  3. Obtain CommonVoice Dataset Link:
  4. Create Symbolic Links:
    • Create symlinks for wsj0 and wsj1 data.
  5. Configure FFmpeg:
    • Modify the FFmpeg path in simulation/simulate_data_from_param.py.
  6. Run Scripts:
    ./prepare_espnet_data.sh
    
  7. Install eSpeak‑NG:
    • Required for phoneme similarity metric computation.

Troubleshooting

  • MLS Extraction Error: Delete the failed file and rerun the script.
  • FMA Processing Warning: Can be ignored.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Speech Enhancement
Noise Processing

Source

Organization: github

Created: 11/12/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.