Back to datasets
Dataset assetOpen Source CommunityAudio Event DetectionInfant Cry and Snoring

ICSD

ICSD is a comprehensive audio event dataset for infant cry and snoring detection. It contains over 3.3 hours of strongly labeled data and 1 hour of weakly labeled data, including foreground and background events for synthetic data generation. Audio files are stored in the ‘audio’ folder, and event timestamp annotations are in the ‘metadata’ folder, each further divided into training, validation, and test subfolders. Source material for generating synthetic strongly labeled data is also provided.

Source
github
Created
Jul 20, 2024
Updated
Jul 25, 2024
Signals
380 views
Availability
Linked source ready
Overview

Dataset description and usage context

ICSD: An Open‑source Dataset for Infant Cry and Snoring Detection

Dataset Overview

ICSD is a comprehensive audio‑event dataset for infant cry and snoring detection, featuring:

  • Over 3.3 hours of strong‑label data and 1 hour of weak‑label data;
  • Both foreground and background events for synthetic data generation.

Data Structure

Audio files are stored in the audio folder, and event timestamp annotations are in the metadata folder, each further divided into training, validation, and test subfolders. Source material for generating synthetic strong‑label data is also provided. You can use Scaper to generate your own synthetic data.

Data Preview

The demo folder provides four downloadable and listenable audio samples.

Baseline Systems

Baseline systems are designed based on the DCASE 2023 Challenge Task 4, offering three baselines:

  1. Baseline using only synthetic data
  2. Baseline using real and synthetic data
  3. Baseline using pre‑trained embeddings

Usage

  • Data download: Download the dataset from HuggingFace and extract it to the data folder.
  • Training:
    • Baseline with only synthetic data: python train_sed.py
    • Baseline with real and synthetic data: python train_sed.py --strong_real
    • Baseline with pre‑trained embeddings: first pre‑compute embeddings: python extract_embeddings.py --output_dir ./embeddings --pretrained_model "beats", then run the system: train_pretrained.py

Citation

If you use the ICSD dataset, please cite the following paper:

@article{ICSD,
      title={ICSD: An Open‑source Dataset for Infant Cry and Snoring Detection},
      author={Qingyu Liu, Longfei Song, Dongxing Xu, Yanhua Long},
      journal={arXiv},
      year={2024}
}
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio