ICSD

ICSD is a comprehensive audio event dataset for infant cry and snoring detection. It contains over 3.3 hours of strongly labeled data and 1 hour of weakly labeled data, including foreground and background events for synthetic data generation. Audio files are stored in the ‘audio’ folder, and event timestamp annotations are in the ‘metadata’ folder, each further divided into training, validation, and test subfolders. Source material for generating synthetic strongly labeled data is also provided.

Updated 7/25/2024

github

Description

ICSD: An Open‑source Dataset for Infant Cry and Snoring Detection

Dataset Overview

ICSD is a comprehensive audio‑event dataset for infant cry and snoring detection, featuring:

Over 3.3 hours of strong‑label data and 1 hour of weak‑label data;
Both foreground and background events for synthetic data generation.

Data Structure

Audio files are stored in the audio folder, and event timestamp annotations are in the metadata folder, each further divided into training, validation, and test subfolders. Source material for generating synthetic strong‑label data is also provided. You can use Scaper to generate your own synthetic data.

Data Preview

The demo folder provides four downloadable and listenable audio samples.

Baseline Systems

Baseline systems are designed based on the DCASE 2023 Challenge Task 4, offering three baselines:

Baseline using only synthetic data
Baseline using real and synthetic data
Baseline using pre‑trained embeddings

Usage

Data download: Download the dataset from HuggingFace and extract it to the data folder.
Training:
- Baseline with only synthetic data: python train_sed.py
- Baseline with real and synthetic data: python train_sed.py --strong_real
- Baseline with pre‑trained embeddings: first pre‑compute embeddings: python extract_embeddings.py --output_dir ./embeddings --pretrained_model "beats", then run the system: train_pretrained.py

Citation

If you use the ICSD dataset, please cite the following paper:

@article{ICSD,
      title={ICSD: An Open‑source Dataset for Infant Cry and Snoring Detection},
      author={Qingyu Liu, Longfei Song, Dongxing Xu, Yanhua Long},
      journal={arXiv},
      year={2024}
}

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Please login to view download links and access full dataset details.

Topics

Audio Event Detection

Infant Cry and Snoring

Source

Organization: github

Created: 7/20/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.

Check Prices →