ICSD
ICSD is a comprehensive audio event dataset for infant cry and snoring detection. It contains over 3.3 hours of strongly labeled data and 1 hour of weakly labeled data, including foreground and background events for synthetic data generation. Audio files are stored in the ‘audio’ folder, and event timestamp annotations are in the ‘metadata’ folder, each further divided into training, validation, and test subfolders. Source material for generating synthetic strongly labeled data is also provided.
Description
ICSD: An Open‑source Dataset for Infant Cry and Snoring Detection
Dataset Overview
ICSD is a comprehensive audio‑event dataset for infant cry and snoring detection, featuring:
- Over 3.3 hours of strong‑label data and 1 hour of weak‑label data;
- Both foreground and background events for synthetic data generation.
Data Structure
Audio files are stored in the audio folder, and event timestamp annotations are in the metadata folder, each further divided into training, validation, and test subfolders. Source material for generating synthetic strong‑label data is also provided. You can use Scaper to generate your own synthetic data.
Data Preview
The demo folder provides four downloadable and listenable audio samples.
Baseline Systems
Baseline systems are designed based on the DCASE 2023 Challenge Task 4, offering three baselines:
- Baseline using only synthetic data
- Baseline using real and synthetic data
- Baseline using pre‑trained embeddings
Usage
- Data download: Download the dataset from HuggingFace and extract it to the
datafolder. - Training:
- Baseline with only synthetic data:
python train_sed.py - Baseline with real and synthetic data:
python train_sed.py --strong_real - Baseline with pre‑trained embeddings: first pre‑compute embeddings:
python extract_embeddings.py --output_dir ./embeddings --pretrained_model "beats", then run the system:train_pretrained.py
- Baseline with only synthetic data:
Citation
If you use the ICSD dataset, please cite the following paper:
@article{ICSD,
title={ICSD: An Open‑source Dataset for Infant Cry and Snoring Detection},
author={Qingyu Liu, Longfei Song, Dongxing Xu, Yanhua Long},
journal={arXiv},
year={2024}
}
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: github
Created: 7/20/2024
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.