Dataset assetOpen Source CommunityAudio ClassificationBioacoustics

DBD-research-group/BirdSet

BirdSet is a large‑scale audio classification dataset focusing on bird vocalizations. It contains over 6,800 hours of recordings, providing training data for nearly 10,000 classes and over 400 hours of evaluation data across eight strongly labeled evaluation sets. BirdSet serves as a rich resource for audio classification tasks such as multi‑label classification, covariate shift, or self‑supervised learning.

Source

hugging_face

Created

Nov 28, 2025

Updated

Sep 10, 2025

Signals

292 views

Availability

Linked source ready

Overview

Dataset description and usage context

Dataset Description

Dataset Overview

Task Category: Audio Classification
License: cc
Tags: Bird Classification, Passive Acoustic Monitoring

Dataset Details

Dataset Name: BirdSet
Contact: Lukas Rauch (lukas.rauch@uni-kassel.de)

Dataset Composition

Training and Test Sets: Multiple training and test sets covering various bird vocalizations.
Data Format: .ogg files at 32 kHz sampling rate.
Data Source: Recordings obtained from Xeno‑Canto (XC), excluding recordings with CC‑ND licenses.
Labels: Scientific bird names converted to ebird_code tags.

Dataset Statistics

Dataset Name	Train Samples	Test Samples	5‑second Test Samples	Size (GB)	Number of Classes
PER (Amazon Basin)	16,802	14,798	15,120	10.5	132
NES (Colombia Costa Rica)	16,117	6,952	24,480	14.2	89
UHH (Hawaiian Islands)	3,626	59,583	36,637	4.92	25 tr, 27 te
HSN (high_sierras)	5,460	10,296	12,000	5.92	21
NBP (NIPS4BPlus)	24,327	5,493	563	29.9	51
POW (Powdermill Nature)	14,911	16,052	4,560	15.7	48
SSW (Sapsucker Woods)	28,403	50,760	205,200	35.2	81
SNE (Sierra Nevada)	19,390	20,147	23,756	20.8	56
XCM (Xenocanto Subset M)	89,798	x	x	89.3	409 (411)
XCL (Xenocanto Complete)	528,434	x	x	484	9,735

Dataset Usage

Training Set: Uses focus audio from XC with quality ratings A, B, C, excluding CC‑ND licensed recordings.
Test Set: Provides full recordings and label sets, excluding recordings without bird vocalizations.
5‑second Test Set: Multi‑label task; each recording is segmented into 5‑second intervals, with an unlabeled [0] vector for silent segments.

Metadata

Audio Format: 32 kHz mono.
Labels: ebird_code, ebird_code_multilabel, ebird_code_secondary.
Additional Metadata: Start time, end time, frequency range, geographic location, etc.

Citation

@misc{birdset, title={BirdSet: A Multi‑Task Benchmark for Classification in Avian Bioacoustics}, author={Lukas Rauch and Raphael Schwinger and Moritz Wirth and René Heinrich and Jonas Lange and Stefan Kahl and Bernhard Sick and Sven Tomforde and Christoph Scholz}, year={2024}, eprint={2403.10380}, archivePrefix={arXiv}, primaryClass={cs.SD} }

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio