Back to datasets
Dataset assetOpen Source CommunityAudio ProcessingSpatial Audio

SpatialAudio

The SpatialSounds dataset includes the balanced training and evaluation sets of AudioSet, as well as reverberation data. The dataset provides audio files and associated metadata, supporting mono, binaural, and surround sound formats. In addition, sample code for generating spatial audio and the SpatialSoundQA dataset are provided for training BAT models.

Source
huggingface
Created
Sep 21, 2024
Updated
Sep 21, 2024
Signals
235 views
Availability
Linked source ready
Overview

Dataset description and usage context

SpatialSounds Dataset Overview

Dataset Contents

AudioSet (non-reverberant audio sources)

  • Dataset Type: Mono/Binaural/Ambisonics
  • Dataset Structure:
    • Balanced train and Evaluation sets
    • Unbalanced train set requires reference to Official AudioSet
  • Metadata: Available for download from metadata
  • Weight Generation: See weights-generation or use the provided weight files

Reverberation

  • Dataset Structure:
    • train_reverberation.json and eval_reverberation.json
    • binaural and mono folders
  • Download Link: mp3d_reverberation

SpatialSoundQA Dataset

  • Dataset Content: Training data for different stages of training BAT models
  • Download Link: SpatialSoundQA

Data Generation Method

  • Spatial Audio Generation: Use scipy.signal.fftconvolve or torchaudio.functional.fftconvolve to generate spatial audio from mono recordings

TODO

  1. Upload QA evaluation set

License

  • License Type: CC BY-NC 4.0
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio