JUHE API Marketplace
DATASET
Open Source Community

SpatialAudio

The SpatialSounds dataset includes the balanced training and evaluation sets of AudioSet, as well as reverberation data. The dataset provides audio files and associated metadata, supporting mono, binaural, and surround sound formats. In addition, sample code for generating spatial audio and the SpatialSoundQA dataset are provided for training BAT models.

Updated 9/21/2024
huggingface

Description

SpatialSounds Dataset Overview

Dataset Contents

AudioSet (non-reverberant audio sources)

  • Dataset Type: Mono/Binaural/Ambisonics
  • Dataset Structure:
    • Balanced train and Evaluation sets
    • Unbalanced train set requires reference to Official AudioSet
  • Metadata: Available for download from metadata
  • Weight Generation: See weights-generation or use the provided weight files

Reverberation

  • Dataset Structure:
    • train_reverberation.json and eval_reverberation.json
    • binaural and mono folders
  • Download Link: mp3d_reverberation

SpatialSoundQA Dataset

  • Dataset Content: Training data for different stages of training BAT models
  • Download Link: SpatialSoundQA

Data Generation Method

  • Spatial Audio Generation: Use scipy.signal.fftconvolve or torchaudio.functional.fftconvolve to generate spatial audio from mono recordings

TODO

  1. Upload QA evaluation set

License

  • License Type: CC BY-NC 4.0

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Spatial Audio
Audio Processing

Source

Organization: huggingface

Created: 9/21/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.