Back to datasets
Dataset assetOpen Source CommunitySound RecognitionVehicle Safety

NINA Dataset

NINA dataset is a collection of in‑vehicle and out‑of‑vehicle sounds (e.g., electric‑vehicle alarm sounds) for research purposes. The sounds were recorded via dashcams or smartphone microphones; because recording conditions are uncontrolled, no vehicle speed, specific recording device model, or microphone details are provided.

Source
github
Created
Dec 9, 2019
Updated
Sep 12, 2023
Signals
245 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Name

Naturalistic IN‑vehicle Audio Dataset (NINA)

Dataset Content

NINA dataset contains sounds generated inside and outside vehicles (e.g., electric‑vehicle alarm sounds). These sounds were mainly recorded via dashcams or smartphone microphones, and because the environment is uncontrolled, the dataset does not include vehicle speed, recording device model, or microphone details.

Dataset Classification

CategorySegmentsTotal Duration (s)
Crash751865
Driving2951086
Tire skidding186208
Horn261314
Harsh acceleration2263
Talking265653
Screaming157113
Music198821
Pothole144138
Meteo (strong rain/hail)943613
Police siren39288
Ambulance siren1591253
Firetruck siren76822

Dataset Files

  • datasetCreation.sh: main script file
  • youtube_IDs.csv: list of YouTube videos
  • labels: folder containing txt files, each with annotations [start time] [end time] [category]

Dataset Usage

By running bash datasetCreation.sh ./labels/ ./output, an output folder is created containing subfolders for each category, each populated with wav files.

Contribution / Extension

Contributors can extend the dataset by adding new YouTube video IDs and related titles to the youtube_IDs.csv file and annotating audio using Audacity or other tools.

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio