NINA Dataset
NINA dataset is a collection of in‑vehicle and out‑of‑vehicle sounds (e.g., electric‑vehicle alarm sounds) for research purposes. The sounds were recorded via dashcams or smartphone microphones; because recording conditions are uncontrolled, no vehicle speed, specific recording device model, or microphone details are provided.
Dataset description and usage context
Dataset Overview
Dataset Name
Naturalistic IN‑vehicle Audio Dataset (NINA)
Dataset Content
NINA dataset contains sounds generated inside and outside vehicles (e.g., electric‑vehicle alarm sounds). These sounds were mainly recorded via dashcams or smartphone microphones, and because the environment is uncontrolled, the dataset does not include vehicle speed, recording device model, or microphone details.
Dataset Classification
| Category | Segments | Total Duration (s) |
|---|---|---|
| Crash | 751 | 865 |
| Driving | 295 | 1086 |
| Tire skidding | 186 | 208 |
| Horn | 261 | 314 |
| Harsh acceleration | 22 | 63 |
| Talking | 265 | 653 |
| Screaming | 157 | 113 |
| Music | 198 | 821 |
| Pothole | 144 | 138 |
| Meteo (strong rain/hail) | 94 | 3613 |
| Police siren | 39 | 288 |
| Ambulance siren | 159 | 1253 |
| Firetruck siren | 76 | 822 |
Dataset Files
datasetCreation.sh: main script fileyoutube_IDs.csv: list of YouTube videoslabels: folder containing txt files, each with annotations[start time] [end time] [category]
Dataset Usage
By running bash datasetCreation.sh ./labels/ ./output, an output folder is created containing subfolders for each category, each populated with wav files.
Contribution / Extension
Contributors can extend the dataset by adding new YouTube video IDs and related titles to the youtube_IDs.csv file and annotating audio using Audacity or other tools.
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.