DATASET

Open Source Community

Human Voice Dataset

A collection of human voice recordings featuring various singing styles (pitch, vowels, consonants, etc.). The dataset aims to simplify research on voice‑based music controllers and can be used to benchmark vocal feature detection algorithms (pitch detection, onset detection) as well as serve as training data for machine‑learning models.

Updated 6/19/2021

github

Description

Human Voice Dataset

Dataset Overview

Purpose: This dataset is designed to streamline research on voice‑controlled musical interfaces. It facilitates benchmarking of vocal feature detection algorithms (e.g., pitch detection, onset detection) and provides training material for machine‑learning models.
Current Version: Recordings from one singer are provided; additional singers will be added in the coming weeks.

Vocal Features

Notes: Explored in semitone intervals from the lowest to the highest range, e.g., c3.wav, c#3.wav, d3.wav, ...
Vowels: Form a limited‑value dimension (a, e, i, ...), e.g., _-a-[note].wav, _-e-[note].wav, ...
Consonants: Must be combined with a vowel to be intelligible, e.g., t-a-[note].wav, t-u-[note].wav, ...
Dynamics: Volume changes, pitch bends, vibrato (currently unavailable)

Dataset Structure

File Naming Pattern: [consonant]-[vowel]-[note]-[dynamic].wav
Directory Layout:
- data/voices/
  - martin/
    - notes/
      - sources/
      - exports/
    - vowels/
      - sources/
      - exports/
    - consonants/
      - sources/
      - exports/

Singer Information

Property Files:
- singer.properties: Contains age, gender, nationality, etc.
- recorder.properties: Contains recording equipment, recording conditions, etc.

Dataset Expansion

Adding Samples:
1. Clone the repository: git clone https://github.com/vocobox/human-voice-dataset.git
2. Copy the singer folder and commit changes: git add ., git commit -m "[new singer] barbara", git push origin master

Other Useful Sound Datasets

Piano Note Dataset: MAPS Database
Singing Voice Dataset: Singing Voice Dataset
Speech Corpora: CMU Speech, VoxForge, TED‑LIUM Corpus
Sound and Instruments: IRMAS Dataset

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Speech Recognition

Music Technology

Source

Organization: github

Created: 5/29/2018

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.

Check Prices →