Back to datasets
Dataset assetOpen Source CommunityMusic DataMIDI

Piano Roll, Lead Sheets, Midi

The repository collects various symbolic music datasets, including piano‑roll datasets, lead‑sheet datasets, and MIDI datasets, each with its specific source and format.

Source
github
Created
Dec 2, 2018
Updated
Jan 15, 2022
Signals
108 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Piano Roll

  • 5‑track piano‑roll dataset

    • Source: Based on LPD with new preprocessing policy.
    • Description: Contains a piano‑roll dataset with 5 tracks.
  • Lead sheet dataset

    • Source: Based on [Theorytab], with potential integration with other lead‑sheet datasets.
    • Description: Lead‑sheet dataset; see this repo for details.

Lead Sheets

  • Crawled Datasets
    • Data source, genre, format, chords, melody, number of songs and source link:
      • Theorytab: pop, XML, V, V, 10,148, X
      • Wikifonia: pop, XML, V, V, 6,675, O
      • Hymnal: hymn, MIDI, Δ, V, 3,358, O

MIDI

  • Crawled Datasets
    • Data source, genre, multi‑track support, format, number of songs and source link:
      • VGMdb: game, V, MIDI, 28,419, O
      • Doug McKenzie Jazz: jazz, V, MIDI, 297, O
      • Piano‑e‑Competition: classical, , MIDI, 1,573, O

These datasets cover a variety of music types and formats, suitable for different music analysis and processing needs.

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio