JUHE API Marketplace
DATASET
Open Source Community

Kpop-lyric-datasets

A JSON‑format dataset comprising 25,696 Korean pop songs, sourced from Melon's monthly charts (2000 – October 2023). The dataset includes Python functions for data processing and emphasizes copyright attribution and usage restrictions.

Updated 12/3/2023
github

Description

Dataset Overview

Dataset Name

  • Kpop‑lyric‑datasets

Dataset Content

  • Contains 25,696 K‑pop songs in JSON format, sourced from Melon’s Monthly Chart Ranking 100 (2000 ~ 2023 Oct.).

License

  • Available for research purposes; commercial use requires negotiation with lyric authors, artists, composers, etc.

Dataset Structure

Data File Paths

  • melonmonthly-chartmelon-<year>melon-<year>-<month>melon-monthly_<year>-<month>_<chart rank>.json

Data Fields

  • info: Metadata such as year, month, rank, genre, and source website.
  • song_id: Song ID in the Melon database.
  • song_name: Song title.
  • album: Album name.
  • release_date: Release date.
  • artist: Artist name.
  • genre: Genre.
  • lyric_writer: Lyricist.
  • composer: Composer.
  • arranger: Arranger.
  • lyrics: Lyrics content, including line count and text.

Usage

Data Retrieval

  • Get 2023 data: data_parser.get_dict(2023) returns a dictionary.
  • Get 2010‑2022 data: data_parser.get_df(2010, 2022) returns a Pandas DataFrame.

Cloning

  • Clone with git clone https://github.com/EX3exp/Kpop-lyric-datasets.git.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Korean Pop Music
Text Mining

Source

Organization: github

Created: 12/2/2023

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.