Back to datasets
Dataset assetOpen Source CommunityText MiningKorean Pop Music
Kpop-lyric-datasets
A JSON‑format dataset comprising 25,696 Korean pop songs, sourced from Melon's monthly charts (2000 – October 2023). The dataset includes Python functions for data processing and emphasizes copyright attribution and usage restrictions.
Source
github
Created
Dec 2, 2023
Updated
Dec 3, 2023
Signals
599 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Dataset Name
- Kpop‑lyric‑datasets
Dataset Content
- Contains 25,696 K‑pop songs in JSON format, sourced from Melon’s Monthly Chart Ranking 100 (2000 ~ 2023 Oct.).
License
- Available for research purposes; commercial use requires negotiation with lyric authors, artists, composers, etc.
Dataset Structure
Data File Paths
melonmonthly-chartmelon-<year>melon-<year>-<month>melon-monthly_<year>-<month>_<chart rank>.json
Data Fields
- info: Metadata such as year, month, rank, genre, and source website.
- song_id: Song ID in the Melon database.
- song_name: Song title.
- album: Album name.
- release_date: Release date.
- artist: Artist name.
- genre: Genre.
- lyric_writer: Lyricist.
- composer: Composer.
- arranger: Arranger.
- lyrics: Lyrics content, including line count and text.
Usage
Data Retrieval
- Get 2023 data:
data_parser.get_dict(2023)returns a dictionary. - Get 2010‑2022 data:
data_parser.get_df(2010, 2022)returns a Pandas DataFrame.
Cloning
- Clone with
git clone https://github.com/EX3exp/Kpop-lyric-datasets.git.
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.