DATASET
Open Source Community
Kpop-lyric-datasets
A JSON‑format dataset comprising 25,696 Korean pop songs, sourced from Melon's monthly charts (2000 – October 2023). The dataset includes Python functions for data processing and emphasizes copyright attribution and usage restrictions.
Updated 12/3/2023
github
Description
Dataset Overview
Dataset Name
- Kpop‑lyric‑datasets
Dataset Content
- Contains 25,696 K‑pop songs in JSON format, sourced from Melon’s Monthly Chart Ranking 100 (2000 ~ 2023 Oct.).
License
- Available for research purposes; commercial use requires negotiation with lyric authors, artists, composers, etc.
Dataset Structure
Data File Paths
melonmonthly-chartmelon-<year>melon-<year>-<month>melon-monthly_<year>-<month>_<chart rank>.json
Data Fields
- info: Metadata such as year, month, rank, genre, and source website.
- song_id: Song ID in the Melon database.
- song_name: Song title.
- album: Album name.
- release_date: Release date.
- artist: Artist name.
- genre: Genre.
- lyric_writer: Lyricist.
- composer: Composer.
- arranger: Arranger.
- lyrics: Lyrics content, including line count and text.
Usage
Data Retrieval
- Get 2023 data:
data_parser.get_dict(2023)returns a dictionary. - Get 2010‑2022 data:
data_parser.get_df(2010, 2022)returns a Pandas DataFrame.
Cloning
- Clone with
git clone https://github.com/EX3exp/Kpop-lyric-datasets.git.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Korean Pop Music
Text Mining
Source
Organization: github
Created: 12/2/2023
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.