High Quality Data

Dataset Hub

Explore high-quality datasets for your AI and machine learning projects.

Sort:

Browse by Category

poetry_dataset

The dataset comprises five features: title, author, text, identifier, and category. It is divided into a training set containing 47,829 samples, with a total size of 12,701,752 bytes. The download size is 9,836,711 bytes.

huggingface

View Details

Gutenberg Poetry Corpus

Poetry

Text Generation

This is a poetry corpus extracted from Project Gutenberg, containing approximately three million lines of poetry, particularly suitable for creative computational poetry text generation applications.

github

View Details