Explore high-quality datasets for your AI and machine learning projects.
The dataset comprises five features: title, author, text, identifier, and category. It is divided into a training set containing 47,829 samples, with a total size of 12,701,752 bytes. The download size is 9,836,711 bytes.
This is a poetry corpus extracted from Project Gutenberg, containing approximately three million lines of poetry, particularly suitable for creative computational poetry text generation applications.