High Quality Data

Dataset Hub

Explore high-quality datasets for your AI and machine learning projects.

Sort:

Browse by Category

legacy107/qa_wikipedia

The qa_wikipedia dataset is a question‑answering dataset containing multiple documents extracted from Wikipedia along with associated questions. Features include document ID, title, context, question, answer start position, answer text, and the full article. The dataset is split into training, test, and validation subsets for different modeling stages.

hugging_face

View Details

wikipedia_qa

Question Answering Systems

Wikipedia

Based on Korean Wikipedia data, this dataset is processed into a question‑answer format. Its goal is to be processed via code rather than a language model, and new processing ideas will be uploaded as new versions.

huggingface

View Details