tau/commonsense_qa
CommonsenseQA is a new multiple‑choice QA dataset that requires various types of commonsense knowledge to predict the correct answer. It contains 12,102 questions, each with one correct answer and four distractors. The dataset is split into training, validation, and test sets, primarily in English.
Description
Dataset Overview
Basic Information
- Name: CommonsenseQA
- Language: English (
en) - License: MIT
- Multilinguality: Monolingual
- Size: 1K<n<10K
- Source Data: Original data
- Task Category: Question Answering
- Task ID: open-domain-qa
- Paper Code ID: commonsenseqa
- Display Name: CommonsenseQA
Dataset Structure
- Features:
id: String type, unique ID.question: String type, question description.question_concept: String type, concept related to the question.choices: Dictionary type, containing option labels and texts.label: String type, option label.text: String type, option text.
answerKey: String type, correct answer.
- Data Splits:
train: 9,741 samples, 2,207,794 bytes.validation: 1,221 samples, 273,848 bytes.test: 1,140 samples, 257,842 bytes.- Total Download Size: 1,558,570 bytes.
- Total Dataset Size: 2,739,484 bytes.
Dataset Creation
- Annotation Creators: Crowd‑sourced
- Language Creators: Crowd‑sourced
Usage Considerations
-
License: MIT, see details at this link.
-
Citation Information:
@inproceedings{talmor-etal-2019-commonsenseqa, title = "{C}ommonsense{QA}: A Question Answering Challenge Targeting Commonsense Knowledge", author = "Talmor, Alon and Herzig, Jonathan and Lourie, Nicholas and Berant, Jonathan", booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)", month = jun, year = "2019", address = "Minneapolis, Minnesota", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/N19-1421", doi = "10.18653/v1/N19-1421", pages = "4149--4158", archivePrefix = "arXiv", eprint = "1811.00937", primaryClass = "cs", }
Contributors
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: hugging_face
Created: Unknown
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.