Explore high-quality datasets for your AI and machine learning projects.
This is a formatted version of the [derek‑thomas/ScienceQA](https://huggingface.co/datasets/derek-thomas/ScienceQA) dataset that includes only image instances. It is used in the `lmms‑eval` pipeline to enable one‑click evaluation of large multimodal models. The dataset provides fields such as image, question, choices, answer, hint, task, grade, subject, topic, category, skill, lecture, and solution, and is split into training, validation, and test sets.
The ScienceQA dataset is a multimodal science question‑answering collection covering numerous domains such as chemistry, biology, physics, earth science, engineering, geography, history, civics, economics, global studies, grammar, writing, vocabulary, natural science, language science, and social science. The dataset comprises fields including images, questions, multiple‑choice options, answers, hints, task descriptions, grade levels, subjects, topics, categories, skills, lectures, and solutions. It is primarily intended for multimodal multiple‑choice tasks, supporting question answering (multiple choice, closed‑domain, open‑domain), visual question answering, and multi‑class classification. The dataset was created to diagnose AI systems’ multi‑hop reasoning capability and explainability, especially in scientific question answering. The language is English, with a scale ranging from 10 K to 100 K instances, split into training, validation, and test sets.