stemdataset/STEM
The STEM dataset is a multimodal benchmark for testing neural models on science, technology, engineering, and mathematics (STEM) skills. It contains 448 skills and 1,073,146 questions covering all STEM subjects. Unlike existing datasets, it requires models to understand multimodal visual‑language information and is based on K‑12 curricula. The dataset is split into training, validation, and test sets; the test set’s ground‑truth answers are hidden and can be evaluated via leaderboard submission. Each entry is a multimodal multiple‑choice question with a description, image, options, and the correct answer index.
Dataset description and usage context
STEM Dataset Overview
Basic Information
- License: Apache‑2.0
- Language: English
- Scale: 1M < n < 10M
- Tags: STEM, Benchmark
Content
- Type: Multimodal multiple‑choice
- Subjects: Science, Technology, Engineering, Mathematics
- Number of Skills: 448
- Number of Questions: 1,073,146
- Splits: Training, Validation, Test
- Training Size: 644,797 questions
- Validation Size: 214,272 questions
- Test Size: 214,077 questions
Features
- Schema:
DatasetDict({
train: Dataset({
features: [subject, grade, skill, pic_choice, pic_prob, problem, problem_pic, choices, choices_pic, answer_idx],
num_rows: 644797
})
valid: Dataset({
features: [subject, grade, skill, pic_choice, pic_prob, problem, problem_pic, choices, choices_pic, answer_idx],
num_rows: 214272
})
test: Dataset({
features: [subject, grade, skill, pic_choice, pic_prob, problem, problem_pic, choices, choices_pic, answer_idx],
num_rows: 214077
})
})
- Feature Description:
subject: subject areagrade: educational grade levelskill: specific skill identifierpic_choice: whether options are imagespic_prob: whether the problem includes an imageproblem: textual description of the problemproblem_pic: associated image for the problemchoices: textual optionschoices_pic: image options (if any)answer_idx: index of the correct answer
Use Cases
- Evaluation: Follow the code for dataset evaluation.
Contact
- Email: stemdataset@gmail.com
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.