DARE
DARE (Diverse Visual Question Answering with Robustness Evaluation) is a carefully curated multiple‑choice VQA benchmark. It evaluates visual‑language models across five categories and includes four robustness assessments based on prompt, answer‑option subset, output format, and number of correct answers. The validation split contains images, questions, answer options, and correct answers, while the test split hides correct answers to prevent leakage.
Dataset description and usage context
DARE Dataset Overview
Dataset Information
Features
- id: String type
- instance_id: 64‑bit integer type
- question: String type
- answer: List of strings type
- A: String type
- B: String type
- C: String type
- D: String type
- category: String type
- img: Image type
Configurations
- 1_correct:
- validation:
1_correct/validation/0000.parquet - test:
1_correct/test/0000.parquet
- validation:
- 1_correct_var:
- validation:
1_correct_var/validation/0000.parquet - test:
1_correct_var/test/0000.parquet
- validation:
- n_correct:
- validation:
n_correct/validation/0000.parquet - test:
n_correct/test/0000.parquet
- validation:
Dataset Description
DARE (Diverse Visual Question Answering with Robustness Evaluation) is a meticulously created and curated multiple‑choice VQA benchmark. DARE evaluates visual‑language models on five distinct categories and incorporates four robustness assessments based on:
- Prompt
- Subset of answer options
- Output format
- Number of correct answers
The validation set includes images, questions, answer options, and correct answers. The test set’s correct answers are withheld to prevent contamination.
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.