Back to datasets
Dataset assetOpen Source CommunityComputer VisionVisual Question Answering

Phando/vqa_v2

This dataset, named vqa_v2, contains multiple features such as question type, multiple-choice answer, answer list (including answer, answer confidence, and answer ID), image ID, answer type, question ID, question, and image. The dataset is split into training, validation, and test parts, containing 443,757, 214,354, and 447,793 samples respectively. The download size is 34,818,002,031 bytes, and the total size is 171,555,262,245.114 bytes.

Source
hugging_face
Created
Nov 28, 2025
Updated
Dec 7, 2023
Signals
140 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Configuration

  • Configuration Name: default
  • Data File Paths:
    • Training set: data/train-*
    • Validation set: data/validation-*
    • Test set: data/test-*

Dataset Information

  • Features:
    • question_type: string type
    • multiple_choice_answer: string type
    • answers: list type
      • answer: string type
      • answer_confidence: string type
      • answer_id: 64-bit integer type
    • image_id: 64-bit integer type
    • answer_type: string type
    • question_id: 64-bit integer type
    • question: string type
    • image: image type

Dataset Splits

  • Training set:
    • Bytes: 67692137168.704
    • Samples: 443,757
  • Validation set:
    • Bytes: 33693404566.41
    • Samples: 214,354
  • Test set:
    • Bytes: 70169720510.0
    • Samples: 447,793

Dataset Size

  • Download Size: 34818002031
  • Dataset Size: 171555262245.114
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio