lmms-lab/llava-interleave-bench
LLaVA‑Interleave Bench is a comprehensive multi‑image dataset collected from public datasets or generated via the GPT‑4V API. The dataset aims to evaluate the interleaved multi‑image reasoning capability of large multimodal models. It was collected in April 2024 and released in June 2024. Its primary use is for research on large multimodal models and chatbots, targeting researchers and enthusiasts in computer vision, natural language processing, machine learning, and AI.
Dataset description and usage context
LLaVA‑Next‑Interleave Bench Dataset Overview
Dataset Details
Dataset Type: LLaVA‑Next‑Interleave Bench is a collection of multi‑image datasets gathered from public sources or generated via the GPT‑4V API. It is intended to assess the multi‑image processing capabilities of large multimodal models.
Dataset Date: The dataset was collected in April 2024, using both public data collection and GPT‑4V API generation.
Dataset Structure: After downloading and extracting, the following structure is present:
interleave_data ├── Split1 │ ├── ... │ └── ... │ ├── Split2 │ ├── ... │ └── ... ├── multi_image_in_domain.json ├── multi_image_out_domain.json └── multi_view_in_domain.json
License: The dataset is released under the Creative Commons Attribution 4.0 International license and must comply with OpenAI policy.
Issues and Feedback: For questions or comments, please contact fliay@connect.ust.hk.
Intended Use
Primary Intended Use: The LLaVA‑Next Interleave dataset is mainly for research on large multimodal models and chatbots.
Primary Intended Users: The dataset is aimed at researchers and enthusiasts in computer vision, natural language processing, machine learning, and artificial intelligence.
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.