Back to datasets
Dataset assetOpen Source CommunityBenchmarkingMultimodal Large Language Models
MME-RealWorld-lite-lmms-eval
MME‑RealWorld is a benchmark dataset for multimodal large language models (MLLMs), containing 13,366 high‑resolution images and 29,429 manually annotated question‑answer pairs covering 43 tasks across five real‑world scenarios. It aims to address the limitations of existing benchmarks for practical applications, offering large scale, high quality, and challenging tasks. A Chinese version (MME‑RealWorld‑CN) with 5,917 QA pairs is also provided.
Source
huggingface
Created
Nov 13, 2024
Updated
Nov 14, 2024
Signals
359 views
Availability
Linked source ready
Overview
Dataset description and usage context
MME‑RealWorld‑lite‑lmms‑eval Dataset Overview
Dataset Information
Features
- bytes: string
- path: string
- index: integer
- question: string
- multi‑choice options: list of strings
- answer: string
- category: string
- l2‑category: string
Data Split
- train: 1,919 samples, 1,990,753,320 bytes
Size
- Download size: 1,880,779,075 bytes
- Dataset size: 1,990,753,320 bytes
Configuration
- default: data files located at
data/train-*
Dataset Details
Characteristics
- Scale: 29,429 manually annotated QA pairs collected from 32 volunteers, covering 43 sub‑tasks in 5 real‑world scenarios—currently the largest fully human‑annotated benchmark.
- Quality:
- Resolution: Average image resolution of 2000 × 1500 px, the highest among competitors.
- Annotation: All annotations are performed and cross‑checked by a professional team to ensure quality.
- Task Difficulty and Real‑World Relevance: Even the best models achieve <60 % accuracy. Many tasks are harder than traditional benchmarks, e.g., counting 133 cars in a surveillance video or identifying small objects on a 5,000 × 5,000 px remote‑sensing map.
- MME‑RealWorld‑CN: For Chinese scenarios, 5,917 QA pairs were collected from Chinese volunteers, addressing translation‑induced issues.
Related Links
- Paper: https://arxiv.org/abs/2408.13257
- Code: https://github.com/yfzhang114/MME-RealWorld
- Project page: https://mme-realworld.github.io/
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.