Dataset assetOpen Source CommunityMultimodal LearningChinese OCR

SWHL/ChineseOCRBench

Chinese OCRBench is a dataset specifically designed for evaluating Chinese OCR tasks, filling the evaluation gap for multimodal large language models in this domain. It comprises 3,410 images and 3,410 question‑answer pairs sourced from the ReCTS and ESTVQA datasets. Annotation includes image filename, question, answer, etc., suitable for OCR benchmarking and research.

Source

hugging_face

Created

Nov 28, 2025

Updated

Apr 30, 2024

Signals

284 views

Availability

Linked source ready

Overview

Dataset description and usage context

Dataset Overview

Name

Chinese OCRBench

Purpose

Dedicated benchmark for Chinese OCR tasks, addressing the lack of Chinese evaluation for multimodal LLMs.

Composition

3,410 images and 3,410 QA pairs.
Sources: ReCTS and ESTVQA datasets.

Detailed Composition

Dataset	Images	Questions
ESTVQA	709	709
ReCTS	2,701	2,701
Total	3,410	3,410

Annotation Format

Each sample includes: dataset_name, id, question, answers, type, file_name.

Usage

Recommended to use together with the MultimodalOCR evaluation script.

Loading Example

from datasets import load_dataset

dataset = load_dataset("SWHL/ChineseOCRBench")

test_data = dataset["test"]
print(test_data[0])
# {image: <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=760x1080 at 0x12544E770>, dataset_name: ESTVQA_cn, id: 0, question: 这家店的名字是什么?, answers: 禾不锈钢, type: Chinese}

License

Apache‑2.0

Language

Chinese

Size

1K < n < 10K

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio