High Quality Data

Dataset Hub

Explore high-quality datasets for your AI and machine learning projects.

Sort:

Browse by Category

SWHL/ChineseOCRBench

Chinese OCRBench is a dataset specifically designed for evaluating Chinese OCR tasks, filling the evaluation gap for multimodal large language models in this domain. It comprises 3,410 images and 3,410 question‑answer pairs sourced from the ReCTS and ESTVQA datasets. Annotation includes image filename, question, answer, etc., suitable for OCR benchmarking and research.

hugging_face

View Details