Explore high-quality datasets for your AI and machine learning projects.
Chinese OCRBench is a dataset specifically designed for evaluating Chinese OCR tasks, filling the evaluation gap for multimodal large language models in this domain. It comprises 3,410 images and 3,410 question‑answer pairs sourced from the ReCTS and ESTVQA datasets. Annotation includes image filename, question, answer, etc., suitable for OCR benchmarking and research.