Dataset assetOpen Source CommunityNatural Language ProcessingLanguage Model Evaluation

Chinese-SimpleQA

Chinese SimpleQA is a comprehensive Chinese benchmark for evaluating factual correctness of language models on short questions. It features five characteristics: Chinese language, diversity, high quality, static references, and ease of evaluation. The dataset covers six major topics with 99 fine‑grained sub‑topics, spanning humanities to science and engineering, containing 3,000 high‑quality questions to help developers assess factual accuracy in Chinese and support algorithm research.

Source

huggingface

Created

Nov 12, 2024

Updated

Nov 17, 2024

Signals

1,305 views

Availability

Linked source ready

Overview

Dataset description and usage context

Chinese SimpleQA Dataset Overview

Basic Information

License: cc‑by‑nc‑sa‑4.0
Task Category: Question Answering
Language: Chinese
Dataset Name: Chinese SimpleQA
Scale: 10 K < n < 100 K

Introduction

Goal: Evaluate language models' factual accuracy on short questions.
Key Features:
- Chinese: Dedicated to Chinese language, assessing LLM factuality in Chinese.
- Diversity: Covers six major topics—"Chinese Culture", "Humanities", "Engineering, Technology & Applied Science", "Life, Art & Culture", "Society", "Natural Science"—with 99 sub‑topics.
- High Quality: Strict quality control ensures accuracy.
- Static: Reference answers remain unchanged over time.
- Easy Evaluation: Short QA pairs enable rapid assessment via existing LLM APIs (e.g., OpenAI).

Content

Topic Coverage: 6 major topics, 99 fine‑grained sub‑topics.
Number of Questions: 3,000 high‑quality questions.

Citation

@misc{he2024chinesesimpleqachinesefactuality, title={Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models}, author={Yancheng He and Shilong Li and Jiaheng Liu and Yingshui Tan and Weixun Wang and Hui Huang and Xingyuan Bu and Hangyu Guo and Chengwei Hu and Boren Zheng and Zhuoran Lin and Xuepeng Liu and Dekai Sun and Shirong Lin and Zhicheng Zheng and Xiaoyong Zhu and Wenbo Su and Bo Zheng}, year={2024}, eprint={2411.07140}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2411.07140}, }

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio