Back to datasets
Dataset assetOpen Source CommunityNatural Language ProcessingLanguage Model Evaluation

Chinese-SimpleQA

Chinese SimpleQA is a comprehensive Chinese benchmark for evaluating factual correctness of language models on short questions. It features five characteristics: Chinese language, diversity, high quality, static references, and ease of evaluation. The dataset covers six major topics with 99 fine‑grained sub‑topics, spanning humanities to science and engineering, containing 3,000 high‑quality questions to help developers assess factual accuracy in Chinese and support algorithm research.

Source
huggingface
Created
Nov 12, 2024
Updated
Nov 17, 2024
Signals
1,305 views
Availability
Linked source ready
Overview

Dataset description and usage context

Chinese SimpleQA Dataset Overview

Basic Information

  • License: cc‑by‑nc‑sa‑4.0
  • Task Category: Question Answering
  • Language: Chinese
  • Dataset Name: Chinese SimpleQA
  • Scale: 10 K < n < 100 K

Introduction

  • Goal: Evaluate language models' factual accuracy on short questions.
  • Key Features:
    • Chinese: Dedicated to Chinese language, assessing LLM factuality in Chinese.
    • Diversity: Covers six major topics—"Chinese Culture", "Humanities", "Engineering, Technology & Applied Science", "Life, Art & Culture", "Society", "Natural Science"—with 99 sub‑topics.
    • High Quality: Strict quality control ensures accuracy.
    • Static: Reference answers remain unchanged over time.
    • Easy Evaluation: Short QA pairs enable rapid assessment via existing LLM APIs (e.g., OpenAI).

Content

  • Topic Coverage: 6 major topics, 99 fine‑grained sub‑topics.
  • Number of Questions: 3,000 high‑quality questions.

Citation

@misc{he2024chinesesimpleqachinesefactuality, title={Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models}, author={Yancheng He and Shilong Li and Jiaheng Liu and Yingshui Tan and Weixun Wang and Hui Huang and Xingyuan Bu and Hangyu Guo and Chengwei Hu and Boren Zheng and Zhuoran Lin and Xuepeng Liu and Dekai Sun and Shirong Lin and Zhicheng Zheng and Xiaoyong Zhu and Wenbo Su and Bo Zheng}, year={2024}, eprint={2411.07140}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2411.07140}, }

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio