JUHE API Marketplace
DATASET
Open Source Community

Xiezhi

Xiezhi is a comprehensive assessment suite designed to evaluate broad domain knowledge. It comprises 516 different disciplines, offering multiple‑choice questions across 13 topics, totaling 249,587 items, plus two sub‑sets (Xiezhi‑Specialty and Xiezhi‑Interdiscipline) each containing 15,000 questions.

Updated 3/11/2024
arXiv

Description

Dataset Overview

Xiezhi (獬豸) is a comprehensive evaluation suite for assessing language models (LMs) across holistic domain knowledge. It contains 249,587 multiple‑choice questions covering 516 disciplines and four difficulty levels.

Details

Question Design

  • Each evaluated LM must select the best answer from 50 options.
  • For every question, besides the correct answer, three distractors are provided, and the remaining 46 options are randomly drawn from all options across the Xiezhi dataset.

Evaluation Metric

  • Mean Reciprocal Rank (MRR) is used, computing the reciprocal rank of the correct answer.

Sample Data

  • Examples from the Xiezhi‑Specialty and Xiezhi‑Interdiscipline subsets are provided.
  • Few‑shot learning configurations are demonstrated.

Usage

  • Tests can be run on model collections including C‑Eval, M3KE, MMLU, Xiezhi‑Inter, and Xiezhi‑Spec, bundled in ./Tester/model_test.py.
  • Anyone can execute ./Tester/test.sh to evaluate.
  • For custom data, override the _get_data function in ./Tester/model_test.py.

License

  • This work is released under the MIT License.
  • The Xiezhi dataset itself is under the Creative Commons Attribution‑NonCommercial‑ShareAlike 4.0 International License.

Citation

Please cite the following paper when using the dataset:

@article{gu2023xiezhi,
  title={Xiezhi: An Ever‑Updating Benchmark for Holistic Domain Knowledge Evaluation},
  author={Gu, Zhouhong and Zhu, Xiaoxuan and Ye, Haoning and Zhang, Lin and Wang, Jianchen and Jiang, Sihang and Xiong, Zhuozhi and Li, Zihan and He, Qianyu and Xu, Rui and Huang, Wenhao and Zheng, Weiguo and Feng, Hongwei and Xiao, Yanghua},
  journal={arXiv:2304.11679},
  year={2023}
}

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Knowledge Assessment
Interdisciplinary

Source

Organization: arXiv

Created: 6/9/2023

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.