JUHE API Marketplace
DATASET
Open Source Community

haonan-li/cmmlu

CMMLU is a comprehensive Chinese evaluation suite specifically designed to assess large‑scale multi‑task language understanding capabilities in Chinese linguistic and cultural contexts. It covers 67 subjects ranging from basic to advanced professional levels, including STEM fields such as physics and mathematics as well as humanities and social sciences. Many tasks involve nuanced phrasing and cultural specifics that are hard to translate. Answers for many tasks are China‑specific and may not be applicable elsewhere. Each subject provides development and test sets; every question is a four‑option multiple‑choice item with a single correct answer.

Updated 7/13/2023
hugging_face

Description

Dataset Overview

Dataset Name

  • CMMLU

Description

  • CMMLU is a comprehensive benchmark for evaluating large language models (LLMs) on advanced knowledge and reasoning abilities within Chinese language and cultural contexts. It spans 67 topics from elementary to advanced professional levels, covering STEM subjects such as physics and mathematics, as well as humanities and social sciences.

Features

  • Multiple‑choice and QA tasks.
  • Each question offers four options with only one correct answer.
  • Many tasks contain context‑specific nuances and phrasing that are difficult to translate.
  • Answers for many tasks are China‑specific and may not be suitable for other regions or languages.

Structure

  • Provides development and test sets for each topic.
  • Development set contains 5 questions per topic; test set contains over 100 questions per topic.

Usage

  • The dataset can be loaded via Python, supporting per‑topic loading or loading all data at once.

License

Citation

@misc{li2023cmmlu,
      title={CMMLU: Measuring massive multitask language understanding in Chinese}, 
      author={Haonan Li and Yixuan Zhang and Fajri Koto and Yifei Yang and Hai Zhao and Yeyun Gong and Nan Duan and Timothy Baldwin},
      year={2023},
      eprint={2306.09212},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Chinese Language Understanding
Multi‑Task Evaluation

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.