CCLUE
CCLUE is a benchmark for evaluating natural language understanding of classical Chinese, containing multiple task datasets such as sentence segmentation and punctuation, named entity recognition, classical Chinese classification, classical poetry sentiment classification, and 文白 retrieval.
Dataset description and usage context
Dataset Overview
Dataset Name: CCLUE
Description: CCLUE is a benchmark for evaluating natural language understanding of classical Chinese, containing multiple task datasets, benchmark models, and evaluation code. Researchers can quickly evaluate various pre‑trained language models with simple code.
Tasks and Dataset Details
| Task Name | Abbr. | Train Set | Dev Set | Test Set | Task Type | Metric |
|---|---|---|---|---|---|---|
| Sentence Segmentation & Punctuation | S&P | 26,935 | 4,075 | 3,992 | Sequence Labeling | F1 |
| Named Entity Recognition | NER | 2,566 | 281 | 327 | Sequence Labeling | F1 |
| Classical Chinese Classification | CLS | 160,000 | 20,000 | 20,000 | Text Classification | Acc |
| Classical Poetry Sentiment Classification | SENT | 16,000 | 2,000 | 2,000 | Text Classification | Acc |
| 文白 Retrieval | RETR | - | - | 10,000 | Text Retrieval | Acc |
Evaluation Method
- Quick Evaluation: No code download required; submit model to HuggingFace and request evaluation. Results returned within three business days.
- Local Evaluation: Download data and code, install dependencies, prepare model, and run evaluation script. Results are saved in the
outputsfolder.
Submitting Results
Results can be submitted to the CCLUE leaderboard, requiring submission of organization, model name, project/paper URL, model weight link, and evaluation results. All submissions must be reproducible and will be verified before appearing on the leaderboard.
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.