MLCE(Medical-LLMs-Chinese-Exam)
The MLCE dataset gathers Chinese medical examination and competition datasets to support evaluation of large language models on specialized medical abilities and to enable targeted training, aiming to promote development of comprehensive medical LLMs.
Dataset description and usage context
Medical Large‑Model Chinese Exam Evaluation
Dataset Introduction
MLCE (Medical‑LLMs‑Chinese‑Exam): Medical Large‑Model Chinese Exam Evaluation aggregates data from various Chinese medical examinations and related competitions to assist in assessing large models' specialized abilities and to facilitate targeted training, with the goal of fostering comprehensive medical LLM development.
Dataset Progress
- [2024/7/7] 2017‑2021 National Physician Qualification Exam, National Pharmacist Qualification Exam, and National Nurse Qualification Exam questions.
- [2024/7/7] First open‑source release of the MLCE dataset.
Sample Data
Questions are stored in the following JSON format:
{
"id": "",
"question": "",
"options": {},
"answer": "",
"question_type": ""
}
Example:
{
"id": "2017-Unit1-1",
"question": "Male, 40 years old, dizziness and headache for two weeks, three consecutive blood pressure readings 21",
"options": {
"A": "Acute hypertension",
"B": "Chronic nephritis",
"C": "Hyperthyroidism",
"D": "Primary hypertension",
"E": "SLE"
},
"answer": "D",
"question_type": "Single‑choice"
}
Dataset Details
| Dataset Name | Sample Count | Source | Data Origin |
|---|---|---|---|
| 2017‑2021physician.json | 3000 | National Physician Qualification Exam | LLM‑Chinese‑NMLE |
| 2017‑2021pharmacist.json | 2400 | National Pharmacist Qualification Exam | LLM‑Chinese‑NMLE |
| 2017‑2021nurse.json | 1200 | National Nurse Qualification Exam | LLM‑Chinese‑NMLE |
| Total | 6600 |
Thanks to all open‑source contributors! Processed data are available under data/.
More data are being prepared.
Contact
If you are interested in this work, have data contributions, or any questions, please contact: jingnant@163.com
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.