MLCE(Medical-LLMs-Chinese-Exam)
The MLCE dataset gathers Chinese medical examination and competition datasets to support evaluation of large language models on specialized medical abilities and to enable targeted training, aiming to promote development of comprehensive medical LLMs.
Description
Medical Large‑Model Chinese Exam Evaluation
Dataset Introduction
MLCE (Medical‑LLMs‑Chinese‑Exam): Medical Large‑Model Chinese Exam Evaluation aggregates data from various Chinese medical examinations and related competitions to assist in assessing large models' specialized abilities and to facilitate targeted training, with the goal of fostering comprehensive medical LLM development.
Dataset Progress
- [2024/7/7] 2017‑2021 National Physician Qualification Exam, National Pharmacist Qualification Exam, and National Nurse Qualification Exam questions.
- [2024/7/7] First open‑source release of the MLCE dataset.
Sample Data
Questions are stored in the following JSON format:
{
"id": "",
"question": "",
"options": {},
"answer": "",
"question_type": ""
}
Example:
{
"id": "2017-Unit1-1",
"question": "Male, 40 years old, dizziness and headache for two weeks, three consecutive blood pressure readings 21",
"options": {
"A": "Acute hypertension",
"B": "Chronic nephritis",
"C": "Hyperthyroidism",
"D": "Primary hypertension",
"E": "SLE"
},
"answer": "D",
"question_type": "Single‑choice"
}
Dataset Details
| Dataset Name | Sample Count | Source | Data Origin |
|---|---|---|---|
| 2017‑2021physician.json | 3000 | National Physician Qualification Exam | LLM‑Chinese‑NMLE |
| 2017‑2021pharmacist.json | 2400 | National Pharmacist Qualification Exam | LLM‑Chinese‑NMLE |
| 2017‑2021nurse.json | 1200 | National Nurse Qualification Exam | LLM‑Chinese‑NMLE |
| Total | 6600 |
Thanks to all open‑source contributors! Processed data are available under data/.
More data are being prepared.
Contact
If you are interested in this work, have data contributions, or any questions, please contact: jingnant@163.com
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: github
Created: 7/7/2024
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.