cmm-math
CMM‑Math is a Chinese multimodal mathematics dataset containing over 28,000 high‑quality samples covering 12 grades from primary school to high school. It includes diverse question types such as multiple‑choice and fill‑in‑the‑blank, with detailed solutions. Some questions involve visual context, making the dataset more challenging. The dataset is split into a training set (22,000+ samples) and an evaluation set (5,000+ samples).
Description
CMM‑Math Dataset Overview
Basic Information
- License: BSD‑3‑Clause
- Language: Chinese
- Task Type: Text Generation
Dataset Introduction
- Name: CMM‑Math
- Content: A Chinese multimodal mathematics dataset containing benchmark and training portions, intended for evaluating and enhancing the mathematical reasoning abilities of large multimodal models (LMMs).
- Number of Samples: Over 28,000 high‑quality samples
- Question Types: Includes various formats such as multiple‑choice, fill‑in‑the‑blank, etc.
- Applicable Grades: Covers 12 grades from primary school to high school.
- Characteristics: Some questions or solutions may contain visual context, increasing the dataset’s difficulty.
Dataset Structure
... (structure details omitted as per original) ...
Related Papers
... (details omitted) ...
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: huggingface
Created: 9/6/2024
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.