DATASET
Open Source Community
meta-math/MetaMathQA
MetaMathQA is a dataset enhanced from the training sets of GSM8K and MATH, without using any test‑set data. Each original question can be found in `meta-math/MetaMathQA`; these items originate from GSM8K or MATH training sets.
Updated 12/21/2023
hugging_face
Description
Dataset Overview
Dataset Name
- Name: MetaMathQA
- Data Augmentation Source: Enhanced from the training sets of GSM8K and MATH.
- Test Set Usage: No test‑set data are included in the augmentation.
Model Training
- Model Name: MetaMath‑Mistral‑7B
- Base Model: Mistral‑7B
- Training Dataset: MetaMathQA
- Performance Improvement: Using MetaMathQA and upgrading the base model from llama‑2‑7B to Mistral‑7B raises GSM8K performance from 66.5 to 77.7.
Experimental Results
- Model Performance Comparison:
- MetaMath‑Mistral‑7B: GSM8K Pass@1 = 77.7, MATH Pass@1 = 28.2.
- Other Models: Includes MPT‑7B, Falcon‑7B, LLaMA‑1‑7B, etc. Detailed metrics are shown in the experiment table.
Citation
@article{yu2023metamath,
title={MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models},
author={Yu, Longhui and Jiang, Weisen and Shi, Han and Yu, Jincheng and Liu, Zhengying and Zhang, Yu and Kwok, James T and Li, Zhenguo and Weller, Adrian and Liu, Weiyang},
journal={arXiv preprint arXiv:2309.12284},
year={2023}
}
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Math Problem Solving
Natural Language Processing
Source
Organization: hugging_face
Created: Unknown
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.