JUHE API Marketplace
DATASET
Open Source Community

meta-math/MetaMathQA

MetaMathQA is a dataset enhanced from the training sets of GSM8K and MATH, without using any test‑set data. Each original question can be found in `meta-math/MetaMathQA`; these items originate from GSM8K or MATH training sets.

Updated 12/21/2023
hugging_face

Description

Dataset Overview

Dataset Name

  • Name: MetaMathQA
  • Data Augmentation Source: Enhanced from the training sets of GSM8K and MATH.
  • Test Set Usage: No test‑set data are included in the augmentation.

Model Training

  • Model Name: MetaMath‑Mistral‑7B
  • Base Model: Mistral‑7B
  • Training Dataset: MetaMathQA
  • Performance Improvement: Using MetaMathQA and upgrading the base model from llama‑2‑7B to Mistral‑7B raises GSM8K performance from 66.5 to 77.7.

Experimental Results

  • Model Performance Comparison:
    • MetaMath‑Mistral‑7B: GSM8K Pass@1 = 77.7, MATH Pass@1 = 28.2.
    • Other Models: Includes MPT‑7B, Falcon‑7B, LLaMA‑1‑7B, etc. Detailed metrics are shown in the experiment table.

Citation

@article{yu2023metamath,
  title={MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models},
  author={Yu, Longhui and Jiang, Weisen and Shi, Han and Yu, Jincheng and Liu, Zhengying and Zhang, Yu and Kwok, James T and Li, Zhenguo and Weller, Adrian and Liu, Weiyang},
  journal={arXiv preprint arXiv:2309.12284},
  year={2023}
}

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Math Problem Solving
Natural Language Processing

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.