JUHE API Marketplace
DATASET
Open Source Community

MathCritique-76k

MathCritique‑76k is a dataset for training and testing large language models (LLMs) on mathematical reasoning tasks, containing model responses and step‑level feedback. The dataset was collected via an automated, scalable framework and aims to help models generate natural‑language feedback, improving performance on mathematical reasoning tasks.

Updated 11/26/2024
github

Description

MathCritique Dataset Overview

Dataset Introduction

  • Name: MathCritique‑76k
  • Source: Automatically collected by the AutoMathCritique framework, containing responses to mathematical reasoning tasks and their step‑level feedback.
  • Purpose: Fine‑tune language models to generate natural‑language mathematical reasoning feedback.
  • Features:
    • Utilizes a two‑player paradigm separating the reasoning and critique roles.
    • The critique model provides step‑level feedback during both training and testing, supervising the reasoning model.
    • The dataset helps improve the reasoning model's performance on challenging queries, especially when extending reasoning time.

Dataset Structure

  • Raw Data: Built upon GSM8k and MATH training sets; each query includes a problem and its answer.
  • New Data: Built from GPT‑4 feedback; each query includes a problem, feedback, and a refined answer.
  • Size: Currently 100 examples are released, with more to follow.

Usage

  • Install Dependencies:
    • LLaMA‑Factory dependencies
    • vllm for inference
    • deepspeed for training
    • Custom transformers version
  • Run Experiments:
    • Use selfimprove/inference-all.sh script for training, inference, and evaluation.
    • Key configuration parameters include dataset path, model name, sampling temperature, etc.

License

Contact Information

Citation

@misc{xi2024enhancingllmreasoningcritique, title={Enhancing LLM Reasoning via Critique Models with Test‑Time and Training‑Time Supervision}, author={Zhiheng Xi and Dingwen Yang and Jixuan Huang and Jiafu Tang and Guanyu Li and Yiwen Ding and Wei He and Boyang Hong and Shihan Do and Wenyu Zhan and Xiao Wang and Rui Zheng and Tao Ji and Xiaowei Shi and Yitao Zhai and Rongxiang Weng and Jingang Wang and Xunliang Cai and Tao Gui and Zuxuan Wu and Qi Zhang and Xipeng Qiu and Xuanjing Huang and Yu‑Gang Jiang}, year={2024}, eprint={2411.16579}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2411.16579}, }

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Mathematical Reasoning
Natural Language Processing

Source

Organization: github

Created: 11/26/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.