Back to datasets
Dataset assetOpen Source CommunityText GenerationMathematical Word Problems

swulling/gsm8k_chinese

The dataset contains Chinese and English math word problems with answers, suitable for text generation tasks, especially the generation of math application problems. It is split into a training set (7,473 samples) and a test set (1,319 samples). Features include question, answer, Chinese question, and answer‑only fields.

Source
hugging_face
Created
Nov 28, 2025
Updated
Nov 28, 2023
Signals
540 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Basic Information

  • Language: Chinese
  • License: MIT
  • Size: 1K < n < 10K samples
  • Source Dataset: gsm8k
  • Task Category: text2text‑generation

Structure

Features

  • question: string
  • answer: string
  • question_zh‑cn: string
  • answer_only: integer

Splits

  • test:
    • Bytes: 1,020,788
    • Samples: 1,319
  • train:
    • Bytes: 5,664,657
    • Samples: 7,473

Download & Size

  • Download Size: 3,988,161 bytes
  • Dataset Size: 6,685,445 bytes

Configuration

  • Config Name: default
  • Data Files:
    • test: data/test-*
    • train: data/train-*

Labels

  • math-word-problems
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio