DATASET
Open Source Community
gsm8k_synthetic_cot
The dataset includes three primary features—question, chain‑of‑thought, and answer—and is split into training, validation, and test sets containing 385,620, 500, and 1,319 samples respectively. The download size is 50,052,843 bytes and the total size is 91,978,048 bytes.
Updated 12/22/2024
huggingface
Description
Dataset Overview
Language
- English (en)
License
- MIT
Dataset Information
Features
- question: type is string
- cot: type is sequence of strings
- answer: type is string
Data Splits
- train:
- Bytes: 91430680
- Samples: 385620
- valid:
- Bytes: 147836
- Samples: 500
- test:
- Bytes: 399532
- Samples: 1319
Data Size
- Download Size: 50052843 bytes
- Dataset Size: 91978048 bytes
Configuration
- config_name: default
- data_files:
- train: data/train-*
- valid: data/valid-*
- test: data/test-*
- data_files:
Source
- Converted from: https://github.com/da03/Internalize_CoT_Step_by_Step
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Chain-of-Thought
Machine Learning
Source
Organization: huggingface
Created: 12/18/2024
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.