Dataset Overview

Dataset Information

Features:
- labeler: string
- timestamp: string
- generation: null
- is_quality_control_question: bool
- is_initial_screening_question: bool
- question (structured):
  - problem: string
  - ground_truth_answer: string
- label (structured):
  - steps (list):
    - completions (list):
      - text: string
      - rating: int64
      - flagged: bool
    - human_completion (structured):
      - text: string
      - rating: null
      - source: string
      - flagged: bool
      - corrected_rating: int64
    - chosen_completion: int64
  - total_time: int64
  - finish_reason: string
Splits:
- train: 5,185,121 bytes, 949 samples
- test: 532,137 bytes, 106 samples
Download Size: 1,850,110 bytes
Dataset Size: 5,717,258 bytes

Features:
- labeler: string
- timestamp: string
- generation: int64
- is_quality_control_question: bool
- is_initial_screening_question: bool
- question (structured):
  - problem: string
  - ground_truth_solution: string
  - ground_truth_answer: string
  - pre_generated_steps: sequence of string
  - pre_generated_answer: string
  - pre_generated_verifier_score: float64
- label (structured):
  - steps (list):
    - completions (list):
      - text: string
      - rating: int64
      - flagged: bool
    - human_completion: null
    - chosen_completion: int64
  - total_time: int64
  - finish_reason: string
Splits:
- train: 344,736,273 bytes, 97,782 samples
- test: 9,164,167 bytes, 2,762 samples
Download Size: 132,668,705 bytes
Dataset Size: 353,900,440 bytes