Back to datasets
Dataset assetOpen Source CommunityProgramming Q&A

lissadesu/code_qa_updated

This dataset is primarily designed for code‑related question‑answer tasks and includes features such as labNo, taskNo, questioner, question, code, startLine, endLine, questionType, answer, src, code_processed, id, raw_code, raw_comment, comment, and q_code. The dataset is split into a training set containing 35,360 samples, with a total size of 46,842,820 bytes and a download size of 17,749,500 bytes.

Source
hugging_face
Created
Nov 28, 2025
Updated
Oct 6, 2023
Signals
112 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Information

  • License: MIT
  • Features:
    • labNo: data type float64
    • taskNo: data type float64
    • questioner: data type string
    • question: data type string
    • code: data type string
    • startLine: data type float64
    • endLine: data type float64
    • questionType: data type string
    • answer: data type string
    • src: data type string
    • code_processed: data type string
    • id: data type string
    • raw_code: data type string
    • raw_comment: data type string
    • comment: data type string
    • q_code: data type string

Data Split

  • Training Set:
    • Name: train
    • Bytes: 46842820
    • Samples: 35360

Dataset Size

  • Download Size: 17749500 bytes
  • Total Size: 46842820 bytes
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio