Back to datasets
Dataset assetOpen Source CommunityProgramming EducationCode Generation
mbpp
The dataset comprises four features: instance_id (integer), prompt (string), canonical_solution (string), and test (string). It is divided into four parts: training set (train), test set (test), validation set (validation), and prompt set (prompt). Each part has corresponding file paths and sample counts. The total download size is 228,122 bytes, and the total dataset size is 500,198 bytes.
Source
huggingface
Created
Dec 4, 2024
Updated
Dec 8, 2024
Signals
380 views
Availability
Linked source ready
Overview
Dataset description and usage context
MBPP Dataset Overview
Dataset Information
Features
- instance_id: data type
int32 - prompt: data type
string - canonical_solution: data type
string - test: data type
string
Data Splits
- train: contains 374 samples, occupying 189,426 bytes
- test: contains 500 samples, occupying 260,317 bytes
- validation: contains 90 samples, occupying 45,555 bytes
- prompt: contains 10 samples, occupying 4,900 bytes
Dataset Size
- Download Size: 228,122 bytes
- Total Size: 500,198 bytes
Configuration
- config_name: default
- data_files:
- train:
data/train-* - test:
data/test-* - validation:
data/validation-* - prompt:
data/prompt-*
- train:
- data_files:
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.