Back to datasets
Dataset assetOpen Source CommunityNatural Language ProcessingData Augmentation
Source82/osa-alpaca_dataset_augmented_cleaned
This dataset includes three features: instruction, input, and output, all of type string. The dataset contains only a training split (train) with 6,856 samples, total size 1,958,991 bytes. Download size is 792,005 bytes. In the default configuration, the data file path is data/train-*.
Source
hugging_face
Created
Nov 28, 2025
Updated
Jun 27, 2024
Signals
41 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Dataset Information
-
Features:
instruction: typestringinput: typestringoutput: typestring
-
Data Split:
train:- Bytes: 1958991
- Samples: 6856
-
Download Size: 792005 bytes
-
Dataset Size: 1958991 bytes
Configuration
- Configuration Name:
default- Data Files:
train: pathdata/train-*
- Data Files:
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.