Back to datasets
Dataset assetOpen Source CommunityNatural Language ProcessingData Augmentation

Source82/osa-alpaca_dataset_augmented_cleaned

This dataset includes three features: instruction, input, and output, all of type string. The dataset contains only a training split (train) with 6,856 samples, total size 1,958,991 bytes. Download size is 792,005 bytes. In the default configuration, the data file path is data/train-*.

Source
hugging_face
Created
Nov 28, 2025
Updated
Jun 27, 2024
Signals
41 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Information

  • Features:

    • instruction: type string
    • input: type string
    • output: type string
  • Data Split:

    • train:
      • Bytes: 1958991
      • Samples: 6856
  • Download Size: 792005 bytes

  • Dataset Size: 1958991 bytes

Configuration

  • Configuration Name: default
    • Data Files:
      • train: path data/train-*
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio