Back to datasets
Dataset assetOpen Source CommunityImage ProcessingUser Interface

agentsea/wave-ui

This dataset includes multiple fields such as image, instruction, bounding box, resolution, source, platform, name, description, type, OCR, language, purpose, and expectation. The dataset is divided into training, validation, and test sets containing 63,530, 7,944, and 7,938 samples respectively. The total download size is 3,400,799,177 bytes and the total size is 34,533,114,093.5 bytes. Based on the field content, the dataset may be used for image‑processing and natural‑language‑processing tasks, particularly scenarios involving image annotation and textual instructions.

Source
hugging_face
Created
Nov 28, 2025
Updated
Sep 26, 2024
Signals
103 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Feature Information

  • image: Image data, data type is image.
  • instruction: Instruction text, data type is string.
  • bbox: Bounding box, data type is a sequence of floats.
  • resolution: Resolution, data type is a sequence of integers.
  • source: Data source, data type is string.
  • platform: Platform information, data type is string.
  • name: Name, data type is string.
  • description: Description, data type is string.
  • type: Type, data type is string.
  • OCR: OCR text, data type is string.
  • language: Language, data type is string.
  • purpose: Purpose, data type is string.
  • expectation: Expectation, data type is string.

Data Split

  • train: Training set, containing 63,530 samples, size 27,758,382,015.75 bytes.
  • validation: Validation set, containing 7,944 samples, size 3,476,872,245.0 bytes.
  • test: Test set, containing 7,938 samples, size 3,297,859,832.75 bytes.

Dataset Size

  • Download size: 3,400,799,177 bytes
  • Total dataset size: 34,533,114,093.5 bytes

Configuration Information

  • config_name: default
    • data_files:
      • train: path is data/train-*
      • validation: path is data/validation-*
      • test: path is data/test-*
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio