Back to datasets
Dataset assetOpen Source CommunityImage ProcessingUser Interface
agentsea/wave-ui
This dataset includes multiple fields such as image, instruction, bounding box, resolution, source, platform, name, description, type, OCR, language, purpose, and expectation. The dataset is divided into training, validation, and test sets containing 63,530, 7,944, and 7,938 samples respectively. The total download size is 3,400,799,177 bytes and the total size is 34,533,114,093.5 bytes. Based on the field content, the dataset may be used for image‑processing and natural‑language‑processing tasks, particularly scenarios involving image annotation and textual instructions.
Source
hugging_face
Created
Nov 28, 2025
Updated
Sep 26, 2024
Signals
103 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Feature Information
- image: Image data, data type is image.
- instruction: Instruction text, data type is string.
- bbox: Bounding box, data type is a sequence of floats.
- resolution: Resolution, data type is a sequence of integers.
- source: Data source, data type is string.
- platform: Platform information, data type is string.
- name: Name, data type is string.
- description: Description, data type is string.
- type: Type, data type is string.
- OCR: OCR text, data type is string.
- language: Language, data type is string.
- purpose: Purpose, data type is string.
- expectation: Expectation, data type is string.
Data Split
- train: Training set, containing 63,530 samples, size 27,758,382,015.75 bytes.
- validation: Validation set, containing 7,944 samples, size 3,476,872,245.0 bytes.
- test: Test set, containing 7,938 samples, size 3,297,859,832.75 bytes.
Dataset Size
- Download size: 3,400,799,177 bytes
- Total dataset size: 34,533,114,093.5 bytes
Configuration Information
- config_name: default
- data_files:
- train: path is
data/train-* - validation: path is
data/validation-* - test: path is
data/test-*
- train: path is
- data_files:
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.