RichHF-18K
The RichHF‑18K dataset contains the extensive human‑feedback labels we collected for our CVPR 24 paper, along with the original filenames of the labeled images. It includes subjective scores (e.g., aesthetic ratings), human‑annotated heatmaps (e.g., regions of pixel‑level distortion), and misalignment marks in textual prompts. The dataset consists of 17,760 examples in TensorFlow Example format, comprising 15,810 training examples, 995 development examples, and 955 test examples.
Description
RichHF-18K Dataset Overview
Dataset Description
- Content: Rich human feedback labels for text-to-image generation.
- Size: 17,760 examples.
- Training: 15,810 examples.
- Development: 995 examples.
- Test: 955 examples.
- Format: Tensorflow Example format.
- Included Data:
- Filenames of original images from Pick-a-pic v1 dataset.
- Subjective scores (aesthetics score, artifact score, misalignment score, overall score).
- Heatmaps (artifact map, misalignment map).
- Token-level labels for misaligned tokens in the prompt (prompt_misalignment_label).
Data Fields
- filename: Original image filename.
- aesthetics_score: Aesthetics score.
- artifact_score: Artifact score.
- misalignment_score: Text-image misalignment score.
- overall_score: Overall score.
- artifact_map: Artifact heatmap.
- misalignment_map: Misalignment heatmap.
- prompt_misalignment_label: Token-level labels for misaligned tokens in the prompt.
Data Usage
- Loading: The tfrecord file can be loaded using
tf.data.TFRecordDataset. - Parsing and Matching: Detailed parsing and matching of misalignment labels to each token in the prompt can be found at Google Research GitHub repository.
Additional Notes
- The dataset does not contain the original images, only their filenames.
- All scores are the higher the better, indicating better quality or fewer artifacts.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: github
Created: 6/14/2024
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.