DATASET
Open Source Community
Multi-Mask Inpainting Dataset
The dataset is intended for multi‑mask image inpainting tasks. It contains images downloaded from the WikiArt API together with globally and object‑level annotations generated by the Kosmos‑2 and LLaVA models. Creation involved image download, mask generation, and construction of an entity dataset.
Updated 12/2/2024
github
Description
I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text‑Guided Multi‑Mask Inpainting (WACV 2025)
Dataset Overview
Dataset Preparation
Image Download
- Use the WikiArt API to download images with the following command:
python -m inpainting.data.downloader download-and-save-images-wikiart-v2 -o data/mm_inp_dataset/images - After completion, the image count should be 116,475.
Dataset Construction
- The dataset includes global image annotations and object‑level annotations.
- Construction steps:
- Generate masks from annotations (≈10 min).
- Build the entity dataset (≈10 min).
- Extract noun‑phrase roots using SpaCy (≈2 min).
- Generate mask descriptions with LLaVA‑1.6‑Vicuna‑13B (optional, time‑consuming).
- Move LLaVA annotations to the entity directory (≈5 s).
- Clean and save LLaVA annotations (≈10 s).
- Split the dataset (skip if already split).
Dataset Structure
- Each image is associated with multiple masks; each mask corresponds to an object crop and a LLaVA‑generated object‑level description.
Model Training and Testing
Model Download
- Retrieve model weights from Google Drive:
- LLaVA‑MultiMask: Extract and place in
models/llava. - SD‑2‑Inp‑RCA‑FineTuned: Extract and place in
models/sd.
- LLaVA‑MultiMask: Extract and place in
Experimental Results
- Commands are provided for training and testing various models, including LLaVA‑Prompt, LLaVA‑1Mask, LLaVA‑MultiMask, etc.
- Multi‑mask inpainting results include metrics such as FID, LPIPS, PSNR, CLIP‑IQA, CLIPSim‑I2I, and CLIPSim‑T2I.
Citation
- If you use this dataset, please cite the associated paper:
@inproceedings{fanelli2025idream, title = {I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text‑Guided Multi‑Mask Inpainting}, author = {Nicola Fanelli and Gennaro Vessio and Giovanna Castellano}, year = {2025}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision} }
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Image Restoration
Computer Vision
Source
Organization: github
Created: 12/2/2024
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.