I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text‑Guided Multi‑Mask Inpainting (WACV 2025)

Dataset Overview

Dataset Preparation

Image Download

Use the WikiArt API to download images with the following command:

python -m inpainting.data.downloader download-and-save-images-wikiart-v2 -o data/mm_inp_dataset/images

After completion, the image count should be 116,475.

Dataset Construction

The dataset includes global image annotations and object‑level annotations.
Construction steps:
1. Generate masks from annotations (≈10 min).
2. Build the entity dataset (≈10 min).
3. Extract noun‑phrase roots using SpaCy (≈2 min).
4. Generate mask descriptions with LLaVA‑1.6‑Vicuna‑13B (optional, time‑consuming).
5. Move LLaVA annotations to the entity directory (≈5 s).
6. Clean and save LLaVA annotations (≈10 s).
7. Split the dataset (skip if already split).

Dataset Structure

Each image is associated with multiple masks; each mask corresponds to an object crop and a LLaVA‑generated object‑level description.

Model Training and Testing

Model Download

Retrieve model weights from Google Drive:
- LLaVA‑MultiMask: Extract and place in models/llava.
- SD‑2‑Inp‑RCA‑FineTuned: Extract and place in models/sd.

Experimental Results

Commands are provided for training and testing various models, including LLaVA‑Prompt, LLaVA‑1Mask, LLaVA‑MultiMask, etc.
Multi‑mask inpainting results include metrics such as FID, LPIPS, PSNR, CLIP‑IQA, CLIPSim‑I2I, and CLIPSim‑T2I.

Citation

If you use this dataset, please cite the associated paper:

@inproceedings{fanelli2025idream,
  title     = {I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text‑Guided Multi‑Mask Inpainting},
  author    = {Nicola Fanelli and Gennaro Vessio and Giovanna Castellano},
  year      = {2025},
  booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision}
}

Multi-Mask Inpainting Dataset

Description

I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text‑Guided Multi‑Mask Inpainting (WACV 2025)

Dataset Overview

Dataset Preparation

Image Download

Dataset Construction

Dataset Structure

Model Training and Testing

Model Download

Experimental Results

Citation

AI studio

Access Dataset

Topics

Source

Multi-Mask Inpainting Dataset

Description

I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text‑Guided Multi‑Mask Inpainting (WACV 2025)

Dataset Overview

Dataset Preparation

Image Download

Dataset Construction

Dataset Structure

Model Training and Testing

Model Download

Experimental Results

Citation

AI studio

Access Dataset

Topics

Source

I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text‑Guided Multi‑Mask Inpainting (WACV 2025)