JUHE API Marketplace
DATASET
Open Source Community

Multi-Mask Inpainting Dataset

The dataset is intended for multi‑mask image inpainting tasks. It contains images downloaded from the WikiArt API together with globally and object‑level annotations generated by the Kosmos‑2 and LLaVA models. Creation involved image download, mask generation, and construction of an entity dataset.

Updated 12/2/2024
github

Description

I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text‑Guided Multi‑Mask Inpainting (WACV 2025)

Dataset Overview

Dataset Preparation

Image Download

  • Use the WikiArt API to download images with the following command:
    python -m inpainting.data.downloader download-and-save-images-wikiart-v2 -o data/mm_inp_dataset/images
    
  • After completion, the image count should be 116,475.

Dataset Construction

  • The dataset includes global image annotations and object‑level annotations.
  • Construction steps:
    1. Generate masks from annotations (≈10 min).
    2. Build the entity dataset (≈10 min).
    3. Extract noun‑phrase roots using SpaCy (≈2 min).
    4. Generate mask descriptions with LLaVA‑1.6‑Vicuna‑13B (optional, time‑consuming).
    5. Move LLaVA annotations to the entity directory (≈5 s).
    6. Clean and save LLaVA annotations (≈10 s).
    7. Split the dataset (skip if already split).

Dataset Structure

  • Each image is associated with multiple masks; each mask corresponds to an object crop and a LLaVA‑generated object‑level description.

Model Training and Testing

Model Download

  • Retrieve model weights from Google Drive:
    • LLaVA‑MultiMask: Extract and place in models/llava.
    • SD‑2‑Inp‑RCA‑FineTuned: Extract and place in models/sd.

Experimental Results

  • Commands are provided for training and testing various models, including LLaVA‑Prompt, LLaVA‑1Mask, LLaVA‑MultiMask, etc.
  • Multi‑mask inpainting results include metrics such as FID, LPIPS, PSNR, CLIP‑IQA, CLIPSim‑I2I, and CLIPSim‑T2I.

Citation

  • If you use this dataset, please cite the associated paper:
    @inproceedings{fanelli2025idream,
      title     = {I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text‑Guided Multi‑Mask Inpainting},
      author    = {Nicola Fanelli and Gennaro Vessio and Giovanna Castellano},
      year      = {2025},
      booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision}
    }
    

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Image Restoration
Computer Vision

Source

Organization: github

Created: 12/2/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.