JUHE API Marketplace
DATASET
Open Source Community

Polyp-Gen Dataset

The Polyp-Gen dataset is a realistic and diverse polyp image generation dataset for expanding endoscopic datasets. It contains 55,883 samples, including 29,640 polyp frames and 26,243 non‑polyp frames. Low‑quality images such as blurry, reflective, or ghosted frames were filtered out.

Updated 9/16/2024
github

Description

Polyp-Gen: Realistic and Diverse Polyp Image Generation for Endoscopic Dataset Expansion

Dataset Overview

  • Dataset Name: Polyp-Gen
  • Dataset Description: Realistic and diverse polyp image generation for expanding endoscopic datasets.
  • Source: The model was trained on the LDPolypVideo dataset.
  • Filtering: Low‑quality images were removed, resulting in 55,883 samples, including 29,640 polyp frames and 26,243 non‑polyp frames.
  • Download: The dataset can be downloaded here.

Training

  • Pre‑trained Model: Uses Stable Diffusion Inpainting‑2, available on HuggingFace.
  • Training Script: Run the following script:
    bash scripts/train.sh
    

Sampling

  • Sampling Examples: Demonstrates sampling with specific masks.
  • Checkpoint Download: Checkpoints are available here.
  • Sampling Script:
    python sample_one_image.py
    
  • Mask Proposer: Uses pretrained DINOv2 weights, available here.
    • Global Retrieval: Build a database and perform global retrieval:
      python GlobalRetrieval.py --data_path /path/of/non-polyp/images --database_path /path/to/build/database --image_path /path/of/query/image/
      
    • Local Matching: Perform local matching for a query image:
      python LocalMatching.py --ref_image /path/ref/image --ref_mask /path/ref/mask --query_image /path/query/image --mask_proposal /path/to/save/mask
      
    • Example:
      python LocalMatching.py --ref_image demos/img_1513_neg.jpg --ref_mask demos/mask_1513.jpg --query_image demos/img_1592_neg.jpg --mask_proposal gen_mask.jpg
      
    • Sample with Generated Mask: Use the generated mask for sampling.

Acknowledgements

  • Code Foundations: Based on the following projects, thanks to the authors:

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Medical Imaging
Polyp Detection

Source

Organization: github

Created: 9/12/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.