JUHE API Marketplace
DATASET
Open Source Community

X2I

OmniGen, proposed by Beijing Zhiyuan Institute, is a novel diffusion model for unified image generation. The X2I dataset was built to train this model and is the first large‑scale unified image‑generation dataset, consolidating diverse tasks into a single format. It comprises roughly 100 million images covering tasks such as text‑to‑image, multimodal‑to‑image, theme‑driven generation, and computer‑vision tasks. By unifying the format, the dataset enables a single model to handle multiple image‑generation tasks, improving generalisation and multi‑task performance.

Updated 9/20/2024
github

Description

OmniGen Dataset Overview

1. Introduction

  • Name: OmniGen
  • Type: Unified image‑generation model
  • Features: Supports multimodal prompts for image generation without extra plugins or preprocessing steps
  • Goal: Provide a simple, flexible image‑generation paradigm

2. Core Capabilities

  • Text‑to‑Image Generation
  • Theme‑Driven Generation
  • Identity‑Preserving Generation
  • Image Editing
  • Conditional Image Generation
  • Reference‑Expression Generation (automatically detects objects in the input image)

3. Technical Highlights

  • Methodology: See paper arXiv:2409.11340
  • Advantages: Automatic extraction of input image features (objects, human pose, depth maps, etc.)
  • Flexibility: New capabilities can be added via fine‑tuning

4. Resources

5. Usage

  • Installation:
    git clone https://github.com/VectorSpaceLab/OmniGen.git
    cd OmniGen
    pip install -e .
    
  • Quick‑Start Example: Provides examples for text‑to‑image and multimodal‑to‑image generation
  • Diffusers Integration: Supports usage through the Diffusers library

6. Fine‑Tuning Support

7. License

  • License: MIT License

8. Citation

@article{xiao2024omnigen,
  title={Omnigen: Unified image generation},
  author={Xiao, Shitao and Wang, Yueze and Zhou, Junjie and Yuan, Huaying and Xing, Xingrun and Yan, Ruiran and Wang, Shuting and Huang, Tiejun and Liu, Zheng},
  journal={arXiv preprint arXiv:2409.11340},
  year={2024}
}

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Image Generation
Multi‑Task Processing

Source

Organization: github

Created: 9/17/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.