X2I

OmniGen, proposed by Beijing Zhiyuan Institute, is a novel diffusion model for unified image generation. The X2I dataset was built to train this model and is the first large‑scale unified image‑generation dataset, consolidating diverse tasks into a single format. It comprises roughly 100 million images covering tasks such as text‑to‑image, multimodal‑to‑image, theme‑driven generation, and computer‑vision tasks. By unifying the format, the dataset enables a single model to handle multiple image‑generation tasks, improving generalisation and multi‑task performance.

Updated 9/20/2024

github

Description

OmniGen Dataset Overview

1. Introduction

Name: OmniGen
Type: Unified image‑generation model
Features: Supports multimodal prompts for image generation without extra plugins or preprocessing steps
Goal: Provide a simple, flexible image‑generation paradigm

2. Core Capabilities

Text‑to‑Image Generation
Theme‑Driven Generation
Identity‑Preserving Generation
Image Editing
Conditional Image Generation
Reference‑Expression Generation (automatically detects objects in the input image)

3. Technical Highlights

Methodology: See paper arXiv:2409.11340
Advantages: Automatic extraction of input image features (objects, human pose, depth maps, etc.)
Flexibility: New capabilities can be added via fine‑tuning

4. Resources

Model Weights: Shitao/OmniGen‑v1
Demo Platforms:
- Hugging Face Demo
- Replicate Demo
Dataset: X2I Dataset

5. Usage

Installation:

git clone https://github.com/VectorSpaceLab/OmniGen.git
cd OmniGen
pip install -e .

Quick‑Start Example: Provides examples for text‑to‑image and multimodal‑to‑image generation
Diffusers Integration: Supports usage through the Diffusers library

6. Fine‑Tuning Support

LoRA Fine‑Tuning
Full‑Fine‑Tuning Options
Training Script: train.py
Guide: docs/fine‑tuning.md

7. License

License: MIT License

8. Citation

@article{xiao2024omnigen,
  title={Omnigen: Unified image generation},
  author={Xiao, Shitao and Wang, Yueze and Zhou, Junjie and Yuan, Huaying and Xing, Xingrun and Yan, Ruiran and Wang, Shuting and Huang, Tiejun and Liu, Zheng},
  journal={arXiv preprint arXiv:2409.11340},
  year={2024}
}

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Please login to view download links and access full dataset details.

Topics

Image Generation

Multi‑Task Processing

Source

Organization: github

Created: 9/17/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.

Check Prices →