X2I
OmniGen, proposed by Beijing Zhiyuan Institute, is a novel diffusion model for unified image generation. The X2I dataset was built to train this model and is the first large‑scale unified image‑generation dataset, consolidating diverse tasks into a single format. It comprises roughly 100 million images covering tasks such as text‑to‑image, multimodal‑to‑image, theme‑driven generation, and computer‑vision tasks. By unifying the format, the dataset enables a single model to handle multiple image‑generation tasks, improving generalisation and multi‑task performance.
Description
OmniGen Dataset Overview
1. Introduction
- Name: OmniGen
- Type: Unified image‑generation model
- Features: Supports multimodal prompts for image generation without extra plugins or preprocessing steps
- Goal: Provide a simple, flexible image‑generation paradigm
2. Core Capabilities
- Text‑to‑Image Generation
- Theme‑Driven Generation
- Identity‑Preserving Generation
- Image Editing
- Conditional Image Generation
- Reference‑Expression Generation (automatically detects objects in the input image)
3. Technical Highlights
- Methodology: See paper arXiv:2409.11340
- Advantages: Automatic extraction of input image features (objects, human pose, depth maps, etc.)
- Flexibility: New capabilities can be added via fine‑tuning
4. Resources
- Model Weights: Shitao/OmniGen‑v1
- Demo Platforms:
- Dataset: X2I Dataset
5. Usage
- Installation:
git clone https://github.com/VectorSpaceLab/OmniGen.git cd OmniGen pip install -e . - Quick‑Start Example: Provides examples for text‑to‑image and multimodal‑to‑image generation
- Diffusers Integration: Supports usage through the Diffusers library
6. Fine‑Tuning Support
- LoRA Fine‑Tuning
- Full‑Fine‑Tuning Options
- Training Script:
train.py - Guide: docs/fine‑tuning.md
7. License
- License: MIT License
8. Citation
@article{xiao2024omnigen,
title={Omnigen: Unified image generation},
author={Xiao, Shitao and Wang, Yueze and Zhou, Junjie and Yuan, Huaying and Xing, Xingrun and Yan, Ruiran and Wang, Shuting and Huang, Tiejun and Liu, Zheng},
journal={arXiv preprint arXiv:2409.11340},
year={2024}
}
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: github
Created: 9/17/2024
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.