Back to datasets
Dataset assetOpen Source CommunityVision-Language ModelsMulti-Concept Personalization
MC-LLaVA Multi-Concept Personalization Dataset
The MC-LLaVA Multi-Concept Personalization dataset is a high-quality collection designed to advance multi-concept personalization research. It gathers images featuring multiple characters from various movies and manually generates multi-concept question‑answer samples. With diverse movie genres and QA types, the dataset aims to enable vision‑language models to excel in multi-concept personalization tasks.
Source
github
Created
Nov 18, 2024
Updated
Nov 23, 2024
Signals
79 views
Availability
Linked source ready
Overview
Dataset description and usage context
MC-LLaVA: Multi-Concept Personalized Vision-Language Model
Overview
- Name: MC-LLaVA
- Type: Multi-Concept Personalized Vision-Language Model
- Paper: MC-LLaVA: Multi-Concept Personalized Vision-Language Model
Model Features
- Multi-Concept Personalization: Through a joint training strategy, MC-LLaVA can integrate multiple concepts in a single training session, achieving multi-concept personalization.
- Utilization of Visual Tag Information: Leverages visual tag information for concept label initialization, enhancing concept representations and accelerating joint training.
Dataset
- Data Source: Images containing multiple characters collected from various movies, with manually generated multi-concept QA samples.
- Dataset Characteristics:
- Diverse movie genres
- Diverse question‑answer types
Experimental Results
- Multi-Concept Personalization Response: Through comprehensive qualitative and quantitative experiments, MC-LLaVA demonstrates outstanding multi-concept personalization response capabilities.
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.