Explore high-quality datasets for your AI and machine learning projects.
UniSim‑Bench is a multimodal perception similarity benchmark created by New York University and EPFL, containing seven multimodal perception similarity tasks across 25 datasets. It covers various image‑to‑text tasks and is designed to evaluate model generalisation across tasks. The benchmark aggregates existing perception tasks and trains models using multi‑task learning. UniSim‑Bench is widely used to assess and improve multimodal perception models, especially for cross‑modal similarity evaluation and generative model quality assessment.
LongVALE: Time-Aware Omni-Modal Perception Benchmark for Vision-Audio-Language-Event in Long-duration Videos