Explore high-quality datasets for your AI and machine learning projects.
The CBSD68 dataset is a color version of the BSD68 benchmark for image denoising, containing original .jpg files converted to lossless .png format and augmented with different levels of Gaussian white noise.
LiveBench is a large‑language‑model (LLM) benchmark created jointly by Abacus.AI, NYU, Nvidia, UMD, and USC. It contains 18 tasks spanning mathematics, programming, reasoning, language understanding, instruction following, and data analysis. LiveBench's questions are sourced from up‑to‑date materials such as recent math competitions, arXiv papers, news articles, and datasets, and answers are automatically scored against objective facts, eliminating the need for LLM or human judges. The benchmark aims to address data contamination issues in traditional evaluations, ensuring fairness and validity.
MME‑RealWorld is a benchmark dataset for multimodal large language models (MLLMs), containing 13,366 high‑resolution images and 29,429 manually annotated question‑answer pairs covering 43 tasks across five real‑world scenarios. It aims to address the limitations of existing benchmarks for practical applications, offering large scale, high quality, and challenging tasks. A Chinese version (MME‑RealWorld‑CN) with 5,917 QA pairs is also provided.