Dataset Catalog

Browse trusted datasets for evaluation, enrichment, and production use.

Showing 1 of 1 datasets

Category: Multidisciplinary Learning

MMLU

Model EvaluationMultidisciplinary Learning

This dataset is a benchmark for evaluating language‑model performance across a range of tasks. It is also used to assess models that have been fine‑tuned on multiple tasks.

Source arXivUpdated Apr 28, 2026648 viewsLinked

Inspect dataset