JUHE API Marketplace
DATASET
Open Source Community

OlympiadBench

OlympiadBench is an Olympic‑level bilingual multimodal scientific benchmark, containing 8,476 questions from Olympic‑level mathematics and physics competitions, including the Chinese Gaokao. Each question is accompanied by expert‑level step‑by‑step reasoning annotations. The dataset aims to challenge and advance AGI development through complex scientific problems. The best‑performing model, GPT‑4V, achieved an average score of only 17.97% on this benchmark, dropping to as low as 10.74% in the physics domain, highlighting the benchmark's rigor and the complexity of physical reasoning.

Updated 7/17/2024
huggingface

Description

Dataset Overview

Dataset Name

OlympiadBench

Dataset Description

OlympiadBench is an Olympic‑level bilingual multimodal scientific question benchmark, containing 8,476 questions from Olympic‑level mathematics and physics competitions, including the Chinese Gaokao. Each question includes expert‑level step‑by‑step reasoning annotations. The best‑performing model, GPT‑4V, achieved an average score of only 17.97% on OlympiadBench, and as low as 10.74% in the physics domain, highlighting the benchmark's rigor and the complexity of physical reasoning.

Dataset Configuration

  • OE_MM_maths_en_COMP: OlympiadBench/OE_MM_maths_en_COMP/OE_MM_maths_en_COMP.parquet
  • OE_MM_maths_zh_CEE: OlympiadBench/OE_MM_maths_zh_CEE/OE_MM_maths_zh_CEE.parquet
  • OE_MM_maths_zh_COMP: OlympiadBench/OE_MM_maths_zh_COMP/OE_MM_maths_zh_COMP.parquet
  • OE_MM_physics_en_COMP: OlympiadBench/OE_MM_physics_en_COMP/OE_MM_physics_en_COMP.parquet
  • OE_MM_physics_zh_CEE: OlympiadBench/OE_MM_physics_zh_CEE/OE_MM_physics_zh_CEE.parquet
  • OE_TO_maths_en_COMP: OlympiadBench/OE_TO_maths_en_COMP/OE_TO_maths_en_COMP.parquet
  • OE_TO_maths_zh_CEE: OlympiadBench/OE_TO_maths_zh_CEE/OE_TO_maths_zh_CEE.parquet
  • OE_TO_maths_zh_COMP: OlympiadBench/OE_TO_maths_zh_COMP/OE_TO_maths_zh_COMP.parquet
  • OE_TO_physics_en_COMP: OlympiadBench/OE_TO_physics_en_COMP/OE_TO_physics_en_COMP.parquet
  • OE_TO_physics_zh_CEE: OlympiadBench/OE_TO_physics_zh_CEE/OE_TO_physics_zh_CEE.parquet
  • TP_MM_maths_en_COMP: OlympiadBench/TP_MM_maths_en_COMP/TP_MM_maths_en_COMP.parquet
  • TP_MM_maths_zh_CEE: OlympiadBench/TP_MM_maths_zh_CEE/TP_MM_maths_zh_CEE.parquet
  • TP_MM_maths_zh_COMP: OlympiadBench/TP_MM_maths_zh_COMP/TP_MM_maths_zh_COMP.parquet
  • TP_MM_physics_en_COMP: OlympiadBench/TP_MM_physics_en_COMP/TP_MM_physics_en_COMP.parquet
  • TP_TO_maths_en_COMP: OlympiadBench/TP_TO_maths_en_COMP/TP_TO_maths_en_COMP.parquet
  • TP_TO_maths_zh_CEE: OlympiadBench/TP_TO_maths_zh_CEE/TP_TO_maths_zh_CEE.parquet
  • TP_TO_maths_zh_COMP: OlympiadBench/TP_TO_maths_zh_COMP/TP_TO_maths_zh_COMP.parquet
  • TP_TO_physics_en_COMP: OlympiadBench/TP_TO_physics_en_COMP/TP_TO_physics_en_COMP.parquet

Task Types

  • Question Answering
  • Visual Question Answering

Languages

  • Chinese
  • English

Tags

  • Mathematics
  • Physics

Dataset Size

  • 10M<n<100M

License

apache-2.0

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Evaluation Dataset
Artificial Intelligence

Source

Organization: huggingface

Created: 7/16/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.