Back to datasets
Dataset assetOpen Source CommunityNatural Language ProcessingLanguage Models

chiayewken/bamboogle

The Bamboogle dataset contains data for studying the compositionality gap in language models. It includes two features—question and answer—and consists of a test split with 125 examples, totalling 10,747 bytes. The dataset is associated with the paper "Measuring and Narrowing the Compositionality Gap in Language Models" and is released under the MIT License.

Source
hugging_face
Created
Nov 28, 2025
Updated
Oct 27, 2023
Signals
344 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Information

  • Features:
    • Question: Data type is string.
    • Answer: Data type is string.
  • Splits:
    • test: Contains 125 samples, total size 10,747 bytes.
  • Download Size: 8,383 bytes.
  • Dataset Size: 10,747 bytes.

Configuration

  • Config Name: default
    • Data Files:
      • Split: test
      • Path: data/test-*
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio