Back to datasets
Dataset assetOpen Source CommunityNatural Language ProcessingLanguage Models
chiayewken/bamboogle
The Bamboogle dataset contains data for studying the compositionality gap in language models. It includes two features—question and answer—and consists of a test split with 125 examples, totalling 10,747 bytes. The dataset is associated with the paper "Measuring and Narrowing the Compositionality Gap in Language Models" and is released under the MIT License.
Source
hugging_face
Created
Nov 28, 2025
Updated
Oct 27, 2023
Signals
344 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Dataset Information
- Features:
- Question: Data type is string.
- Answer: Data type is string.
- Splits:
- test: Contains 125 samples, total size 10,747 bytes.
- Download Size: 8,383 bytes.
- Dataset Size: 10,747 bytes.
Configuration
- Config Name: default
- Data Files:
- Split: test
- Path: data/test-*
- Data Files:
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.