lukaemon/bbh
The BIG-Bench Hard dataset comprises multiple sub‑tasks, each associated with a configuration name such as boolean expressions, causal judgement, date understanding, etc. Each sub‑task contains input and target features, and every configuration has a test set with 250 examples (unless otherwise noted). The dataset is primarily used to evaluate and challenge the performance of natural language processing models on complex tasks.
Description
BIG-Bench Hard Dataset Overview
Dataset List
1. boolean_expressions
- Features:
- input: string
- target: string
- Test Set:
- bytes: 11790
- number of examples: 250
2. causal_judgement
- Features:
- input: string
- target: string
- Test Set:
- bytes: 198021
- number of examples: 187
3. date_understanding
- Features:
- input: string
- target: string
- Test Set:
- bytes: 54666
- number of examples: 250
4. disambiguation_qa
- Features:
- input: string
- target: string
- Test Set:
- bytes: 78620
- number of examples: 250
5. dyck_languages
- Features:
- input: string
- target: string
- Test Set:
- bytes: 38432
- number of examples: 250
6. formal_fallacies
- Features:
- input: string
- target: string
- Test Set:
- bytes: 138224
- number of examples: 250
7. geometric_shapes
- Features:
- input: string
- target: string
- Test Set:
- bytes: 68560
- number of examples: 250
8. hyperbaton
- Features:
- input: string
- target: string
- Test Set:
- bytes: 38574
- number of examples: 250
9. logical_deduction_five_objects
- Features:
- input: string
- target: string
- Test Set:
- bytes: 148595
- number of examples: 250
10. logical_deduction_seven_objects
- Features:
- input: string
- target: string
- Test Set:
- bytes: 191022
- number of examples: 250
11. logical_deduction_three_objects
- Features:
- input: string
- target: string
- Test Set:
- bytes: 105831
- number of examples: 250
12. movie_recommendation
- Features:
- input: string
- target: string
- Test Set:
- bytes: 50985
- number of examples: 250
13. multistep_arithmetic_two
- Features:
- input: string
- target: string
- Test Set:
- bytes: 12943
- number of examples: 250
14. navigate
- Features:
- input: string
- target: string
- Test Set:
- bytes: 49031
- number of examples: 250
15. object_counting
- Features:
- input: string
- target: string
- Test Set:
- bytes: 30508
- number of examples: 250
16. penguins_in_a_table
- Features:
- input: string
- target: string
- Test Set:
- bytes: 70062
- number of examples: 146
17. reasoning_about_colored_objects
- Features:
- input: string
- target: string
- Test Set:
- bytes: 89579
- number of examples: 250
18. ruin_names
- Features:
- input: string
- target: string
- Test Set:
- bytes: 46537
- number of examples: 250
19. salient_translation_error_detection
- Features:
- input: string
- target: string
- Test Set:
- bytes: 277110
- number of examples: 250
20. snarks
- Features:
- input: string
- target: string
- Test Set:
- bytes: 38223
- number of examples: 178
21. sports_understanding
- Features:
- input: string
- target: string
- Test Set:
- bytes: 22723
- number of examples: 250
22. temporal_sequences
- Features:
- input: string
- target: string
- Test Set:
- bytes: 139546
- number of examples: 250
23. tracking_shuffled_objects_five_objects
- Features:
- input: string
- target: string
- Test Set:
- bytes: 162590
- number of examples: 250
24. tracking_shuffled_objects_seven_objects
- Features:
- input: string
- target: string
- Test Set:
- bytes: 207274
- number of examples: 250
25. tracking_shuffled_objects_three_objects
- Features:
- input: string
- target: string
- Test Set:
- bytes: 122104
- number of examples: 250
26. web_of_lies
- Features:
- input: string
- target: string
- Test Set:
- bytes: 47582
- number of examples: 250
27. word_sorting
- Features:
- input: string
- target: string
- Test Set:
- bytes: 60918
- number of examples: 250
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: hugging_face
Created: Unknown
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.