JUHE API Marketplace
DATASET
Open Source Community

wenhu/tab_fact

TabFact is a large‑scale dataset comprising 16 k Wikipedia tables as evidence and 118 k manually annotated statements for fact verification based on semi‑structured evidence. Statements are labeled as ENTAILED or REFUTED. The dataset is challenging because it requires both soft linguistic reasoning and hard symbolic reasoning.

Updated 1/18/2024
hugging_face

Description

Dataset Overview

  • Name: TabFact
  • Language: English (en)
  • License: CC‑BY‑4.0
  • Multilinguality: Monolingual
  • Size: 100 K < size < 1 M
  • Source: Original data
  • Task Category: Text Classification
  • Task ID: Fact‑checking
  • Paper/Code ID: tabfact
  • Pretty Name: TabFact

Structure

Config: tab_fact

  • Features:
    • id: int32
    • table_id: string
    • table_text: string
    • table_caption: string
    • statement: string
    • label:
      • class_label:
        • names:
          • 0: refuted
          • 1: entailed
  • Splits:
    • train: num_bytes 99,852,664; num_examples 92,283
    • validation: num_bytes 13,846,872; num_examples 12,792
    • test: num_bytes 13,493,391; num_examples 12,779
    • download_size: 196,508,436
    • dataset_size: 127,192,927

Config: blind_test

  • Features:
    • id: int32
    • table_id: string
    • table_text: string
    • table_caption: string
    • statement: string
    • test_id: string
  • Splits:
    • test: num_bytes 10,954,442; num_examples 9,750
    • download_size: 196,508,436
    • dataset_size: 10,954,442

Creation

  • Annotation Workers: Crowdsourcing
  • Language Workers: Crowdsourcing

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Fact Verification
Natural Language Processing

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.