JUHE API Marketplace
DATASET
Open Source Community

mandarjoshi/trivia_qa

TriviaQA is a reading‑comprehension dataset containing over 650,000 question‑answer‑evidence triples. It includes 95,000 question‑answer pairs authored by trivia enthusiasts and independently collected evidence documents, with an average of six documents per question, providing high‑quality distant supervision. The dataset is monolingual (English) and is suitable for QA and text‑generation tasks.

Updated 1/5/2024
hugging_face

Description

Dataset Overview

Basic Information

  • Name: TriviaQA
  • Language: English
  • Multilinguality: Monolingual
  • License: Unknown
  • Annotation creators: Crowdsourcing
  • Language creators: Machine‑generated
  • Size categories:
    • 10K < n < 100K
    • 100K < n < 1M

Dataset Structure

Configuration Details

Configuration: rc

  • Features:
    • question: string
    • question_id: string
    • question_source: string
    • entity_pages: sequence
      • document_source: string
      • filename: string
      • title: string
      • wiki_context: string
    • search_results: sequence ... (structure omitted for brevity)
  • Splits:
    • train: 138,384 examples, 12,749,651,131 bytes
    • validation: 17,944 examples, 1,662,321,188 bytes
    • test: 17,210 examples, 1,577,710,503 bytes
  • Download size: 8,998,808,983 bytes
  • Dataset size: 15,989,682,822 bytes

... (additional configurations omitted for brevity)

Task Types

  • Question Answering
  • Text‑to‑Text Generation

Task IDs

  • Open‑domain QA
  • Open‑domain Abstract QA
  • Extractive QA
  • Abstractive QA

Dataset Information

  • Paper code ID: triviaqa
  • Pretty name: TriviaQA

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Reading Comprehension
Natural Language Processing

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.