JUHE API Marketplace
DATASET
Open Source Community

qmsum

The dataset is used for the QMSum task and contains two features: text content and answer length. It is split into a training set with 1,257 samples and a test set with 200 samples. The test set originates from the LongBench QMSum task, while the training set comes from the original QMSum repository. No built‑in validation set is provided; it is recommended to partition a portion of the training set for validation.

Updated 9/25/2024
huggingface

Description

QMSum Dataset Overview

Dataset Information

Features

  • text: data type string
  • answer_length: data type int64

Data Splits

  • train: contains 1,257 samples, occupying 66,437,471 bytes
  • test: contains 200 samples, occupying 11,622,102 bytes

Dataset Size

  • Download Size: 32,972,862 bytes
  • Total Dataset Size: 78,059,573 bytes

Configuration

  • config_name: default
    • data_files:
      • train: path data/train-*
      • test: path data/test-*

Additional Information

  • test dataset: from LongBench QMSum task
  • train dataset: from the original QMSum repository
  • validation set: none; it is recommended to carve out a portion of the training data for validation

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Text Summarization
Question Answering Systems

Source

Organization: huggingface

Created: 9/25/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.