DATASET
Open Source Community
qmsum
The dataset is used for the QMSum task and contains two features: text content and answer length. It is split into a training set with 1,257 samples and a test set with 200 samples. The test set originates from the LongBench QMSum task, while the training set comes from the original QMSum repository. No built‑in validation set is provided; it is recommended to partition a portion of the training set for validation.
Updated 9/25/2024
huggingface
Description
QMSum Dataset Overview
Dataset Information
Features
- text: data type
string - answer_length: data type
int64
Data Splits
- train: contains 1,257 samples, occupying 66,437,471 bytes
- test: contains 200 samples, occupying 11,622,102 bytes
Dataset Size
- Download Size: 32,972,862 bytes
- Total Dataset Size: 78,059,573 bytes
Configuration
- config_name: default
- data_files:
- train: path
data/train-* - test: path
data/test-*
- train: path
- data_files:
Additional Information
- test dataset: from LongBench QMSum task
- train dataset: from the original QMSum repository
- validation set: none; it is recommended to carve out a portion of the training data for validation
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Text Summarization
Question Answering Systems
Source
Organization: huggingface
Created: 9/25/2024
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.