irds/msmarco-document-v2_trec-dl-2019
The msmarco-document-v2/trec-dl-2019 dataset, provided by the ir-datasets package, focuses on text retrieval tasks. It contains 200 queries and 13,940 relevance judgments (qrels) for evaluating document retrieval systems. Example usage includes loading and processing the data with HuggingFace's datasets library in Python.
Description
Dataset Overview
Dataset Name
msmarco-document-v2/trec-dl-2019
Data Source
- Source dataset:
irds/msmarco-document-v2
Task Category
- Text Retrieval
Data Content
queries(query topics): count = 200qrels(relevance judgments): count = 13,940docs: use theirds/msmarco-document-v2dataset
Usage
from datasets import load_dataset
queries = load_dataset(irds/msmarco-document-v2_trec-dl-2019, queries)
for record in queries:
record # {query_id: ..., text: ...}
qrels = load_dataset(irds/msmarco-document-v2_trec-dl-2019, qrels)
for record in qrels:
record # {query_id: ..., doc_id: ..., relevance: ..., iteration: ...}
Citation Information
@inproceedings{Craswell2019TrecDl, title={Overview of the TREC 2019 deep learning track}, author={Nick Craswell and Bhaskar Mitra and Emine Yilmaz and Daniel Campos and Ellen Voorhees}, booktitle={TREC 2019}, year={2019} } @inproceedings{Bajaj2016Msmarco, title={MS MARCO: A Human Generated MAchine Reading COmprehension Dataset}, author={Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, Mir Rosenberg, Xia Song, Alina Stoica, Saurabh Tiwary, Tong Wang}, booktitle={InCoCo@NIPS}, year={6016} }
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: hugging_face
Created: Unknown
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.