Back to datasets
Dataset assetOpen Source CommunityDeep LearningInformation Retrieval

irds/msmarco-document-v2_trec-dl-2019

The msmarco-document-v2/trec-dl-2019 dataset, provided by the ir-datasets package, focuses on text retrieval tasks. It contains 200 queries and 13,940 relevance judgments (qrels) for evaluating document retrieval systems. Example usage includes loading and processing the data with HuggingFace's datasets library in Python.

Source
hugging_face
Created
Nov 28, 2025
Updated
Jan 5, 2023
Signals
155 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Name

msmarco-document-v2/trec-dl-2019

Data Source

  • Source dataset: irds/msmarco-document-v2

Task Category

  • Text Retrieval

Data Content

  • queries (query topics): count = 200
  • qrels (relevance judgments): count = 13,940
  • docs: use the irds/msmarco-document-v2 dataset

Usage

from datasets import load_dataset

queries = load_dataset(irds/msmarco-document-v2_trec-dl-2019, queries)
for record in queries:
    record # {query_id: ..., text: ...}

qrels = load_dataset(irds/msmarco-document-v2_trec-dl-2019, qrels)
for record in qrels:
    record # {query_id: ..., doc_id: ..., relevance: ..., iteration: ...}

Citation Information

@inproceedings{Craswell2019TrecDl, title={Overview of the TREC 2019 deep learning track}, author={Nick Craswell and Bhaskar Mitra and Emine Yilmaz and Daniel Campos and Ellen Voorhees}, booktitle={TREC 2019}, year={2019} } @inproceedings{Bajaj2016Msmarco, title={MS MARCO: A Human Generated MAchine Reading COmprehension Dataset}, author={Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, Mir Rosenberg, Xia Song, Alina Stoica, Saurabh Tiwary, Tong Wang}, booktitle={InCoCo@NIPS}, year={6​016} }

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio