JUHE API Marketplace
DATASET
Open Source Community

irds/msmarco-document-v2_trec-dl-2019

The msmarco-document-v2/trec-dl-2019 dataset, provided by the ir-datasets package, focuses on text retrieval tasks. It contains 200 queries and 13,940 relevance judgments (qrels) for evaluating document retrieval systems. Example usage includes loading and processing the data with HuggingFace's datasets library in Python.

Updated 1/5/2023
hugging_face

Description

Dataset Overview

Dataset Name

msmarco-document-v2/trec-dl-2019

Data Source

  • Source dataset: irds/msmarco-document-v2

Task Category

  • Text Retrieval

Data Content

  • queries (query topics): count = 200
  • qrels (relevance judgments): count = 13,940
  • docs: use the irds/msmarco-document-v2 dataset

Usage

from datasets import load_dataset

queries = load_dataset(irds/msmarco-document-v2_trec-dl-2019, queries)
for record in queries:
    record # {query_id: ..., text: ...}

qrels = load_dataset(irds/msmarco-document-v2_trec-dl-2019, qrels)
for record in qrels:
    record # {query_id: ..., doc_id: ..., relevance: ..., iteration: ...}

Citation Information

@inproceedings{Craswell2019TrecDl, title={Overview of the TREC 2019 deep learning track}, author={Nick Craswell and Bhaskar Mitra and Emine Yilmaz and Daniel Campos and Ellen Voorhees}, booktitle={TREC 2019}, year={2019} } @inproceedings{Bajaj2016Msmarco, title={MS MARCO: A Human Generated MAchine Reading COmprehension Dataset}, author={Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, Mir Rosenberg, Xia Song, Alina Stoica, Saurabh Tiwary, Tong Wang}, booktitle={InCoCo@NIPS}, year={6​016} }

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Information Retrieval
Deep Learning

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.