DFKI-SLT/conll04

Dataset Overview

Dataset Name: CoNLL04

Purpose: Relation extraction task

Language: English

Size: 1,437 sentences, each containing at least one relation.

Data Structure

Fields

tokens: Text content, string.
entities: List of entities
- type: Entity type, string.
- start: Start index, integer.
- end: End index, integer.
relations: List of relations
- type: Relation type, string.
- head: Head entity index, integer.
- tail: Tail entity index, integer.

Splits

Training (train): 922 samples, 358 752 bytes.
Validation (validation): 231 samples, 94 688 bytes.
Test (test): 288 samples, 114 248 bytes.

Configuration

Default:
- Train path: data/train-*
- Validation path: data/validation-*
- Test path: data/test-*

Citation

BibTeX:

@inproceedings{roth-yih-2004-linear,
    title = "A Linear Programming Formulation for Global Inference in Natural Language Tasks",
    author = "Roth, Dan  and
      Yih, Wen-tau",
    booktitle = "Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL-2004) at HLT-NAACL 2004",
    month = may # " 6 - " # may # " 7",
    year = "2004",
    address = "Boston, Massachusetts, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W04-2401",
    pages = "1--8",
}
@article{eberts-ulges2019spert,
  author       = {Markus Eberts and
                  Adrian Ulges},
  title        = {Span-based Joint Entity and Relation Extraction with Transformer Pre-training},
  journal      = {CoRR},
  volume       = {abs/1909.07755},
  year         = {2019},
  url          = {http://arxiv.org/abs/1909.07755},
  eprinttype    = {arXiv},
  eprint       = {1909.07755},
  timestamp    = {Mon, 23 Sep 2019 18:07:15 +0200},
  biburl       = {https://dblp.org/rec/journals/corr/abs-1909-07755.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

APA:

Roth, D., & Yih, W. (2004). A linear programming formulation for global inference in natural language tasks. In Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL-2004) at HLT-NAACL 2004 (pp. 1‑8). Boston, MA, USA: Association for Computational Linguistics. https://aclanthology.org/W04-2401
Eberts, M., & Ulges, A. (2019). Span‑based joint entity and relation extraction with transformer pre‑training. CoRR, abs/1909.07755. http://arxiv.org/abs/1909.07755

Description

Dataset Overview

Data Structure

Fields

Splits

Configuration

Citation

AI studio

Access Dataset

Topics

Source