JUHE API Marketplace
DATASET
Open Source Community

bigbio/genia_term_corpus

The GENIA Term Corpus focuses on recognizing entities of interest in molecular biology such as proteins, genes, and cells, which is a fundamental task in biomedical text mining. The GENIA technical term annotations cover physical biological entities as well as other important terminology. The corpus annotates abstracts from the main GENIA corpus, totaling 1,999 abstracts.

Updated 12/22/2022
hugging_face

Description

GENIA Term Corpus Dataset Overview

Basic Information

  • Language: English
  • License: GENIA_PROJECT_LICENSE
  • Multilinguality: Monolingual
  • Dataset Name: GENIA Term Corpus
  • Homepage: GENIA Term Corpus

Dataset Description

  • Availability: Public
  • Task: Named Entity Recognition (NER)
  • Content: Contains recognition of entities of interest in molecular biology (e.g., proteins, genes, cells). The dataset covers 1,999 abstracts from the original GENIA corpus.

Citation Information

  • Reference 1: Ohta, T., Tateisi, Y., & Kim, J.-D. (2002). The GENIA Corpus: An Annotated Research Abstract Corpus in Molecular Biology Domain. Proceedings of the Second International Conference on Human Language Technology Research, 82–86.
  • Reference 2: Kim, J.-D., Ohta, T., Tateisi, Y., & Tsujii, J. (2003). GENIA corpus - a semantically annotated corpus for bio‑textmining. Bioinformatics, 19 Suppl 1, i180-2.
  • Reference 3: Kim, J.-D., Ohta, T., Tsuruoka, Y., Tateisi, Y., & Collier, N. (2004). Introduction to the Bio‑Entity Recognition Task at JNLPBA. Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications, 70–75.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Bioinformatics
Text Mining

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.