JUHE API Marketplace
DATASET
Open Source Community

Chinese-Literature-NER-RE-Dataset

A discourse‑level Named Entity Recognition and Relation Extraction dataset for Chinese literary texts.

Updated 4/1/2020
github

Description

Dataset Overview

Dataset Name

  • Chinese-Literature-NER-RE-Dataset

Dataset Purpose

  • For Named Entity Recognition (NER) and Relation Extraction (RE) on Chinese literary texts.

Dataset Description

  • Detailed dataset description is provided in the arXiv paper.

Tag Set

  • Entity tags: defines 7 entity types.
  • Relation tags: defines 9 relation types.

Annotation Format

Entity Annotation
  • T tag: identifies an entity.
    • Id: unique identifier of the entity in the document, starting from 0 and incremented for each new entity.
    • Type: entity type, corresponding to one of the entity tags.
    • Begin Index: starting index of the entity, starting from 0 and incremented per character.
    • End Index: ending index of the entity, starting from 0 and incremented per character.
    • Value: the word representing the identified object.
Relation Annotation
  • R tag: identifies a relation.
    • Id: unique identifier of the relation in the document, starting from 0 and incremented for each new relation.
    • Arg1 and Arg2: the two entities involved.
    • Type: relation type, corresponding to one of the relation tags.

Citation Information

  • Authors: Jingjing Xu, Ji Wen, Xu Sun, Qi Su
  • Title: A Discourse-Level Named Entity Recognition and Relation Extraction Dataset for Chinese Literature Text
  • Year: 2017
  • Link: arXiv article link

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Natural Language Processing
Literary Text Analysis

Source

Organization: github

Created: 10/4/2019

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.