DATASET
Open Source Community
Chinese-Literature-NER-RE-Dataset
A discourse‑level Named Entity Recognition and Relation Extraction dataset for Chinese literary texts.
Updated 4/1/2020
github
Description
Dataset Overview
Dataset Name
- Chinese-Literature-NER-RE-Dataset
Dataset Purpose
- For Named Entity Recognition (NER) and Relation Extraction (RE) on Chinese literary texts.
Dataset Description
- Detailed dataset description is provided in the arXiv paper.
Tag Set
- Entity tags: defines 7 entity types.
- Relation tags: defines 9 relation types.
Annotation Format
Entity Annotation
- T tag: identifies an entity.
- Id: unique identifier of the entity in the document, starting from 0 and incremented for each new entity.
- Type: entity type, corresponding to one of the entity tags.
- Begin Index: starting index of the entity, starting from 0 and incremented per character.
- End Index: ending index of the entity, starting from 0 and incremented per character.
- Value: the word representing the identified object.
Relation Annotation
- R tag: identifies a relation.
- Id: unique identifier of the relation in the document, starting from 0 and incremented for each new relation.
- Arg1 and Arg2: the two entities involved.
- Type: relation type, corresponding to one of the relation tags.
Citation Information
- Authors: Jingjing Xu, Ji Wen, Xu Sun, Qi Su
- Title: A Discourse-Level Named Entity Recognition and Relation Extraction Dataset for Chinese Literature Text
- Year: 2017
- Link: arXiv article link
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Natural Language Processing
Literary Text Analysis
Source
Organization: github
Created: 10/4/2019
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.