CEC-Corpus
The Chinese Emergency Corpus is constructed by Shanghai University (Semantic Intelligence Lab) and includes news reports of five types of emergencies: earthquakes, fires, traffic accidents, terrorist attacks, and food poisoning. The dataset undergoes text preprocessing, analysis, annotation, etc., using XML as the annotation format, containing six tags: Event, Denoter, Time, Location, Participant, and Object, to comprehensively describe events and their elements.
Description
Chinese Emergency Corpus (CEC) Overview
Dataset Construction
- Construction Institution: Semantic Intelligence Lab, Shanghai University
- Data Source: Internet news reports
- Event Categories: Earthquake, fire, traffic accident, terrorist attack, food poisoning (5 categories)
- Number of Texts: 332 articles
Data Processing
- Preprocessing Steps: Text preprocessing, text analysis, event annotation, consistency checking
- Annotation Format: XML
- Primary Data Structures: Event, Denoter, Time, Location, Participant, Object
- Attribute Definitions: Define relevant attributes for each tag
Research and Development Funding
- Funding Projects: National Natural Science Foundation projects “Key Issues in Event Reasoning based on Description Logic” (Grant No. 61305053) and “Event Ontology Model and Application Technology” (Grant No. 60975033)
Research Outcomes
-
Research Papers: Multiple papers published in Journal of Chinese Information Processing, Pattern Recognition and Artificial Intelligence, etc.
-
Doctoral Dissertations: Including studies on event‑oriented knowledge processing and event‑oriented text representation.
-
Master’s Theses: Covering intentional event research, extraction and reasoning of temporal event elements, etc.
Corpus Characteristics
- Scale: Slightly smaller than ACE and TimeBank corpora
- Annotation Completeness: Provides the most comprehensive annotation of events and event elements.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: github
Created: 1/22/2015
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.