Back to datasets
Dataset assetOpen Source CommunityAviation SafetyIncident Reports
elihoole/asrs-aviation-reports
The ASRS Aviation Incident Reports dataset aggregates 47,723 aviation incident reports from the Aviation Safety Reporting System (ASRS) maintained by NASA. It is primarily used for abstractive summarization tasks, with model performance measured by ROUGE scores. Each instance contains a narrative of the incident, a summary, and a document ID; some instances also include expert‑provided extended analyses. The dataset is in English (U.S. and U.K. variants).
Source
hugging_face
Created
Nov 28, 2025
Updated
Jul 15, 2022
Signals
195 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
- Dataset Name: ASRS Aviation Incident Reports
- Size: 47,723 records
- Language: English (en)
- License: Apache-2.0
- Multilinguality: Monolingual
- Task Category: Summarization
Dataset Structure
- Data Instances:
- Report 1_Narrative: Narrative report text
- Report 1.2_Synopsis: Expert‑written summary text
- acn_num_ACN: Document ID
- Report 1.1_Callback and Report 2.1_Callback: Expert extended analysis (present in a subset of instances)
- Report 2_Narrative: Second narrative report (present in a subset of instances)
- Other fields: Metadata such as time, location, flight conditions, aircraft model, etc.
Dataset Creation
- Annotation creators: Expert‑generated
- Source: Original ASRS reports
Detailed Field Information
| Feature | Number of Instances | Avg. Tokens |
|---|---|---|
| Report 1_Narrative | 47,723 | 281 |
| Report 1.1_Callback | 1,435 | 103 |
| Report 2_Narrative | 11,228 | 169 |
| Report 2.1_Callback | 85 | 110 |
| Report 1.2_Synopsis | 47,723 | 27 |
Supported Tasks
- Summarization: Used for training models to produce abstractive and extractive summaries, evaluated with ROUGE scores.
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.