ATIS dataset

ATIS dataset

The ATIS dataset is used in training and evaluation phases, containing 4,978 training sentences and 850 evaluation sentences. It is utilized for natural language understanding (NLU) training, involving tokenization, featurization, intent classification, and entity recognition and extraction.

Updated 12/24/2022

github

Dataset Overview

Dataset Name

ATIS dataset

Dataset Purpose

Used for training and evaluating natural language understanding (NLU) models

Dataset Composition

Training set contains 4,978 sentences
Evaluation set contains 850 sentences

Dataset Sample

Sample image shows a portion of the dataset

Model Configuration and Results

Intent Classifier

Model 1: DIET, 256‑bit binary transformer, outperforms other models
Model 2: Linear SVM
Model 3: MITIE language model
Performance Metrics:
- Weighted average precision: 0.96, 0.88, 0.94
- Weighted average recall: 0.96, 0.89, 0.94
- Weighted average F1 score: 0.96, 0.88, 0.93

Entity Extractor

Model 1: DIET, used for both intent classification and entity extraction
Model 2: CRF, less efficient than DIET
Model 3: MITIE entity extractor, performance between DIET and CRF
Performance Metrics:
- Weighted average precision: 0.96, 0.90, 0.95
- Weighted average recall: 0.94, 0.89, 0.92
- Weighted average F1 score: 0.94, 0.89, 0.93

Description

Dataset Overview

Dataset Name

Dataset Purpose

Dataset Composition

Dataset Sample

Model Configuration and Results

Intent Classifier

Entity Extractor

AI studio

Access Dataset

Topics

Source