DATASET
Open Source Community
ATIS dataset
The ATIS dataset is used in training and evaluation phases, containing 4,978 training sentences and 850 evaluation sentences. It is utilized for natural language understanding (NLU) training, involving tokenization, featurization, intent classification, and entity recognition and extraction.
Updated 12/24/2022
github
Description
Dataset Overview
Dataset Name
- ATIS dataset
Dataset Purpose
- Used for training and evaluating natural language understanding (NLU) models
Dataset Composition
- Training set contains 4,978 sentences
- Evaluation set contains 850 sentences
Dataset Sample
- Sample image shows a portion of the dataset
Model Configuration and Results
Intent Classifier
- Model 1: DIET, 256‑bit binary transformer, outperforms other models
- Model 2: Linear SVM
- Model 3: MITIE language model
- Performance Metrics:
- Weighted average precision: 0.96, 0.88, 0.94
- Weighted average recall: 0.96, 0.89, 0.94
- Weighted average F1 score: 0.96, 0.88, 0.93
Entity Extractor
- Model 1: DIET, used for both intent classification and entity extraction
- Model 2: CRF, less efficient than DIET
- Model 3: MITIE entity extractor, performance between DIET and CRF
- Performance Metrics:
- Weighted average precision: 0.96, 0.90, 0.95
- Weighted average recall: 0.94, 0.89, 0.92
- Weighted average F1 score: 0.94, 0.89, 0.93
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Natural Language Processing
Intent Recognition
Source
Organization: github
Created: 8/23/2022
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.