JUHE API Marketplace
DATASET
Open Source Community

ATIS dataset

The ATIS dataset is used in training and evaluation phases, containing 4,978 training sentences and 850 evaluation sentences. It is utilized for natural language understanding (NLU) training, involving tokenization, featurization, intent classification, and entity recognition and extraction.

Updated 12/24/2022
github

Description

Dataset Overview

Dataset Name

  • ATIS dataset

Dataset Purpose

  • Used for training and evaluating natural language understanding (NLU) models

Dataset Composition

  • Training set contains 4,978 sentences
  • Evaluation set contains 850 sentences

Dataset Sample

  • Sample image shows a portion of the dataset

Model Configuration and Results

Intent Classifier

  • Model 1: DIET, 256‑bit binary transformer, outperforms other models
  • Model 2: Linear SVM
  • Model 3: MITIE language model
  • Performance Metrics:
    • Weighted average precision: 0.96, 0.88, 0.94
    • Weighted average recall: 0.96, 0.89, 0.94
    • Weighted average F1 score: 0.96, 0.88, 0.93

Entity Extractor

  • Model 1: DIET, used for both intent classification and entity extraction
  • Model 2: CRF, less efficient than DIET
  • Model 3: MITIE entity extractor, performance between DIET and CRF
  • Performance Metrics:
    • Weighted average precision: 0.96, 0.90, 0.95
    • Weighted average recall: 0.94, 0.89, 0.92
    • Weighted average F1 score: 0.94, 0.89, 0.93

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Natural Language Processing
Intent Recognition

Source

Organization: github

Created: 8/23/2022

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.