DATASET

Open Source Community

Secom-Dataset

The Secom dataset contains a unique rare‑event scenario with highly imbalanced output classes. It consists of 1,567 observations and 590 variables; each record represents a single production entity with associated measurement features. The `secom_labels.data` file provides pass/fail labels (‑1 = pass, 1 = fail) and timestamps for each test point.

Updated 3/15/2021

github

Description

Dataset Overview

Dataset Name

Predictive‑Models‑for‑Equipment‑Fault‑Detection---Secom‑Dataset

Composition

secom.data: Contains 1,567 observations with 590 variables (features).
secom_labels.data: Contains classification labels (pass/fail) and timestamps.

Description

secom.data: Each record represents a production entity with a set of measured features.
secom_labels.data: Simple pass/fail labeling where –1 indicates pass, 1 indicates fail; timestamps correspond to specific test points.

Applications

Apply various machine‑learning models for fitting, evaluate model performance, and select the best model to predict yield in semiconductor manufacturing.

Special Note

The data involve a rare‑event statistical scenario; the occurrence frequency of the failure class is extremely low, so sampling techniques are applied during preprocessing.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Quality Control

Fault Detection

Source

Organization: github

Created: 12/21/2017

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.

Check Prices →