DATASET
Open Source Community
Secom-Dataset
The Secom dataset contains a unique rare‑event scenario with highly imbalanced output classes. It consists of 1,567 observations and 590 variables; each record represents a single production entity with associated measurement features. The `secom_labels.data` file provides pass/fail labels (‑1 = pass, 1 = fail) and timestamps for each test point.
Updated 3/15/2021
github
Description
Dataset Overview
Dataset Name
Predictive‑Models‑for‑Equipment‑Fault‑Detection---Secom‑Dataset
Composition
- secom.data: Contains 1,567 observations with 590 variables (features).
- secom_labels.data: Contains classification labels (pass/fail) and timestamps.
Description
- secom.data: Each record represents a production entity with a set of measured features.
- secom_labels.data: Simple pass/fail labeling where –1 indicates pass, 1 indicates fail; timestamps correspond to specific test points.
Applications
- Apply various machine‑learning models for fitting, evaluate model performance, and select the best model to predict yield in semiconductor manufacturing.
Special Note
- The data involve a rare‑event statistical scenario; the occurrence frequency of the failure class is extremely low, so sampling techniques are applied during preprocessing.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Quality Control
Fault Detection
Source
Organization: github
Created: 12/21/2017
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.