Back to datasets
Dataset assetOpen Source CommunityFault DetectionQuality Control
Secom-Dataset
The Secom dataset contains a unique rare‑event scenario with highly imbalanced output classes. It consists of 1,567 observations and 590 variables; each record represents a single production entity with associated measurement features. The `secom_labels.data` file provides pass/fail labels (‑1 = pass, 1 = fail) and timestamps for each test point.
Source
github
Created
Dec 21, 2017
Updated
Mar 15, 2021
Signals
293 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Dataset Name
Predictive‑Models‑for‑Equipment‑Fault‑Detection---Secom‑Dataset
Composition
- secom.data: Contains 1,567 observations with 590 variables (features).
- secom_labels.data: Contains classification labels (pass/fail) and timestamps.
Description
- secom.data: Each record represents a production entity with a set of measured features.
- secom_labels.data: Simple pass/fail labeling where –1 indicates pass, 1 indicates fail; timestamps correspond to specific test points.
Applications
- Apply various machine‑learning models for fitting, evaluate model performance, and select the best model to predict yield in semiconductor manufacturing.
Special Note
- The data involve a rare‑event statistical scenario; the occurrence frequency of the failure class is extremely low, so sampling techniques are applied during preprocessing.
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.