Back to datasets
Dataset assetOpen Source CommunityWelding DefectsQuality Control

Welding Defect Dataset

The dataset, supplied by Godrej Aerospace, contains multiple parameters influencing the welding process, such as ambient temperature, welding operation temperature, humidity, voltage, current, welding speed, protective gas flow rate, and metal composition. It also includes machine data and detailed information about welders for advanced analysis.

Source
github
Created
Dec 18, 2022
Updated
Dec 18, 2022
Signals
472 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Problem Statement

  • Predict welding defects using ML models.
  • Develop algorithms based on the provided parameters.
  • Goal: assist Godrej Aerospace in producing defect‑free products.

Dataset Provided

  • Parameters include: ambient temperature, welding operation temperature, humidity, voltage & current, welding speed, protective‑gas flow rate, metal composition.
  • Machine data includes welder details.

Dataset Analysis

  1. Parameter Correlation:

    • Significant correlation observed between current and voltage, temperature and humidity.
    • Current‑voltage relationship aligns with Ohm’s law (V = IR).
    • Positive correlation between temperature and humidity is a physical phenomenon.
  2. Impact of Welding Operation Temperature:

    • High welding temperatures may cause tungsten inclusions.
    • Recommended welding temperature range: 30‑60 °C.
  3. Effects of Current and Voltage:

    • High current generates more heat, increasing the likelihood of tungsten inclusions.
    • Voltage may also affect tungsten inclusions.
  4. Temperature, Humidity, and Flow Rate on Porosity:

    • Temperature and humidity have limited impact on porosity.
    • Porosity flow rate is higher than that of defect‑free and tungsten‑inclusion samples.
  5. Imbalanced Data Handling:

    • Applied oversampling, undersampling, and SMOTE techniques.
    • Combined the three methods to create a robust dataset.

Models Used

  • Experimented with SVM, AdaBoost, Decision Tree, Random Forest, and Gradient Boosting.
  • Final selection: XGBoost, achieving 96 % accuracy and strong F1 score.

Model Deployment

  • Deployed with Tailwind CSS and a Flask server on AWS EC2.
  • Model runs efficiently, using only 30 % of a 1 GB RAM instance; model size is 35 MB.
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.