Back to datasets
Dataset assetOpen Source CommunityData AnalysisHiring Decision

Hiring Decision Analysis Dataset

The dataset contains multiple variables related to recruitment decisions, such as age, gender, education level, work experience, number of previous employers, distance to the company, interview score, skill score, personality score, and recruitment strategy. The target variable is the recruitment decision, classified as hired or not hired.

Source
github
Created
Jul 16, 2024
Updated
Jul 24, 2024
Signals
264 views
Availability
Linked source ready
Overview

Dataset description and usage context

Recruitment Decision Analysis Dataset

Data Description

The dataset includes the following columns:

Variable Description

  • Age:

    • Range: 20 to 50 years
    • Type: Integer
  • Gender:

    • Categories: Male (0) or Female (1)
    • Type: Binary
  • Education Level:

    • Categories: 1: Bachelor (type 1), 2: Bachelor (type 2), 3: Master, 4: Doctorate
    • Type: Categorical
  • Years of Work Experience:

    • Range: 0 to 15 years
    • Type: Integer
  • Number of Previously Worked Companies:

    • Range: 1 to 5 companies
    • Type: Integer
  • Distance to Company:

    • Range: 1 to 50 km
    • Type: Continuous Float
  • Interview Score:

    • Range: 0 to 100
    • Type: Integer
  • Skill Score:

    • Range: 0 to 100
    • Type: Integer
  • Personality Score:

    • Range: 0 to 100
    • Type: Integer
  • Recruitment Strategy:

    • Categories: 1: Aggressive, 2: Moderate, 3: Conservative
    • Type: Categorical
  • Recruitment Decision (Target Variable):

    • Categories: 0: Not hired, 1: Hired
    • Type: Binary Integer

Exploratory Data Analysis (EDA)

Correlation Analysis

Correlation analysis identifies relationships between features and the target variable (recruitment decision), revealing the most influential factors in the hiring process.

Age and Recruitment Decision

Analysis of age distribution and its impact on recruitment decisions uncovers age‑related trends and biases, using histograms and box plots for different age groups.

Years of Work Experience and Recruitment Decision

Exploration of how years of work experience affect the likelihood of being hired, visualized with scatter and line plots.

Predictive Modeling

Logistic Regression

A logistic regression model is employed to predict recruitment decisions. Logistic regression is suitable for this binary classification problem, where the target outcome is either hired or not hired.

Hyperparameter Tuning

GridSearchCV is used for hyperparameter optimization, tuning parameters such as regularization penalty (penalty: l1, l2) and regularization strength (C: [0.01, 0.1, 1, 10, 100]).

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio