JUHE API Marketplace
DATASET
Open Source Community

Cereberal-Stroke-Analysis

This dataset is used to analyze stroke, employing machine learning models and resampling techniques such as SMOTEENN to improve prediction accuracy and address dataset imbalance.

Updated 12/12/2023
github

Description

Overview of Dataset Processing Workflow

Data Loading and Import

  • Use pandas, numpy, seaborn, matplotlib.pyplot, and other libraries to import and read the CSV file into a DataFrame (df).

Exploratory Data Analysis (EDA)

  • Perform basic data exploration with head() and describe().
  • Check and count missing values using isnull().sum().

Handling Categorical Variables

  • Apply pd.get_dummies() for one‑hot encoding of categorical variables.

Handling Missing Values

  • Fill missing values using the KNNImputer algorithm.

Feature Scaling and Train‑Test Split

  • Perform feature scaling with MinMaxScaler.
  • Split the dataset into training and testing sets.

Model Selection and Evaluation

  • Conduct preliminary testing with models such as KNeighborsClassifier, GaussianNB, DecisionTreeClassifier, and RandomForestClassifier.
  • Generate a classification report to evaluate model performance on the imbalanced dataset.

Data Resampling

  • Apply SMOTE for oversampling.
  • Perform random undersampling to balance class distribution.
  • Use SMOTEENN to combine oversampling and undersampling.

Post‑Resampling Model Evaluation

  • Retrain and evaluate models on the oversampled, undersampled, and combined sampled datasets.

Conclusion

  • Various resampling techniques, especially SMOTEENN, substantially improve the model’s ability to identify positive stroke cases.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Stroke
Machine Learning

Source

Organization: github

Created: 12/12/2023

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.