Back to datasets
Dataset assetOpen Source CommunityHeart Disease Risk AssessmentHealth Data Analysis

Heart Disease Health Indicators Dataset

Using data from the Behavioral Risk Factor Surveillance System (BRFSS), identify which risk factors are the strongest indicators of heart disease, explore correlations among risk factors, and attempt to predict heart disease when indicators are present.

Source
github
Created
Aug 27, 2022
Updated
Dec 10, 2022
Signals
129 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Title

  • Heart Disease Prediction and Risk Factors

Source

Objectives

  • Determine the strongest risk factors for heart disease.
  • Explore correlations among risk factors.
  • Attempt to predict heart disease given the presence of risk indicators.

Data Processing

  • Data Cleaning: Check for inconsistencies, handle missing values, and use outlier analysis for trimming and smoothing.
  • Pre‑processing: Ensure reliability and interpretability, perform dimensionality reduction and transformation.
  • Integration: Conduct correlation analysis and resolve conflicts during detection.

Tools

  • Programming: Python / R
  • Data handling: pandas
  • Visualization: matplotlib, seaborn, graphviz
  • Analysis: scikit‑learn

Research Questions

  1. Can heart disease be accurately predicted from common health indicators?
  2. Which attributes are most influential for coronary heart disease risk?
  3. How can this information be made more accessible to the public?

Applications

  • Developed a simple web app based on the best model, allowing users to input seven statistically significant high‑impact factors to receive a high or low risk assessment for heart disease.
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio