ISIC Archive Skin Lesion Dataset
The ISIC Archive skin‑lesion dataset was jointly created by Forthcoming University of Applied Sciences and Eindhoven University of Technology, containing 71,035 skin‑lesion images. It is used to study model gender bias and fairness. The dataset was constructed using linear programming to balance gender, age, and lesion type, aiming to mitigate gender bias in medical image diagnosis. Primary applications are skin‑lesion classification and fairness research.
Dataset description and usage context
Dataset Overview
Research Objective
This study systematically evaluates the diagnostic accuracy of various convolutional neural network (CNN) architectures on skin‑lesion images, with particular attention to how demographic parameters such as gender influence performance.
Dataset Construction
- A balanced test set was used.
- Five training sets of equal size were built with female‑to‑male ratios of: all‑female, 75:25, 50:50, 25:75, all‑male.
- All six datasets maintain a 50:50 benign‑to‑malignant ratio.
Data Source
The dataset comprises metadata from the ISIC Archive, with references:
- Codella, N., et al. (2019)
- Codella, N.C.F., et al. (2018)
- Combalia, M., et al. (2019)
- Gutman, D., et al. (2016)
- Tschandl, P., et al. (2018)
- Veronica, R., et al. (2021)
Code Structure
0_data: Collected skin‑lesion metadata.1_code: Baseline and multi‑task models, experiment definitions, and MATLAB code.single task:0_baseline.py(Keras/TensorFlow)reinforcing:1_mtl_strengthen.py(Keras/TensorFlow)adversarial:br‑net.py(PyTorch)MATLAB folder: Linear‑programming model for creating dataset distributions.Experiments folder: Runs various model‑dataset combinations.e1: 50F:50M (run‑e1: base, run‑e1m: reinforcing, run‑e1br: adversarial)e5: all‑femalee7: all‑malee8: 25F:75Me9: 75F:25M
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.