ISIC Archive Skin Lesion Dataset
The ISIC Archive skin‑lesion dataset was jointly created by Forthcoming University of Applied Sciences and Eindhoven University of Technology, containing 71,035 skin‑lesion images. It is used to study model gender bias and fairness. The dataset was constructed using linear programming to balance gender, age, and lesion type, aiming to mitigate gender bias in medical image diagnosis. Primary applications are skin‑lesion classification and fairness research.
Description
Dataset Overview
Research Objective
This study systematically evaluates the diagnostic accuracy of various convolutional neural network (CNN) architectures on skin‑lesion images, with particular attention to how demographic parameters such as gender influence performance.
Dataset Construction
- A balanced test set was used.
- Five training sets of equal size were built with female‑to‑male ratios of: all‑female, 75:25, 50:50, 25:75, all‑male.
- All six datasets maintain a 50:50 benign‑to‑malignant ratio.
Data Source
The dataset comprises metadata from the ISIC Archive, with references:
- Codella, N., et al. (2019)
- Codella, N.C.F., et al. (2018)
- Combalia, M., et al. (2019)
- Gutman, D., et al. (2016)
- Tschandl, P., et al. (2018)
- Veronica, R., et al. (2021)
Code Structure
0_data: Collected skin‑lesion metadata.1_code: Baseline and multi‑task models, experiment definitions, and MATLAB code.single task:0_baseline.py(Keras/TensorFlow)reinforcing:1_mtl_strengthen.py(Keras/TensorFlow)adversarial:br‑net.py(PyTorch)MATLAB folder: Linear‑programming model for creating dataset distributions.Experiments folder: Runs various model‑dataset combinations.e1: 50F:50M (run‑e1: base, run‑e1m: reinforcing, run‑e1br: adversarial)e5: all‑femalee7: all‑malee8: 25F:75Me9: 75F:25M
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: arXiv
Created: 7/24/2024
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.