ISIC Archive Skin Lesion Dataset

The ISIC Archive skin‑lesion dataset was jointly created by Forthcoming University of Applied Sciences and Eindhoven University of Technology, containing 71,035 skin‑lesion images. It is used to study model gender bias and fairness. The dataset was constructed using linear programming to balance gender, age, and lesion type, aiming to mitigate gender bias in medical image diagnosis. Primary applications are skin‑lesion classification and fairness research.

Updated 7/24/2024

arXiv

Description

Dataset Overview

Research Objective

This study systematically evaluates the diagnostic accuracy of various convolutional neural network (CNN) architectures on skin‑lesion images, with particular attention to how demographic parameters such as gender influence performance.

Dataset Construction

A balanced test set was used.
Five training sets of equal size were built with female‑to‑male ratios of: all‑female, 75:25, 50:50, 25:75, all‑male.
All six datasets maintain a 50:50 benign‑to‑malignant ratio.

Data Source

The dataset comprises metadata from the ISIC Archive, with references:

Codella, N., et al. (2019)
Codella, N.C.F., et al. (2018)
Combalia, M., et al. (2019)
Gutman, D., et al. (2016)
Tschandl, P., et al. (2018)
Veronica, R., et al. (2021)

Code Structure

0_data: Collected skin‑lesion metadata.
1_code: Baseline and multi‑task models, experiment definitions, and MATLAB code.
- single task: 0_baseline.py (Keras/TensorFlow)
- reinforcing: 1_mtl_strengthen.py (Keras/TensorFlow)
- adversarial: br‑net.py (PyTorch)
- MATLAB folder: Linear‑programming model for creating dataset distributions.
- Experiments folder: Runs various model‑dataset combinations.
  - e1: 50F:50M (run‑e1: base, run‑e1m: reinforcing, run‑e1br: adversarial)
  - e5: all‑female
  - e7: all‑male
  - e8: 25F:75M
  - e9: 75F:25M

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Please login to view download links and access full dataset details.

Topics

Skin Lesion Classification

Model Fairness

Source

Organization: arXiv

Created: 7/24/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.

Check Prices →