Back to datasets
Dataset assetOpen Source CommunityImage ClassificationCancer Research

1aurent/LC25000

The dataset contains 25,000 color images divided into five classes, each class comprising 5,000 images. All images are 768 × 768 pixels in JPEG format.

Source
hugging_face
Created
Nov 28, 2025
Updated
May 25, 2024
Signals
260 views
Availability
Linked source ready
Overview

Dataset description and usage context

LC25000: Lung and Colon Histopathological Image Dataset

Dataset Description

  • Features:

    • image: Image data
    • organ: Organ label (lung for lung, colon for colon)
    • label: Pathology label (benign, adenocarcinomas, squamous carcinomas)
  • Split:

    • train: Training set containing 25,000 samples, size 1,581,800,190 bytes
  • Dataset Size:

    • Download size: 1,125,348,716 bytes
    • Dataset size: 1,581,800,190 bytes
  • Tags:

    • biology, cancer
  • Size Category:

    • 10K<n<100K
  • License:

    • unlicense
  • Task Category:

    • image-classification
  • Papers with Code ID:

    • lc25000

Detailed Description

The dataset comprises 25,000 color images split into five categories, each containing 5,000 images. All images are 768 × 768 pixels and stored as JPEG files.

Citation

@misc{borkowski2019lung,
  title         = {Lung and Colon Cancer Histopathological Image Dataset (LC25000)},
  author        = {Andrew A. Borkowski and Marilyn M. Bui and L. Brannon Thomas and Catherine P. Wilson and Lauren A. DeLand and Stephen M. Mastorides},
  year          = {2019},
  eprint        = {1912.12142},
  archiveprefix = {arXiv},
  primaryclass  = {eess.IV}
}
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio