Dataset assetOpen Source CommunityImage SegmentationCrack Detection

rimvydasrub/crackseg9k

This dataset is the largest, most diverse, and most consistent crack‑segmentation dataset built to date. It contains 9 255 images aggregated from various open‑source small datasets and pre‑processed to a resolution of 400 × 400 pixels. The dataset comprises ten sub‑datasets, such as Crack500, Deepcrack, etc.

Source

hugging_face

Created

Nov 28, 2025

Updated

Jun 14, 2024

Signals

740 views

Availability

Linked source ready

Overview

Dataset description and usage context

Crackseg9k Dataset Overview

Basic Information

Dataset Name: crackseg9k
Version: 4.0.0
Homepage: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/EGIEBY
License: Creative Commons CC0 1.0 Universal (CC0 1.0) Public Domain Dedication

Description

This dataset is currently the largest, most diverse, and most consistent crack‑segmentation dataset.
It includes 9 255 images combined from multiple smaller open‑source datasets.
The dataset consists of 10 sub‑datasets: Crack500, Deepcrack, Sdnet, Cracktree, Gaps, Volker Rissbilder, Noncrack, Masonry, and Ceramic.
All images have been pre‑processed and resized to 400 × 400 pixels.

Citation

@article{kulkarni2022crackseg9k, title={CrackSeg9k: A Collection and Benchmark for Crack Segmentation Datasets and Frameworks}, author={Kulkarni, Shreyas and Singh, Shreyas and Balakrishnan, Dhananjay and Sharma, Siddharth and Devunuri, Saipraneeth and Korlapati, Sai Chowdeswara Rao}, journal={arXiv preprint arXiv:2208.13054}, year={2022} }

Data Split

Training set: data/train.parquet
Test set: data/test.parquet

Features

image: data type base64
mask: data type base64
head: data type base64

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio