yh0701/FracAtlas_dataset
The FracAtlas dataset is a musculoskeletal radiographic image collection for fracture classification, localisation, and segmentation. It includes 4,083 X‑ray images (717 with fractures) and provides annotations in COCO, VGG, YOLO, and Pascal VOC formats. The dataset is intended for deep‑learning tasks in medical imaging, particularly fracture understanding. It is freely available under CC‑BY 4.0.
Description
Dataset Card: FracAtlas
Overview
"FracAtlas" is a collection of musculoskeletal radiographic images for fracture classification, localisation, and segmentation. It contains 4,083 X‑ray images (717 of which depict fractures) and provides annotations in COCO, VGG, YOLO, and Pascal VOC formats. The dataset is designed for deep‑learning tasks in medical imaging, especially for understanding fractures.
Source
The source data are hosted on Figshare, an online digital repository where researchers can store and share their outputs, including datasets. FracAtlas is freely accessible under a CC‑BY 4.0 licence, allowing wide use in the scientific community, particularly among researchers and practitioners in medical imaging.
Uses
"FracAtlas" can be used to develop a variety of machine‑learning or deep‑learning algorithms, for example:
- Develop deep‑learning models to automatically detect fractures in radiographic images.
- Classify fracture types (e.g., hairline, comminuted, transverse) using machine‑learning models.
- Implement segmentation models to delineate bone structures from surrounding tissue in radiographs.
- Predict patient outcomes by combining fracture characteristics with other clinical data.
- Build models to recognise abnormal patterns in radiographic bone images.
Structure
Original Dataset Architecture
The original zip file contains three sub‑folders “images”, “Annotations”, “utilities” and a “dataset.csv” file.
- images folder: contains two sub‑folders “Fractured” and “Non‑fractured”, each holding JPG images.
- Annotations folder: contains four sub‑folders “COCO JSON”, “PASCAL VOC”, “VGG JSON” and “YOLO”, each storing the corresponding annotation format.
- utilities folder: includes several scripts for converting the original files into more readable formats.
- dataset.csv: provides many basic variables for each image, such as
image_id,hand,leg,hip,shoulder,mixed,hardware,multiscan,fractured,fracture_count,frontal,lateral,oblique, etc.
Updated Dataset Architecture
In the Hugging Face dataset loader, some existing variables from the original “dataset.csv” are extracted and modified to fit HuggingFace feature classes. Additional important variables are extracted from other files in the FracAtlas zip to present a more systematic and clean dataset.
Motivation
FracAtlas was created to meet the demand for annotated musculoskeletal radiographs needed to train machine‑learning models for fracture detection. The dataset aims to fill the gap in publicly available, annotated radiographic images that can advance AI‑assisted diagnostic tools.
Source Data
Initially 14,068 X‑ray images were collected. Because of privacy concerns, all DICOM images were assigned arbitrary filenames and converted to JPG format. The conversions were performed using proprietary software supplied with the respective X‑ray machines.
Annotations
The dataset contains 4,083 images that were manually annotated by two radiologists for fracture classification, localisation, and segmentation. The annotations were later verified and merged by an orthopaedic surgeon using the open‑source annotation platform makesense.ai. Annotation types include COCO JSON, PASCAL VOC, VGG JSON and YOLO.
Bias, Risks and Limitations
While FracAtlas is particularly valuable for the development of computer‑assisted diagnostic systems, its potential limitations should be considered. First, the manual annotation process is susceptible to human error, which may lead to mislabeled data.
Citation
Abedeen, I., Rahman, M. A., Prottyasha, F. Z., Ahmed, T., Chowdhury, T. M., & Shatabda, S. (2023). FracAtlas: A Dataset for Fracture Classification, Localization and Segmentation of Musculoskeletal Radiographs. Scientific Data, 10(1), 521. https://doi.org/10.1038/s41597-023-02432-4
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: hugging_face
Created: Unknown
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.