Back to datasets
Dataset assetOpen Source CommunityEarth Observation3D Point Cloud Segmentation

IGNF/FRACTAL

FRACTAL is a benchmark dataset for 3D point‑cloud semantic segmentation, comprising 100,000 point clouds covering a 250 km² area across five regions in France. Originating from the Lidar HD project, it balances rare classes through an efficient sampling strategy and includes challenging landscapes. Each point cloud spans 50 × 50 m with high point density (average 37 points/m²). The dataset provides seven semantic classes and is colorized using high‑resolution aerial imagery. It is split into training (80,000), validation (10,000), and test (10,000) subsets, with metadata documenting acquisition years and distribution details.

Source
hugging_face
Created
Nov 28, 2025
Updated
Apr 5, 2025
Signals
186 views
Availability
Linked source ready
Overview

Dataset description and usage context

FRACTAL Dataset Overview

Basic Information

  • Name: FRACTAL
  • Type: 3D Point‑Cloud Semantic Segmentation Benchmark
  • Size: 100,000 point clouds
  • Coverage Area: 250 km²
  • Source: Lidar HD Program (2020‑2025)
  • Point Density: 10 pulses/m², average 37 pts/m², total 9,261 M points
  • Semantic Classes: 7 (Other | Ground | Vegetation | Building | Water | Bridge | Permanent Structure)
  • Colorization: Using high‑resolution aerial images from ORTHO HR®

Content

  • Composition:
    • Training: 80,000 point clouds
    • Validation: 10,000 point clouds
    • Test: 10,000 point clouds
  • Point Cloud Density: 10 pulses/m² (~40 pts/m²)
  • Colorization: Near‑infrared, red, green, and blue channels at 0.2 m spatial resolution
  • Acquisition Period: Spanning multiple years, up to a 3‑year gap possible

Class Distribution

ClassTraining %Validation %Test %
Other0.60.50.7
Ground39.039.140.5
Vegetation57.056.954.1
Building2.82.83.3
Water0.50.51.2
Bridge0.10.10.2
Permanent Structure0.040.040.04

Spatial Extent & Split

  • Sampling Regions: Five spatial domains in southern France, total area 17,280 km²
  • Split: 80 % training, 10 % validation, 10 % test
  • Test Area: 25 km² sampled from continuous test zones across each domain, totaling 1,049 km²
  • Training/Validation Area: 200 km² + 25 km², remaining area sampled using spatial stratification

Aerial Imagery

Citation

@misc{gaydon2024fractal, title={FRACTAL: An Ultra‑Large‑Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse Landscapes}, author={Charles Gaydon and Michel Daab and Floryne Roche}, year={2024}, eprint={TBD}, archivePrefix={arXiv}, url={https://arxiv.org/abs/TBD}, primaryClass={cs.CV} }

License

  • License: "OPEN LICENCE 2.0/LICENCE OUVERTE", created by the French government to promote open data dissemination by public administration. Compatible with the UK Open Government Licence (OGL), Creative Commons Attribution (CC‑BY), and Open Data Commons Attribution (ODC‑BY).
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio