Dataset assetOpen Source CommunityTable Structure RecognitionDataset Corrections

bsmock/ICDAR-2013.c

ICDAR‑2013.c dataset, released in 2023, is a branch of the original ICDAR‑2013 dataset modified by different authors. It includes minor corrections to the original data and automated fixes (e.g., normalization) to address over‑segmentation and make the dataset more consistent with other table structure recognition (TSR) datasets such as PubTables‑1M. For more details on this version and manual corrections, refer to the associated paper.

Source

hugging_face

Created

Nov 28, 2025

Updated

Sep 7, 2023

Signals

168 views

Availability

Linked source ready

Overview

Dataset description and usage context

ICDAR-2013.c Dataset

Overview

ICDAR‑2013.c dataset was released in 2023 and can be considered a modified version of the original ICDAR‑2013 dataset. It contains manual corrections of minor annotation errors in the original data as well as automated normalizations to fix over‑segmentation issues and improve consistency with other TSR datasets such as PubTables‑1M.

Content

Manual Corrections: Small annotation errors in the original dataset are corrected manually.
Automated Corrections: Normalization is applied to resolve over‑segmentation and increase alignment with other TSR datasets.

Citation

If your research uses this dataset, please cite the following paper:

@article{smock2023aligning,
  title={Aligning benchmark datasets for table structure recognition},
  author={Smock, Brandon and Pesala, Rohith and Abraham, Robin},
  booktitle={International Conference on Document Analysis and Recognition},
  pages={371--386},
  year={2023},
  organization={Springer}
}

Original Dataset

The original ICDAR‑2013 dataset was released for the ICDAR 2013 Table Competition. The original dataset has no known license but is generally considered public domain, so we treat it as having no license restrictions.

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio