JUHE API Marketplace
DATASET
Open Source Community

Doraemon-AI/pdf-layout-chinese

pdf-layout-chinese is a Chinese document layout analysis dataset focusing on Chinese scholarly documents (e.g., papers). The dataset provides 10 layout classes: text, title, image, image title, table, table title, header, footer, caption, and formula. It contains 5,000 training images and 1,000 validation images; each image has a correspondingly named JSON annotation file. Annotations were created with labelme and support polygon shapes.

Updated 4/18/2024
hugging_face

Description

pdf-layout-chinese Dataset Overview

Basic Information

  • Name: pdf-layout-chinese
  • License: AFL-3.0
  • Task Type: Feature Extraction
  • Languages: English, Chinese
  • Size Category: 100M < n < 1B

Content

  • Description: pdf-layout-chinese is a Chinese document layout analysis dataset targeting scholarly paper scenarios.
  • Labels: 10 classes – text, title, image, image title, table, table title, header, footer, caption, formula.
  • Splits: 5,000 training images and 1,000 validation images, stored in train and val directories respectively.
  • Annotation Files: Each image has a same‑named JSON annotation file generated with labelme.

Annotation Format

  • Tool: labelme
  • Structure: Aligns with labelme format and includes key fields:
    • shapes: list of annotation instances.
    • labels: class labels.
    • points: polygon coordinates.
    • shape_type: polygon
    • imagePath: image path/name
    • imageHeight: image height
    • imageWidth: image width

Conversion

  • Conversion Tool: labelme2coco.py
  • Commands:
    • Training set: python3 labelme2coco.py train train_save_path --labels labels.txt
    • Validation set: python3 labelme2coco.py val val_save_path --labels labels.txt
  • Output Location: Saved under train_save_path / val_save_path directories.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Document Layout Analysis
Computer Vision

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.