JUHE API Marketplace
DATASET
Open Source Community

HamdiJr/Egyptian_hieroglyphs

The dataset contains 10 images of Egyptian hieroglyphs extracted from the book "The Pyramid of Unas", together with a language model. Each hieroglyph is manually annotated and labeled according to the Gardiner sign list. The dataset also includes automated detection results, tools for building the language model (e.g., vocabulary and n‑gram grammars), and a description of its structure and GPL non‑commercial license.

Updated 7/22/2022
hugging_face

Description

Dataset Overview

Dataset Name

  • Egyptian hieroglyphs 𓂀
  • Hieroglyphs image dataset along with Language Model

Dataset Features

  • Source: Built from 10 images in the book The Pyramid of Unas (Alexandre Piankoff, 1955).
  • Image Numbers: 3, 5, 7, 9, 20, 21, 22, 23, 39, 41.
  • Annotation: Every hieroglyph was manually annotated and labeled according to the Gardiner Sign List.
  • Image Naming: File names contain the label and image index.

Dataset Statistics

  • Total Images: 4,210 (including 179 labeled as UNKNOWN).
  • Total Classes: 171 (excluding the UNKNOWN class).

Annotation Accuracy

  • Note: Annotations may not be fully accurate; unidentified hieroglyphs are marked as “UNKNOWN”.

Data Processing

  • Manual Annotation: Hand‑annotated hieroglyphs.
  • Automated Detection: Automatic extraction of hieroglyphs using text detection methods, stored in Dataset/Automated/.
  • Location Information: x/y coordinates for each hieroglyph stored in the Location-folder.

Dataset Structure

  • Images: The 10 source images from The Pyramid of Unas.
  • Manual Annotation: Hieroglyph image crops with location data.
  • Automated Detection: Automatically detected hieroglyph crops with location data.
  • Language Model: Egyptian text, dictionary, and n‑grams from the JSesh database.

License

  • GPL: Non‑commercial use only.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Hieroglyph Recognition
Natural Language Processing

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.