High Quality Data

Dataset Hub

Explore high-quality datasets for your AI and machine learning projects.

Sort:

Browse by Category

ATHAR

The ATHAR dataset is a comprehensive collection of classical Arabic texts and their English translations, comprising approximately 66,000 parallel lines of original classical Arabic and corresponding English translations. The dataset is split into test and training subsets. Each record contains a classical Arabic text field and its English translation field. It is suitable for Arabic‑to‑English translation tasks.

huggingface

View Details