Explore high-quality datasets for your AI and machine learning projects.
The Multi30k dataset is a multilingual English‑German image description dataset, containing training, validation, and test sets, and supporting multiple languages such as English, German, French, and Czech. The dataset provides detailed statistics such as the number of sentences, word count, and average words per sentence. Additionally, it offers download links for visual features and original images.