High Quality Data

Dataset Hub

Explore high-quality datasets for your AI and machine learning projects.

Sort:

Browse by Category

Multi-view Datasets

The collection comprises multiple multi‑view benchmark datasets for clustering or classification tasks, including AwA, COIL100, MNIST2, NUSWIDEOBJ, PIE, and YouTubeFaces, each accompanied by detailed descriptions and multi‑view features.

github

View Details

isek-ai/danbooru-tags

Anime Tags

Data Classification

The dataset is named danbooru tags and mainly includes the names of Danbooru tags, the words that compose tag names, and the tag categories. Tag categories include General (0), Artist (1), Copyright (3), Character (4), and Metadata (5), etc.

hugging_face

View Details

CSDMC2010 SPAM corpus

Spam Detection

Data Classification

This dataset consists of a series of email messages and serves as training and testing data. Because the dataset is used for a competition, the test set is unlabeled while only the training set is labeled. The training set contains 2,929 legitimate (ham) emails and 1,378 spam emails.

github

View Details