Explore high-quality datasets for your AI and machine learning projects.
The dataset contains 25,000 color images divided into five classes, each class comprising 5,000 images. All images are 768 × 768 pixels in JPEG format.
The dataset focuses on image classification and feature extraction for defective tires, containing between 1 k and 10 k images, suitable for identifying tire defects in the automotive industry. It is released under the CC‑BY‑4.0 license.
The dataset is the latest supplemental data from e621.net, containing the newest images and videos. Data formats include gif, jpg, png, swf, and webm. Due to content reasons, this dataset is not suitable for all audiences. It consists of 346,765 records, with IDs ranging from 117,744 to 5,083,602, last updated on 2024-10-01. The dataset primarily uses English and Japanese, and is applicable to image classification, zero‑shot image classification, and text‑to‑image generation tasks. Content involves art and anime, accompanied by detailed tags describing the images and videos.
This dataset contains 207,572 books from the Amazon marketplace, intended for book cover image classification and data mining tasks. The dataset includes cover images, titles, authors, and categories.
Fashion MNIST is a dataset of clothing images, including T‑shirts, trousers, dresses, etc. It consists of 60,000 training images and 10,000 test images, each a 28 × 28 pixel grayscale image.
The dataset comprises two subsets (sub1 and sub2), each containing images and label features. Sub1 labels include *combined* and *seb*; sub2 labels include *combined* and *mel*. Each subset provides training, validation, and test splits with corresponding sample counts and byte sizes. Download sizes and actual sizes are listed in the README.
CIFAR-100-LT是一个不平衡的数据集,包含不到60,000张32x32像素的彩色图像,分布在100个不同的类别中。每个类别的样本数量呈指数级减少,数据集包括10,000张测试图像,每类100张,以及少于50,000张训练图像。这100个类别进一步组织成20个超类。每个图像都被分配了两个标签:一个表示特定类别的精细标签和一个表示相关超类的粗略标签。
This is a large‑scale insect pest recognition benchmark dataset containing over 75,000 images across 102 categories. It is intended for detection and classification tasks and provides detailed annotation examples and class lists.
ImageNet is a large‑scale image classification dataset created via crowdsourcing. It contains a massive number of images labeled with specific categories, all labels are in English. The dataset size ranges between 1 million and 10 million, source data is original, task type is image classification, specific task ID is multi‑class image classification. The dataset details the features, including images and labels, and provides a list of class names.
The dataset named COMPTECH2022 WhoSigned? is primarily used for handwritten signature verification. It contains over 5,000 handwritten signatures along with their original images and cropped images; each image includes approximately 10 handwritten signatures from the same user ID. Images are cropped using a segmentation neural network, with each cropped image containing a single handwritten signature. User IDs can be defined from the image file names. The dataset was created by Toloka.ai with support from COMPTECH2022.
FEMNIST is an image classification dataset containing 62 classes (10 digits, 26 lowercase letters, 26 uppercase letters), with images of size 28×28 pixels (optionally upscaled to 128×128 pixels), involving 3,500 users. The dataset is derived by partitioning the EMNIST dataset so that each user's data includes characters written by a single author.
Caltech‑101 dataset: VGG from scratch plus ResNet. This Jupyter notebook provides CNN‑based image classification. The project uses technologies including a VGG model built from scratch, a pretrained ResNet model, and F1 score and accuracy for performance evaluation. Dataset source: https://www.tensorflow.org/datasets/catalog/caltech101?hl=es-419
The dataset comprises two features: images for visualisation and classification labels. There are 47 distinct categories such as banded, blotchy, braided, etc. The data are split into training, test, and validation sets, each containing 1,880 samples. The total download size is 629,310,459 bytes and the overall dataset size is 620,627,859.52 bytes.
PatternNet is a large‑scale high‑resolution remote sensing dataset for scene classification and image retrieval. It contains 38 classes, each with 800 images of size 256×256 pixels, totaling 30,400 images. The images are sourced from Google Earth or Google Map API, covering various land‑cover types such as airplanes, baseball fields, basketball courts, beaches, bridges, etc. The image resolution is 256 × 256 m with three RGB bands.
This dataset is designed for door object detection, classification, and semantic segmentation. Images were captured with a 3D Realsense D435 camera, with the angle adjusted to ensure the complete door region. The dataset is divided into two subsets: one containing 240 images for door detection/semantic segmentation, and another containing 1,206 images for door classification. The classification categories are closed doors, partially open doors, and fully open doors, and the dataset provides detailed training, validation, and test splits.
The Dunhuang dataset includes images and their corresponding labels. Label categories cover animals, architecture, Buddhist art, captions, clothing, dance, doctrinal translation, flight, folklore, landscape, music, pattern, sponsor, story, technique, and traffic. The training set contains 2,352 examples; the total dataset size is approximately 2.71 GB.
Cars dataset contains 16,185 images of 196 car classes. The dataset is split into 8,144 training images and 8,041 test images, with each class roughly divided 50‑50. Classes typically include manufacturer, model, and year, e.g., 2012 Tesla Model S or 2012 BMW M3 coupe.
The dataset represents a texture collection of human colorectal cancer histology images. It contains 5,000 RGB images of size 150 × 150 px (74 µm × 74 µm), each belonging to one of eight tissue classes. Images were digitized at 20× magnification (0.495 µm/pixel) using an Aperio ScanScope (Leica Biosystems). The samples are formalin‑fixed, paraffin‑embedded (FFPE) colorectal adenocarcinoma tissues from pathology archives, fully anonymized and approved by an ethics committee.
The CIFAR-100-Enriched dataset is an augmented version of the original CIFAR-100 dataset, comprising 60,000 color images of 32 × 32 pixels across 100 classes, with 600 images per class. In addition to fine‑grained labels (specific categories), it includes coarse‑grained labels (super‑categories). Augmentation includes image embeddings generated by a Vision Transformer, facilitating data analysis and model training. The dataset aims to promote data‑driven AI principles and advance research in image classification.
These datasets are used for evaluating semi‑supervised learning in fine‑grained classification, including images of birds, fungi, the CUB‑200‑2011 bird dataset, and natural species.
This dataset is intended for training and supporting a Support Vector Machine (SVM) model to classify images of cats and dogs. It contains images of cats and dogs suitable for binary classification tasks.
Android icons collected from multiple sources and manually grouped. Note that a Misc_or_unknown class includes unusual icons that cannot be categorized.
ImageNet-100是原始ImageNet-1k数据集的一个子集,包含随机选择的100个类别。此外,图像的较短边被调整为160像素。数据集包含图像和标签两个主要字段,标签是基于imagenet100.txt文件中的synset id索引的。数据集分为训练集和验证集,分别包含126689和5000个样本。
--- dataset_info: features: - name: image dtype: image - name: label dtype: class_label: names: '0': none '1': videograph '2': zocalo - name: image_640x640 dtype: image - name: cropped_image dtype: image - name: embeddings_640x640 sequence: float32 - name: embeddings_cropped sequence: float32 - name: embeddings sequence: float32 splits: - name: train num_bytes: 1259070755.518 num_examples: 2369 - name: test num_bytes: 123028562.0 num_examples: 279 download_size: 1377254060 dataset_size: 1382099317.518 --- # Dataset Card for "banners" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
This is an image classification dataset containing images of LEGO bricks and their corresponding labels. Each image is associated with a label representing the LEGO brick number. The dataset size ranges between 100 000 and 1 000 000 records.