Dataset assetOpen Source CommunityImage ClassificationData Augmentation

renumics/cifar100-enriched

The CIFAR-100-Enriched dataset is an augmented version of the original CIFAR-100 dataset, comprising 60,000 color images of 32 × 32 pixels across 100 classes, with 600 images per class. In addition to fine‑grained labels (specific categories), it includes coarse‑grained labels (super‑categories). Augmentation includes image embeddings generated by a Vision Transformer, facilitating data analysis and model training. The dataset aims to promote data‑driven AI principles and advance research in image classification.

Source

hugging_face

Created

Nov 28, 2025

Updated

Jun 6, 2023

Signals

112 views

Availability

Linked source ready

Overview

Dataset description and usage context

Dataset Overview

Dataset Name: CIFAR-100-Enriched
Dataset Version: Enriched version provided by Renumics
Dataset Category: Image Classification
Dataset Size: 10K < n < 100K
Dataset Tags: image classification, cifar-100, cifar-100-enriched, embeddings, enhanced, spotlight, renumics
Language: English
Multilinguality: Monolingual
Annotation Creators: Crowdsourced
Language Creators: N/A

Detailed Description

Summary: This dataset is an enriched version of CIFAR-100, adding image embeddings and other application‑specific enrichment to help the machine‑learning community better understand and apply data‑center AI principles.
Content: Contains 60,000 32×32 colour images across 100 classes, with 600 images per class. The dataset includes 50,000 training images and 10,000 test images. Each image has a "fine" label and a "coarse" label.
Enrichment: Enhanced with image embeddings generated by a Vision Transformer.
Structure:
- Data Instances: Each instance includes image path, raw image, fine label, coarse label, predictions, etc.
- Fields: Image, raw image, labels, predictions, embeddings, etc.
- Splits: 50,000 training images, 10,000 test images.

Usage

Exploration: Use Renumics Spotlight to quickly analyze and explore the dataset.
Supported Tasks: Image classification – assign each image to one of 100 categories.
Language: English labels.

Creation

Source: CIFAR-10 and CIFAR-100 are subsets of the 80 million tiny images dataset, collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton.
Contributors: Alex Krizhevsky, Vinod Nair, Geoffrey Hinton, Renumics GmbH.

Citation

@article{krizhevsky2009learning, added-at = {2021-01-21T03:01:11.000+0100}, author = {Krizhevsky, Alex}, biburl = {https://www.bibsonomy.org/bibtex/2fe5248afe57647d9c85c50a98a12145c/s364315}, interhash = {cc2d42f2b7ef6a4e76e47d1a50c8cd86}, intrahash = {fe5248afe57647d9c85c50a98a12145c}, pages = {32--33}, timestamp = {2021-01-21T03:01:11.000+0100}, title = {Learning Multiple Layers of Features from Tiny Images}, url = {https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf}, year = 2009 }

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio