JUHE API Marketplace
DATASET
Open Source Community

renumics/cifar100-enriched

The CIFAR-100-Enriched dataset is an augmented version of the original CIFAR-100 dataset, comprising 60,000 color images of 32 × 32 pixels across 100 classes, with 600 images per class. In addition to fine‑grained labels (specific categories), it includes coarse‑grained labels (super‑categories). Augmentation includes image embeddings generated by a Vision Transformer, facilitating data analysis and model training. The dataset aims to promote data‑driven AI principles and advance research in image classification.

Updated 6/6/2023
hugging_face

Description

Dataset Overview

  • Dataset Name: CIFAR-100-Enriched
  • Dataset Version: Enriched version provided by Renumics
  • Dataset Category: Image Classification
  • Dataset Size: 10K < n < 100K
  • Dataset Tags: image classification, cifar-100, cifar-100-enriched, embeddings, enhanced, spotlight, renumics
  • Language: English
  • Multilinguality: Monolingual
  • Annotation Creators: Crowdsourced
  • Language Creators: N/A

Detailed Description

  • Summary: This dataset is an enriched version of CIFAR-100, adding image embeddings and other application‑specific enrichment to help the machine‑learning community better understand and apply data‑center AI principles.
  • Content: Contains 60,000 32×32 colour images across 100 classes, with 600 images per class. The dataset includes 50,000 training images and 10,000 test images. Each image has a "fine" label and a "coarse" label.
  • Enrichment: Enhanced with image embeddings generated by a Vision Transformer.
  • Structure:
    • Data Instances: Each instance includes image path, raw image, fine label, coarse label, predictions, etc.
    • Fields: Image, raw image, labels, predictions, embeddings, etc.
    • Splits: 50,000 training images, 10,000 test images.

Usage

  • Exploration: Use Renumics Spotlight to quickly analyze and explore the dataset.
  • Supported Tasks: Image classification – assign each image to one of 100 categories.
  • Language: English labels.

Creation

  • Source: CIFAR-10 and CIFAR-100 are subsets of the 80 million tiny images dataset, collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton.
  • Contributors: Alex Krizhevsky, Vinod Nair, Geoffrey Hinton, Renumics GmbH.

Citation

@article{krizhevsky2009learning, added-at = {2021-01-21T03:01:11.000+0100}, author = {Krizhevsky, Alex}, biburl = {https://www.bibsonomy.org/bibtex/2fe5248afe57647d9c85c50a98a12145c/s364315}, interhash = {cc2d42f2b7ef6a4e76e47d1a50c8cd86}, intrahash = {fe5248afe57647d9c85c50a98a12145c}, pages = {32--33}, timestamp = {2021-01-21T03:01:11.000+0100}, title = {Learning Multiple Layers of Features from Tiny Images}, url = {https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf}, year = 2009 }

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Image Classification
Data Augmentation

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.