Back to datasets
Dataset assetOpen Source CommunityMachine LearningImage Recognition

Stanford Dog Dataset

The Stanford Dog Dataset contains approximately 20,000 images spanning 120 categories, with each image accompanied by corresponding annotations. The dataset is used to train convolutional neural network (CNN) classifiers; due to the limited data volume, transfer learning techniques are employed, utilizing pretrained models such as VGG16.

Source
github
Created
Jan 14, 2018
Updated
Feb 3, 2023
Signals
218 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

Dataset Name

  • Stanford Dog Dataset

Dataset Content

  • Contains approximately 20,000 images covering 120 different dog breeds.
  • Each image includes corresponding annotation information.

Dataset Usage

  • Due to the limited data volume, transfer learning techniques are employed, using pretrained VGG16 and VGG16BN models.
  • By replacing the top fully‑connected layer and softmax layer, freezing the remaining layers, and employing synthetic image generation to increase image variability.

Dataset Processing

  • The dataset is split into training, validation, and test sets, ensuring no class imbalance across subsets.
  • Data format is converted to tfRecords to accelerate I/O operations.

Training Results

  • After 25 epochs of training the initial CNN, training accuracy reached 94.07% and test accuracy 51.07%.
  • Using the VGG16 model for 50 epochs yielded training accuracy 97.8% and test accuracy 40.23%.
  • Applying Dropout and Batch Normalization resulted in training accuracy 88.23% and test accuracy 76.53% after 20 epochs.

Prediction and Analysis

  • The YOLO algorithm is used for object detection; accuracy is evaluated by comparing predicted boxes with ground‑truth boxes using Intersection over Union (IoU).
  • Confusion matrix analysis reveals that “Silky Terrier / Yorkshire Terrier” is the most frequently misclassified dog breed pair.
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio