DATASET
Open Source Community
Stanford Dog Dataset
The Stanford Dog Dataset contains approximately 20,000 images spanning 120 categories, with each image accompanied by corresponding annotations. The dataset is used to train convolutional neural network (CNN) classifiers; due to the limited data volume, transfer learning techniques are employed, utilizing pretrained models such as VGG16.
Updated 2/3/2023
github
Description
Dataset Overview
Dataset Name
- Stanford Dog Dataset
Dataset Content
- Contains approximately 20,000 images covering 120 different dog breeds.
- Each image includes corresponding annotation information.
Dataset Usage
- Due to the limited data volume, transfer learning techniques are employed, using pretrained VGG16 and VGG16BN models.
- By replacing the top fully‑connected layer and softmax layer, freezing the remaining layers, and employing synthetic image generation to increase image variability.
Dataset Processing
- The dataset is split into training, validation, and test sets, ensuring no class imbalance across subsets.
- Data format is converted to tfRecords to accelerate I/O operations.
Training Results
- After 25 epochs of training the initial CNN, training accuracy reached 94.07% and test accuracy 51.07%.
- Using the VGG16 model for 50 epochs yielded training accuracy 97.8% and test accuracy 40.23%.
- Applying Dropout and Batch Normalization resulted in training accuracy 88.23% and test accuracy 76.53% after 20 epochs.
Prediction and Analysis
- The YOLO algorithm is used for object detection; accuracy is evaluated by comparing predicted boxes with ground‑truth boxes using Intersection over Union (IoU).
- Confusion matrix analysis reveals that “Silky Terrier / Yorkshire Terrier” is the most frequently misclassified dog breed pair.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Image Recognition
Machine Learning
Source
Organization: github
Created: 1/14/2018
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.