DATASET
Open Source Community
Open Images dataset
Open Images is a dataset containing approximately 9 million image URLs, annotated with over 6 000 category labels. The dataset is split into training and validation sets, and each image may have one or multiple labels; label information can be obtained via CSV files.
Updated 1/9/2017
github
Description
Open Images Dataset Overview
Dataset Description
- Scale: Approximately 9 million image URLs.
- Labels: Contains over 6 000 categories.
- Label type: Represented by mids from Freebase or Google Knowledge Graph API.
- Number of labels: 7 844 distinct labels, of which about 6 000 are considered trainable.
- Images: 9 018 219 training images, 167 057 validation images.
- Annotation: Each image may have one or multiple labels; annotations are provided in CSV files.
Dataset Content
- Training set: 9 018 219 images with machine‑generated annotations.
- Validation set: 167 057 images with both machine‑generated and human annotations.
- Human annotation quality: Positive labels are 1.0, negative labels 0.0; machine annotation confidence ranges from 0.0 to 1.0.
Dataset Organization
- Files: Two CSV files –
images.csv(contains image URLs, OpenImages ID, title, author, license) andlabels.csv(maps labels to image IDs with confidence scores). - Download:
Data Quality
- Label distribution: Highly imbalanced; some labels appear in over a million images, others in fewer than 100.
- Annotation accuracy: Machine annotations are noisy, but accuracy improves with the number of associated images.
Model Applications
- Models such as Inception v3 have been trained on Open Images annotations and can be fine‑tuned for downstream tasks, as well as used for advanced applications like DeepDream and style transfer.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Image Recognition
Machine Learning
Source
Organization: github
Created: 10/1/2016
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.