whyen-wang/coco_captions
COCO is a large-scale dataset for object detection, segmentation, and captioning, primarily used for image-to-text tasks. The dataset provides English captions, each image being associated with multiple textual descriptions. Detailed information about dataset creation, annotation processes, or social impact is not supplied.
Description
Dataset Card: COCO Captions
Dataset Description
Dataset Overview
COCO Captions is a large-scale dataset for object detection, segmentation, and caption generation.
Supported Tasks and Leaderboards
- Image to Text
Language
- English (en)
Dataset Structure
Data Instances
An example data instance is shown below:
{
"image": PIL.Image(mode="RGB"),
"captions": [
"Closeup of bins of food that include broccoli and bread.",
"A meal is presented in brightly colored plastic trays.",
"there are containers filled with different kinds of foods",
"Colorful dishes holding meat, vegetables, fruit, and bread.",
"A bunch of trays that have different food."
]
}
Data Fields
- Image (
image): a PIL.Image object - Captions (
captions): a list containing multiple captions
Data Splits
| Split | Train | Validation |
|---|---|---|
| Default | 118,287 | 5,000 |
Dataset Creation
Rationale
[More information to be added]
Source Data
Initial Collection and Normalization
[More information to be added]
Source Language Producers
[More information to be added]
Annotation
Annotation Process
[More information to be added]
Annotators
[More information to be added]
Personal and Sensitive Information
[More information to be added]
Considerations for Using the Data
Societal Impact
[More information to be added]
Discussion of Bias
[More information to be added]
Other Known Limitations
[More information to be added]
Additional Information
Curators
[More information to be added]
License
Creative Commons Attribution 4.0 License
Citation Information
@article{cocodataset,
author = {Tsung{-}Yi Lin and Michael Maire and Serge J. Belongie and Lubomir D. Bourdev and Ross B. Girshick and James Hays and Pietro Perona and Deva Ramanan and Piotr Doll{a}r and C. Lawrence Zitnick},
title = {Microsoft {COCO:} Common Objects in Context},
journal = {CoRR},
volume = {abs/1405.0312},
year = {2014},
url = {http://arxiv.org/abs/1405.0312},
archivePrefix = {arXiv},
eprint = {1405.0312},
timestamp = {Mon, 13 Aug 2018 16:48:13 +0200},
biburl = {https://dblp.org/rec/bib/journals/corr/LinMBHPRDZ14},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Contributions
Thanks to @github-whyen-wang for adding this dataset.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: hugging_face
Created: Unknown
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.