whyen-wang/coco_captions
COCO is a large-scale dataset for object detection, segmentation, and captioning, primarily used for image-to-text tasks. The dataset provides English captions, each image being associated with multiple textual descriptions. Detailed information about dataset creation, annotation processes, or social impact is not supplied.
Dataset description and usage context
Dataset Card: COCO Captions
Dataset Description
Dataset Overview
COCO Captions is a large-scale dataset for object detection, segmentation, and caption generation.
Supported Tasks and Leaderboards
- Image to Text
Language
- English (en)
Dataset Structure
Data Instances
An example data instance is shown below:
{
"image": PIL.Image(mode="RGB"),
"captions": [
"Closeup of bins of food that include broccoli and bread.",
"A meal is presented in brightly colored plastic trays.",
"there are containers filled with different kinds of foods",
"Colorful dishes holding meat, vegetables, fruit, and bread.",
"A bunch of trays that have different food."
]
}
Data Fields
- Image (
image): a PIL.Image object - Captions (
captions): a list containing multiple captions
Data Splits
| Split | Train | Validation |
|---|---|---|
| Default | 118,287 | 5,000 |
Dataset Creation
Rationale
[More information to be added]
Source Data
Initial Collection and Normalization
[More information to be added]
Source Language Producers
[More information to be added]
Annotation
Annotation Process
[More information to be added]
Annotators
[More information to be added]
Personal and Sensitive Information
[More information to be added]
Considerations for Using the Data
Societal Impact
[More information to be added]
Discussion of Bias
[More information to be added]
Other Known Limitations
[More information to be added]
Additional Information
Curators
[More information to be added]
License
Creative Commons Attribution 4.0 License
Citation Information
@article{cocodataset,
author = {Tsung{-}Yi Lin and Michael Maire and Serge J. Belongie and Lubomir D. Bourdev and Ross B. Girshick and James Hays and Pietro Perona and Deva Ramanan and Piotr Doll{a}r and C. Lawrence Zitnick},
title = {Microsoft {COCO:} Common Objects in Context},
journal = {CoRR},
volume = {abs/1405.0312},
year = {2014},
url = {http://arxiv.org/abs/1405.0312},
archivePrefix = {arXiv},
eprint = {1405.0312},
timestamp = {Mon, 13 Aug 2018 16:48:13 +0200},
biburl = {https://dblp.org/rec/bib/journals/corr/LinMBHPRDZ14},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Contributions
Thanks to @github-whyen-wang for adding this dataset.
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.