Dataset assetOpen Source CommunityComputer VisionMultimodal Learning

Flickr30K, COCO

Flickr30K is a multimodal dataset containing images and text, used for training and validating algorithms that mine similarity between images and text. COCO is a large, rich image dataset primarily used for object detection, segmentation, and image captioning tasks.

Source

github

Created

Jun 7, 2024

Updated

Jun 30, 2024

Signals

398 views

Availability

Linked source ready

Overview

Dataset description and usage context

LoRS: Low‑Rank Similarity Mining

Datasets

Flickr30K:
- Training set: link
- Validation set: link
- Test set: link
- Images: link
COCO:
- Training set: link
- Validation set: link
- Test set: link
- Images: link

Dataset Storage Structure

./distill_utils/data/
├── Flickr30k/
│   ├── flickr30k-images/
│   │   ├── 1234.jpg
│   │   └── ......
│   ├── results_20130124.token
│   └── readme.txt
└── COCO/
    ├── train2014/
    ├── val2014/
    └── test2014/

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio