Back to datasets
Dataset assetOpen Source CommunityComputer VisionMultimodal Learning
Flickr30K, COCO
Flickr30K is a multimodal dataset containing images and text, used for training and validating algorithms that mine similarity between images and text. COCO is a large, rich image dataset primarily used for object detection, segmentation, and image captioning tasks.
Source
github
Created
Jun 7, 2024
Updated
Jun 30, 2024
Signals
398 views
Availability
Linked source ready
Overview
Dataset description and usage context
LoRS: Low‑Rank Similarity Mining
Datasets
-
Flickr30K:
-
COCO:
Dataset Storage Structure
./distill_utils/data/
├── Flickr30k/
│ ├── flickr30k-images/
│ │ ├── 1234.jpg
│ │ └── ......
│ ├── results_20130124.token
│ └── readme.txt
└── COCO/
├── train2014/
├── val2014/
└── test2014/
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.