Back to datasets
Dataset assetOpen Source CommunityComputer VisionMultimodal Learning

Flickr30K, COCO

Flickr30K is a multimodal dataset containing images and text, used for training and validating algorithms that mine similarity between images and text. COCO is a large, rich image dataset primarily used for object detection, segmentation, and image captioning tasks.

Source
github
Created
Jun 7, 2024
Updated
Jun 30, 2024
Signals
398 views
Availability
Linked source ready
Overview

Dataset description and usage context

LoRS: Low‑Rank Similarity Mining

Datasets

Dataset Storage Structure

./distill_utils/data/
├── Flickr30k/
│   ├── flickr30k-images/
│   │   ├── 1234.jpg
│   │   └── ......
│   ├── results_20130124.token
│   └── readme.txt
└── COCO/
    ├── train2014/
    ├── val2014/
    └── test2014/
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio