JUHE API Marketplace
DATASET
Open Source Community

spot-the-diff

This dataset is used for learning to describe the differences between pairs of similar images. It contains four image features (img_a, img_b, img_diff) and one sentence sequence feature (sentences). The dataset is split into training, testing, and validation sets with 9,524, 1,404, and 1,634 samples respectively.

Updated 12/19/2024
huggingface

Description

Dataset Overview

Dataset Information

  • Features:

    • img_id: String type, unique identifier for the image.
    • img_a: Image type, first image.
    • img_b: Image type, second image.
    • img_diff: Image type, difference image.
    • sentences: Sequence of strings, sentences describing the differences.
  • Dataset Splits:

    • train: Training set, 9,524 samples, size 1,904,363,199.892 bytes.
    • test: Test set, 1,404 samples, size 268,451,640.804 bytes.
    • val: Validation set, 1,634 samples, size 308,229,248.356 bytes.
  • Dataset Size:

    • Download size: 2,292,419,742 bytes
    • Total size: 2,481,044,089.052 bytes

Configuration

  • Configuration Name: default
    • Data File Paths:
      • Training: data/train-*
      • Testing: data/test-*
      • Validation: data/val-*

Original Dataset

  • Source: https://github.com/harsh19/spot-the-diff/

Reference

@inproceedings{jhamtani2018learning, title={Learning to Describe Differences Between Pairs of Similar Images}, author={Jhamtani, Harsh and Berg-Kirkpatrick, Taylor}, booktitle={Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, year={2018} }

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Image Difference Recognition
Natural Language Processing

Source

Organization: huggingface

Created: 12/19/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.