Back to datasets
Dataset assetOpen Source CommunityCamouflaged Object DetectionVision-Language Models

MM-CamObj

The MM‑CamObj dataset, created by Shanghai Jiao Tong University, addresses challenges for vision‑language models in complex, especially camouflaged‑object, scenarios. It comprises two subsets: CamObj‑Align (11,363 high‑quality image‑text pairs) for vision‑language alignment, and CamObj‑Instruct (11,363 images with 68,849 diverse dialogues) for instruction fine‑tuning. Images were carefully selected from classic datasets and detailed descriptions and dialogues were generated using GPT‑4o. MM‑CamObj is primarily used to evaluate and improve vision‑language models on camouflaged‑object detection, localization, and counting tasks.

Source
arXiv
Created
Sep 24, 2024
Updated
Sep 24, 2024
Signals
229 views
Availability
Linked source ready
Overview

Dataset description and usage context

MM‑CamObj

Dataset Overview

  • Name: MM‑CamObj
  • Full Title: MM‑CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object Scenarios
  • Source: ARXIV 24
  • Description: This repository hosts the official code and data for “MM‑CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object Scenarios”.

Dataset Status

  • Release Status: Code and dataset are forthcoming.
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio