Back to datasets
Dataset assetOpen Source CommunityMultimodal ModelsArtificial Intelligence Research

M4-Instruct-Data

M4‑Instruct is a multi‑image dataset collected in April 2024 from public datasets and the GPT‑4V API, intended for training large multimodal models. It is used for research on large multimodal models and chatbots, targeting audiences in computer vision, natural language processing, machine learning, and AI.

Source
huggingface
Created
Jun 26, 2024
Updated
Jun 26, 2024
Signals
221 views
Availability
Linked source ready
Overview

Dataset description and usage context

M4‑Instruct Dataset Overview

Dataset Details

Dataset Type: M4‑Instruct is a collection of multi‑image data gathered from public datasets or generated via the GPT‑4V API. It aims to train large multimodal models with interleaved multi‑image capabilities, such as LLaVA‑NeXT‑Interleave.

Dataset Date: Collected in April 2024, released in June 2024.

Data Statistics: The release includes multi‑image, multi‑frame (video), and multi‑view (3D) data for M4‑Instruct.

Data Content:

  • JSON files: m4_instruct_annotations.json and m4_instruct_video.json
  • Images: *.zip
  • For dreamsim_split.z01 and dreamsim_split.zip, run zip -s 0 dreamsim_split.zip --out dreamsim.zip

License: Creative Commons Attribution 4.0 International; compliance with OpenAI policy required: https://openai.com/policies/terms-of-use

Contact for Issues:

Intended Use

Primary Intended Use: Research on large multimodal models and chatbots.

Primary Intended Users: Researchers and enthusiasts in computer vision, natural language processing, machine learning, and artificial intelligence.

Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio