Back to datasets
Dataset assetOpen Source CommunityMultimodal Language ModelsSecurity Evaluation
JailbreakV-28K/JailBreakV-28k
JailBreakV_28K is a benchmark dataset for evaluating the robustness of multimodal large language models (MLLMs) against jailbreak attacks. It contains 28,000 jailbreak text‑image pairs, including 20,000 text‑based LLM transfer jailbreak attacks and 8,000 image‑based MLLM jailbreak attacks, covering 16 security policies and 5 different jailbreak methods. Additionally, the RedTeam_2K dataset provides 2,000 harmful queries to identify alignment vulnerabilities in LLMs and MLLMs, encompassing 16 security policies and 8 data sources.
Source
hugging_face
Created
Nov 28, 2025
Updated
Jul 10, 2024
Signals
290 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
Dataset Name
- JailBreakV_28K
License
- MIT
Task Categories
- Text Generation
- Question Answering
Dataset Size
- 10K < n < 100K
Configurations
- config_name: JailBreakV_28K
- split: mini_JailBreakV_28K
- path: JailBreakV_28K/mini_JailBreakV_28K.csv
- split: JailBreakV_28K
- path: JailBreakV_28K/JailBreakV_28K.csv
- split: mini_JailBreakV_28K
- config_name: RedTeam_2K
- split: RedTeam_2K
- path: JailBreakV_28K/RedTeam_2K.csv
- split: RedTeam_2K
Dataset Details
JailBreakV_28K and mini_JailBreakV_28K Columns
- id: Unique identifier
- jailbreak_query: Jailbreak query generated by various jailbreak attacks
- redteam_query: Harmful query from RedTeam_2K
- format: Jailbreak attack method, including template, persuasion, logic, figstep, query‑related
- policy: Security policy violated by the red‑team query
- image_path: Image file path
- from: Data source
- selected_mini: "True" if the data is in the mini_JailBreakV_28K set, otherwise "False"
- transfer_from_llm: "True" if the jailbreak query is transferred from an LLM jailbreak attack, otherwise "False"
RedTeam_2K Columns
- id: Unique identifier
- question: Harmful query
- policy: Security policy violated by the red‑team query
- from: Data source
Data Composition
RedTeam_2K
- Description: Curated dataset of 2,000 harmful queries designed to identify alignment vulnerabilities in LLMs and MLLMs. Covers 16 security policies and includes queries from 8 different sources.
JailBreakV_28K
- Description: Dataset containing 28,000 jailbreak text‑image pairs, including 20,000 text‑based LLM transfer jailbreak attacks and 8,000 image‑based MLLM jailbreak attacks. Covers 16 security policies and 5 different jailbreak methods.
Dataset Overview
- RedTeam_2K: 2,000 harmful queries covering 16 security policies from 8 sources.
- JailBreakV_28K: 28,000 jailbreak text‑image pairs covering 16 security policies and 5 jailbreak methods.
Usage Example
from datasets import load_dataset
mini_JailBreakV_28K = load_dataset("JailbreakV-28K/JailBreakV-28k", "JailBreakV_28K")["mini_JailBreakV_28K"]
JailBreakV_28K = load_dataset("JailbreakV-28K/JailBreakV-28k", "JailBreakV_28K")["JailBreakV_28K"]
RedTeam_2K = load_dataset("JailbreakV-28K/JailBreakV-28k", "RedTeam_2K")["RedTeam_2K"]
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.