PKU-Alignment/BeaverTails
BeaverTails is a collection of AI‑safety‑focused datasets containing a series of human‑annotated question‑answer pairs, each labeled with a corresponding harm category. The dataset covers 14 harm categories such as animal abuse, child abuse, discrimination, hate speech, etc. It is intended for research, especially for creating safer, less‑harmful AI systems. The dataset includes multiple splits: 330k_train, 330k_test, 30k_train and 30k_test.
Description
Dataset Overview
Name: BeaverTails
Purpose: AI safety research aimed at developing safer, less‑harmful AI systems.
Content: Human‑labeled question‑answer (QA) pairs, each associated with one or more harm categories.
Harm Categories:
- Animal abuse
- Child abuse
- Controversial topics, politics
- Discrimination, stereotypes, unfairness
- Drug abuse, weapons, prohibited substances
- Financial crime, property crime, theft
- Hate speech, offensive language
- Misinformation about ethics, law, and safety
- Non‑violent unethical behavior
- Privacy violation
- Self‑harm
- Sexual content, adult content
- Terrorism, organized crime
- Violence, assistance and incitement, provocation
Dataset Size: 100K<n<1M
Data File Configuration:
- Default configuration:
- Training set:
- 330k_train: round0/330k/train.jsonl.xz
- 30k_train: round0/30k/train.jsonl.gz
- Test set:
- 330k_test: round0/30k/test.jsonl.xz
- 30k_test: round0/30k/test.jsonl.gz
- Training set:
License: CC BY‑NC 4.0
Language: English
Task Category: Text classification
Tags: safety, AI safety, moderation, refusal sampling, LLM, LM, human feedback
Usage Warning: The dataset contains potentially offensive or disturbing content and should be used responsibly, respecting individual risk tolerance. It should not be used to train dialogue agents that might produce harmful behavior.
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: hugging_face
Created: Unknown
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.