JUHE API Marketplace
DATASET
Open Source Community

PKU-Alignment/BeaverTails

BeaverTails is a collection of AI‑safety‑focused datasets containing a series of human‑annotated question‑answer pairs, each labeled with a corresponding harm category. The dataset covers 14 harm categories such as animal abuse, child abuse, discrimination, hate speech, etc. It is intended for research, especially for creating safer, less‑harmful AI systems. The dataset includes multiple splits: 330k_train, 330k_test, 30k_train and 30k_test.

Updated 10/17/2023
hugging_face

Description

Dataset Overview

Name: BeaverTails

Purpose: AI safety research aimed at developing safer, less‑harmful AI systems.

Content: Human‑labeled question‑answer (QA) pairs, each associated with one or more harm categories.

Harm Categories:

  1. Animal abuse
  2. Child abuse
  3. Controversial topics, politics
  4. Discrimination, stereotypes, unfairness
  5. Drug abuse, weapons, prohibited substances
  6. Financial crime, property crime, theft
  7. Hate speech, offensive language
  8. Misinformation about ethics, law, and safety
  9. Non‑violent unethical behavior
  10. Privacy violation
  11. Self‑harm
  12. Sexual content, adult content
  13. Terrorism, organized crime
  14. Violence, assistance and incitement, provocation

Dataset Size: 100K<n<1M

Data File Configuration:

  • Default configuration:
    • Training set:
      • 330k_train: round0/330k/train.jsonl.xz
      • 30k_train: round0/30k/train.jsonl.gz
    • Test set:
      • 330k_test: round0/30k/test.jsonl.xz
      • 30k_test: round0/30k/test.jsonl.gz

License: CC BY‑NC 4.0

Language: English

Task Category: Text classification

Tags: safety, AI safety, moderation, refusal sampling, LLM, LM, human feedback

Usage Warning: The dataset contains potentially offensive or disturbing content and should be used responsibly, respecting individual risk tolerance. It should not be used to train dialogue agents that might produce harmful behavior.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

AI Safety
Content Moderation

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.