JUHE API Marketplace
DATASET
Open Source Community

nsfw

This dataset contains erotic stories that have been cleaned, deduplicated, and depolluted, intended for training text‑filtering classifiers. The data originates from the HuggingFace datasets bluuwhale/nsfwstory and bluuwhale/nsfwstory2. The dataset comprises 49,579 samples, and the downloaded parquet file is 646 MB.

Updated 1/11/2025
huggingface

Description

Dataset Overview

Dataset Name

Geralt-Targaryen/nsfw

Dataset Description

This dataset contains cleaned, deduplicated, and depolluted NSFW (Not Safe For Work) stories, intended for training text‑filtering classifiers.

Dataset Source

Dataset Scale

  • Number of samples: 49,579
  • Downloaded parquet file size: 646 M

License

Apache-2.0

Warning

This dataset contains explicit sexual content.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Text Classification
Pornographic Content Filtering

Source

Organization: huggingface

Created: 1/1/2025

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.