JUHE API Marketplace
DATASET
Open Source Community

nsfw_detection

The dataset includes two configurations: `nsfw_detection_test_v1` and `nsfw_detection_v1`. `nsfw_detection_test_v1` provides a test split with 10,000 samples, each containing a text and a label (0 for safe, 1 for nsfw). `nsfw_detection_v1` includes a training split with 845,904 samples and a validation split with 10,000 samples, both following the same format. The dataset is primarily used for detecting unsafe content in text.

Updated 8/22/2024
huggingface

Description

Dataset Overview

Dataset Configurations

Configuration Name: nsfw_detection_test_v1

  • Features:
    • text: string
    • label: categorical with two classes:
      • 0: safe
      • 1: nsfw
    • __index_level_0__: integer
  • Splits:
    • test: 10,000 samples, 9,258,616 bytes
  • Download Size: 5,981,940 bytes
  • Dataset Size: 9,258,616 bytes

Configuration Name: nsfw_detection_v1

  • Features:
    • text: string
    • label: categorical with two classes:
      • 0: safe
      • 1: nsfw
    • __index_level_0__: integer
  • Splits:
    • train: 845,904 samples, 776,291,817 bytes
    • val: 10,000 samples, 9,258,616 bytes
  • Download Size: 506,877,225 bytes
  • Dataset Size: 785,550,433 bytes

Data Files

Configuration Name: nsfw_detection_test_v1

  • Splits:
    • test: file path nsfw_detection_test_v1/test-*

Configuration Name: nsfw_detection_v1

  • Splits:
    • train: file path nsfw_detection_v1/train-*
    • val: file path nsfw_detection_v1/val-*

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Text Safety Detection

Source

Organization: huggingface

Created: 8/22/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.