JUHE API Marketplace
DATASET
Open Source Community

isek-ai/danbooru-tags-2016-2023

--- language: - en license: cc0-1.0 size_categories: - 1M<n<10M task_categories: - text-classification - text-generation - text2text-generation dataset_info: - config_name: all features: - name: id dtype: int64 - name: copyright dtype: string - name: character dtype: string - name: artist dtype: string - name: general dtype: string - name: meta dtype: string - name: rating dtype: string - name: score dtype: int64 - name: created_at dtype: string splits: - name: train num_bytes: 2507757369 num_examples: 4601557 download_size: 991454905 dataset_size: 2507757369 - config_name: safe features: - name: id dtype: int64 - name: copyright dtype: string - name: character dtype: string - name: artist dtype: string - name: general dtype: string - name: meta dtype: string - name: rating dtype: string - name: score dtype: int64 - name: created_at dtype: string splits: - name: train num_bytes: 646613535.5369519 num_examples: 1186490 download_size: 247085114 dataset_size: 646613535.5369519 configs: - config_name: all data_files: - split: train path: all/train-* - config_name: safe data_files: - split: train path: safe/train-* tags: - danbooru --- # danbooru-tags-2016-2023 A dataset of danbooru tags. ## Dataset information Generated using [danbooru](https://danbooru.donmai.us/) and [safebooru](https://safebooru.donmai.us/) API. The dataset was created with the following conditions: |Subset name|`all`|`safe`| |-|-|-| |API Endpoint|https://danbooru.donmai.us|https://safebooru.donmai.us| |Date|`2016-01-01..2023-12-31`|`2016-01-01..2023-12-31`| |Score|`>0`|`>0`| |Rating|`g,s,q,e`|`g`| |Filetype|`png,jpg,webp`|`png,jpg,webp`| |Size (number of rows)|4,601,557|1,186,490| ## Usage ``` pip install datasets ``` ```py from datasets import load_dataset dataset = load_dataset( "isek-ai/danbooru-tags-2016-2023", "safe", # or "all" split="train", ) print(dataset) print(dataset[0]) # Dataset({ # features: ['id', 'copyright', 'character', 'artist', 'general', 'meta', 'rating', 'score', 'created_at'], # num_rows: 1186490 # }) # {'id': 2229839, 'copyright': 'kara no kyoukai', 'character': 'ryougi shiki', 'artist': 'momoko (momopoco)', 'general': '1girl, 2016, :|, brown eyes, brown hair, closed mouth, cloud, cloudy sky, dated, day, flower, hair flower, hair ornament, japanese clothes, kimono, long hair, long sleeves, looking at viewer, new year, obi, outdoors, sash, shrine, sky, solo, standing, wide sleeves', 'meta': 'commentary request, partial commentary', 'rating': 'g', 'score': 76, 'created_at': '2016-01-01T00:43:18.369+09:00'} ```

Updated 2/5/2024
hugging_face

Description

Dataset Overview

Basic Information

  • Language: English
  • License: CC0-1.0
  • Size Category: 1M < n < 10M
  • Task Categories:
    • Text Classification
    • Text Generation
    • Text‑to‑Text Generation

Dataset Configurations

  • Configuration Name: all

    • Features:
      • id: int64
      • copyright: string
      • character: string
      • artist: string
      • general: string
      • meta: string
      • rating: string
      • score: int64
      • created_at: string
    • Splits:
      • train
        • Bytes: 2,507,757,369
        • Samples: 4,601,557
    • Download Size: 991,454,905
    • Dataset Size: 2,507,757,369
  • Configuration Name: safe

    • Features: (same as above)
    • Splits:
      • train
        • Bytes: 646,613,535.5369519
        • Samples: 1,186,490
    • Download Size: 247,085,114
    • Dataset Size: 646,613,535.5369519

Data Files

  • Configuration: all

    • Data File:
      • Split: train
      • Path: all/train-*
  • Configuration: safe

    • Data File:
      • Split: train
      • Path: safe/train-*

Labels

  • danbooru

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Anime Image Tagging
Text Classification

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.