JUHE API Marketplace
High Quality Data

Dataset Hub

Explore high-quality datasets for your AI and machine learning projects.

Sort:

Browse by Category

isek-ai/danbooru-tags-2016-2023

Text Classification
Anime Image Tagging

--- language: - en license: cc0-1.0 size_categories: - 1M<n<10M task_categories: - text-classification - text-generation - text2text-generation dataset_info: - config_name: all features: - name: id dtype: int64 - name: copyright dtype: string - name: character dtype: string - name: artist dtype: string - name: general dtype: string - name: meta dtype: string - name: rating dtype: string - name: score dtype: int64 - name: created_at dtype: string splits: - name: train num_bytes: 2507757369 num_examples: 4601557 download_size: 991454905 dataset_size: 2507757369 - config_name: safe features: - name: id dtype: int64 - name: copyright dtype: string - name: character dtype: string - name: artist dtype: string - name: general dtype: string - name: meta dtype: string - name: rating dtype: string - name: score dtype: int64 - name: created_at dtype: string splits: - name: train num_bytes: 646613535.5369519 num_examples: 1186490 download_size: 247085114 dataset_size: 646613535.5369519 configs: - config_name: all data_files: - split: train path: all/train-* - config_name: safe data_files: - split: train path: safe/train-* tags: - danbooru --- # danbooru-tags-2016-2023 A dataset of danbooru tags. ## Dataset information Generated using [danbooru](https://danbooru.donmai.us/) and [safebooru](https://safebooru.donmai.us/) API. The dataset was created with the following conditions: |Subset name|`all`|`safe`| |-|-|-| |API Endpoint|https://danbooru.donmai.us|https://safebooru.donmai.us| |Date|`2016-01-01..2023-12-31`|`2016-01-01..2023-12-31`| |Score|`>0`|`>0`| |Rating|`g,s,q,e`|`g`| |Filetype|`png,jpg,webp`|`png,jpg,webp`| |Size (number of rows)|4,601,557|1,186,490| ## Usage ``` pip install datasets ``` ```py from datasets import load_dataset dataset = load_dataset( "isek-ai/danbooru-tags-2016-2023", "safe", # or "all" split="train", ) print(dataset) print(dataset[0]) # Dataset({ # features: ['id', 'copyright', 'character', 'artist', 'general', 'meta', 'rating', 'score', 'created_at'], # num_rows: 1186490 # }) # {'id': 2229839, 'copyright': 'kara no kyoukai', 'character': 'ryougi shiki', 'artist': 'momoko (momopoco)', 'general': '1girl, 2016, :|, brown eyes, brown hair, closed mouth, cloud, cloudy sky, dated, day, flower, hair flower, hair ornament, japanese clothes, kimono, long hair, long sleeves, looking at viewer, new year, obi, outdoors, sash, shrine, sky, solo, standing, wide sleeves', 'meta': 'commentary request, partial commentary', 'rating': 'g', 'score': 76, 'created_at': '2016-01-01T00:43:18.369+09:00'} ```

hugging_face
View Details