Back to datasets
Dataset assetOpen Source CommunityText GenerationChinese Stories
zhoukz/TinyStories-Qwen
A Chinese story dataset generated using Qwen series models, modeled after the TinyStories dataset. All data are AI‑generated; the dataset is unfiltered and does not guarantee uniform distribution, safety, harmlessness, or any other properties. The seed information used for generation was randomly selected without any specific meaning.
Source
hugging_face
Created
Nov 28, 2025
Updated
Jan 1, 2024
Signals
121 views
Availability
Linked source ready
Overview
Dataset description and usage context
Dataset Overview
License
- MIT License
Task Category
- Text Generation
Language
- Chinese
Configuration
- Configuration Name: default
- Data Files:
- Training Set: data_???.jsonl
- Validation Set: data_val_???.jsonl
- Data Files:
Dataset Description
- Chinese story collection generated using Qwen series models, modeled after the TinyStories dataset.
- Dataset Characteristics:
- Not a translation of the original dataset.
- Does not follow the original dataset format.
- All data are AI‑generated.
- The dataset is unfiltered and does not guarantee uniform distribution, safety, harmlessness, or any other properties.
- Seed information for generation is randomly selected, with no specific meaning.
Need downstream help?
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.