Explore high-quality datasets for your AI and machine learning projects.
The xbookcn_short_story dataset contains Chinese short stories for text generation tasks. Each story is split into multiple chunks, and the Qwen‑instruct model generates four summaries of varying lengths. Features include source, category, title, content, content length, URL, and four summaries. The dataset size ranges from 100 MB to 1 GB; the training set comprises 627,195 samples.