TigerResearch/sft_zh
Chinese sft‑zh data collection from the Tigerbot open‑source project, encompassing multiple Chinese datasets such as Alpaca‑Chinese, encyclopedia QA, classic literature QA, riddles, reading comprehension, general QA, and Zhihu QA. The collection can be used directly without repeated downloads.
Description
Dataset Overview
This dataset is the Chinese sft‑zh fine‑tuning collection within the Tigerbot open‑source project, encompassing other Chinese sft datasets released by the organization, eliminating the need for duplicate downloads.
Usage
import datasets
ds_sft = datasets.load_dataset('TigerResearch/sft_zh')
File Breakdown
| Type | Language | Dataset File | Size |
|---|---|---|---|
| Alpaca Chinese | Chinese | tigerbot-alpaca-zh-0.5m | 0.5m |
| Encyclopedia QA | Chinese | tigerbot-wiki-qa-1k | 1k |
| Classic Literature QA | Chinese | tigerbot-book-qa-1k | 1k |
| Riddles | Chinese | tigerbot-riddle-qa-1k | 1k |
| Reading Comprehension | Chinese | tigerbot-superclue-c3-zh-5k | 5k |
| General QA | Chinese | tigerbot-hc3-zh-12k | 12k |
| Zhihu QA | Chinese | tigerbot-zhihu-zh-10k | 10k |
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Please login to view download links and access full dataset details.
Topics
Source
Organization: hugging_face
Created: Unknown
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.