Explore high-quality datasets for your AI and machine learning projects.
The DISC‑Law‑SFT dataset is a high‑quality Chinese legal supervision fine‑tuning dataset designed to improve legal AI systems' abilities in understanding and generating legal text. It consists of two subsets—DISC‑Law‑SFT‑Pair (for introducing legal reasoning) and DISC‑Law‑SFT‑Triplet (for enhancing the model's use of external legal knowledge). The dataset covers numerous legal scenarios such as information extraction, judgment prediction, document summarization, and legal QA. Tasks include legal information extraction, event detection, case classification, judgment prediction, case matching, text summarization, judicial public‑opinion summarization, QA, reading comprehension, and judicial exam. Total size is 403 K entries, suitable for legal assistants, consulting services, and exam preparation.