Explore high-quality datasets for your AI and machine learning projects.
A large‑scale Chinese natural‑language‑processing corpus containing diverse types of Chinese text such as Wikipedia, news, and encyclopedia Q&A, intended to support research and applications in Chinese NLP.
Contains multiple Chinese corpora, such as provincial‑city latitude/longitude coordinates, postal codes, administrative division codes, idioms, personal names, named‑entity recognition data, relation recognition data, reading comprehension, and image‑text QA data.