Explore high-quality datasets for your AI and machine learning projects.
T2Ranking is a large‑scale Chinese passage ranking benchmark dataset, containing over 300K queries and more than 2M unique passages, sourced from real‑world search engines. This dataset focuses on Chinese search scenarios, with extensive fine‑grained relevance annotations. By retrieving passage results from multiple commercial search engines and providing complete annotations, it mitigates false‑negative issues and employs various strategies to ensure high dataset quality.