Explore high-quality datasets for your AI and machine learning projects.
CTIBench is a comprehensive benchmark suite and dataset designed to evaluate large language models (LLMs) on cyber‑threat intelligence (CTI) tasks. The dataset includes multiple tasks such as multiple‑choice questions (CTI‑MCQ), vulnerability classification (CTI‑RCM), vulnerability scoring (CTI‑VSP), and threat‑report analysis (CTI‑TAA). Each task is provided as a TSV file containing prompts and the correct answer. The data were curated by Md Tanvirul Alam and Dipkamal Bhusal, sourced from authoritative standards such as NIST, MITRE, and GDPR.