Explore high-quality datasets for your AI and machine learning projects.
REFUSE‑BENCH is a benchmark dataset created by the University of Maryland, Baltimore for binary function similarity detection. It contains 243,128 binary files compiled under various configurations and optimizations, collected from source code on GitHub to reflect real‑world computer security scenarios. The dataset supports research in reverse engineering, malware analysis, and vulnerability detection, and is used to evaluate and improve binary function similarity models.