High Quality Data

Dataset Hub

Explore high-quality datasets for your AI and machine learning projects.

Sort:

Browse by Category

LorenzH/juliet_test_suite_c_1_3

This dataset includes all test cases from NIST's Juliet test suite for the C and C++ programming languages. Each sample provides a good and a defective implementation, extracted via the Juliet suite's OMITGOOD and OMITBAD preprocessor macros. The dataset supports software defect prediction and code clone detection tasks. Its structure comprises data instances, fields, and splits. Fields include index, filename, defect class, good code, and bad code. Splits contain training and test set sizes. The dataset is synthetic, with all samples handcrafted, and therefore does not fully represent real‑world software defects.

hugging_face

View Details