Explore high-quality datasets for your AI and machine learning projects.
Thermal stability prediction is a regression task where each input protein x is mapped to a label y representing its thermal stability. The dataset originates from the FLIP project, using Human‑cell split protein data and excluding proteins without AlphaFold2 structures. It is divided based on 70 % structural similarity into training, validation, and test sets containing 5,310, 706, and 706 samples respectively. The data are stored in LMDB format and include fields such as sample count, UniProt ID, structure‑aware sequence, and adaptive fitness labels.