Explore high-quality datasets for your AI and machine learning projects.
The CAPE dataset contains mutation sequences of the RhlA enzyme and their functional evaluation metrics. The training set comprises 1,593 sequences, each paired with an enzyme activity metric for model training. The test set includes 925 sequences for model evaluation, where participants must predict the activity of these sequences. The goal is to optimize rhamnolipids production and application potential by engineering RhlA mutations.