JUHE API Marketplace
DATASET
Open Source Community

DATASET-CAPE-RhlA-seqlabel

The CAPE dataset contains mutation sequences of the RhlA enzyme and their functional evaluation metrics. The training set comprises 1,593 sequences, each paired with an enzyme activity metric for model training. The test set includes 925 sequences for model evaluation, where participants must predict the activity of these sequences. The goal is to optimize rhamnolipids production and application potential by engineering RhlA mutations.

Updated 11/13/2024
huggingface

Description

CAPE Dataset: RhlA Enzyme Mutations

Dataset Introduction and Use Cases

RhlA (Uniprot ID: Q51559, PDB ID: 8IK2) is a key enzyme involved in synthesizing the hydrophobic component of rhamnolipids. It determines fatty‑acid chain length and unsaturation, influencing the physicochemical properties and bioactivity of rhamnolipids.

Why Engineer RhlA?

Modifying RhlA enables precise control over fatty‑acid chain structure, thereby increasing rhamnolipid yield and enhancing its industrial and pharmaceutical applicability.

Dataset Description

Training Set: Saprot_CAPE_dataset_train.csv

  • File format: CSV
  • Number of sequences: 1,593
  • Columns:
    • protein: Represents the mutation combination at six critical residues (positions 74, 101, 143, 148, 173, 176).
    • label: Enzyme activity metric indicating overall productivity.

Test Set: Saprot_CAPE_dataset_test.csv

  • Number of sequences: 925
  • Description: Contains only the sequence information. Participants must predict the activity of these sequences for model evaluation. Predictions are submitted to Kaggle for performance feedback.

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Enzyme Engineering
Biotechnology

Source

Organization: huggingface

Created: 11/13/2024

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.