DATASET
Open Source Community
scikit-fingerprints/MoleculeNet_ESOL
The MoleculeNet ESOL dataset is part of the MoleculeNet benchmark for predicting aqueous solubility. The target values are log‑transformed, expressed as log mol/L. The dataset contains 1,128 samples; scaffold split is recommended; evaluation metric is RMSE.
Updated 7/18/2024
hugging_face
Description
MoleculeNet ESOL Dataset Overview
Basic Information
- Dataset Name: MoleculeNet ESOL
- Task Types:
- Tabular regression
- Graph machine learning
- Tabular classification
- Tags:
- Chemistry
- Biology
- Medicine
- Size: 1K < n < 10K
- Configuration:
- Config name: default
- Data files:
- Split: train
- Path: "esol.csv"
Task Description
- Task: Predict aqueous solubility
- Target: Log‑transformed solubility, unit log mol per litre (log Mol/L)
Dataset Features
- Number of Tasks: 1
- Task Type: Regression
- Total Samples: 1,128
- Recommended Split: scaffold
- Recommended Metric: RMSE
References
- John S. Delaney, "ESOL: Estimating Aqueous Solubility Directly from Molecular Structure", J. Chem. Inf. Comput. Sci. 2004, 44, 3, 1000–1005
- Wu, Zhenqin, et al., "MoleculeNet: a benchmark for molecular machine learning", Chemical Science 9.2 (2018): 513‑530
AI studio
Generate PPTs instantly with Nano Banana Pro.
Generate PPT NowAccess Dataset
Login to Access
Please login to view download links and access full dataset details.
Topics
Chemistry
Machine Learning
Source
Organization: hugging_face
Created: Unknown
Power Your Data Analysis with Premium AI Models
Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.
Enjoy a free trial and save 20%+ compared to official pricing.