JUHE API Marketplace
DATASET
Open Source Community

scikit-fingerprints/MoleculeNet_ESOL

The MoleculeNet ESOL dataset is part of the MoleculeNet benchmark for predicting aqueous solubility. The target values are log‑transformed, expressed as log mol/L. The dataset contains 1,128 samples; scaffold split is recommended; evaluation metric is RMSE.

Updated 7/18/2024
hugging_face

Description

MoleculeNet ESOL Dataset Overview

Basic Information

  • Dataset Name: MoleculeNet ESOL
  • Task Types:
    • Tabular regression
    • Graph machine learning
    • Tabular classification
  • Tags:
    • Chemistry
    • Biology
    • Medicine
  • Size: 1K < n < 10K
  • Configuration:
    • Config name: default
    • Data files:
      • Split: train
      • Path: "esol.csv"

Task Description

  • Task: Predict aqueous solubility
  • Target: Log‑transformed solubility, unit log mol per litre (log Mol/L)

Dataset Features

  • Number of Tasks: 1
  • Task Type: Regression
  • Total Samples: 1,128
  • Recommended Split: scaffold
  • Recommended Metric: RMSE

References

  1. John S. Delaney, "ESOL: Estimating Aqueous Solubility Directly from Molecular Structure", J. Chem. Inf. Comput. Sci. 2004, 44, 3, 1000–1005
  2. Wu, Zhenqin, et al., "MoleculeNet: a benchmark for molecular machine learning", Chemical Science 9.2 (2018): 513‑530

AI studio

Generate PPTs instantly with Nano Banana Pro.

Generate PPT Now

Access Dataset

Login to Access

Please login to view download links and access full dataset details.

Topics

Chemistry
Machine Learning

Source

Organization: hugging_face

Created: Unknown

Power Your Data Analysis with Premium AI Models

Supporting GPT-5, Claude-4, DeepSeek v3, Gemini and more.

Enjoy a free trial and save 20%+ compared to official pricing.