JUHE API Marketplace
High Quality Data

Dataset Hub

Explore high-quality datasets for your AI and machine learning projects.

Sort:

Browse by Category

CrystalDFT

Piezoelectric Materials
Materials Science

CrystalDFT is a small molecular crystal database created by the Bernal Institute at the University of Limerick, containing DFT‑predicted electromechanical properties for 572 organic crystals. The dataset was generated via high‑throughput screening to identify sustainable materials with excellent piezoelectric performance, aiming to replace lead‑based piezoelectrics. Applications focus on the development and optimization of piezoelectric materials, addressing environmental and health concerns associated with traditional lead‑based compounds.

arXiv
View Details

LeMat-Bulk

Materials Science
Density Functional Theory

The LeMatBulk dataset is a materials science and chemistry dataset that includes several configurations (such as compatible_pbe, compatible_pbesol, compatible_scan, non_compatible) and encompasses various chemical structure features such as elements, chemical formulas, lattice vectors, and energy properties. The dataset is intended to support materials science research, particularly in the context of density functional theory (DFT) calculations. It contains subsets filtered for compatibility according to different DFT functionals and pseudopotentials. The dataset also describes methods for ensuring compatibility and deduplication of entries. Distributed under the CC‑BY‑4.0 license, it can be downloaded from the Hugging Face datasets library and used in Python.

huggingface
View Details

nimashoghi/wbm

Materials Science
Crystal Structure

The dataset contains multiple material‑science features such as chemical formula, number of sites, volume, energy, band gap, etc., which can be used for material property research and prediction. It includes 256,963 samples (total size 725 MB, download size 156 MB).

hugging_face
View Details

materials-toolkits/materials-project

Materials Science
Chemistry

This dataset contains per‑atom formation energy data for 133,420 materials. It is provided as two main files: `index.json`, which includes material indices, IDs, formulas, atom counts, and per‑atom formation energies; and `data.hdf5`, which stores structural information (lattice, number of atoms, per‑atom energy, atom pointers) and atomic data (positions, atomic numbers).

hugging_face
View Details