materials-toolkits/materials-project
This dataset contains per‑atom formation energy data for 133,420 materials. It is provided as two main files: `index.json`, which includes material indices, IDs, formulas, atom counts, and per‑atom formation energies; and `data.hdf5`, which stores structural information (lattice, number of atoms, per‑atom energy, atom pointers) and atomic data (positions, atomic numbers).
Dataset description and usage context
Dataset Overview
Dataset Name
Materials Project (2019 dump)
Dataset Description
This dataset contains per‑atom formation energy data for 133,420 materials.
Data Source
Data processed from mp.2019.04.01.json.
Download Link
MD5 Checksum
c132f3781f32cd17f3a92aa6501b9531
Data Content
The dataset is packaged in materials-project.tar.gz.
Index File (index.json)
Contains the following fields:
index(int): Index of the structure in the data file.id(str): Materials Project ID.formula(str): Chemical formula.natoms(int): Number of atoms.energy_pa(float): Formation energy per atom.
Data File (data.hdf5)
Contains the following fields:
structures: Group containing structural information.structures/cell(float32): Lattice of the material.structures/natoms(int32): Number of atoms.structures/energy_pa(float32): Formation energy per atom.structures/atoms_ptr(int64): Position of the first atom in the structure.
atoms: Group containing atomic information.atoms/positions(float32): Atom positions.atoms/atomic_number(uint8): Atomic numbers of the atoms.
Pair the dataset with AI analysis and content workflows.
Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.