0. Reference Energy Construction
Implementation
This stage is implemented in:
src/dopingflow/refs.py
The public entry point is:
run_refs_build(...)
Purpose
This stage prepares all thermodynamic reference quantities required for formation energy evaluation of substitutionally doped structures.
The stage performs the following tasks:
Relax the host oxide unit cell
Build and relax the host supercell
Relax reference structures according to the selected reference scheme
Store all relevant energies and metadata in:
reference_structures/reference_energies.json
The resulting reference data are later used by the formation-energy stage.
Inputs
This stage uses settings from:
[references]: reference mode, host structure, reference directories, relaxation settings, oxygen settings, and caching behavior[doping]: defines the host species and the dopant set used in later steps
The host supercell is defined in [references].supercell and is constructed
at this stage.
Reference Modes
Two thermodynamic reference schemes are supported.
Metal reference mode
In metal mode, elemental chemical potentials are taken from relaxed elemental reference phases.
For each relevant element \(i\), the workflow relaxes the corresponding metal structure and computes:
where:
\(E_{\mathrm{metal}}\) is the relaxed total energy of the elemental reference structure
\(N_{\mathrm{atoms}}\) is the number of atoms in that structure
This mode corresponds to equilibrium with elemental reservoirs.
Oxide reference mode
In oxide mode, dopant chemical potentials are derived from oxide reference phases together with the oxygen chemical potential.
For a binary oxide \(M_xO_y\), the chemical potential satisfies:
which gives:
The oxygen chemical potential is obtained from the gas reference (typically \(O_2\)):
where:
\(E_{O_2}\) is the relaxed total energy of the oxygen molecule
\(\Delta\mu_O\) is the optional shift defined by
muO_shift_ev
The setting oxygen_mode is stored for traceability.
For example, O-rich usually corresponds to:
while more oxygen-poor conditions may be represented by a negative shift.
Method Summary
Read the host oxide unit-cell structure
Relax the host unit cell
Build and relax the host supercell
Determine the selected reference mode
- Metal mode:
Relax elemental metal references
Compute per-atom elemental chemical potentials
- Oxide mode:
Relax oxide reference structures
Relax the oxygen gas reference
Compute \(\mu_O\)
Derive cation chemical potentials from oxide thermodynamics
Write all results and metadata to
reference_energies.json
Formation Energy Framework
The workflow assumes substitutional doping on host sites.
The formation energy is defined as:
where:
\(E_{\mathrm{doped}}\) is the relaxed total energy of the doped supercell
\(E_{\mathrm{pristine}}\) is the relaxed total energy of the pristine host supercell
\(\mu_i\) is the chemical potential of dopant species \(i\)
\(\mu_{\mathrm{host}}\) is the chemical potential of the substituted host species
\(n_i\) is the number of substituted atoms of species \(i\)
This corresponds to removing host atoms and inserting dopant atoms while keeping the total lattice size fixed.
The same formal expression is used in both reference modes; only the way the chemical potentials are constructed differs.
Host Reference Energy
The pristine host reference energy is computed by:
Reading the host oxide unit cell
Relaxing the unit cell
Building the requested supercell
Relaxing the supercell
Extracting the final total energy
The relaxed host supercell is reused by later workflow stages as the starting point for structure generation.
Both atomic positions and lattice vectors are allowed to relax.
Metal Chemical Potentials
For each relevant elemental reference phase, the workflow computes:
These values are used directly in formation-energy evaluation when
reference_mode = "metal".
Oxide-Derived Chemical Potentials
When reference_mode = "oxide", the workflow stores relaxed oxide
reference energies and the oxygen gas reference energy.
The oxygen chemical potential is computed from the gas reference and the optional oxygen shift. The cation chemical potentials are then derived from the oxide stoichiometry.
For a reduced oxide composition \(M_xO_y\):
where \(E_{M_xO_y}^{\mathrm{(f.u.)}}\) is the relaxed energy per formula unit of the oxide reference.
Relaxation Method
All reference relaxations use:
Interatomic potential: M3GNet
Optimizer: FIRE
Convergence criterion: maximum force below
fmax
The relaxations are fully unconstrained (cell parameters and atomic positions).
Caching Strategy
If:
skip_if_done = true
and reference_energies.json already exists, this stage is skipped.
This ensures deterministic behavior and avoids unnecessary recomputation.
Outputs
The file reference_energies.json contains metadata and energies needed
for later stages.
Typical top-level fields include:
timestampreference_modehostreferencesoxide_mode(only relevant in oxide mode)supercellconfig_path
The host block typically contains:
host formula
source POSCAR path
relaxed unit-cell POSCAR path
relaxed supercell POSCAR path
number of atoms in unit cell and supercell
total and per-atom energies
The references block contains one entry per relaxed reference phase,
for example metals, oxides, or gas references.
For oxide mode, the JSON also stores the oxygen reference settings such as:
oxides_refgas_refoxygen_modemuO_shift_ev
Notes and Limitations
This stage does not evaluate doped structures
Energies are ML-predicted, not DFT total energies
Reference phase selection strongly affects the resulting chemical potentials
The workflow currently assumes substitutional doping
Oxide-mode derivation assumes simple oxide reference chemistry
No finite-size corrections are applied
No charge-state corrections are included
No entropy or temperature effects are considered
No competing phase stability analysis is performed