Installation, Usage, and Outputs
This page explains how to install dopingflow and how to run the workflow either step-by-step or using the single orchestration command.
Installation
Clone the repository and install in editable mode:
git clone KazemZh/dopingflow
cd dopingflow
python -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -e .
Verify the CLI is available:
dopingflow --help
Required Inputs
Refer to Required Input Files page.
Running the Workflow
All commands accept -c/--config to specify the TOML file.
If omitted, input.toml in the current directory is used.
Run the full pipeline with one command
To run the complete workflow in order:
dopingflow run-all -c input.toml
This executes:
refs -> generate -> scan -> relax -> filter -> bandgap -> formation -> collect
The surface stage is not included in run-all and must be executed separately:
dopingflow surface -c input.toml
This design allows users to first inspect and validate the final database before generating surface structures.
Resuming and partial runs (run-all)
You can resume from a given stage:
dopingflow run-all -c input.toml --from relax
You can stop at a stage (inclusive). This is useful if you do not want to run bandgap yet:
dopingflow run-all -c input.toml --until filter
You can print the planned steps without running them:
dopingflow run-all -c input.toml --dry-run
You can run only a subset of steps inside a selected range:
dopingflow run-all -c input.toml --from refs --until collect --only refs,generate,scan
Filtering controls inside run-all
The filter stage supports optional overrides (passed through by run-all):
Restrict filtering to a single composition folder:
dopingflow run-all -c input.toml --from relax --until filter --filter-only Sb5_Zr5
Force re-filtering even if outputs exist:
dopingflow run-all -c input.toml --from filter --until filter --force
Override filtering mode by specifying one of:
dopingflow run-all -c input.toml --from filter --until filter --window-mev 50 dopingflow run-all -c input.toml --from filter --until filter --topn 12
Step-by-step execution
Step 00: build and relax thermodynamic reference structures:
dopingflow refs-build -c input.toml
Step 01: structure generation:
dopingflow generate -c input.toml
Step 02: scan (symmetry-unique enumeration / sampling + ML-based single-point energies):
dopingflow scan -c input.toml
Step 03: relax scanned candidates (ML backend + ASE optimizer):
dopingflow relax -c input.toml
Step 04: filter relaxed candidates:
dopingflow filter -c input.toml
Optional Step 05: predict bandgap (ALIGNN):
Before running bandgap, set the model path:
export ALIGNN_MODEL_DIR=/path/to/your/alignn/model_root
dopingflow bandgap -c input.toml
Step 06: formation energies:
dopingflow formation -c input.toml
Step 07: collect results into one CSV database:
dopingflow collect -c input.toml
Step 08: generate surfaces and optionally relax slabs:
dopingflow surface -c input.toml
Outputs Overview
This section summarizes the main outputs created by each stage.
Step 00 (refs-build)
Writes:
reference_structures/reference_energies.json
This file contains:
relaxed host unit-cell and supercell energies
relaxed reference structure energies
metadata about backend, optimizer, device, and convergence settings
reference information used for formation energy evaluation
Additional outputs:
reference_structures/relaxed/host_unit_relaxed.POSCARreference_structures/relaxed/host_supercell_<a>x<b>x<c>_relaxed.POSCARreference_structures/relaxed/refs/<name>_relaxed.POSCAR
Step 01 (generate)
Writes a structure folder per composition under [structure].outdir (default: random_structures):
<outdir>/<composition_tag>/POSCAR<outdir>/<composition_tag>/metadata.json
Step 02 (scan)
Inside each <composition_tag>/ folder, writes:
ranking_scan.csv(top-k single-point energies)scan_summary.txt(human-readable summary)
Candidate structures:
<composition_tag>/candidate_###/01_scan/POSCAR
<composition_tag>/candidate_###/01_scan/meta.json
The scan stage evaluates structures using a selected ML backend (e.g. M3GNet, UMA, MACE, GRACE).
Step 03 (relax)
For each candidate:
candidate_###/02_relax/POSCARcandidate_###/02_relax/meta.json
Also writes per composition folder:
ranking_relax.csv
The relaxation stage uses:
ML interatomic potentials for forces
ASE optimizers for structural relaxation
Step 04 (filter)
Writes per composition folder:
ranking_relax_filtered.csv(filtered candidate table)selected_candidates.txt(names of kept candidates)
Step 05 (bandgap)
Writes per composition folder:
bandgap_alignn_summary.csv
Writes per candidate:
candidate_###/03_band/meta.json
Step 06 (formation)
Writes per composition folder:
formation_energies.csv
Writes per candidate:
candidate_###/04_formation/meta.json
Step 07 (collect)
Writes one flat CSV in the workflow root:
results_database.csv
This file is a compact “database view” across compositions and selected candidates, combining scan/relax/filter/bandgap/formation results where available.
Step 08 (surface)
Generates slab structures from selected candidates and optionally relaxes them.
Input:
results_database.csv(from Step 07)
Writes per candidate:
<outdir>/<composition_tag>/candidate_###/hkl_h_k_l/term_###/
Files per slab:
POSCAR(generated slab)CONTCAR(relaxed slab, if enabled)meta.json(slab metadata)
Optional relaxation outputs:
surface_relax.logsurface_relax.trajsurface_relax.json
Global output:
<outdir>/surface_summary.csv
The surface stage uses:
pymatgen for slab generation
the same ML backend abstraction as Step 03 for relaxation
ASE optimizers for slab relaxation
Tips
Use
--verbosewith any command for more detailed logs:dopingflow run-all -c input.toml --verbose
If bandgap is not configured yet (no
ALIGNN_MODEL_DIR), stop before bandgap:dopingflow run-all -c input.toml --until filter
Run surface generation only after verifying final candidates:
dopingflow collect -c input.toml dopingflow surface -c input.toml
Start with a single composition and one candidate to validate surface settings:
composition_tag = "Sb50" selection_mode = "id" candidate_id = 1