dopingflow package

Submodules

dopingflow.bandgap module

class dopingflow.bandgap.BandgapConfig(outdir: 'Path', skip_if_done: 'bool', cutoff: 'float', max_neighbors: 'int', n_workers: 'int', device: 'str', gpu_id: 'int', batch_size: 'int')

Bases: object

Parameters:
  • outdir (Path)

  • skip_if_done (bool)

  • cutoff (float)

  • max_neighbors (int)

  • n_workers (int)

  • device (str)

  • gpu_id (int)

  • batch_size (int)

outdir: Path
skip_if_done: bool
cutoff: float
max_neighbors: int
n_workers: int
device: str
gpu_id: int
batch_size: int
dopingflow.bandgap.run_bandgap(raw_cfg, root, *, config_path=None)

Step 05: Predict bandgap (ALIGNN local model) for relaxed candidates.

Selection:
  1. If selected_candidates.txt exists in a composition folder -> use it

  2. Else fallback to candidate_*/02_relax/POSCAR

Outputs per composition folder:
  • bandgap_alignn_summary.csv

  • candidate_*/03_band/meta.json

Parameters:
  • raw_cfg (dict[str, Any])

  • root (Path)

  • config_path (Path | None)

Return type:

None

dopingflow.bandgap.run_bandgap_from_toml(config_path)
Parameters:

config_path (Path)

Return type:

None

dopingflow.cli module

dopingflow.cli.refs_build_cmd(config=<typer.models.OptionInfo object>, verbose=<typer.models.OptionInfo object>)

Step 00: Build/cache reference energies.

Parameters:
  • config (Path)

  • verbose (bool)

Return type:

None

dopingflow.cli.generate_cmd(config=<typer.models.OptionInfo object>, verbose=<typer.models.OptionInfo object>)

Step 01: Generate random doped structures.

Parameters:
  • config (Path)

  • verbose (bool)

Return type:

None

dopingflow.cli.scan_cmd(config=<typer.models.OptionInfo object>, verbose=<typer.models.OptionInfo object>)

Step 02: Symmetry-unique scan + M3GNet single-point energies (top-k).

Parameters:
  • config (Path)

  • verbose (bool)

Return type:

None

dopingflow.cli.relax_cmd(config=<typer.models.OptionInfo object>, verbose=<typer.models.OptionInfo object>)

Step 03: Relax scanned candidates with M3GNet Relaxer.

Parameters:
  • config (Path)

  • verbose (bool)

Return type:

None

dopingflow.cli.filter_cmd(config=<typer.models.OptionInfo object>, only=<typer.models.OptionInfo object>, force=<typer.models.OptionInfo object>, window_meV=<typer.models.OptionInfo object>, topn=<typer.models.OptionInfo object>, verbose=<typer.models.OptionInfo object>)

Step 04: Filter relaxed candidates (window or top-N).

Parameters:
  • config (Path)

  • only (str | None)

  • force (bool)

  • window_meV (float | None)

  • topn (int | None)

  • verbose (bool)

Return type:

None

dopingflow.cli.bandgap_cmd(config=<typer.models.OptionInfo object>, verbose=<typer.models.OptionInfo object>)

Step 05: Predict bandgap for filtered relaxed candidates (ALIGNN).

Parameters:
  • config (Path)

  • verbose (bool)

Return type:

None

dopingflow.cli.formation_cmd(config=<typer.models.OptionInfo object>, verbose=<typer.models.OptionInfo object>)

Step 06: Compute formation energies using cached references.

Parameters:
  • config (Path)

  • verbose (bool)

Return type:

None

dopingflow.cli.collect_cmd(config=<typer.models.OptionInfo object>, verbose=<typer.models.OptionInfo object>)

Step 07: Collect selected candidates into one CSV database.

Parameters:
  • config (Path)

  • verbose (bool)

Return type:

None

dopingflow.cli.run_all_cmd(config=<typer.models.OptionInfo object>, start=<typer.models.OptionInfo object>, stop=<typer.models.OptionInfo object>, only=<typer.models.OptionInfo object>, dry_run=<typer.models.OptionInfo object>, filter_only=<typer.models.OptionInfo object>, force=<typer.models.OptionInfo object>, window_meV=<typer.models.OptionInfo object>, topn=<typer.models.OptionInfo object>, verbose=<typer.models.OptionInfo object>)

Run the full pipeline in order, with optional step selection.

Step keys:

refs -> generate -> scan -> relax -> filter -> bandgap -> formation -> collect

Parameters:
  • config (Path)

  • start (str)

  • stop (str)

  • only (str | None)

  • dry_run (bool)

  • filter_only (str | None)

  • force (bool)

  • window_meV (float | None)

  • topn (int | None)

  • verbose (bool)

Return type:

None

dopingflow.cli.surface_cmd(config=<typer.models.OptionInfo object>, verbose=<typer.models.OptionInfo object>)

Step 08: Generate surfaces from selected doped bulk candidates.

Parameters:
  • config (Path)

  • verbose (bool)

Return type:

None

dopingflow.collect module

class dopingflow.collect.DBConfig(outdir: 'Path', skip_if_done: 'bool')

Bases: object

Parameters:
  • outdir (Path)

  • skip_if_done (bool)

outdir: Path
skip_if_done: bool
dopingflow.collect.read_json(path)
Parameters:

path (Path)

Return type:

dict | None

dopingflow.collect.safe_get(d, *keys, default=None)
Parameters:

d (dict | None)

dopingflow.collect.read_selected_txt(path)
Parameters:

path (Path)

Return type:

List[str]

dopingflow.collect.read_filtered_table(path)
Parse ranking_relax_filtered.csv into:

candidate -> {“rank_relax_filtered”: int, “E_relaxed_eV_filtered”: float, “delta_e_eV”: float, “filter_mode”: str}

Expected columns written by filtering.py:

rank_filtered, candidate, energy_relaxed_eV, delta_e_eV, …, filter_mode

Parameters:

path (Path)

Return type:

Dict[str, Dict[str, Any]]

dopingflow.collect.read_scan_ranking(path)
Parameters:

path (Path)

Return type:

Dict[str, Dict[str, Any]]

dopingflow.collect.read_bandgap_summary(path)
Parameters:

path (Path)

Return type:

Dict[str, Dict[str, Any]]

dopingflow.collect.read_formation_csv(path)

formation_energies.csv written by formation.py.

We keep it as a fallback, but prefer candidate_*/04_formation/meta.json for rich info.

Parameters:

path (Path)

Return type:

Dict[str, Dict[str, Any]]

dopingflow.collect.read_formation_meta(path)

candidate_*/04_formation/meta.json written by formation.py.

Parameters:

path (Path)

Return type:

Dict[str, Any]

dopingflow.collect.run_collect(raw_cfg, root, *, config_path=None)

Step 07: Collect results into ONE flat CSV database (results_database.csv), ONLY for the filtered/selected candidates (Step 04 output).

Selection priority:
  1. selected_candidates.txt

  2. ranking_relax_filtered.csv

Parameters:
  • raw_cfg (dict[str, Any])

  • root (Path)

  • config_path (Path | None)

Return type:

Path

dopingflow.collect.run_collect_from_toml(config_path)
Parameters:

config_path (Path)

Return type:

Path

dopingflow.filtering module

class dopingflow.filtering.FilterConfig(outdir: 'Path', mode: 'str', window_meV: 'float', max_candidates: 'int', skip_if_done: 'bool')

Bases: object

Parameters:
  • outdir (Path)

  • mode (str)

  • window_meV (float)

  • max_candidates (int)

  • skip_if_done (bool)

outdir: Path
mode: str
window_meV: float
max_candidates: int
skip_if_done: bool
dopingflow.filtering.run_filtering(raw_cfg, root, *, only=None, force=False, window_meV=None, topn=None)

Step 04: filter relaxed candidates after Step 03.

For each composition folder in [structure].outdir:

reads: ranking_relax.csv writes: ranking_relax_filtered.csv, selected_candidates.txt

Filtering:
  • [filter].mode=”window”: keep candidates within window_meV above Emin

  • [filter].mode=”topn”: keep lowest-energy max_candidates

Parameters:
  • raw_cfg (dict[str, Any])

  • root (Path)

  • only (str | None)

  • force (bool)

  • window_meV (float | None)

  • topn (int | None)

Return type:

None

dopingflow.filtering.run_filtering_from_toml(config_path, *, only=None, force=False, window_meV=None, topn=None)
Parameters:
  • config_path (Path)

  • only (str | None)

  • force (bool)

  • window_meV (float | None)

  • topn (int | None)

Return type:

None

dopingflow.formation module

class dopingflow.formation.FormationConfig(outdir: 'Path', host_species: 'str', anion_species: 'List[str]', skip_if_done: 'bool', normalize: 'str')

Bases: object

Parameters:
  • outdir (Path)

  • host_species (str)

  • anion_species (List[str])

  • skip_if_done (bool)

  • normalize (str)

outdir: Path
host_species: str
anion_species: List[str]
skip_if_done: bool
normalize: str
dopingflow.formation.run_formation(raw_cfg, root, *, config_path=None)

Step 06: Compute formation energies for relaxed (and optionally filtered) candidates.

Reads:
  • E_doped from candidate_*/02_relax/meta.json (energy_relaxed_eV)

  • references from reference_structures/reference_energies.json

Writes per composition folder:
  • formation_energies.csv

  • candidate_*/04_formation/meta.json

Parameters:
  • raw_cfg (dict[str, Any])

  • root (Path)

  • config_path (Path | None)

Return type:

None

dopingflow.formation.run_formation_from_toml(config_path)
Parameters:

config_path (Path)

Return type:

None

dopingflow.generate module

class dopingflow.generate.GenerateConfig(outdir: 'str', poscar_order: 'List[str]', seed_base: 'int', clean_outdir: 'bool', mode: 'str', host_species: 'str', compositions: 'List[Dict[str, float]]', dopants: 'List[str]', must_include: 'List[str]', max_dopants_total: 'int', allowed_totals: 'List[float]', levels: 'List[float]')

Bases: object

Parameters:
  • outdir (str)

  • poscar_order (List[str])

  • seed_base (int)

  • clean_outdir (bool)

  • mode (str)

  • host_species (str)

  • compositions (List[Dict[str, float]])

  • dopants (List[str])

  • must_include (List[str])

  • max_dopants_total (int)

  • allowed_totals (List[float])

  • levels (List[float])

outdir: str
poscar_order: List[str]
seed_base: int
clean_outdir: bool
mode: str
host_species: str
compositions: List[Dict[str, float]]
dopants: List[str]
must_include: List[str]
max_dopants_total: int
allowed_totals: List[float]
levels: List[float]
dopingflow.generate.validate_composition_minimal(requested_pct)
Parameters:

requested_pct (Dict[str, float])

Return type:

None

dopingflow.generate.normalize_to_counts_and_effective(n_host, requested_pct)

Convert requested dopant percentages (relative to host sites) into integer dopant counts by rounding to the nearest integer number of substitutions.

Parameters:
  • n_host (int)

  • requested_pct (Dict[str, float])

Return type:

tuple[Dict[str, int], Dict[str, float], List[str], float, float]

dopingflow.generate.composition_tag(effective_pct, must_first=None)
Parameters:
  • effective_pct (Dict[str, float])

  • must_first (List[str] | None)

Return type:

str

dopingflow.generate.stable_seed_from_tag(tag, seed_base)
Parameters:
  • tag (str)

  • seed_base (int)

Return type:

int

dopingflow.generate.build_structure_from_counts(pristine, host_species, dopant_counts, seed)
Parameters:
  • pristine (Structure)

  • host_species (str)

  • dopant_counts (Dict[str, int])

  • seed (int)

Return type:

Structure

dopingflow.generate.reorder_structure_by_species(s, order)
Parameters:
  • s (Structure)

  • order (List[str])

Return type:

Structure

dopingflow.generate.enumerate_compositions(dopants, must_include, max_dopants_total, allowed_totals, levels)
Parameters:
  • dopants (List[str])

  • must_include (List[str])

  • max_dopants_total (int)

  • allowed_totals (List[float])

  • levels (List[float])

Return type:

List[Dict[str, float]]

dopingflow.generate.run_generate(raw_cfg, root, *, config_path=None)

Generate one random doped POSCAR per composition.

Requires:
  • refs-build completed

  • reference_structures/reference_energies.json exists

  • relaxed host supercell POSCAR exists

Output:

<outdir>/<tag>/POSCAR <outdir>/<tag>/metadata.json

Parameters:
  • raw_cfg (dict[str, Any])

  • root (Path)

  • config_path (Path | None)

Return type:

Path

dopingflow.generate.run_generate_from_toml(config_path)
Parameters:

config_path (Path)

Return type:

Path

dopingflow.hardware module

class dopingflow.hardware.HardwareConfig(device: 'DeviceMode' = 'auto', gpu_id: 'int' = 0, allow_gpu_batching: 'bool' = True, relax_batch_size: 'int' = 8, bandgap_batch_size: 'int' = 32)

Bases: object

Parameters:
  • device (Literal['auto', 'cpu', 'cuda'])

  • gpu_id (int)

  • allow_gpu_batching (bool)

  • relax_batch_size (int)

  • bandgap_batch_size (int)

device: Literal['auto', 'cpu', 'cuda'] = 'auto'
gpu_id: int = 0
allow_gpu_batching: bool = True
relax_batch_size: int = 8
bandgap_batch_size: int = 32
dopingflow.hardware.resolve_torch_device(mode='auto', gpu_id=0)
Parameters:
  • mode (str)

  • gpu_id (int)

Return type:

str

dopingflow.hardware.configure_tensorflow(mode='auto', gpu_id=0)
Parameters:
  • mode (str)

  • gpu_id (int)

Return type:

str

dopingflow.hardware.parse_hardware_config(raw_cfg)
Parameters:

raw_cfg (dict)

Return type:

HardwareConfig

dopingflow.logging module

dopingflow.logging.setup_logging(root, *, verbose=False)

Configure logging for dopingflow.

  • Console output

  • Per-run log file

  • Noise suppression

Parameters:
  • root (Path)

  • verbose (bool)

Return type:

Path

dopingflow.ml_backends module

dopingflow.ml_backends.set_default_runtime_env(*, tf_threads=1, omp_threads=1)

Conservative defaults to keep CPU/TensorFlow noise low. Safe to call repeatedly.

Parameters:
  • tf_threads (int)

  • omp_threads (int)

Return type:

None

dopingflow.ml_backends.normalize_backend_config(*, backend, model, task, section_name)

Normalize + validate backend/model/task choices for any stage. Returns: (backend, model, task)

Parameters:
  • backend (str)

  • model (str)

  • task (str)

  • section_name (str)

Return type:

Tuple[str, str, str]

dopingflow.ml_backends.check_backend_dependency(backend, *, stage_name)

Fail early with clear stage-specific import errors.

Parameters:
  • backend (str)

  • stage_name (str)

Return type:

None

dopingflow.ml_backends.prepare_backend_runtime(*, backend, device, gpu_id, tf_threads=1, omp_threads=1)

Configure backend runtime environment in the current process. Call this before constructing the calculator.

Parameters:
  • backend (str)

  • device (str)

  • gpu_id (int)

  • tf_threads (int)

  • omp_threads (int)

Return type:

None

dopingflow.ml_backends.build_ase_calculator(*, backend, model, task, device)

Return an ASE-compatible calculator for the selected backend.

Parameters:
  • backend (str)

  • model (str)

  • task (str)

  • device (str)

dopingflow.ml_relaxation module

dopingflow.ml_relaxation.get_optimizer_class(name)
Parameters:

name (str)

dopingflow.ml_relaxation.final_fmax(forces)
Parameters:

forces (ndarray)

Return type:

float

dopingflow.ml_relaxation.structure_energy_with_calculator(struct, calculator)
Parameters:

struct (Structure)

Return type:

float

dopingflow.ml_relaxation.relax_structure_with_calculator(struct, *, calculator, optimizer_name, fmax, max_steps)
Parameters:
  • struct (Structure)

  • optimizer_name (str)

  • fmax (float)

  • max_steps (int)

Return type:

Tuple[Structure, float, int, float, bool]

dopingflow.refs module

class dopingflow.refs.RefConfig(reference_mode: 'str', skip_if_done: 'bool', fmax: 'float', max_steps: 'int', tf_threads: 'int', omp_threads: 'int', device: 'str', gpu_id: 'int', backend: 'str', model: 'str', task: 'str', optimizer: 'str', host: 'str', host_dir: 'Path', supercell: 'tuple[int, int, int]', metal_ref: 'List[str]', metals_dir: 'Path', oxides_ref: 'List[str]', oxides_dir: 'Path', gas_ref: 'str', gas_dir: 'Path', oxygen_mode: 'str', muO_shift_ev: 'float')

Bases: object

Parameters:
  • reference_mode (str)

  • skip_if_done (bool)

  • fmax (float)

  • max_steps (int)

  • tf_threads (int)

  • omp_threads (int)

  • device (str)

  • gpu_id (int)

  • backend (str)

  • model (str)

  • task (str)

  • optimizer (str)

  • host (str)

  • host_dir (Path)

  • supercell (tuple[int, int, int])

  • metal_ref (List[str])

  • metals_dir (Path)

  • oxides_ref (List[str])

  • oxides_dir (Path)

  • gas_ref (str)

  • gas_dir (Path)

  • oxygen_mode (str)

  • muO_shift_ev (float)

reference_mode: str
skip_if_done: bool
fmax: float
max_steps: int
tf_threads: int
omp_threads: int
device: str
gpu_id: int
backend: str
model: str
task: str
optimizer: str
host: str
host_dir: Path
supercell: tuple[int, int, int]
metal_ref: List[str]
metals_dir: Path
oxides_ref: List[str]
oxides_dir: Path
gas_ref: str
gas_dir: Path
oxygen_mode: str
muO_shift_ev: float
dopingflow.refs.run_refs_build(raw_cfg, root, *, config_path=None)

Build/cache relaxed reference energies needed for formation energy calculations.

Outputs:
  • reference_structures/reference_energies.json

  • reference_structures/relaxed/host_unit_relaxed.POSCAR

  • reference_structures/relaxed/host_supercell_<a>x<b>x<c>_relaxed.POSCAR

  • reference_structures/relaxed/refs/<name>_relaxed.POSCAR

Parameters:
  • raw_cfg (dict[str, Any])

  • root (Path)

  • config_path (Path | None)

Return type:

Path

dopingflow.refs.run_refs_build_from_toml(config_path)
Parameters:

config_path (Path)

Return type:

Path

dopingflow.relax module

class dopingflow.relax.RelaxConfig(fmax: 'float', max_steps: 'int', n_workers: 'int', tf_threads: 'int', omp_threads: 'int', order: 'List[str]', outdir: 'Path', skip_if_done: 'bool', skip_candidate_if_done: 'bool', device: 'str', gpu_id: 'int', backend: 'str', model: 'str', task: 'str', optimizer: 'str')

Bases: object

Parameters:
  • fmax (float)

  • max_steps (int)

  • n_workers (int)

  • tf_threads (int)

  • omp_threads (int)

  • order (List[str])

  • outdir (Path)

  • skip_if_done (bool)

  • skip_candidate_if_done (bool)

  • device (str)

  • gpu_id (int)

  • backend (str)

  • model (str)

  • task (str)

  • optimizer (str)

fmax: float
max_steps: int
n_workers: int
tf_threads: int
omp_threads: int
order: List[str]
outdir: Path
skip_if_done: bool
skip_candidate_if_done: bool
device: str
gpu_id: int
backend: str
model: str
task: str
optimizer: str
dopingflow.relax.run_relax(raw_cfg, root, *, config_path=None)

Step 03: Relax candidates produced by Step 02 using a unified ASE backend layer.

Parameters:
  • raw_cfg (dict[str, Any])

  • root (Path)

  • config_path (Path | None)

Return type:

None

dopingflow.relax.run_relax_from_toml(config_path)
Parameters:

config_path (Path)

Return type:

None

dopingflow.scan module

class dopingflow.scan.ScanConfig(poscar_in: 'str', topk: 'int', symprec: 'float', max_enum: 'int', n_workers: 'int', chunksize: 'int', order: 'List[str]', anion_species: 'List[str]', host_species: 'str', max_unique: 'int', skip_if_done: 'bool', device: 'str', gpu_id: 'int', backend: 'str', model: 'str', task: 'str', mode: 'str', sample_budget: 'int', sample_batch_size: 'int', sample_patience: 'int', sample_seed: 'int', sample_max_saved: 'int')

Bases: object

Parameters:
  • poscar_in (str)

  • topk (int)

  • symprec (float)

  • max_enum (int)

  • n_workers (int)

  • chunksize (int)

  • order (List[str])

  • anion_species (List[str])

  • host_species (str)

  • max_unique (int)

  • skip_if_done (bool)

  • device (str)

  • gpu_id (int)

  • backend (str)

  • model (str)

  • task (str)

  • mode (str)

  • sample_budget (int)

  • sample_batch_size (int)

  • sample_patience (int)

  • sample_seed (int)

  • sample_max_saved (int)

poscar_in: str
topk: int
symprec: float
max_enum: int
n_workers: int
chunksize: int
order: List[str]
anion_species: List[str]
host_species: str
max_unique: int
skip_if_done: bool
device: str
gpu_id: int
backend: str
model: str
task: str
mode: str
sample_budget: int
sample_batch_size: int
sample_patience: int
sample_seed: int
sample_max_saved: int
dopingflow.scan.run_scan(raw_cfg, root, *, config_path=None)
Step 02:
  • enumerate / sample symmetry-unique dopant arrangements

  • evaluate single-point energies using selected ML backend via ASE calculator

  • keep top-k lowest energies

Parameters:
  • raw_cfg (dict[str, Any])

  • root (Path)

  • config_path (Path | None)

Return type:

None

dopingflow.scan.run_scan_from_toml(config_path)
Parameters:

config_path (Path)

Return type:

None

dopingflow.surface module

dopingflow.utils.io module

dopingflow.utils.parallel module

dopingflow.utils.pymatgen_helpers module

Module contents