jqmc_workflow package

Contents

jqmc_workflow package#

Subpackages#

Submodules#

jqmc_workflow.launcher module#

Launcher: DAG-based parallel workflow executor for jqmc-workflow.

True DAG execution: as soon as ALL predecessors of a node complete, that node starts immediately — no waiting for the entire “layer”. Supports FileFrom / ValueFrom dependencies.

class jqmc_workflow.launcher.Launcher(workflows=None, log_level='INFO', log_name='jqmc_workflow.log', draw_graph=False)#

Bases: object

DAG-based parallel workflow executor.

Accepts a list of Container objects, automatically infers the dependency graph from FileFrom / ValueFrom references, and executes workflows with true DAG parallelism: as soon as all predecessors of a node complete, that node starts immediately — there is no layer-based grouping.

Parameters:
  • workflows (list[Container]) – Workflows to execute. Labels must be unique.

  • log_level (str) – Logging level ("DEBUG" or "INFO").

  • log_name (str) – Log file name (appended, not overwritten).

  • draw_graph (bool) – If True, render the dependency graph to dependency_graph.png (requires the graphviz Python package).

Raises:

ValueError – If workflow labels are duplicated or a dependency references an undefined workflow label.

Examples

Typical three-stage QMC pipeline:

from jqmc_workflow import (
    Launcher, Container, FileFrom,
    WF_Workflow, VMC_Workflow, MCMC_Workflow,
)

wf = Container(
    label="wf",
    dirname="00_wf",
    input_files=["trexio.h5"],
    workflow=WF_Workflow(trexio_file="trexio.h5"),
)

vmc = Container(
    label="vmc-opt",
    dirname="01_vmc",
    input_files=[FileFrom("wf", "hamiltonian_data.h5")],
    workflow=VMC_Workflow(
        server_machine_name="cluster",
        num_opt_steps=10,
        target_error=0.001,
    ),
)

mcmc = Container(
    label="mcmc-run",
    dirname="02_mcmc",
    input_files=[
        FileFrom("vmc-opt", "hamiltonian_data_opt_step_9.h5")
    ],
    rename_input_files=["hamiltonian_data.h5"],
    workflow=MCMC_Workflow(
        server_machine_name="cluster",
        target_error=0.001,
    ),
)

launcher = Launcher(
    workflows=[wf, vmc, mcmc],
    draw_graph=True,
)
launcher.launch()

Notes

  • The launcher changes the working directory during execution and restores it afterwards.

  • If a workflow fails, all downstream dependents are automatically skipped.

See also

Container

Wraps a workflow in a project directory.

FileFrom

File dependency placeholder.

ValueFrom

Value dependency placeholder.

async async_launch()#

Execute all workflows respecting DAG dependencies.

As soon as ALL predecessors of a node complete, that node starts immediately — no layer-based grouping.

launch()#

jqmc_workflow.lrdmc_ext_workflow module#

LRDMC_Ext_Workflow — LRDMC extrapolation to the a²→0 limit.

Orchestrates multiple LRDMC_Workflow runs at different lattice spacings (alat values), then post-processes with jqmc-tool lrdmc extrapolate-energy to obtain the continuum-limit energy.

This is a composite workflow: it spawns one Container per alat value, runs them (potentially in parallel via the Launcher), and finally performs the extrapolation.

class jqmc_workflow.lrdmc_ext_workflow.LRDMC_Ext_Workflow(server_machine_name='localhost', alat_list=None, hamiltonian_file='hamiltonian_data.h5', queue_label='default', pilot_queue_label=None, jobname_prefix='jqmc-lrdmc', number_of_walkers=4, max_time=86400, polynomial_order=2, num_gfmc_bin_blocks=5, num_gfmc_warmup_steps=0, num_gfmc_collect_steps=5, time_projection_tau=0.1, target_survived_walkers_ratio=None, num_mcmc_per_measurement=None, non_local_move=None, E_scf=None, atomic_force=None, epsilon_PW=None, mcmc_seed=None, verbosity=None, poll_interval=60, target_error=0.001, pilot_steps=100, num_gfmc_projections=None, max_continuation=5)#

Bases: Workflow

LRDMC a²→0 continuum-limit extrapolation workflow.

Orchestrates multiple LRDMC_Workflow runs at different lattice spacings (alat values), then post-processes with jqmc-tool lrdmc extrapolate-energy to obtain the continuum-limit energy.

Each alat run is wrapped in its own Container and all alat values are executed in parallel. Every alat independently calibrates its own num_mcmc_per_measurement (when target_survived_walkers_ratio is set in GFMC_n mode), runs an error-bar pilot, and then runs production.

Mode selection follows the same rules as LRDMC_Workflow:

  • GFMC_t (default) — set time_projection_tau (default 0.10).

  • GFMC_n — set target_survived_walkers_ratio or num_mcmc_per_measurement.

Parameters:
  • server_machine_name (str) – Target machine name (shared by all sub-runs).

  • alat_list (list[float]) – List of lattice discretization values, e.g. [0.5, 0.4, 0.3].

  • hamiltonian_file (str) – Input hamiltonian_data.h5 (must exist in the parent directory or be resolved by FileFrom).

  • queue_label (str) – Queue/partition label for production runs.

  • pilot_queue_label (str, optional) – Queue/partition label for pilot runs. Defaults to queue_label when None. A shorter queue is often sufficient for the pilot.

  • jobname_prefix (str) – Prefix for each sub-run job name.

  • number_of_walkers (int) – Walkers per MPI process.

  • max_time (int) – Wall-time limit per sub-run (seconds).

  • polynomial_order (int) – Polynomial order for the a²→0 extrapolation (default: 2).

  • num_gfmc_bin_blocks (int) – Binning blocks for post-processing.

  • num_gfmc_warmup_steps (int) – Warmup steps to discard.

  • num_gfmc_collect_steps (int) – Weight-collection steps.

  • time_projection_tau (float, optional) – Imaginary time step for GFMC_t mode (default 0.10). Ignored when target_survived_walkers_ratio or num_mcmc_per_measurement is set.

  • target_survived_walkers_ratio (float, optional) – Target survived-walkers ratio (default None). Each alat independently runs a calibration pilot (_pilot_a) to find its own optimal num_mcmc_per_measurement. Set to None to disable auto-calibration (requires explicit num_mcmc_per_measurement). Activates GFMC_n mode.

  • num_mcmc_per_measurement (int, optional) – GFMC projections per measurement. When given explicitly, automatic calibration is disabled and this value is used for every alat. Activates GFMC_n mode.

  • non_local_move (str, optional) – Non-local move treatment. Default from jqmc_miscs.

  • E_scf (float, optional) – Initial energy guess for the GFMC shift (GFMC_n only). Default from jqmc_miscs.

  • atomic_force (bool, optional) – Compute atomic forces. Default from jqmc_miscs.

  • epsilon_PW (float, optional) – Pathak–Wagner regularization parameter (Bohr). When > 0, the force estimator is regularized near the nodal surface. Default from jqmc_miscs.

  • mcmc_seed (int, optional) – Random seed for MCMC. Default from jqmc_miscs.

  • verbosity (str, optional) – Verbosity level. Default from jqmc_miscs.

  • poll_interval (int) – Seconds between job-status polls.

  • target_error (float) – Target statistical error (Ha) for each sub-LRDMC run. Passed through to each LRDMC_Workflow.

  • pilot_steps (int) – Pilot measurement steps for target-error estimation.

  • num_gfmc_projections (int, optional) – Fixed number of measurement steps per production run. When set, the error-bar pilot is skipped for each sub-LRDMC and all max_continuation runs are executed unconditionally. Passed through to each LRDMC_Workflow. Default None (automatic mode).

  • max_continuation (int) – Maximum number of production runs per sub-LRDMC.

Examples

GFMC_t mode (default):

wf = LRDMC_Ext_Workflow(
    server_machine_name="cluster",
    alat_list=[0.5, 0.4, 0.3],
    target_error=0.001,
    number_of_walkers=8,
)
status, files, values = wf.launch()
print(values["extrapolated_energy"],
      values["extrapolated_energy_error"])

GFMC_n mode with calibration:

wf = LRDMC_Ext_Workflow(
    server_machine_name="cluster",
    alat_list=[0.5, 0.4, 0.3],
    target_survived_walkers_ratio=0.97,
    target_error=0.001,
    number_of_walkers=8,
)

As part of a Launcher pipeline:

enc = Container(
    label="lrdmc-ext",
    dirname="03_lrdmc",
    input_files=[FileFrom("mcmc-run", "hamiltonian_data.h5")],
    workflow=LRDMC_Ext_Workflow(
        server_machine_name="cluster",
        alat_list=[0.5, 0.4, 0.3],
        target_error=0.001,
    ),
)

Notes

  • At least two alat values are required for extrapolation. With a single value, per-alat results are returned but no extrapolation is performed.

  • Each sub-run directory is named lrdmc_alat_<value>/.

See also

LRDMC_Workflow

Single-alat LRDMC run.

async async_launch()#

Run LRDMC at each alat, then extrapolate to a²→0.

Every alat value is launched in parallel. Each child LRDMC_Workflow independently handles its own calibration (_pilot_a), error-bar pilot (_pilot_b), and production phase.

Returns:

(status, output_files, output_values)

Return type:

tuple

jqmc_workflow.lrdmc_workflow module#

LRDMC_Workflow — Lattice-Regularized Diffusion Monte Carlo run.

Generates an LRDMC input TOML, submits jqmc (job_type=lrdmc-bra or job_type=lrdmc-tau) on a remote/local machine, monitors until completion, fetches the checkpoint, and post-processes with jqmc-tool lrdmc compute-energy.

Two operating modes are available:

  • GFMC_n mode (job_type=lrdmc-bra) — activated when target_survived_walkers_ratio or num_mcmc_per_measurement is set. Uses discrete projections per measurement.

  • GFMC_t mode (job_type=lrdmc-tau) — activated when time_projection_tau is used (default). Uses a continuous imaginary time step between projections. No calibration pilot is needed.

class jqmc_workflow.lrdmc_workflow.LRDMC_Workflow(server_machine_name='localhost', alat=0.3, hamiltonian_file='hamiltonian_data.h5', input_file='input.toml', output_file='out.o', queue_label='default', jobname='jqmc-lrdmc', number_of_walkers=4, max_time=86400, num_gfmc_bin_blocks=5, num_gfmc_warmup_steps=0, num_gfmc_collect_steps=5, time_projection_tau=0.1, target_survived_walkers_ratio=None, num_mcmc_per_measurement=None, non_local_move=None, E_scf=None, atomic_force=None, epsilon_PW=None, mcmc_seed=None, verbosity=None, poll_interval=60, target_error=0.001, pilot_steps=100, num_gfmc_projections=None, pilot_queue_label=None, max_continuation=1)#

Bases: Workflow

Single LRDMC (Lattice-Regularized Diffusion Monte Carlo) run.

Generates a job_type=lrdmc-bra (GFMC_n) or job_type=lrdmc-tau (GFMC_t) input TOML at a fixed lattice spacing alat, submits jqmc, monitors until completion, fetches the checkpoint, and post-processes with jqmc-tool lrdmc compute-energy to extract the DMC energy ± error.

Mode selection (mutually exclusive):

  • GFMC_t (default) — set time_projection_tau (default 0.10). Uses continuous imaginary-time projection. Only the error-bar pilot is run (no calibration phase).

  • GFMC_n — set target_survived_walkers_ratio or num_mcmc_per_measurement. Uses discrete GFMC projections. When target_survived_walkers_ratio is set (and num_mcmc_per_measurement is None), an automatic calibration pilot determines the optimal num_mcmc_per_measurement.

The workflow supports two operating modes:

Automatic mode (default, num_gfmc_projections=None):

  1. Pilot run (_0) — A short run with pilot_steps measurement steps. The resulting error estimates the steps required for target_error via $sigma propto 1/sqrt{N}$. In GFMC_n mode with calibration, three additional short runs precede this to determine num_mcmc_per_measurement.

  2. Production runs (_1, _2, …) — Continuation runs with the estimated step count. The loop terminates when the error is ≤ target_error or max_continuation is reached.

Fixed-step mode (num_gfmc_projections is set):

The error-bar pilot (_pilot_b) is skipped and target_error is ignored. If calibration is needed (GFMC_n mode with target_survived_walkers_ratio), _pilot_a still runs. Each production run uses exactly num_gfmc_projections measurement steps, and max_continuation runs are executed unconditionally.

Parameters:
  • server_machine_name (str) – Target machine name.

  • alat (float) – Lattice discretization parameter (bohr).

  • hamiltonian_file (str) – Input hamiltonian_data.h5.

  • input_file (str) – Generated TOML input filename.

  • output_file (str) – Stdout capture filename.

  • queue_label (str) – Queue/partition label.

  • jobname (str) – Scheduler job name.

  • number_of_walkers (int) – Walkers per MPI process.

  • max_time (int) – Wall-time limit (seconds).

  • num_gfmc_bin_blocks (int) – Binning blocks for post-processing.

  • num_gfmc_warmup_steps (int) – Warmup steps to discard in post-processing.

  • num_gfmc_collect_steps (int) – Weight-collection steps for energy post-processing.

  • time_projection_tau (float, optional) – Imaginary time step between projections (bohr) for GFMC_t mode. Default 0.10. Ignored when target_survived_walkers_ratio or num_mcmc_per_measurement is set.

  • target_survived_walkers_ratio (float, optional) – Target survived-walkers ratio for automatic num_mcmc_per_measurement calibration. Setting this activates GFMC_n mode. The pilot phase runs three short calculations at Ne*k*(0.3/alat)² projections (k=2,4,6), fits a linear model to the observed survived-walkers ratio, and picks the value that achieves this target.

  • num_mcmc_per_measurement (int, optional) – GFMC projections per measurement (GFMC_n mode). When given explicitly, the automatic calibration is skipped.

  • non_local_move (str, optional) – Non-local move treatment ("tmove" or "dltmove"). Default from jqmc_miscs.

  • E_scf (float, optional) – Initial energy guess for the GFMC shift (GFMC_n only). Default from jqmc_miscs.

  • atomic_force (bool, optional) – Compute atomic forces. Default from jqmc_miscs.

  • epsilon_PW (float, optional) – Pathak–Wagner regularization parameter (Bohr). When > 0, the force estimator is regularized near the nodal surface. Default from jqmc_miscs.

  • mcmc_seed (int, optional) – Random seed for MCMC. Default from jqmc_miscs.

  • verbosity (str, optional) – Verbosity level. Default from jqmc_miscs.

  • poll_interval (int) – Seconds between job-status polls.

  • target_error (float) – Target statistical error (Ha).

  • pilot_steps (int) – Measurement steps for the pilot estimation run.

  • num_gfmc_projections (int, optional) – Fixed number of measurement steps per production run. When set, the error-bar pilot (_pilot_b) is skipped, target_error is ignored, and all max_continuation production runs are executed unconditionally. Calibration (_pilot_a) still runs when needed (GFMC_n mode with target_survived_walkers_ratio). Default None (automatic mode).

  • pilot_queue_label (str, optional) – Queue label for the pilot run. Defaults to queue_label. Use a shorter/smaller queue for the pilot to save resources.

  • max_continuation (int) – Maximum number of production runs after the pilot.

Examples

GFMC_t mode (default):

wf = LRDMC_Workflow(
    server_machine_name="cluster",
    alat=0.3,
    target_error=0.0005,
    number_of_walkers=8,
)
status, files, values = wf.launch()
print(values["energy"], values["energy_error"])

GFMC_n mode with calibration:

wf = LRDMC_Workflow(
    server_machine_name="cluster",
    alat=0.3,
    target_error=0.0005,
    target_survived_walkers_ratio=0.97,
    number_of_walkers=8,
)

Fixed-step mode (skip error-bar pilot):

wf = LRDMC_Workflow(
    server_machine_name="cluster",
    alat=0.3,
    num_gfmc_projections=500,
    max_continuation=3,
    number_of_walkers=8,
)

As part of a Launcher pipeline:

enc = Container(
    label="lrdmc-a0.30",
    dirname="03_lrdmc",
    input_files=[FileFrom("mcmc-run", "hamiltonian_data.h5")],
    workflow=LRDMC_Workflow(
        server_machine_name="cluster",
        alat=0.3,
        target_error=0.001,
    ),
)

Notes

  • For a²→0 continuum-limit extrapolation, use LRDMC_Ext_Workflow instead.

  • The pilot is skipped on re-entrance if an estimation already exists in workflow_state.toml.

See also

LRDMC_Ext_Workflow

Multi-alat extrapolation wrapper.

MCMC_Workflow

VMC production sampling (job_type=mcmc).

VMC_Workflow

Wavefunction optimisation (job_type=vmc).

async async_launch()#

Run the LRDMC workflow.

Fixed-step mode (num_gfmc_projections is set): The error-bar pilot (_pilot_b) is skipped. Calibration (_pilot_a) still runs if needed. Each production run uses exactly num_gfmc_projections steps and all max_continuation runs are executed unconditionally.

Automatic mode (num_gfmc_projections is None, default):

  1. Calibration pilot (_pilot_a, GFMC_n only) — Three short LRDMC runs to determine num_mcmc_per_measurement.

  2. Error-bar pilot (_pilot_b) — estimates production steps.

  3. Production runs (_1, _2, …) — accumulate statistics until target_error is achieved or max_continuation is reached.

property job_type: str#

Return the jqmc job type string for TOML generation.

jqmc_workflow.mcmc_workflow module#

MCMC_Workflow — MCMC production run (sampling) via jqmc (job_type=mcmc).

Generates an MCMC input TOML, submits jqmc on a remote/local machine, monitors until completion, fetches results, and post-processes the checkpoint with jqmc-tool mcmc compute-energy to extract the VMC energy ± error.

class jqmc_workflow.mcmc_workflow.MCMC_Workflow(server_machine_name='localhost', hamiltonian_file='hamiltonian_data.h5', input_file='input.toml', output_file='out.o', queue_label='default', jobname='jqmc-mcmc', number_of_walkers=4, max_time=86400, num_mcmc_bin_blocks=1, num_mcmc_warmup_steps=0, Dt=None, epsilon_AS=None, num_mcmc_per_measurement=None, atomic_force=None, parameter_derivatives=None, mcmc_seed=None, verbosity=None, poll_interval=60, target_error=0.001, num_mcmc_steps=None, pilot_steps=100, pilot_queue_label=None, max_continuation=1)#

Bases: Workflow

MCMC (VMC production-run / sampling) workflow.

Generates a job_type=mcmc input TOML, submits jqmc on a remote or local machine, monitors until completion, fetches results, and post-processes the checkpoint with jqmc-tool mcmc compute-energy to obtain the VMC energy ± error.

The workflow supports two modes:

Automatic mode (default, num_mcmc_steps=None):

  1. Pilot run (_0) — A short MCMC run with pilot_steps measurement steps. The resulting statistical error is used to estimate the total steps required for target_error via the CLT scaling $sigma propto 1/sqrt{N}$.

  2. Production runs (_1, _2, …) — Continuation runs with the estimated step count. After each run, the checkpoint is post-processed; if the error is at or below target_error the loop terminates. At most max_continuation production runs are attempted.

Fixed-step mode (num_mcmc_steps is set):

The pilot run is skipped entirely and target_error is ignored. Each production run uses exactly num_mcmc_steps measurement steps, and max_continuation runs are executed unconditionally.

Parameters:
  • server_machine_name (str) – Name of the target machine (configured in ~/.jqmc_setting/).

  • hamiltonian_file (str) – Input hamiltonian_data.h5.

  • input_file (str) – Generated TOML input filename.

  • output_file (str) – Stdout capture filename.

  • queue_label (str) – Queue/partition label.

  • jobname (str) – Scheduler job name.

  • number_of_walkers (int) – Walkers per MPI process.

  • max_time (int) – Wall-time limit (seconds).

  • num_mcmc_bin_blocks (int) – Binning blocks for post-processing.

  • num_mcmc_warmup_steps (int) – Warmup steps to discard in post-processing.

  • Dt (float, optional) – MCMC step size (bohr). Default from jqmc_miscs.

  • epsilon_AS (float, optional) – Attacalite-Sorella regularization parameter. Default from jqmc_miscs.

  • num_mcmc_per_measurement (int, optional) – MCMC updates per measurement. Default from jqmc_miscs.

  • atomic_force (bool, optional) – Compute atomic forces. Default from jqmc_miscs.

  • parameter_derivatives (bool, optional) – Compute parameter derivatives. Default from jqmc_miscs.

  • mcmc_seed (int, optional) – Random seed for MCMC. Default from jqmc_miscs.

  • verbosity (str, optional) – Verbosity level. Default from jqmc_miscs.

  • poll_interval (int) – Seconds between job-status polls.

  • target_error (float) – Target statistical error (Ha). Ignored when num_mcmc_steps is set.

  • num_mcmc_steps (int, optional) – Fixed number of measurement steps per production run. When set, the pilot run is skipped and target_error is ignored; each of the max_continuation production runs uses exactly this many steps.

  • pilot_steps (int) – Measurement steps for the pilot estimation run. Ignored when num_mcmc_steps is set.

  • pilot_queue_label (str, optional) – Queue label for the pilot run. Defaults to queue_label. Use a shorter/smaller queue for the pilot to save resources.

  • max_continuation (int) – Maximum number of production runs after the pilot.

Examples

Standalone launch (automatic mode):

wf = MCMC_Workflow(
    server_machine_name="cluster",
    target_error=0.0005,
    pilot_steps=200,
    number_of_walkers=8,
)
status, files, values = wf.launch()
print(values["energy"], values["energy_error"])

Fixed-step mode (no pilot, no target_error check):

wf = MCMC_Workflow(
    server_machine_name="cluster",
    num_mcmc_steps=5000,
    number_of_walkers=8,
    max_continuation=3,
)
status, files, values = wf.launch()

As part of a Launcher pipeline:

enc = Container(
    label="mcmc",
    dirname="02_mcmc",
    input_files=[FileFrom("vmc-opt", "hamiltonian_data_opt_step_9.h5")],
    rename_input_files=["hamiltonian_data.h5"],
    workflow=MCMC_Workflow(
        server_machine_name="cluster",
        target_error=0.001,
    ),
)

Notes

  • The pilot run is skipped on re-entrance if an estimation already exists in workflow_state.toml.

  • Continuation runs restart from the most recent .h5 checkpoint file.

See also

VMC_Workflow

Wavefunction optimisation (job_type=vmc).

LRDMC_Workflow

Diffusion Monte Carlo (job_type=lrdmc-bra / lrdmc-tau).

async async_launch()#

Run the MCMC workflow.

Fixed-step mode (num_mcmc_steps is set): The pilot run is skipped. Each production run uses exactly num_mcmc_steps steps and all max_continuation runs are executed unconditionally.

Automatic mode (num_mcmc_steps is None, default):

  1. Pilot run in _pilot/ subdirectory estimates required steps (skipped on continuation). May use a different queue from production (pilot_queue_label).

  2. Production runs (_1, _2, …) start from scratch and accumulate statistics until target_error is achieved or max_continuation is reached.

jqmc_workflow.vmc_workflow module#

VMC_Workflow — Jastrow / orbital optimization via jqmc (job_type=vmc).

Generates a VMC input TOML, submits the jqmc binary on a remote (or local) machine, monitors until completion, and fetches the results. The optimized hamiltonian_data_opt_step_N.h5 files and checkpoint are collected as outputs.

class jqmc_workflow.vmc_workflow.VMC_Workflow(server_machine_name='localhost', num_opt_steps=20, hamiltonian_file='hamiltonian_data.h5', input_file='input.toml', output_file='out.o', queue_label='default', jobname='jqmc-vmc', number_of_walkers=4, max_time=86400, Dt=None, epsilon_AS=None, num_mcmc_per_measurement=None, num_mcmc_warmup_steps=None, num_mcmc_bin_blocks=None, wf_dump_freq=None, opt_J1_param=None, opt_J2_param=None, opt_J3_param=None, opt_JNN_param=None, opt_lambda_param=None, opt_with_projected_MOs=None, num_param_opt=None, optimizer_kwargs=None, mcmc_seed=None, verbosity=None, poll_interval=60, target_error=0.001, pilot_mcmc_steps=50, pilot_vmc_steps=5, pilot_queue_label=None, max_continuation=1, target_snr=4.5)#

Bases: Workflow

VMC (Variational Monte Carlo) Jastrow / orbital optimisation workflow.

Generates a job_type=vmc input TOML, submits jqmc, monitors until completion, and collects the optimised hamiltonian_data_opt_step_N.h5 files and checkpoint.

The workflow operates in two phases:

  1. Pilot VMC run (_0) — Runs a short optimisation with pilot_vmc_steps optimisation steps and pilot_mcmc_steps MCMC steps per step. The statistical error of the last optimisation step is used to estimate the MCMC steps per opt-step required to achieve target_error via $sigma propto 1/sqrt{N}$.

  2. Production VMC runs (_1, _2, …) — Full optimisation with num_opt_steps and the estimated MCMC steps per step. If a run is interrupted by the wall-time limit, the next continuation restarts from the checkpoint. At most max_continuation runs are attempted.

Parameters:
  • server_machine_name (str) – Name of the target machine (must be configured in ~/.jqmc_setting/).

  • num_opt_steps (int) – Number of optimization iterations for production runs.

  • hamiltonian_file (str) – Input hamiltonian_data.h5.

  • input_file (str) – Name of the generated TOML input file.

  • output_file (str) – Name of the stdout capture file.

  • queue_label (str) – Queue/partition label from queue_data.toml.

  • jobname (str) – Job name for the scheduler.

  • number_of_walkers (int) – Walkers per MPI process.

  • max_time (int) – Wall-time limit in seconds.

  • Dt (float, optional) – MCMC step size (bohr). Default from jqmc_miscs.

  • epsilon_AS (float, optional) – Attacalite-Sorella regularization parameter. Default from jqmc_miscs.

  • num_mcmc_per_measurement (int, optional) – MCMC updates per measurement. Default from jqmc_miscs.

  • num_mcmc_warmup_steps (int, optional) – Warmup measurement steps to discard. Default from jqmc_miscs.

  • num_mcmc_bin_blocks (int, optional) – Binning blocks. Default from jqmc_miscs.

  • wf_dump_freq (int, optional) – Wavefunction dump frequency. Default from jqmc_miscs.

  • opt_J1_param (bool, optional) – Optimize J1 Jastrow parameters. Default from jqmc_miscs.

  • opt_J2_param (bool, optional) – Optimize J2 Jastrow parameters. Default from jqmc_miscs.

  • opt_J3_param (bool, optional) – Optimize J3 Jastrow parameters. Default from jqmc_miscs.

  • opt_JNN_param (bool, optional) – Optimize neural-network Jastrow parameters. Default from jqmc_miscs.

  • opt_lambda_param (bool, optional) – Optimize lambda (geminal) parameters. Default from jqmc_miscs.

  • opt_with_projected_MOs (bool, optional) – Optimize in a restricted MO space. Default from jqmc_miscs.

  • num_param_opt (int, optional) – Number of parameters to optimize (0 = all). Default from jqmc_miscs.

  • optimizer_kwargs (dict, optional) – Optimizer configuration dict. Default from jqmc_miscs.

  • mcmc_seed (int, optional) – Random seed for MCMC. Default from jqmc_miscs.

  • verbosity (str, optional) – Verbosity level. Default from jqmc_miscs.

  • poll_interval (int) – Seconds between job-status polls.

  • target_error (float) – Target statistical error (Ha) per optimization step.

  • pilot_mcmc_steps (int) – MCMC steps per opt-step for the pilot run.

  • pilot_vmc_steps (int) – Number of optimization steps in the pilot run (small; just enough to estimate the error bar).

  • pilot_queue_label (str, optional) – Queue label for the pilot run. Defaults to queue_label. Use a shorter/smaller queue for the pilot to save resources.

  • max_continuation (int) – Maximum number of production runs after the pilot.

  • target_snr (float) – Target signal-to-noise ratio max(|f|/|std f|) for force convergence. The workflow continues until the last optimization step’s S/N drops to or below this threshold.

Examples

Standalone launch:

wf = VMC_Workflow(
    server_machine_name="cluster",
    num_opt_steps=20,
    target_error=0.001,
    pilot_mcmc_steps=50,
    pilot_vmc_steps=5,
    number_of_walkers=8,
)
status, files, values = wf.launch()
print(values["optimized_hamiltonian"])

As part of a Launcher pipeline:

enc = Container(
    label="vmc",
    dirname="01_vmc",
    input_files=[FileFrom("wf", "hamiltonian_data.h5")],
    workflow=VMC_Workflow(
        server_machine_name="cluster",
        num_opt_steps=20,
        target_error=0.001,
    ),
)

Notes

  • The pilot uses a small number of opt steps (pilot_vmc_steps) just to estimate the error. The real optimisation happens in production runs with the full num_opt_steps.

  • The estimation is stored in workflow_state.toml under [estimation]; on re-entrance the pilot is skipped.

See also

MCMC_Workflow

VMC production sampling (job_type=mcmc).

LRDMC_Workflow

Diffusion Monte Carlo (job_type=lrdmc-bra / lrdmc-tau).

WF_Workflow

TREXIO → hamiltonian_data conversion.

async async_launch()#

Run the VMC optimization workflow with automatic step estimation.

  1. Pilot VMC run in _pilot/ with pilot_vmc_steps opt steps and pilot_mcmc_steps MCMC steps to estimate the required MCMC steps per opt step (skipped on continuation). May use a different queue (pilot_queue_label).

  2. Production VMC runs (_1, _2, …) start from scratch with the full num_opt_steps and estimated MCMC steps until all optimization steps complete or max_continuation is reached.

jqmc_workflow.wf_workflow module#

WF_Workflow — TREXIO to hamiltonian_data.h5 conversion.

Wraps jqmc-tool trexio convert-to which converts a TREXIO file (.h5) into the internal hamiltonian_data.h5 format, optionally attaching Jastrow one-body, two-body, three-body, and neural-network factors.

This workflow runs locally (no remote job submission).

class jqmc_workflow.wf_workflow.WF_Workflow(trexio_file='trexio.h5', hamiltonian_file='hamiltonian_data.h5', j1_parameter=None, j1_type=None, j2_parameter=None, j2_type=None, j3_basis_type=None, j_nn_type=None, j_nn_params=None, ao_conv_to=None)#

Bases: Workflow

Convert a TREXIO file to hamiltonian_data.h5.

Calls jqmc-tool trexio convert-to under the hood.

Parameters:
  • trexio_file (str) – Path to the input TREXIO .h5 file.

  • hamiltonian_file (str) – Output filename (default: "hamiltonian_data.h5").

  • j1_parameter (float, optional) – Jastrow one-body parameter (-j1).

  • j1_type (str, optional) – Jastrow one-body functional form (--jastrow-1b-type). "exp" (default) or "pade".

  • j2_parameter (float, optional) – Jastrow two-body parameter (-j2).

  • j2_type (str, optional) – Jastrow two-body functional form (--jastrow-2b-type). "pade" (default) or "exp".

  • j3_basis_type (str, optional) – Jastrow three-body basis-set type (-j3). One of "ao", "ao-full", "ao-small", "ao-medium", "ao-large", "mo", "none", or None (disabled).

  • j_nn_type (str, optional) – Neural-network Jastrow type (-j-nn-type), e.g. "schnet".

  • j_nn_params (list[str], optional) – Extra NN Jastrow parameters (-jp key=value).

  • ao_conv_to (str, optional) – Convert AOs after building the Hamiltonian (--ao-conv-to). "cart" → convert to Cartesian AOs, "sphe" → convert to spherical-harmonic AOs, None → keep the original representation.

Example

>>> wf = WF_Workflow(
...     trexio_file="molecular.h5",
...     j1_parameter=1.0,
...     j1_type="pade",
...     j2_parameter=0.5,
...     j2_type="exp",
...     j3_basis_type="ao-small",
... )
>>> status, out_files, out_values = wf.launch()

Notes

This workflow runs locally — no remote job submission is involved. It calls jqmc-tool trexio convert-to via subprocess.run().

See also

VMC_Workflow

Optimise the wavefunction produced by this step.

async async_launch()#

Run the TREXIO→hamiltonian conversion (locally).

Returns:

(status, output_files, output_values)

Return type:

tuple

jqmc_workflow.workflow module#

Base Workflow and Encapsulated Workflow for jqmc-workflow.

Workflow state is tracked via workflow_state.toml (human+machine readable). Dependencies between workflows are declared with FileFrom / ValueFrom.

class jqmc_workflow.workflow.Container(label='workflow', dirname='workflow', input_files=None, rename_input_files=None, workflow=None)#

Bases: object

Run a Workflow inside a dedicated project directory.

Container is the standard wrapper used with the Launcher. It manages:

  • Directory creation — a self-contained project directory is created under the current working directory.

  • Input file copying — source files (or resolved FileFrom references) are copied into the project dir.

  • State tracking — a workflow_state.toml file records lifecycle status (pendingrunningcompleted).

  • Re-entrance — if the directory already exists with status completed, the workflow is not re-run; outputs are read from the state file instead.

Parameters:
  • label (str) – Human-readable label; also used as the key for dependency resolution in the Launcher.

  • dirname (str) – Directory name to create (relative to CWD).

  • input_files (list[str | FileFrom]) – Files to copy into the project directory before launch. Items may be plain paths or FileFrom objects.

  • rename_input_files (list[str], optional) – If provided (same length as input_files), each copied file is renamed to the corresponding entry.

  • workflow (Workflow) – The inner Workflow instance to execute.

Variables:
  • output_files (list[str]) – Output filenames (populated after launch).

  • output_values (dict) – Scalar results from the inner workflow.

  • status (str) – Current status.

  • project_dir (str) – Absolute path to the project directory.

Examples

Wrap a VMC optimization in its own directory:

enc = Container(
    label="vmc-opt",
    dirname="01_vmc",
    input_files=["hamiltonian_data.h5"],
    workflow=VMC_Workflow(
        server_machine_name="cluster",
        num_opt_steps=10,
        target_error=0.001,
    ),
)
status, files, values = enc.launch()

See also

Launcher

Execute multiple Container objects as a DAG.

FileFrom

Reference an output file from another workflow.

async async_launch()#
launch()#
class jqmc_workflow.workflow.FileFrom(label, filename)#

Bases: object

Declare that an input file should come from another workflow’s output.

Used inside Container definitions to express inter-workflow file dependencies. The Launcher resolves these placeholders before a workflow is launched.

Parameters:
  • label (str) – Label of the upstream workflow that produces the file.

  • filename (str) – Filename (basename) to pull from the upstream workflow’s output directory.

Examples

Pass an optimised Hamiltonian from a VMC step to an MCMC step:

Container(
    label="mcmc-run",
    dirname="mcmc",
    input_files=[FileFrom("vmc-opt", "hamiltonian_data_opt_step_9.h5")],
    rename_input_files=["hamiltonian_data.h5"],
    workflow=MCMC_Workflow(...),
)

See also

ValueFrom

Declare a scalar-value dependency.

Launcher

Resolves FileFrom / ValueFrom at launch time.

class jqmc_workflow.workflow.ValueFrom(label, key)#

Bases: object

Declare that a parameter value should come from another workflow’s output.

Used when a downstream workflow needs a scalar result (energy, error, filename string, etc.) produced by an upstream workflow. The Launcher resolves these placeholders before launch.

Parameters:
  • label (str) – Label of the upstream workflow that produces the value.

  • key (str) – Key name in the upstream workflow’s output_values dict.

Examples

Feed the MCMC energy into an LRDMC workflow as trial_energy:

LRDMC_Workflow(
    trial_energy=ValueFrom("mcmc-run", "energy"),
    ...
)

See also

FileFrom

Declare a file dependency.

Launcher

Resolves FileFrom / ValueFrom at launch time.

class jqmc_workflow.workflow.Workflow(project_dir=None)#

Bases: object

Abstract base class for all jQMC computation workflows.

Every concrete workflow (VMC, MCMC, LRDMC, WF, …) inherits from this class and overrides async_launch().

Parameters:

project_dir (str, optional) – Absolute path to the working directory for this workflow. When None (the default), project_dir is set to the process CWD at the time async_launch() is first called. Container sets this explicitly before launching the inner workflow.

Variables:
  • status (str) – Current lifecycle status ("init", "success", "failed").

  • output_files (list[str]) – Filenames produced by the workflow (populated after launch).

  • output_values (dict) – Scalar results (energy, error, …) produced by the workflow.

  • project_dir (str or None) – Working directory for file I/O. Resolved to an absolute path.

Notes

Subclass contract:

  • Override async_launch() and return (status, output_files, output_values).

  • Call super().__init__() in your constructor.

Examples

Minimal custom workflow:

class MyWorkflow(Workflow):
    async def async_launch(self):
        # ... do work ...
        self.status = "success"
        return self.status, ["result.h5"], {"energy": -1.23}
async async_collect()#

Collect and return results from completed jobs.

Override in subclass. The default raises NotImplementedError.

Returns:

Workflow-specific result mapping.

Return type:

dict

async async_launch()#

Override in subclass. Must return (status, output_files, output_values).

async async_poll()#

Check whether submitted jobs have completed.

Override in subclass. The default raises NotImplementedError.

Returns:

One of "running", "completed", "failed".

Return type:

str

async async_submit()#

Submit initial job(s) and return tracking information.

Override in subclass. The default raises NotImplementedError.

Returns:

At minimum {"status": "submitted"}.

Return type:

dict

launch()#

Module contents#

jqmc_workflow — Automated workflow manager for jQMC calculations.

Public API#

Workflow classes:

WF_Workflow TREXIO → hamiltonian_data.h5 conversion. VMC_Workflow Jastrow / orbital optimisation (job_type=vmc). MCMC_Workflow VMC production sampling (job_type=mcmc). LRDMC_Workflow Lattice-Regularized DMC (job_type=lrdmc-bra / lrdmc-tau). LRDMC_Ext_Workflow Multi-alat LRDMC a²→0 extrapolation.

Composition helpers:

Workflow Abstract base for custom workflows. Container Wraps a workflow in a project directory. FileFrom Declare a file dependency on another workflow. ValueFrom Declare a value dependency on another workflow. Launcher DAG-based parallel workflow executor.

class jqmc_workflow.Container(label='workflow', dirname='workflow', input_files=None, rename_input_files=None, workflow=None)#

Bases: object

Run a Workflow inside a dedicated project directory.

Container is the standard wrapper used with the Launcher. It manages:

  • Directory creation — a self-contained project directory is created under the current working directory.

  • Input file copying — source files (or resolved FileFrom references) are copied into the project dir.

  • State tracking — a workflow_state.toml file records lifecycle status (pendingrunningcompleted).

  • Re-entrance — if the directory already exists with status completed, the workflow is not re-run; outputs are read from the state file instead.

Parameters:
  • label (str) – Human-readable label; also used as the key for dependency resolution in the Launcher.

  • dirname (str) – Directory name to create (relative to CWD).

  • input_files (list[str | FileFrom]) – Files to copy into the project directory before launch. Items may be plain paths or FileFrom objects.

  • rename_input_files (list[str], optional) – If provided (same length as input_files), each copied file is renamed to the corresponding entry.

  • workflow (Workflow) – The inner Workflow instance to execute.

Variables:
  • output_files (list[str]) – Output filenames (populated after launch).

  • output_values (dict) – Scalar results from the inner workflow.

  • status (str) – Current status.

  • project_dir (str) – Absolute path to the project directory.

Examples

Wrap a VMC optimization in its own directory:

enc = Container(
    label="vmc-opt",
    dirname="01_vmc",
    input_files=["hamiltonian_data.h5"],
    workflow=VMC_Workflow(
        server_machine_name="cluster",
        num_opt_steps=10,
        target_error=0.001,
    ),
)
status, files, values = enc.launch()

See also

Launcher

Execute multiple Container objects as a DAG.

FileFrom

Reference an output file from another workflow.

async async_launch()#
launch()#
class jqmc_workflow.FileFrom(label, filename)#

Bases: object

Declare that an input file should come from another workflow’s output.

Used inside Container definitions to express inter-workflow file dependencies. The Launcher resolves these placeholders before a workflow is launched.

Parameters:
  • label (str) – Label of the upstream workflow that produces the file.

  • filename (str) – Filename (basename) to pull from the upstream workflow’s output directory.

Examples

Pass an optimised Hamiltonian from a VMC step to an MCMC step:

Container(
    label="mcmc-run",
    dirname="mcmc",
    input_files=[FileFrom("vmc-opt", "hamiltonian_data_opt_step_9.h5")],
    rename_input_files=["hamiltonian_data.h5"],
    workflow=MCMC_Workflow(...),
)

See also

ValueFrom

Declare a scalar-value dependency.

Launcher

Resolves FileFrom / ValueFrom at launch time.

class jqmc_workflow.Input_Parameters(actual_opt_steps=None, per_input=<factory>)#

Bases: object

Key parameters extracted from a workflow directory.

Parameters are sourced from the TOML input files recorded in workflow_state.toml, with defaults filled from jqmc.jqmc_miscs.cli_parameters.

Each entry in per_input is a dict:

{
    "input_file": "input_1.toml",
    "output_file": "out_1.o",
    "job_type": "vmc",
    "control": { ... all [control] params with defaults ... },
    "<job_type>": { ... all job-type params with defaults ... },
}
Variables:
  • actual_opt_steps (int or None) – For VMC: last completed optimization step stored in restart.h5 (rank_0/driver_config attrs i_opt). None for non-VMC workflows.

  • per_input (list of dict) – Per-input-file parameters. One dict per [[jobs]] entry in workflow_state.toml.

Parameters:
  • actual_opt_steps (int | None)

  • per_input (list)

actual_opt_steps: int | None = None#
per_input: list#
class jqmc_workflow.LRDMC_Diagnostic_Data(survived_walkers_ratio=None, avg_num_projections=None, total_time_sec=None, precompilation_time_sec=None, net_time_sec=None, timing_breakdown=<factory>, energy=None, energy_error=None, atomic_forces=None, hamiltonian_data_file=None, restart_checkpoint=None, num_mpi_processes=None, num_walkers_per_process=None, jax_backend=None, jax_devices=None, stderr_tail='')#

Bases: object

Parse result for an LRDMC calculation.

Variables:
  • survived_walkers_ratio (float or None) – Survived walkers ratio = X % → X / 100.

  • avg_num_projections (float or None) – Average of the number of projections = X.

  • total_time_sec (float or None) – Total GFMC time for N branching steps = X sec.

  • precompilation_time_sec (float or None) – Pre-compilation time for GFMC = X sec.

  • net_time_sec (float or None) – Net GFMC time without pre-compilations = X sec.

  • timing_breakdown (dict) – Per-branching timing breakdown (msec). Keys vary by LRDMC variant, e.g. "projection", "observable", "mpi_barrier", "collection", "reconfiguration", "e_L", "de_L_dR_dr", "update_E_scf", "misc".

  • energy (float or None) – Energy from jqmc-tool post-processing.

  • energy_error (float or None) – Energy error from jqmc-tool post-processing.

  • atomic_forces (list of dict or None) – Per-atom forces from jqmc-tool lrdmc compute-force. Each dict: {label, Fx, Fx_err, Fy, Fy_err, Fz, Fz_err}.

  • hamiltonian_data_file (str or None) – [control] hamiltonian_h5 value from the input TOML.

  • restart_checkpoint (str or None) – Restart file name from Dump restart checkpoint file(s) to X.. None if the line was not found.

  • num_mpi_processes (int or None) – The number of MPI process = N. → N.

  • num_walkers_per_process (int or None) – The number of walkers assigned for each MPI process = N. → N.

  • jax_backend (str or None) – JAX backend = X. → X (e.g. "gpu", "cpu").

  • jax_devices (list or None) – Parsed list of global XLA device strings.

  • stderr_tail (str) – Last portion of stderr (up to 200 lines).

Parameters:
  • survived_walkers_ratio (float | None)

  • avg_num_projections (float | None)

  • total_time_sec (float | None)

  • precompilation_time_sec (float | None)

  • net_time_sec (float | None)

  • timing_breakdown (dict)

  • energy (float | None)

  • energy_error (float | None)

  • atomic_forces (list | None)

  • hamiltonian_data_file (str | None)

  • restart_checkpoint (str | None)

  • num_mpi_processes (int | None)

  • num_walkers_per_process (int | None)

  • jax_backend (str | None)

  • jax_devices (list | None)

  • stderr_tail (str)

atomic_forces: list | None = None#
avg_num_projections: float | None = None#
energy: float | None = None#
energy_error: float | None = None#
hamiltonian_data_file: str | None = None#
jax_backend: str | None = None#
jax_devices: list | None = None#
net_time_sec: float | None = None#
num_mpi_processes: int | None = None#
num_walkers_per_process: int | None = None#
precompilation_time_sec: float | None = None#
restart_checkpoint: str | None = None#
stderr_tail: str = ''#
survived_walkers_ratio: float | None = None#
timing_breakdown: dict#
total_time_sec: float | None = None#
class jqmc_workflow.LRDMC_Ext_Diagnostic_Data(extrapolated_energy=None, extrapolated_energy_error=None, per_alat_results=<factory>, stderr_tail='')#

Bases: object

Parse result for an LRDMC a²→0 extrapolation.

Variables:
  • extrapolated_energy (float or None) – For a -> 0 bohr: E = X +- Y Ha. → X.

  • extrapolated_energy_error (float or None) – Y from the above.

  • per_alat_results (list of dict) – Each dict has {"alat": float, "energy": float, "energy_error": float}.

  • stderr_tail (str) – Last portion of stderr (up to 200 lines).

Parameters:
  • extrapolated_energy (float | None)

  • extrapolated_energy_error (float | None)

  • per_alat_results (list)

  • stderr_tail (str)

extrapolated_energy: float | None = None#
extrapolated_energy_error: float | None = None#
per_alat_results: list#
stderr_tail: str = ''#
class jqmc_workflow.LRDMC_Ext_Workflow(server_machine_name='localhost', alat_list=None, hamiltonian_file='hamiltonian_data.h5', queue_label='default', pilot_queue_label=None, jobname_prefix='jqmc-lrdmc', number_of_walkers=4, max_time=86400, polynomial_order=2, num_gfmc_bin_blocks=5, num_gfmc_warmup_steps=0, num_gfmc_collect_steps=5, time_projection_tau=0.1, target_survived_walkers_ratio=None, num_mcmc_per_measurement=None, non_local_move=None, E_scf=None, atomic_force=None, epsilon_PW=None, mcmc_seed=None, verbosity=None, poll_interval=60, target_error=0.001, pilot_steps=100, num_gfmc_projections=None, max_continuation=5)#

Bases: Workflow

LRDMC a²→0 continuum-limit extrapolation workflow.

Orchestrates multiple LRDMC_Workflow runs at different lattice spacings (alat values), then post-processes with jqmc-tool lrdmc extrapolate-energy to obtain the continuum-limit energy.

Each alat run is wrapped in its own Container and all alat values are executed in parallel. Every alat independently calibrates its own num_mcmc_per_measurement (when target_survived_walkers_ratio is set in GFMC_n mode), runs an error-bar pilot, and then runs production.

Mode selection follows the same rules as LRDMC_Workflow:

  • GFMC_t (default) — set time_projection_tau (default 0.10).

  • GFMC_n — set target_survived_walkers_ratio or num_mcmc_per_measurement.

Parameters:
  • server_machine_name (str) – Target machine name (shared by all sub-runs).

  • alat_list (list[float]) – List of lattice discretization values, e.g. [0.5, 0.4, 0.3].

  • hamiltonian_file (str) – Input hamiltonian_data.h5 (must exist in the parent directory or be resolved by FileFrom).

  • queue_label (str) – Queue/partition label for production runs.

  • pilot_queue_label (str, optional) – Queue/partition label for pilot runs. Defaults to queue_label when None. A shorter queue is often sufficient for the pilot.

  • jobname_prefix (str) – Prefix for each sub-run job name.

  • number_of_walkers (int) – Walkers per MPI process.

  • max_time (int) – Wall-time limit per sub-run (seconds).

  • polynomial_order (int) – Polynomial order for the a²→0 extrapolation (default: 2).

  • num_gfmc_bin_blocks (int) – Binning blocks for post-processing.

  • num_gfmc_warmup_steps (int) – Warmup steps to discard.

  • num_gfmc_collect_steps (int) – Weight-collection steps.

  • time_projection_tau (float, optional) – Imaginary time step for GFMC_t mode (default 0.10). Ignored when target_survived_walkers_ratio or num_mcmc_per_measurement is set.

  • target_survived_walkers_ratio (float, optional) – Target survived-walkers ratio (default None). Each alat independently runs a calibration pilot (_pilot_a) to find its own optimal num_mcmc_per_measurement. Set to None to disable auto-calibration (requires explicit num_mcmc_per_measurement). Activates GFMC_n mode.

  • num_mcmc_per_measurement (int, optional) – GFMC projections per measurement. When given explicitly, automatic calibration is disabled and this value is used for every alat. Activates GFMC_n mode.

  • non_local_move (str, optional) – Non-local move treatment. Default from jqmc_miscs.

  • E_scf (float, optional) – Initial energy guess for the GFMC shift (GFMC_n only). Default from jqmc_miscs.

  • atomic_force (bool, optional) – Compute atomic forces. Default from jqmc_miscs.

  • epsilon_PW (float, optional) – Pathak–Wagner regularization parameter (Bohr). When > 0, the force estimator is regularized near the nodal surface. Default from jqmc_miscs.

  • mcmc_seed (int, optional) – Random seed for MCMC. Default from jqmc_miscs.

  • verbosity (str, optional) – Verbosity level. Default from jqmc_miscs.

  • poll_interval (int) – Seconds between job-status polls.

  • target_error (float) – Target statistical error (Ha) for each sub-LRDMC run. Passed through to each LRDMC_Workflow.

  • pilot_steps (int) – Pilot measurement steps for target-error estimation.

  • num_gfmc_projections (int, optional) – Fixed number of measurement steps per production run. When set, the error-bar pilot is skipped for each sub-LRDMC and all max_continuation runs are executed unconditionally. Passed through to each LRDMC_Workflow. Default None (automatic mode).

  • max_continuation (int) – Maximum number of production runs per sub-LRDMC.

Examples

GFMC_t mode (default):

wf = LRDMC_Ext_Workflow(
    server_machine_name="cluster",
    alat_list=[0.5, 0.4, 0.3],
    target_error=0.001,
    number_of_walkers=8,
)
status, files, values = wf.launch()
print(values["extrapolated_energy"],
      values["extrapolated_energy_error"])

GFMC_n mode with calibration:

wf = LRDMC_Ext_Workflow(
    server_machine_name="cluster",
    alat_list=[0.5, 0.4, 0.3],
    target_survived_walkers_ratio=0.97,
    target_error=0.001,
    number_of_walkers=8,
)

As part of a Launcher pipeline:

enc = Container(
    label="lrdmc-ext",
    dirname="03_lrdmc",
    input_files=[FileFrom("mcmc-run", "hamiltonian_data.h5")],
    workflow=LRDMC_Ext_Workflow(
        server_machine_name="cluster",
        alat_list=[0.5, 0.4, 0.3],
        target_error=0.001,
    ),
)

Notes

  • At least two alat values are required for extrapolation. With a single value, per-alat results are returned but no extrapolation is performed.

  • Each sub-run directory is named lrdmc_alat_<value>/.

See also

LRDMC_Workflow

Single-alat LRDMC run.

async async_launch()#

Run LRDMC at each alat, then extrapolate to a²→0.

Every alat value is launched in parallel. Each child LRDMC_Workflow independently handles its own calibration (_pilot_a), error-bar pilot (_pilot_b), and production phase.

Returns:

(status, output_files, output_values)

Return type:

tuple

class jqmc_workflow.LRDMC_Workflow(server_machine_name='localhost', alat=0.3, hamiltonian_file='hamiltonian_data.h5', input_file='input.toml', output_file='out.o', queue_label='default', jobname='jqmc-lrdmc', number_of_walkers=4, max_time=86400, num_gfmc_bin_blocks=5, num_gfmc_warmup_steps=0, num_gfmc_collect_steps=5, time_projection_tau=0.1, target_survived_walkers_ratio=None, num_mcmc_per_measurement=None, non_local_move=None, E_scf=None, atomic_force=None, epsilon_PW=None, mcmc_seed=None, verbosity=None, poll_interval=60, target_error=0.001, pilot_steps=100, num_gfmc_projections=None, pilot_queue_label=None, max_continuation=1)#

Bases: Workflow

Single LRDMC (Lattice-Regularized Diffusion Monte Carlo) run.

Generates a job_type=lrdmc-bra (GFMC_n) or job_type=lrdmc-tau (GFMC_t) input TOML at a fixed lattice spacing alat, submits jqmc, monitors until completion, fetches the checkpoint, and post-processes with jqmc-tool lrdmc compute-energy to extract the DMC energy ± error.

Mode selection (mutually exclusive):

  • GFMC_t (default) — set time_projection_tau (default 0.10). Uses continuous imaginary-time projection. Only the error-bar pilot is run (no calibration phase).

  • GFMC_n — set target_survived_walkers_ratio or num_mcmc_per_measurement. Uses discrete GFMC projections. When target_survived_walkers_ratio is set (and num_mcmc_per_measurement is None), an automatic calibration pilot determines the optimal num_mcmc_per_measurement.

The workflow supports two operating modes:

Automatic mode (default, num_gfmc_projections=None):

  1. Pilot run (_0) — A short run with pilot_steps measurement steps. The resulting error estimates the steps required for target_error via $sigma propto 1/sqrt{N}$. In GFMC_n mode with calibration, three additional short runs precede this to determine num_mcmc_per_measurement.

  2. Production runs (_1, _2, …) — Continuation runs with the estimated step count. The loop terminates when the error is ≤ target_error or max_continuation is reached.

Fixed-step mode (num_gfmc_projections is set):

The error-bar pilot (_pilot_b) is skipped and target_error is ignored. If calibration is needed (GFMC_n mode with target_survived_walkers_ratio), _pilot_a still runs. Each production run uses exactly num_gfmc_projections measurement steps, and max_continuation runs are executed unconditionally.

Parameters:
  • server_machine_name (str) – Target machine name.

  • alat (float) – Lattice discretization parameter (bohr).

  • hamiltonian_file (str) – Input hamiltonian_data.h5.

  • input_file (str) – Generated TOML input filename.

  • output_file (str) – Stdout capture filename.

  • queue_label (str) – Queue/partition label.

  • jobname (str) – Scheduler job name.

  • number_of_walkers (int) – Walkers per MPI process.

  • max_time (int) – Wall-time limit (seconds).

  • num_gfmc_bin_blocks (int) – Binning blocks for post-processing.

  • num_gfmc_warmup_steps (int) – Warmup steps to discard in post-processing.

  • num_gfmc_collect_steps (int) – Weight-collection steps for energy post-processing.

  • time_projection_tau (float, optional) – Imaginary time step between projections (bohr) for GFMC_t mode. Default 0.10. Ignored when target_survived_walkers_ratio or num_mcmc_per_measurement is set.

  • target_survived_walkers_ratio (float, optional) – Target survived-walkers ratio for automatic num_mcmc_per_measurement calibration. Setting this activates GFMC_n mode. The pilot phase runs three short calculations at Ne*k*(0.3/alat)² projections (k=2,4,6), fits a linear model to the observed survived-walkers ratio, and picks the value that achieves this target.

  • num_mcmc_per_measurement (int, optional) – GFMC projections per measurement (GFMC_n mode). When given explicitly, the automatic calibration is skipped.

  • non_local_move (str, optional) – Non-local move treatment ("tmove" or "dltmove"). Default from jqmc_miscs.

  • E_scf (float, optional) – Initial energy guess for the GFMC shift (GFMC_n only). Default from jqmc_miscs.

  • atomic_force (bool, optional) – Compute atomic forces. Default from jqmc_miscs.

  • epsilon_PW (float, optional) – Pathak–Wagner regularization parameter (Bohr). When > 0, the force estimator is regularized near the nodal surface. Default from jqmc_miscs.

  • mcmc_seed (int, optional) – Random seed for MCMC. Default from jqmc_miscs.

  • verbosity (str, optional) – Verbosity level. Default from jqmc_miscs.

  • poll_interval (int) – Seconds between job-status polls.

  • target_error (float) – Target statistical error (Ha).

  • pilot_steps (int) – Measurement steps for the pilot estimation run.

  • num_gfmc_projections (int, optional) – Fixed number of measurement steps per production run. When set, the error-bar pilot (_pilot_b) is skipped, target_error is ignored, and all max_continuation production runs are executed unconditionally. Calibration (_pilot_a) still runs when needed (GFMC_n mode with target_survived_walkers_ratio). Default None (automatic mode).

  • pilot_queue_label (str, optional) – Queue label for the pilot run. Defaults to queue_label. Use a shorter/smaller queue for the pilot to save resources.

  • max_continuation (int) – Maximum number of production runs after the pilot.

Examples

GFMC_t mode (default):

wf = LRDMC_Workflow(
    server_machine_name="cluster",
    alat=0.3,
    target_error=0.0005,
    number_of_walkers=8,
)
status, files, values = wf.launch()
print(values["energy"], values["energy_error"])

GFMC_n mode with calibration:

wf = LRDMC_Workflow(
    server_machine_name="cluster",
    alat=0.3,
    target_error=0.0005,
    target_survived_walkers_ratio=0.97,
    number_of_walkers=8,
)

Fixed-step mode (skip error-bar pilot):

wf = LRDMC_Workflow(
    server_machine_name="cluster",
    alat=0.3,
    num_gfmc_projections=500,
    max_continuation=3,
    number_of_walkers=8,
)

As part of a Launcher pipeline:

enc = Container(
    label="lrdmc-a0.30",
    dirname="03_lrdmc",
    input_files=[FileFrom("mcmc-run", "hamiltonian_data.h5")],
    workflow=LRDMC_Workflow(
        server_machine_name="cluster",
        alat=0.3,
        target_error=0.001,
    ),
)

Notes

  • For a²→0 continuum-limit extrapolation, use LRDMC_Ext_Workflow instead.

  • The pilot is skipped on re-entrance if an estimation already exists in workflow_state.toml.

See also

LRDMC_Ext_Workflow

Multi-alat extrapolation wrapper.

MCMC_Workflow

VMC production sampling (job_type=mcmc).

VMC_Workflow

Wavefunction optimisation (job_type=vmc).

async async_launch()#

Run the LRDMC workflow.

Fixed-step mode (num_gfmc_projections is set): The error-bar pilot (_pilot_b) is skipped. Calibration (_pilot_a) still runs if needed. Each production run uses exactly num_gfmc_projections steps and all max_continuation runs are executed unconditionally.

Automatic mode (num_gfmc_projections is None, default):

  1. Calibration pilot (_pilot_a, GFMC_n only) — Three short LRDMC runs to determine num_mcmc_per_measurement.

  2. Error-bar pilot (_pilot_b) — estimates production steps.

  3. Production runs (_1, _2, …) — accumulate statistics until target_error is achieved or max_continuation is reached.

property job_type: str#

Return the jqmc job type string for TOML generation.

class jqmc_workflow.Launcher(workflows=None, log_level='INFO', log_name='jqmc_workflow.log', draw_graph=False)#

Bases: object

DAG-based parallel workflow executor.

Accepts a list of Container objects, automatically infers the dependency graph from FileFrom / ValueFrom references, and executes workflows with true DAG parallelism: as soon as all predecessors of a node complete, that node starts immediately — there is no layer-based grouping.

Parameters:
  • workflows (list[Container]) – Workflows to execute. Labels must be unique.

  • log_level (str) – Logging level ("DEBUG" or "INFO").

  • log_name (str) – Log file name (appended, not overwritten).

  • draw_graph (bool) – If True, render the dependency graph to dependency_graph.png (requires the graphviz Python package).

Raises:

ValueError – If workflow labels are duplicated or a dependency references an undefined workflow label.

Examples

Typical three-stage QMC pipeline:

from jqmc_workflow import (
    Launcher, Container, FileFrom,
    WF_Workflow, VMC_Workflow, MCMC_Workflow,
)

wf = Container(
    label="wf",
    dirname="00_wf",
    input_files=["trexio.h5"],
    workflow=WF_Workflow(trexio_file="trexio.h5"),
)

vmc = Container(
    label="vmc-opt",
    dirname="01_vmc",
    input_files=[FileFrom("wf", "hamiltonian_data.h5")],
    workflow=VMC_Workflow(
        server_machine_name="cluster",
        num_opt_steps=10,
        target_error=0.001,
    ),
)

mcmc = Container(
    label="mcmc-run",
    dirname="02_mcmc",
    input_files=[
        FileFrom("vmc-opt", "hamiltonian_data_opt_step_9.h5")
    ],
    rename_input_files=["hamiltonian_data.h5"],
    workflow=MCMC_Workflow(
        server_machine_name="cluster",
        target_error=0.001,
    ),
)

launcher = Launcher(
    workflows=[wf, vmc, mcmc],
    draw_graph=True,
)
launcher.launch()

Notes

  • The launcher changes the working directory during execution and restores it afterwards.

  • If a workflow fails, all downstream dependents are automatically skipped.

See also

Container

Wraps a workflow in a project directory.

FileFrom

File dependency placeholder.

ValueFrom

Value dependency placeholder.

async async_launch()#

Execute all workflows respecting DAG dependencies.

As soon as ALL predecessors of a node complete, that node starts immediately — no layer-based grouping.

launch()#
class jqmc_workflow.MCMC_Diagnostic_Data(acceptance_ratio=None, avg_walker_weight=None, total_time_sec=None, precompilation_time_sec=None, net_time_sec=None, timing_breakdown=<factory>, energy=None, energy_error=None, atomic_forces=None, hamiltonian_data_file=None, restart_checkpoint=None, num_mpi_processes=None, num_walkers_per_process=None, jax_backend=None, jax_devices=None, stderr_tail='')#

Bases: object

Parse result for an MCMC sampling run.

Variables:
  • acceptance_ratio (float or None) – Acceptance ratio is X % → X / 100.

  • avg_walker_weight (float or None) – Average of walker weights is X.

  • total_time_sec (float or None) – Total elapsed time for MCMC N steps. = X sec.

  • precompilation_time_sec (float or None) – Pre-compilation time for MCMC = X sec.

  • net_time_sec (float or None) – Net total time for MCMC = X sec.

  • timing_breakdown (dict) – Per-MCMC-step timing breakdown (msec). Keys: "mcmc_update", "e_L", "de_L_dR_dr", "dln_Psi_dR_dr", "dln_Psi_dc", "de_L_dc", "mpi_barrier", "misc".

  • energy (float or None) – Energy from jqmc-tool post-processing.

  • energy_error (float or None) – Energy error from jqmc-tool post-processing.

  • atomic_forces (list of dict or None) – Per-atom forces from jqmc-tool mcmc compute-force. Each dict: {label, Fx, Fx_err, Fy, Fy_err, Fz, Fz_err}.

  • hamiltonian_data_file (str or None) – [control] hamiltonian_h5 value from the input TOML.

  • restart_checkpoint (str or None) – Restart file name from Dump restart checkpoint file(s) to X.. None if the line was not found.

  • num_mpi_processes (int or None) – The number of MPI process = N. → N.

  • num_walkers_per_process (int or None) – The number of walkers assigned for each MPI process = N. → N.

  • jax_backend (str or None) – JAX backend = X. → X (e.g. "gpu", "cpu").

  • jax_devices (list or None) – Parsed list of global XLA device strings.

  • stderr_tail (str) – Last portion of stderr (up to 200 lines).

Parameters:
  • acceptance_ratio (float | None)

  • avg_walker_weight (float | None)

  • total_time_sec (float | None)

  • precompilation_time_sec (float | None)

  • net_time_sec (float | None)

  • timing_breakdown (dict)

  • energy (float | None)

  • energy_error (float | None)

  • atomic_forces (list | None)

  • hamiltonian_data_file (str | None)

  • restart_checkpoint (str | None)

  • num_mpi_processes (int | None)

  • num_walkers_per_process (int | None)

  • jax_backend (str | None)

  • jax_devices (list | None)

  • stderr_tail (str)

acceptance_ratio: float | None = None#
atomic_forces: list | None = None#
avg_walker_weight: float | None = None#
energy: float | None = None#
energy_error: float | None = None#
hamiltonian_data_file: str | None = None#
jax_backend: str | None = None#
jax_devices: list | None = None#
net_time_sec: float | None = None#
num_mpi_processes: int | None = None#
num_walkers_per_process: int | None = None#
precompilation_time_sec: float | None = None#
restart_checkpoint: str | None = None#
stderr_tail: str = ''#
timing_breakdown: dict#
total_time_sec: float | None = None#
class jqmc_workflow.MCMC_Workflow(server_machine_name='localhost', hamiltonian_file='hamiltonian_data.h5', input_file='input.toml', output_file='out.o', queue_label='default', jobname='jqmc-mcmc', number_of_walkers=4, max_time=86400, num_mcmc_bin_blocks=1, num_mcmc_warmup_steps=0, Dt=None, epsilon_AS=None, num_mcmc_per_measurement=None, atomic_force=None, parameter_derivatives=None, mcmc_seed=None, verbosity=None, poll_interval=60, target_error=0.001, num_mcmc_steps=None, pilot_steps=100, pilot_queue_label=None, max_continuation=1)#

Bases: Workflow

MCMC (VMC production-run / sampling) workflow.

Generates a job_type=mcmc input TOML, submits jqmc on a remote or local machine, monitors until completion, fetches results, and post-processes the checkpoint with jqmc-tool mcmc compute-energy to obtain the VMC energy ± error.

The workflow supports two modes:

Automatic mode (default, num_mcmc_steps=None):

  1. Pilot run (_0) — A short MCMC run with pilot_steps measurement steps. The resulting statistical error is used to estimate the total steps required for target_error via the CLT scaling $sigma propto 1/sqrt{N}$.

  2. Production runs (_1, _2, …) — Continuation runs with the estimated step count. After each run, the checkpoint is post-processed; if the error is at or below target_error the loop terminates. At most max_continuation production runs are attempted.

Fixed-step mode (num_mcmc_steps is set):

The pilot run is skipped entirely and target_error is ignored. Each production run uses exactly num_mcmc_steps measurement steps, and max_continuation runs are executed unconditionally.

Parameters:
  • server_machine_name (str) – Name of the target machine (configured in ~/.jqmc_setting/).

  • hamiltonian_file (str) – Input hamiltonian_data.h5.

  • input_file (str) – Generated TOML input filename.

  • output_file (str) – Stdout capture filename.

  • queue_label (str) – Queue/partition label.

  • jobname (str) – Scheduler job name.

  • number_of_walkers (int) – Walkers per MPI process.

  • max_time (int) – Wall-time limit (seconds).

  • num_mcmc_bin_blocks (int) – Binning blocks for post-processing.

  • num_mcmc_warmup_steps (int) – Warmup steps to discard in post-processing.

  • Dt (float, optional) – MCMC step size (bohr). Default from jqmc_miscs.

  • epsilon_AS (float, optional) – Attacalite-Sorella regularization parameter. Default from jqmc_miscs.

  • num_mcmc_per_measurement (int, optional) – MCMC updates per measurement. Default from jqmc_miscs.

  • atomic_force (bool, optional) – Compute atomic forces. Default from jqmc_miscs.

  • parameter_derivatives (bool, optional) – Compute parameter derivatives. Default from jqmc_miscs.

  • mcmc_seed (int, optional) – Random seed for MCMC. Default from jqmc_miscs.

  • verbosity (str, optional) – Verbosity level. Default from jqmc_miscs.

  • poll_interval (int) – Seconds between job-status polls.

  • target_error (float) – Target statistical error (Ha). Ignored when num_mcmc_steps is set.

  • num_mcmc_steps (int, optional) – Fixed number of measurement steps per production run. When set, the pilot run is skipped and target_error is ignored; each of the max_continuation production runs uses exactly this many steps.

  • pilot_steps (int) – Measurement steps for the pilot estimation run. Ignored when num_mcmc_steps is set.

  • pilot_queue_label (str, optional) – Queue label for the pilot run. Defaults to queue_label. Use a shorter/smaller queue for the pilot to save resources.

  • max_continuation (int) – Maximum number of production runs after the pilot.

Examples

Standalone launch (automatic mode):

wf = MCMC_Workflow(
    server_machine_name="cluster",
    target_error=0.0005,
    pilot_steps=200,
    number_of_walkers=8,
)
status, files, values = wf.launch()
print(values["energy"], values["energy_error"])

Fixed-step mode (no pilot, no target_error check):

wf = MCMC_Workflow(
    server_machine_name="cluster",
    num_mcmc_steps=5000,
    number_of_walkers=8,
    max_continuation=3,
)
status, files, values = wf.launch()

As part of a Launcher pipeline:

enc = Container(
    label="mcmc",
    dirname="02_mcmc",
    input_files=[FileFrom("vmc-opt", "hamiltonian_data_opt_step_9.h5")],
    rename_input_files=["hamiltonian_data.h5"],
    workflow=MCMC_Workflow(
        server_machine_name="cluster",
        target_error=0.001,
    ),
)

Notes

  • The pilot run is skipped on re-entrance if an estimation already exists in workflow_state.toml.

  • Continuation runs restart from the most recent .h5 checkpoint file.

See also

VMC_Workflow

Wavefunction optimisation (job_type=vmc).

LRDMC_Workflow

Diffusion Monte Carlo (job_type=lrdmc-bra / lrdmc-tau).

async async_launch()#

Run the MCMC workflow.

Fixed-step mode (num_mcmc_steps is set): The pilot run is skipped. Each production run uses exactly num_mcmc_steps steps and all max_continuation runs are executed unconditionally.

Automatic mode (num_mcmc_steps is None, default):

  1. Pilot run in _pilot/ subdirectory estimates required steps (skipped on continuation). May use a different queue from production (pilot_queue_label).

  2. Production runs (_1, _2, …) start from scratch and accumulate statistics until target_error is achieved or max_continuation is reached.

class jqmc_workflow.VMC_Diagnostic_Data(steps=<factory>, total_opt_steps=None, total_opt_time_sec=None, opt_timing_breakdown=<factory>, optimized_hamiltonian=None, restart_checkpoint=None, num_mpi_processes=None, num_walkers_per_process=None, jax_backend=None, jax_devices=None, stderr_tail='')#

Bases: object

Aggregated parse result for an entire VMC optimization.

Variables:
  • steps (list of VMC_Step_Data) – Per-step data in chronological order.

  • total_opt_steps (int or None) – Total optimization steps (Optimization step = N/M → M).

  • total_opt_time_sec (float or None) – Total elapsed time for optimization N steps. = X sec.

  • opt_timing_breakdown (dict) – Per-optimization-step timing breakdown (sec). Keys: "mcmc_run", "get_E", "get_gF", "optimizer", "param_update", "mpi_barrier", "misc".

  • optimized_hamiltonian (str or None) – Path to the last hamiltonian_data_opt_step_*.h5 file found.

  • restart_checkpoint (str or None) – Restart file name from Dump restart checkpoint file(s) to X.. None if the line was not found (indicates abnormal termination).

  • num_mpi_processes (int or None) – The number of MPI process = N. → N.

  • num_walkers_per_process (int or None) – The number of walkers assigned for each MPI process = N. → N.

  • jax_backend (str or None) – JAX backend = X. → X (e.g. "gpu", "cpu"). Set to "cpu" when the log says Running on CPUs or single GPU.

  • jax_devices (list or None) – Parsed list of global XLA device strings from *** XLA Global devices recognized by JAX*** line. e.g. ["CudaDevice(id=0)", "CudaDevice(id=1)"].

  • stderr_tail (str) – Last portion of stderr (up to 200 lines).

Parameters:
  • steps (list)

  • total_opt_steps (int | None)

  • total_opt_time_sec (float | None)

  • opt_timing_breakdown (dict)

  • optimized_hamiltonian (str | None)

  • restart_checkpoint (str | None)

  • num_mpi_processes (int | None)

  • num_walkers_per_process (int | None)

  • jax_backend (str | None)

  • jax_devices (list | None)

  • stderr_tail (str)

jax_backend: str | None = None#
jax_devices: list | None = None#
num_mpi_processes: int | None = None#
num_walkers_per_process: int | None = None#
opt_timing_breakdown: dict#
optimized_hamiltonian: str | None = None#
restart_checkpoint: str | None = None#
stderr_tail: str = ''#
steps: list#
total_opt_steps: int | None = None#
total_opt_time_sec: float | None = None#
class jqmc_workflow.VMC_Step_Data(step, energy=None, energy_error=None, max_force=None, max_force_error=None, signal_to_noise_ratio=None, avg_walker_weight=None, acceptance_ratio=None, total_time_sec=None, precompilation_time_sec=None, net_time_sec=None, timing_breakdown=<factory>)#

Bases: object

Data for one VMC optimization step.

Variables:
  • step (int) – Optimization step number (Optimization step = N/M → N).

  • energy (float or None) – Total energy E = X +- Y → X (Ha).

  • energy_error (float or None) – Energy statistical error → Y (Ha).

  • max_force (float or None) – Maximum force Max f = X +- Y → X (Ha/a.u.).

  • max_force_error (float or None) – Force error → Y (Ha/a.u.).

  • signal_to_noise_ratio (float or None) – Max of signal-to-noise of f = max(|f|/|std f|) = X.

  • avg_walker_weight (float or None) – Average of walker weights is X.

  • acceptance_ratio (float or None) – Acceptance ratio is X % → X / 100.

  • total_time_sec (float or None) – Total elapsed time for MCMC N steps. = X sec.

  • precompilation_time_sec (float or None) – Pre-compilation time for MCMC = X sec.

  • net_time_sec (float or None) – Net total time for MCMC = X sec.

  • timing_breakdown (dict) – Per-MCMC-step timing breakdown (msec). Keys match the jQMC log lines, e.g. "mcmc_update", "e_L", etc.

Parameters:
  • step (int)

  • energy (float | None)

  • energy_error (float | None)

  • max_force (float | None)

  • max_force_error (float | None)

  • signal_to_noise_ratio (float | None)

  • avg_walker_weight (float | None)

  • acceptance_ratio (float | None)

  • total_time_sec (float | None)

  • precompilation_time_sec (float | None)

  • net_time_sec (float | None)

  • timing_breakdown (dict)

acceptance_ratio: float | None = None#
avg_walker_weight: float | None = None#
energy: float | None = None#
energy_error: float | None = None#
max_force: float | None = None#
max_force_error: float | None = None#
net_time_sec: float | None = None#
precompilation_time_sec: float | None = None#
signal_to_noise_ratio: float | None = None#
step: int#
timing_breakdown: dict#
total_time_sec: float | None = None#
class jqmc_workflow.VMC_Workflow(server_machine_name='localhost', num_opt_steps=20, hamiltonian_file='hamiltonian_data.h5', input_file='input.toml', output_file='out.o', queue_label='default', jobname='jqmc-vmc', number_of_walkers=4, max_time=86400, Dt=None, epsilon_AS=None, num_mcmc_per_measurement=None, num_mcmc_warmup_steps=None, num_mcmc_bin_blocks=None, wf_dump_freq=None, opt_J1_param=None, opt_J2_param=None, opt_J3_param=None, opt_JNN_param=None, opt_lambda_param=None, opt_with_projected_MOs=None, num_param_opt=None, optimizer_kwargs=None, mcmc_seed=None, verbosity=None, poll_interval=60, target_error=0.001, pilot_mcmc_steps=50, pilot_vmc_steps=5, pilot_queue_label=None, max_continuation=1, target_snr=4.5)#

Bases: Workflow

VMC (Variational Monte Carlo) Jastrow / orbital optimisation workflow.

Generates a job_type=vmc input TOML, submits jqmc, monitors until completion, and collects the optimised hamiltonian_data_opt_step_N.h5 files and checkpoint.

The workflow operates in two phases:

  1. Pilot VMC run (_0) — Runs a short optimisation with pilot_vmc_steps optimisation steps and pilot_mcmc_steps MCMC steps per step. The statistical error of the last optimisation step is used to estimate the MCMC steps per opt-step required to achieve target_error via $sigma propto 1/sqrt{N}$.

  2. Production VMC runs (_1, _2, …) — Full optimisation with num_opt_steps and the estimated MCMC steps per step. If a run is interrupted by the wall-time limit, the next continuation restarts from the checkpoint. At most max_continuation runs are attempted.

Parameters:
  • server_machine_name (str) – Name of the target machine (must be configured in ~/.jqmc_setting/).

  • num_opt_steps (int) – Number of optimization iterations for production runs.

  • hamiltonian_file (str) – Input hamiltonian_data.h5.

  • input_file (str) – Name of the generated TOML input file.

  • output_file (str) – Name of the stdout capture file.

  • queue_label (str) – Queue/partition label from queue_data.toml.

  • jobname (str) – Job name for the scheduler.

  • number_of_walkers (int) – Walkers per MPI process.

  • max_time (int) – Wall-time limit in seconds.

  • Dt (float, optional) – MCMC step size (bohr). Default from jqmc_miscs.

  • epsilon_AS (float, optional) – Attacalite-Sorella regularization parameter. Default from jqmc_miscs.

  • num_mcmc_per_measurement (int, optional) – MCMC updates per measurement. Default from jqmc_miscs.

  • num_mcmc_warmup_steps (int, optional) – Warmup measurement steps to discard. Default from jqmc_miscs.

  • num_mcmc_bin_blocks (int, optional) – Binning blocks. Default from jqmc_miscs.

  • wf_dump_freq (int, optional) – Wavefunction dump frequency. Default from jqmc_miscs.

  • opt_J1_param (bool, optional) – Optimize J1 Jastrow parameters. Default from jqmc_miscs.

  • opt_J2_param (bool, optional) – Optimize J2 Jastrow parameters. Default from jqmc_miscs.

  • opt_J3_param (bool, optional) – Optimize J3 Jastrow parameters. Default from jqmc_miscs.

  • opt_JNN_param (bool, optional) – Optimize neural-network Jastrow parameters. Default from jqmc_miscs.

  • opt_lambda_param (bool, optional) – Optimize lambda (geminal) parameters. Default from jqmc_miscs.

  • opt_with_projected_MOs (bool, optional) – Optimize in a restricted MO space. Default from jqmc_miscs.

  • num_param_opt (int, optional) – Number of parameters to optimize (0 = all). Default from jqmc_miscs.

  • optimizer_kwargs (dict, optional) – Optimizer configuration dict. Default from jqmc_miscs.

  • mcmc_seed (int, optional) – Random seed for MCMC. Default from jqmc_miscs.

  • verbosity (str, optional) – Verbosity level. Default from jqmc_miscs.

  • poll_interval (int) – Seconds between job-status polls.

  • target_error (float) – Target statistical error (Ha) per optimization step.

  • pilot_mcmc_steps (int) – MCMC steps per opt-step for the pilot run.

  • pilot_vmc_steps (int) – Number of optimization steps in the pilot run (small; just enough to estimate the error bar).

  • pilot_queue_label (str, optional) – Queue label for the pilot run. Defaults to queue_label. Use a shorter/smaller queue for the pilot to save resources.

  • max_continuation (int) – Maximum number of production runs after the pilot.

  • target_snr (float) – Target signal-to-noise ratio max(|f|/|std f|) for force convergence. The workflow continues until the last optimization step’s S/N drops to or below this threshold.

Examples

Standalone launch:

wf = VMC_Workflow(
    server_machine_name="cluster",
    num_opt_steps=20,
    target_error=0.001,
    pilot_mcmc_steps=50,
    pilot_vmc_steps=5,
    number_of_walkers=8,
)
status, files, values = wf.launch()
print(values["optimized_hamiltonian"])

As part of a Launcher pipeline:

enc = Container(
    label="vmc",
    dirname="01_vmc",
    input_files=[FileFrom("wf", "hamiltonian_data.h5")],
    workflow=VMC_Workflow(
        server_machine_name="cluster",
        num_opt_steps=20,
        target_error=0.001,
    ),
)

Notes

  • The pilot uses a small number of opt steps (pilot_vmc_steps) just to estimate the error. The real optimisation happens in production runs with the full num_opt_steps.

  • The estimation is stored in workflow_state.toml under [estimation]; on re-entrance the pilot is skipped.

See also

MCMC_Workflow

VMC production sampling (job_type=mcmc).

LRDMC_Workflow

Diffusion Monte Carlo (job_type=lrdmc-bra / lrdmc-tau).

WF_Workflow

TREXIO → hamiltonian_data conversion.

async async_launch()#

Run the VMC optimization workflow with automatic step estimation.

  1. Pilot VMC run in _pilot/ with pilot_vmc_steps opt steps and pilot_mcmc_steps MCMC steps to estimate the required MCMC steps per opt step (skipped on continuation). May use a different queue (pilot_queue_label).

  2. Production VMC runs (_1, _2, …) start from scratch with the full num_opt_steps and estimated MCMC steps until all optimization steps complete or max_continuation is reached.

class jqmc_workflow.ValueFrom(label, key)#

Bases: object

Declare that a parameter value should come from another workflow’s output.

Used when a downstream workflow needs a scalar result (energy, error, filename string, etc.) produced by an upstream workflow. The Launcher resolves these placeholders before launch.

Parameters:
  • label (str) – Label of the upstream workflow that produces the value.

  • key (str) – Key name in the upstream workflow’s output_values dict.

Examples

Feed the MCMC energy into an LRDMC workflow as trial_energy:

LRDMC_Workflow(
    trial_energy=ValueFrom("mcmc-run", "energy"),
    ...
)

See also

FileFrom

Declare a file dependency.

Launcher

Resolves FileFrom / ValueFrom at launch time.

class jqmc_workflow.WF_Workflow(trexio_file='trexio.h5', hamiltonian_file='hamiltonian_data.h5', j1_parameter=None, j1_type=None, j2_parameter=None, j2_type=None, j3_basis_type=None, j_nn_type=None, j_nn_params=None, ao_conv_to=None)#

Bases: Workflow

Convert a TREXIO file to hamiltonian_data.h5.

Calls jqmc-tool trexio convert-to under the hood.

Parameters:
  • trexio_file (str) – Path to the input TREXIO .h5 file.

  • hamiltonian_file (str) – Output filename (default: "hamiltonian_data.h5").

  • j1_parameter (float, optional) – Jastrow one-body parameter (-j1).

  • j1_type (str, optional) – Jastrow one-body functional form (--jastrow-1b-type). "exp" (default) or "pade".

  • j2_parameter (float, optional) – Jastrow two-body parameter (-j2).

  • j2_type (str, optional) – Jastrow two-body functional form (--jastrow-2b-type). "pade" (default) or "exp".

  • j3_basis_type (str, optional) – Jastrow three-body basis-set type (-j3). One of "ao", "ao-full", "ao-small", "ao-medium", "ao-large", "mo", "none", or None (disabled).

  • j_nn_type (str, optional) – Neural-network Jastrow type (-j-nn-type), e.g. "schnet".

  • j_nn_params (list[str], optional) – Extra NN Jastrow parameters (-jp key=value).

  • ao_conv_to (str, optional) – Convert AOs after building the Hamiltonian (--ao-conv-to). "cart" → convert to Cartesian AOs, "sphe" → convert to spherical-harmonic AOs, None → keep the original representation.

Example

>>> wf = WF_Workflow(
...     trexio_file="molecular.h5",
...     j1_parameter=1.0,
...     j1_type="pade",
...     j2_parameter=0.5,
...     j2_type="exp",
...     j3_basis_type="ao-small",
... )
>>> status, out_files, out_values = wf.launch()

Notes

This workflow runs locally — no remote job submission is involved. It calls jqmc-tool trexio convert-to via subprocess.run().

See also

VMC_Workflow

Optimise the wavefunction produced by this step.

async async_launch()#

Run the TREXIO→hamiltonian conversion (locally).

Returns:

(status, output_files, output_values)

Return type:

tuple

class jqmc_workflow.Workflow(project_dir=None)#

Bases: object

Abstract base class for all jQMC computation workflows.

Every concrete workflow (VMC, MCMC, LRDMC, WF, …) inherits from this class and overrides async_launch().

Parameters:

project_dir (str, optional) – Absolute path to the working directory for this workflow. When None (the default), project_dir is set to the process CWD at the time async_launch() is first called. Container sets this explicitly before launching the inner workflow.

Variables:
  • status (str) – Current lifecycle status ("init", "success", "failed").

  • output_files (list[str]) – Filenames produced by the workflow (populated after launch).

  • output_values (dict) – Scalar results (energy, error, …) produced by the workflow.

  • project_dir (str or None) – Working directory for file I/O. Resolved to an absolute path.

Notes

Subclass contract:

  • Override async_launch() and return (status, output_files, output_values).

  • Call super().__init__() in your constructor.

Examples

Minimal custom workflow:

class MyWorkflow(Workflow):
    async def async_launch(self):
        # ... do work ...
        self.status = "success"
        return self.status, ["result.h5"], {"energy": -1.23}
async async_collect()#

Collect and return results from completed jobs.

Override in subclass. The default raises NotImplementedError.

Returns:

Workflow-specific result mapping.

Return type:

dict

async async_launch()#

Override in subclass. Must return (status, output_files, output_values).

async async_poll()#

Check whether submitted jobs have completed.

Override in subclass. The default raises NotImplementedError.

Returns:

One of "running", "completed", "failed".

Return type:

str

async async_submit()#

Submit initial job(s) and return tracking information.

Override in subclass. The default raises NotImplementedError.

Returns:

At minimum {"status": "submitted"}.

Return type:

dict

launch()#
jqmc_workflow.get_all_workflow_statuses(base_dir)#

Recursively find all workflow_state.toml files under base_dir.

Returns a list of dicts, each containing:

  • directory – absolute path to the workflow directory

  • label – workflow label (from [workflow])

  • type – workflow type (e.g. "vmc")

  • status – current workflow status

Directories without a workflow_state.toml are silently skipped.

Parameters:

base_dir (str)

Return type:

list

jqmc_workflow.get_workflow_summary(directory)#

Return a comprehensive summary of the workflow in directory.

The returned dict contains:

  • workflow – label, type, status, timestamps

  • result – any stored results (energy, etc.)

  • estimation – step-estimation data (if present)

  • jobs – list of job records with their statuses

  • num_jobs – total number of job records

Returns an empty dict if no workflow_state.toml is found.

Parameters:

directory (str)

Return type:

dict

jqmc_workflow.parse_input_params(work_dir)#

Extract key parameters from a workflow directory, per input file.

For each [[jobs]] entry in workflow_state.toml, the corresponding TOML input file is loaded and merged with the default values defined in jqmc.jqmc_miscs.cli_parameters. The result is a list of per-input dicts stored in Input_Parameters.per_input.

actual_opt_steps is read from restart.h5 when available (VMC only).

Parameters:

work_dir (str) – Path to the workflow working directory.

Returns:

Structured parameter data with per-input detail.

Return type:

Input_Parameters

jqmc_workflow.parse_lrdmc_ext_output(work_dir)#

Parse LRDMC extrapolation output from work_dir.

Looks for For a -> 0 bohr: E = ... in the stdout of the extrapolation step.

Parameters:

work_dir (str) – Path to the LRDMC extrapolation working directory.

Returns:

Structured parse result.

Return type:

LRDMC_Ext_Diagnostic_Data

jqmc_workflow.parse_lrdmc_output(work_dir)#

Parse LRDMC calculation output from work_dir.

Extracts survived walkers ratio, average number of projections, and net GFMC time from stdout. Energy/error come from workflow_state.toml result section.

Parameters:

work_dir (str) – Path to the LRDMC working directory.

Returns:

Structured parse result.

Return type:

LRDMC_Diagnostic_Data

jqmc_workflow.parse_mcmc_output(work_dir)#

Parse MCMC sampling output from work_dir.

Extracts acceptance ratio, walker weights, and net time from stdout. Energy/error are extracted from the workflow_state.toml result section (populated by jqmc-tool post-processing) or from stdout if jqmc-tool mcmc compute-energy output is present.

Parameters:

work_dir (str) – Path to the MCMC working directory.

Returns:

Structured parse result.

Return type:

MCMC_Diagnostic_Data

jqmc_workflow.parse_vmc_output(work_dir)#

Parse VMC optimization output from work_dir.

Discovers output files (out_vmc, out_vmc_0, etc.) in the directory, parses per-step data, and looks for hamiltonian_data_opt_step_*.h5.

Parameters:

work_dir (str) – Path to the VMC working directory.

Returns:

Structured parse result containing per-step data and metadata.

Return type:

VMC_Diagnostic_Data

jqmc_workflow.repair_forces_from_output(work_dir)#

Re-parse forces from out_*.o files and update workflow_state.toml.

This repairs corrupted force data caused by the pre-fix parse_ufloat_short that ignored scientific notation (e.g. +3(8)e-05 was parsed as 3.0 instead of 3e-05).

Returns True if the TOML was updated, False otherwise.

Parameters:

work_dir (str)

Return type:

bool