prep.x

Description

prep is the TurboRVB program that performs density-functional theory (DFT) calculations. It shares the same input cards (standard-input namelists) as the QMC driver and uses molecular orbitals from fort.10 or fort.10_new as the basis to run self-consistent (SC) DFT or non-self-consistent (NonSC) calculations (band structure, DOS, etc.). The source program name is main; the executable is prep.x.

  • Input: Standard input with namelists common to QMC: &simulation (itestr4, iopt), &pseudo (e.g. npsamax), &vmc, &optimization (molopt, yeswrite10), &readio (writescratch; must be 0 for prep), &parameters (e.g. yes_kpoints, decoupled_run), &molecul (real-space mesh nx, ny, nz, nbufd), &dft (maxit, epsdft, typedft, mixing, optocc, bands, memlarge, etc.). When compute_bands is used, &band_structure (task, deltaE, emin, emax) is read. fort.10 (when iopt=1) or fort.10_new (when iopt≠1) as the wave function; pseudo.dat when pseudopotentials are used.

  • Output: fort.10_new (converged wave function; single k-point or first k-point). When manyfort10 and yeswrite10, fort.10_000xxx per k-point under scratch. occupationlevels.dat (eigenvalues, occupations, mesh and k-point info). total_densities.sav, wavefunction.sav (for continuation and NonSC). EDFT_vsk.dat (when print_energies=.true.; energy per k-point). TurboDFT.sav and tmp00xxxx under wherescratch (distributed scratch per MPI rank).

  • Flow: read_datasmin reads SIMULATIONS, PARAMETERS, PSEUDO, VMC, K-POINTS → setup_para initializes parallel (k-point pools, SCALAPACK) → Initializeall reads pseudos, fort.10/fort.10_new, read_datasmin_mol (MOLECULAR), &dft / &band_structure → QMC arrays are deallocated and DFT scratch is allocated → initialize_environment (mesh, occupationlevels.dat) → self_consistent_run or non_self_consistent_run (if compute_bands) → write_output_and_finalize (writes scratch, fort.10_new, occupationlevels.dat, etc.).

  • Typical use: Run LDA/LSDA or Hartree DFT to obtain self-consistent molecular orbitals and write fort.10_new for use as the initial wave function in VMC/DMC. Supports k-point sampling (kaverage) and band structure (compute_bands).

The program supports MPI (when built with PARALLEL) and OpenMP. Run with --help, -help, or help to show online help (help_online('prep')); in a PARALLEL build, --help is typically available only in the serial executable.

Input and output

Input

  • Standard input: Read by read_datasmin and then in Initializeall. Namelists &simulation, &pseudo, &vmc, &optimization, &readio, &parameters, &molecul, &dft; when compute_bands, &band_structure. In &readio, set writescratch=0 (required for prep). Use test/.../prep.input or similar as a template.

  • fort.10 (when iopt=1): Initial wave function (TurboRVB format). Must contain molecular orbitals (contraction > 0).

  • fort.10_new (when iopt≠1): Wave function for continuation. When kaverage and yesread10, per-k-point wave functions may be read from turborvb.scratch/fort.10_00xxxx.

  • pseudo.dat: Read by read_pseudo when pseudopotentials (ECP) are used.

  • occupationlevels.dat: When iopt≠1 or compute_bands. Contains bandso, nproco, meshproco, nko, etc. from a previous run.

  • Scratch under wherescratch: For continuation; distributed density and matrices (e.g. tmp00xxxx) are read when writescratch=0.

  • Command line: First argument --help, -help, or help prints online help and exits.

Output

  • fort.10_new: Converged wave function (updated molecular orbitals). Single k-point or, when not manyfort10, the first k-point. Written by write_fort10(ufort10); rank 0 only when parallel.

  • fort.10_000xxx (under scratch): When manyfort10 and yeswrite10; one wave function file per k-point (unit_scratch_fort10, e.g. turborvb.scratch/fort.10_*).

  • occupationlevels.dat: Eigenvalues, occupations, mesh and k-point information for continuation and NonSC runs.

  • total_densities.sav: Charge (and spin density for LSDA). Written by rank 0 (unit_scratch_densities).

  • wavefunction.sav: Eigenvectors, etc. Written by rank 0 (unit_scratch_eigenvects).

  • EDFT_vsk.dat: When print_energies=.true.; energy per k-point (Efree(k), Etot(k), etc.) on unit_print_energies.

  • TurboDFT.sav: Final density and matrices for continuation.

  • tmp00xxxx (under wherescratch): Per-MPI-rank distributed mesh density and overlap matrices; written and read when writescratch=0.

  • Standard output: Calculation type (print_header), mesh info, convergence history, warnings, and final log.

Input parameters

Variable are read from standard input. Read the variables of the same Input parameters as turborvb.x from standard input.

DFT section

Parameter name

Datatype

Default

Description

mixing

real(8)

  • 0.25d0 if typeopt == 2

  • 0.25d0 if typeopt == 0

  • 1.d0 if typeopt == 4

  • 0.01d0 otherwise

Choose a small value for better convergence. If even in this way it does not converge, switch on the smearing technique setting optocc=1 (suggested for open shell systems). Alternatively you can change iteration method with typeopt=3 (conjugate gradients) which will certainly converge for mixing small enough. In these cases mixing means just the maximum amplitude in the step.

typedft

integer

  • if (abs(typedft) == 4 or abs(typedft) == 5) and (symmagp is .false. or ipc == 2) + 1

  • otherwise

    • 1 if typedft == 4

    • -1 if typedft == -4

    • 3 if typedft == 5

    • -3 if typedft == -5

    • 2 otherwise

  • typedft = 0: DFT calculation with Hartree potential only.

  • typedft = 1: LDA (PZ 1981).

  • typedft = 2: LDA (OB 1994).

  • typedft = -1,-2: Same as the two above, but with the corresponding fit performed by imposing continuity in the correlation energy at \(rs = 1\).

  • typedft = 3: KZK finite volume DFT: should be more accurate for finite volume.

  • typedft = -3: Different fitting procedure, suitable for open systems. Could be used with periodic systems too, but it is less stable.

  • typedft = 4: Standard LSDA.

  • typedft = -4: Standard LSDA, but with the corresponding fit performed by imposing continuity in the correlation energy at \(rs = 1\).

  • typedft = 5: LSDA + KZK (not applied on spin) (typedft = -5 similar to typedft = -3).

maxit

integer

100

Maximum number of iterations in the self-consistent cycle.

epsdft

real(8)

1.d-7

Tolerance in the convergence of total energy.

optocc

integer

-1

  • optocc = 0: Use standard occupation of levels. It works well only for closed shell systems (insulators or special cases). Occupations are read from standard input (see below); the number of occupations read is chosen by the parameter nelocc (neloccdo for down spin electrons) which must be specified in input. In this case we have occupations(1:nelocc) = 2 for LDA; occupations(1:nelocc) = 1 && occupationdo(1:neloccdo) = 1 for LSDA.

  • optocc = 1: Use a smeared Fermi distribution with a spread given by the parameter epsshell (see below). In this case: \(occupations(i) = \exp{ \frac{eig(i)-ef}{epsshell}+1}\) where the Fermi energy \(ef\) is determined by the constraint, sum(occupations(1:bands)) = no. of electrons via bisection method . For LSDA (|typedft| = 4, 5) two Fermi distributions are introduced for up and down electrons. In the case of k-points sampling, the Fermi energy is determined by averaging over \(ef\) computed for each k-point.

typeopt

integer

  • 4 if no input

  • ignore input or above value

    • 0 if compute_bands is .true.

  • typeopt = 0: Use self consistency method with standard mixing.

  • typeopt = 2: Linear mixing scheme.

  • typeopt = 3: Conjugate gradients method with SR acceleration.

  • typeopt = 4: Anderson mixing scheme with Jacobian acceleration, no use of mixing is made; this method looks to be the faster and therefore the preferred among the available ones. For information on the algorithm see doc/tex/parbcs.tex, Ch. V

memlarge

logical

.true.

Optimize speed at the cost of much greater memory requirements. The whole basis set is saved on disk!

epsover

real(8)

1d-13

Minimum tolerance for the lowest eigenvalues of the overlap matrix. If epsover < 0 no orthogonalization is implemented (faster but less stable).

nelocc

integer

0

If nelocc > 0, it is the number of occupations that are read in the last record of the input file. Occupation values can only be (0,2] (paired orbitals), -1 (unpaired at the end) or 0 (unoccupied). If nxs \(\times\) nys \(\times\) nzs > 0 this record is just after the ones to read the input magnetization (see below).

neloccdo

integer

0

It is similar to nelocc but for the spin down electrons which are assumed with no unpaired orbitals. Another record with neloccdo integers be written below. Note that, in this case occupations for up spin can take values 1 (occupied paired orbital), -1 (unpaired), 0 (unoccupied orbital). Instead occupations for down spin electrons can take values 1 (occupied paired orbital) and 0 (unoccupied orbital).

bands

integer

  • 2*neldo + nelup - neldo if no input

  • with input value or above value

    • 2*neldo + nelup - neldo if bands < nelup

    • nelocc if nelocc /= 0 and bands < nelocc

    • nelorbh if bands > nelorbh and contracted_on is .false.

    • nelorb_at if bands > nelorb_at and contracted_on is .true.

The number of lowest eigenvalues of Khon-Sham equations to be evaluated. By default bands = nelup + 7 where nelup is the number of spin up electrons. This corresponds to assuming at most an 8-fold degenerancy in the last occupied shell.

mixingder

real(8)

  • if no input

    • 0.05d0 if typeopt is even

    • 100.0d0 if typeopt is odd

  • with input or above value

    • abs(mixingder) if mixingder < 0

  • Case 1 (typeopt = 3): Used to evaluate numerically the first and second derivatives.

  • Case 2 (typeopt = 4): Used to be closer to the linear regime for the evaluation of the Jacobian (mixingder << 1).

epssr

real(8)

  • 0 if typeopt is even

  • -1 if typeopt is odd and no input

WIP

maxcg

integer

40

With typeopt = 3 (conjugate gradient) each maxcg steps restart the conj. grad. procedure. If maxcg = 0 no restarting is performed (discouraged since numerically unstable).

weightcorr

real(8)

1.d0

Weight of the correlation energy, e.g. weightcorr \(= 0\) no correlation, weightcorr \(= 1\) standard LDA.

weightxc

real(8)

1.d0

Weight of the exchange energy. When weightxc \(= 0\) no exchange and when \(weightcorr = 1\) standard LDA is used.

maxold

integer

3

The number of previous iterations to be considered in the numerical evaluation of Jacobian with typeopt = 4.

epsshell

real(qp)

  • 0.005d0 if no input

  • ignore input or above value

    • 0.0d0 if optocc == 0

Spread of the Fermi distribution used when optocc = 1. The unit is Ha.

weightvh

real(8)

1.d0

Weight of the hartree potential. Setting weightvh \(\neq 1\) is used just for testing.

orthodiag

logical

.false.

.false. the Kohn-Sham eigenvectors are not orthogonalized after each Hamiltonian diagonalization.

mincond

integer

1

Disregard the first mincond-1 direction regardless of the condition number limited by epsover. This is useful to have better cancellation errors as a function e.g. of pressure.

randspin

real(8)

  • if no input

    • 0.d0 if iopt == 0

    • -1.d0 if iopt /= 0

Used for initializing magnetization. If randspin > 0, add random component to the orbitals. If randspin < 0 initialize with maximum possible spin given density and grid (see below). If zero no action.

nzs

integer

1

Dimension of the grid where the magnetization is defined along the \(x\), \(y\), \(z\) direction. The format is written below.

Additional information:

  • nxs nys nzs

    Dimension of the grid where the magnetization is defined along the \(x\), \(y\), \(z\) direction: The format is

    !  After "/" or "occupation list"
    ! # empty line
    ! s111 s211 s311 ... snxs11
    ! s121 s221 s321 ... snxs11
    ! s1nys1 ... ... ... snxsnys1
    ! # empty line
    ! s112 s212 s312 ... snxs12
    ! ...
    ! # empty line
    ! s11nzs s21nzs s31nzs ... snxs1nzs
    ! ...
    ! s1nysnzs ... ... ... snxsnysnzs
    

nys

integer

1

Dimension of the grid where the magnetization is defined along the \(x\), \(y\), \(z\) direction. The format is written below.

Additional information:

  • nxs nys nzs

    Dimension of the grid where the magnetization is defined along the \(x\), \(y\), \(z\) direction: The format is

    !  After "/" or "occupation list"
    ! # empty line
    ! s111 s211 s311 ... snxs11
    ! s121 s221 s321 ... snxs11
    ! s1nys1 ... ... ... snxsnys1
    ! # empty line
    ! s112 s212 s312 ... snxs12
    ! ...
    ! # empty line
    ! s11nzs s21nzs s31nzs ... snxs1nzs
    ! ...
    ! s1nysnzs ... ... ... snxsnysnzs
    

nxs

integer

1

Dimension of the grid where the magnetization is defined along the \(x\), \(y\), \(z\) direction. The format is written below.

Additional information:

  • nxs nys nzs

    Dimension of the grid where the magnetization is defined along the \(x\), \(y\), \(z\) direction: The format is

    !  After "/" or "occupation list"
    ! # empty line
    ! s111 s211 s311 ... snxs11
    ! s121 s221 s321 ... snxs11
    ! s1nys1 ... ... ... snxsnys1
    ! # empty line
    ! s112 s212 s312 ... snxs12
    ! ...
    ! # empty line
    ! s11nzs s21nzs s31nzs ... snxs1nzs
    ! ...
    ! s1nysnzs ... ... ... snxsnysnzs
    

jaccond

real(8)

0.d0

Minimum threshold for the condition matrix in the self-consistent approach typeopt = 4.

h_field

real(8)

0.d0

If h\_field > (<) 0 put a magnetic field increasing (decreasing) the magnetization with the staggering given by the table sxyz defined for randspin.

eps_mach

real(8)

  • DLAMCH('E') (defined in lapack) if no input

WIP

nc

integer

-1

WIP

rmax

real*8

0.d0

WIP

write_den

logical

.false.

If .true. write the overlap matrix elements for effective Hamiltonian calculations to the disk.

contracted_on

logical

.false.

If .true. it acts on the contracted basis (considerably faster).

task

integer

0

WIP

deltaE

real(8)

0.d0

WIP

emin

real(8)

0.d0

WIP

emax

real(8)

0.d0

WIP

optimize_overs

logical

  • if no input

    • .true. if double_overs is .false.

  • ignore input or above value

    • .false. if double_overs is .true.

If .true. optimize the overlap matrices calculation if the phase for spin down electrons is equal or opposite to the phase for up spin electrons. Otherwise it is automatically set to .false..

shift_origin

logical

  • if no input

    • .true. if skip_equivalence is .true. or yes_tilted is .true.

    • .false. otherwise

WIP

try_translation

logical

.true.

WIP

zero_jas

logical

.false.

If .true. set the one-body Jastrow to zero at the end of the DFT calculation.

write_matrix

logical

.false.

WIP

fix_density

logical

-.false. if no input - ignore input

  • .false. if decoupled_run is .false.

If the flag decoupled_run is set to .true. (&parameters card) as well as the yes_kpoints flag (&kpoints card), the k-points are evolved independently but using the averaged electronic density.

newj_onebody

real(8) :: newj_onebody(100)

-1.d0

WIP

setj_onebody

real(8) :: setj_onebody(100)

-1.d0

WIP

h_charge

real(8)

0.d0

WIP

shiftx

logical

.false.

WIP

shifty

logical

.false.

WIP

shiftz

logical

.false.

WIP

l0_at

real(8)

1.d0

WIP

scale_z

integer

  • if no input

    • 2 if double_mesh is .true.

    • 1 if double_mesh is .false.

WIP

linear

logical

.false.

WIP

corr_hartree

logical

  • .true.` if no input

  • ignore input or above value

    • .false. if do_hartree is .true. or double_mesh is .false.

WIP

rion_from

real(8) :: rion_from(3)

automatic

WIP

minz_at

real(8)

-1.d0

WIP

vmax0_in

real(8)

-100.d0

WIP

nx_at

integer

automatic

WIP

ny_at

integer

automatic

WIP

nz_at

integer

automatic

WIP

from_ions

logical

.true.

WIP

scale_hartree

real(8)

  • if no input

    • 1.d0 if double_mesh is .true.

    • -1.d0 if double_mesh is .false.

WIP

do_hartree

logical

  • .false. if no input

  • ignore input

    • .false. if double_mesh is .false.

WIP

nproc_fft

integer

maxdiv

WIP

tfcut (nonexistent)

Used only with typeopt = 0/2/4. It is used for preconditioning to improve convergence of small q charge fluctuations. Suggested value of tfcut \(= \frac{1}{{xi_{TF}}^2}\) where \(xi\) is the Thomas-Fermi length expressed in a.u.

band_structure section

Parameter name

Datatype

Default

Description

task

integer

0

Flag to specify which quantity to compute after a non self-consistent run. It is ignored if ``type_comp_dft = 0``.
  ``task = 0`` Do not compute anything.
  ``task = 1`` Band structure plot (use ``kp_type = 2/3`` in the k-points card to specify the path in the Brillouin zone).
  ``task = 2`` Density of States calculations using smearing parameter given by ``epsshell``. The integer ``optocc`` must be set to 1.

deltaE

real(8)

0.d0

Energy bin for computing the density of states (task = 2).

emin

real(8)

0.d0

min(eigenvalue): minimum value of the energy to be included in band structure or DOS plot.

emax

real(8)

0.d0

max(eigenvalue): maximum value of the energy to be included in band structure or DOS plot.

lastline of file section

Parameter name

Datatype

Default

Description

occupations

real(8), dimension(:), allocatable :: occupations

automatic

WIP

Notes

Molecular orbitals required

  • DFT runs only with molecular orbitals: contraction must be > 0. Otherwise the program stops with "DFT works only with molecular orbitals". Use makefort10 or convertfort10 to build a fort.10 that contains molecular orbitals.

writescratch and iopt

  • writescratch must be 0 in &readio. If not, the program stops with "Scratch files not written".

  • iopt=1: Start from scratch (no potential); read fort.10. iopt=0: Continuation; read fort.10_new and occupationlevels.dat (and scratch as needed).

k-points and continuation

  • When kaverage (k-point sampling from &kpoints): For continuation with yesread10, per-k-point wave functions are read from fort.10_00xxxx. When manyfort10 and yeswrite10, per-k-point fort.10_000xxx are written.

  • compute_bands: Band-structure and DOS runs assume a prior converged SC run; use the same mesh and (for NonSC) the same number of processors per k-point. Set decoupled_run etc. as needed for the NonSC step.

itestr4 and Z optimization

  • If itestr4=-8 (optimize Z), the program reports "Optimization of Z is not possible with DFT", sets yeszagp=.false. and itestr4=-4, then continues (Z optimization is not performed in DFT).

MPI and OpenMP

  • When built with PARALLEL, run with e.g. mpirun -np N ./prep.x < prep.input. Rank 0 typically reads standard input and opens fort.10 / fort.10_new and pseudo.dat; ensure these files are visible to rank 0 (e.g. shared working directory).

  • For k-point runs: nproc must be a multiple of the number of k-points (nk). Use the same nproc and nk when continuing a run or when running NonSC after SC.

  • With __SCALAPACK, the build must use -DPARALLEL; otherwise initialization may fail. Block size and process grid must be consistent with matrix dimensions.

Pseudopotentials

  • When ECPs are used, pseudo.dat is read in Initializeall (same convention as the main TurboRVB program).

Troubleshooting

Fatal errors (program stops)

  • Scratch files not writtenwritescratch is not 0. Set writescratch=0 in &readio.

  • DFT works only with molecular orbitalscontraction=0 (no molecular orbitals). Provide a fort.10 with molecular orbitals (e.g. from makefort10 or convertfort10).

  • Optimization of Z is not possible with DFT — Started with itestr4=-8. The program switches to itestr4=-4 and continues; Z optimization is skipped.

  • Double allocation for complex algorithm — Mismatch between real/complex build or input. Check build options and input flags.

  • Some input k-points are wrong, substituted with WF phase! — Continuation with yesread10; input k-points do not match the phases in fort.10_00xxxx. The program substitutes phases and continues; verify k-point input.

  • error in reading DFT card — Invalid or malformed &dft namelist. Check syntax and variable names against the template.

  • error in reading band_structure cardcompute_bands is set but &band_structure read failed. Provide &band_structure with task, deltaE, emin, emax, etc.

  • ERROR in TurboDFT setup — Input inconsistency detected in check_dft_input or checkiflagerr. Fix the input according to the message printed just before.

  • It is assumed that nelup > neldo — nelup ≤ neldo in input. Check electron numbers.

  • To use LSDA functional, start with a non-symmetric AGP... — symmagp with typedft=4/5 (LSDA). Use a non-symmetric AGP or set typedft to 1/-1 (LDA).

  • compile with PARALLEL option for more than one k-point — Serial build with more than one k-point. Build with PARALLEL or use a single k-point.

  • choose an higher number of processors or decrease the number of k-points! — nproc < nk. Increase MPI processes or decrease the number of k-points.

  • Choose a number of processors multiple of the number of k-points! — nproc is not a multiple of nk. Set mpirun -np to a multiple of the k-point count.

  • occupationlevels.dat not found! Run a self-consistent calculation before computing band structurecompute_bands is set but the file is missing. Run an SC calculation first to generate it.

  • Use the same mesh as SC calculation! — NonSC run with a different mesh or processor layout than the SC run. Use the same nx, ny, nz and (for continuation) the same nproc.

  • one or more k-points read from file do not coincide with input fort.10 ones — K-points in the file and in fort.10 differ. Align k-point input with the previous SC run.

  • Impossible to continue fast with different # processors — Continuation with a different number of MPI processes. Use the same nproc or force noread to recompute.

  • Occupations not defined — Diagonalization or occupation setup failed (e.g. info≠0). Check bands, electron count, and optocc.

  • check input nbufd and/or code — Buffer or mesh dimension mismatch (initialize_matrices, hamiltonian, elec_density). Increase nbufd or nx, ny, nz in &molecul.

  • No tasks required / Task type not recognizedtask (in &band_structure) is 0 or not supported. Set task=1 (bands), 2 (DOS), or 5 (both).

  • Number of processor per k-point different with respect to self-consistent run — NonSC run with different nproc/nk than SC. Use the same nproc and nk.

  • Use kp_type=2 or kp_type=3 for plotting the DFT band structure — Band plotting requires kp_type 2 or 3 in the k-point input.

  • Error reading spin up/down occupations / Provide spin down occupations for... / ERROR reading occupations! — occupationlevels.dat or occupation input invalid or incomplete. Check file format and record count.

  • Too few molecular orbitals < nelocc — Fewer orbitals than nelocc. Increase bands or decrease nelocc.

  • ERROR not found molecular orbital # — Orbital reordering (e.g. momentum conservation) could not find a matching orbital. Check symmetry and AGP setup.

  • Turbo-DFT terminates without convergence — SC did not converge within maxit. Try increasing maxit, adjusting mixing, typeopt, or optocc.

Warnings (execution continues)

  • Many Warning messages are informational (e.g. init. value of threads, Point group symmetries not implemented with yes_tilted, # bands restored to default, do_hartree/corr_hartree settings, epsover/mixingder/epssr, density rescaling or symmetrisation, orbital reordering, occupation defaults, buffer size, SVD directions neglected, diagonalization info≠0, charge/spin not conserved, DOS smearing defaults). They indicate automatic adjustments or conditions to watch; check the message and adjust input or mesh if results are unexpected.

  • Warning you should run with a smaller number of processors !!! — Reducing the number of MPI ranks may improve stability.

  • Warning not all orbitals are read from fort10 — Fort.10 did not contain as many orbitals as expected; verify the file.

Other notes

  • fort.10_new is overwritten if it exists. Back it up if needed.