prep.x

Description

prep is the TurboRVB program that performs density-functional theory (DFT) calculations. It shares the same input cards (standard-input namelists) as the QMC driver and uses molecular orbitals from fort.10 or fort.10_new as the basis to run self-consistent (SC) DFT or non-self-consistent (NonSC) calculations (band structure, DOS, etc.). The source program name is main; the executable is prep.x.

Input: Standard input with namelists common to QMC: &simulation (itestr4, iopt), &pseudo (e.g. npsamax), &vmc, &optimization (molopt, yeswrite10), &readio (writescratch; must be 0 for prep), &parameters (e.g. yes_kpoints, decoupled_run), &molecul (real-space mesh nx, ny, nz, nbufd), &dft (maxit, epsdft, typedft, mixing, optocc, bands, memlarge, etc.). When compute_bands is used, &band_structure (task, deltaE, emin, emax) is read. fort.10 (when iopt=1) or fort.10_new (when iopt≠1) as the wave function; pseudo.dat when pseudopotentials are used.
Output: fort.10_new (converged wave function; single k-point or first k-point). When manyfort10 and yeswrite10, fort.10_000xxx per k-point under scratch. occupationlevels.dat (eigenvalues, occupations, mesh and k-point info). total_densities.sav, wavefunction.sav (for continuation and NonSC). EDFT_vsk.dat (when print_energies=.true.; energy per k-point). TurboDFT.sav and tmp00xxxx under wherescratch (distributed scratch per MPI rank).
Flow: read_datasmin reads SIMULATIONS, PARAMETERS, PSEUDO, VMC, K-POINTS → setup_para initializes parallel (k-point pools, SCALAPACK) → Initializeall reads pseudos, fort.10/fort.10_new, read_datasmin_mol (MOLECULAR), &dft / &band_structure → QMC arrays are deallocated and DFT scratch is allocated → initialize_environment (mesh, occupationlevels.dat) → self_consistent_run or non_self_consistent_run (if compute_bands) → write_output_and_finalize (writes scratch, fort.10_new, occupationlevels.dat, etc.).
Typical use: Run LDA/LSDA or Hartree DFT to obtain self-consistent molecular orbitals and write fort.10_new for use as the initial wave function in VMC/DMC. Supports k-point sampling (kaverage) and band structure (compute_bands).

The program supports MPI (when built with PARALLEL) and OpenMP. Run with --help, -help, or help to show online help (help_online('prep')); in a PARALLEL build, --help is typically available only in the serial executable.

Input and output

Input

Standard input: Read by read_datasmin and then in Initializeall. Namelists &simulation, &pseudo, &vmc, &optimization, &readio, &parameters, &molecul, &dft; when compute_bands, &band_structure. In &readio, set writescratch=0 (required for prep). Use test/.../prep.input or similar as a template.
fort.10 (when iopt=1): Initial wave function (TurboRVB format). Must contain molecular orbitals (contraction > 0).
fort.10_new (when iopt≠1): Wave function for continuation. When kaverage and yesread10, per-k-point wave functions may be read from turborvb.scratch/fort.10_00xxxx.
pseudo.dat: Read by read_pseudo when pseudopotentials (ECP) are used.
occupationlevels.dat: When iopt≠1 or compute_bands. Contains bandso, nproco, meshproco, nko, etc. from a previous run.
Scratch under wherescratch: For continuation; distributed density and matrices (e.g. tmp00xxxx) are read when writescratch=0.
Command line: First argument --help, -help, or help prints online help and exits.

Output

fort.10_new: Converged wave function (updated molecular orbitals). Single k-point or, when not manyfort10, the first k-point. Written by write_fort10(ufort10); rank 0 only when parallel.
fort.10_000xxx (under scratch): When manyfort10 and yeswrite10; one wave function file per k-point (unit_scratch_fort10, e.g. turborvb.scratch/fort.10_*).
occupationlevels.dat: Eigenvalues, occupations, mesh and k-point information for continuation and NonSC runs.
total_densities.sav: Charge (and spin density for LSDA). Written by rank 0 (unit_scratch_densities).
wavefunction.sav: Eigenvectors, etc. Written by rank 0 (unit_scratch_eigenvects).
EDFT_vsk.dat: When print_energies=.true.; energy per k-point (Efree(k), Etot(k), etc.) on unit_print_energies.
TurboDFT.sav: Final density and matrices for continuation.
tmp00xxxx (under wherescratch): Per-MPI-rank distributed mesh density and overlap matrices; written and read when writescratch=0.
Standard output: Calculation type (print_header), mesh info, convergence history, warnings, and final log.

Input parameters

Variable are read from standard input. Read the variables of the same Input parameters as turborvb.x from standard input.

DFT section

Parameter name	Datatype	Default	Description
mixing	`real(8)`	`0.25d0` if typeopt == 2 `0.25d0` if typeopt == 0 `1.d0` if typeopt == 4 `0.01d0` otherwise	Choose a small value for better convergence. If even in this way it does not converge, switch on the smearing technique setting `optocc=1` (suggested for open shell systems). Alternatively you can change iteration method with `typeopt=3` (conjugate gradients) which will certainly converge for mixing small enough. In these cases mixing means just the maximum amplitude in the step.
typedft	`integer`	if (abs(typedft) == 4 or abs(typedft) == 5) and (symmagp is .false. or ipc == 2) + `1` otherwise `1` if typedft == 4 `-1` if typedft == -4 `3` if typedft == 5 `-3` if typedft == -5 `2` otherwise	`typedft = 0`: DFT calculation with Hartree potential only. `typedft = 1`: LDA (PZ 1981). `typedft = 2`: LDA (OB 1994). `typedft = -1,-2`: Same as the two above, but with the corresponding fit performed by imposing continuity in the correlation energy at \(rs = 1\). `typedft = 3`: KZK finite volume DFT: should be more accurate for finite volume. `typedft = -3`: Different fitting procedure, suitable for open systems. Could be used with periodic systems too, but it is less stable. `typedft = 4`: Standard LSDA. `typedft = -4`: Standard LSDA, but with the corresponding fit performed by imposing continuity in the correlation energy at \(rs = 1\). `typedft = 5`: LSDA + KZK (not applied on spin) (`typedft = -5` similar to `typedft = -3`).
maxit	`integer`	`100`	Maximum number of iterations in the self-consistent cycle.
epsdft	`real(8)`	`1.d-7`	Tolerance in the convergence of total energy.
optocc	`integer`	`-1`	`optocc = 0`: Use standard occupation of levels. It works well only for closed shell systems (insulators or special cases). Occupations are read from standard input (see below); the number of occupations read is chosen by the parameter `nelocc` (`neloccdo` for down spin electrons) which must be specified in input. In this case we have `occupations(1:nelocc) = 2` for LDA; `occupations(1:nelocc) = 1` && `occupationdo(1:neloccdo) = 1` for LSDA. `optocc = 1`: Use a smeared Fermi distribution with a spread given by the parameter `epsshell` (see below). In this case: \(occupations(i) = \exp{ \frac{eig(i)-ef}{epsshell}+1}\) where the Fermi energy \(ef\) is determined by the constraint, sum(occupations(1:bands)) = no. of electrons via bisection method . For LSDA (`\|typedft\| = 4, 5`) two Fermi distributions are introduced for up and down electrons. In the case of k-points sampling, the Fermi energy is determined by averaging over \(ef\) computed for each k-point.
typeopt	`integer`	`4` if no input ignore input or above value `0` if compute_bands is .true.	`typeopt = 0`: Use self consistency method with standard mixing. `typeopt = 2`: Linear mixing scheme. `typeopt = 3`: Conjugate gradients method with SR acceleration. `typeopt = 4`: Anderson mixing scheme with Jacobian acceleration, no use of mixing is made; this method looks to be the faster and therefore the preferred among the available ones. For information on the algorithm see doc/tex/parbcs.tex, Ch. V
memlarge	`logical`	`.true.`	Optimize speed at the cost of much greater memory requirements. The whole basis set is saved on disk!
epsover	`real(8)`	`1d-13`	Minimum tolerance for the lowest eigenvalues of the overlap matrix. If `epsover` < 0 no orthogonalization is implemented (faster but less stable).
nelocc	`integer`	`0`	If `nelocc` > 0, it is the number of occupations that are read in the last record of the input file. Occupation values can only be (0,2] (paired orbitals), -1 (unpaired at the end) or 0 (unoccupied). If `nxs` \(\times\) `nys` \(\times\) `nzs` > 0 this record is just after the ones to read the input magnetization (see below).
neloccdo	`integer`	`0`	It is similar to `nelocc` but for the spin down electrons which are assumed with no unpaired orbitals. Another record with `neloccdo` integers be written below. Note that, in this case occupations for up spin can take values 1 (occupied paired orbital), -1 (unpaired), 0 (unoccupied orbital). Instead occupations for down spin electrons can take values 1 (occupied paired orbital) and 0 (unoccupied orbital).
bands	`integer`	`2neldo + nelup - neldo` if no input with input value or above value `2neldo + nelup - neldo` if bands < nelup `nelocc` if nelocc /= 0 and bands < nelocc `nelorbh` if bands > nelorbh and contracted_on is .false. `nelorb_at` if bands > nelorb_at and contracted_on is .true.	The number of lowest eigenvalues of Khon-Sham equations to be evaluated. By default `bands = nelup + 7` where `nelup` is the number of spin up electrons. This corresponds to assuming at most an 8-fold degenerancy in the last occupied shell.
mixingder	`real(8)`	if no input `0.05d0` if typeopt is even `100.0d0` if typeopt is odd with input or above value `abs(mixingder)` if mixingder < 0	Case 1 (`typeopt = 3`): Used to evaluate numerically the first and second derivatives. Case 2 (`typeopt = 4`): Used to be closer to the linear regime for the evaluation of the Jacobian (`mixingder` << 1).
epssr	`real(8)`	`0` if typeopt is even `-1` if typeopt is odd and no input	WIP
maxcg	`integer`	`40`	With `typeopt = 3` (conjugate gradient) each `maxcg` steps restart the conj. grad. procedure. If `maxcg = 0` no restarting is performed (discouraged since numerically unstable).
weightcorr	`real(8)`	`1.d0`	Weight of the correlation energy, e.g. `weightcorr` \(= 0\) no correlation, `weightcorr` \(= 1\) standard LDA.
weightxc	`real(8)`	`1.d0`	Weight of the exchange energy. When `weightxc` \(= 0\) no exchange and when \(weightcorr = 1\) standard LDA is used.
maxold	`integer`	`3`	The number of previous iterations to be considered in the numerical evaluation of Jacobian with `typeopt = 4`.
epsshell	`real(qp)`	`0.005d0` if no input ignore input or above value `0.0d0` if optocc == 0	Spread of the Fermi distribution used when `optocc = 1`. The unit is Ha.
weightvh	`real(8)`	`1.d0`	Weight of the hartree potential. Setting `weightvh` \(\neq 1\) is used just for testing.
orthodiag	`logical`	`.false.`	`.false.` the Kohn-Sham eigenvectors are not orthogonalized after each Hamiltonian diagonalization.
mincond	`integer`	`1`	Disregard the first `mincond`-1 direction regardless of the condition number limited by `epsover`. This is useful to have better cancellation errors as a function e.g. of pressure.
randspin	`real(8)`	if no input `0.d0` if iopt == 0 `-1.d0` if iopt /= 0	Used for initializing magnetization. If `randspin` > 0, add random component to the orbitals. If `randspin` < 0 initialize with maximum possible spin given density and grid (see below). If zero no action.
nzs	`integer`	`1`	Dimension of the grid where the magnetization is defined along the \(x\), \(y\), \(z\) direction. The format is written below. Additional information: `nxs` `nys` `nzs` Dimension of the grid where the magnetization is defined along the \(x\), \(y\), \(z\) direction: The format is ! After "/" or "occupation list" ! # empty line ! s111 s211 s311 ... snxs11 ! s121 s221 s321 ... snxs11 ! s1nys1 ... ... ... snxsnys1 ! # empty line ! s112 s212 s312 ... snxs12 ! ... ! # empty line ! s11nzs s21nzs s31nzs ... snxs1nzs ! ... ! s1nysnzs ... ... ... snxsnysnzs
nys	`integer`	`1`	Dimension of the grid where the magnetization is defined along the \(x\), \(y\), \(z\) direction. The format is written below. Additional information: `nxs` `nys` `nzs` Dimension of the grid where the magnetization is defined along the \(x\), \(y\), \(z\) direction: The format is ! After "/" or "occupation list" ! # empty line ! s111 s211 s311 ... snxs11 ! s121 s221 s321 ... snxs11 ! s1nys1 ... ... ... snxsnys1 ! # empty line ! s112 s212 s312 ... snxs12 ! ... ! # empty line ! s11nzs s21nzs s31nzs ... snxs1nzs ! ... ! s1nysnzs ... ... ... snxsnysnzs
nxs	`integer`	`1`	Dimension of the grid where the magnetization is defined along the \(x\), \(y\), \(z\) direction. The format is written below. Additional information: `nxs` `nys` `nzs` Dimension of the grid where the magnetization is defined along the \(x\), \(y\), \(z\) direction: The format is ! After "/" or "occupation list" ! # empty line ! s111 s211 s311 ... snxs11 ! s121 s221 s321 ... snxs11 ! s1nys1 ... ... ... snxsnys1 ! # empty line ! s112 s212 s312 ... snxs12 ! ... ! # empty line ! s11nzs s21nzs s31nzs ... snxs1nzs ! ... ! s1nysnzs ... ... ... snxsnysnzs
jaccond	`real(8)`	`0.d0`	Minimum threshold for the condition matrix in the self-consistent approach `typeopt = 4`.
h_field	`real(8)`	`0.d0`	If `h\_field` > (<) 0 put a magnetic field increasing (decreasing) the magnetization with the staggering given by the table sxyz defined for `randspin`.
eps_mach	`real(8)`	`DLAMCH('E')` (defined in lapack) if no input	WIP
nc	`integer`	`-1`	WIP
rmax	`real*8`	`0.d0`	WIP
write_den	`logical`	`.false.`	If `.true.` write the overlap matrix elements for effective Hamiltonian calculations to the disk.
contracted_on	`logical`	`.false.`	If `.true.` it acts on the contracted basis (considerably faster).
task	`integer`	`0`	WIP
deltaE	`real(8)`	`0.d0`	WIP
emin	`real(8)`	`0.d0`	WIP
emax	`real(8)`	`0.d0`	WIP
optimize_overs	`logical`	if no input `.true.` if double_overs is .false. ignore input or above value `.false.` if double_overs is .true.	If `.true.` optimize the overlap matrices calculation if the phase for spin down electrons is equal or opposite to the phase for up spin electrons. Otherwise it is automatically set to `.false.`.
shift_origin	`logical`	if no input `.true.` if skip_equivalence is .true. or yes_tilted is .true. `.false.` otherwise	WIP
try_translation	`logical`	`.true.`	WIP
zero_jas	`logical`	`.false.`	If `.true.` set the one-body Jastrow to zero at the end of the DFT calculation.
write_matrix	`logical`	`.false.`	WIP
fix_density	`logical`	-`.false.` if no input - ignore input `.false.` if decoupled_run is .false.	If the flag `decoupled_run` is set to `.true.` (&parameters card) as well as the `yes_kpoints` flag (&kpoints card), the k-points are evolved independently but using the averaged electronic density.
newj_onebody	`real(8) :: newj_onebody(100)`	`-1.d0`	WIP
setj_onebody	`real(8) :: setj_onebody(100)`	`-1.d0`	WIP
h_charge	`real(8)`	`0.d0`	WIP
shiftx	`logical`	`.false.`	WIP
shifty	`logical`	`.false.`	WIP
shiftz	`logical`	`.false.`	WIP
l0_at	`real(8)`	`1.d0`	WIP
scale_z	`integer`	if no input `2` if double_mesh is .true. `1` if double_mesh is .false.	WIP
linear	`logical`	`.false.`	WIP
corr_hartree	`logical`	.true.` if no input ignore input or above value `.false.` if do_hartree is .true. or double_mesh is .false.	WIP
rion_from	`real(8) :: rion_from(3)`	automatic	WIP
minz_at	`real(8)`	`-1.d0`	WIP
vmax0_in	`real(8)`	`-100.d0`	WIP
nx_at	`integer`	automatic	WIP
ny_at	`integer`	automatic	WIP
nz_at	`integer`	automatic	WIP
from_ions	`logical`	`.true.`	WIP
scale_hartree	`real(8)`	if no input `1.d0` if double_mesh is .true. `-1.d0` if double_mesh is .false.	WIP
do_hartree	`logical`	`.false.` if no input ignore input `.false.` if double_mesh is .false.	WIP
nproc_fft	`integer`	`maxdiv`	WIP
tfcut (nonexistent)			Used only with `typeopt = 0/2/4`. It is used for preconditioning to improve convergence of small q charge fluctuations. Suggested value of `tfcut` \(= \frac{1}{{xi_{TF}}^2}\) where \(xi\) is the Thomas-Fermi length expressed in a.u.

band_structure section

Parameter name	Datatype	Default	Description
task	`integer`	`0`	Flag to specify which quantity to compute after a non self-consistent run. It is ignored if ``type_comp_dft = 0``. ``task = 0`` Do not compute anything. ``task = 1`` Band structure plot (use ``kp_type = 2/3`` in the k-points card to specify the path in the Brillouin zone). ``task = 2`` Density of States calculations using smearing parameter given by ``epsshell``. The integer ``optocc`` must be set to 1.
deltaE	`real(8)`	`0.d0`	Energy bin for computing the density of states (task = 2).
emin	`real(8)`	`0.d0`	min(eigenvalue): minimum value of the energy to be included in band structure or DOS plot.
emax	`real(8)`	`0.d0`	max(eigenvalue): maximum value of the energy to be included in band structure or DOS plot.

lastline of file section

Parameter name	Datatype	Default	Description
occupations	`real(8), dimension(:), allocatable` :: occupations	automatic	WIP

Notes

Molecular orbitals required

DFT runs only with molecular orbitals: contraction must be > 0. Otherwise the program stops with "DFT works only with molecular orbitals". Use makefort10 or convertfort10 to build a fort.10 that contains molecular orbitals.

writescratch and iopt

writescratch must be 0 in &readio. If not, the program stops with "Scratch files not written".
iopt=1: Start from scratch (no potential); read fort.10. iopt=0: Continuation; read fort.10_new and occupationlevels.dat (and scratch as needed).

k-points and continuation

When kaverage (k-point sampling from &kpoints): For continuation with yesread10, per-k-point wave functions are read from fort.10_00xxxx. When manyfort10 and yeswrite10, per-k-point fort.10_000xxx are written.
compute_bands: Band-structure and DOS runs assume a prior converged SC run; use the same mesh and (for NonSC) the same number of processors per k-point. Set decoupled_run etc. as needed for the NonSC step.

itestr4 and Z optimization

If itestr4=-8 (optimize Z), the program reports "Optimization of Z is not possible with DFT", sets yeszagp=.false. and itestr4=-4, then continues (Z optimization is not performed in DFT).

MPI and OpenMP

When built with PARALLEL, run with e.g. mpirun -np N ./prep.x < prep.input. Rank 0 typically reads standard input and opens fort.10 / fort.10_new and pseudo.dat; ensure these files are visible to rank 0 (e.g. shared working directory).
For k-point runs: nproc must be a multiple of the number of k-points (nk). Use the same nproc and nk when continuing a run or when running NonSC after SC.
With __SCALAPACK, the build must use -DPARALLEL; otherwise initialization may fail. Block size and process grid must be consistent with matrix dimensions.

Pseudopotentials

When ECPs are used, pseudo.dat is read in Initializeall (same convention as the main TurboRVB program).

Troubleshooting

Fatal errors (program stops)

Scratch files not written — writescratch is not 0. Set writescratch=0 in &readio.
DFT works only with molecular orbitals — contraction=0 (no molecular orbitals). Provide a fort.10 with molecular orbitals (e.g. from makefort10 or convertfort10).
Optimization of Z is not possible with DFT — Started with itestr4=-8. The program switches to itestr4=-4 and continues; Z optimization is skipped.
Double allocation for complex algorithm — Mismatch between real/complex build or input. Check build options and input flags.
Some input k-points are wrong, substituted with WF phase! — Continuation with yesread10; input k-points do not match the phases in fort.10_00xxxx. The program substitutes phases and continues; verify k-point input.
error in reading DFT card — Invalid or malformed &dft namelist. Check syntax and variable names against the template.
error in reading band_structure card — compute_bands is set but &band_structure read failed. Provide &band_structure with task, deltaE, emin, emax, etc.
ERROR in TurboDFT setup — Input inconsistency detected in check_dft_input or checkiflagerr. Fix the input according to the message printed just before.
It is assumed that nelup > neldo — nelup ≤ neldo in input. Check electron numbers.
To use LSDA functional, start with a non-symmetric AGP... — symmagp with typedft=4/5 (LSDA). Use a non-symmetric AGP or set typedft to 1/-1 (LDA).
compile with PARALLEL option for more than one k-point — Serial build with more than one k-point. Build with PARALLEL or use a single k-point.
choose an higher number of processors or decrease the number of k-points! — nproc < nk. Increase MPI processes or decrease the number of k-points.
Choose a number of processors multiple of the number of k-points! — nproc is not a multiple of nk. Set mpirun -np to a multiple of the k-point count.
occupationlevels.dat not found! Run a self-consistent calculation before computing band structure — compute_bands is set but the file is missing. Run an SC calculation first to generate it.
Use the same mesh as SC calculation! — NonSC run with a different mesh or processor layout than the SC run. Use the same nx, ny, nz and (for continuation) the same nproc.
one or more k-points read from file do not coincide with input fort.10 ones — K-points in the file and in fort.10 differ. Align k-point input with the previous SC run.
Impossible to continue fast with different # processors — Continuation with a different number of MPI processes. Use the same nproc or force noread to recompute.
Occupations not defined — Diagonalization or occupation setup failed (e.g. info≠0). Check bands, electron count, and optocc.
check input nbufd and/or code — Buffer or mesh dimension mismatch (initialize_matrices, hamiltonian, elec_density). Increase nbufd or nx, ny, nz in &molecul.
No tasks required / Task type not recognized — task (in &band_structure) is 0 or not supported. Set task=1 (bands), 2 (DOS), or 5 (both).
Number of processor per k-point different with respect to self-consistent run — NonSC run with different nproc/nk than SC. Use the same nproc and nk.
Use kp_type=2 or kp_type=3 for plotting the DFT band structure — Band plotting requires kp_type 2 or 3 in the k-point input.
Error reading spin up/down occupations / Provide spin down occupations for... / ERROR reading occupations! — occupationlevels.dat or occupation input invalid or incomplete. Check file format and record count.
Too few molecular orbitals < nelocc — Fewer orbitals than nelocc. Increase bands or decrease nelocc.
ERROR not found molecular orbital # — Orbital reordering (e.g. momentum conservation) could not find a matching orbital. Check symmetry and AGP setup.
Turbo-DFT terminates without convergence — SC did not converge within maxit. Try increasing maxit, adjusting mixing, typeopt, or optocc.

Warnings (execution continues)

Many Warning messages are informational (e.g. init. value of threads, Point group symmetries not implemented with yes_tilted, # bands restored to default, do_hartree/corr_hartree settings, epsover/mixingder/epssr, density rescaling or symmetrisation, orbital reordering, occupation defaults, buffer size, SVD directions neglected, diagonalization info≠0, charge/spin not conserved, DOS smearing defaults). They indicate automatic adjustments or conditions to watch; check the message and adjust input or mesh if results are unexpected.
Warning you should run with a smaller number of processors !!! — Reducing the number of MPI ranks may improve stability.
Warning not all orbitals are read from fort10 — Fort.10 did not contain as many orbitals as expected; verify the file.

Other notes

fort.10_new is overwritten if it exists. Back it up if needed.