Configuration Files
This document describes the detailed specifications of TurboWorkflows configuration files. For an overview and basic usage of configuration files, see Environment Configuration.
Configuration File Structure
Configuration files are managed in the following directory structure.
~/.turbofilemanager_config/
├── machine_data.yaml # Server machine settings
├── localhost/ # Settings for localhost
│ ├── package.yaml
│ ├── queue_data.toml
│ ├── submit_mpi.sh
│ └── submit_nompi.sh
├── remotesrv/ # Settings for remotesrv (example)
│ ├── package.yaml
│ ├── queue_data.toml
│ ├── submit_mpi.sh
│ └── submit_nompi.sh
└── ...
machine_data.yaml
A YAML file that describes settings for server machines.
File Format
Written in dictionary format with server names as keys. An example is shown below.
localhost:
machine_type: local
queuing: false
computation: true
file_manager_root: /mnt/data/workflow
jobsubmit: bash
jobcheck: ps
jobnum_index: 1
remotesrv:
machine_type: remote
queuing: true
computation: true
file_manager_root: /work/flow/data
jobsubmit: /opt/pbs/bin/qsub
jobcheck: /opt/pbs/bin/qstat
jobdel: /opt/pbs/bin/qdel
jobnum_index: 0
Configuration Parameters
machine_typeType: String (
localorremote)Required: Yes
Description: Specifies the type of machine.
local: Local machine (executed on the same host)remote: Remote machine (connected via SSH)
queuingType: Boolean (
trueorfalse)Required: Yes
Description: Specifies whether to use a job scheduler.
true: Use a job scheduler (executed as batch jobs)false: Do not use a job scheduler (direct execution)
computationType: Boolean (
trueorfalse)Required: Yes
Description: Specifies whether to execute computations on this machine.
true: Execute computationsfalse: Do not execute computations (file management only, etc.)
file_manager_rootType: String (directory path)
Required: When using a remote server, or when file transfer is needed
Description: Specifies the root directory for file management. When transferring files, file paths are treated as relative paths from this directory.
Note: When using a remote server, you need to set
file_manager_rooton both localhost and the remote server.jobsubmitType: String (command path)
Required: When executing computations
Description: Specifies the path to the command for submitting jobs.
Examples:
PBS/Torque:
qsubSlurm:
sbatchLocal execution:
bash
jobcheckType: String (command path)
Required: When executing computations
Description: Specifies the path to the command for checking job execution status.
Examples:
PBS/Torque:
qstat, orqstat -u username, etc.SLURM:
squeue, orsqueue --noheader, etc.Local execution:
ps
jobdelType: String (command path)
Required: When
queuing: trueDescription: Specifies the path to the command for deleting (canceling) jobs.
Examples:
PBS/Torque:
qdelSLURM:
scancel
jobnum_indexType: Integer
Required: When executing computations
Description: Specifies the index for extracting the job number from the job submission command output. Specifies the position of the job number (0-based) when the command output is split by whitespace.
Examples:
For SLURM:
$ sbatch job.sh Submitted batch job 42
Since the job number (42) is in the 3rd column (0-based), specify
jobnum_index: 3.For PBS:
$ qsub job.sh 42.server-pbs
Since the job number (42.server-pbs) is in the first column, specify
jobnum_index: 0.
ipType: String (IP address)
Required: (Not used)
Description: Specifies the IP address of the remote machine. Usually,
HostNameis specified in SSH configuration (~/.ssh/config), so this parameter is no longer used.
package.yaml
Configure program package settings for each server. Written in YAML format.
Packages are mainly specified within Workflow classes and are used to manage external programs used by Workflows. Currently, the keys turborvb and python are used.
File Format
Written in dictionary format with package names as keys. An example is shown below.
turborvb:
name: turborvb
binary_path:
stable: /opt/turborvb/stable/bin
latest: /opt/turborvb/latest/bin
binary_list:
- turborvb-serial.x
- turborvb-mpi.x
- prep-serial.x
- prep-mpi.x
- makefort10.x
- convertfort10mol.x
- convertfort10.x
- readforward-serial.x
- readforward-mpi.x
job_template:
mpi: submit_mpi.sh
nompi: submit_nompi.sh
python:
name: python
binary_path:
stable: /usr/bin
binary_list:
- python3
job_template:
mpi: submit_mpi.sh
nompi: submit_nompi.sh
Configuration Parameters
- Package Entry (e.g.,
turborvb,python) Type: Dictionary
Required: Yes
Description: Describes the package settings with the package name as the key.
nameType: String
Required: Yes
Description: Specifies the package name. Usually, this is the same value as the key.
binary_pathType: Dictionary
Required: Yes
Description: Specifies the binary path for each version with version names as keys. Multiple versions can be managed.
Example:
binary_path: stable: /opt/turborvb/stable/bin latest: /opt/turborvb/latest/bin v1.0: /opt/turborvb/v1.0/bin
Note: It is recommended to specify absolute paths. When using relative paths, be aware that they depend on the current directory at execution time.
Empty Value: When an empty string (
""or blank) is specified, binaries are searched from the PATH environment variable.binary_listType: List (list of strings)
Required: Yes
Description: Specifies a list of binary file names used by this package.
Example:
binary_list: - turborvb-serial.x - turborvb-mpi.x - prep-serial.x
job_templateType: Dictionary
Required: Yes
Description: Specifies the template file name for job scripts.
mpiType: String
Required: Yes
Description: Specifies the template file name for MPI parallel jobs (e.g.,
submit_mpi.sh).nompiType: String
Required: Yes
Description: Specifies the template file name for serial jobs (e.g.,
submit_nompi.sh).
Version Management
binary_path can manage multiple versions. The version used in workflows is specified with the version parameter (e.g., version="stable").
queue_data.toml
Configure batch queue settings for each server. Written in TOML format.
File Format
Written in dictionary format with queue labels as keys. An example is shown below.
[default]
mpi = false
max_job_submit = 1
num_cores = 1
omp_num_threads = 1
nodes = 1
cpns = 1
mpi_per_node = 1
[large]
mpi = true
max_job_submit = 10
num_cores = 48
omp_num_threads = 1
nodes = 2
cpns = 48
cores_per_node = 48
mpi_per_node = 24
max_time = "24:00:00"
queue = "large"
account = "myaccount"
partition = "normal"
Configuration Parameters
- Queue Label (e.g.,
[default],[large]) Type: TOML table
Required: Yes
Description: Describes the queue settings with the queue label as the key. Specified with the
queue_labelparameter of the Workflow class.mpiType: Boolean (
trueorfalse)Required: Yes
Description: Specifies whether to perform MPI parallelization.
true: Execute as an MPI parallel job. Usesmpifromjob_templatein package settings.false: Execute as a serial job. Usesnompifromjob_templatein package settings.
max_job_submitType: Integer
Required: Yes
Description: Specifies the maximum number of jobs that can be submitted to the job scheduler. The limit varies by system.
- Custom Variables
Type: Any (string, integer, floating point number, boolean)
Required: No
Description: Any key-value pairs can be defined. These are used as parameters in job templates, and
_KEY_is replaced with the value corresponding to the key (case-insensitive).Examples of commonly used variables:
queue: Queue namenodes: Number of nodes to usenum_cores: Number of cores to useomp_num_threads: Number of OpenMP threadscores_per_nodeorcpns: Number of cores per nodempi_per_node: Number of MPI processes per nodemax_time: Maximum execution time (specified as a string, e.g.,"24:00:00")account: Account namepartition: Partition name (Slurm)memory: Memory amount (e.g.,"32GB")
TOML Format Notes
TOML format describes data in key/value pairs. Values have types and can be quite strict, so care must be taken. Some things to note are listed below. For details, refer to the TOML specification.
Numbers: Both integers and floating point numbers can be used. Floating point numbers require digits before and after the decimal point. Do not write 1.0 as 1., nor do not write 0.1 as .1.
Strings: Must be enclosed in quotes (e.g.,
queue = "small").Booleans: Write in lowercase
trueorfalse.yes,no, etc. are unacceptable.Time: When setting the maximum execution time, the value must be enclosed in quotes and treated as a string. TOML has a type for representing time (local time), so errors may occur if it is mistakenly interpreted as such.
max_time = "24:00:00" # Correct max_time = 24:00:00 # Error (interpreted as time)
Job Script Templates (submit_mpi.sh, submit_nompi.sh)
Prepare job script templates. Templates can be separated by package and by the presence or absence of MPI parallelization.
File Format
Written in shell script format. Embedded parameters in templates are written in the format _KEY_.
Predefined Variables
The predefined variables that are automatically replaced by TurboWorkflows are as follows.
_INPUT_Description: Path to the input file
_OUTPUT_Description: Path to the output file
_PREOPTION_Description: Options to be placed before
_INPUT__POSTOPTION_Description: Options to be placed after
_INPUT_Usage Example:
$BINARY $PREOPTION < $INPUT $POSTOPTION > $OUTPUTIf
_INPUT_isNone, the< $INPUTpart is removed.If
_PREOPTION_or_POSTOPTION_isNone, the corresponding variable is replaced with an empty string.
_JOBNAME_Description: Job name
Usage Example:
#SBATCH --job-name=_JOBNAME_or#PBS -N _JOBNAME__BINARY_ROOT_Description: Root directory path of the binary
_BINARY_Description: Binary file name
Usage Example:
BINARY=_BINARY_ROOT_/_BINARY_
Using Custom Variables
Variables defined in queue_data.toml can be used in job script templates.
Keywords in templates are the variable names in uppercase enclosed by _ (e.g., num_cores → _NUM_CORES_).
Example:
When defined in queue_data.toml as follows:
[default]
num_cores = 48
omp_num_threads = 1
nodes = 2
max_time = "24:00:00"
They can be used in job script templates as follows:
export OMP_NUM_THREADS=_OMP_NUM_THREADS_
CORES=_NUM_CORES_
#SBATCH --time=_MAX_TIME_
For template examples, see Configuration File Examples.
Notes
The format of job scripts varies by system. Refer to the system’s user manual.
Some systems may require additional information such as account group specifications.
Variable replacement is case-insensitive, but it is recommended to write in uppercase for readability.