Configuration files
Overview
At the core of every glurmo
simulation there is a settings directory, .glurmo
, which contains three configuration files: settings.json
, script_template
, and slurm_template
. We will start by going through each of these files in turn.
settings.json
settings.json
is the file in which you specify the general settings of your simulation as well as the specific parameters for the script and the slurm script. We will go over each of these in turn, but first, an important note: you cannot use comments in settings.json
! Unfortunately this seems to break the parser shipped with go
- I’m hoping to fix this when I have time, or simply switch to a different file format like .toml
. Sorry about this!
General settings
General settings are stored under the “general” sub-object of the overall settings object. Right now, there are only two things you need to specify here: the number of simulations (“n_sims”) and the simulation id (“id”). The “id” setting should be unique, since this is how glurmo
identifies simulations associated with this studies. If you re-use ids, bad things could happen if you try to cancel jobs.
Script settings
Script settings are stored under the “templates” sub-object of the overall settings object. Entries here represent parameters that you may want to vary across your simulation scripts or slurm scripts. For example, this settings file specifies the number of data points to simulate (N) as well as the number of CPUs to use in the simulation study, among other things.
There are two settings that must be specified under script settings: “script_extension” and “result_extension.” The first tells glurmo
what kind of scripts you’re running. In this case, we’re running our simulations with R, so we’re using the “.R” extension. The second tells glurmo
the file extension for the results, i.e. how we’ll store the results of a simulation. In this case, we’re storing our results as .RData files, so the “result_extension” is “.RData”.
script_template
The script_template
file serves as a template for- you guessed it- your scripts. A script template is a mixture between code in a certain language (in this case, R) as well as templating markdown (denoted by pairs of curly brackets, i.e. {{.parameter}}). The key utility of this file is that for each simulation, the templating markdown will be replaced by its corresponding parameter in the “template” section of settings.json
. For example, {{.N}} will be replaced with 100 in this particular simulation.
If you could only use a static set of variables, glurmo
would not be very useful, since it would just create and run the same script a certain number of times. This might yield different results, but it wouldn’t be reproducible. That is why glurmo
makes a couple of “script specific” variables available to you, even though they aren’t specified in settings.json
. These variables are index
(see line 1) and results_path
(see line 33). index
captures the number of the script you’re in, and is zero indexed. So for the first script, it takes on the value of 0; for the second, 1; and so on. results_path
is the path that you should use to save the result of the current simulation. What you save and how you save it is up to you, but you must create a file at the results path with the result extension from settings.json
; otherwise, glurmo
will have no way of knowing that this simulation has completed. Note that this can be as simple as creating an empty file with whatever extension you choose.
slurm_template
slurm_template
is analogous to script_template
, but for templating your slurm submissions. Note again that we use templating markdown to substitute in parameters from settings.json
, and that once again glurmo
makes certain script specific variables available to you. These are:
job_id
: this is the id of the specific simulation. Note that you must set the job name to {{.job_id}} forglurmo
to be able to properly manage your simulations.error_path
: this is the path to the error output file for this simulation, which will be/absolute/path/to/dir/slurm_errors/error___{index}
output_path
: this is the path to the general output file for this simulation, which will be/absolute/path/to/dir/slurm_out/output___{index}
path_to_script
: this is the absolute path to the script for this simulation, which will be/absolute/path/to/dir/scripts/script_{index}{extension}
Conclusion
If this all sounds somewhat abstract right now, that’s fine- it will become much clearer in the next section when we set up and run this simulation. The key takeaway is that using just these three files, we can specify the script, slurm script, and parameter settings of a simulation. This might seem like overkill right now, because it is- this particular example could be run just as well using a job array. But there are still a few benefits to using glurmo
even in this simple setting, as we’ll see in the next section.