`glurmo` by example

Version 0.1

A brief guide to using the `glurmo` command line utility

Setting up

Setting up the simulation

In order to run the simulations, we must set up the directory. This entails glurmo creating the actual scripts and slurm scripts, as well as setting up the general directory structure. To set up the directory, ensure your current directory is simple-lm, and then run:

$ glurmo -s .

Note that you can also run setup from a different directory, as the final argument is simply the path to whichever directory you’d like to run glurmo on. If you run ls, you should see the following new sub-directories: results, scripts, slurm, slurm_errors, and slurm_out. We’ll go over each directory in turn.

results, slurm_errors, and slurm_out

These directories will store various outputs from your simulation studies. Right now, they should be empty- we’ll discuss these more in the next section, though you can probably guess what will be stored in each one.

scripts

The scripts directory is a bit more interesting. As of now, it should contain five scripts: script_0.R through script_4.R. Let’s take a look at script_2.R:

set.seed(24601 + 2)

#----------------------------------
# General settings
#----------------------------------
N = 100
beta_0 = 1
beta_1 = 2

#----------------------------------
# X
#----------------------------------
X_sigma = 2
X_mu = 0
X = rnorm(N, X_mu, X_sigma)

#----------------------------------
# y
#----------------------------------
epsilon_sigma = 1
epsilon = rnorm(N, 0, epsilon_sigma)
y = beta_0 + beta_1 * X + epsilon


#----------------------------------
# Model
#----------------------------------
sim_data = data.frame(y = y, X = X)
lm_fit = lm(y ~ X, data = sim_data)



saveRDS(lm_fit, "/home/oshern/Projects/glurmo-examples/simple-lm/results/results___2.RData")

As promised, the templating markdown has been replaced with the parameters. Further, note that the {{.index}} markdown was replaced with “2”, since this is the second script. And similarly, the {{.results_path}} markdown has been replaced by a path to the current directory that should end in “/results/result___2”, followed by the {{.result_extension}} “.RData”. We now have R scripts that are ready to be run, which we will do in the next section.

slurm

The slurm directory stores the slurm scripts for each simulation. This is quite analogous to Let’s take a look at slurm_2:

#!/bin/sh

#SBATCH --job-name=simple_lm___2
#SBATCH --time=00:01:00
#SBATCH --mail-user=your.email@university.edu
#SBATCH --mail-type=END,FAIL
#SBATCH --mem=4g
#SBATCH --cpus-per-task=1
#SBATCH -e "/home/oshern/Projects/glurmo-examples/simple-lm/slurm_errors/error___2"
#SBATCH -o "/home/oshern/Projects/glurmo-examples/simple-lm/slurm_out/output___2"

R CMD BATCH --no-save --no-restore /home/oshern/Projects/glurmo-examples/simple-lm/scripts/script_2.R /home/oshern/Projects/glurmo-examples/simple-lm/slurm_out/output___2.Rout

Just as you might expect, the markdown has been replaced with the simulation settings or script specific variables.

Conclusions

We are now ready to run the simulations themselves, which we will do in the next section. But first I think it’s worth highlighting a strength of glurmo, even in a simple setting like this. The nice thing about glurmo is that it makes it extremely easy to produce simulation studies that are well-organized and have a common structure. This is particularly useful when organizing the outputs, as we’ll see in the next section. It’s also worth emphasizing how re-usable much of this is: we could very easily spin up an entirely different simulation study with the same slurm_template, so long as we keep the variables in the settings.json file. I do this all the time in my own work!

Last updated on 2 Sep 2024
Published on 2 Sep 2024
 Edit on GitHub