You can submit scripts to run in Roar collab. Each script is called a job and you submit a job by providing a batch script to the SLURM scheduler. This script contains the resource specifications that your program needs to run.
Step 1: Create a job submit script.
Below is an example of such file, called job_submit.sh
.
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --cpus-per-task=4
#SBATCH --time=1:00:00
#SBATCH --mem=1GB
#SBATCH --job-name=job_test
#SBATCH --output=output_%j.log
#SBATCH --mail-user=<psu-id>@psu.edu
#SBATCH --mail-type=BEGIN,END,FAIL
# User specific aliases and functions
## modules
module load r/4.3.2
# by default, the job starts in your home directory
# change it to where you put your .sh file, usually
# next to code to use
cd $HOME/work/code/job_example/
# Body of code (include here scripts to run)
R --file="./script.R"
# Finish up
echo " "
echo "Job Ended at `date`"
echo " "
Configuration Parameters
-
#SBATCH --nodes=1
: Specifies the number of compute nodes required for the job. In this case, it is set to 1. -
#SBATCH --cpus-per-task=4
: Specifies the number of CPUs allocated per task. In this case, it is set to 4. -
#SBATCH --time=1:00:00
: Specifies the maximum runtime for the job. In this case, it is set to 1 hour. -
#SBATCH --mem=1GB
: Specifies the amount of memory allocated per node for the job. In this case, it is set to 1GB. -
#SBATCH --job-name=job_name
: Specifies the name of the job. Replacejob_test
with the desired name. -
#SBATCH --output=output_%j.log
: Specifies the output file for the job. It uses %j as a placeholder for the job ID. -
#SBATCH --mail-user=<psu-id>@psu.edu
: Specifies the email address to receive notifications about the job. Replace<psu-id>
with your own username. -
#SBATCH --mail-type=BEGIN,END,FAIL
: Specifies the types of email notifications to receive. In this case, it is set to receive notifications at the beginning, end, and in case of failure.
The module load
line is necessary to load R into the Roar Collab session. I recommend always specifying a software version to ensure you are always using the same one. In this case, we are using R version 4.3.2
. Use module spider "r"
to get a list of all available versions.
R script example
Below is the R script called script.R
# Example R script (script.R) using the mtcars dataset
# Load necessary libraries
library(dplyr)
library(ggplot2)
# Use the mtcars dataset
data <- mtcars
# Example analysis using dplyr across
data_summary <- data %>%
summarise(across(where(is.numeric),
list(mean = mean, sd = sd, min = min,
Q1 = ~quantile(.x, 0.25),
median = median,
Q3 = ~quantile(.x, 0.75),
max = max),
.names = "{col}_{fn}")) %>%
tidyr::pivot_longer(cols = everything(),
names_to = c("variable",".value"),
names_sep = "_")
# Plot example: Create a histogram for the mpg variable
plot <- ggplot(data, aes(x = mpg)) + geom_histogram(binwidth = 2, fill = "skyblue", color = "black") +
labs(title = "Histogram of MPG in mtcars", x = "Miles Per Gallon", y = "Frequency")
# Save the output
write.csv(data_summary, "summary_output.csv")
ggsave("histogram_output.png", plot)
Step 2: Submit your job
In your Roar Collab session, go to folder containing the .slurm
file you just created, then type
sbatch job_submit.sh
Monitoring and managing your job
Checking the status of your job
To see the status of submitted job type
squeue
To list all the jobs belonging to a particular user:
squeue -u <psu-id>
To get all the details about a particular job (full status):
scontrol show jobid -dd
Automated monitoring
If you want to keep a terminal window open and checking your jobs’ status every x
seconds, simply run
watch --interval=x squeue -u <psu-id>
to exit the monitoring screen and go back to the usual prompt, simply do CTRL+c
.
Modifying jobs on the queue
It is possible to alter the requested resources of a job without losing your place in the queue. For example, say you want to change the walltime to 12 hrs, then type
scontrol update jobid=JOB_ID TimeLimit=DD-HH:MM:SS
Remember, you can only modify jobs that are in the queue and have not yet started running.
Cancelling jobs
To cancel a job, type scancel JOB_ID
. You can find the JOB_ID using squeue
.
Sources:
For serially running jobs, you can use job dependencies:
- Submit the first job
JOB_ID_1=$(sbatch job1.sh | awk '{print $4}')
- Submit the second job to start after the first job completes successfully
JOB_ID_2=$(sbatch --dependency=afterok:$JOB_ID_1 job2.sh | awk '{print $4}')
- Repeat as necessary. Or, create an array of slurm jobs and run them all together:
sbatch --array=1-100 job_array.sh
Here is a bash code you can use to submit a chain of jobs:
// File name : run_100jobs_for_me.sh
// the first job you submit
job=$(sbatch myrun1.sh | awk '{print $4}')
// submission of the jobs 2 through 100!
for i in {2..100}
do
job_next=$(sbatch --dependency=afterok:$job myrun$i.sh | awk '{print $4}')
job=$job_next
done
You can run the code above using the following command:
sh run_100jobs_for_me.sh
Customizing running order with conditional afterany
s
- Job is scheduled if the Job JOB_ID exits without errors or is successfully completed.
afterok:JOB_ID
- Job is scheduled if the Job JOB_ID exited with errors.
afternotok:JOB_ID
- Job is scheduled if the Job JOB_ID exits with or without error
afterany:JOB_ID