HPC. User guides Job Manager

Job Manager

When users log on to the system, they do so on the frontends that give access to system resources, users should not run any work on these machines since all users work on these machines and running processes on them slows down the work of other users.

All HPC jobs running on the system must be executed on the compute nodes by submitting a script to the SCAYLE job manager.

It is essential that users optimize the use of system resources to ensure efficient and equitable performance. In addition, proper use of the SCAYLE job manager not only facilitates task distribution, but also maximizes the utilization of available resources, minimizing wait times and avoiding overloading front-end nodes.

By following these guidelines, a collaborative and efficient work environment is ensured, benefiting all users of the system.

The job manager or queue manager is a system that is in charge of sending jobs to the compute nodes to which each user has access, controls their execution and prevents several jobs from sharing the same resources, thus increasing their execution times. The manager used by SCAYLE is SLURM.

In the case of the SLURM job manager, the most commonly used commands in its regular use are:

Compiles jobs on the nodes with the same characteristics as those on which they will be executed later, creating an interactive session with the compute nodes.

For example, by executing the following command:

[user@frontend11 ~]$ salloc --ntasks=16 --time=60 --partition=genoa

The job manager will assign an access to one of the compute nodes of the partition called “genoa” (--partition=cascadelake), for a maximum of 60 minutes (--time=60) and will allow us to use 16 cores for our compilation or test tasks of our code.

This SLURM command is used to send to the job manager the script of the job we want to run.

For example:

[user@frontend11 ~]$ sbatch run_openfoam.sh

It will send the execution script of the “OpenFOAM fluid dynamics” software to the job manager.

Displays information and status of all jobs in progress.

[user@frontend11 ~]$ squeue 
             JOBID PARTITION    NAME     USER     ST  TIME     NODES  NODELIST(REASON)
             94631 genoa  JOB_OF   user  R   3:47:29  2      cn[7026-7027]

In the above example, the user has a single job running with a “JOBID” value of 94631:

JOBID: It is a unique number that the job manager assigns to each of the jobs it manages.
PARTITION: Informs on which server group the job is being executed.
NAME: Reports the name of the job that was defined in the submit script.
USER: Indicates the owner of the job.
ST: Informs the status of the job, in this case “R (running)” running. There is another status “PD (pending)”, which indicates that the job is waiting to be executed.
TIME: Informs how long the job has been running.

Finally, the number of servers used and their names are detailed in the columns “NODES” and “NODELIST”.

For security and privacy reasons, users can only have access to the information of their own jobs, and cannot access any information of other users.

Cancels the execution of a job that is in progress.

According to the above example if we want to cancel the job with “JOBID 94631”, we would pass the command:

[user@frontend11 ~]$ scancel 94631

The first step in order to submit a job to the job manager is to write a submission script containing two types of lines: directives to the job manager and Linux commands.

Linux commands will be interpreted by the shell written on the first line of the script (#!/bin/bash).
The directives for the job manager are placed at the beginning of the script and in the case of SLURM are lines starting with the string “#SBATCH” followed by the various options available. These directives are processed by the manager when the script is submitted with the sbatch command, which serve to provide information to the manager and allow the execution nodes to perform the job as desired by the user.

For example, the next batch script:

#!/bin/bash 
#SBATCH --ntasks=32 
#SBATCH --job-name=hello_world 
#SBATCH --mail-user=email@scayle.es
#SBATCH --mail-type=ALL 
#SBATCH --output=hello_world_%A_%a.out 
#SBATCH --error=hello_world_%A_%a.err 
#SBATCH --partition=sapphire 
#SBATCH --qos=normal 
#SBATCH --time=0-00:05:00 

source /soft/sapphire/intel/oneapi/setvars.sh

srun -n $SLURM_NTASKS hello_world.sh

Line 1, as detailed above, specifies the type of shell that will execute the linux commands of the script.

All lines beginning with #SBATCH are the directives that will be interpreted by the task manager. Explained below:

--ntasks --> Set the desired number of cores for script execution, in this case 32 cores.
--nodes --> If we need more than one node for the job, select the number of desired nodes.
--gres=gpu: --> Select the number of graphics cards to use per node.
--job-name --> Name assigned to the job.
--mail-user--> E-mail address to where notifications related to the job will be sent.
--mail-type --> Defines under what circumstances an email will be sent to the user. In this case “ALL” will be at the start of execution, at the end of execution and in case the job is cancelled.
--output --> This is the standard output file. If no output file is defined for errors, by default the standard output of the execution and the output of possible errors are unified in a single file.
--error --> Defines the error output file.
--partition --> Partition to which the job is sent.
--qos --> QOS with which the job is submitted. In this link you are shown the available QOS.
--time --> Time limit for the job (D-HH:MM:SS).

IMPORTANT: For the job to work correctly it is mandatory to add the #SBATCH --time=(D-HH:MM:SS) parameter to the script. D are days, HH are hours, MM are minutes and SS are seconds.
The time defined with the parameter “---time ” will in no case take precedence over the maximum execution time associated with the QOS of the job.

Each QOS (Quality Of Services) allows you to customize various parameters such as the maximum time a job can run, the maximum number of cores that can be requested by a user or which users can submit jobs to that partition.

By default, users have access to certain limits. To request access to a particular QOS, the user should contact the support staff.

Below are the limits of the QOS we have available where:

MaxWall: This is the maximum time that can be requested when submitting a job (days-hours:minutes:seconds).
MaxTRESPU: This is the maximum number of cores that a user can book simultaneously.
MaxJobsPU: Is the maximum number of jobs being executed by a user concurrently.

Name	Priority	MaxWall	MaxTRESPU	MaxJobsPU
normal	100	5-00:00:00	cpu=512	50
long	100	15-00:00:00	cpu=256
xlong	100	30-00:00:00	cpu=128
xxlong	100	45-00:00:00	cpu=64

The default QOS used by users if nothing is specified is the normal QOS.

When the same job must be repeated a number of times by varying only the value of some parameter, the task manager allows to perform this task in an automated way. This type of jobs are called array jobs.

To send an array job you must use the --array option of the sbatch command, for example from the command line:

 frontend11> sbatch ... --array 1-20 ... test.sh

It would send 20 simultaneous executions of the test.sh program.

If we wanted to include it in the script itself, we should add it to the rest of the task manager options:

  #SBATCH --output=hello_world_%A_%a.out 
  #SBATCH --error=hello_world_%A_%a.err
  #SBATCH –-partition=sapphire 
  #SBATCH –-qos=normal 10
  #SBATCH –-array=1-20       <----

Given the characteristics of the limits of the queuing system, we will use the following SLURM parameter to have a simultaneous execution, with --array=1-20 %<number>.

  #SBATCH --array=1-20%4

Where <number> we will enter the value of jobs we want simultaneously.

This does not guarantee that jobs will come in one after the other as it depends on machine load and priorities.

There are a number of variables that are defined in the job environment when the script is run through the task manager. These variables can be used in the script.

Among the most interesting for regular use are the following:

$SLURM_JOB_ID: Job ID.
$SLURM_JOB_NAME: Job name.
$SLURM_SUBMIT_DIR: submit directory.
$SLURM_JOB_NUM_NODES: number of nodes assigned to the job.
$SLURM_CPUS_ON_NODE: Number of cores/node.
$SLURM_NTASKS: Total number of cores per job.
$SLURM_NODEID: Index of the node running relative to the nodes assigned to the job.
$SLURM_PROCID: index of the task relative to the job.

Using features and constraints gives users the ability to precisely define their needs, whether it is the interconnect preference of their application, the amount of memory required, or the willingness to use low priority machines. Having a clear understanding of how these features operate and which ones are best suited for your application can maximize performance in the queuing environment.

To select a "feature" with #SBATCH we will use the following parameter:

#SBATCH --constraint=<feature_name>

Where <feature_name> will be one of the names you will find in the tables defined below.

In this section you will find the characteristics (column name) that you can enter in the “--constraint= ” parameter:

Name	Description
intel	Select only nodes with Intel CPUs
amd	Select only nodes with AMD CPUs
cascadelake	Selects only nodes with CPUs based on the Cascade Lake architecture
icelake	Selects only nodes with CPUs based on the Ice Lake architecture
sapphirerapids	Selects only nodes with CPUs based on the Sapphire Rapids architecture
broadwell	Selects only nodes with CPUs based on the Broadwell architecture
genoa	Selects only nodes with CPUs based on the Genoa architecture
gpu_v100	Selects only nodes with NVidia Tesla V100 GPUs
gpu_a100	Selects only nodes with NVidia Tesla A100 GPUs
gpu_h100	Selects only nodes with NVidia Tesla H100 GPUs
192GB	Select only nodes that have max. 192GB of RAM available
256GB	Select only nodes that have max. 256GB of RAM available
384GB	Select only nodes that have max. 384GB of RAM available
1024GB	Select only nodes that have max. 1024GB of RAM available
1536GB	Select only nodes that have max. 1536GB of RAM available
2048GB	Select only nodes that have max. 2048GB of RAM available
2434GB	Select only nodes that have max. 2434GB of RAM available
xeon_6240	Selects only nodes with Intel Xeon 6240 CPUs
xeon_6252	Selects only nodes with Intel Xeon 6252 CPUs
xeon_8358	Selects only nodes with Intel Xeon 8358 CPUs
xeon_e5-2695v4	Selects only nodes with Intel Xeon E5-2695 v4 CPUs
xeon_8462y+	Selects only nodes with Intel Xeon 8462y+ CPUs
epyc_9374f	Selects only nodes with AMD EPYC 9374F CPUs

Last update: 26/03/2025 10:38