Job Arrays¶
Job arrays provide a convenient way to submit and manage large numbers of similar and independent jobs. They are particularly useful when the same application must be executed multiple times with different input files or parameters.
Instead of submitting many individual jobs, a single batch script can be used to launch multiple tasks within one job array.
The following example demonstrates a simple array job:
#!/bin/bash
#SBATCH --job-name=array_test
#SBATCH --partition=short
#SBATCH --nodes=1
#SBATCH --ntasks=4
#SBATCH --cpus-per-task=2
#SBATCH --output=out_array_%A_%a.out
#SBATCH --error=err_array_%A_%a.err
#SBATCH --array=1-8
# Print the task index
echo "My SLURM_ARRAY_TASK_ID:" $SLURM_ARRAY_TASK_ID
srun ./myapp --input input_data_${SLURM_ARRAY_TASK_ID}.inp
In this example, 8 array tasks will be launched (--array=1-8). Each
task runs with 4 tasks (--ntasks=4) and 2 CPUs per task
(--cpus-per-task=2).
The environment variable SLURM_ARRAY_TASK_ID uniquely identifies each
task in the array. It can be used to select different input files or pass
parameters to the application.
If you want to reuse the same batch script with different array ranges,
the --array option can be omitted from the script and specified when
submitting the job:
sbatch --array=1-8 array.job.sh
Defining the Array Range¶
Job arrays can be defined in several ways:
- Range of indices
- Explicit list of indices
- Range with step size
Example definitions:
#SBATCH --array=0-15 # Tasks with indices 0–15
#SBATCH --array=1,2,9,22,31 # Tasks with indices 1, 2, 9, 22, 31
#SBATCH --array=1-7:2 # Tasks with indices 1, 3, 5, 7
#SBATCH --array=1-7:2,20 # Tasks with indices 1, 3, 5, 7, 20
Limiting Array Concurrency¶
By default, SLURM may start many array tasks at the same time if resources are available. When running large arrays, it is often desirable to limit how many tasks run concurrently.
This can be done using the % modifier:
#SBATCH --array=1-100%10
In this example, the array contains 100 tasks, but at most 10 tasks will run at the same time. Once a task finishes, another task from the array is started.
This is useful when:
- running many small jobs
- avoiding excessive filesystem load
- controlling license usage or external resource limits
Using File Lists Instead of Numeric Indices¶
Array indices do not need to correspond directly to numeric input files. A common approach is to map the array index to a list of input files.
Example input file list:
input_A.dat
input_B.dat
input_C.dat
input_D.dat
Batch script:
#!/bin/bash
#SBATCH --array=1-4
INPUT=$(sed -n "${SLURM_ARRAY_TASK_ID}p" input_list.txt)
srun ./myapp --input ${INPUT}
Here, each array task reads one line from input_list.txt. This allows
users to process arbitrary datasets without renaming files or enforcing
numeric indexing.
Managing Array Jobs¶
The squeue command can be used to monitor array jobs.
Pending tasks are often displayed as a single aggregated entry, while running tasks appear as separate entries with job IDs in the form:
<jobid>_<arrayindex>
Example:
squeue -u username
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
123456_[4-8] small example user1 PD 0:00 1 (Resources)
123456_1 small example user1 R 0:17 1 n024
123456_2 small example user1 R 0:23 1 n025
123456_3 small example user1 R 0:29 1 n025
To cancel selected tasks from a job array, use scancel with the appropriate index range:
scancel 123456_[1-3]
This cancels array tasks 1–3 from job array 123456.
To cancel the entire job array, specify only the job ID:
scancel 123456
Environment Variables¶
In addition to SLURM_ARRAY_TASK_ID, SLURM provides several environment
variables describing the job array:
| Variable | Description |
|---|---|
SLURM_ARRAY_TASK_ID |
Index of the current array task |
SLURM_ARRAY_TASK_COUNT |
Total number of tasks in the array |
SLURM_ARRAY_TASK_MIN |
Lowest array index value |
SLURM_ARRAY_TASK_MAX |
Highest array index value |