Skip to content

Job priorities

Demand for HPC resources typically surpasses supply, thus a method which establishes an order when a job can run has to be implemented. By default, the scheduler allocates on a simple FIFO approach. When the cluster is occupied, the submitted jobs will wait in queue for execution. The waiting jobs will be ordered acccording to their priority attribute (the higher the number, the sooner will the job be launched). However the applications of rules and policies can change the priority of a job, which will be expressed as a number to the scheduler.

Priority factors

Job priorities are calculated as follows:


Priority = 1,000,000 * Fairshare + \
           1         * Job_age   + \
           345,600   * Partition + \
           1,000,000 * QoS_priority
  • Fairshare: The fairshare factor, ranging from to 0 to 1, gives a penalty to the users with regards to resources used by a project's jobs in the last 14 days. The more resources used by a project's jobs in the last 14 days, the lower the priority of the new jobs for that project.
  • Job_age: The priority of a job increases the longer it has been in the queue. Job age equals to the time in seconds the job is waiting in queue, to a maximum of 86,400 seconds.
  • Partition: A factor assigned to the job by partition selection, see partions list. The partition factor values are:
    • 1.0 for the short partition
    • 0.5 for the medium partion
    • 0.0 for the long partition
  • QoS_priority: Quality of service factor serves to prioritize jobs submitted from active projects, while ensuring that projects that have exceeded their duration can fully utilize their allocated billing hours. The QoS factor gains value of 1.0 for active projects, and 0.0 for projects that have exceeded their duration.

Backfilling

Slurm on Devana cluster is loaded with backfill scheduling plugin, which allows lower priority jobs to run as long as the they will finish before the higher priority job needs the resources. This makes it very important that the users specify their CPU, memory and walltime requirements accurately, to make best use of the backfilling system.

Following command can be used to view estimated time of job execution:

squeue --start -j <jobid>

Managing priorities

There are several commands that allow user to manage/view priorities of submitted jobs, chief among them sprio and sshare. Sprio command shows the priorities (and their components) of waiting jobs. Sshare can be used to determine faishare factor that is used in job priority calculation.

sprio - jobs scheduling priority information

Demand for HPC resources typically surpasses supply, thus a method which establishes an order when a job can run has to be implemented. By default, the scheduler allocates on a simple FIFO approach. However the applications of rules and policies can change the priority of a job, which will be expressed as a number to the scheduler. The sprio command can be used to view the priorities (and their components) of waiting jobs.

Sorting all waitings jobs by their priority


sprio -S -y
   JOBID  PARTITION    PRIORITY      SITE        AGE  FAIRSHARE  PARTITION        QoS
   674582 short        1442165          0        439      96126     345600    1000000
   674520 medium       1427035          0       1724     252511     172800    1000000
   674521 medium       1427033          0       1722     252511     172800    1000000
   674522 medium       1427031          0       1720     252511     172800    1000000
   674502 long         1026833          0       2444      24390          0    1000000
   674528 long          512442          0       1682     510760          0          0
Zero value of QoS for job 674528 indicates that it has been submitted within project that exceeded its duration.

See the slurm documentation page for more information or type sprio --help.

sshare - list shares of associations

This command displays fairshare information based on the hierarchical account structure. In our case we will use it to determine the fairshare factor used in job priority calculation. Since the fairshare factor value depends on the account (user project) as well, we have to define it as well.

In this case we know, that our user1 has access to the project called p70-23-t. Therefore we can display the fairshare factor (shown here in the last column) as follows:

Viewing user's fairshare


sshare -A p70-23-t
    Account                    User  RawShares  NormShares    RawUsage  EffectvUsage  FairShare 
    -------------------- ---------- ---------- ----------- ----------- ------------- ---------- 
    p70-23-t                                 1    0.333333   122541631      0.364839
    p70-23-t               user1             1    0.111111     4798585      0.039159   0.263158

You can display all project accounts available to you using sprojects command.

See the slurm documentation for more information or type sshare --help.