GROMACS
Description¶

GROMACS is a versatile package to perform molecular dynamics for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions it can also be used for dynamics of non-biological systems, such as polymers and fluid dynamics.
User Guide¶
- 📄 Official documentation: GROMACS website
- 📚 Beginner & advanced tutorials: www.mdtutorials.com
- 🔍 Full reference manual: manual.gromacs.org
Available Versions¶
You can load any of the currently available GROMACS versions, including their runtime dependencies, as a single module using the following commands:
module load GROMACS/2021.5-foss-2021b
module load GROMACS/2021.5-foss-2021b-CUDA-11.4.1
module load GROMACS/2023.2-intelmkl-CUDA-12.0
Deorecated Builds
Both 2021.5-foss-2021b versions are deprecated builds without support for efficient thread-MPI parallelization and users are strongly encouraged to use newer builds with enable MPI support.
CPU Versions¶
Example run script¶
You can copy and modify this script to gromacs_run.sh and submit it using:
sbatch gromacs_run.sh
#!/bin/bash
#SBATCH --job-name= # Name of the job
#SBATCH --account= # Project account number
#SBATCH --partition= # Partition name (short, medium, long)
#SBATCH --nodes= # Number of nodes
#SBATCH --ntasks= # Total number of MPI ranks
#SBATCH --cpus-per-task= # Number of threads per MPI rank
#SBATCH --time=hh:mm:ss # Time limit (hh:mm:ss)
#SBATCH --output=stdout.%j.out # Standard output (%j = Job ID)
#SBATCH --error=stderr.%j.err # Standard error
#SBATCH --mail-type=END,FAIL # Notifications for job done or failed
#SBATCH --mail-user= # Email address for notifications
# === Metadata functions ===
log_job_start() {
echo "================== SLURM JOB METADATA =================="
printf " Job ID : %s\n" "$SLURM_JOB_ID"
printf " Job Name : %s\n" "$SLURM_JOB_NAME"
printf " Partition : %s\n" "$SLURM_JOB_PARTITION"
printf " Nodes : %s\n" "$SLURM_JOB_NUM_NODES"
printf " Tasks (MPI) : %s\n" "$SLURM_NTASKS"
printf " CPUs per Task : %s\n" "$SLURM_CPUS_PER_TASK"
printf " Account : %s\n" "$SLURM_JOB_ACCOUNT"
printf " Submit Dir : %s\n" "$SLURM_SUBMIT_DIR"
printf " Work Dir : %s\n" "$PWD"
printf " Start Time : %s\n" "$(date)"
echo "========================================================"
}
log_job_end() {
printf " End Time : %s\n" "$(date)"
echo "========================================================"
}
# === Load required module(s) ===
module purge
module load GROMACS/2023.2-intelmkl-CUDA-12.0
# === Set working directories ===
# Use shared filesystems for cross-node calculations
INIT_DIR="${SLURM_SUBMIT_DIR}"
WORK_DIR="/work/${SLURM_JOB_ACCOUNT}/${SLURM_JOB_ID}"
mkdir -p "$WORK_DIR"
# === Input/output file declarations ===
INPUT_FILES="" # Adjust as needed
OUTPUT_FILES="" # Adjust as needed
# === Copy input files to scratch ===
cp $INPUT_FILES "$WORK_DIR"
# === Change to working directory ===
cd "$WORK_DIR" || { echo "Failed to cd into $WORK_DIR"; exit 1; }
log_job_start >> "$INIT_DIR/jobinfo.$SLURM_JOB_ID.log"
# === Run GROMACS ===
mpiexec -np ${SLURM_NTASKS} gmx_mpi mdrun -ntomp ${SLURM_CPUS_PER_TASK} ...
# === Copy output files back ===
cp $OUTPUT_FILES "$INIT_DIR"
# === Optional: clean up scratch ===
# rm -rf "$WORK_DIR"
log_job_end >> "$INIT_DIR/jobinfo.$SLURM_JOB_ID.log"
Tip
If you find some issues with the instructions above, please report it to us using our Helpdesk portal.
Benchmarks¶
In order to better understand how GROMACS utilises the available hardware on Devana and how to get good performance we can examine the effect on benchmark performance of the choice of the number of MPI ranks per node and OpenMP thread.
Following command has been used to run the benchmarks:
mpiexec -np ${SLURM_NTASKS} gmx_mpi mdrun -v -s $trajectory.tpr -ntomp ${SLURM_CPUS_PER_TASK} -pin on -nsteps 20000 -deffnm $trajectory
For more information about these benchmarks systems see following page.
Info
"Single-node benchmarks have been run on local /work/ storage native to each compute node, which are generally faster than shared storage hosting /home/ and /scratch/ directories."
Benchmarks have been made on following systems:
Molecular dynamics simulation of protein in membrane surrounded by water molecules (81743 atoms with system size 10.8 x 10.2 x 9.6 Å3) with a 2fs time step for a total of 40ps. Downloadable here.
| Single node Performance | Cross-node Performance |
|---|---|
![]() |
![]() |
Binding affinity study benchmark of protein-ligand system surrounded by water molecules (ca. 107k atoms) with energy evaluations done every step (TI). Downloadable here.
| Single node Performance | Cross-node Performance |
|---|---|
![]() |
![]() |
Binding affinity study benchmark of bromosporine to bromodomain surrounded by water molecules (43952 atoms with system size 8.55 x 8.55 x 6.04 Å3) with a 2fs time step for a total of 400ps. Free energy is controlled with init-lambda-state, coul-lambdas and vdw-lambdas vectors, all 20 lambda neighbors are calculated, energy evaluations done every step. Downloadable here.
| Single node Performance | Cross-node Performance |
|---|---|
![]() |
![]() |
Hybrid Decomposition and Node Utilization
The choice of MPI × OpenMP hybrid decomposition has a significant impact on performance. In the benchmark heatmaps, diagonals represent configurations with an equal total number of hardware threads (or Baseline Units, BUs); for example, the outermost diagonal corresponds to 64 BUs, the next to 32 BUs, and so on.
GPU Accelerated Versions¶
GPU support has been implemented in GROMACS since version 5.* and following versions have been both compiled with CUDA/GPU support:
- GROMACS/2023.2-intelmkl-CUDA-12.0
- GROMACS/2021.5-foss-2021b-CUDA-11.4.1
Example run script¶
You can copy and modify this script to gromacs_run_gpu.sh and submit it using:
sbatch gromacs_run_gpu.sh
#!/bin/bash
#SBATCH --job-name= # Name of the job
#SBATCH --account= # Project account number
#SBATCH --partition=gpu # GPU-enabled partition
#SBATCH --nodes= # Number of nodes
#SBATCH --ntasks= # Total number of MPI ranks
#SBATCH --cpus-per-task= # Number of threads per MPI rank
#SBATCH --gres=gpu:1 # Request 1 GPU
#SBATCH --time=hh:mm:ss # Time limit (hh:mm:ss)
#SBATCH --output=stdout.%j.out # Standard output (%j = Job ID)
#SBATCH --error=stderr.%j.err # Standard error
#SBATCH --mail-type=END,FAIL # Notifications for job done or failed
#SBATCH --mail-user= # Email address for notifications
# === Metadata functions ===
log_job_start() {
echo "================== SLURM JOB METADATA =================="
printf " Job ID : %s\n" "$SLURM_JOB_ID"
printf " Job Name : %s\n" "$SLURM_JOB_NAME"
printf " Partition : %s\n" "$SLURM_JOB_PARTITION"
printf " Nodes : %s\n" "$SLURM_JOB_NUM_NODES"
printf " Tasks (MPI) : %s\n" "$SLURM_NTASKS"
printf " CPUs per Task : %s\n" "$SLURM_CPUS_PER_TASK"
printf " GPU Count : %s\n" "$SLURM_GPUS"
printf " Account : %s\n" "$SLURM_JOB_ACCOUNT"
printf " Submit Dir : %s\n" "$SLURM_SUBMIT_DIR"
printf " Work Dir : %s\n" "$PWD"
printf " Start Time : %s\n" "$(date)"
echo "========================================================"
}
log_job_end() {
printf " End Time : %s\n" "$(date)"
echo "========================================================"
}
# === Load required module(s) ===
module purge
module load GROMACS/2023.2-intelmkl-CUDA-12.0
# === Set working directories ===
# Use shared filesystems for cross-node calculations
INIT_DIR="${SLURM_SUBMIT_DIR}"
WORK_DIR="/work/${SLURM_JOB_ACCOUNT}/${SLURM_JOB_ID}"
mkdir -p "$WORK_DIR"
# === Input/output file declarations ===
INPUT_FILES="" # Adjust as needed
OUTPUT_FILES="" # Adjust as needed
# === Copy input files to scratch ===
cp $INPUT_FILES "$WORK_DIR"
# === Change to working directory ===
cd "$WORK_DIR" || { echo "Failed to cd into $WORK_DIR"; exit 1; }
log_job_start >> "$INIT_DIR/jobinfo.$SLURM_JOB_ID.log"
# === Run GROMACS ===
mpiexec -np ${SLURM_NTASKS} gmx_mpi mdrun -ntomp ${SLURM_CPUS_PER_TASK} -nb gpu ...
# === Copy output files back ===
cp $OUTPUT_FILES "$INIT_DIR"
# === Optional: clean up scratch ===
# rm -rf "$WORK_DIR"
log_job_end >> "$INIT_DIR/jobinfo.$SLURM_JOB_ID.log"
The script offloads the short-range nonbonded interactions to the GPU using -nb gpu, which provides most of the speedup compared to CPU-only runs.
You can also offload:
- PME calculations:
-pme gpu - Bonded interactions:
-bonded gpu
See the GROMACS GPU performance guide for more details.
GPU assignment
Manually assigning tasks to specific GPUs is not currently supported. The number of MPI ranks determines the number of GPU tasks spawned, which are evenly distributed across available GPUs.
Tip
If you find some issues with the instructions above, please report it to us using our Helpdesk portal.
Benchmarks¶
Note
"Section under construction."
For more information about GPU benchmarks systems see following page.





