Environment Modules

Introduction

The MPCDF uses modules to adapt the user environment for working with software installed at various locations in the file system or for switching between different software versions. The user is no longer required to explicitly specify paths for different executable versions, or keep track of PATH, MANPATH and related environment variables. With the modules approach, users simply ‘load’ and ‘unload’ modules to control their environment. System administrators provide modulefiles, typically named after the software package and an optional version number. All popular shells are supported, including bash, ksh, and tcsh. Besides handling different software versions, the modules approach allows system administrators to install software in non-standard locations and also to relocate software packages transparently for the user (by adapting the modulefile). It is therefore highly recommended for users to use the variables provided in the modules in their makefiles, scripts, etc. instead of relying on absolute paths (see below for examples). Note that since 2018, HPC systems as well as the increasing number of dedicated clusters all use hierarchical environment modules (see below for more details).

Basic interactive usage

Please find below a list of the most important commands (see https://modules.readthedocs.io/en/latest/module.html for a complete reference):

module help lists module subcommands and switches

module avail lists available software packages and versions which can be enabled (“loaded”) with the module command

module apropos <keyword> searches available modulefiles for the specified keyword string and list all matching modules.

module help <package>/<version> provides brief documentation for the specified module.

module load <package>/<version> “loads” the module, i.e. modifies the user’s environment ($PATH, $MANPATH, etc.)

module unload <package>/<version> “unloads” the module

module list lists all modules which are currently loaded in the user’s environment

Usage in scripts

Instead of absolute paths to libraries, binaries etc. the environment variables set by the modulefile should be used in scripts, makefiles etc. By convention, an MPCDF modulefile sets an environment variable named <PKG>_HOME (where PKG is the name of the package, for example: MKL_HOME) which points to the root directory of the installation path (see below for example usage). Information about additional, package-specific environment variables can be obtained with the commands module help <package>/<version> and module show <package>/<version>.

Examples

1) Interactive session on the command line, using the Intel fortran compiler (version 19.1.3) and Intel MKL (version 2020.4 explicitly specified):

module load intel/19.1.3
module load mkl/2020.4
ifort -I$MKL_HOME/include example.F -L$MKL_HOME/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core

2) Makefile (fragment):

FC=ifort

example: example.F

    $(FC) -I$(MKL_HOME)/include test.F -L$(MKL_HOME)/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core

3) Handling long output from module avail which may not fit into a single terminal window:

Piping the output to less using a bourne shell (e.g. bash):

`( module avail ) 2>&1 | less`

Piping the output to lessusing a c shell (e.g. tcsh):

`( module avail ) | & less`

Hierarchical environment modules

To manage the plethora of software packages resulting from all the relevant combinations of compilers and MPI libraries, we have decided to organize the environment module system for accessing these packages in a natural hierarchical manner. Compilers (gcc, intel) are located on the uppermost level, depending libraries (e.g., MPI) on the second level, more depending libraries on a third level. This means that not all the modules are visible initially: only after loading a compiler module, will the modules depending on this become available. Similarly, loading an MPI module in addition will make the modules depending on the MPI library available.

Note that for newer systems (starting with raven), no defaults are set for compilers and MPI libraries and no modules are pre-loaded. For older systems, after login, the Intel compiler, Intel MPI and Intel MKL module will be loaded by default. To start at the root of the environment modules hierarchy for those systems, issue ‘module purge’.

For example, the FFTW library compiled with a certain Intel compiler and a certain Intel MPI library can be loaded as follows: First, load the Intel compiler module using the command

module load intel/19.1.3

second, the Intel MPI module with

module load impi/2019.9

and, finally, the FFTW module fitting exactly to the compiler and MPI library via

module load fftw-mpi

You may check using the command

module available

that after the first and second steps the depending environment modules become visible, in the present example impi/2019.9 and fftw-mpi. Moreover, note that the environment modules can be loaded via a single module load statement as long as the order given by the hierarchy is correct, e.g., module load intel/19.1.3 impi/2019.9 fftw-mpi. Please always specify the exact version of the compiler and MPI library and please make sure to always use the same compiler and MPI modules for compiling your code as for running your code in a SLURM script.

In case you know the name of the module you wish to load, but you are not sure about the available versions or what dependencies need to be loaded first, you can try to use the ‘find-module’ command. This tool searches for the MODULENAME string through a list of all installed modules

find-module MODULENAME

You can then choose the desired module version, use the output of the command to determine the correct order to load dependencies, and finally load the module itself, e.g.

$ find-module horovod

    horovod/cpu/0.13.11 (after loading anaconda/3/2019.03 tensorflow/cpu/1.14.0)

    horovod/cpu/0.15.2  (after loading anaconda/3/2019.03 tensorflow/cpu/1.14.0)

    horovod/gpu/0.13.11 (after loading anaconda/3/2019.03 tensorflow/gpu/1.14.0)

    horovod/gpu/0.15.2  (after loading anaconda/3/2019.03 tensorflow/gpu/1.14.0)

$ module load anaconda/3/2019.03 tensorflow/cpu/1.14.0 horovod/cpu/0.13.11

It is important to point out that a large fraction of the available software is not affected by the hierarchy, e.g., certain HPC applications, tools such as git or cmake, mathematical software (maple, matlab, mathematica), visualization software (visit, paraview, idl) are visible at the uppermost hierarchy. Note that a hierarchy exists for Python modules with the ‘anaconda’ module files on the top level.

Note on module dependencies

Some projects require the loading of more than one compiler module and the use of depending libraries. In other words, the projects must load at least two uppermost level modules and second level dependee modules that exist in both module hierarchies. In this case, the second level module is chosen from the last-loaded uppermost module.

This is also true for second/third level dependencies.

In the following example, one would like to use the Intel compiler for a C++ code that relies on FFTW. As explained in our compiler documentation, a GCC module will also need to be loaded in this case. Depending on loading order, different versions of the FFTW library are ultimately made available to the project:

:~$ module purge
:~$ module load gcc/11 intel/21.6
:~$ module load fftw-serial
:~$ printenv | grep FFTW_HOME
FFTW_HOME=/mpcdf/soft/SLE_12/packages/skylake/fftw/intel_21.6.0-2021.6.0/3.3.10
:~$ module purge
:~$ module load intel/21.6 gcc/11
:~$ module load fftw-serial
:~$ printenv | grep FFTW_HOME
FFTW_HOME=/mpcdf/soft/SLE_12/packages/skylake/fftw/gcc_11-11.2.0/3.3.10