Extraterrestrial Physics CCAS


Name of the cluster:

CCAS

Institution:

Max Planck Institute for Extraterrestrial Physics

Login nodes:

  • ccas01.opt.rzg.mpg.de

Hardware Configuration:

  • login node ccas01 : 2 x Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz; hyper-threading is on - 2 threads per core; 40 CPU threads per node; 128 GB RAM

  • 24 execution nodes ccas[02-25] for parallel computing with a total amount of 480 CPU cores; 2 x Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz ; hyper-threading is on - 2 threads per core; 128 GB RAM

  • 4 execution nodes ccas[26-29] for parallel computing with a total amount of 80 CPU cores; 2 x Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz ; hyper-threading is on - 2 threads per core; 256 GB RAM

  • 2 execution nodes ccas[30-31] for parallel computing with a total amount of 40 CPU cores; 2 x Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz ; hyper-threading is on - 2 threads per core; 256 GB RAM

  • node interconnect is based on Mellanox Technologies InfiniBand fabric (Speed: 56Gb)

Compilers and Libraries:

The “module” subsystem is implemented on CCAS cluster. Please use ‘module available’ to see all available modules.

  • Intel compilers (-> ‘module load intel’): icc, icpc, ifort

  • GNU compilers (-> ‘module load gcc’): gcc, g++, gfortran

  • Intel MKL (‘module load mkl’): $MKL_HOME defined; libraries found in $MKL_HOME/lib/intel64

  • Intel MPI 2017.4 (‘module load impi’): mpicc, mpigcc, mpiicc, mpiifort, mpiexec, …´

Alternative software stack

In addition to the modules visible by default there is also an alternative software stack available containing some more up-to-date packages which resembles the module tree available on the HPC systems. You can switch over to use that one by doing
source /mpcdf/soft/distribution/obs_modules.sh (for csh/tcsh, please use the obs_modules.csh file instead).
Note that this unloads all previously loaded modules from your existing environment, and it is not possible to mix modules from the two trees. Similar to the HPC systems, this module tree is hierarchical.

Batch system based on Slurm:

  • sbatch, srun, squeue, sinfo, scancel, scontrol, s*

  • a brief overview of current batch usage and free resources is provided by the Slurm command sinfo

  • current max. turnaround time (wallclock): 1, 7 & 28 days

  • sample batch scripts can be found on Cobra home page (must be modified for CCAS)

Useful tips:

Default partition is ccas128 with 128G RAM and wallclock 24h.

To use partitions with 256G and 512G RAM use --partition option in your sbatch scripts (for instance #SBATCH --partition=ccas256 )

To use 7d and 28d time limits use in your sbatch scripts --qos option (for instance #SBATCH --qos 7d )

For parallel MPI codes we suggest to use --exclusive option to allocate whole resources on nodes and do not share them with other jobs (#SBATCH --exclusive ). In case to share nodes only with other user running jobs add #SBATCH --exclusive=<user>

Default memory per job on ccas512 partition defines as NODE_RAM/NODE_CPUS (512G/40). We suggest to explicitly set the needed memory by using --mem option (#SBATCH --mem=64g )

Support:

For support please create a trouble ticket at the MPCDF helpdesk