Quantum Optics


Name of the cluster:

TQO

Institution:

Max Planck Institute of Quantum Optics

Login nodes:

  • tqo401.opt.rzg.mpg.de

Hardware Configuration:

1 login node tqo401 :

2 x Intel(R) Xeon(R) Platinum 8268 CPU @ 2.90GHz (Caskadelake); 48 cores per node; 385 Gb RAM; no hyper-threading

59 execution nodes tqo[402-460] :

total amount of 2880 CPU cores; 2 x Intel(R) Xeon(R) Platinum 8268 CPU @ 2.90GHz (Caskadelake); 385 Gb RAM; no hyper-threading

66 execution nodes tqo[501-566] :

total amount of 4752 CPU cores; Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (Icelake); 512 Gb RAM; no hyper-threading

1 execution nodes tqog02 for parallel GPU computing :

total amount of 16 CPU cores; 2 x Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz (Skylake); 94 Gb RAM; no hyper-threading; 2 x Nvidia Tesla P100 GPUs per node

2 execution nodes tqog[03-04] for parallel GPU computing :

total amount of 32 CPU cores; 2 x Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz (Skylake); 94 Gb RAM; no hyper-threading; 2 x Nvidia Tesla V100 GPUs per node

  • Node interconnect is based on 1Gb/s ethernet

Filesystems:

/u ($HOME)

shared home filesystem; GPFS-based; user quotas (currently 500 GB, 0.5M files) enforced; quota can be checked with ‘/usr/lpp/mmfs/bin/mmlsquota’.

/ptmp

shared home filesystem; GPFS-based; no quotas enforced. NO BACKUPS!

Compilers and Libraries:

The “module” subsystem is implemented on TQO. Please use ‘module available’ to see all available modules.

  • Intel compilers (e.g. ‘module load intel/19.1.3’): icc, icpc, ifort

  • Intel MKL (‘module load mkl’): $MKL_HOME defined; libraries found in $MKL_HOME/lib/intel64

  • Intel MPI (e.g. ‘module load impi/2019.9’): mpicc, mpigcc, mpiicc, mpiifort, mpiexec, …´

Batch system based on Slurm:

  • sbatch, srun, squeue, sinfo, scancel, scontrol, s*

  • current max. turnaround time (wallclock) for partitions: 168 (partition s.168) & 672 (partition s.672) hours

  • s.gpu partition for GPU computing: 2xP100 & 4xV100 gpus; turnaround time is 672 hours

  • nodes are different in CPU architecture and memory capacity. Use --constraint sbatch/srun option to run MPI jobs on homogeneous environment

  • sample batch scripts can be found on Cobra home page (must be modified for TQO)

Useful tips:

Access to AFS is still possible, even from batch jobs. It is not recommended, though. Old $HOME from AFS is available at
/afs/ipp/u/$USERNAME
save-password is required if you need AFS tokens within your batch job.

The OpenMP codes require a variable OMP_NUM_THREADS to be set. This can be obtained from the Slurm environment variable $SLURM_CPUS_PER_TASK which is set when --cpus-per-task is specified in a sbatch script (an example is on help information page)

To use GPUs add in your slurm scripts --gres option and choose how many GPUs and/or which model of them to have: #SBATCH –gres=gpu:p100:1 or #SBATCH –gres=gpu:v100:2
Valid gres options are: gpu[[:type]:count]
where
type is a type of gpu (p100 or v100)
count is a number of resources (1 or 2)

GPU cards are in default compute mode.

To use a gpu interactively: login to the tqog01 node; load cuda module. GPU cards on tqog01 are in default compute mode.

Default memory for jobs is 1600M per core. Use –mem to set necessary amount of memory per job. To grant the job access to all of the memory on each node use –mem=0 option for sbatch/srun

To run code on nodes with different memory capacity (94G; 192G; 384G; 512G) use --constraint=<list> option in a sbatch script: --constraint=94G or --constraint=192G for instance.

To run code on nodes with specific CPU architecture use --constraint=<list> option in a sbatch script: --constraint=cascadelake or --constraint=skylake

To check node features use sinfo -O nodelist,features

Support:

For support please create a trouble ticket at the MPCDF helpdesk