Psychiatry PSYCL
- Name of the cluster:
PSYCL
- Institution:
Max Planck Institute of Psychiatry
Login nodes:
psycl01.bc.rzg.mpg.de
Hardware-Configuration:
Login node psycl01 :
CPUs Model: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
2 sockets
10 cores per socket
hyper-threading (2 threads per core)
128 GB RAM
2 x Tesla K20Xm
13 execution nodes psycl[02-14] :
CPUs Model: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
2 sockets
10 cores per socket
hyper-threading (2 threads per core)
128 GB RAM ( psycl[02-05] )
256 GB RAM ( psycl[06-09] )
512 GB RAM ( psycl[10-13] )
768 GB RAM ( psycl14 )
2 x Tesla K20Xm ( psycl[02-04] )
2 x GeForce GTX 980 ( psycl[05-14] )
node interconnect is based on 10 Gb/s ethernet
Filesystems:
GPFS-based with total size of 226 TB:
- /u
shared home filesystem with user home directory in
/u/<username>
; GPFS-based; user quotas (currently 4 TB, 1M files) enforced; quota can be checked with ‘/usr/lpp/mmfs/bin/mmlsquota’.- /ptmp
shared scratch filesystem with user directory in
/ptmp/<username>
; GPFS-based; no quotas enforced. NO BACKUPS!
Compilers and Libraries:
The “module” subsystem is implemented on PSYCL. Please use ‘module available’ to see all available modules.
Python (-> ‘module load anaconda/3/5.1’): python, ipython
Intel compilers (-> ‘module load intel’): icc, icpc, ifort
GNU compilers (-> ‘module load gcc’): gcc, g++, gfortran
Intel MKL (- > ‘module load mkl’): $MKL_HOME defined; libraries found in $MKL_HOME/lib/intel64
Intel MPI 2018.3 (-> ‘module load impi’): mpicc, mpigcc, mpiicc, mpiifort, mpiexec, …´
MATLAB (-> ‘module load matlab’): matlab, mcc, mex
Mathematica (-> ‘module load mathematica’)
CUDA (-> ‘module load cuda’)
Batch system based on Slurm:
The batch system on PSYCL is the Slurm Workload Manager. A brief introduction into the basic commands (srun, sbatch, squeue, scancel, …) can be found on the Cobra home page. For more detailed information, see the Slurm handbook. See also the sample batch scripts which must be modified for PSYCL cluster.
Current Slurm configuration on PSYCL:
default turnaround time: 24 hours
current max. turnaround time (wallclock): 11 days
Useful tips:
By default run time limit used for jobs that don’t specify a value is 24 hours. Use --time option for sbatch/srun to set a limit on the total run time of the job allocation but not longer than 11 days
Default memory per node is 60G. To grant the job access to all of the memory on each node use –mem=0 option for sbatch/srun
The OpenMP codes require a variable OMP_NUM_THREADS to be set. This can be obtained from the Slurm environment variable $SLURM_CPUS_PER_TASK which is set when --cpus-per-task is specified in a sbatch script (an example is on help information page)
To run code on nodes with different memory capacity (mem128G; mem256G; mem512G; mem768G) use --constraint option in a sbatch script: #SBATCH --constraint=mem256G or #SBATCH --constraint=mem768G
GPU cards are in default compute mode.
To check node features, general resources and scheduling weight of nodes use sinfo -O nodelist,features:25,gres,weight
Support:
For support please create a trouble ticket at the MPCDF helpdesk