Containers
Apptainer
Apptainer (https://apptainer.org) is an open-source software developed to add containers and reproducibility to scientific high performance computing.
Just like its predecessor Singularity, Apptainer is being developed to provide container technologies on HPC systems. It gives users an easy way to access different OSs on the HPC systems while still ensuring that containers runs in an established user environment, without a pathway for privilege escalation on the host.
Apptainer was born in 2021, when the Singularity open source project split into two separate projects: Apptainer and SingularityCE. The Apptainer branch has joined the Linux Foundation, while the Sylabs’ fork of Singularity, dedicated to commercial use, was renamed SingularityCE. While, at least at the beginning, there has been continual alignment between Sylabs’ SingularityCE and Apptainer, over time the paths of the projects will likely diverge as both projects continue to mature.
As part of the transition, only open community standard interfaces will be supported in Apptainer. This includes removing the “Library” and “Remote Builder” support.
In the event these become open community maintained standards (and not corporate controlled), these feature may be left intact and/or re-add at a later date.
For this reason, users of the old Singularity software are encouraged to adjust their scripts accordingly.
On top of the apptainer
command, Apptainer provides backwards compatibility offering singularity
as a command line link. It is also committed to maintain as much of the CLI and environment functionality available in the old Singularity software as possible. From the user’s perspective, very little, if anything, should change and the wrapper around the singularity
command allows users to run commands like ‘singularity pull’, ‘singularity run’, etc. just as before.
Please, visit https://apptainer.org for additional information on the Apptainer software and access to its documentation.
Examples of Apptainer commands
The following table summarizes some Apptainer commands (based on version 1.0.3). For more information see the Apptainer User Guide at https://www.apptainer.org/docs/.
General commands | |
help | Help about any command |
Usage commands | |
build | Build an Apptainer image |
cache | Manage the local cache |
capability | Manage Linux capabilities for users and groups |
exec | Run a command within a container |
inspect | Show metadata for an image |
instance | Manage containers running as services |
key | Manage OpenPGP keys |
oci | Manage OCI containers |
plugin | Manage apptainer plugins |
pull | Pull an image from a URI |
push | Upload image to the provided URI |
remote | Manage apptainer remote endpoints |
run | Run the user-defined default command within a container |
run-help | Show the user-defined help for an image |
search | Search a Container Library for images |
shell | Run a shell within a container |
sif | siftool is a program for Singularity Image Format (SIF) file manipulation |
sign | Attach a cryptographic signature to an image |
test | Run the user-defined tests within a container |
verify | Verify cryptographic signatures attached to an image |
version | Show the version for Apptainer |
Global options | |
-d --debug | print debugging information (highest verbosity) |
-h --help | help for apptainer |
-q --quiet | suppress normal output |
-s --silent | only print errors |
-v --verbose | print additional information |
The help command gives an overview of Apptainer options and subcommands.
For example:
$ apptainer help <command> [<subcommand>]
$ apptainer help build
$ apptainer help instance start
Apptainer on the MPCDF HPC systems
On the HPC clusters at MPCDF, an environment module is provided in order to load the Apptainer software. For backwards compatibility, a Singularity module (singularity/link2apptainer) is also provided which will print a warning message and load the default Apptainer module. The old Singularity as well as the new SingularityCE software will not be supported on the HPC clusters.
Charliecloud
Introduction
Among the large number of container engines available nowadays, Charliecloud provides a user-defined software stack specifically developed for High Performance Computing (HPC) centers, with a particular attention towards security, minimal overhead and ease of use.
Charliecloud runs on Linux systems and isolates the image environment using Linux user namespaces. Contrary to other container engines, it does not require privileged operations or daemons at runtime, can easily access the hardware available on the node (e.g. GPUs) and can be configured to run on multiple computing nodes (for example using the OpenMPI library). A Charliecloud container image can be created from any Docker image locally available in the user’s workstation producing a tarball file that can be transferred to the HPC cluster where it is intended to run. See this link for a list of commands. Note that for the creation of a Docker image, root permissions are required on the local machine.
Installation
Charliecloud has minimal software requirements:
Recent Linux kernel (recommended version 4.4 or higher)
Bash (version 4.1 or higher)
C compiler and standard library
Docker (version 17.03 or higher)
and can be conveniently installed from one of the software tarballs (available here) using a standard ‘configure-make-make install’ procedure. Alternatively, Charliecloud is also available for local installation from several package managers.
Once the software has been installed, any Docker image previously built on the local system can be converted into a Charliecloud image using:
ch-builder2tar <image_name>:<tag> .
This command creates a .tar.gz file in the local folder containing everything that is needed to run the container on the HPC host.
In order to facilitate the access to Charliecloud for researchers of the Max Planck Society, Charliecloud has been deployed on the HPC clusters Raven, Cobra and Talos via the module system and can be provided on institute’s clusters on request. The software on the host can be loaded by simply using
module load charliecloud
Once the container image has been transferred to the HPC cluster (e.g. using scp), the .tar.gz file can be decompressed to a folder with the command
srun ch-run <image_dir> -- echo "I'm in a container"
Multiple options are available, like, for example, the possibility to bind-mount local folders inside the container to access or store software and data available on the host system.
Performance
In order to assess the overhead introduced by the container engine and to test the performance of Charliecloud against software running on the ‘bare metal’, we run the Tensorflow 2 synthetic benchmark on 1 to 96 Tesla Volta V100 GPUs on the Talos machine-learning cluster. We focus on the following combination of software and libraries:
gcc 8
openmpi 4.0.5
CUDA 10.1
cudnn 7.6.5
nccl 2.4.8
TensorFlow 2.2.0
Horovod 0.20.2
The resulting total number of images processed by the benchmark for the Charliecloud and the bare metal runs are shown in the figure below, together with the ideal scaling normalized to a single GPU. The overhead introduced by the container engine is minimal and the performance is almost identical to the bare metal run. Figure: Tensorflow 2 synthetic benchmark comparison between bare metal and Charliecloud runs. Performances are measured as total number of images per second processed by TensorFlow for an increasing number of GPUs.
Conclusion
Charliecloud has been deployed at the MPCDF and is available via the module system, providing an additional platform to run containers at our HPC facility. The software is fully compatible with images created with Docker and grants effortless access to accelerated hardware and multi-node computation, while maintaining a strong focus on security and ease of use. Running Charliecloud containers can be efficiently integrated into any Slurm submission script for the HPC clusters available at the MPCDF without requiring any special privilege on the machine.
On their local workstation, users are entrusted with the preparation of the Docker source image and the installation of software into the container.