Storage

The HPC Cloud offers both block storage, in the form of disk volumes which can be directly attached to a server, and file storage, in the form of shared filesystems which can be NFS-mounted by the operating system. In addition, there is an object store providing containers (also called buckets) which allow data to be accessed from multiple clients though standard REST APIs. 1 The block and object storage services are based on Ceph, while file storage offers a choice between CephFS and IBM Storage Scale (GPFS). The latter is also mounted on Raven (by default) as well as Robin (on request) for efficient exchange of data between the various systems.

Block

Storage volumes are logically independent block devices of arbitrary size. They can be attached to a running server and later detached or even reattached to a different server. 2 Typically, volume are used to store data which should persist beyond the lifetime of a single instance, although they are also useful as “scratch” space.

  1. Create an empty volume via the Create Volume button on Project / Volumes / Volumes. The maxiumum allowable size depends on your project’s quota. The default volume type Ceph stores data on an HDD-based pool. There is also type CephSSD, available on request, which targets an SSD-based pool, offering potentially higher performance and lower latency.

    Volume Size
  2. Select a server from Project / Compute / Instances, perform the Attache Volume action, and pick the newly created volume.

  3. The device names /dev/vdb of volumes are not guaranteed to be the same each time they are attached to a server. To definitively identify a volume look into /dev/disks/by-id. Volumes will appear as entries virtio-ID where ID is the first 20 characters of the OpenStack volume ID. For example:

CLI Example

openstack volume create VOLUME --size SIZE [--type TYPE]
openstack server add volume SERVER VOLUME

Tip

It is possible to migrate an existing volume to a new type online, even while still attached to a the server. Simply click the “Change Volume Type” action, choose the target type, and set the migration policy to “On Demand”. The data will then be transparently copied to the new backend storage pool.

openstack volume set VOLUME --type TYPE --retype-policy on-demand

File

Projects can deploy a parallel filesystem within the HPC-cloud, upon request, via Nexus-Posix. Nexus-Posix is based on IBM Spectrum Scale, a high performance parallel filesystem, and can be mounted on both HPC Cloud VMs and the Raven HPC system. This cross mounting allows projects to easily access data from both cloud and HPC systems, allowing for hybrid, HPC+Cloud, solutions.

Highlights of Nexus-Posix include:

  • May be mounted on Raven HPC system and multiple HPC Cloud VMs

  • Data Security - Automatic backup included

  • Possible high performance data transfers and sharing via Globus (GO-Nexus)

  • Reservations can be increased as projects grow

Nexus-Posix reservations can be requested by projects starting at 5TB and ranging into 100s of TBs. For more information, please make a request via the MPCDF helpdesk.

Object

The HPC Cloud includes an object store for saving and retrieving data through a publicly-accessible REST API. Both OpenStack Swift- and S3-style APIs are supported. Objects are stored in containers (or buckets) which in turn belong to the project (or tenant). Folders within buckets are supported but typically handled as a part of the object name.

Important

Please be aware that the contents of public buckets are not only visible but can be modified by anyone on the internet.

  1. Create a new bucket via the Container Button button on Project / Object Store / Containers. Note that buckets, both public and private, share a global namespace. If a bucket name is already taken by another project you will receive an error message. Please use the S3 bucket naming rules. This will help avoid possible problems when accessing objects via the S3 API.

    Container Name
  2. Select the bucket and click the Upload Button button to upload a file. The object name (confusingly labeled “File Name”) defaults to the filename, but may be an arbitrary string. You may create a folder for the object at the same time by prepending one or more names, separated by “/”.

    Object Filename
  3. If you created a public bucket, the contents of the file would then be available at, e.g.: https://objectstore.hpccloud.mpcdf.mpg.de/swift/v1/demobucket/demofolder/demoobject or https://objectstore.hpccloud.mpcdf.mpg.de/demobucket/demofolder/demoobject

CLI Example

openstack container create BUCKET
openstack object create BUCKET FILE --name FOLDER/OBJECT

Object store quotas are separate from those of other storage types, and are not displayed on the dashboard. For questions about your quota, please contact the helpdesk. Note that while the Ceph backend itself supports very large objects, uploads through the dashboard and Swift API are limited to 5GB. 3 Use the S3 API to avoid this limitation.

Tip

The object store supports a subset of the Amazon S3 API. See CLI and Scripting to get started with s3cmd and/or Boto3.

1

Note that the term container is unrelated to Docker containers. Where possible we use bucket to avoid confusion.

2

Attaching a volume to multiple servers simultaneously is not supported.

3

The openstack command also uses the Swift API and is therefore subject to the same limitation. Note that the --segment-size feature of the older swift command splits the file into a collection of smaller objects which cannot easily be downloaded by other clients.