MPCDF DataHub and Globus Online

To provide improved functionality for large scale data transfer and sharing the MPCDF has obtained a Globus Online Subscription. The subcription makes advanced functionality available on both the general Globus Online Server (DataHub) which is deployed at MPCDF and Globus Online Connect Clients which MPCDF users may deploy on laptops, desktops and login nodes on their Linux clusters.

Globus Online is a third party transfer service which enables fire-and-forget data transfer at TB or multi-TB scale. Globus Online is well established and widely used with many computing centres and research institutes, as well as Universities, having Globus Services installed for their users.

Here we will describe how you can gain access to Globus Online and make use of DataHub and Globus Connect Personal clients to transfer and share large data sets.

This article focuses on:

Creating a Globus Online Account

Data Transfers and Sharing are managed in Globus Online via the Globus Online Web App. The DataHub itself is available to all MPCDF users by default, however, to make use of Globus Online you will need to create an account within the Web App. Navigate to https://globus.org and click “Log in”.

Account-1

Many MPG institutes can use the existing “Organizational login” by selecting “Max-Planck-Gesellschaft” as show in the following screenshot.

Account-1a

This will forward you to the MPG SSO site.

Account-1b

If your institute is currently not supported, you will need to use a Google ID, ORCID ID or create a GlobusID. Note: This account is not linked to MPCDF and a different password should be chosen.

Account-2

Ideally we would suggest that you use a globus or ORCID ID.

During the Signup process you will be asked for an email account, please use your MPCDF email account whenever possible. This will help us when accepting users in the Globus Connect Personal Plus Group (This is detailed later in this document).

When selecting a Globus ID:

Account-3

Account-4

When selecting an ORCID ID:

Account-5

Account-6

Data Transfer and Sharing

Now that you have a Globus Online account you will be able to access the Globus Online Connect Server instance at MPCDF (DataHub) and make use of Globus Connect Personal Clients.

The DataHub can be used to transfer or share large data volumes, complementing the data services at MPCDF, such as DataShare.

DataHub offers scratch based storage, with a quota of 50TB per user. Data is removed after 30 days.

Transfer

To start a transfer login to the Globus Online Web App and navigate to the File Manager section.

Transfer-1

The left and right panels allow you to access different storage resources (Data Collections). These may be either Globus Connect Servers or Globus Connect Personal clients.

The MPCDF DataHub service offers a generic Data Collection called “MPCDF DataHub Stage-and-Share Area” - This provides the same scratch based /data area that was mounted on the previous DataHub Endpoint mpcdf#datahub. Note mpcdf#datahub is no longer available after the DataHub Upgtrade (09.03.22).

The collections can be found by using the search function in the File Manager or Bookmarks section of the Globus Online Web App.

To access a Collection simply follow the usual login steps, entering your MPCDF username and password when prompted on the login.datahub.mpcdf.mpg.de site, then link an identity from “MPCDF DataHub OIDC Server (login.datahub.mpcdf.mpg.de)” and once this is linked use the identity (username@login.datahub.mpcdf.mpg.de) to access the Collection.

Once you have accessed the collection you should navigate to your home area where you can store data. Note: You cannot copy data to the based directory /data of the collection.

Transfer-2

Globus Connect Personal Clients do not require password activation, once a client has been started you can simply search for it and open it in one of the panels.

Now you are ready to Transfer or Share data. Note that clicking on the “Transfer and Timer Options” provides more options to configure transfers, such as adding encryption or tuning data synchronization options.

Transfer-3

To start a data Transfer simply select the files or folders you wish to transfer and click “Start”. You can view the details of the transfers by clicking on the link provided in the green box (top right) or by selecting the “Activity” section from the left hand column.

Transfer-4

Note: the Activity Page can be used to view Transfer and Delete actions during the past 90 days.

Activity-1

Sharing

You can share a Folder by selecting the Folder and clicking the “Share” option in the central column.

Share-1

This will fist ask you to consent to manage the shared/collection. After this you will need to click “Add Guest Collection”.

Share-2

Share-3

When creating a new Guest Collection you will be able to select the directory (Folder) and add basic metadata including keyword tags (which will later allow users to search for your data). At the very least you must enter a “Display Name” which will be the Guest Collection Name.

Share-4

After the Guest Collection has been created you will be able to chose which users (or groups) you wish to share with. Click “Add Permissions - Share With”.

Share-5

Now you can chose if you wish to share with a Globus Online user, group, all globus users, or even make the data public.

Share-6

To share with a user simply enter their Globus username in the search box, as you enter you will see that a real-time search is performed to help you find the user.

You can share with ANY Globus User, they do not need to have an account at the MPCDF.

Once the permission has been added for the user they will receive an email and you can also manage their access to the shared collection. Change Read/Write options and even remove the users access. Additionally you can select further users that you wish to share this Guest Collection with.

Share-7

Globus Connect Personal (Plus)

Installing Globus Connect Personal

A Globus Connect Personal Client allows you to access data on a laptop,desktop or Linux Cluster Login node in the same way you would access data on a Globus Connect Server.

You can have any number of Globus Connect Personal deployments, allowing access to data on several systems.

To install a globus connect personal client simply select the “File Manager” section in the Globus Online Web App and click on the link “Get Globus Connect Personal” at the bottom.

GCP-1

You will then be able to select your operating system to download the correct client.

GCP-3

By clicking on “Learn More about Globus Connect Personal” you can follow the links to more detailed documentation about the installation and configuration of Globus Connect Personal.

GCP-4

Once Installed you can activate a client via the command line or GUI. The client will then be visible in your list of collections just like any standard Globus Connect Server.

Enhanced Functionality with Globus Connect Personal “Plus”

The MPCDF Subscription allows users to gain access to enhanced Globus Connect Personal functionality, namely sharing and client-to-client transfers, by joining the “Max Planck Data Facility Globus Connect Plus” group.

To join the group you should navigate to your “Groups” and type the name of the Group “Max Planck Data Facility Globus Connect Plus” in the search box. Once you see the group you will be able to click the option to join the group.

Groups-1

At this point MPCDF subscription Managers will receive a notification of your request and will grant access. Note: If you used your MPCDF email address when registering your account this process will be simplified for the subscription managers. Using a non MPCDF mail address may lead to some delay in you being admitted to the group.

More Information:

More information on Globus Online and Globus Connect Personal can be found in the Globs Documentation:

For specific Questions about the MPCDF support for Globus Online please create a helpdesk ticket.