Sharing Large Files with DataShare
To enable large file transfers via DataShare we advise using rclone chunker. This recipe will focus on sharing data via a public link, however, rclone can also be configured to use a standard user account in DataShare.
Set up share folder in DataShare
Create a new folder for the data in DataShare
Via Sharing - Public Links, create a share with
read/write
permissions
Copy link to clipboard and paste into the text editor for your choice
Extract the cryptic share token at the end of the url and save it for the the rclone configuration
Optionally repeat steps 2-5 to create another share with
readonly
permissions if recipient should only be able to download files
Sender: Upload files using rclone
Configure rclone remote and chunking overlay.
> rclone config create testproject webdav url https://datashare.mpcdf.mpg.de/public.php/webdav/ user <sharetoken> pass <sharepass>
> rclone config create testproject-overlay chunker remote testproject: chunk_size 2G hash_type none
The default chunk_size of 2GB generally works fine. Can be increased up to 20GB if less chunks are desired.
However using very big chunks might cause problems with slow clients or network connections (also relevant during download).
Checksums can be enabled if desired (e.g. hash_type md5
) but will of course take some additional time to calculate.
Upload individual files or a whole directory
> rclone copy 5g testproject-overlay: --progress --transfers 1
Transferred: 5G / 5 GBytes, 100%, 52.979 MBytes/s, ETA 0s
Checks: 3 / 3, 100%
Renamed: 3
Transferred: 1 / 1, 100%
Elapsed time: 1m41.6s
The --transfers 1
option ensures that only a single operation is running at a time.
Please make sure to always use it when doing chunked uploads to DataShare;
multiple concurrent transfers can actually slow things down due to synchronization overhead and generate unnecessary load on the server.
Files on the server
On the server, the folder will look like this (5g.rclone_chunk.001, 5g.rclone_chunk.002…):
The file with the original name (5g
in this example) just contains some metadata (number of chunks, checksums if enabled). Data is split into chunks of <name>-<number>
.
If desired, chunks can be downloaded via the web interface or curl and assembled manually e.g. with cat <name>-rclone_chunk-??? > <name>
.
Recipient: Download files again using rclone
For larger data sets, setting up rclone on the recipient as well is recommended:
Configure rclone remote and chunking overlay
> rclone config create testproject-readonly webdav url https://datashare.mpcdf.mpg.de/public.php/webdav/ user <sharetoken> pass <sharepass>
> rclone config create testproject-readonly-overlay chunker remote testproject-readonly:
Download individual files or a whole directory
> rclone copy testproject-readonly:5g 5g-from-remote --progress
Transferred: 5G / 5 GBytes, 100%, 91.694 MBytes/s, ETA 0s
Transferred: 1 / 1, 100%
Elapsed time: 1m2.1s