Globus is a third-party service for transferring large amounts of data between Globus Data Transfer Nodes (DTNs). With Globus, very high data transfer rates are achievable. This service allows data to be accessible to any person who has a Globus account.
To use Globus on the NeSI HPC cluster, you need:
- A Globus account (see Initial-Globus-Sign-Up-and-your-Globus-Identities)
- An active My NeSI account. If you don't have one, you can apply online at Creating-a-NeSI-Account
- Access to one or both of our HPCs: Mahuika or Māui.
- Access to the Globus endpoint system where files will be transferred to/from.
Note that a NeSI account does not create a Globus account, nor can you, as the end user, link the two through a web site. Also, both your accounts (My NeSI and Globus) must exist before you try to use our DTN.
The NeSI Data Transfer Node
The NeSI Data Transfer Node (DTN) acts as an interface between our HPC facility storage and a worldwide network of Globus endpoints. This is achieved using Globus.org, a web-based service that solves many of the challenges encountered moving large volumes of data between systems. While NeSI supports use of other data transfer tools and protocols such as
scp, Globus provides the most comprehensive, efficient, and easy to use service for NeSI users who need to move large data sets (more than a few gigabytes at a time).
Types of Globus endpoints or Data Transfer Nodes
Globus data transfers take place between endpoints. An endpoint is nothing more than an operating system (Windows, Linux, etc) that has the Globus endpoint software installed on it. Endpoints come in two kinds: personal and server.
The NeSI DTN is an example of a server endpoint. These type of endpoints are usually configured to access large capacity and high-performance parallel filesystems. Endpoints can also be unmanaged or managed by a subscription. NeSI DTN is a server type, managed endpoint (by NeSI subscription) which entitles respective authorized users to provide data transfer and data sharing services on behalf of their Globus accounts.
Your institution may have its own managed server endpoint, and if so we encourage you to use that endpoint for your data transfers between that institution and NeSI. You may need to apply to the person or group administering the managed server endpoint, most likely your IT team, to get access to the endpoint. Your institution may even have several endpoints, in which case we recommend that you consider which one would be best suited for your data transfer requirements. If you need any help in regards to this, get in touch with us via firstname.lastname@example.org.
If your institution doesn't have a managed server endpoint, you can set up a personal endpoint using software provided by Globus (see below). Please be aware that even if you set up a personal endpoint, you may still need to consult your IT team in order to make it usable, especially if your institution has an aggressive firewall.
Transferring data using a managed endpoint
As an example, to move files between the NeSI HPC Storage (accessible from Maui and Mahuika) and the Otago University high-capacity central file storage (another managed server endpoint):
Once logged into globus.org, you'll be taken to the globus File Manager page where you would be able to search for DTNs in the Collection field.
NeSI endpoints start with "nesi#":
|Endpoint Name||Description||Recommended Use||Apply for Use||Contact|
||NeSI Globus Endpoint, located at NIWA Wellington (Greta Point)||
transferring files to/from maui/mahuika,
|see conditions as outlined email@example.com|
||Endpoint 02 for the High Capacity Research Storage Cluster, Dunedin Campus, University of Otago||Primary endpoint for Otago Dunedin; uses local service accounts or globus sharing.||Complete form at https://firstname.lastname@example.org|
||Endpoint provides access to UoA research data.||transferring files between UoA research drives and maui/mahuika||
Apply by email to email@example.com
||A Globus endpoint attached to AgResearch’s institutional Linux storage platform||Sharing large datasets with external collaborators and moving large datasets between NeSI’s facility and AgResearch’s internal storage firstname.lastname@example.org|
On maui/mahuika, NeSI DTN can only see your home directory and project subdirectories of /nesi/nobackup - see Globus Paths, Permissions, Storage Allocation. Navigate to your project directory on the nobackup filesystem /nesi/nobackup/<project_code> and select the two-endpoint panel for transfer.
Select the target endpoint and authenticate.
Select files you wish to transfer and select the corresponding "Start" button:
- Sign in to https://www.globus.org. You will be taken to the File Manager page https://app.globus.org/file-manager
- Open the two-endpoint panel located on the top-right of the File Manager page.
- Select the Endpoints you wish to move files between (start typing "nesi#" to see the list of NeSI DTNs to select from). Authenticate at both endpoints.
- At Globus.org the nesi#hpcf-dtn endpoint defaults to your home directory (represented by "/~/") on mahuika/maui. We do not recommend uploading data to your home directory, as home directories are very small. Instead, navigate to an appropriate project directory under /nesi/nobackup (see Globus Paths, Permissions, Storage Allocation).
- Transfer the files by clicking the appropriate button depending on the direction of the transfer.
- Check your email for confirmation about the job completion report.
Transferring data using a personal endpoint
To transfer files into/out of your laptop, desktop computer or any other system you control, configure it as a Globus Personal Endpoint (see Personal Globus Endpoint Configuration for transfers between personal endpoints).
To share files with others outside your filesystem, see https://docs.globus.org/how-to/share-files/