Using Globus platform services like Search and Flows to build data portals, science gateways and data commons that facilitate data discovery and collaboration. This tutorial was presented at the GlobusWorld 2021 conference in Chicago, IL by Vas Vasiliadis.
4. Canonical data access/distribution use case
• Portal/science gateway to
distribute data
• Interface to search and
discover data of interest
• Asynchronous transfer to
user’s system or via
HTTPS (e.g. for catalogs)
• Fine-grained authorization
enforced on (meta)data
Search and request
data of interest
Transfer
data to
destination
5. Common solution components
• Guest collection for “staging” data
• Registered application (manages permissions)
• Search index with faceted queries
• Data transfer, to and from shared endpoint
6. Relevant Globus platform features
• Guest collection creation requires authentication
– Cannot be completely automated
– Must be a managed endpoint
• Roles for management of endpoint and tasks
– Access Manager role grants the right to manage permissions
– Granted to other users, groups or applications
7. Application registration
• Set desired scopes
• Set callback URL
• Get client ID and secret
• Consents implement
least privileges principle
7
Auth
developers.globus.org
8. Data sharing permissions management
• Permissions are set per folder, on a guest collection
• Permissions management can be automated
• For a user
– Identity: user must log in with this
– Email: user gets a code via email; link to their Globus Account
• For a group
– Group UUID: search for group to get UUID
– Access governed by membership in the group
• For an application
– Application identity: appclientid@clients.auth.globus.org
9. Application concepts
• Custom application that can automatically manage
permissions
– Can use Globus CLI
• Confidential apps: use client id and secret
– Ensure application is on a secure device
– Set up policy for rotation of secret (limited life tokens)
10. Client credential grant
10
1. Authenticate with app
client id and secret
2. Access Tokens
Application,
Science Gateway,
Data Portal
(Client)
3. Authenticate as app
with access tokens (to
manage permissions)
Globus Transfer
(Resource Server)
Globus Auth
(Authorization Server)
11. Data description and discovery
• (Meta)data store with fine-
grained visibility controls
• Schema agnostic
à dynamic schemas
• Simple search using URL
query parameters
• Complex search using
search request document
11
docs.globus.org/api/search
Search
Index
Search
github.com/globus/searchable-files-demo
12. Data ingest with Globus Search
12
Search
Index
POST /index/{index_id}/ingest'
Search
{
"ingest_type": "GMetaList",
"ingest_data": {
"gmeta": [
{
"id": "filetype",
"subject”: "https://search.api.globus.org/abc.txt",
"visible_to": ["public"],
"content": {
"metadata-schema/file#type": "file”
}
},
...
]
}
13. Data ingest with Globus Search
13
Search
Index
POST /index/{index_id}/ingest'
Search
{
"ingest_type": "GMetaList",
"ingest_data": {
"gmeta": [
{
"id": "size",
"subject": "https://search.api.globus.org/abc.txt",
"visible_to": ["urn:globus:auth:identity:46bd0f56-
e24f-11e5-a510-131bef46955c"],
"content": {
"metadata-schema/file#size": "1000000",
"metadata-schema/file#size_human": "1MB”
}
},
...
]
}
Visibility limited to Globus Auth identity
- Single user
- Globus Group
- Registered client application
16. Data Access and Sharing
• Set guest collection access rule
• Check authenticated user’s Group membership
• Submit Transfer task
16
Groups
service
Transfer
service
GET /groups/my_groups
POST /endpoint/{endpoint_id}/access
POST /transfer
Groups
Transfer
19. Multiple ways to “automate” data management
• Scripts using the CLI (+ cron?)
• Globus Timer service à scheduled/recurring transfers
• Your own code calling the Globus APIs (ugh!)
• Globus Flows service!
– Flows comprise Actions
– Actions execute against an Action Provider service endpoint
– Extend by using the Action Provider Toolkit
action-provider-tools.readthedocs.io/en/latest
20. Let’s deploy and run a
simple flow…
Initiate a Globus transfer
task to move data to a
guest collection
Add an access rule
allowing a Group to
access the data
Start End
22. Resources
• Globus API documentation: docs.globus.org/api
• Helpdesk and issue escalation: support@globus.org
• Mailing list: discuss@globus.org
• Globus professional services team
– Assist with portal/gateway/app architecture and design
– Develop custom applications that leverage the Globus platform
– Advise on customized deployment and integration scenarios
23. Join the Globus community
• Access the service: app.globus.org
• Create a personal endpoint: app.globus.org/file-manager/gcp
• Documentation: docs.globus.org
• Engage: discuss@globus.org
• Subscribe: globus.org/subscriptions
• Need help? support@globus.org
• Follow us: @globus