Provided Hooks
Table of Contents
Overview
On this page, we give more details on the ‘Provided Hooks’ i.e. those workflows/functions that have been pre-developed and pre-deployed by Destination Earth Data Lake.
The Hook service provides ready-to-use high level serverless workflows and functions preconfigured to efficiently access and manipulate Destination Earth Data Lake (DEDL) data. A growing number of workflows and functions will provide on-demand capabilities for the diverse satellite data analysis needs.
The full list of available Hooks are seen in the Hook Descriptions section further below.
Note
The main processor that will be of use to Destination Earth Data Lake users is the data-harvest
processor; with this you will be able to download data of interest to your S3 Object Storage.
The collection of Jupyter Notebooks examples on how to use the DestinE Data Lake services can be found at Destination Earth on Github.
Getting Started
Accessing the Jupyter Notebook Hook Tutorial
The simplest way to run the Hook Tutorial is to use the Destination Earth Data Lake (DEDL) - JupyterHub - Stack Service which has already Git Cloned the Github/destination-earth repository.
Alternatively you can access the Github/destination-earth HOOK folder yourself, then click on the Hook Tutorial Notebook
For more information on (DEDL) Stack Services refer to Run a Notebook on JupyterHub
The notebook is ready to use and by default will create a request to execute the data-harvest hook.
The notebook can be used with an optional .env_tutorial file that can load up environment variables for use in the notebook (See the README.md)
Install python package requirements
Import packages and load optional environment variables from file
Enter your DESP username and password (This will get a token allowing you to interact with the
Hook API
- OnDemand Processing API)Setup Static Variables (sets the root url of the
Hook API
- OnDemand Processing API.https://odp.data.destination-earth.eu/odata/v1/
)List Available workflows (Gives a list of Hooks - Names and Display Names. Also prints out json response)
Now we choose a workflow so that we can see the details of how to execute it (default is ‘data-harvest’). The json response shows you the options available.
Set the Name of the order (order_name). This allows us to easily identify the Order in following steps.
Next (optional) we set the PRIVATE bucket details and access_key, secret_key (only necessary if you want to use a private bucket for output). By default we are using TEMPORARARY storage in this tutorial.
Next we set some obligatory parameters for the Order (in particular the collection_id and data_id. Also by default we see ‘TEMPORARY’ storage is set and the source_type is ‘DESP’ - i.e. simplified configuration using DESP credentials, which uses DEDL HDA component in the background)
Next we execute the order. We can note that normally hooks have a simplified configuration using the DESP source_type (which gets data from the DEDL HDA component and uses DESP credentials)
Now that the order has been made, we can check the status of the order (queue, in_progress, completed)
Here we see that the order with Id 25759 is completed (and ready to download from TEMPORARY Storage - DownloadLInk expires after 2 weeks)
Following this we have some code that lists files in your PRIVATE storage (if this option was selected)
And finally this section checks if we are using TEMPORARY storage, gives the status with DownloadLink and also the option to download the file(s) programatically
Hook Descriptions
In the table below we can see a list of pre-developed and pre-deployed Hooks made available by Destination Earth Data Lake:
Data-harvest is a workflow that allows users to download data from external sources. It requires a URL to the external catalogue, credentials and data to download. The workflow is mainly used to download data from HDA (https://hda.data.destination-earth.eu/) using STAC.