Tutorial scenario examples

This section provides comprehensive guidance on accessing and utilizing the HDA. It includes detailed step-by-step instructions for accessing specific endpoints, discovering available services along with their corresponding descriptions. It also covers the process of listing and searching for available STAC collections as well as gaining further insights and information about them.

The collection of Jupyter Notebooks examples on how to use the DestinE Data Lake services can be found at Destination Earth on Github.

You can also use JupyterHub git extension to load the examples into your Jupyter Notebook. Select Clone a repository and specify Destination Earth on Github as repository.

DestinE Data Lake - HDA Tutorial

This notebook demonstrates how to use the HDA (Harmonized Data Access) API by sending a few HTTP requests to the API, using Python code.

Please note that the default HDA quota provided for users includes

maximum 4 requests per second,

bandwidth up to 20 Mbps per connection and

monthly transfer up to 6 TB per month.

To run the examples in this article, you can either use the DestinE Platform provided Insula service or the DEDL provided STACK Service. Insula service is an integrated code environment based on JupyterHub, available through the DestinE Platform. The ‘datalake-lab’ folder in Insula’s navigation panel offers useful HDA tutorials.

The basic HDA tutorial notebook and the python script to manage the authentication are also available here:

HDA tutorial

Authentication Import

The notebook requires the Python script to manage authentication.

Import the relevant modules

We start off by importing the relevant modules for HTTP requests and json handling, as well as writing a small pretty printing helper for viewing json responses in the notebook.

from typing import Union
import requests
import json
import urllib.parse
from requests.auth import HTTPBasicAuth

from IPython.display import JSON

# map
import folium
import folium.plugins
from branca.element import Figure
import shapely.geometry

def display_as_json(response: requests.Response) -> None:
    """Displays a HTTP request response as an interactive JSON in Jupyter Hub.

    Args:
        response (requests.Response): HTTP request response
    Returns:
        None
    """
    if not isinstance(response, requests.Response):
        raise TypeError(f"display_as_json expects a requests.Response parameter, got {type(response)}.")
    return JSON(json.loads(response.text))

Define some constants for the API URLs

In this section, we define the relevant constants, holding the URL strings for the different endpoints.

# IDS
SERVICE_ID = "dedl-hook"

# Use the Collection https://hda.data.destination-earth.eu/ui/dataset/EO.ESA.DAT.SENTINEL-2.MSI.L1C
COLLECTION_ID = "EO.ESA.DAT.SENTINEL-2.MSI.L1C"

ITEM_ID = "S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE"

# Core API
HDA_API_URL = "https://hda.data.destination-earth.eu"
SERVICES_URL = f"{HDA_API_URL}/services"
SERVICE_BY_ID_URL = f"{SERVICES_URL}/{SERVICE_ID}"

# STAC API
## Core
STAC_API_URL = f"{HDA_API_URL}/stac"
CONFORMANCE_URL = f"{STAC_API_URL}/conformance"

## Item Search
SEARCH_URL = f"{STAC_API_URL}/search"
DOWNLOAD_URL = f"{STAC_API_URL}/download"

## Collections
COLLECTIONS_URL = f"{STAC_API_URL}/collections"
COLLECTION_BY_ID_URL = f"{COLLECTIONS_URL}/{COLLECTION_ID}"

## Items
COLLECTION_ITEMS_URL = f"{COLLECTIONS_URL}/{COLLECTION_ID}/items"
COLLECTION_ITEM_BY_ID_URL = f"{COLLECTIONS_URL}/{COLLECTION_ID}/items/{ITEM_ID}"

## HTTP Success
HTTP_SUCCESS_CODE = 200

The following cell will prompt you to provide your own DESP username and password:

import json
import os
from getpass import getpass
import dedl_authentication as deauth

DESP_USERNAME = input("Please input your DESP username or email: ")
DESP_PASSWORD = getpass("Please input your DESP password: ")

auth = deauth.AuthHandler(DESP_USERNAME, DESP_PASSWORD)
access_token = auth.get_token()

auth_headers = {"Authorization": f"Bearer {access_token}"}

Details are not shown here for security reasons, however, Status code 200 will be shown when successful.

Core API

We can start off by requesting the HDA landing page, which provides links to the API definition, the available services as well as the STAC API index.

print(HDA_API_URL)
display_as_json(requests.get(HDA_API_URL))

https://hda.data.destination-earth.eu/

Services

Requesting the /services endpoint will return the list of services available for users of the platform. It fetches the services from the services catalog database.

print(SERVICES_URL)
display_as_json(requests.get(SERVICES_URL))

https://hda.data.destination-earth.eu/services

The API can also describe a specific service, identified by its serviceID (e.g. de-ecmwf-aviso-lumi).

print(SERVICE_BY_ID_URL)
display_as_json(requests.get(SERVICE_BY_ID_URL))

https://hda.data.destination-earth.eu/services/de-ecmwf-aviso-lumi

STAC API

The HDA is plugged to a STAC API, component of the EO Catalogue.

Core

The STAC API entry point is set to the /stac endpoint and provides the search capabilities provided by the DEDL STAC interface.

print(STAC_API_URL)
display_as_json(requests.get(STAC_API_URL))

https://hda.data.destination-earth.eu/stac

The user can also have access to the list of all the conformance classes that the server conforms to by hitting the /stac/conformance endpoint.

print(CONFORMANCE_URL)
display_as_json(requests.get(CONFORMANCE_URL))

https://hda.data.destination-earth.eu/stac/conformance

Collections

The /stac/collections endpoint returns a FeatureCollection object, listing all STAC collections available to the user.

print(COLLECTIONS_URL)
display_as_json(requests.get(COLLECTIONS_URL))

https://hda.data.destination-earth.eu/stac/collections

By providing a specific collectionID (e.g. EO.ESA.DAT.SENTINEL-2.MSI.L1C), the user can get the metadata for a specific Collection.

print(COLLECTION_BY_ID_URL)
display_as_json(requests.get(COLLECTION_BY_ID_URL))

https://hda.data.destination-earth.eu/stac/collections/EO.ESA.DAT.SENTINEL-2.MSI.L1C

Items

It is also possible to get the list of items available in a given Collection using Simple Search.

DATETIME = "?datetime=2023-09-09T00:00:00Z/2023-09-20T23:59:59Z"
print(COLLECTION_ITEMS_URL+DATETIME)
display_as_json(requests.get(COLLECTION_ITEMS_URL+DATETIME, headers={'Authorization': 'Bearer {}'.format(access_token)}))

https://hda.lumi.data.destination-earth.eu/stac/collections/EO.ESA.DAT.SENTINEL-2.MSI.L1C/items?datetime=2023-09-09T00:00:00Z/2023-09-20T23:59:59Z

Item ID

To get the metadata specific to a given item (identified by its itemID in a collection, the user can request the /stac/collections/{collectionID}/items/{itemID} endpoint.

print(COLLECTION_ITEM_BY_ID_URL)
display_as_json(requests.get(COLLECTION_ITEM_BY_ID_URL, headers={'Authorization': 'Bearer {}'.format(access_token)}))

https://hda.data.destination-earth.eu/stac/collections/EO.ESA.DAT.SENTINEL-2.MSI.L1C/items/S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE

Item search

The STAC API also provides an item /stac/search endpoint, which is intended as a shorthand API for simple queries. This endpoint allows users to efficiently search for items that match the specified input filters.

By default, the /stac/search endpoint will return the first 20 items found in all the collections available at the /stac/collections endpoint. Filters can be added either via query parameters in a GET request or added to the JSON body of a POST request.

The user can also enhance the request by adding filters. The full detail for each available filter is available in the API documentation.

The query parameters are added at the end of the URL as a query string: ?param1=val1&param2=val2&param3=val3

SEARCH_QUERY_STRING = "?collections=EO.ESA.DAT.SENTINEL-1.L1_GRD&datetime=2023-09-09T00:00:00Z/2023-09-20T00:00:00Z"
print(SEARCH_URL + SEARCH_QUERY_STRING)
display_as_json(requests.get(SEARCH_URL + SEARCH_QUERY_STRING, headers={'Authorization': 'Bearer {}'.format(access_token)}))

https://hda.data.destination-earth.eu/stac/search?collections=EO.ESA.DAT.SENTINEL-1.L1_GRD&datetime=2023-09-09T00:00:00Z/..

The same filters can be added as the JSON body of a POST request.

BODY = {
    "collections": [
        "EO.ESA.DAT.SENTINEL-1.L1_GRD",
    ],
    "datetime" : "2023-09-09T00:00:00Z/2023-09-20T23:59:59Z",
    "bbox": [-11,35,
              50,72 ],
    "limit": 10,
}
response = requests.post(SEARCH_URL, json=BODY, headers={'Authorization': 'Bearer {}'.format(access_token)})
display_as_json(response)

The metadata of a given item contains also the download link that the user can use to download a specific item:

result = json.loads(response.text)
downloadUrl = result['features'][0]['assets']['downloadLink']['href']
print(downloadUrl)

response = requests.get(downloadUrl,stream=True,headers={'Authorization': 'Bearer {}'.format(access_token), 'Accept-Encoding': None})

# If the request was successful, download the file
if (response.status_code == HTTP_SUCCESS_CODE):
        print("Downloading dataset...")
        product_id = 'Item'
        filename = product_id + ".zip"
        with open(filename, 'wb') as f:
            for chunk in response.iter_content(chunk_size=1024):
                if chunk:
                    f.write(chunk)
                    f.flush()
        print("The dataset has been downloaded to: {}".format(filename))
else: print("Request Unsuccessful! Error-Code: {}".format(response.status_code))

https://hda.data.destination-earth.eu/stac/collections/EO.ESA.DAT.SENTINEL-2.MSI.L1C/items/S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321/download?provider=dedl

Downloading S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE...
The dataset has been downloaded to: S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE.zip

Conclusion

This small tutorial presented the HDA API, illustrated with some pieces of Python code showing how to send HTTP requests to the different endpoints, as well as the use of a few filtering capabilities.

Again, more detail on each endpoint can be found in the API documentation.