Tutorial scenario examples
This section provides comprehensive guidance on accessing and utilizing the HDA. It includes detailed step-by-step instructions for accessing specific endpoints, discovering available services along with their corresponding descriptions. It also covers the process of listing and searching for available STAC collections as well as gaining further insights and information about them.
The collection of Jupyter Notebooks examples on how to use the DestinE Data Lake services can be found at Destination Earth on Github.
You can also use JupyterHub git extension to load the examples into your Jupyter Notebook. Select Clone a repository and specify Destination Earth on Github as repository.
DestinE Data Lake - HDA Tutorial
This notebook demonstrates how to use the HDA (Harmonized Data Access) API by sending a few HTTP requests to the API, using Python code.
Please note that the default HDA quota provided for users includes
maximum 4 requests per second,
bandwidth up to 20 Mbps per connection and
monthly transfer up to 6 TB per month.
To run the examples in this article, you can either use the DestinE Platform provided Insula service or the DEDL provided STACK Service. Insula service is an integrated code environment based on JupyterHub, available through the DestinE Platform. The ‘datalake-lab’ folder in Insula’s navigation panel offers useful HDA tutorials.
The basic HDA tutorial notebook and the python script to manage the authentication are also available here:
The notebook requires the Python script to manage authentication.
Import the relevant modules
We start off by importing the relevant modules for HTTP requests and json handling, as well as writing a small pretty printing helper for viewing json responses in the notebook.
from typing import Union
import requests
import json
import urllib.parse
from requests.auth import HTTPBasicAuth
from IPython.display import JSON
# map
import folium
import folium.plugins
from branca.element import Figure
import shapely.geometry
def display_as_json(response: requests.Response) -> None:
"""Displays a HTTP request response as an interactive JSON in Jupyter Hub.
Args:
response (requests.Response): HTTP request response
Returns:
None
"""
if not isinstance(response, requests.Response):
raise TypeError(f"display_as_json expects a requests.Response parameter, got {type(response)}.")
return JSON(json.loads(response.text))
Define some constants for the API URLs
In this section, we define the relevant constants, holding the URL strings for the different endpoints.
# IDS
SERVICE_ID = "dedl-hook"
# Use the Collection https://hda.data.destination-earth.eu/ui/dataset/EO.ESA.DAT.SENTINEL-2.MSI.L1C
COLLECTION_ID = "EO.ESA.DAT.SENTINEL-2.MSI.L1C"
ITEM_ID = "S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE"
# Core API
HDA_API_URL = "https://hda.data.destination-earth.eu"
SERVICES_URL = f"{HDA_API_URL}/services"
SERVICE_BY_ID_URL = f"{SERVICES_URL}/{SERVICE_ID}"
# STAC API
## Core
STAC_API_URL = f"{HDA_API_URL}/stac"
CONFORMANCE_URL = f"{STAC_API_URL}/conformance"
## Item Search
SEARCH_URL = f"{STAC_API_URL}/search"
DOWNLOAD_URL = f"{STAC_API_URL}/download"
## Collections
COLLECTIONS_URL = f"{STAC_API_URL}/collections"
COLLECTION_BY_ID_URL = f"{COLLECTIONS_URL}/{COLLECTION_ID}"
## Items
COLLECTION_ITEMS_URL = f"{COLLECTIONS_URL}/{COLLECTION_ID}/items"
COLLECTION_ITEM_BY_ID_URL = f"{COLLECTIONS_URL}/{COLLECTION_ID}/items/{ITEM_ID}"
## HTTP Success
HTTP_SUCCESS_CODE = 200
The following cell will prompt you to provide your own DESP username and password:
import json
import os
from getpass import getpass
import dedl_authentication as deauth
DESP_USERNAME = input("Please input your DESP username or email: ")
DESP_PASSWORD = getpass("Please input your DESP password: ")
auth = deauth.AuthHandler(DESP_USERNAME, DESP_PASSWORD)
access_token = auth.get_token()
auth_headers = {"Authorization": f"Bearer {access_token}"}
Details are not shown here for security reasons, however, Status code 200 will be shown when successful.
Core API
We can start off by requesting the HDA landing page, which provides links to the API definition, the available services as well as the STAC API index.
print(HDA_API_URL)
display_as_json(requests.get(HDA_API_URL))
Services
Requesting the /services endpoint will return the list of services available for users of the platform. It fetches the services from the services catalog database.
print(SERVICES_URL)
display_as_json(requests.get(SERVICES_URL))
https://hda.data.destination-earth.eu/services
The API can also describe a specific service, identified by its serviceID (e.g. de-ecmwf-aviso-lumi).
print(SERVICE_BY_ID_URL)
display_as_json(requests.get(SERVICE_BY_ID_URL))
https://hda.data.destination-earth.eu/services/de-ecmwf-aviso-lumi
STAC API
The HDA is plugged to a STAC API, component of the EO Catalogue.
Core
The STAC API entry point is set to the /stac endpoint and provides the search capabilities provided by the DEDL STAC interface.
print(STAC_API_URL)
display_as_json(requests.get(STAC_API_URL))
https://hda.data.destination-earth.eu/stac
The user can also have access to the list of all the conformance classes that the server conforms to by hitting the /stac/conformance endpoint.
print(CONFORMANCE_URL)
display_as_json(requests.get(CONFORMANCE_URL))
Collections
The /stac/collections endpoint returns a FeatureCollection object, listing all STAC collections available to the user.
print(COLLECTIONS_URL)
display_as_json(requests.get(COLLECTIONS_URL))
https://hda.data.destination-earth.eu/stac/collections
By providing a specific collectionID (e.g. EO.ESA.DAT.SENTINEL-2.MSI.L1C), the user can get the metadata for a specific Collection.
print(COLLECTION_BY_ID_URL)
display_as_json(requests.get(COLLECTION_BY_ID_URL))
https://hda.data.destination-earth.eu/stac/collections/EO.ESA.DAT.SENTINEL-2.MSI.L1C
Items
It is also possible to get the list of items available in a given Collection using Simple Search.
DATETIME = "?datetime=2023-09-09T00:00:00Z/2023-09-20T23:59:59Z"
print(COLLECTION_ITEMS_URL+DATETIME)
display_as_json(requests.get(COLLECTION_ITEMS_URL+DATETIME, headers={'Authorization': 'Bearer {}'.format(access_token)}))
Item ID
To get the metadata specific to a given item (identified by its itemID in a collection, the user can request the /stac/collections/{collectionID}/items/{itemID} endpoint.
print(COLLECTION_ITEM_BY_ID_URL)
display_as_json(requests.get(COLLECTION_ITEM_BY_ID_URL, headers={'Authorization': 'Bearer {}'.format(access_token)}))
Item search
The STAC API also provides an item /stac/search endpoint, which is intended as a shorthand API for simple queries. This endpoint allows users to efficiently search for items that match the specified input filters.
By default, the /stac/search endpoint will return the first 20 items found in all the collections available at the /stac/collections endpoint. Filters can be added either via query parameters in a GET request or added to the JSON body of a POST request.
The user can also enhance the request by adding filters. The full detail for each available filter is available in the API documentation.
The query parameters are added at the end of the URL as a query string: ?param1=val1¶m2=val2¶m3=val3
SEARCH_QUERY_STRING = "?collections=EO.ESA.DAT.SENTINEL-1.L1_GRD&datetime=2023-09-09T00:00:00Z/2023-09-20T00:00:00Z"
print(SEARCH_URL + SEARCH_QUERY_STRING)
display_as_json(requests.get(SEARCH_URL + SEARCH_QUERY_STRING, headers={'Authorization': 'Bearer {}'.format(access_token)}))
The same filters can be added as the JSON body of a POST request.
BODY = {
"collections": [
"EO.ESA.DAT.SENTINEL-1.L1_GRD",
],
"datetime" : "2023-09-09T00:00:00Z/2023-09-20T23:59:59Z",
"bbox": [-11,35,
50,72 ],
"limit": 10,
}
response = requests.post(SEARCH_URL, json=BODY, headers={'Authorization': 'Bearer {}'.format(access_token)})
display_as_json(response)
The metadata of a given item contains also the download link that the user can use to download a specific item:
result = json.loads(response.text)
downloadUrl = result['features'][0]['assets']['downloadLink']['href']
print(downloadUrl)
response = requests.get(downloadUrl,stream=True,headers={'Authorization': 'Bearer {}'.format(access_token), 'Accept-Encoding': None})
# If the request was successful, download the file
if (response.status_code == HTTP_SUCCESS_CODE):
print("Downloading dataset...")
product_id = 'Item'
filename = product_id + ".zip"
with open(filename, 'wb') as f:
for chunk in response.iter_content(chunk_size=1024):
if chunk:
f.write(chunk)
f.flush()
print("The dataset has been downloaded to: {}".format(filename))
else: print("Request Unsuccessful! Error-Code: {}".format(response.status_code))
Downloading S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE...
The dataset has been downloaded to: S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE.zip
Conclusion
This small tutorial presented the HDA API, illustrated with some pieces of Python code showing how to send HTTP requests to the different endpoints, as well as the use of a few filtering capabilities.
Again, more detail on each endpoint can be found in the API documentation.