![](https://github.com/destination-earth/DestinE-DataLake-Lab/blob/main/img/DestinE-banner.jpg?raw=true)



# DEDL - HDA Tutorial

<br> Author: EUMETSAT </br>

<div class="alert alert-block alert-success">
<h3>First steps using Harmonised Data access API</h3>
<li> Discover Data of DestinE Data Portfolio </li>
<li> Access Data of DestinE Data Portfolio </li>
</div>

This notebook demonstrates how to use the HDA (Harmonized Data Access) API by sending a few HTTP requests to the API, using Python code.

The detailed API and definition of each endpoint and parameters is available in the HDA Swagger UI at:

https://hda.data.destination-earth.eu/docs/

<div class="alert alert-block alert-warning">
<b> Prequisites: </b>
<li> For Data discovery: none </li>
<li> For Data access : <a href="https://platform.destine.eu/"> DestinE user account</a> </li>
</div>

## Import the relevant modules
We start off by importing the relevant modules for HTTP requests and json handling, as well as writing a small pretty printing helper for viewing json responses in the notebook.

In [None]:
!pip install --quiet folium

In [None]:
from typing import Union
import requests
import json
import urllib.parse
from requests.auth import HTTPBasicAuth

from IPython.display import JSON

# map
import folium
import folium.plugins
from branca.element import Figure
import shapely.geometry

def display_as_json(response: requests.Response) -> None:
    """Displays a HTTP request response as an interactive JSON in Jupyter Hub.
    
    Args:
        response (requests.Response): HTTP request response
    Returns:
        None
    """
    if not isinstance(response, requests.Response):
        raise TypeError(f"display_as_json expects a requests.Response parameter, got {type(response)}.")
    return JSON(json.loads(response.text))


## Define some constants for the API URLs
In this section, we define the relevant constants, holding the URL strings for the different endpoints.

In [None]:
# IDS
SERVICE_ID = "dedl-hook"

# Use the Collection https://hda.data.destination-earth.eu/ui/dataset/EO.ESA.DAT.SENTINEL-2.MSI.L1C
COLLECTION_ID = "EO.ESA.DAT.SENTINEL-2.MSI.L1C"

ITEM_ID = "S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE"

# Core API
HDA_API_URL = "https://hda.data.destination-earth.eu"
SERVICES_URL = f"{HDA_API_URL}/services"
SERVICE_BY_ID_URL = f"{SERVICES_URL}/{SERVICE_ID}"

# STAC API
## Core
STAC_API_URL = f"{HDA_API_URL}/stac"
CONFORMANCE_URL = f"{STAC_API_URL}/conformance"

## Item Search
SEARCH_URL = f"{STAC_API_URL}/search"
DOWNLOAD_URL = f"{STAC_API_URL}/download"

## Collections
COLLECTIONS_URL = f"{STAC_API_URL}/collections"
COLLECTION_BY_ID_URL = f"{COLLECTIONS_URL}/{COLLECTION_ID}"

## Items
COLLECTION_ITEMS_URL = f"{COLLECTIONS_URL}/{COLLECTION_ID}/items"
COLLECTION_ITEM_BY_ID_URL = f"{COLLECTIONS_URL}/{COLLECTION_ID}/items/{ITEM_ID}"

## HTTP Success
HTTP_SUCCESS_CODE = 200

The collection used for this tutorial is [Sentinel 2 MSI Level 1C](https://hda.data.destination-earth.eu/ui/dataset/EO.ESA.DAT.SENTINEL-2.MSI.L1C)

## Obtain Authentication Token

In [46]:
import json
import os
from getpass import getpass
import dedl_authentication as deauth

DESP_USERNAME = input("Please input your DESP username or email: ")
DESP_PASSWORD = getpass("Please input your DESP password: ")

auth = deauth.AuthHandler(DESP_USERNAME, DESP_PASSWORD)
access_token = auth.get_token()

auth_headers = {"Authorization": f"Bearer {access_token}"}

Please input your DESP username:  eum-dedl-user
Please input your DESP password:  ········


Response code: 200


## Core API

We can start off by requesting the HDA landing page, which provides links to the API definition (links `service-desc` and `service-doc`), the available services as well as the STAC API index.  

In [47]:
print(HDA_API_URL)
display_as_json(requests.get(HDA_API_URL))

https://hda.data.destination-earth.eu


<IPython.core.display.JSON object>

### Services
Requesting the `/services` endpoint will return the list of services available for users of the platform.
It fetches the services from the **services catalog** database.

In [48]:
print(SERVICES_URL)
display_as_json(requests.get(SERVICES_URL))

https://hda.data.destination-earth.eu/services


<IPython.core.display.JSON object>

The API can also describe a specific service, identified by its `serviceID` (e.g. **dedl-hook**).

In [None]:
print(SERVICE_BY_ID_URL)
display_as_json(requests.get(SERVICE_BY_ID_URL))

## STAC API
The HDA is plugged to a STAC API.
### Core
The STAC API entry point is set to the `/stac` endpoint and provides the search capabilities provided by the DEDL STAC interface.

In [None]:
print(STAC_API_URL)
display_as_json(requests.get(STAC_API_URL))

The user can also have access to the list of all the conformance classes that the server conforms to by hitting the `/stac/conformance` endpoint.

In [None]:
print(CONFORMANCE_URL)
display_as_json(requests.get(CONFORMANCE_URL))

### Collections
The `/stac/collections` endpoint returns a `FeatureCollection` object, listing all STAC collections available to the user.

In [49]:
print(COLLECTIONS_URL)
display_as_json(requests.get(COLLECTIONS_URL))

https://hda.data.destination-earth.eu/stac/collections


<IPython.core.display.JSON object>

By providing a specific `collectionID` (e.g. **EO.ESA.DAT.SENTINEL-2.MSI.L1C**), the user can get the metadata for a specific `Collection`.

In [50]:
print(COLLECTION_BY_ID_URL)
display_as_json(requests.get(COLLECTION_BY_ID_URL))

https://hda.data.destination-earth.eu/stac/collections/EO.ESA.DAT.SENTINEL-2.MSI.L1C


<IPython.core.display.JSON object>

### Items
It is also possible to get the list of items available in a given `Collection` using Simple Search.


In [None]:
DATETIME = "?datetime=2023-09-09T00:00:00Z/2023-09-20T23:59:59Z&limit=3"

print(COLLECTION_ITEMS_URL+DATETIME)
r=requests.get(COLLECTION_ITEMS_URL+DATETIME, headers=auth_headers)  

display_as_json(r)            

#### Sorting items

It is possible to sort the list of items available in a given Collection using the 'sortby' parameter.

In [45]:
SORTBYDATETIME = "&sortby=datetime&limit=3"

print(COLLECTION_ITEMS_URL+DATETIME+SORTBYDATETIME)
r=requests.get(COLLECTION_ITEMS_URL+DATETIME+SORTBYDATETIME, headers=auth_headers)    

display_as_json(r)        

https://hda.data.destination-earth.eu/stac/collections/EO.ESA.DAT.SENTINEL-2.MSI.L1C/items?datetime=2023-09-09T00:00:00Z/2023-09-20T23:59:59Z&limit=3&sortby=datetime&limit=3


<IPython.core.display.JSON object>

### Item ID
To get the metadata specific to a given item (identified by its `itemID` in a collection, the user can request the `/stac/collections/{collectionID}/items/{itemID}`endpoint.

In [52]:
print(COLLECTION_ITEM_BY_ID_URL)
r=requests.get(COLLECTION_ITEM_BY_ID_URL, headers=auth_headers) 

display_as_json(r)            
    

https://hda.data.destination-earth.eu/stac/collections/EO.ESA.DAT.SENTINEL-2.MSI.L1C/items/S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE


<IPython.core.display.JSON object>

### Item Download
The metadata of a given item contains also the download link that the user can use to download a specific item.

In [53]:
result = json.loads(r.text)
downloadUrl = result['assets']['downloadLink']['href']
print(downloadUrl)

response = requests.get(downloadUrl,stream=True,headers=auth_headers)

# If the request was successful, download the file
if (response.status_code == HTTP_SUCCESS_CODE):
        print("Downloading "+ ITEM_ID + "...")
        filename = ITEM_ID + ".zip"
        with open(filename, 'wb') as f:
            for chunk in response.iter_content(chunk_size=1024): 
                if chunk:
                    f.write(chunk)
                    f.flush()
        print("The dataset has been downloaded to: {}".format(filename))
else: print("Request Unsuccessful! Error-Code: {}".format(response.status_code))

https://hda.data.destination-earth.eu/stac/collections/EO.ESA.DAT.SENTINEL-2.MSI.L1C/items/S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321/download?provider=dedl
Downloading S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE...
The dataset has been downloaded to: S2A_MSIL1C_20230910T050701_N0509_R019_T47VLH_20230910T074321.SAFE.zip


### Item search
The STAC API also provides an item `/stac/search` endpoint, which is intended as a shorthand API for simple queries.
This endpoint allows users to efficiently search for items that match the specified input filters.

By default, the `/stac/search` endpoint will return the first 100 items found in all the collections available at the `/stac/collections` endpoint.
Filters can be added either via query parameters in a **GET** request or added to the JSON body of a **POST** request.

The user can also enhance the request by adding filters. The full detail for each available filter is available in the [API documentation](https://hda.data.destination-earth.eu/docs/#/STAC%20API%20-%20Item%20Search/getItemSearch).

The query parameters are added at the end of the URL as a *query string*: `?param1=val1&param2=val2&param3=val3`

In [None]:
SEARCH_QUERY_STRING = "?collections="+COLLECTION_ID+"&datetime=2023-09-09T00:00:00Z/2023-09-20T00:00:00Z&limit=3"
print(SEARCH_URL + SEARCH_QUERY_STRING)
r=requests.get(SEARCH_URL + SEARCH_QUERY_STRING, headers=auth_headers)

display_as_json(r)    

The same filters can be added as the JSON body of a **POST** request.

In [54]:
BODY = {
    "collections": [
        COLLECTION_ID,
    ],
    "datetime" : "2023-09-09T00:00:00Z/2023-09-20T23:59:59Z",
    "bbox": [-11,35,
              50,72 ],
    "sortby": [{"field": "datetime","direction": "desc"}
              ],
    "limit": 3,
}

r=requests.post(SEARCH_URL, json=BODY, headers=auth_headers)

display_as_json(r)    

<IPython.core.display.JSON object>

#### Visualize search results

Search results can be visualized on a map.

In [55]:
map1 = folium.Map([62, -5],
                  zoom_start=4.5)

bbox=[-11,35,50,72]
fig = Figure(width="900px", height="500px")
fig.add_child(map1)

folium.GeoJson(
    shapely.geometry.box(*bbox),style_function=lambda feature: {
        "fillColor": "#ffffaa",
        "color": "black",
        "weight": 2,
        "dashArray": "5, 5",
    }
).add_to(map1)

results = folium.features.GeoJson( r.json(),style_function=lambda feature: {
        "fillColor": "#ff0000",
        "color": "black",
        "weight": 1
    })

map1.add_child(results)

display(fig)


# Conclusion
This small tutorial presented the HDA API, illustrated with some pieces of Python code showing how to send HTTP requests to the different endpoints, as well as the use of a few filtering capabilities.

Again, more detail on each endpoint can be found in the [API documentation](https://hda.data.destination-earth.eu/docs/).

# FAQ

### My request is failing due to unsupported provider
A: This is totally normal and expected as your provider is not a valid host provider.

Valid Host Providers can be found in the collection metadata with providers having **host** as a listed role.

e.g 

```json
"roles": [

    "processing",

    "host"

]
```

### My `/search` or `/items` request return 0 items

Multiple reasons can lead to no items returned in the response

#### You are not authenticated

You did not provide an access token in the request. An access token is required to search and access items in datasets.

#### No parameters set in the request or too wide parameters

The request parameters are not precise enough for the HDA API to give you a response. Narrow down your request by giving more specific parameters.

e.g You can use a short datetime range like **datetime=2023-09-09T00:00:00Z/2023-09-21T00.00.00Z**

#### There is no item matching the request parameters

Modify the parameters to find items.


### Expected products within the specified datetime range from my `/search` or `/items` requests are missing from the results

As of now the search returns results rounded up to nearest milisecond. Please take this into account on your datetime queries.

e.g Should you want to access an item with datetime in its metadata `2023-01-01T12:22:33:555`. This item may have in reality the datetime `2023-01-01T12:22:33:55487655`. Hence, your search filter should be `2023-01-01T12:22:33:554/2023-01-01T12:22:33:556`.

