Using s3cmd to obtain EODATA on DEDL

EODATA Storage endpoint

S3 is an object storage service with which you can retrieve data over HTTPS using REST API. The default S3 endpoint address to work with EO data on DEDL is:

https://eodata.data.destination-earth.eu

Prerequisites

No. 1 My DataLake Services fully operational

You need access to a fully operational My DataLake Services. See How to create profile on My DataLake Services.

No. 2 Credentials for S3 EODATA access from My DataLake Services

See article How to obtain eodata S3 keys through My DataLake Services

No. 3 s3cmd installed

s3cmd must be installed on your virtual machine or computer. Learn more here:

How to install s3cmd on Linux

No. 4 s3cfg installed

Configuration files for s3cmd command

Code in this article has been tested on Ubuntu 22.04; with slight modifications, it should work on other Linux flavors.

Preparation step 1 Get S3 EODATA credentials

The way to obtain S3 credential to access EODATA from DEDL is to create them through My DataLake Services. Now, you may have already created them in that way, in which case you just have to fetch them and use in the next step.

Or, if you do not already have any S3 key pair for EODATA, use Prerequisite No. 2 to get S3 credentials for EODATA. Either way, you end up with two return values:

  • <access_key>

  • <secret_key>

In the next step, these credentials will become entries in an S3 config file.

Preparation step 2 Create the appropriate s3cfg file

In Prerequisite No. 4, we discuss how to create S3 configuration file interactively and where to put that file, once created. Here we create the config file with a text editor, and enter a minimum of data needed for EODATA access to function.

The text editor usually is vi/vim or nano and you can create s3cfg file with a command like this:

  • vi s3cfg

  • vim s3cfg

  • nano s3cfg

We are using using the simplest location of s3cfg file, that is, in the current directory.

Copy the following content to the configuration file with your access and secret key replaced:

[default]
access_key = <access_key>
host_base = eodata.data.destination-earth.eu
host_bucket = eodata.data.destination-earth.eu
human_readable_sizes = False
secret_key = <secret_key>
use_https = true
check_ssl_certificate = true

s3cmd in exploitation

s3cmd is now functional and you can use it to glimpse at the available folders with satellite data, download one particular image, download archives and so on.

Use s3cmd to list files in EODATA repository

Run s3cmd command pointing to the previously created configuration file with option -c:

$ s3cmd -c s3cfg ls s3://eodata/
../../../../_images/s3cmd-download-45.png

Downloading one file with s3cmd

Below is an example of downloading a product from the EO data repository using s3cmd:

$ s3cmd -c s3cfg get s3://eodata/Sentinel-1/SAR/SLC/2016/12/28/S1A_IW_SLC__1SDV_20161228T044442_20161228T044509_014575_017AE8_4C26.SAFE/measurement/s1a-iw2-slc-vv-20161228t044442-20161228t044508-014575-017ae8-005.tiff

Here is what it looks like in terminal window:

../../../../_images/s3cmd-download-44.png

Downloading archives with the recursive option

If the objects in the repository are archives, for example, such as S1B_IW_SLC__1SDV_20191013T155948_20191013T160015_018459_022C6B_13A2.SAFE, use the recursive option --recursive or -r to download the whole product.

Example of recursive search with -r option:

$ s3cmd -c s3cfg -r get s3://eodata/Sentinel-1/SAR/SLC/2019/10/13/S1B_IW_SLC__1SDV_20191013T155948_20191013T160015_018459_022C6B_13A2.SAFE/

The above request can be broken down as follows:

Single Request to Initiate

The initial command with -r is a single request to initiate the process of downloading files recursively.

Multiple Requests for Each File

Despite the initial command being a single request to start the recursive download, S3 treats each file download as a separate request. Therefore, if downloaded product has 100 files in the package, S3 will count this as 100 requests (one for each file downloaded).

So, while s3cmd is executed in a single command, the --recursive option will cause that each downloaded file is counted as an individual request in terms of quota count.

This is what it will look like in terminal window, in the middle of the downloading process:

../../../../_images/s3cmd-download-2.png

You can see that it was request No. 25 and that there will be a total of 45 requests in this case.

At the end, a new PDF file will appear:

../../../../_images/s3cmd-download-3.png

Here is what the beginning of that PDF file looks like:

../../../../_images/s3cmd-download-4.png