Stack Service

A Stack Service is a hosted application in the cloud.

It is a pre-deployed environment that lets you start working immediately, without installing or maintaining the underlying software stack yourself. On Destination Earth Data Lake, Stack Services available to you are JupyterHub and Dask.

The basic usage of Jupyter notebooks is to work interactively: explore data, test ideas, prototype workflows, and produce results for reporting.

Dask is for distributed computing when a task is too large or too slow to run in a single notebook kernel.

How Stack Services fit together

In most workflows, you use the services in this order:

  1. Start in JupyterHub to explore data, test code, and validate the logic on a small sample.

  2. If the same workflow becomes slow or too large, scale out by connecting the notebook to a Dask cluster.

  3. Run the heavy processing in parallel, then collect and save results back to your project storage.

What is Dask and when should you use it?

Dask is a parallel computing framework for Python that helps you run the same type of analysis you would do in a notebook, but faster and on larger datasets. Instead of processing everything in a single Python process, Dask can distribute the work across multiple workers (CPU cores or multiple nodes).

Use Dask when:

  • Your data does not fit comfortably in memory on a single machine.

  • A notebook computation takes too long and you want to speed it up by parallelizing it.

  • You need to run the same workflow on many files, time steps, or spatial tiles.

  • You want a scalable workflow that starts small (interactive) and can grow to a cluster when needed.

Typical examples where Dask helps include batch processing over many scenes, time ranges, or tiles, and running the same computation repeatedly over a large set of inputs.

Example notebooks

Example Jupyter notebooks for Stack Services on DestinE Data Lake are available here:

Stack Services examples for Destination Earth on GitHub

Prerequisites

No. 1 Access to My DataLake Services

You must have access to My DataLake Services (for example: you can log in, have a profile, and can create or access a project).

List of articles about My DataLake Services

No. 2 Access to Stack Services

To use Stack Services, your account must have the required roles for the specific service you want to use (for example: stack-dask or stack-jupyter). If you cannot access a service, request the appropriate roles first.

How to request roles for Stack Dask on My DataLake Services

No. 3 Use JupyterHub to run the examples

If you want to go through the example notebooks and learn the workflows step by step, JupyterHub is the main entry point.

Run a notebook on JupyterHub

How to request for JupyterHub roles on My DataLake Services

What’s next

Use the guides below to start with notebooks, connect via the LUMI bridge, create a Dask cluster, and access services.