A  Scheduling Model for the Square Kilometre Array Science Data Processor

A Scheduling Model for the Square Kilometre Array Science Data Processor

Only 126 seats left
Monday, June 28, 2021 3:00 PM to 4:00 PM · 1 hr. (Africa/Abidjan)
Exascale SystemsHPC Workflows

Information

This PhD investigates the current design of the Square Kilometre Array (SKA) Science Data Processor (SDP). The data scale of the SKA is such that it is not possible to keep raw observation data on-premise for more than a few weeks. Data reduction and analysis pipelines, normally run by astronomers locally (either on a PC or a local HPC centre), must instead be completed within this deadline, to ensure the SKA's intermediary data storage - the SDP 'buffer' - is not filled. It is essential that the scheduling and processing of observation data happens in an efficient manner to guarantee the delivery of High Priority Science Observations (HPSOs), by ensuring their post-processing pipelines are completed. HPSO post-processing pipelines are represented as Directed Acyclic Graphs (DAGs), which are commonly referred to as a science workflow. There is a large body of literature associated with workflow scheduling, both in grid and cloud computing environments. However, the proposed system for the SKA uses a batch-processing model, rather than apply one of the various workflow scheduling heuristics in the literature. An analytical argument has been made that this it is unnecessary to do so given the presence of an intermediary buffer.

This thesis intends to determine whether or not the SKA-SDP buffer-and batch-processing model is sufficient for the data and compute demands of an observation schedule, especially when faced with system delays on the SDP compute infrastructure. Additionally, we are interested in determining how much more effective - if at all - existing heuristic scheduling techniques currently implemented in science workflow managers (for example, Pegasus) are at improving the quality of batch-processed schedules. Finally, given SKA observations are so dependent on the processing of these workflows, there is an opportunity to apply workflow scheduling to a global task DAG, constructed of all workflows in an observation schedule. This PhD plans to investigate a decision support model that incorporates this global DAG.

In order to test these ideas, we need to model the SKA instrument, storage buffer, and computing facilities. In order to do this, TopSim - Telescope Operations Simulator - has been developed. This is a generalisable instrument-storage-compute discrete-event simulator that, in addition to simulating the data life-cycle and scheduling of the SKA, is able to simulate other global scheduling applications such as Internet of Things (e.g. Edge and Fog computing) or Remote sensing (e.g. geoscience or satellites).

I will present an overview of the observation and scheduling model we have developed for TopSim, an example decision algorithm for the Telescope using the Global task-graph, and preliminary results of some scheduling heuristic comparison simulations.