PhD defense by Rasmus Munk

Grid of Clouds: A model for how resources can be shared amongst organisations

When seeking new scientific insights, critical aspects of any scientific inquiry is the acquisition, analysis, andstorage of scientific evidence. In today’s environment, with the vast growth in data generation, computationalplatforms have become a key pillar in making such insights feasible. Additionally, with the underlying increasein compute power not coming from faster processors, but rather from the increase in the number of cores withina particular compute platform, the task of speeding up existing applications is no longer achieved by waitingfor the next processor. As a complement to this, additional compute capacity is also emerging in specializedhardware platforms, that also leverage the possibility of increasing the number of computational cores withina single device. Nevertheless, this additional capacity on its own does not make up for the projected amountof data that has to be processed in the future. Because of this, the organizing and employment of the existing resources, and how and where data is stored becomes ever important.

 

Historically, the organising of computational resources have changed as per the technological developmentsof their time. Both the Grid and Cloud notions seek to organize resources such that they can be effectively usedby multiple users.

 

This thesis explores how the modern computational infrastructures can be used and deployed to providescientists with the tools they require to achieve new scientific insights at the different stages of inquiry, be they atthe acquisition, analysis or storage stage. Specifically, it presents how data can be ubiquitously stored and retrieveas part of data acquisition, or within a particular analysis with the introduced MiG Utils library. Furthermore, it presents a design solution for how high throughput data generation from large scientific instruments could be optimized by applying in-situ computational kernels to datastream via The HIgh Throughput Storage System.

Link to zoom: https://ucph-ku.zoom.us/j/64661382480

Link to thesis: https://erda.ku.dk/archives/81fbf75023a296eaffb4d249cfc2f840/Grid_enabled_Clouds.pdf