Automatic Parallelization of Scientific Application

Research output: Book/ReportPh.D. thesisResearch

Standard

Automatic Parallelization of Scientific Application. / Blum, Troels.

The Niels Bohr Institute, Faculty of Science, University of Copenhagen, 2015. 130 p.

Research output: Book/ReportPh.D. thesisResearch

Harvard

Blum, T 2015, Automatic Parallelization of Scientific Application. The Niels Bohr Institute, Faculty of Science, University of Copenhagen. <https://soeg.kb.dk/permalink/45KBDK_KGL/fbp0ps/alma99122859001105763>

APA

Blum, T. (2015). Automatic Parallelization of Scientific Application. The Niels Bohr Institute, Faculty of Science, University of Copenhagen. https://soeg.kb.dk/permalink/45KBDK_KGL/fbp0ps/alma99122859001105763

Vancouver

Blum T. Automatic Parallelization of Scientific Application. The Niels Bohr Institute, Faculty of Science, University of Copenhagen, 2015. 130 p.

Author

Blum, Troels. / Automatic Parallelization of Scientific Application. The Niels Bohr Institute, Faculty of Science, University of Copenhagen, 2015. 130 p.

Bibtex

@phdthesis{fb75d63a82914157bb736add433a178e,
title = "Automatic Parallelization of Scientific Application",
abstract = "In my PhD work I show that it is possible to run unmodified Python/NumPy code on modern GPUs. This is done by using the Bohrium runtime system to translate the NumPy array operations into an array based bytecode sequence. Executing these byte-codes on two GPUs from different vendors shows great performance gains.Scientists working with computer simulations should be allowed to focus on their field of research and not spend excessive amounts of time learning exotic programming models and languages. We have with Bohrium achieved very promising results by starting out with a relatively simple approach. This has lead to more specialized methods as I have shown with the work done with both specialized, and parametrizied kernels. Both have their benefits and recognizable use cases. We achieved clear performance benefits without any significant negative impact on overall application performance. Even in the cases where we were not able to gain any performance boost by specialization, the added cost, for kernel generation and extra bookkeeping, is minimal.Many of the lessons learned developing and optimizing the Bohrium GPU vector engine has proven to be valuable in a broader perspective, which has made it possible to generalize the developments and made them benefit the complete Bohrium project.",
author = "Troels Blum",
year = "2015",
language = "English",
publisher = "The Niels Bohr Institute, Faculty of Science, University of Copenhagen",

}

RIS

TY - BOOK

T1 - Automatic Parallelization of Scientific Application

AU - Blum, Troels

PY - 2015

Y1 - 2015

N2 - In my PhD work I show that it is possible to run unmodified Python/NumPy code on modern GPUs. This is done by using the Bohrium runtime system to translate the NumPy array operations into an array based bytecode sequence. Executing these byte-codes on two GPUs from different vendors shows great performance gains.Scientists working with computer simulations should be allowed to focus on their field of research and not spend excessive amounts of time learning exotic programming models and languages. We have with Bohrium achieved very promising results by starting out with a relatively simple approach. This has lead to more specialized methods as I have shown with the work done with both specialized, and parametrizied kernels. Both have their benefits and recognizable use cases. We achieved clear performance benefits without any significant negative impact on overall application performance. Even in the cases where we were not able to gain any performance boost by specialization, the added cost, for kernel generation and extra bookkeeping, is minimal.Many of the lessons learned developing and optimizing the Bohrium GPU vector engine has proven to be valuable in a broader perspective, which has made it possible to generalize the developments and made them benefit the complete Bohrium project.

AB - In my PhD work I show that it is possible to run unmodified Python/NumPy code on modern GPUs. This is done by using the Bohrium runtime system to translate the NumPy array operations into an array based bytecode sequence. Executing these byte-codes on two GPUs from different vendors shows great performance gains.Scientists working with computer simulations should be allowed to focus on their field of research and not spend excessive amounts of time learning exotic programming models and languages. We have with Bohrium achieved very promising results by starting out with a relatively simple approach. This has lead to more specialized methods as I have shown with the work done with both specialized, and parametrizied kernels. Both have their benefits and recognizable use cases. We achieved clear performance benefits without any significant negative impact on overall application performance. Even in the cases where we were not able to gain any performance boost by specialization, the added cost, for kernel generation and extra bookkeeping, is minimal.Many of the lessons learned developing and optimizing the Bohrium GPU vector engine has proven to be valuable in a broader perspective, which has made it possible to generalize the developments and made them benefit the complete Bohrium project.

UR - https://soeg.kb.dk/permalink/45KBDK_KGL/fbp0ps/alma99122859001105763

M3 - Ph.D. thesis

BT - Automatic Parallelization of Scientific Application

PB - The Niels Bohr Institute, Faculty of Science, University of Copenhagen

ER -

ID: 153607512