1 research outputs found

    Runtime Code Generation and Data Management for Heterogeneous Computing in Java

    Get PDF
    GPUs (Graphics Processing Unit) and other accelerators are nowadays commonly found in desktop machines, mobile devices and even data centres. While these highly parallel processors offer high raw performance, they also dramatically increase program complexity, requiring extra effort from programmers. This results in difficult-to-maintain and non-portable code due to the low-level nature of the languages used to program these devices. This paper presents a high-level parallel programming approach for the popular Java programming language. Our goal is to revitalise the old Java slogan – Write once, run anywhere — in the context of modern heterogeneous systems. To enable the use of parallel accelerators from Java we introduce a new API for heterogeneous programming based on array and functional programming. Applications written with our API can then be transparently accelerated on a device such as a GPU using our runtime OpenCL code generator. In order to ensure the highest level of performance, we present data management optimizations. Usually, data has to be translated (marshalled) between the Java representation and the representation accelerators use. This paper shows how marshal affects runtime and present a novel technique in Java to avoid this cost by implementing our own customised array data structure. Our design hides low level data management from the user making our approach applicable even for inexperienced Java programmers. We evaluated our technique using a set of applications from different domains, including mathematical finance and machine learning. We achieve speedups of up to 500x over sequential and multi-threaded Java code when using an external GPU
    corecore