123 research outputs found
Automatic parallelization of array-oriented programs for a multi-core machine
Abstract We present the work on automatic parallelization of array-oriented programs for multi-core machines. Source programs written in standard APL are translated by a parallelizing APL-to-C compiler into parallelized C code, i.e. C mixed with OpenMP directives. We describe techniques such as virtual operations and datapartitioning used to effectively exploit parallelism structured around array-primitives. We present runtime performance data, showing the speedup of the resulting parallelized code, using different numbers of threads and different problem sizes, on a 4-core machine, for several examples
HDArray: Parallel Array Interface for Distributed Heterogeneous Devices
Heterogeneous clusters with nodes containing one or more accelerators, such as GPUs, have become common. While MPI provides a mechanism and management of interaddress space communication, and OpenCL provides a way to manage computation and communication within a process with access to heterogeneous computational resources, programmers are forced to write hybrid programs that manage the interaction of both of these systems. This paper describes an array programming interface that provides users with automatic or manual distributions of data and work. Using the distribution and information about what data is used and defined by kernels, communication among processes and among devices in a process is performed automatically. The interface provides a unified programming model to the user, thus simplifying program development
Maximizing Communication Overlap with Dynamic Program Analysis
International audienceWe present a dynamic program analysis approach to optimize communication overlap in scientific applications. Our tool instruments the code to generate a trace of the application's memory and synchronization behavior. An offline analysis determines the program optimal points for maximal overlap when considering several programming constructs: nonblocking one-sided communication operations, non-blocking collectives and bespoke synchronization patterns and operations. Feedback about possible transformations is presented to the user and the tool can perform the directed transformations, which are supported by a lightweight runtime. The value of our approach comes from: 1) the ability to optimize across boundaries of software modules or libraries, while specializing for the intrinsics of the underlying communication runtime; and 2) providing upper bounds on the expected performance improvements after communication optimizations. We have reduced the time spent in communication by as much as 64% for several applications that were already aggressively optimized for overlap; this indicates that manual optimizations leave untapped performance. Although demonstrated mainly for the UPC programming language, the methodology can be easily adapted to any other communication and synchronization API
Development of biomedical devices for the extracorporeal real-time monitoring and perfusion of transplant organs
The goal of this Thesis is to develop a range of technologies that could enable a paradigm shift in organ preservation for renal transplantation, transitioning from static cold storage to warm normothermic blood perfusion. This transition could enable the development of novel pre-implantation therapies, and even serve as the foundation for a global donor pool.
A low-hæmolysis pump was developed, based on a design first proposed by Nikola Tesla in 1913. Simulations demonstrated the theoretical superiority of this design over existing centrifugal pumps for blood recirculation, and provided insights for future avenues of research into this technology.
A miniature, battery-powered, multimodal sensor suite for the in-line monitoring of a blood perfusion circuit was designed and implemented. This was named the ‘SmartPipe’, and proved capable of simultaneously monitoring temperature, pressure and blood oxygen saturations over the biologically-relevant ranges of each modality.
Finally, the Thesis details the successful implementation and optimisation of a combined microfluidic and microdialysis system for the real-time quantitation of creatinine in blood or urine through amperometric sensing, to act as a live renal function monitor. The range of detection was 4.3μM – 500μM, with the possibility of extending this in both directions. This work also details and explores a novel methodology for functional monitoring in closed-loop systems which avoids the need for sensor calibration, and potentially overcomes the problems of sensor drift and desensitisation.Open Acces
- …