11,415 research outputs found

    Introducing Molly: Distributed Memory Parallelization with LLVM

    Get PDF
    Programming for distributed memory machines has always been a tedious task, but necessary because compilers have not been sufficiently able to optimize for such machines themselves. Molly is an extension to the LLVM compiler toolchain that is able to distribute and reorganize workload and data if the program is organized in statically determined loop control-flows. These are represented as polyhedral integer-point sets that allow program transformations applied on them. Memory distribution and layout can be declared by the programmer as needed and the necessary asynchronous MPI communication is generated automatically. The primary motivation is to run Lattice QCD simulations on IBM Blue Gene/Q supercomputers, but since the implementation is not yet completed, this paper shows the capabilities on Conway's Game of Life

    Locality-Aware Automatic Parallelization for GPGPU with OpenHMPP Directives

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in International Journal of Parallel Programming. The final authenticated version is available online at: https://doi.org/10.1007/s10766-015-0362-9[Abstract] The use of GPUs for general purpose computation has increased dramatically in the past years due to the rising demands of computing power and their tremendous computing capacity at low cost. Hence, new programming models have been developed to integrate these accelerators with high-level programming languages, giving place to heterogeneous computing systems. Unfortunately, this heterogeneity is also exposed to the programmer complicating its exploitation. This paper presents a new technique to automatically rewrite sequential programs into a parallel counterpart targeting GPU-based heterogeneous systems. The original source code is analyzed through domain-independent computational kernels, which hide the complexity of the implementation details by presenting a non-statement-based, high-level, hierarchical representation of the application. Next, a locality-aware technique based on standard compiler transformations is applied to the original code through OpenHMPP directives. Two representative case studies from scientific applications have been selected: the three-dimensional discrete convolution and the simple-precision general matrix multiplication. The effectiveness of our technique is corroborated by a performance evaluation on NVIDIA GPUs.Ministerio de Economía y Competitividad; TIN2010-16735Ministerio de Economía y Competitividad; TIN2013-42148-PGalicia, Consellería de Cultura, Educación e Ordenación Universitaria; GRC2013-055Ministerio de Educación; AP2008-0101

    CoBe -- Coded Beacons for Localization, Object Tracking, and SLAM Augmentation

    Full text link
    This paper presents a novel beacon light coding protocol, which enables fast and accurate identification of the beacons in an image. The protocol is provably robust to a predefined set of detection and decoding errors, and does not require any synchronization between the beacons themselves and the optical sensor. A detailed guide is then given for developing an optical tracking and localization system, which is based on the suggested protocol and readily available hardware. Such a system operates either as a standalone system for recovering the six degrees of freedom of fast moving objects, or integrated with existing SLAM pipelines providing them with error-free and easily identifiable landmarks. Based on this guide, we implemented a low-cost positional tracking system which can run in real-time on an IoT board. We evaluate our system's accuracy and compare it to other popular methods which utilize the same optical hardware, in experiments where the ground truth is known. A companion video containing multiple real-world experiments demonstrates the accuracy, speed, and applicability of the proposed system in a wide range of environments and real-world tasks. Open source code is provided to encourage further development of low-cost localization systems integrating the suggested technology at its navigation core

    Improved Distributed Estimation Method for Environmental\ud time-variant Physical variables in Static Sensor Networks

    Get PDF
    In this paper, an improved distributed estimation scheme for static sensor networks is developed. The scheme is developed for environmental time-variant physical variables. The main contribution of this work is that the algorithm in [1]-[3] has been extended, and a filter has been designed with weights, such that the variance of the estimation errors is minimized, thereby improving the filter design considerably\ud and characterizing the performance limit of the filter, and thereby tracking a time-varying signal. Moreover, certain parameter optimization is alleviated with the application of a particular finite impulse response (FIR) filter. Simulation results are showing the effectiveness of the developed estimation algorithm
    • …
    corecore