237 research outputs found

    OpenMP for Networks of SMPs

    Get PDF
    In this paper, we present the first system that implement OpenMP on network of shared-memory multiprocessors. This system enables the program to rely on a single, standard, shared-memory API for parallelization within a multiprocessor and between multiprocessors. It is implemented via a translator that convert OpenMP directives to appropriate calls to a modified version of the TreadMarks software distributed shared memory(SDSM) system. In contrast to previous SDSM systems for SMPs, the modified TreadMarks system uses POSIX threads for parallelism within an SMP mode. This approach greatly simplifies the changes required to the SDSM in order to exploit the intra-mode hardware shared memory. We present performance results for seven applications (Barnes-Hut, CLU and Water from SPLASH-2, 3D-FFT from NAS, Red-Black SOR, TSP and MGS) running on an SP2 with four four-processor SMP nodes. A comparison between the thread implementation and the original implementation of TreadMarks shows that using the hardware shared memory within an SMP node significantly achieves speedups that are up to 30% better than the original versions. We also compare SDSM against message passing. Overall, the speedups of multithreaded RreadMarks programs are within 7-30% of the MPI versions

    Compiler Optimization Techniques for OpenMP Programs

    Get PDF

    The AXIOM software layers

    Get PDF
    AXIOM project aims at developing a heterogeneous computing board (SMP-FPGA).The Software Layers developed at the AXIOM project are explained.OmpSs provides an easy way to execute heterogeneous codes in multiple cores. People and objects will soon share the same digital network for information exchange in a world named as the age of the cyber-physical systems. The general expectation is that people and systems will interact in real-time. This poses pressure onto systems design to support increasing demands on computational power, while keeping a low power envelop. Additionally, modular scaling and easy programmability are also important to ensure these systems to become widespread. The whole set of expectations impose scientific and technological challenges that need to be properly addressed.The AXIOM project (Agile, eXtensible, fast I/O Module) will research new hardware/software architectures for cyber-physical systems to meet such expectations. The technical approach aims at solving fundamental problems to enable easy programmability of heterogeneous multi-core multi-board systems. AXIOM proposes the use of the task-based OmpSs programming model, leveraging low-level communication interfaces provided by the hardware. Modular scalability will be possible thanks to a fast interconnect embedded into each module. To this aim, an innovative ARM and FPGA-based board will be designed, with enhanced capabilities for interfacing with the physical world. Its effectiveness will be demonstrated with key scenarios such as Smart Video-Surveillance and Smart Living/Home (domotics).Peer ReviewedPostprint (author's final draft

    Scalable RDMA performance in PGAS languages

    Get PDF
    Partitioned global address space (PGAS) languages provide a unique programming model that can span shared-memory multiprocessor (SMP) architectures, distributed memory machines, or cluster ofSMPs. Users can program large scale machines with easy-to-use, shared memory paradigms. In order to exploit large scale machines efficiently, PGAS language implementations and their runtime system must be designed for scalability and performance. The IBM XLUPC compiler and runtime system provide a scalable design through the use of the shared variable directory (SVD). The SVD stores meta-information needed to access shared data. It is dereferenced, in the worst case, for every shared memory access, thus exposing a potential performance problem. In this paper we present a cache of remote addresses as an optimization that will reduce the SVD access overhead and allow the exploitation of native (remote) direct memory accesses. It results in a significant performance improvement while maintaining the run-time portability and scalability.Postprint (published version

    MAGDA: A Mobile Agent based Grid Architecture

    Get PDF
    Mobile agents mean both a technology and a programming paradigm. They allow for a flexible approach which can alleviate a number of issues present in distributed and Grid-based systems, by means of features such as migration, cloning, messaging and other provided mechanisms. In this paper we describe an architecture (MAGDA – Mobile Agent based Grid Architecture) we have designed and we are currently developing to support programming and execution of mobile agent based application upon Grid systems
    corecore