50,827 research outputs found

    dOpenCL: Towards a Uniform Programming Approach for Distributed Heterogeneous Multi-/Many-Core Systems

    Get PDF
    Modern computer systems are becoming increasingly heterogeneous by comprising multi-core CPUs, GPUs, and other accelerators. Current programming approaches for such systems usually require the application developer to use a combination of several programming models (e. g., MPI with OpenCL or CUDA) in order to exploit the full compute capability of a system. In this paper, we present dOpenCL (Distributed OpenCL) – a uniform approach to programming distributed heterogeneous systems with accelerators. dOpenCL extends the OpenCL standard, such that arbitrary computing devices installed on any node of a distributed system can be used together within a single application. dOpenCL allows moving data and program code to these devices in a transparent, portable manner. Since dOpenCL is designed as a fully-fledged implementation of the OpenCL API, it allows running existing OpenCL applications in a heterogeneous distributed environment without any modifications. We describe in detail the mechanisms that are required to implement OpenCL for distributed systems, including a device management mechanism for running multiple applications concurrently. Using three application studies, we compare the performance of dOpenCL with MPI+OpenCL and a standard OpenCL implementation

    RELEASE: A High-level Paradigm for Reliable Large-scale Server Software

    Get PDF
    Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the rst six months. The project aim is to scale the Erlang's radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the e ectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene

    A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures

    Full text link
    Scientific problems that depend on processing large amounts of data require overcoming challenges in multiple areas: managing large-scale data distribution, co-placement and scheduling of data with compute resources, and storing and transferring large volumes of data. We analyze the ecosystems of the two prominent paradigms for data-intensive applications, hereafter referred to as the high-performance computing and the Apache-Hadoop paradigm. We propose a basis, common terminology and functional factors upon which to analyze the two approaches of both paradigms. We discuss the concept of "Big Data Ogres" and their facets as means of understanding and characterizing the most common application workloads found across the two paradigms. We then discuss the salient features of the two paradigms, and compare and contrast the two approaches. Specifically, we examine common implementation/approaches of these paradigms, shed light upon the reasons for their current "architecture" and discuss some typical workloads that utilize them. In spite of the significant software distinctions, we believe there is architectural similarity. We discuss the potential integration of different implementations, across the different levels and components. Our comparison progresses from a fully qualitative examination of the two paradigms, to a semi-quantitative methodology. We use a simple and broadly used Ogre (K-means clustering), characterize its performance on a range of representative platforms, covering several implementations from both paradigms. Our experiments provide an insight into the relative strengths of the two paradigms. We propose that the set of Ogres will serve as a benchmark to evaluate the two paradigms along different dimensions.Comment: 8 pages, 2 figure

    D-SPACE4Cloud: A Design Tool for Big Data Applications

    Get PDF
    The last years have seen a steep rise in data generation worldwide, with the development and widespread adoption of several software projects targeting the Big Data paradigm. Many companies currently engage in Big Data analytics as part of their core business activities, nonetheless there are no tools and techniques to support the design of the underlying hardware configuration backing such systems. In particular, the focus in this report is set on Cloud deployed clusters, which represent a cost-effective alternative to on premises installations. We propose a novel tool implementing a battery of optimization and prediction techniques integrated so as to efficiently assess several alternative resource configurations, in order to determine the minimum cost cluster deployment satisfying QoS constraints. Further, the experimental campaign conducted on real systems shows the validity and relevance of the proposed method

    GHOST: Building blocks for high performance sparse linear algebra on heterogeneous systems

    Get PDF
    While many of the architectural details of future exascale-class high performance computer systems are still a matter of intense research, there appears to be a general consensus that they will be strongly heterogeneous, featuring "standard" as well as "accelerated" resources. Today, such resources are available as multicore processors, graphics processing units (GPUs), and other accelerators such as the Intel Xeon Phi. Any software infrastructure that claims usefulness for such environments must be able to meet their inherent challenges: massive multi-level parallelism, topology, asynchronicity, and abstraction. The "General, Hybrid, and Optimized Sparse Toolkit" (GHOST) is a collection of building blocks that targets algorithms dealing with sparse matrix representations on current and future large-scale systems. It implements the "MPI+X" paradigm, has a pure C interface, and provides hybrid-parallel numerical kernels, intelligent resource management, and truly heterogeneous parallelism for multicore CPUs, Nvidia GPUs, and the Intel Xeon Phi. We describe the details of its design with respect to the challenges posed by modern heterogeneous supercomputers and recent algorithmic developments. Implementation details which are indispensable for achieving high efficiency are pointed out and their necessity is justified by performance measurements or predictions based on performance models. The library code and several applications are available as open source. We also provide instructions on how to make use of GHOST in existing software packages, together with a case study which demonstrates the applicability and performance of GHOST as a component within a larger software stack.Comment: 32 pages, 11 figure
    • …
    corecore