13,469 research outputs found

    Developing performance-portable molecular dynamics kernels in Open CL

    Get PDF
    This paper investigates the development of a molecular dynamics code that is highly portable between architectures. Using OpenCL, we develop an implementation of Sandia’s miniMD benchmark that achieves good levels of performance across a wide range of hardware: CPUs, discrete GPUs and integrated GPUs. We demonstrate that the performance bottlenecks of miniMD’s short-range force calculation kernel are the same across these architectures, and detail a number of platform- agnostic optimisations that improve its performance by at least 2x on all hardware considered. Our complete code is shown to be 1.7x faster than the original miniMD, and at most 2x slower than implementations individually hand-tuned for a specific architecture

    AutoAccel: Automated Accelerator Generation and Optimization with Composable, Parallel and Pipeline Architecture

    Full text link
    CPU-FPGA heterogeneous architectures are attracting ever-increasing attention in an attempt to advance computational capabilities and energy efficiency in today's datacenters. These architectures provide programmers with the ability to reprogram the FPGAs for flexible acceleration of many workloads. Nonetheless, this advantage is often overshadowed by the poor programmability of FPGAs whose programming is conventionally a RTL design practice. Although recent advances in high-level synthesis (HLS) significantly improve the FPGA programmability, it still leaves programmers facing the challenge of identifying the optimal design configuration in a tremendous design space. This paper aims to address this challenge and pave the path from software programs towards high-quality FPGA accelerators. Specifically, we first propose the composable, parallel and pipeline (CPP) microarchitecture as a template of accelerator designs. Such a well-defined template is able to support efficient accelerator designs for a broad class of computation kernels, and more importantly, drastically reduce the design space. Also, we introduce an analytical model to capture the performance and resource trade-offs among different design configurations of the CPP microarchitecture, which lays the foundation for fast design space exploration. On top of the CPP microarchitecture and its analytical model, we develop the AutoAccel framework to make the entire accelerator generation automated. AutoAccel accepts a software program as an input and performs a series of code transformations based on the result of the analytical-model-based design space exploration to construct the desired CPP microarchitecture. Our experiments show that the AutoAccel-generated accelerators outperform their corresponding software implementations by an average of 72x for a broad class of computation kernels

    Design of multimedia processor based on metric computation

    Get PDF
    Media-processing applications, such as signal processing, 2D and 3D graphics rendering, and image compression, are the dominant workloads in many embedded systems today. The real-time constraints of those media applications have taxing demands on today's processor performances with low cost, low power and reduced design delay. To satisfy those challenges, a fast and efficient strategy consists in upgrading a low cost general purpose processor core. This approach is based on the personalization of a general RISC processor core according the target multimedia application requirements. Thus, if the extra cost is justified, the general purpose processor GPP core can be enforced with instruction level coprocessors, coarse grain dedicated hardware, ad hoc memories or new GPP cores. In this way the final design solution is tailored to the application requirements. The proposed approach is based on three main steps: the first one is the analysis of the targeted application using efficient metrics. The second step is the selection of the appropriate architecture template according to the first step results and recommendations. The third step is the architecture generation. This approach is experimented using various image and video algorithms showing its feasibility

    gCSP: A Graphical Tool for Designing CSP systems

    Get PDF
    For broad acceptance of an engineering paradigm, a graphical notation and a supporting design tool seem necessary. This paper discusses certain issues of developing a design environment for building systems based on CSP. Some of the issues discussed depend specifically on the underlying theory of CSP, while a number of them are common for any graphical notation and supporting tools, such as provisions for complexity management and design overview

    Future internet enablers for VGI applications

    No full text
    This paper presents the authors experiences with the development of mobile Volunteered Geographic Information (VGI) applications in the context of the ENVIROFI project and Future Internet Public Private Partnership (FI-PPP) FP7 research programme.FI-PPP has an ambitious goal of developing a set of Generic FI Enablers (GEs) - software and hardware tools that will simplify development of thematic future internet applications. Our role in the programme was to provide requirements and assess the usability of the GEs from the point of view of the environmental usage area, In addition, we specified and developed three proof of concept implementations of environmental FI applications, and a set of specific environmental enablers (SEs) complementing the functionality offered by GEs. Rather than trying to rebuild the whole infrastructure of the Environmental Information Space (EIS), we concentrated on two aspects: (1) how to assure the existing and future EIS services and applications can be integrated and reused in FI context; and (2) how to profit from the GEs in future environmental applications.This paper concentrates on the GEs and SEs which were used in two of the ENVIROFI pilots which are representative for the emerging class of Volunteered Geographic Information (VGI) use-cases: one of them is pertinent to biodiversity and another to influence of weather and airborne pollution on users’ wellbeing. In VGI applications, the EIS and SensorWeb overlap with the Social web and potentially huge amounts of information from mobile citizens needs to be assessed and fused with the observations from official sources. On the whole, the authors are confident that the FI-PPP programme will greatly influence the EIS, but the paper also warns of the shortcomings in the current GE implementations and provides recommendations for further developments
    • 

    corecore