1,994 research outputs found

    A methodology for analyzing commercial processor performance numbers

    Get PDF
    The wealth of performance numbers provided by benchmarking corporations makes it difficult to detect trends across commercial machines. A proposed methodology, based on statistical data analysis, simplifies exploration of these machines' large datasets

    Hardware Accelerator of Cartesian Genetic Programming with Multiple Fitness Units

    Get PDF
    A new accelerator of Cartesian genetic programming is presented in this paper. The accelerator is completely implemented in a single FPGA. The proposed architecture contains multiple instances of virtual reconfigurable circuit to evaluate several candidate solutions in parallel. An advanced memory organization was developed to achieve the maximum throughput of processing. The search algorithm is implemented using the on-chip PowerPC processor. In the benchmark problem (image filter evolution) the proposed platform provides a significant speedup (170) in comparison with a highly optimized software implementation. Moreover, the accelerator is 8 times faster than previous FPGA accelerators of image filter evolution

    Description and Optimization of Abstract Machines in a Dialect of Prolog

    Full text link
    In order to achieve competitive performance, abstract machines for Prolog and related languages end up being large and intricate, and incorporate sophisticated optimizations, both at the design and at the implementation levels. At the same time, efficiency considerations make it necessary to use low-level languages in their implementation. This makes them laborious to code, optimize, and, especially, maintain and extend. Writing the abstract machine (and ancillary code) in a higher-level language can help tame this inherent complexity. We show how the semantics of most basic components of an efficient virtual machine for Prolog can be described using (a variant of) Prolog. These descriptions are then compiled to C and assembled to build a complete bytecode emulator. Thanks to the high level of the language used and its closeness to Prolog, the abstract machine description can be manipulated using standard Prolog compilation and optimization techniques with relative ease. We also show how, by applying program transformations selectively, we obtain abstract machine implementations whose performance can match and even exceed that of state-of-the-art, highly-tuned, hand-crafted emulators.Comment: 56 pages, 46 figures, 5 tables, To appear in Theory and Practice of Logic Programming (TPLP

    Relaxed Operational Semantics of Concurrent Programming Languages

    Full text link
    We propose a novel, operational framework to formally describe the semantics of concurrent programs running within the context of a relaxed memory model. Our framework features a "temporary store" where the memory operations issued by the threads are recorded, in program order. A memory model then specifies the conditions under which a pending operation from this sequence is allowed to be globally performed, possibly out of order. The memory model also involves a "write grain," accounting for architectures where a thread may read a write that is not yet globally visible. Our formal model is supported by a software simulator, allowing us to run litmus tests in our semantics.Comment: In Proceedings EXPRESS/SOS 2012, arXiv:1208.244

    Accelerating FPGA-based evolution of wavelet transform filters by optimized task scheduling

    Get PDF
    Adaptive embedded systems are required in various applications. This work addresses these needs in the area of adaptive image compression in FPGA devices. A simplified version of an evolution strategy is utilized to optimize wavelet filters of a Discrete Wavelet Transform algorithm. We propose an adaptive image compression system in FPGA where optimized memory architecture, parallel processing and optimized task scheduling allow reducing the time of evolution. The proposed solution has been extensively evaluated in terms of the quality of compression as well as the processing time. The proposed architecture reduces the time of evolution by 44% compared to our previous reports while maintaining the quality of compression unchanged with respect to existing implementations. The system is able to find an optimized set of wavelet filters in less than 2 min whenever the input type of data changes

    Re-Engineering of the CERN Accelerators and Services Control System based on VMEbus and PowerPC Technology

    Get PDF
    A new generation of PowerPC VMEbus front-end computers is being introduced in the CERN accelerators and services control system infrastructure. This new technology is aimed at replacing existing PC based front-end computers and at offering a high performance microprocessor platform for present and future engineering developments for the LHC era. This paper describes the re-engineering strategy and the core architecture of the new systems. Special performance issues are also addressed in this paper

    PRISE: An Integrated Platform for Research and Teaching of Critical Embedded Systems

    Get PDF
    In this paper, we present PRISE, an integrated workbench for Research and Teaching of critical embedded systems at ISAE, the French Institute for Space and Aeronautics Engineering. PRISE is built around state-of-the-art technologies for the engineering of space and avionics systems used in Space and Avionics domain. It aims at demonstrating key aspects of critical, real-time, embedded systems used in the transport industry, but also validating new scientific contributions for the engineering of software functions. PRISE combines embedded and simulation platforms, and modeling tools. This platform is available for both research and teaching. Being built around widely used commercial and open source software; PRISE aims at being a reference platform for our teaching and research activities at ISAE
    • …
    corecore