5,558 research outputs found

    Coarse-Grain optimization and code generation for embedded multicore systems

    Get PDF
    DOI: 10.1109/DSD.2013.48As processors and systems-on-chip increasingly become multicore, parallel programming remains a difficult, time-consuming and complicated task. End users who are not parallel programming experts have a need to exploit such processors and architectures, using state of the art fourth generation of high programming languages, like Scilab or MATLAB. The ALMA toolset addresses this problem by receiving Scilab code as input and produces parallel code for embedded multiprocessor systems on chip, using platform quasi-agnostic optimisations. In this paper, coarse grain parallelism extraction and optimization issues as well as parallel code generation for the ALMA toolset are discussed

    A case study for NoC based homogeneous MPSoC architectures

    Get PDF
    The many-core design paradigm requires flexible and modular hardware and software components to provide the required scalability to next-generation on-chip multiprocessor architectures. A multidisciplinary approach is necessary to consider all the interactions between the different components of the design. In this paper, a complete design methodology that tackles at once the aspects of system level modeling, hardware architecture, and programming model has been successfully used for the implementation of a multiprocessor network-on-chip (NoC)-based system, the NoCRay graphic accelerator. The design, based on 16 processors, after prototyping with field-programmable gate array (FPGA), has been laid out in 90-nm technology. Post-layout results show very low power, area, as well as 500 MHz of clock frequency. Results show that an array of small and simple processors outperform a single high-end general purpose processo

    A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems

    Full text link
    Recent technological advances have greatly improved the performance and features of embedded systems. With the number of just mobile devices now reaching nearly equal to the population of earth, embedded systems have truly become ubiquitous. These trends, however, have also made the task of managing their power consumption extremely challenging. In recent years, several techniques have been proposed to address this issue. In this paper, we survey the techniques for managing power consumption of embedded systems. We discuss the need of power management and provide a classification of the techniques on several important parameters to highlight their similarities and differences. This paper is intended to help the researchers and application-developers in gaining insights into the working of power management techniques and designing even more efficient high-performance embedded systems of tomorrow

    Profile-Guided compilation of Scilab algorithms for multiprocessor systems

    Get PDF
    DOI: 10.1007/978-3-319-05960-0_37The expression of parallelism in commonly used programming languages is still a large problem when mapping high performance embedded applications to multiprocessor system on chip devices. The Architecture oriented paraLlelization for high performance embedded Multicore systems using scilAb (ALMA) European project aims to bridge these hurdles through the introduction and exploitation of a Scilab-based toolchain which enables the efficient mapping of applications on multiprocessor platforms from a high level of abstraction. To achieve maximum performance the toolchain supports iterative application parallelization using profile-guided application compilation. In this way, the toolchain will increase the quality and performance of a parallelized application from iteration to iteration. This holistic solution of the toolchain hides the complexity of both, the application and the architecture, which leads to a better acceptance, reduced development cost, and shorter time-to-market

    Evaluating Cache Coherent Shared Virtual Memory for Heterogeneous Multicore Chips

    Full text link
    The trend in industry is towards heterogeneous multicore processors (HMCs), including chips with CPUs and massively-threaded throughput-oriented processors (MTTOPs) such as GPUs. Although current homogeneous chips tightly couple the cores with cache-coherent shared virtual memory (CCSVM), this is not the communication paradigm used by any current HMC. In this paper, we present a CCSVM design for a CPU/MTTOP chip, as well as an extension of the pthreads programming model, called xthreads, for programming this HMC. Our goal is to evaluate the potential performance benefits of tightly coupling heterogeneous cores with CCSVM
    corecore