1,611 research outputs found

    A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems

    Full text link
    Recent technological advances have greatly improved the performance and features of embedded systems. With the number of just mobile devices now reaching nearly equal to the population of earth, embedded systems have truly become ubiquitous. These trends, however, have also made the task of managing their power consumption extremely challenging. In recent years, several techniques have been proposed to address this issue. In this paper, we survey the techniques for managing power consumption of embedded systems. We discuss the need of power management and provide a classification of the techniques on several important parameters to highlight their similarities and differences. This paper is intended to help the researchers and application-developers in gaining insights into the working of power management techniques and designing even more efficient high-performance embedded systems of tomorrow

    Multi-Softcore Architecture on FPGA

    Get PDF
    To meet the high performance demands of embedded multimedia applications, embedded systems are integrating multiple processing units. However, they are mostly based on custom-logic design methodology. Designing parallel multicore systems using available standards intellectual properties yet maintaining high performance is also a challenging issue. Softcore processors and field programmable gate arrays (FPGAs) are a cheap and fast option to develop and test such systems. This paper describes a FPGA-based design methodology to implement a rapid prototype of parametric multicore systems. A study of the viability of making the SoC using the NIOS II soft-processor core from Altera is also presented. The NIOS II features a general-purpose RISC CPU architecture designed to address a wide range of applications. The performance of the implemented architecture is discussed, and also some parallel applications are used for testing speedup and efficiency of the system. Experimental results demonstrate the performance of the proposed multicore system, which achieves better speedup than the GPU (29.5% faster for the FIR filter and 23.6% faster for the matrix-matrix multiplication)

    DeSyRe: on-Demand System Reliability

    No full text
    The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints

    MURAC: A unified machine model for heterogeneous computers

    Get PDF
    Includes bibliographical referencesHeterogeneous computing enables the performance and energy advantages of multiple distinct processing architectures to be efficiently exploited within a single machine. These systems are capable of delivering large performance increases by matching the applications to architectures that are most suited to them. The Multiple Runtime-reconfigurable Architecture Computer (MURAC) model has been proposed to tackle the problems commonly found in the design and usage of these machines. This model presents a system-level approach that creates a clear separation of concerns between the system implementer and the application developer. The three key concepts that make up the MURAC model are a unified machine model, a unified instruction stream and a unified memory space. A simple programming model built upon these abstractions provides a consistent interface for interacting with the underlying machine to the user application. This programming model simplifies application partitioning between hardware and software and allows the easy integration of different execution models within the single control ow of a mixed-architecture application. The theoretical and practical trade-offs of the proposed model have been explored through the design of several systems. An instruction-accurate system simulator has been developed that supports the simulated execution of mixed-architecture applications. An embedded System-on-Chip implementation has been used to measure the overhead in hardware resources required to support the model, which was found to be minimal. An implementation of the model within an operating system on a tightly-coupled reconfigurable processor platform has been created. This implementation is used to extend the software scheduler to allow for the full support of mixed-architecture applications in a multitasking environment. Different scheduling strategies have been tested using this scheduler for mixed-architecture applications. The design and implementation of these systems has shown that a unified abstraction model for heterogeneous computers provides important usability benefits to system and application designers. These benefits are achieved through a consistent view of the multiple different architectures to the operating system and user applications. This allows them to focus on achieving their performance and efficiency goals by gaining the benefits of different execution models during runtime without the complex implementation details of the system-level synchronisation and coordination

    A Fuzzy Logic Reconfiguration Engine for Symmetric Chip Multiprocessors

    Get PDF
    Recent developments in reconfigurable multiprocessor system on chip (MPSoC) have offered system designers a great amount of flexibility to exploit task concurrency with higher throughput and less energy consumption. This paper presents a novel fuzzy logic reconfiguration engine (FLRE) for coarse grain MPSoC reconfiguration that facilitates to identify an optimum balance between power and performance of the system. The FLRE is composed on two levels of abstraction layers. The system selects an optimal configuration of Level 1 / Level 2 cache size and Associativity, processor operating frequency and voltage, the number of cores based on miss rate, and energy and throughput information of the system both at core and SoC level. An 8-core symmetric chip multiprocessor has been used to evaluate the proposed scheme. The results show an overall decrease of energy consumption with not more than 30% decrease in the throughput

    Smart technologies for effective reconfiguration: the FASTER approach

    Get PDF
    Current and future computing systems increasingly require that their functionality stays flexible after the system is operational, in order to cope with changing user requirements and improvements in system features, i.e. changing protocols and data-coding standards, evolving demands for support of different user applications, and newly emerging applications in communication, computing and consumer electronics. Therefore, extending the functionality and the lifetime of products requires the addition of new functionality to track and satisfy the customers needs and market and technology trends. Many contemporary products along with the software part incorporate hardware accelerators for reasons of performance and power efficiency. While adaptivity of software is straightforward, adaptation of the hardware to changing requirements constitutes a challenging problem requiring delicate solutions. The FASTER (Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration) project aims at introducing a complete methodology to allow designers to easily implement a system specification on a platform which includes a general purpose processor combined with multiple accelerators running on an FPGA, taking as input a high-level description and fully exploiting, both at design time and at run time, the capabilities of partial dynamic reconfiguration. The goal is that for selected application domains, the FASTER toolchain will be able to reduce the design and verification time of complex reconfigurable systems providing additional novel verification features that are not available in existing tool flows

    Modeling and Mapping of Optimized Schedules for Embedded Signal Processing Systems

    Get PDF
    The demand for Digital Signal Processing (DSP) in embedded systems has been increasing rapidly due to the proliferation of multimedia- and communication-intensive devices such as pervasive tablets and smart phones. Efficient implementation of embedded DSP systems requires integration of diverse hardware and software components, as well as dynamic workload distribution across heterogeneous computational resources. The former implies increased complexity of application modeling and analysis, but also brings enhanced potential for achieving improved energy consumption, cost or performance. The latter results from the increased use of dynamic behavior in embedded DSP applications. Furthermore, parallel programming is highly relevant in many embedded DSP areas due to the development and use of Multiprocessor System-On-Chip (MPSoC) technology. The need for efficient cooperation among different devices supporting diverse parallel embedded computations motivates high-level modeling that expresses dynamic signal processing behaviors and supports efficient task scheduling and hardware mapping. Starting with dynamic modeling, this thesis develops a systematic design methodology that supports functional simulation and hardware mapping of dynamic reconfiguration based on Parameterized Synchronous Dataflow (PSDF) graphs. By building on the DIF (Dataflow Interchange Format), which is a design language and associated software package for developing and experimenting with dataflow-based design techniques for signal processing systems, we have developed a novel tool for functional simulation of PSDF specifications. This simulation tool allows designers to model applications in PSDF and simulate their functionality, including use of the dynamic parameter reconfiguration capabilities offered by PSDF. With the help of this simulation tool, our design methodology helps to map PSDF specifications into efficient implementations on field programmable gate arrays (FPGAs). Furthermore, valid schedules can be derived from the PSDF models at runtime to adapt hardware configurations based on changing data characteristics or operational requirements. Under certain conditions, efficient quasi-static schedules can be applied to reduce overhead and enhance predictability in the scheduling process. Motivated by the fact that scheduling is critical to performance and to efficient use of dynamic reconfiguration, we have focused on a methodology for schedule design, which complements the emphasis on automated schedule construction in the existing literature on dataflow-based design and implementation. In particular, we have proposed a dataflow-based schedule design framework called the dataflow schedule graph (DSG), which provides a graphical framework for schedule construction based on dataflow semantics, and can also be used as an intermediate representation target for automated schedule generation. Our approach to applying the DSG in this thesis emphasizes schedule construction as a design process rather than an outcome of the synthesis process. Our approach employs dataflow graphs for representing both application models and schedules that are derived from them. By providing a dataflow-integrated framework for unambiguously representing, analyzing, manipulating, and interchanging schedules, the DSG facilitates effective codesign of dataflow-based application models and schedules for execution of these models. As multicore processors are deployed in an increasing variety of embedded image processing systems, effective utilization of resources such as multiprocessor systemon-chip (MPSoC) devices, and effective handling of implementation concerns such as memory management and I/O become critical to developing efficient embedded implementations. However, the diversity and complexity of applications and architectures in embedded image processing systems make the mapping of applications onto MPSoCs difficult. We help to address this challenge through a structured design methodology that is built upon the DSG modeling framework. We refer to this methodology as the DEIPS methodology (DSG-based design and implementation of Embedded Image Processing Systems). The DEIPS methodology provides a unified framework for joint consideration of DSG structures and the application graphs from which they are derived, which allows designers to integrate considerations of parallelization and resource constraints together with the application modeling process. We demonstrate the DEIPS methodology through cases studies on practical embedded image processing systems
    • …
    corecore