6 research outputs found

    Instruction buffering exploration for low energy embedded processors.

    No full text
    For multimedia applications, loop buffering is an efficient mechanism to reduce the power in the instruction memory of embedded processors. Especially software controlled loop buffers are energy efficient. However current compilers do not fully take advantage of the possibilities of such loop buffers. This paper presents an algorithm to explore for an application or a set of applications what is the optimal loop buffer configuration and the optimal way to use this configuration. Results for the MediaBench application suite show an additional 35% reduction (on average) in energy in the instruction memory hierarchy as compared to traditional approaches to the loop buffer without any performance implications

    Improving Energy Efficiency of Application-Specific Instruction-Set Processors

    Get PDF
    Present-day consumer mobile devices seem to challenge the concept of embedded computing by bringing the equivalent of supercomputing power from two decades ago into hand-held devices. This challenge, however, is well met by pushing the boundaries of embedded computing further into areas previously monopolised by Application-Specific Integrated Circuits (ASICs). Furthermore, in areas traditionally associated with embedded computing, an increase in the complexity of algorithms and applications requires a continuous rise in availability of computing power and energy efficiency in order to fit within the same, or smaller, power budget. It is, ultimately, the amount of energy the application execution consumes that dictates the usefulness of a programmable embedded system, in comparison with implementation of an ASIC.This Thesis aimed to explore the energy efficiency overheads of Application-Specific InstructionSet Processors (ASIPs), a class of embedded processors aiming to compete with ASICs. While an ASIC can be designed to provide precise performance and energy efficiency required by a specific application without unnecessary overheads, the cost of design and verification, as well as the inability to upgrade or modify, favour more flexible programmable solutions. The ASIP designs can match the computing performance of the ASIC for specific applications. What is left, therefore, is achieving energy efficiency of a similar order of magnitude.In the past, one area of ASIP design that has been identified as a major consumer of energy is storage of temporal values produced during computation – the Register File (RF), with the associated interconnection network to transport those values between registers and computational Function Units (FUs). In this Thesis, the energy efficiency of RF and interconnection network is studied using the Transport Triggered Architectures (TTAs) template. Specifically, compiler optimisations aiming at reducing the traffic of temporal values between RF and FUs are presented in this Thesis. Bypassing of the temporal value, from the output of the FU which produces it directly in the input ports of the FUs that require it to continue with the computation, saves multiple RF reads. In addition, if all the uses of such a temporal value can be bypassed, the RF write can be eliminated as well. Such optimisations result in a simplification of the RF, via a reduction in the actual number of registers present or a reduction in the number of read and write ports in the RF and improved energy efficiency. In cases where the limited number of the simultaneous RF reads or writes cause a performance bottleneck, such optimisations result in performance improvements leading to faster execution times, therefore, allowing for execution at lower clock frequencies resulting in additional energy savings.Another area of the ASIP design consuming a significant amount of energy is the instruction memory subsystem, which is the artefact required for the programmability of the embedded processor. As this subsystem is not present in ASIC, the energy consumed for storing an application program and reading it from the instruction memories to control processor execution is an overhead that needs to be minimised. In this Thesis, one particular tool to improve the energy efficiency of the instruction memory subsystem – instruction buffer – is examined. While not trivially obvious, the presence of buffers for storing loop bodies, or parts of them, results in a reduced number of reads from the instruction memories. As a result, memories can be put to lower power state leading to lower overall energy consumption, pending energy-efficient buffer implementation. Specifically, an energy-efficient implementation of the instruction buffer is presented in this Thesis, together with analysis tools to identify candidate loops and assess their suitability for storing in the instruction buffer.The studies presented in this Thesis show that the energy overheads associated with the use of embedded processors, in comparison to ad-hoc ASIC solutions, are manageable when carefully considered during the design of an embedded system for a particular application, or application domain. Finally, the methods presented in this Thesis do not restrict the reprogrammability of the embedded system

    Instruction buffering exploration for low energy embedded processors

    No full text
    For multimedia applications, loop buffering is an efficient mechanism to reduce the power in the instruction memory of embedded processors. Especially software controlled loop buffers are energy efficient. However current compilers do not fully take advantage of the possibilities of such loop buffers. This paper presents an algorithm to explore for an application or a set of applications what is the optimal loop buffer configuration and the optimal way to use this configuration. Results for the MediaBench application suite show an additional 35% reduction (on average) in energy in the instruction memory hierarchy as compared to traditional approaches to the loop buffer without any performance implications
    corecore