96,760 research outputs found
Memory Hierarchy Hardware-Software Co-design in Embedded Systems
The memory hierarchy is the main bottleneck in modern computer systems as the gap between the speed of the processor and the memory continues to grow larger. The situation in embedded systems is even worse. The memory hierarchy consumes a large amount of chip area and energy, which are precious resources in embedded systems. Moreover, embedded systems have multiple design objectives such as performance, energy consumption, and area, etc.
Customizing the memory hierarchy for specific applications is a very important way to take full advantage of limited resources to maximize the performance. However, the traditional custom memory hierarchy design methodologies are phase-ordered. They separate the application optimization from the memory hierarchy architecture design, which tend to result in local-optimal solutions. In traditional Hardware-Software co-design methodologies, much of the work has focused on utilizing reconfigurable logic to partition the computation. However, utilizing reconfigurable logic to perform the memory hierarchy design is seldom addressed.
In this paper, we propose a new framework for designing memory hierarchy for embedded systems. The framework will take advantage of the flexible reconfigurable logic to customize the memory hierarchy for specific applications. It combines the application optimization and memory hierarchy design together to obtain a global-optimal solution. Using the framework, we performed a case study to design a new software-controlled instruction memory that showed promising potential.Singapore-MIT Alliance (SMA
Low Power Processor Architectures and Contemporary Techniques for Power Optimization ā A Review
The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. Ā© 2009 ACADEMY PUBLISHER
Recommended from our members
Integrated Dynamic Facade Control with an Agent-based Architecture for Commercial Buildings
Dynamic faƧades have significant technical potential to minimize heating, cooling, and lighting energy use and peak electric demand in the perimeter zone of commercial buildings, but the performance of these systems is reliant on being able to balance complex trade-offs between solar control, daylight admission, comfort, and view over the life of the installation. As the context for controllable energy-efficiency technologies grows more complex with the increased use of intermittent renewable energy resources on the grid, it has become increasingly important to look ahead towards more advanced approaches to integrated systems control in order to achieve optimum life-cycle performance at a lower cost. This study examines the feasibility of a model predictive control system for low-cost autonomous dynamic faƧades. A system architecture designed around lightweight, simple agents is proposed. The architecture accommodates whole building and grid level demands through its modular, hierarchical approach. Automatically-generated models for computing window heat gains, daylight illuminance, and discomfort glare are described. The open source Modelica and JModelica software tools were used to determine the optimum state of control given inputs of window heat gains and lighting loads for a 24-hour optimization horizon. Penalty functions for glare and view/ daylight quality were implemented as constraints. The control system was tested on a low-power controller (1.4 GHz single core with 2 GB of RAM) to evaluate feasibility. The target platform is a low-cost ($35/unit) embedded controller with 1.2 GHz dual-core cpu and 1 GB of RAM. Configuration and commissioning of the curtainwall unit was designed to be largely plug and play with minimal inputs required by the manufacturer through a web-based user interface. An example application was used to demonstrate optimal control of a three-zone electrochromic window for a south-facing zone. The overall approach was deemed to be promising. Further engineering is required to enable scalable, turnkey solutions
Definition, implementation and validation of energy code smells: an exploratory study on an embedded system
Optimizing software in terms of energy efficiency is one of the challenges that both research and industry will have to face in the next few years.We consider energy efficiency as a software product quality characteristic, to be improved through the refactoring of appropriate code pattern: the aim of this work is identifying those code patterns, hereby defined as Energy Code Smells, that might increase the impact of software over power consumption. For our purposes, we perform an experiment consisting in the execution of several code patterns on an embedded system. These code patterns are executed in two versions: the first one contains a code issue that could negatively impact power consumption, the other one is refactored removing the issue. We measure the power consumption of the embedded device during the execution of each code pattern. We also track the execution time to investigate whether Energy Code Smells are also Performance Smells. Our results show that some Energy Code Smells actually have an impact over power consumption in the magnitude order of micro Watts. Moreover, those Smells did not introduce a performance decreas
A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems
Recent technological advances have greatly improved the performance and
features of embedded systems. With the number of just mobile devices now
reaching nearly equal to the population of earth, embedded systems have truly
become ubiquitous. These trends, however, have also made the task of managing
their power consumption extremely challenging. In recent years, several
techniques have been proposed to address this issue. In this paper, we survey
the techniques for managing power consumption of embedded systems. We discuss
the need of power management and provide a classification of the techniques on
several important parameters to highlight their similarities and differences.
This paper is intended to help the researchers and application-developers in
gaining insights into the working of power management techniques and designing
even more efficient high-performance embedded systems of tomorrow
A Survey on Compiler Autotuning using Machine Learning
Since the mid-1990s, researchers have been trying to use machine-learning
based approaches to solve a number of different compiler optimization problems.
These techniques primarily enhance the quality of the obtained results and,
more importantly, make it feasible to tackle two main compiler optimization
problems: optimization selection (choosing which optimizations to apply) and
phase-ordering (choosing the order of applying optimizations). The compiler
optimization space continues to grow due to the advancement of applications,
increasing number of compiler optimizations, and new target architectures.
Generic optimization passes in compilers cannot fully leverage newly introduced
optimizations and, therefore, cannot keep up with the pace of increasing
options. This survey summarizes and classifies the recent advances in using
machine learning for the compiler optimization field, particularly on the two
major problems of (1) selecting the best optimizations and (2) the
phase-ordering of optimizations. The survey highlights the approaches taken so
far, the obtained results, the fine-grain classification among different
approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our
Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated
quarterly here (Send me your new published papers to be added in the
subsequent version) History: Received November 2016; Revised August 2017;
Revised February 2018; Accepted March 2018
Lost in translation: Exposing hidden compiler optimization opportunities
Existing iterative compilation and machine-learning-based optimization
techniques have been proven very successful in achieving better optimizations
than the standard optimization levels of a compiler. However, they were not
engineered to support the tuning of a compiler's optimizer as part of the
compiler's daily development cycle. In this paper, we first establish the
required properties which a technique must exhibit to enable such tuning. We
then introduce an enhancement to the classic nightly routine testing of
compilers which exhibits all the required properties, and thus, is capable of
driving the improvement and tuning of the compiler's common optimizer. This is
achieved by leveraging resource usage and compilation information collected
while systematically exploiting prefixes of the transformations applied at
standard optimization levels. Experimental evaluation using the LLVM v6.0.1
compiler demonstrated that the new approach was able to reveal hidden
cross-architecture and architecture-dependent potential optimizations on two
popular processors: the Intel i5-6300U and the Arm Cortex-A53-based Broadcom
BCM2837 used in the Raspberry Pi 3B+. As a case study, we demonstrate how the
insights from our approach enabled us to identify and remove a significant
shortcoming of the CFG simplification pass of the LLVM v6.0.1 compiler.Comment: 31 pages, 7 figures, 2 table. arXiv admin note: text overlap with
arXiv:1802.0984
- ā¦