23,370 research outputs found
Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review
The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER
Optimizing the flash-RAM energy trade-off in deeply embedded systems
Deeply embedded systems often have the tightest constraints on energy
consumption, requiring that they consume tiny amounts of current and run on
batteries for years. However, they typically execute code directly from flash,
instead of the more energy efficient RAM. We implement a novel compiler
optimization that exploits the relative efficiency of RAM by statically moving
carefully selected basic blocks from flash to RAM. Our technique uses integer
linear programming, with an energy cost model to select a good set of basic
blocks to place into RAM, without impacting stack or data storage.
We evaluate our optimization on a common ARM microcontroller and succeed in
reducing the average power consumption by up to 41% and reducing energy
consumption by up to 22%, while increasing execution time. A case study is
presented, where an application executes code then sleeps for a period of time.
For this example we show that our optimization could allow the application to
run on battery for up to 32% longer. We also show that for this scenario the
total application energy can be reduced, even if the optimization increases the
execution time of the code
A 64mW DNN-based Visual Navigation Engine for Autonomous Nano-Drones
Fully-autonomous miniaturized robots (e.g., drones), with artificial
intelligence (AI) based visual navigation capabilities are extremely
challenging drivers of Internet-of-Things edge intelligence capabilities.
Visual navigation based on AI approaches, such as deep neural networks (DNNs)
are becoming pervasive for standard-size drones, but are considered out of
reach for nanodrones with size of a few cm. In this work, we
present the first (to the best of our knowledge) demonstration of a navigation
engine for autonomous nano-drones capable of closed-loop end-to-end DNN-based
visual navigation. To achieve this goal we developed a complete methodology for
parallel execution of complex DNNs directly on-bard of resource-constrained
milliwatt-scale nodes. Our system is based on GAP8, a novel parallel
ultra-low-power computing platform, and a 27 g commercial, open-source
CrazyFlie 2.0 nano-quadrotor. As part of our general methodology we discuss the
software mapping techniques that enable the state-of-the-art deep convolutional
neural network presented in [1] to be fully executed on-board within a strict 6
fps real-time constraint with no compromise in terms of flight results, while
all processing is done with only 64 mW on average. Our navigation engine is
flexible and can be used to span a wide performance range: at its peak
performance corner it achieves 18 fps while still consuming on average just
3.5% of the power envelope of the deployed nano-aircraft.Comment: 15 pages, 13 figures, 5 tables, 2 listings, accepted for publication
in the IEEE Internet of Things Journal (IEEE IOTJ
Design of multimedia processor based on metric computation
Media-processing applications, such as signal processing, 2D and 3D graphics
rendering, and image compression, are the dominant workloads in many embedded
systems today. The real-time constraints of those media applications have
taxing demands on today's processor performances with low cost, low power and
reduced design delay. To satisfy those challenges, a fast and efficient
strategy consists in upgrading a low cost general purpose processor core. This
approach is based on the personalization of a general RISC processor core
according the target multimedia application requirements. Thus, if the extra
cost is justified, the general purpose processor GPP core can be enforced with
instruction level coprocessors, coarse grain dedicated hardware, ad hoc
memories or new GPP cores. In this way the final design solution is tailored to
the application requirements. The proposed approach is based on three main
steps: the first one is the analysis of the targeted application using
efficient metrics. The second step is the selection of the appropriate
architecture template according to the first step results and recommendations.
The third step is the architecture generation. This approach is experimented
using various image and video algorithms showing its feasibility
Recommended from our members
Design Space Exploration in Cyber-Physical Systems
Cyber physical systems (CPS) integrate a variety of engineering areas such as control, mechanical and computer engineering in a holistic design effort. While interdependencies between the different disciplines are key attributes of CPS design science, little is known about the impact of design decisions of the cyber part on the overall system qualities. To investigate these interdependencies, this paper proposes a simulation-based Design Space Exploration (DSE) framework that considers detailed cyber system parameters such as cache size, bus width, and voltage levels in addition to physical and control parameters of the CPS. We propose an exploration algorithm that surfs the parameter configurations in the cyber physical sub-systems, in order to approximate the Pareto-optimal design points with regards to the trade-os among the design objectives, such as energy consumption and control stability. We apply the proposed framework to a network control system for an inverted-pendulum application. The presented holistic evaluation of the identified Pareto-points reveals the presence of non-trivial trade-os, which are imposed by the control, physical, and detailed cyber parameters. For instance the identified energy and control optimal design points comprise configurations with a wide range of CPU speeds, sample times and cache configuration following non-trivial zig-zag patterns. The proposed framework could identify and manage those trade-os and, as a result, is an imperative rst step to automate the search for superior CSP configurations
- …