44 research outputs found
Hardware and software for a power-aware wireless microsensor node
Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 143-144).This thesis examines important issues in the design of hardware and software for microsensor networks, with particular attention paid to mechanisms for providing power awareness. The [mu]AMPS Revision 1 microsensor node is used as an example. The design of this node implementation is described in detail, including, in particular, the design of the pAMPS processor board and its power-scalable architecture. The operating system and application programming interface for the node is described. Finally, an analysis is made of the power consumed by each of the node's subsystems, and these results are used to assess the degree of power-awareness provided by the [mu]AMPS Revision 1 node.by Nathan J. Ickes.M.Eng
Recommended from our members
Energy-aware embedded media processing: customizable memory subsystems and energy management policies
textThe design of energy-efficient data memory architectures for embedded
system platforms has received considerable attention in recent years. In
this dissertation we propose a special-purpose data memory subsystem, called
Xtream-Fit, targeted to streaming media applications executing on both generic
uniprocessor embedded platforms and powerful SMT-based multi-threading
platforms. We empirically demonstrate that Xtream-Fit achieves high energydelay
efficiency across a wide range of media devices, from systems running a
single media application to systems concurrently executing multiple media applications
under synchronization constraints. Xtream-Fitâs energy efficiency
is predicated on a novel task-based execution model that exposes/enhances
opportunities for efficient prefetching, and aggressive dynamic energy conservation
techniques targeting on-chip and off-chip memory components. A key
novelty of Xtream-Fit is that it exposes a single customization parameter, thus
enabling a very simple and yet effective design space exploration methodology
to find the best memory configuration for the target application(s). Extensive
experimental results show that Xtream-Fit reduces energy-delay product
substantially â by 32% to 69% â as compared to âstandardâ general-purpose
memory subsystems enhanced with state of the art cache decay and SDRAM
power mode control policies.Electrical and Computer Engineerin
An ultra-low voltage FFT processor using energy-aware techniques
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, February 2004.Page 170 blank.Includes bibliographical references (p. 165-169).In a number of emerging applications such as wireless sensor networks, system lifetime depends on the energy efficiency of computation and communication. The key metric in such applications is the energy dissipated per function rather than traditional ones such as clock speed or silicon area. Hardware designs are shifting focus toward enabling energy-awareness, allowing the processor to be energy-efficient for a variety of operating scenarios. This is in contrast to conventional low-power design, which optimizes for the worst-case scenario. Here, three energy-quality scalable hooks are designed into a real-valued FFT processor: variable FFT length (N=128 to 1024 points), variable bit precision (8,16 bit), and variable voltage supply with variable clock frequency (VDD=1 80mV to 0.9V, and f=164Hz to 6MHz). A variable-bit-precision and variable-FFT-length scalable FFT ASIC using an off-the-shelf standard-cell logic library and memory only scales down to 1V operation. Further energy savings is achieved through ultra-low voltage-supply operation. As performance requirements are relaxed, the operating voltage supply is scaled down, possibly even below the threshold voltage into the subthreshold region. When lower frequencies cause leakage energy dissipation to exceed the active energy dissipation, there is an optimal operating point for minimizing energy consumption.(cont.) Logic and memory design techniques allowing ultra-low voltage operation are employed to study the optimal frequency/voltage operating point for the FFT. A full-custom implementation with circuit techniques optimized for deep voltage scaling into the subthreshold regime, is fabricated using a standard CMOS 0.18[mu]m logic process and functions down to 180mV. At the optimal operating point where the voltage supply is 350mV, the FFT processor dissipates 155nJ/FFT. The custom FFT is 8x more energy-efficient than the ASIC implementation and 350x more energy-efficient than a low-power microprocessor implementation.by Alice Wang.Ph.D
A survey of design techniques for system-level dynamic power management
Dynamic power management (DPM) is a design methodology for dynamically reconfiguring systems to provide the requested services and performance levels with a minimum number of active components or a minimum load on such components. DPM encompasses a set of techniques that achieves energy-efficient computation by selectively turning off (or reducing the performance of) system components when they are idle (or partially unexploited). In this paper, we survey several approaches to system-level dynamic power management. We first describe how systems employ power-manageable components and how the use of dynamic reconfiguration can impact the overall power consumption. We then analyze DPM implementation issues in electronic systems, and we survey recent initiatives in standardizing the hardware/software interface to enable software-controlled power management of hardware component
Energy efficient hardware acceleration of multimedia processing tools
The world of mobile devices is experiencing an ongoing trend of feature enhancement and generalpurpose multimedia platform convergence. This trend poses many grand challenges, the most pressing being their limited battery life as a consequence of delivering computationally demanding features. The envisaged mobile application features can be considered to be accelerated by a set of underpinning hardware blocks Based on the survey that this thesis presents on modem video compression standards and their associated enabling technologies, it is concluded that tight energy and throughput constraints can still be effectively tackled at algorithmic level in order to design re-usable optimised hardware acceleration cores.
To prove these conclusions, the work m this thesis is focused on two of the basic enabling technologies that support mobile video applications, namely the Shape Adaptive Discrete Cosine Transform (SA-DCT) and its inverse, the SA-IDCT. The hardware architectures presented in this work have been designed with energy efficiency in mind. This goal is achieved by employing high level techniques such as redundant computation elimination, parallelism and low switching computation structures. Both architectures compare favourably against the relevant pnor art in the literature.
The SA-DCT/IDCT technologies are instances of a more general computation - namely, both are Constant Matrix Multiplication (CMM) operations. Thus, this thesis also proposes an algorithm for the efficient hardware design of any general CMM-based enabling technology. The proposed algorithm leverages the effective solution search capability of genetic programming. A bonus feature of the proposed modelling approach is that it is further amenable to hardware acceleration. Another bonus feature is an early exit mechanism that achieves large search space reductions .Results show an improvement on state of the art algorithms with future potential for even greater savings
Composable system resources as an architecture for networked systems
Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (p. 169-173).(cont.) In this thesis, I describe an architecture for network devices that is based on using pluggable system resource modules that can be composed together to create a close-to-optimal platform for a particular application mix and device. Frequently used applications execute efficiently, while infrequently used applications execute less efficiently. Metrics for calculating efficiencies and selected application domains and mixes are specified by individuals as opposed to one-size-fits- all metrics specified by manufacturers. I show that such a composable system architecture is effective in optimizing system performance with respect to user preferences and application requirements, while the modularity of the architecture introduces little overhead. I also explore opportunities that arise from segmenting devices into UI and computational resource components, and show that an automated design environment can be created that greatly simplifies custom device design, reducing time-to-market and lowering costs.Network devices promise to provide a variety of user interfaces through which users can interact with network applications. The design of these devices stand in stark contrast to the design of personal computers in which new software content is accommodated by increased processor performance. Network device design, on the other hand, must take into consideration a variety of metrics including interactive performance, power consumption, battery life, transaction security, physical size and weight, and cost. Designing a general-purpose platform that caters to all of these metrics for all applications and devices is impractical. For an application mix, a processor architecture and platform can be designed that is optimized for a selected set of metrics, such as power consumption and battery life. Each of these optimized processor architectures and platforms will no doubt be applicable to a variety of devices. This suggests a modular system architecture for network devices that segments the computational resources from the device UI. Computational resources can be selected for a device UI that are optimized with respect to application mixes as well as to user preferences and metrics. Segmenting out the device UI reduces the complexity of device UIs, simplifying development and lowering costs. At the same time, with little electrical circuitry resident on device UIs, the selected platform can more fully optimize the entire device.by Sandeep Chatterjee.Ph.D
Compiler-driven data layout transformations for network applications
This work approaches the little studied topic of compiler optimisations directed to
network applications.
It starts by investigating if there exist any fundamental differences between application
domains that justify the development and tuning of domain-specific compiler optimisations.
It shows an automated approach that is capable of identifying domain-specific
workload characterisations and presenting them in a readily interpretable format based
on decision trees. The generated workload profiles summarise key resource utilisation
issues and enable compiler engineers to address the highlighted bottlenecks.
By applying this methodology to data intensive network infrastructure application it
shows that data organisation is the key obstacle to overcome in order to achieve high
performance.
It therefore proposes and evaluates three specialised data transformations (structure
splitting, array regrouping, and software caching) against the industrial EEMBC networking
benchmarks and real-world data sets. It also demonstrates on one hand that
speedups of up to 2.62 can be achieved, but on the other that no single solution performs
equally well across different network traffic scenarios.
Hence, to address this issue, an adaptive software caching scheme for high frequency
route lookup operations is introduced and its effectiveness evaluated one more time
against EEMBC networking benchmarks and real-world data sets achieving speedups
of up to 3.30 and 2.27. The results clearly demonstrate that adaptive data organisation
schemes are necessary to ensure optimal performance under varying network loads.
Finally this research addresses another issue introduced by data transformations such
as array regrouping and software caching, i.e. the need for static analysis to allow
efficient resource allocation. This thesis proposes a static code analyser that allows the
automatic resource analysis of source code containing lists and tree structures. The tool
applies a combination of amortised analysis and separation logic methodology to real
code and is able to evaluate type and resource usage of existing data structures, which
can be used to compute global resource consumption values for full data intensive
network applications
System-level power optimization:techniques and tools
This tutorial surveys design methods for energy-efficient system-level design. We consider electronic sytems consisting of a hardware platform and software layers. We consider the three major constituents of hardware that consume energy, namely computation, communication, and storage units, and we review methods of reducing their energy consumption. We also study models for analyzing the energy cost of software, and methods for energy-efficient software design and compilation. This survery is organized around three main phases of a system design: conceptualization and modeling design and implementation, and runtime management. For each phase, we review recent techniques for energy-efficient design of both hardware and software