25,834 research outputs found
FastDeepIoT: Towards Understanding and Optimizing Neural Network Execution Time on Mobile and Embedded Devices
Deep neural networks show great potential as solutions to many sensing
application problems, but their excessive resource demand slows down execution
time, pausing a serious impediment to deployment on low-end devices. To address
this challenge, recent literature focused on compressing neural network size to
improve performance. We show that changing neural network size does not
proportionally affect performance attributes of interest, such as execution
time. Rather, extreme run-time nonlinearities exist over the network
configuration space. Hence, we propose a novel framework, called FastDeepIoT,
that uncovers the non-linear relation between neural network structure and
execution time, then exploits that understanding to find network configurations
that significantly improve the trade-off between execution time and accuracy on
mobile and embedded devices. FastDeepIoT makes two key contributions. First,
FastDeepIoT automatically learns an accurate and highly interpretable execution
time model for deep neural networks on the target device. This is done without
prior knowledge of either the hardware specifications or the detailed
implementation of the used deep learning library. Second, FastDeepIoT informs a
compression algorithm how to minimize execution time on the profiled device
without impacting accuracy. We evaluate FastDeepIoT using three different
sensing-related tasks on two mobile devices: Nexus 5 and Galaxy Nexus.
FastDeepIoT further reduces the neural network execution time by to
and energy consumption by to compared with the
state-of-the-art compression algorithms.Comment: Accepted by SenSys '1
A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems
Recent technological advances have greatly improved the performance and
features of embedded systems. With the number of just mobile devices now
reaching nearly equal to the population of earth, embedded systems have truly
become ubiquitous. These trends, however, have also made the task of managing
their power consumption extremely challenging. In recent years, several
techniques have been proposed to address this issue. In this paper, we survey
the techniques for managing power consumption of embedded systems. We discuss
the need of power management and provide a classification of the techniques on
several important parameters to highlight their similarities and differences.
This paper is intended to help the researchers and application-developers in
gaining insights into the working of power management techniques and designing
even more efficient high-performance embedded systems of tomorrow
TANGO: Transparent heterogeneous hardware Architecture deployment for eNergy Gain in Operation
The paper is concerned with the issue of how software systems actually use
Heterogeneous Parallel Architectures (HPAs), with the goal of optimizing power
consumption on these resources. It argues the need for novel methods and tools
to support software developers aiming to optimise power consumption resulting
from designing, developing, deploying and running software on HPAs, while
maintaining other quality aspects of software to adequate and agreed levels. To
do so, a reference architecture to support energy efficiency at application
construction, deployment, and operation is discussed, as well as its
implementation and evaluation plans.Comment: Part of the Program Transformation for Programmability in
Heterogeneous Architectures (PROHA) workshop, Barcelona, Spain, 12th March
2016, 7 pages, LaTeX, 3 PNG figure
Powertrace: Network-level Power Profiling for Low-power Wireless Networks
Low-power wireless networks are quickly becoming a critical part of our everyday infrastructure. Power consumption is a critical concern, but power measurement and estimation is a challenge. We present Powertrace,
which to the best of our knowledge is the first system for network-level power profiling of low-power wireless systems. Powertrace uses power state tracking to estimate system power consumption and a structure called energy capsules to attribute energy consumption to activities such as packet transmissions and receptions. With Powertrace, the power consumption of a system can be broken down into individual activities which allows us to answer questions such as “How much energy is spent forwarding packets for node X?”, “How much energy
is spent on control traffic and how much on critical data?”, and “How much energy does application X account for?”. Experiments show that Powertrace is accurate to 94% of the energy consumption of a device. To
demonstrate the usefulness of Powertrace, we use it to experimentally analyze the power behavior of the proposed IETF standard IPv6 RPL routing protocol and a sensor network data collection protocol. Through using Powertrace, we find the highest power consumers and are
able to reduce the power consumption of data collection with 24%. It is our hope that Powertrace will help the community to make empirical energy evaluation a widely used tool in the low-power wireless research community toolbox
An initial performance review of software components for a heterogeneous computing platform
The design of embedded systems is a complex activity that involves a lot of
decisions. With high performance demands of present day usage scenarios and
software, they often involve energy hungry state-of-the-art computing units.
While focusing on power consumption of computing units, the physical properties
of software are often ignored. Recently, there has been a growing interest to
quantify and model the physical footprint of software (e.g. consumed power,
generated heat, execution time, etc.), and a component based approach
facilitates methods for describing such properties. Based on these, software
architects can make energy-efficient software design solutions. This paper
presents power consumption and execution time profiling of a component software
that can be allocated on heterogeneous computing units (CPU, GPU, FPGA) of a
tracked robot
Performance and Power Analysis of HPC Workloads on Heterogenous Multi-Node Clusters
Performance analysis tools allow application developers to identify and characterize the inefficiencies that cause performance degradation in their codes, allowing for application optimizations. Due to the increasing interest in the High Performance Computing (HPC) community towards energy-efficiency issues, it is of paramount importance to be able to correlate performance and power figures within the same profiling and analysis tools. For this reason, we present a performance and energy-efficiency study aimed at demonstrating how a single tool can be used to collect most of the relevant metrics. In particular, we show how the same analysis techniques can be applicable on different architectures, analyzing the same HPC application on a high-end and a low-power cluster. The former cluster embeds Intel Haswell CPUs and NVIDIA K80 GPUs, while the latter is made up of NVIDIA Jetson TX1 boards, each hosting an Arm Cortex-A57 CPU and an NVIDIA Tegra X1 Maxwell GPU.The research leading to these results has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] and Horizon 2020 under the Mont-Blanc projects [17], grant agreements n. 288777, 610402 and 671697. E.C. was partially founded by “Contributo 5 per mille assegnato all’Università degli Studi di Ferrara-dichiarazione dei redditi dell’anno 2014”. We thank the University of Ferrara and INFN Ferrara for the access to the COKA Cluster. We warmly thank the BSC tools group, supporting us for the smooth integration and test of our setup within Extrae and Paraver.Peer ReviewedPostprint (published version
- …