25,834 research outputs found

    FastDeepIoT: Towards Understanding and Optimizing Neural Network Execution Time on Mobile and Embedded Devices

    Full text link
    Deep neural networks show great potential as solutions to many sensing application problems, but their excessive resource demand slows down execution time, pausing a serious impediment to deployment on low-end devices. To address this challenge, recent literature focused on compressing neural network size to improve performance. We show that changing neural network size does not proportionally affect performance attributes of interest, such as execution time. Rather, extreme run-time nonlinearities exist over the network configuration space. Hence, we propose a novel framework, called FastDeepIoT, that uncovers the non-linear relation between neural network structure and execution time, then exploits that understanding to find network configurations that significantly improve the trade-off between execution time and accuracy on mobile and embedded devices. FastDeepIoT makes two key contributions. First, FastDeepIoT automatically learns an accurate and highly interpretable execution time model for deep neural networks on the target device. This is done without prior knowledge of either the hardware specifications or the detailed implementation of the used deep learning library. Second, FastDeepIoT informs a compression algorithm how to minimize execution time on the profiled device without impacting accuracy. We evaluate FastDeepIoT using three different sensing-related tasks on two mobile devices: Nexus 5 and Galaxy Nexus. FastDeepIoT further reduces the neural network execution time by 48%48\% to 78%78\% and energy consumption by 37%37\% to 69%69\% compared with the state-of-the-art compression algorithms.Comment: Accepted by SenSys '1

    A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems

    Full text link
    Recent technological advances have greatly improved the performance and features of embedded systems. With the number of just mobile devices now reaching nearly equal to the population of earth, embedded systems have truly become ubiquitous. These trends, however, have also made the task of managing their power consumption extremely challenging. In recent years, several techniques have been proposed to address this issue. In this paper, we survey the techniques for managing power consumption of embedded systems. We discuss the need of power management and provide a classification of the techniques on several important parameters to highlight their similarities and differences. This paper is intended to help the researchers and application-developers in gaining insights into the working of power management techniques and designing even more efficient high-performance embedded systems of tomorrow

    TANGO: Transparent heterogeneous hardware Architecture deployment for eNergy Gain in Operation

    Get PDF
    The paper is concerned with the issue of how software systems actually use Heterogeneous Parallel Architectures (HPAs), with the goal of optimizing power consumption on these resources. It argues the need for novel methods and tools to support software developers aiming to optimise power consumption resulting from designing, developing, deploying and running software on HPAs, while maintaining other quality aspects of software to adequate and agreed levels. To do so, a reference architecture to support energy efficiency at application construction, deployment, and operation is discussed, as well as its implementation and evaluation plans.Comment: Part of the Program Transformation for Programmability in Heterogeneous Architectures (PROHA) workshop, Barcelona, Spain, 12th March 2016, 7 pages, LaTeX, 3 PNG figure

    Powertrace: Network-level Power Profiling for Low-power Wireless Networks

    Get PDF
    Low-power wireless networks are quickly becoming a critical part of our everyday infrastructure. Power consumption is a critical concern, but power measurement and estimation is a challenge. We present Powertrace, which to the best of our knowledge is the first system for network-level power profiling of low-power wireless systems. Powertrace uses power state tracking to estimate system power consumption and a structure called energy capsules to attribute energy consumption to activities such as packet transmissions and receptions. With Powertrace, the power consumption of a system can be broken down into individual activities which allows us to answer questions such as “How much energy is spent forwarding packets for node X?”, “How much energy is spent on control traffic and how much on critical data?”, and “How much energy does application X account for?”. Experiments show that Powertrace is accurate to 94% of the energy consumption of a device. To demonstrate the usefulness of Powertrace, we use it to experimentally analyze the power behavior of the proposed IETF standard IPv6 RPL routing protocol and a sensor network data collection protocol. Through using Powertrace, we find the highest power consumers and are able to reduce the power consumption of data collection with 24%. It is our hope that Powertrace will help the community to make empirical energy evaluation a widely used tool in the low-power wireless research community toolbox

    An initial performance review of software components for a heterogeneous computing platform

    Full text link
    The design of embedded systems is a complex activity that involves a lot of decisions. With high performance demands of present day usage scenarios and software, they often involve energy hungry state-of-the-art computing units. While focusing on power consumption of computing units, the physical properties of software are often ignored. Recently, there has been a growing interest to quantify and model the physical footprint of software (e.g. consumed power, generated heat, execution time, etc.), and a component based approach facilitates methods for describing such properties. Based on these, software architects can make energy-efficient software design solutions. This paper presents power consumption and execution time profiling of a component software that can be allocated on heterogeneous computing units (CPU, GPU, FPGA) of a tracked robot

    Performance and Power Analysis of HPC Workloads on Heterogenous Multi-Node Clusters

    Get PDF
    Performance analysis tools allow application developers to identify and characterize the inefficiencies that cause performance degradation in their codes, allowing for application optimizations. Due to the increasing interest in the High Performance Computing (HPC) community towards energy-efficiency issues, it is of paramount importance to be able to correlate performance and power figures within the same profiling and analysis tools. For this reason, we present a performance and energy-efficiency study aimed at demonstrating how a single tool can be used to collect most of the relevant metrics. In particular, we show how the same analysis techniques can be applicable on different architectures, analyzing the same HPC application on a high-end and a low-power cluster. The former cluster embeds Intel Haswell CPUs and NVIDIA K80 GPUs, while the latter is made up of NVIDIA Jetson TX1 boards, each hosting an Arm Cortex-A57 CPU and an NVIDIA Tegra X1 Maxwell GPU.The research leading to these results has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] and Horizon 2020 under the Mont-Blanc projects [17], grant agreements n. 288777, 610402 and 671697. E.C. was partially founded by “Contributo 5 per mille assegnato all’Università degli Studi di Ferrara-dichiarazione dei redditi dell’anno 2014”. We thank the University of Ferrara and INFN Ferrara for the access to the COKA Cluster. We warmly thank the BSC tools group, supporting us for the smooth integration and test of our setup within Extrae and Paraver.Peer ReviewedPostprint (published version
    • …
    corecore