75 research outputs found

    FastDeepIoT: Towards Understanding and Optimizing Neural Network Execution Time on Mobile and Embedded Devices

    Full text link
    Deep neural networks show great potential as solutions to many sensing application problems, but their excessive resource demand slows down execution time, pausing a serious impediment to deployment on low-end devices. To address this challenge, recent literature focused on compressing neural network size to improve performance. We show that changing neural network size does not proportionally affect performance attributes of interest, such as execution time. Rather, extreme run-time nonlinearities exist over the network configuration space. Hence, we propose a novel framework, called FastDeepIoT, that uncovers the non-linear relation between neural network structure and execution time, then exploits that understanding to find network configurations that significantly improve the trade-off between execution time and accuracy on mobile and embedded devices. FastDeepIoT makes two key contributions. First, FastDeepIoT automatically learns an accurate and highly interpretable execution time model for deep neural networks on the target device. This is done without prior knowledge of either the hardware specifications or the detailed implementation of the used deep learning library. Second, FastDeepIoT informs a compression algorithm how to minimize execution time on the profiled device without impacting accuracy. We evaluate FastDeepIoT using three different sensing-related tasks on two mobile devices: Nexus 5 and Galaxy Nexus. FastDeepIoT further reduces the neural network execution time by 48%48\% to 78%78\% and energy consumption by 37%37\% to 69%69\% compared with the state-of-the-art compression algorithms.Comment: Accepted by SenSys '1

    A Generalized Packing Server for Scheduling Task Graphs on Multiple Resources

    Get PDF
    This paper presents the generalized packing server. It reduces the problem of scheduling tasks with precedence constraints on multiple processing units to the problem of scheduling independent tasks. The work generalizes our previous contribution made in the specific context of scheduling Map/Reduce workflows. The results apply to the generalized parallel task model, introduced in recent literature to denote tasks described by workflow graphs, where some subtasks may be executed in parallel subject to precedence constraints. Recent literature developed schedulability bounds for the generalized parallel tasks on multiprocessors. The generalized packing server, described in this paper, is a run-time mechanism that packs tasks into server budgets (in a manner that respects precedence constraints) allowing the budgets to be viewed as independent tasks by the underlying scheduler. Consequently, any schedulability results derived for the independent task model on multiprocessors become applicable to generalized parallel tasks. The catch is that the sum of capacities of server budgets exceeds by a certain ratio the sum of execution times of the original generalized parallel tasks. Hence, a scaling factor is derived that converts bounds for independent tasks into corresponding bounds for generalized parallel tasks. The factor applies to any work-conserving scheduling policy in both the global and partitioned multiprocessor scheduling models. We show that the new schedulability bounds obtained for the generalized parallel task model, using the aforementioned conversion, improve in several cases upon the best known bounds in current literature. Hence, the packing server is shown to improve the schedulability of generalized parallel tasks. Evaluation results confirm this observation.Ope

    Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

    Full text link
    Off-policy Learning to Rank (LTR) aims to optimize a ranker from data collected by a deployed logging policy. However, existing off-policy learning to rank methods often make strong assumptions about how users generate the click data, i.e., the click model, and hence need to tailor their methods specifically under different click models. In this paper, we unified the ranking process under general stochastic click models as a Markov Decision Process (MDP), and the optimal ranking could be learned with offline reinforcement learning (RL) directly. Building upon this, we leverage offline RL techniques for off-policy LTR and propose the Click Model-Agnostic Unified Off-policy Learning to Rank (CUOLR) method, which could be easily applied to a wide range of click models. Through a dedicated formulation of the MDP, we show that offline RL algorithms can adapt to various click models without complex debiasing techniques and prior knowledge of the model. Results on various large-scale datasets demonstrate that CUOLR consistently outperforms the state-of-the-art off-policy learning to rank algorithms while maintaining consistency and robustness under different click models

    Atmospheric Deposition of Inorganic Elements and Organic Compounds at the Inlets of the Venice Lagoon

    Get PDF
    The Venice Lagoon is subjected to long-range transport of contaminants via aerosol from the near Po Valley. Moreover, it is an area with significant local anthropogenic emissions due to the industrial area of Porto Marghera, the urban centres, and the glass factories and with emissions by ships traffic within the Lagoon. Furthermore, since 2005, the Lagoon has also been affected by the construction of the MOSE (Modulo Sperimentale Elettromeccanico—Electromechanical Experimental Module) mobile dams, as a barrier against the high tide. This work presents and discusses the results from chemical analyses of bulk depositions, carried out in different sites of the Venice Lagoon. Fluxes of pollutants were also statistically analysed on PCA with the aim of investigating the spatial variability of depositions and their correlation with precipitations. Fluxes of inorganic pollutants depend differently on precipitations, while organic compounds show a more seasonal trend. The statistical analysis showed that the site in the northern Lagoon has lower and almost homogeneous fluxes of pollutants, while the other sites registered more variable concentrations. The study also provided important information about the annual trend of pollutants and their evolution over a period of about five years, from 2005 to 2010

    STFNets: Learning Sensing Signals from the Time-Frequency Perspective with Short-Time Fourier Neural Networks

    Full text link
    Recent advances in deep learning motivate the use of deep neural networks in Internet-of-Things (IoT) applications. These networks are modelled after signal processing in the human brain, thereby leading to significant advantages at perceptual tasks such as vision and speech recognition. IoT applications, however, often measure physical phenomena, where the underlying physics (such as inertia, wireless signal propagation, or the natural frequency of oscillation) are fundamentally a function of signal frequencies, offering better features in the frequency domain. This observation leads to a fundamental question: For IoT applications, can one develop a new brand of neural network structures that synthesize features inspired not only by the biology of human perception but also by the fundamental nature of physics? Hence, in this paper, instead of using conventional building blocks (e.g., convolutional and recurrent layers), we propose a new foundational neural network building block, the Short-Time Fourier Neural Network (STFNet). It integrates a widely-used time-frequency analysis method, the Short-Time Fourier Transform, into data processing to learn features directly in the frequency domain, where the physics of underlying phenomena leave better foot-prints. STFNets bring additional flexibility to time-frequency analysis by offering novel nonlinear learnable operations that are spectral-compatible. Moreover, STFNets show that transforming signals to a domain that is more connected to the underlying physics greatly simplifies the learning process. We demonstrate the effectiveness of STFNets with extensive experiments. STFNets significantly outperform the state-of-the-art deep learning models in all experiments. A STFNet, therefore, demonstrates superior capability as the fundamental building block of deep neural networks for IoT applications for various sensor inputs
    corecore