717 research outputs found

    A pixel-based complexity model to estimate energy consumption in video decoders

    Get PDF
    The increasing use of HEVC video streams in diverse multimedia applications is driving the need for higher user control and management of energy consumption in battery-powered devices. This paper presents a contribution for the lack of adequate solutions by proposing a pixel-based complexity model that is capable of estimating the energy consumption of an arbitrary software-based HEVC decoder, running on different hardware platforms and devices. In the proposed model, the computational complexity is defined as a linear function of the number of pixels processed by the main decoding functions, using weighting coefficients which represent the average computational effort that each decoding function requires per pixel. The results shows that the cross-correlation of frame-based complexity estimation with energy consumption is greater than 0.86. The energy consumption of video decoding is estimated with the proposed model within an average deviation range of about 6.9%, for different test sequences.info:eu-repo/semantics/publishedVersio

    FastDepth: Fast Monocular Depth Estimation on Embedded Systems

    Full text link
    Depth sensing is a critical function for robotic tasks such as localization, mapping and obstacle detection. There has been a significant and growing interest in depth estimation from a single RGB image, due to the relatively low cost and size of monocular cameras. However, state-of-the-art single-view depth estimation algorithms are based on fairly complex deep neural networks that are too slow for real-time inference on an embedded platform, for instance, mounted on a micro aerial vehicle. In this paper, we address the problem of fast depth estimation on embedded systems. We propose an efficient and lightweight encoder-decoder network architecture and apply network pruning to further reduce computational complexity and latency. In particular, we focus on the design of a low-latency decoder. Our methodology demonstrates that it is possible to achieve similar accuracy as prior work on depth estimation, but at inference speeds that are an order of magnitude faster. Our proposed network, FastDepth, runs at 178 fps on an NVIDIA Jetson TX2 GPU and at 27 fps when using only the TX2 CPU, with active power consumption under 10 W. FastDepth achieves close to state-of-the-art accuracy on the NYU Depth v2 dataset. To the best of the authors' knowledge, this paper demonstrates real-time monocular depth estimation using a deep neural network with the lowest latency and highest throughput on an embedded platform that can be carried by a micro aerial vehicle.Comment: Accepted for presentation at ICRA 2019. 8 pages, 6 figures, 7 table

    Code improvements towards implementing HEVC decoder

    Get PDF

    Compressed Sensing based Low-Power Multi-View Video Coding and Transmission in Wireless Multi-Path Multi-Hop Networks

    Get PDF
    Wireless Multimedia Sensor Network (WMSN) is increasingly being deployed for surveillance, monitoring and Internet-of-Things (IoT) sensing applications where a set of cameras capture and compress local images and then transmit the data to a remote controller. Such captured local images may also be compressed in a multi-view fashion to reduce the redundancy among overlapping views. In this paper, we present a novel paradigm for compressed-sensing-enabled multi-view coding and streaming in WMSN. We first propose a new encoding and decoding architecture for multi-view video systems based on Compressed Sensing (CS) principles, composed of cooperative sparsity-aware block-level rate-adaptive encoders, feedback channels and independent decoders. The proposed architecture leverages the properties of CS to overcome many limitations of traditional encoding techniques, specifically massive storage requirements and high computational complexity. Then, we present a modeling framework that exploits the aforementioned coding architecture. The proposed mathematical problem minimizes the power consumption by jointly determining the encoding rate and multi-path rate allocation subject to distortion and energy constraints. Extensive performance evaluation results show that the proposed framework is able to transmit multi-view streams with guaranteed video quality at lower power consumption

    Distributed Video Coding for Multiview and Video-plus-depth Coding

    Get PDF

    Event transformer FlowNet for optical flow estimation

    Get PDF
    Event cameras are bioinspired sensors that produce asynchronous and sparse streams of events at image locations where intensity change is detected. They can detect fast motion with low latency, high dynamic range, and low power consumption. Over the past decade, efforts have been conducted in developing solutions with event cameras for robotics applications. In this work, we address their use for fast and robust computation of optical flow. We present ET-FlowNet, a hybrid RNN-ViT architecture for optical flow estimation. Visual transformers (ViTs) are ideal candidates for the learning of global context in visual tasks, and we argue that rigid body motion is a prime case for the use of ViTs since long-range dependencies in the image hold during rigid body motion. We perform end-to-end training with self-supervised learning method. Our results show comparable and in some cases exceeding performance with state-of-the-art coarse-to-fine event-based optical flow estimation.This work was supported by projects EBSLAM DPI2017-89564-P and EBCON PID2020-119244GB-I00 funded by CIN/AEI/10.13039/501100011033 and by an FI AGAUR PhD grant to Yi Tian.Postprint (published version

    A Decoding-Complexity and Rate-Controlled Video-Coding Algorithm for HEVC

    Get PDF
    Video playback on mobile consumer electronic (CE) devices is plagued by fluctuations in the network bandwidth and by limitations in processing and energy availability at the individual devices. Seen as a potential solution, the state-of-the-art adaptive streaming mechanisms address the first aspect, yet the efficient control of the decoding-complexity and the energy use when decoding the video remain unaddressed. The quality of experience (QoE) of the end-users’ experiences, however, depends on the capability to adapt the bit streams to both these constraints (i.e., network bandwidth and device’s energy availability). As a solution, this paper proposes an encoding framework that is capable of generating video bit streams with arbitrary bit rates and decoding-complexity levels using a decoding-complexity–rate–distortion model. The proposed algorithm allocates rate and decoding-complexity levels across frames and coding tree units (CTUs) and adaptively derives the CTU-level coding parameters to achieve their imposed targets with minimal distortion. The experimental results reveal that the proposed algorithm can achieve the target bit rate and the decoding-complexity with 0.4% and 1.78% average errors, respectively, for multiple bit rate and decoding-complexity levels. The proposed algorithm also demonstrates a stable frame-wise rate and decoding-complexity control capability when achieving a decoding-complexity reduction of 10.11 (%/dB). The resultant decoding-complexity reduction translates into an overall energy-consumption reduction of up to 10.52 (%/dB) for a 1 dB peak signal-to-noise ratio (PSNR) quality loss compared to the HM 16.0 encoded bit streams

    Green compressive sampling reconstruction in IoT networks

    Get PDF
    In this paper, we address the problem of green Compressed Sensing (CS) reconstruction within Internet of Things (IoT) networks, both in terms of computing architecture and reconstruction algorithms. The approach is novel since, unlike most of the literature dealing with energy efficient gathering of the CS measurements, we focus on the energy efficiency of the signal reconstruction stage given the CS measurements. As a first novel contribution, we present an analysis of the energy consumption within the IoT network under two computing architectures. In the first one, reconstruction takes place within the IoT network and the reconstructed data are encoded and transmitted out of the IoT network; in the second one, all the CS measurements are forwarded to off-network devices for reconstruction and storage, i.e., reconstruction is off-loaded. Our analysis shows that the two architectures significantly differ in terms of consumed energy, and it outlines a theoretically motivated criterion to select a green CS reconstruction computing architecture. Specifically, we present a suitable decision function to determine which architecture outperforms the other in terms of energy efficiency. The presented decision function depends on a few IoT network features, such as the network size, the sink connectivity, and other systems’ parameters. As a second novel contribution, we show how to overcome classical performance comparison of different CS reconstruction algorithms usually carried out w.r.t. the achieved accuracy. Specifically, we consider the consumed energy and analyze the energy vs. accuracy trade-off. The herein presented approach, jointly considering signal processing and IoT network issues, is a relevant contribution for designing green compressive sampling architectures in IoT networks
    • …
    corecore