28,197 research outputs found

    Video Classification With CNNs: Using The Codec As A Spatio-Temporal Activity Sensor

    Get PDF
    We investigate video classification via a two-stream convolutional neural network (CNN) design that directly ingests information extracted from compressed video bitstreams. Our approach begins with the observation that all modern video codecs divide the input frames into macroblocks (MBs). We demonstrate that selective access to MB motion vector (MV) information within compressed video bitstreams can also provide for selective, motion-adaptive, MB pixel decoding (a.k.a., MB texture decoding). This in turn allows for the derivation of spatio-temporal video activity regions at extremely high speed in comparison to conventional full-frame decoding followed by optical flow estimation. In order to evaluate the accuracy of a video classification framework based on such activity data, we independently train two CNN architectures on MB texture and MV correspondences and then fuse their scores to derive the final classification of each test video. Evaluation on two standard datasets shows that the proposed approach is competitive to the best two-stream video classification approaches found in the literature. At the same time: (i) a CPU-based realization of our MV extraction is over 977 times faster than GPU-based optical flow methods; (ii) selective decoding is up to 12 times faster than full-frame decoding; (iii) our proposed spatial and temporal CNNs perform inference at 5 to 49 times lower cloud computing cost than the fastest methods from the literature.Comment: Accepted in IEEE Transactions on Circuits and Systems for Video Technology. Extension of ICIP 2017 conference pape

    Distributed video coding for wireless video sensor networks: a review of the state-of-the-art architectures

    Get PDF
    Distributed video coding (DVC) is a relatively new video coding architecture originated from two fundamental theorems namely, Slepian–Wolf and Wyner–Ziv. Recent research developments have made DVC attractive for applications in the emerging domain of wireless video sensor networks (WVSNs). This paper reviews the state-of-the-art DVC architectures with a focus on understanding their opportunities and gaps in addressing the operational requirements and application needs of WVSNs

    Source-Channel Diversity for Parallel Channels

    Full text link
    We consider transmitting a source across a pair of independent, non-ergodic channels with random states (e.g., slow fading channels) so as to minimize the average distortion. The general problem is unsolved. Hence, we focus on comparing two commonly used source and channel encoding systems which correspond to exploiting diversity either at the physical layer through parallel channel coding or at the application layer through multiple description source coding. For on-off channel models, source coding diversity offers better performance. For channels with a continuous range of reception quality, we show the reverse is true. Specifically, we introduce a new figure of merit called the distortion exponent which measures how fast the average distortion decays with SNR. For continuous-state models such as additive white Gaussian noise channels with multiplicative Rayleigh fading, optimal channel coding diversity at the physical layer is more efficient than source coding diversity at the application layer in that the former achieves a better distortion exponent. Finally, we consider a third decoding architecture: multiple description encoding with a joint source-channel decoding. We show that this architecture achieves the same distortion exponent as systems with optimal channel coding diversity for continuous-state channels, and maintains the the advantages of multiple description systems for on-off channels. Thus, the multiple description system with joint decoding achieves the best performance, from among the three architectures considered, on both continuous-state and on-off channels.Comment: 48 pages, 14 figure

    Random Linear Network Coding for 5G Mobile Video Delivery

    Get PDF
    An exponential increase in mobile video delivery will continue with the demand for higher resolution, multi-view and large-scale multicast video services. Novel fifth generation (5G) 3GPP New Radio (NR) standard will bring a number of new opportunities for optimizing video delivery across both 5G core and radio access networks. One of the promising approaches for video quality adaptation, throughput enhancement and erasure protection is the use of packet-level random linear network coding (RLNC). In this review paper, we discuss the integration of RLNC into the 5G NR standard, building upon the ideas and opportunities identified in 4G LTE. We explicitly identify and discuss in detail novel 5G NR features that provide support for RLNC-based video delivery in 5G, thus pointing out to the promising avenues for future research.Comment: Invited paper for Special Issue "Network and Rateless Coding for Video Streaming" - MDPI Informatio

    Reliable Video Streaming over mmWave with Multi Connectivity and Network Coding

    Full text link
    The next generation of multimedia applications will require the telecommunication networks to support a higher bitrate than today, in order to deliver virtual reality and ultra-high quality video content to the users. Most of the video content will be accessed from mobile devices, prompting the provision of very high data rates by next generation (5G) cellular networks. A possible enabler in this regard is communication at mmWave frequencies, given the vast amount of available spectrum that can be allocated to mobile users; however, the harsh propagation environment at such high frequencies makes it hard to provide a reliable service. This paper presents a reliable video streaming architecture for mmWave networks, based on multi connectivity and network coding, and evaluates its performance using a novel combination of the ns-3 mmWave module, real video traces and the network coding library Kodo. The results show that it is indeed possible to reliably stream video over cellular mmWave links, while the combination of multi connectivity and network coding can support high video quality with low latency.Comment: To be presented at the 2018 IEEE International Conference on Computing, Networking and Communications (ICNC), March 2018, Maui, Hawaii, USA (invited paper). 6 pages, 4 figure

    Transport of video over partial order connections

    Get PDF
    A Partial Order and partial reliable Connection (POC) is an end-to-end transport connection authorized to deliver objects in an order that can differ from the transmitted one. Such a connection is also authorized to lose some objects. The POC concept is motivated by the fact that heterogeneous best-effort networks such as Internet are plagued by unordered delivery of packets and losses, which tax the performances of current applications and protocols. It has been shown, in several research works, that out of order delivery is able to alleviate (with respect to CO service) the use of end systems’ communication resources. In this paper, the efficiency of out-of-sequence delivery on MPEG video streams processing is studied. Firstly, the transport constraints (in terms of order and reliability) that can be relaxed by MPEG video decoders, for improving video transport, are detailed. Then, we analyze the performance gain induced by this approach in terms of blocking times and recovered errors. We demonstrate that POC connections fill not only the conceptual gap between TCP and UDP but also provide real performance improvements for the transport of multimedia streams such MPEG video
    • …
    corecore