28,197 research outputs found
Video Classification With CNNs: Using The Codec As A Spatio-Temporal Activity Sensor
We investigate video classification via a two-stream convolutional neural
network (CNN) design that directly ingests information extracted from
compressed video bitstreams. Our approach begins with the observation that all
modern video codecs divide the input frames into macroblocks (MBs). We
demonstrate that selective access to MB motion vector (MV) information within
compressed video bitstreams can also provide for selective, motion-adaptive, MB
pixel decoding (a.k.a., MB texture decoding). This in turn allows for the
derivation of spatio-temporal video activity regions at extremely high speed in
comparison to conventional full-frame decoding followed by optical flow
estimation. In order to evaluate the accuracy of a video classification
framework based on such activity data, we independently train two CNN
architectures on MB texture and MV correspondences and then fuse their scores
to derive the final classification of each test video. Evaluation on two
standard datasets shows that the proposed approach is competitive to the best
two-stream video classification approaches found in the literature. At the same
time: (i) a CPU-based realization of our MV extraction is over 977 times faster
than GPU-based optical flow methods; (ii) selective decoding is up to 12 times
faster than full-frame decoding; (iii) our proposed spatial and temporal CNNs
perform inference at 5 to 49 times lower cloud computing cost than the fastest
methods from the literature.Comment: Accepted in IEEE Transactions on Circuits and Systems for Video
Technology. Extension of ICIP 2017 conference pape
Distributed video coding for wireless video sensor networks: a review of the state-of-the-art architectures
Distributed video coding (DVC) is a relatively new video coding architecture originated from two fundamental theorems namely, Slepian–Wolf and Wyner–Ziv. Recent research developments have made DVC attractive for applications in the emerging domain of wireless video sensor networks (WVSNs). This paper reviews the state-of-the-art DVC architectures with a focus on understanding their opportunities and gaps in addressing the operational requirements and application needs of WVSNs
Source-Channel Diversity for Parallel Channels
We consider transmitting a source across a pair of independent, non-ergodic
channels with random states (e.g., slow fading channels) so as to minimize the
average distortion. The general problem is unsolved. Hence, we focus on
comparing two commonly used source and channel encoding systems which
correspond to exploiting diversity either at the physical layer through
parallel channel coding or at the application layer through multiple
description source coding.
For on-off channel models, source coding diversity offers better performance.
For channels with a continuous range of reception quality, we show the reverse
is true. Specifically, we introduce a new figure of merit called the distortion
exponent which measures how fast the average distortion decays with SNR. For
continuous-state models such as additive white Gaussian noise channels with
multiplicative Rayleigh fading, optimal channel coding diversity at the
physical layer is more efficient than source coding diversity at the
application layer in that the former achieves a better distortion exponent.
Finally, we consider a third decoding architecture: multiple description
encoding with a joint source-channel decoding. We show that this architecture
achieves the same distortion exponent as systems with optimal channel coding
diversity for continuous-state channels, and maintains the the advantages of
multiple description systems for on-off channels. Thus, the multiple
description system with joint decoding achieves the best performance, from
among the three architectures considered, on both continuous-state and on-off
channels.Comment: 48 pages, 14 figure
Random Linear Network Coding for 5G Mobile Video Delivery
An exponential increase in mobile video delivery will continue with the
demand for higher resolution, multi-view and large-scale multicast video
services. Novel fifth generation (5G) 3GPP New Radio (NR) standard will bring a
number of new opportunities for optimizing video delivery across both 5G core
and radio access networks. One of the promising approaches for video quality
adaptation, throughput enhancement and erasure protection is the use of
packet-level random linear network coding (RLNC). In this review paper, we
discuss the integration of RLNC into the 5G NR standard, building upon the
ideas and opportunities identified in 4G LTE. We explicitly identify and
discuss in detail novel 5G NR features that provide support for RLNC-based
video delivery in 5G, thus pointing out to the promising avenues for future
research.Comment: Invited paper for Special Issue "Network and Rateless Coding for
Video Streaming" - MDPI Informatio
Reliable Video Streaming over mmWave with Multi Connectivity and Network Coding
The next generation of multimedia applications will require the
telecommunication networks to support a higher bitrate than today, in order to
deliver virtual reality and ultra-high quality video content to the users. Most
of the video content will be accessed from mobile devices, prompting the
provision of very high data rates by next generation (5G) cellular networks. A
possible enabler in this regard is communication at mmWave frequencies, given
the vast amount of available spectrum that can be allocated to mobile users;
however, the harsh propagation environment at such high frequencies makes it
hard to provide a reliable service. This paper presents a reliable video
streaming architecture for mmWave networks, based on multi connectivity and
network coding, and evaluates its performance using a novel combination of the
ns-3 mmWave module, real video traces and the network coding library Kodo. The
results show that it is indeed possible to reliably stream video over cellular
mmWave links, while the combination of multi connectivity and network coding
can support high video quality with low latency.Comment: To be presented at the 2018 IEEE International Conference on
Computing, Networking and Communications (ICNC), March 2018, Maui, Hawaii,
USA (invited paper). 6 pages, 4 figure
Transport of video over partial order connections
A Partial Order and partial reliable Connection (POC) is an end-to-end transport connection authorized to deliver objects in an order that can differ from the transmitted one. Such a connection is also authorized to lose some objects. The POC concept is motivated by the fact that heterogeneous best-effort networks such as Internet are plagued by unordered delivery of packets and losses, which tax the performances of current applications and protocols. It has been shown, in several research works, that out of order delivery is able to alleviate (with respect to CO service) the use of end systems’ communication resources. In this paper, the efficiency of out-of-sequence delivery on MPEG video streams processing is studied. Firstly, the transport constraints (in terms of order and reliability) that can be relaxed by MPEG video decoders, for improving video transport, are detailed. Then, we analyze the performance gain induced by this approach in terms of blocking times and recovered errors. We demonstrate that POC connections fill not only the conceptual gap between TCP and UDP but also provide real performance improvements for the transport of multimedia streams such MPEG video
- …