351 research outputs found
Video Traffic Characteristics of Modern Encoding Standards: H.264/AVC with SVC and MVC Extensions and H.265/HEVC
abstract: Video encoding for multimedia services over communication networks has significantly advanced in recent years with the development of the highly efficient and flexible H.264/AVC video coding standard and its SVC extension. The emerging H.265/HEVC video coding standard as well as 3D video coding further advance video coding for multimedia communications. This paper first gives an overview of these new video coding standards and then examines their implications for multimedia communications by studying the traffic characteristics of long videos encoded with the new coding standards. We review video coding advances from MPEG-2 and MPEG-4 Part 2 to H.264/AVC and its SVC and MVC extensions as well as H.265/HEVC. For single-layer (nonscalable) video, we compare H.265/HEVC and H.264/AVC in terms of video traffic and statistical multiplexing characteristics. Our study is the first to examine the H.265/HEVC traffic variability for long videos. We also illustrate the video traffic characteristics and statistical multiplexing of scalable video encoded with the SVC extension of H.264/AVC as well as 3D video encoded with the MVC extension of H.264/AVC.View the article as published at https://www.hindawi.com/journals/tswj/2014/189481
Video Traffic Characteristics of Modern Encoding Standards: H.264/AVC with SVC and MVC Extensions and H.265/HEVC
Video encoding for multimedia services over communication networks has significantly advanced in recent years with the development of the highly efficient and flexible H.264/AVC video coding standard and its SVC extension. The emerging H.265/HEVC video coding standard as well as 3D video coding further advance video coding for multimedia communications. This paper first gives an overview of these new video coding standards and then examines their implications for multimedia communications by studying the traffic characteristics of long videos encoded with the new coding standards. We review video coding advances from MPEG-2 and MPEG-4 Part 2 to H.264/AVC and its SVC and MVC extensions as well as H.265/HEVC. For single-layer (nonscalable) video, we compare H.265/HEVC and H.264/AVC in terms of video traffic and statistical multiplexing characteristics. Our study is the first to examine the H.265/HEVC traffic variability for long videos. We also illustrate the video traffic characteristics and statistical multiplexing of scalable video encoded with the SVC extension of H.264/AVC as well as 3D video encoded with the MVC extension of H.264/AVC
Study of Compression Statistics and Prediction of Rate-Distortion Curves for Video Texture
Encoding textural content remains a challenge for current standardised video
codecs. It is therefore beneficial to understand video textures in terms of
both their spatio-temporal characteristics and their encoding statistics in
order to optimize encoding performance. In this paper, we analyse the
spatio-temporal features and statistics of video textures, explore the
rate-quality performance of different texture types and investigate models to
mathematically describe them. For all considered theoretical models, we employ
machine-learning regression to predict the rate-quality curves based solely on
selected spatio-temporal features extracted from uncompressed content. All
experiments were performed on homogeneous video textures to ensure validity of
the observations. The results of the regression indicate that using an
exponential model we can more accurately predict the expected rate-quality
curve (with a mean Bj{\o}ntegaard Delta rate of 0.46% over the considered
dataset) while maintaining a low relative complexity. This is expected to be
adopted by in the loop processes for faster encoding decisions such as
rate-distortion optimisation, adaptive quantization, partitioning, etc.Comment: 17 page
Adaptive delivery of immersive 3D multi-view video over the Internet
The increase in Internet bandwidth and the developments in 3D video technology have paved the way for the delivery of 3D Multi-View Video (MVV) over the Internet. However, large amounts of data and dynamic network conditions result in frequent network congestion, which may prevent video packets from being delivered on time. As a consequence, the 3D video experience may well be degraded unless content-aware precautionary mechanisms and adaptation methods are deployed. In this work, a novel adaptive MVV streaming method is introduced which addresses the future generation 3D immersive MVV experiences with multi-view displays. When the user experiences network congestion, making it necessary to perform adaptation, the rate-distortion optimum set of views that are pre-determined by the server, are truncated from the delivered MVV streams. In order to maintain high Quality of Experience (QoE) service during the frequent network congestion, the proposed method involves the calculation of low-overhead additional metadata that is delivered to the client. The proposed adaptive 3D MVV streaming solution is tested using the MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH) standard. Both extensive objective and subjective evaluations are presented, showing that the proposed method provides significant quality enhancement under the adverse network conditions
VR-Caps: A Virtual Environment for Capsule Endoscopy
Current capsule endoscopes and next-generation robotic capsules for diagnosis
and treatment of gastrointestinal diseases are complex cyber-physical platforms
that must orchestrate complex software and hardware functions. The desired
tasks for these systems include visual localization, depth estimation, 3D
mapping, disease detection and segmentation, automated navigation, active
control, path realization and optional therapeutic modules such as targeted
drug delivery and biopsy sampling. Data-driven algorithms promise to enable
many advanced functionalities for capsule endoscopes, but real-world data is
challenging to obtain. Physically-realistic simulations providing synthetic
data have emerged as a solution to the development of data-driven algorithms.
In this work, we present a comprehensive simulation platform for capsule
endoscopy operations and introduce VR-Caps, a virtual active capsule
environment that simulates a range of normal and abnormal tissue conditions
(e.g., inflated, dry, wet etc.) and varied organ types, capsule endoscope
designs (e.g., mono, stereo, dual and 360{\deg}camera), and the type, number,
strength, and placement of internal and external magnetic sources that enable
active locomotion. VR-Caps makes it possible to both independently or jointly
develop, optimize, and test medical imaging and analysis software for the
current and next-generation endoscopic capsule systems. To validate this
approach, we train state-of-the-art deep neural networks to accomplish various
medical image analysis tasks using simulated data from VR-Caps and evaluate the
performance of these models on real medical data. Results demonstrate the
usefulness and effectiveness of the proposed virtual platform in developing
algorithms that quantify fractional coverage, camera trajectory, 3D map
reconstruction, and disease classification.Comment: 18 pages, 14 figure
Visibility computation through image generalization
This dissertation introduces the image generalization paradigm for computing visibility. The paradigm is based on the observation that an image is a powerful tool for computing visibility. An image can be rendered efficiently with the support of graphics hardware and each of the millions of pixels in the image reports a visible geometric primitive. However, the visibility solution computed by a conventional image is far from complete. A conventional image has a uniform sampling rate which can miss visible geometric primitives with a small screen footprint. A conventional image can only find geometric primitives to which there is direct line of sight from the center of projection (i.e. the eye) of the image; therefore, a conventional image cannot compute the set of geometric primitives that become visible as the viewpoint translates, or as time changes in a dynamic dataset. Finally, like any sample-based representation, a conventional image can only confirm that a geometric primitive is visible, but it cannot confirm that a geometric primitive is hidden, as that would require an infinite number of samples to confirm that the primitive is hidden at all of its points. ^ The image generalization paradigm overcomes the visibility computation limitations of conventional images. The paradigm has three elements. (1) Sampling pattern generalization entails adding sampling locations to the image plane where needed to find visible geometric primitives with a small footprint. (2) Visibility sample generalization entails replacing the conventional scalar visibility sample with a higher dimensional sample that records all geometric primitives visible at a sampling location as the viewpoint translates or as time changes in a dynamic dataset; the higher-dimensional visibility sample is computed exactly, by solving visibility event equations, and not through sampling. Another form of visibility sample generalization is to enhance a sample with its trajectory as the geometric primitive it samples moves in a dynamic dataset. (3) Ray geometry generalization redefines a camera ray as the set of 3D points that project at a given image location; this generalization supports rays that are not straight lines, and enables designing cameras with non-linear rays that circumvent occluders to gather samples not visible from a reference viewpoint. ^ The image generalization paradigm has been used to develop visibility algorithms for a variety of datasets, of visibility parameter domains, and of performance-accuracy tradeoff requirements. These include an aggressive from-point visibility algorithm that guarantees finding all geometric primitives with a visible fragment, no matter how small primitive\u27s image footprint, an efficient and robust exact from-point visibility algorithm that iterates between a sample-based and a continuous visibility analysis of the image plane to quickly converge to the exact solution, a from-rectangle visibility algorithm that uses 2D visibility samples to compute a visible set that is exact under viewpoint translation, a flexible pinhole camera that enables local modulations of the sampling rate over the image plane according to an input importance map, an animated depth image that not only stores color and depth per pixel but also a compact representation of pixel sample trajectories, and a curved ray camera that integrates seamlessly multiple viewpoints into a multiperspective image without the viewpoint transition distortion artifacts of prior art methods
- …