662 research outputs found
Convex Optimization Based Bit Allocation for Light Field Compression under Weighting and Consistency Constraints
Compared with conventional image and video, light field images introduce the
weight channel, as well as the visual consistency of rendered view, information
that has to be taken into account when compressing the pseudo-temporal-sequence
(PTS) created from light field images. In this paper, we propose a novel frame
level bit allocation framework for PTS coding. A joint model that measures
weighted distortion and visual consistency, combined with an iterative encoding
system, yields the optimal bit allocation for each frame by solving a convex
optimization problem. Experimental results show that the proposed framework is
effective in producing desired distortion distribution based on weights, and
achieves up to 24.7% BD-rate reduction comparing to the default rate control
algorithm.Comment: published in IEEE Data Compression Conference, 201
Rate-Accuracy Trade-Off In Video Classification With Deep Convolutional Neural Networks
Advanced video classification systems decode video frames to derive the
necessary texture and motion representations for ingestion and analysis by
spatio-temporal deep convolutional neural networks (CNNs). However, when
considering visual Internet-of-Things applications, surveillance systems and
semantic crawlers of large video repositories, the video capture and the
CNN-based semantic analysis parts do not tend to be co-located. This
necessitates the transport of compressed video over networks and incurs
significant overhead in bandwidth and energy consumption, thereby significantly
undermining the deployment potential of such systems. In this paper, we
investigate the trade-off between the encoding bitrate and the achievable
accuracy of CNN-based video classification models that directly ingest
AVC/H.264 and HEVC encoded videos. Instead of retaining entire compressed video
bitstreams and applying complex optical flow calculations prior to CNN
processing, we only retain motion vector and select texture information at
significantly-reduced bitrates and apply no additional processing prior to CNN
ingestion. Based on three CNN architectures and two action recognition
datasets, we achieve 11%-94% saving in bitrate with marginal effect on
classification accuracy. A model-based selection between multiple CNNs
increases these savings further, to the point where, if up to 7% loss of
accuracy can be tolerated, video classification can take place with as little
as 3 kbps for the transport of the required compressed video information to the
system implementing the CNN models
Dynamically Reconfigurable Architectures and Systems for Time-varying Image Constraints (DRASTIC) for Image and Video Compression
In the current information booming era, image and video consumption is ubiquitous. The associated image and video coding operations require significant computing resources for both small-scale computing systems as well as over larger network systems. For different scenarios, power, bitrate and image quality can impose significant time-varying constraints. For example, mobile devices (e.g., phones, tablets, laptops, UAVs) come with significant constraints on energy and power. Similarly, computer networks provide time-varying bandwidth that can depend on signal strength (e.g., wireless networks) or network traffic conditions. Alternatively, the users can impose different constraints on image quality based on their interests. Traditional image and video coding systems have focused on rate-distortion optimization. More recently, distortion measures (e.g., PSNR) are being replaced by more sophisticated image quality metrics. However, these systems are based on fixed hardware configurations that provide limited options over power consumption. The use of dynamic partial reconfiguration with Field Programmable Gate Arrays (FPGAs) provides an opportunity to effectively control dynamic power consumption by jointly considering software-hardware configurations. This dissertation extends traditional rate-distortion optimization to rate-quality-power/energy optimization and demonstrates a wide variety of applications in both image and video compression. In each application, a family of Pareto-optimal configurations are developed that allow fine control in the rate-quality-power/energy optimization space. The term Dynamically Reconfiguration Architecture Systems for Time-varying Image Constraints (DRASTIC) is used to describe the derived systems. DRASTIC covers both software-only as well as software-hardware configurations to achieve fine optimization over a set of general modes that include: (i) maximum image quality, (ii) minimum dynamic power/energy, (iii) minimum bitrate, and (iv) typical mode over a set of opposing constraints to guarantee satisfactory performance. In joint software-hardware configurations, DRASTIC provides an effective approach for dynamic power optimization. For software configurations, DRASTIC provides an effective method for energy consumption optimization by controlling processing times. The dissertation provides several applications. First, stochastic methods are given for computing quantization tables that are optimal in the rate-quality space and demonstrated on standard JPEG compression. Second, a DRASTIC implementation of the DCT is used to demonstrate the effectiveness of the approach on motion JPEG. Third, a reconfigurable deblocking filter system is investigated for use in the current H.264/AVC systems. Fourth, the dissertation develops DRASTIC for all 35 intra-prediction modes as well as intra-encoding for the emerging High Efficiency Video Coding standard (HEVC)
Optimized Data Representation for Interactive Multiview Navigation
In contrary to traditional media streaming services where a unique media
content is delivered to different users, interactive multiview navigation
applications enable users to choose their own viewpoints and freely navigate in
a 3-D scene. The interactivity brings new challenges in addition to the
classical rate-distortion trade-off, which considers only the compression
performance and viewing quality. On the one hand, interactivity necessitates
sufficient viewpoints for richer navigation; on the other hand, it requires to
provide low bandwidth and delay costs for smooth navigation during view
transitions. In this paper, we formally describe the novel trade-offs posed by
the navigation interactivity and classical rate-distortion criterion. Based on
an original formulation, we look for the optimal design of the data
representation by introducing novel rate and distortion models and practical
solving algorithms. Experiments show that the proposed data representation
method outperforms the baseline solution by providing lower resource
consumptions and higher visual quality in all navigation configurations, which
certainly confirms the potential of the proposed data representation in
practical interactive navigation systems
Decoding-complexity-aware HEVC encoding using a complexity–rate–distortion model
The energy consumption of Consumer Electronic (CE) devices during media playback is inexorably linked to the computational complexity of decoding compressed video. Reducing a CE device's the energy consumption is therefore becoming ever more challenging with the increasing video resolutions and the complexity of the video coding algorithms. To this end, this paper proposes a framework that alters the video bit stream to reduce the decoding complexity and simultaneously limits the impact on the coding efficiency. In this context, this paper (i) first performs an analysis to determine the trade-off between the decoding complexity, video quality and bit rate with respect to a reference decoder implementation on a General Purpose Processor (GPP) architecture. Thereafter, (ii) a novel generic decoding complexity-aware video coding algorithm is proposed to generate decoding complexity-rate-distortion optimized High Efficiency Video Coding (HEVC) bit streams.
The experimental results reveal that the bit streams generated by the proposed algorithm achieve 29.43% and 13.22% decoding complexity reductions for a similar video quality with minimal coding efficiency impact compared to the state-of-the-art approaches when applied to the HM16.0 and openHEVC decoder implementations, respectively. In addition, analysis of the energy consumption behavior for the same scenarios reveal up to 20% energy consumption reductions while achieving a similar video quality to that of HM 16.0 encoded HEVC bit streams
- …