3,824 research outputs found
Rate control for HEVC intra-coding based on piecewise linear approximations
This paper proposes a rate control (RC) algorithm for intra-coded sequences (I-frames) within the context of block-based predictive transform coding (PTC) that employs piecewise linear approximations of the rate-distortion (RD) curve of each frame. Specifically, it employs information about the rate (R) and distortion (D) of already compressed blocks within the current frame to linearly approximate the slope of the corresponding RD curve. The proposed algorithm is implemented in the High-Efficiency Video Coding (HEVC) standard and compared with the current HEVC RC algorithm, which is based on a trained rate lambda (R-λ) model. Evaluations on a variety of intra-coded sequences show that the proposed RC algorithm not only attains the overall target bit rate more accurately than the current RC algorithm but is also capable of encoding each I-frame at a more constant bit rate according to the overall bit budget, thus avoiding high bit rate fluctuations across the sequence
Model based optimal bit allocation
Modeling of the operational rate-distortion characteristics of a signal can significantly reduce the computational complexity of an optimal bit allocation algorithm. In this report, such models are studied
Dynamic Bezier curves for variable rate-distortion
Bezier curves (BC) are important tools in a wide range of diverse and challenging applications, from computer-aided design to generic object shape descriptors. A major constraint of the classical BC is that only global information concerning control points (CP) is considered, consequently there may be a sizeable gap between the BC and its control polygon (CtrlPoly), leading to a large distortion in shape representation. While BC variants like degree elevation, composite BC and refinement and subdivision narrow this gap, they increase the number of CP and thereby both the required bit-rate and computational complexity. In addition, while quasi-Bezier curves (QBC) close the gap without increasing the number of CP, they reduce the underlying distortion by only a fixed amount. This paper presents a novel contribution to BC theory, with the introduction of a dynamic Bezier curve (DBC) model, which embeds variable localised CP information into the inherently global Bezier framework, by strategically moving BC points towards the CtrlPoly. A shifting parameter (SP) is defined that enables curves lying within the region between the BC and CtrlPoly to be generated, with no commensurate increase in CP. DBC provides a flexible rate-distortion (RD) criterion for shape coding applications, with a theoretical model for determining the optimal SP value for any admissible distortion being formulated. Crucially DBC retains core properties of the classical BC, including the convex hull and affine invariance, and can be seamlessly integrated into both the vertex-based shape coding and shape descriptor frameworks to improve their RD performance. DBC has been empirically tested upon a number of natural and synthetically shaped objects, with qualitative and quantitative results confirming its consistently superior shape approximation performance, compared with the classical BC, QBC and other established BC-based shape descriptor techniques
Geometric distortion measurement for shape coding: a contemporary review
Geometric distortion measurement and the associated metrics involved are integral to the rate-distortion (RD) shape coding framework, with importantly the efficacy of the metrics being strongly influenced by the underlying measurement strategy. This has been the catalyst for many different techniques with this paper presenting a comprehensive review of geometric distortion measurement, the diverse metrics applied and their impact on shape coding. The respective performance of these measuring strategies is analysed from both a RD and complexity perspective, with a recent distortion measurement technique based on arc-length-parameterisation being comparatively evaluated. Some contemporary research challenges are also investigated, including schemes to effectively quantify shape deformation
Depth map compression via 3D region-based representation
In 3D video, view synthesis is used to create new virtual views between
encoded camera views. Errors in the coding of the depth maps introduce
geometry inconsistencies in synthesized views. In this paper, a new 3D plane
representation of the scene is presented which improves the performance of
current standard video codecs in the view synthesis domain. Two image segmentation
algorithms are proposed for generating a color and depth segmentation.
Using both partitions, depth maps are segmented into regions without
sharp discontinuities without having to explicitly signal all depth edges. The
resulting regions are represented using a planar model in the 3D world scene.
This 3D representation allows an efficient encoding while preserving the 3D
characteristics of the scene. The 3D planes open up the possibility to code
multiview images with a unique representation.Postprint (author's final draft
Concepts for on-board satellite image registration, volume 1
The NASA-NEEDS program goals present a requirement for on-board signal processing to achieve user-compatible, information-adaptive data acquisition. One very specific area of interest is the preprocessing required to register imaging sensor data which have been distorted by anomalies in subsatellite-point position and/or attitude control. The concepts and considerations involved in using state-of-the-art positioning systems such as the Global Positioning System (GPS) in concert with state-of-the-art attitude stabilization and/or determination systems to provide the required registration accuracy are discussed with emphasis on assessing the accuracy to which a given image picture element can be located and identified, determining those algorithms required to augment the registration procedure and evaluating the technology impact on performing these procedures on-board the satellite
Low-latency compression of mocap data using learned spatial decorrelation transform
Due to the growing needs of human motion capture (mocap) in movie, video
games, sports, etc., it is highly desired to compress mocap data for efficient
storage and transmission. This paper presents two efficient frameworks for
compressing human mocap data with low latency. The first framework processes
the data in a frame-by-frame manner so that it is ideal for mocap data
streaming and time critical applications. The second one is clip-based and
provides a flexible tradeoff between latency and compression performance. Since
mocap data exhibits some unique spatial characteristics, we propose a very
effective transform, namely learned orthogonal transform (LOT), for reducing
the spatial redundancy. The LOT problem is formulated as minimizing square
error regularized by orthogonality and sparsity and solved via alternating
iteration. We also adopt a predictive coding and temporal DCT for temporal
decorrelation in the frame- and clip-based frameworks, respectively.
Experimental results show that the proposed frameworks can produce higher
compression performance at lower computational cost and latency than the
state-of-the-art methods.Comment: 15 pages, 9 figure
- …