37 research outputs found
Compensating for motion estimation inaccuracies in DVC
Distributed video coding is a relatively new video coding approach, where compression is achieved by performing motion estimation at the decoder. Current techniques for decoder-side motion estimation make use of assumptions such as linear motion between the reference frames. It is only after the frame is partially decoded that some of the errors are corrected. In this paper, we propose a new approach with multiple predictors, accounting for inaccuracies in the decoder-side motion estimation process during the decoding. Each of the predictors is assigned a weight, and the correlation between the original frame at the encoder and the set of predictors at the decoder is modeled at the decoder. This correlation information is then used during the decoding process. Results indicate average quality gains up to 0.4 dB
On Benefits and Challenges of Conditional Interframe Video Coding in Light of Information Theory
The rise of variational autoencoders for image and video compression has
opened the door to many elaborate coding techniques. One example here is the
possibility of conditional interframe coding. Here, instead of transmitting the
residual between the original frame and the predicted frame (often obtained by
motion compensation), the current frame is transmitted under the condition of
knowing the prediction signal. In practice, conditional coding can be
straightforwardly implemented using a conditional autoencoder, which has also
shown good results in recent works. In this paper, we provide an information
theoretical analysis of conditional coding for inter frames and show in which
cases gains compared to traditional residual coding can be expected. We also
show the effect of information bottlenecks which can occur in practical video
coders in the prediction signal path due to the network structure, as a
consequence of the data-processing theorem or due to quantization. We
demonstrate that conditional coding has theoretical benefits over residual
coding but that there are cases in which the benefits are quickly canceled by
small information bottlenecks of the prediction signal.Comment: 5 pages, 4 figures, accepted to be presented at PCS 2022. arXiv admin
note: text overlap with arXiv:2112.08011 Update Note: Fixed notation in Eq.
10, no changes otherwis
Flexible distribution of complexity by hybrid predictive-distributed video coding
There is currently limited flexibility for distributing complexity in a video coding system. While rate-distortion-complexity (RDC) optimization techniques have been proposed for conventional predictive video coding with encoder-side motion estimation, they fail to offer true flexible distribution of complexity between encoder and decoder since the encoder is assumed to have always more computational resources available than the decoder. On the other hand, distributed video coding solutions with decoder-side motion estimation have been proposed, but hardly any RDC optimized systems have been developed.
To offer more flexibility for video applications involving multi-tasking or battery-constrained devices, in this paper, we propose a codec combining predictive video coding concepts and techniques from distributed video coding and show the flexibility of this method in distributing complexity. We propose several modes to code frames, and provide complexity analysis illustrating encoder and decoder computational complexity for each mode. Rate distortion results for each mode indicate that the coding efficiency is similar. We describe a method to choose which mode to use for coding each inter frame, taking into account encoder and decoder complexity constraints, and illustrate how complexity is distributed more flexibly
REGION-BASED ADAPTIVE DISTRIBUTED VIDEO CODING CODEC
The recently developed Distributed Video Coding (DVC) is typically suitable for the
applications where the conventional video coding is not feasible because of its
inherent high-complexity encoding. Examples include video surveillance usmg
wireless/wired video sensor network and applications using mobile cameras etc. With
DVC, the complexity is shifted from the encoder to the decoder.
The practical application of DVC is referred to as Wyner-Ziv video coding (WZ)
where an estimate of the original frame called "side information" is generated using
motion compensation at the decoder. The compression is achieved by sending only
that extra information that is needed to correct this estimation. An error-correcting
code is used with the assumption that the estimate is a noisy version of the original
frame and the rate needed is certain amount of the parity bits. The side information is
assumed to have become available at the decoder through a virtual channel. Due to
the limitation of compensation method, the predicted frame, or the side information, is
expected to have varying degrees of success. These limitations stem from locationspecific
non-stationary estimation noise. In order to avoid these, the conventional
video coders, like MPEG, make use of frame partitioning to allocate optimum coder
for each partition and hence achieve better rate-distortion performance. The same,
however, has not been used in DVC as it increases the encoder complexity.
This work proposes partitioning the considered frame into many coding units
(region) where each unit is encoded differently. This partitioning is, however, done at
the decoder while generating the side-information and the region map is sent over to
encoder at very little rate penalty. The partitioning allows allocation of appropriate
DVC coding parameters (virtual channel, rate, and quantizer) to each region. The
resulting regions map is compressed by employing quadtree algorithm and
communicated to the encoder via the feedback channel. The rate control in DVC is
performed by channel coding techniques (turbo codes, LDPC, etc.). The performance
of the channel code depends heavily on the accuracy of virtual channel model that models estimation error for each region. In this work, a turbo code has been used and
an adaptive WZ DVC is designed both in transform domain and in pixel domain. The
transform domain WZ video coding (TDWZ) has distinct superior performance as
compared to the normal Pixel Domain Wyner-Ziv (PDWZ), since it exploits the
'
spatial redundancy during the encoding. The performance evaluations show that the
proposed system is superior to the existing distributed video coding solutions.
Although the, proposed system requires extra bits representing the "regions map" to be
transmitted, fuut still the rate gain is noticeable and it outperforms the state-of-the-art
frame based DVC by 0.6-1.9 dB.
The feedback channel (FC) has the role to adapt the bit rate to the changing
'
statistics between the side infonmation and the frame to be encoded. In the
unidirectional scenario, the encoder must perform the rate control. To correctly
estimate the rate, the encoder must calculate typical side information. However, the
rate cannot be exactly calculated at the encoder, instead it can only be estimated. This
work also prbposes a feedback-free region-based adaptive DVC solution in pixel
domain based on machine learning approach to estimate the side information.
Although the performance evaluations show rate-penalty but it is acceptable
considering the simplicity of the proposed algorithm.
vii
In-vivo digital volume correlation via magnetic resonance imaging: Application to positional brain shift and deep tissue injury
This thesis aims to investigate the complexity of the physiological mechanical response
of soft tissues, providing rich datasets for the verification of clinical systems limiting or
preventing tissue injury. A thorough understanding of the sagging of the brain tissue
under the effect of gravity (positional brain shift, PBS) is paramount for the design of
an effective intra-operative correction of surgical trajectories; rich measurements of the
response of the buttock to sitting loads can help the verification of computational models
to couple with clinical measures for the prevention and control of pressure ulcers.
Digital volume correlation (DVC) consists in measuring the local differences between
scans depicting the deformed and undeformed stages of a sample under load, facilitating
the characterisation of the mechanical response of the sample. The use of DVC in-vivo
is limited, due to the limited quality of the scans constrained by the acquisition setting.
Accuracy of three deformable registration methods was first assessed after optimisation
against biomechanically plausible ground truths generated via finite element simulations.
Against the simulation of PBS, the best accuracy achieved was of one order of magnitude
smaller than the resolution of the images. For the simulation of deformations of the
buttock due to sitting, optimal accuracy was around 10% of the average deformation
fields applied.
The best performing methods alongside their optimal parameter sets were then used
to perform in-vivo measurements on real magnetic resonance scans of two separate
datasets of healthy subjects. For PBS, the study revealed the need for intervention- and
Abstract iv
patient-specific correction of surgical trajectories given the effect of head geometry and
orientation on the shift. For the deformation of the buttock due to sitting, the measure�ments gave a three-dimensional depiction of the local and global pattern of deformation,
which results were previously limited to thickness or surface measurements
REGION-BASED ADAPTIVE DISTRIBUTED VIDEO CODING CODEC
The recently developed Distributed Video Coding (DVC) is typically suitable for the
applications where the conventional video coding is not feasible because of its
inherent high-complexity encoding. Examples include video surveillance usmg
wireless/wired video sensor network and applications using mobile cameras etc. With
DVC, the complexity is shifted from the encoder to the decoder.
The practical application of DVC is referred to as Wyner-Ziv video coding (WZ)
where an estimate of the original frame called "side information" is generated using
motion compensation at the decoder. The compression is achieved by sending only
that extra information that is needed to correct this estimation. An error-correcting
code is used with the assumption that the estimate is a noisy version of the original
frame and the rate needed is certain amount of the parity bits. The side information is
assumed to have become available at the decoder through a virtual channel. Due to
the limitation of compensation method, the predicted frame, or the side information, is
expected to have varying degrees of success. These limitations stem from locationspecific
non-stationary estimation noise. In order to avoid these, the conventional
video coders, like MPEG, make use of frame partitioning to allocate optimum coder
for each partition and hence achieve better rate-distortion performance. The same,
however, has not been used in DVC as it increases the encoder complexity.
This work proposes partitioning the considered frame into many coding units
(region) where each unit is encoded differently. This partitioning is, however, done at
the decoder while generating the side-information and the region map is sent over to
encoder at very little rate penalty. The partitioning allows allocation of appropriate
DVC coding parameters (virtual channel, rate, and quantizer) to each region. The
resulting regions map is compressed by employing quadtree algorithm and
communicated to the encoder via the feedback channel. The rate control in DVC is
performed by channel coding techniques (turbo codes, LDPC, etc.). The performance
of the channel code depends heavily on the accuracy of virtual channel model that models estimation error for each region. In this work, a turbo code has been used and
an adaptive WZ DVC is designed both in transform domain and in pixel domain. The
transform domain WZ video coding (TDWZ) has distinct superior performance as
compared to the normal Pixel Domain Wyner-Ziv (PDWZ), since it exploits the
'
spatial redundancy during the encoding. The performance evaluations show that the
proposed system is superior to the existing distributed video coding solutions.
Although the, proposed system requires extra bits representing the "regions map" to be
transmitted, fuut still the rate gain is noticeable and it outperforms the state-of-the-art
frame based DVC by 0.6-1.9 dB.
The feedback channel (FC) has the role to adapt the bit rate to the changing
'
statistics between the side infonmation and the frame to be encoded. In the
unidirectional scenario, the encoder must perform the rate control. To correctly
estimate the rate, the encoder must calculate typical side information. However, the
rate cannot be exactly calculated at the encoder, instead it can only be estimated. This
work also prbposes a feedback-free region-based adaptive DVC solution in pixel
domain based on machine learning approach to estimate the side information.
Although the performance evaluations show rate-penalty but it is acceptable
considering the simplicity of the proposed algorithm.
vii