6 research outputs found
Edge-guided image gap interpolation using multi-scale transformation
This paper presents improvements in image gap restoration through the incorporation of edge-based directional interpolation within multi-scale pyramid transforms. Two types of image edges are reconstructed: 1) the local edges or textures, inferred from the gradients of the neighboring pixels and 2) the global edges between image objects or segments, inferred using a Canny detector. Through a process of pyramid transformation and downsampling, the image is progressively transformed into a series of reduced size layers until at the pyramid apex the gap size is one sample. At each layer, an edge skeleton image is extracted for edge-guided interpolation. The process is then reversed; from the apex, at each layer, the missing samples are estimated (an iterative method is used in the last stage of upsampling), up-sampled, and combined with the available samples of the next layer. Discrete cosine transform and a family of discrete wavelet transforms are utilized as alternatives for pyramid construction. Evaluations over a range of images, in regular and random loss pattern, at loss rates of up to 40%, demonstrate that the proposed method improves peak-signal-to-noise-ratio by 1–5 dB compared with a range of best-published works
Enhanced quality reconstruction of erroneous video streams using packet filtering based on non-desynchronizing bits and UDP checksum-filtered list decoding
The latest video coding standards, such as H.264 and H.265, are extremely vulnerable in error-prone networks. Due to their sophisticated spatial and temporal prediction tools, the effect of an error is not limited to the erroneous area but it can easily propagate spatially to the neighboring blocks and temporally to the following frames. Thus, reconstructed video packets at the decoder side may exhibit significant visual quality degradation. Error concealment and error corrections are two mechanisms that have been developed to improve the quality of reconstructed frames in the presence of errors.
In most existing error concealment approaches, the corrupted packets are ignored and only the correctly received information of the surrounding areas (spatially and/or temporally) is used to recover the erroneous area. This is due to the fact that there is no perfect error detection mechanism to identify correctly received blocks within a corrupted packet, and moreover because of the desynchronization problem caused by the transmission errors on the variable-length code (VLC). But, as many studies have shown, the corrupted packets may contain valuable information that can be used to reconstruct adequately of the lost area (e.g. when the error is located at the end of a slice).
On the other hand, error correction approaches, such as list decoding, exploit the corrupted packets to generate several candidate transmitted packets from the corrupted received packet. They then select, among these candidates, the one with the highest likelihood of being the transmitted packet based on the available soft information (e.g. log-likelihood ratio (LLR) of each bit). However, list decoding approaches suffer from a large solution space of candidate transmitted packets. This is worsened when the soft information is not available at the application layer; a more realistic scenario in practice. Indeed, since it is unknown which bits have higher probabilities of having been modified during transmission, the candidate received packets cannot be ranked by likelihood.
In this thesis, we propose various strategies to improve the quality of reconstructed packets which have been lightly damaged during transmission (e.g. at most a single error per packet). We first propose a simple but efficient mechanism to filter damaged packets in order to retain those likely to lead to a very good reconstruction and discard the others. This method can be used as a complement to most existing concealment approaches to enhance their performance. The method is based on the novel concept of non-desynchronizing bits (NDBs) defined, in the context of an H.264 context-adaptive variable-length coding (CAVLC) coded sequence, as a bit whose inversion does not cause desynchronization at the bitstream level nor changes the number of decoded macroblocks. We establish that, on typical coded bitstreams, the NDBs constitute about a one-third (about 30%) of a bitstream, and that the effect on visual quality of flipping one of them in a packet is mostly insignificant. In most cases (90%), the quality of the reconstructed packet when modifying an individual NDB is almost the same as the intact one. We thus demonstrate that keeping, under certain conditions, a corrupted packet as a candidate for the lost area can provide better visual quality compared to the concealment approaches. We finally propose a non-desync-based decoding framework, which retains a corrupted packet, under the condition of not causing desynchronization and not altering the number of expected macroblocks. The framework can be combined with most current concealment approaches. The proposed approach is compared to the frame copy (FC) concealment of Joint Model (JM) software (JM-FC) and a state-of-the-art concealment approach using the spatiotemporal boundary matching algorithm (STBMA) mechanism, in the case of one bit in error, and on average, respectively, provides 3.5 dB and 1.42 dB gain over them.
We then propose a novel list decoding approach called checksum-filtered list decoding (CFLD) which can correct a packet at the bit stream level by exploiting the receiver side user datagram protocol (UDP) checksum value. The proposed approach is able to identify the possible locations of errors by analyzing the pattern of the calculated UDP checksum on the corrupted packet. This makes it possible to considerably reduce the number of candidate transmitted packets in comparison to conventional list decoding approaches, especially when no soft information is available. When a packet composed of N bits contains a single bit in error, instead of considering N candidate packets, as is the case in conventional list decoding approaches, the proposed approach considers approximately N/32 candidate packets, leading to a 97% reduction in the number of candidates. This reduction can increase to 99.6% in the case of a two-bit error. The method’s performance is evaluated using H.264 and high efficiency video coding (HEVC) test model software. We show that, in the case H.264 coded sequence, on average, the CFLD approach is able to correct the packet 66% of the time. It also offers a 2.74 dB gain over JM-FC and 1.14 dB and 1.42 dB gains over STBMA and hard output maximum likelihood decoding (HO-MLD), respectively. Additionally, in the case of HEVC, the CFLD approach corrects the corrupted packet 91% of the time, and offers 2.35 dB and 4.97 dB gains over our implementation of FC concealment in HEVC test model software (HM-FC) in class B (1920×1080) and C (832×480) sequences, respectively
Recommended from our members
Multi-scale edge-guided image gap restoration
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University London.The focus of this research work is the estimation of gaps (missing blocks) in digital images. To progress the research two main issues were identified as (1) the appropriate domains for image gap restoration and (2) the methodologies for gap interpolation. Multi-scale transforms provide an appropriate framework for gap restoration. The main advantages are transformations into a set of frequency and scales and the ability to progressively reduce the size of the gap to one sample wide at the transform apex. Two types of multi-scale transform were considered for comparative evaluation; 2-dimensional (2D) discrete cosines (DCT) pyramid and 2D discrete wavelets (DWT). For image gap estimation, a family of conventional weighted interpolators and directional edge-guided interpolators are developed and evaluated. Two types of edges were considered; ‘local’ edges or textures and ‘global’ edges such as the boundaries between objects or within/across patterns in the image. For local edge, or texture, modelling a number of methods were explored which aim to reconstruct a set of gradients across the restored gap as those computed from the known neighbourhood. These differential gradients are estimated along the geometrical vertical, horizontal and cross directions for each pixel of the gap. The edge-guided interpolators aim to operate on distinct regions confined within edge lines. For global edge-guided interpolation, two main methods explored are Sobel and Canny detectors. The latter provides improved edge detection. The combination and integration of different multi-scale domains, local edge interpolators, global edge-guided interpolators and iterative estimation of edges provided a variety of configurations that were comparatively explored and evaluated. For evaluation a set of images commonly used in the literature work were employed together with simulated regular and random image gaps at a variety of loss rate. The performance measures used are the peak signal to noise ratio (PSNR) and structure similarity index (SSIM). The results obtained are better than the state of the art reported in the literature