85 research outputs found

    Optimising Spatial and Tonal Data for PDE-based Inpainting

    Full text link
    Some recent methods for lossy signal and image compression store only a few selected pixels and fill in the missing structures by inpainting with a partial differential equation (PDE). Suitable operators include the Laplacian, the biharmonic operator, and edge-enhancing anisotropic diffusion (EED). The quality of such approaches depends substantially on the selection of the data that is kept. Optimising this data in the domain and codomain gives rise to challenging mathematical problems that shall be addressed in our work. In the 1D case, we prove results that provide insights into the difficulty of this problem, and we give evidence that a splitting into spatial and tonal (i.e. function value) optimisation does hardly deteriorate the results. In the 2D setting, we present generic algorithms that achieve a high reconstruction quality even if the specified data is very sparse. To optimise the spatial data, we use a probabilistic sparsification, followed by a nonlocal pixel exchange that avoids getting trapped in bad local optima. After this spatial optimisation we perform a tonal optimisation that modifies the function values in order to reduce the global reconstruction error. For homogeneous diffusion inpainting, this comes down to a least squares problem for which we prove that it has a unique solution. We demonstrate that it can be found efficiently with a gradient descent approach that is accelerated with fast explicit diffusion (FED) cycles. Our framework allows to specify the desired density of the inpainting mask a priori. Moreover, is more generic than other data optimisation approaches for the sparse inpainting problem, since it can also be extended to nonlinear inpainting operators such as EED. This is exploited to achieve reconstructions with state-of-the-art quality. We also give an extensive literature survey on PDE-based image compression methods

    Investigating Polynomial Fitting Schemes for Image Compression

    Get PDF
    Image compression is a means to perform transmission or storage of visual data in the most economical way. Though many algorithms have been reported, research is still needed to cope with the continuous demand for more efficient transmission or storage. This research work explores and implements polynomial fitting techniques as means to perform block-based lossy image compression. In an attempt to investigate nonpolynomial models, a region-based scheme is implemented to fit the whole image using bell-shaped functions. The idea is simply to view an image as a 3D geographical map consisting of hills and valleys. However, the scheme suffers from high computational demands and inferiority to many available image compression schemes. Hence, only polynomial models get further considerations. A first order polynomial (plane) model is designed to work in a multiplication- and division-free (MDF) environment. The intensity values of each image block are fitted to a plane and the parameters are then quantized and coded. Blocking artefacts, a common drawback of block-based image compression techniques, are reduced using an MDF line-fitting scheme at blocks’ boundaries. It is shown that a compression ratio of 62:1 at 28.8dB is attainable for the standard image PEPPER, outperforming JPEG, both objectively and subjectively for this part of the rate-distortion characteristics. Inter-block prediction can substantially improve the compression performance of the plane model to reach a compression ratio of 112:1 at 27.9dB. This improvement, however, slightly increases computational complexity and reduces pipelining capability. Although JPEG2000 is not a block-based scheme, it is encouraging that the proposed prediction scheme performs better in comparison to JPEG 2000, computationally and qualitatively. However, more experiments are needed to have a more concrete comparison. To reduce blocking artefacts, a new postprocessing scheme, based on Weber’s law, is employed. It is reported that images postprocessed using this scheme are subjectively more pleasing with a marginal increase in PSNR (<0.3 dB). The Weber’s law is modified to perform edge detection and quality assessment tasks. These results motivate the exploration of higher order polynomials, using three parameters to maintain comparable compression performance. To investigate the impact of higher order polynomials, through an approximate asymptotic behaviour, a novel linear mapping scheme is designed. Though computationally demanding, the performances of higher order polynomial approximation schemes are comparable to that of the plane model. This clearly demonstrates the powerful approximation capability of the plane model. As such, the proposed linear mapping scheme constitutes a new approach in image modeling, and hence worth future consideration

    Video coding for compression and content-based functionality

    Get PDF
    The lifetime of this research project has seen two dramatic developments in the area of digital video coding. The first has been the progress of compression research leading to a factor of two improvement over existing standards, much wider deployment possibilities and the development of the new international ITU-T Recommendation H.263. The second has been a radical change in the approach to video content production with the introduction of the content-based coding concept and the addition of scene composition information to the encoded bit-stream. Content-based coding is central to the latest international standards efforts from the ISO/IEC MPEG working group. This thesis reports on extensions to existing compression techniques exploiting a priori knowledge about scene content. Existing, standardised, block-based compression coding techniques were extended with work on arithmetic entropy coding and intra-block prediction. These both form part of the H.263 and MPEG-4 specifications respectively. Object-based coding techniques were developed within a collaborative simulation model, known as SIMOC, then extended with ideas on grid motion vector modelling and vector accuracy confidence estimation. An improved confidence measure for encouraging motion smoothness is proposed. Object-based coding ideas, with those from other model and layer-based coding approaches, influenced the development of content-based coding within MPEG-4. This standard made considerable progress in this newly adopted content based video coding field defining normative techniques for arbitrary shape and texture coding. The means to generate this information, the analysis problem, for the content to be coded was intentionally not specified. Further research work in this area concentrated on video segmentation and analysis techniques to exploit the benefits of content based coding for generic frame based video. The work reported here introduces the use of a clustering algorithm on raw data features for providing initial segmentation of video data and subsequent tracking of those image regions through video sequences. Collaborative video analysis frameworks from COST 21 l qual and MPEG-4, combining results from many other segmentation schemes, are also introduced

    Scanline calculation of radial influence for image processing

    Full text link
    Efficient methods for the calculation of radial influence are described and applied to two image processing problems, digital halftoning and mixed content image compression. The methods operate recursively on scanlines of image values, spreading intensity from scanline to scanline in proportions approximating a Cauchy distribution. For error diffusion halftoning, experiments show that this recursive scanline spreading provides an ideal pattern of distribution of error. Error diffusion using masks generated to provide this distribution of error alleviate error diffusion "worm" artifacts. The recursive scanline by scanline application of a spreading filter and a complementary filter can be used to reconstruct an image from its horizontal and vertical pixel difference values. When combined with the use of a downsampled image the reconstruction is robust to incomplete and quantized pixel difference data. Such gradient field integration methods are described in detail proceeding from representation of images by gradient values along contours through to a variety of efficient algorithms. Comparisons show that this form of gradient field integration by convolution provides reduced distortion compared to other high speed gradient integration methods. The reduced distortion can be attributed to success in approximating a radial pattern of influence. An approach to edge-based image compression is proposed using integration of gradient data along edge contours and regularly sampled low resolution image data. This edge-based image compression model is similar to previous sketch based image coding methods but allows a simple and efficient calculation of an edge-based approximation image. A low complexity implementation of this approach to compression is described. The implementation extracts and represents gradient data along edge contours as pixel differences and calculates an approximate image by performing integration of pixel difference data by scanline convolution. The implementation was developed as a prototype for compression of mixed content image data in printing systems. Compression results are reported and strengths and weaknesses of the implementation are identified

    An investigation into the requirements for an efficient image transmission system over an ATM network

    Get PDF
    This thesis looks into the problems arising in an image transmission system when transmitting over an A TM network. Two main areas were investigated: (i) an alternative coding technique to reduce the bit rate required; and (ii) concealment of errors due to cell loss, with emphasis on processing in the transform domain of DCT-based images. [Continues.

    Understanding and advancing PDE-based image compression

    Get PDF
    This thesis is dedicated to image compression with partial differential equations (PDEs). PDE-based codecs store only a small amount of image points and propagate their information into the unknown image areas during the decompression step. For certain classes of images, PDE-based compression can already outperform the current quasi-standard, JPEG2000. However, the reasons for this success are not yet fully understood, and PDE-based compression is still in a proof-of-concept stage. With a probabilistic justification for anisotropic diffusion, we contribute to a deeper insight into design principles for PDE-based codecs. Moreover, by analysing the interaction between efficient storage methods and image reconstruction with diffusion, we can rank PDEs according to their practical value in compression. Based on these observations, we advance PDE-based compression towards practical viability: First, we present a new hybrid codec that combines PDE- and patch-based interpolation to deal with highly textured images. Furthermore, a new video player demonstrates the real-time capacities of PDE-based image interpolation and a new region of interest coding algorithm represents important image areas with high accuracy. Finally, we propose a new framework for diffusion-based image colourisation that we use to build an efficient codec for colour images. Experiments on real world image databases show that our new method is qualitatively competitive to current state-of-the-art codecs.Diese Dissertation ist der Bildkompression mit partiellen Differentialgleichungen (PDEs, partial differential equations) gewidmet. PDE-Codecs speichern nur einen geringen Anteil aller Bildpunkte und transportieren deren Information in fehlende Bildregionen. In einigen Fällen kann PDE-basierte Kompression den aktuellen Quasi-Standard, JPEG2000, bereits schlagen. Allerdings sind die Gründe für diesen Erfolg noch nicht vollständig erforscht, und PDE-basierte Kompression befindet sich derzeit noch im Anfangsstadium. Wir tragen durch eine probabilistische Rechtfertigung anisotroper Diffusion zu einem tieferen Verständnis PDE-basierten Codec-Designs bei. Eine Analyse der Interaktion zwischen effizienten Speicherverfahren und Bildrekonstruktion erlaubt es uns, PDEs nach ihrem Nutzen für die Kompression zu beurteilen. Anhand dieser Einsichten entwickeln wir PDE-basierte Kompression hinsichtlich ihrer praktischen Nutzbarkeit weiter: Wir stellen einen Hybrid-Codec für hochtexturierte Bilder vor, der umgebungsbasierte Interpolation mit PDEs kombiniert. Ein neuer Video-Dekodierer demonstriert die Echtzeitfähigkeit PDE-basierter Interpolation und eine Region-of-Interest-Methode erlaubt es, wichtige Bildbereiche mit hoher Genauigkeit zu speichern. Schlussendlich stellen wir ein neues diffusionsbasiertes Kolorierungsverfahren vor, welches uns effiziente Kompression von Farbbildern ermöglicht. Experimente auf Realwelt-Bilddatenbanken zeigen die Konkurrenzfähigkeit dieses Verfahrens auf

    Metrics for Stereoscopic Image Compression

    Get PDF
    Metrics for automatically predicting the compression settings for stereoscopic images, to minimize file size, while still maintaining an acceptable level of image quality are investigated. This research evaluates whether symmetric or asymmetric compression produces a better quality of stereoscopic image. Initially, how Peak Signal to Noise Ratio (PSNR) measures the quality of varyingly compressed stereoscopic image pairs was investigated. Two trials with human subjects, following the ITU-R BT.500-11 Double Stimulus Continuous Quality Scale (DSCQS) were undertaken to measure the quality of symmetric and asymmetric stereoscopic image compression. Computational models of the Human Visual System (HVS) were then investigated and a new stereoscopic image quality metric designed and implemented. The metric point matches regions of high spatial frequency between the left and right views of the stereo pair and accounts for HVS sensitivity to contrast and luminance changes in these regions. The PSNR results show that symmetric, as opposed to asymmetric stereo image compression, produces significantly better results. The human factors trial suggested that in general, symmetric compression of stereoscopic images should be used. The new metric, Stereo Band Limited Contrast, has been demonstrated as a better predictor of human image quality preference than PSNR and can be used to predict a perceptual threshold level for stereoscopic image compression. The threshold is the maximum compression that can be applied without the perceived image quality being altered. Overall, it is concluded that, symmetric, as opposed to asymmetric stereo image encoding, should be used for stereoscopic image compression. As PSNR measures of image quality are correctly criticized for correlating poorly with perceived visual quality, the new HVS based metric was developed. This metric produces a useful threshold to provide a practical starting point to decide the level of compression to use

    Understanding and advancing PDE-based image compression

    Get PDF
    This thesis is dedicated to image compression with partial differential equations (PDEs). PDE-based codecs store only a small amount of image points and propagate their information into the unknown image areas during the decompression step. For certain classes of images, PDE-based compression can already outperform the current quasi-standard, JPEG2000. However, the reasons for this success are not yet fully understood, and PDE-based compression is still in a proof-of-concept stage. With a probabilistic justification for anisotropic diffusion, we contribute to a deeper insight into design principles for PDE-based codecs. Moreover, by analysing the interaction between efficient storage methods and image reconstruction with diffusion, we can rank PDEs according to their practical value in compression. Based on these observations, we advance PDE-based compression towards practical viability: First, we present a new hybrid codec that combines PDE- and patch-based interpolation to deal with highly textured images. Furthermore, a new video player demonstrates the real-time capacities of PDE-based image interpolation and a new region of interest coding algorithm represents important image areas with high accuracy. Finally, we propose a new framework for diffusion-based image colourisation that we use to build an efficient codec for colour images. Experiments on real world image databases show that our new method is qualitatively competitive to current state-of-the-art codecs.Diese Dissertation ist der Bildkompression mit partiellen Differentialgleichungen (PDEs, partial differential equations) gewidmet. PDE-Codecs speichern nur einen geringen Anteil aller Bildpunkte und transportieren deren Information in fehlende Bildregionen. In einigen Fällen kann PDE-basierte Kompression den aktuellen Quasi-Standard, JPEG2000, bereits schlagen. Allerdings sind die Gründe für diesen Erfolg noch nicht vollständig erforscht, und PDE-basierte Kompression befindet sich derzeit noch im Anfangsstadium. Wir tragen durch eine probabilistische Rechtfertigung anisotroper Diffusion zu einem tieferen Verständnis PDE-basierten Codec-Designs bei. Eine Analyse der Interaktion zwischen effizienten Speicherverfahren und Bildrekonstruktion erlaubt es uns, PDEs nach ihrem Nutzen für die Kompression zu beurteilen. Anhand dieser Einsichten entwickeln wir PDE-basierte Kompression hinsichtlich ihrer praktischen Nutzbarkeit weiter: Wir stellen einen Hybrid-Codec für hochtexturierte Bilder vor, der umgebungsbasierte Interpolation mit PDEs kombiniert. Ein neuer Video-Dekodierer demonstriert die Echtzeitfähigkeit PDE-basierter Interpolation und eine Region-of-Interest-Methode erlaubt es, wichtige Bildbereiche mit hoher Genauigkeit zu speichern. Schlussendlich stellen wir ein neues diffusionsbasiertes Kolorierungsverfahren vor, welches uns effiziente Kompression von Farbbildern ermöglicht. Experimente auf Realwelt-Bilddatenbanken zeigen die Konkurrenzfähigkeit dieses Verfahrens auf

    A motion-based approach for audio-visual automatic speech recognition

    Get PDF
    The research work presented in this thesis introduces novel approaches for both visual region of interest extraction and visual feature extraction for use in audio-visual automatic speech recognition. In particular, the speaker‘s movement that occurs during speech is used to isolate the mouth region in video sequences and motionbased features obtained from this region are used to provide new visual features for audio-visual automatic speech recognition. The mouth region extraction approach proposed in this work is shown to give superior performance compared with existing colour-based lip segmentation methods. The new features are obtained from three separate representations of motion in the region of interest, namely the difference in luminance between successive images, block matching based motion vectors and optical flow. The new visual features are found to improve visual-only and audiovisual speech recognition performance when compared with the commonly-used appearance feature-based methods. In addition, a novel approach is proposed for visual feature extraction from either the discrete cosine transform or discrete wavelet transform representations of the mouth region of the speaker. In this work, the image transform is explored from a new viewpoint of data discrimination; in contrast to the more conventional data preservation viewpoint. The main findings of this work are that audio-visual automatic speech recognition systems using the new features extracted from the frequency bands selected according to their discriminatory abilities generally outperform those using features designed for data preservation. To establish the noise robustness of the new features proposed in this work, their performance has been studied in presence of a range of different types of noise and at various signal-to-noise ratios. In these experiments, the audio-visual automatic speech recognition systems based on the new approaches were found to give superior performance both to audio-visual systems using appearance based features and to audio-only speech recognition systems
    corecore