22 research outputs found

    Removal Of Blocking Artifacts From JPEG-Compressed Images Using Neural Network

    Get PDF
    The goal of this research was to develop a neural network that will produce considerable improvement in the quality of JPEG compressed images, irrespective of compression level present in the images. In order to develop a computationally efficient algorithm for reducing blocky and Gibbs oscillation artifacts from JPEG compressed images, we integrated artificial intelligence to remove blocky and Gibbs oscillation artifacts. In this approach, alpha blend filter [7] was used to post process JPEG compressed images to reduce noise and artifacts without losing image details. Here alpha blending was controlled by a limit factor that considers the amount of compression present, and any local information derived from Prewitt filter application in the input JPEG image. The outcome of modified alpha blend was improved by a trained neural network and compared with various other published works [7][9][11][14][20][23][30][32][33][35][37] where authors used post compression filtering methods

    Complex Neural Networks for Audio

    Get PDF
    Audio is represented in two mathematically equivalent ways: the real-valued time domain (i.e., waveform) and the complex-valued frequency domain (i.e., spectrum). There are advantages to the frequency-domain representation, e.g., the human auditory system is known to process sound in the frequency-domain. Furthermore, linear time-invariant systems are convolved with sources in the time-domain, whereas they may be factorized in the frequency-domain. Neural networks have become rather useful when applied to audio tasks such as machine listening and audio synthesis, which are related by their dependencies on high quality acoustic models. They ideally encapsulate fine-scale temporal structure, such as that encoded in the phase of frequency-domain audio, yet there are no authoritative deep learning methods for complex audio. This manuscript is dedicated to addressing the shortcoming. Chapter 2 motivates complex networks by their affinity with complex-domain audio, while Chapter 3 contributes methods for building and optimizing complex networks. We show that the naive implementation of Adam optimization is incorrect for complex random variables and show that selection of input and output representation has a significant impact on the performance of a complex network. Experimental results with novel complex neural architectures are provided in the second half of this manuscript. Chapter 4 introduces a complex model for binaural audio source localization. We show that, like humans, the complex model can generalize to different anatomical filters, which is important in the context of machine listening. The complex model\u27s performance is better than that of the real-valued models, as well as real- and complex-valued baselines. Chapter 5 proposes a two-stage method for speech enhancement. In the first stage, a complex-valued stochastic autoencoder projects complex vectors to a discrete space. In the second stage, long-term temporal dependencies are modeled in the discrete space. The autoencoder raises the performance ceiling for state of the art speech enhancement, but the dynamic enhancement model does not outperform other baselines. We discuss areas for improvement and note that the complex Adam optimizer improves training convergence over the naive implementation

    Visual Recognition and Categorization on the Basis of Similarities to Multiple Class Prototypes

    Get PDF
    To recognize a previously seen object, the visual system must overcome the variability in the object's appearance caused by factors such as illumination and pose. Developments in computer vision suggest that it may be possible to counter the influence of these factors, by learning to interpolate between stored views of the target object, taken under representative combinations of viewing conditions. Daily life situations, however, typically require categorization, rather than recognition, of objects. Due to the open-ended character both of natural kinds and of artificial categories, categorization cannot rely on interpolation between stored examples. Nonetheless, knowledge of several representative members, or prototypes, of each of the categories of interest can still provide the necessary computational substrate for the categorization of new instances. The resulting representational scheme based on similarities to prototypes appears to be computationally viable, and is readily mapped onto the mechanisms of biological vision revealed by recent psychophysical and physiological studies

    Removal Of Blocking Artifacts From JPEG-Compressed Images Using An Adaptive Filtering Algorithm

    Get PDF
    The aim of this research was to develop an algorithm that will produce a considerable improvement in the quality of JPEG images, by removing blocking and ringing artifacts, irrespective of the level of compression present in the image. We review multiple published related works, and finally present a computationally efficient algorithm for reducing the blocky and Gibbs oscillation artifacts commonly present in JPEG compressed images. The algorithm alpha-blends a smoothed version of the image with the original image; however, the blending is controlled by a limit factor that considers the amount of compression present and any local edge information derived from the application of a Prewitt filter. In addition, the actual value of the blending coefficient (α) is derived from the local Mean Structural Similarity Index Measure (MSSIM) which is also adjusted by a factor that also considers the amount of compression present. We also present our results as well as the results for a variety of other papers whose authors used other post compression filtering methods

    Distortion-constraint compression of three-dimensional CLSM images using image pyramid and vector quantization

    Get PDF
    The confocal microscopy imaging techniques, which allow optical sectioning, have been successfully exploited in biomedical studies. Biomedical scientists can benefit from more realistic visualization and much more accurate diagnosis by processing and analysing on a three-dimensional image data. The lack of efficient image compression standards makes such large volumetric image data slow to transfer over limited bandwidth networks. It also imposes large storage space requirements and high cost in archiving and maintenance. Conventional two-dimensional image coders do not take into account inter-frame correlations in three-dimensional image data. The standard multi-frame coders, like video coders, although they have good performance in capturing motion information, are not efficiently designed for coding multiple frames representing a stack of optical planes of a real object. Therefore a real three-dimensional image compression approach should be investigated. Moreover the reconstructed image quality is a very important concern in compressing medical images, because it could be directly related to the diagnosis accuracy. Most of the state-of-the-arts methods are based on transform coding, for instance JPEG is based on discrete-cosine-transform CDCT) and JPEG2000 is based on discrete- wavelet-transform (DWT). However in DCT and DWT methods, the control of the reconstructed image quality is inconvenient, involving considerable costs in computation, since they are fundamentally rate-parameterized methods rather than distortion-parameterized methods. Therefore it is very desirable to develop a transform-based distortion-parameterized compression method, which is expected to have high coding performance and also able to conveniently and accurately control the final distortion according to the user specified quality requirement. This thesis describes our work in developing a distortion-constraint three-dimensional image compression approach, using vector quantization techniques combined with image pyramid structures. We are expecting our method to have: 1. High coding performance in compressing three-dimensional microscopic image data, compared to the state-of-the-art three-dimensional image coders and other standardized two-dimensional image coders and video coders. 2. Distortion-control capability, which is a very desirable feature in medical 2. Distortion-control capability, which is a very desirable feature in medical image compression applications, is superior to the rate-parameterized methods in achieving a user specified quality requirement. The result is a three-dimensional image compression method, which has outstanding compression performance, measured objectively, for volumetric microscopic images. The distortion-constraint feature, by which users can expect to achieve a target image quality rather than the compressed file size, offers more flexible control of the reconstructed image quality than its rate-constraint counterparts in medical image applications. Additionally, it effectively reduces the artifacts presented in other approaches at low bit rates and also attenuates noise in the pre-compressed images. Furthermore, its advantages in progressive transmission and fast decoding make it suitable for bandwidth limited tele-communications and web-based image browsing applications

    Multiscale Markov Decision Problems: Compression, Solution, and Transfer Learning

    Full text link
    Many problems in sequential decision making and stochastic control often have natural multiscale structure: sub-tasks are assembled together to accomplish complex goals. Systematically inferring and leveraging hierarchical structure, particularly beyond a single level of abstraction, has remained a longstanding challenge. We describe a fast multiscale procedure for repeatedly compressing, or homogenizing, Markov decision processes (MDPs), wherein a hierarchy of sub-problems at different scales is automatically determined. Coarsened MDPs are themselves independent, deterministic MDPs, and may be solved using existing algorithms. The multiscale representation delivered by this procedure decouples sub-tasks from each other and can lead to substantial improvements in convergence rates both locally within sub-problems and globally across sub-problems, yielding significant computational savings. A second fundamental aspect of this work is that these multiscale decompositions yield new transfer opportunities across different problems, where solutions of sub-tasks at different levels of the hierarchy may be amenable to transfer to new problems. Localized transfer of policies and potential operators at arbitrary scales is emphasized. Finally, we demonstrate compression and transfer in a collection of illustrative domains, including examples involving discrete and continuous statespaces.Comment: 86 pages, 15 figure

    Optimal control and approximations

    Get PDF
    corecore