38 research outputs found

    Weakly monotonic averaging with application to image processing

    Full text link

    Nonlocal smoothing and adaptive morphology for scalar- and matrix-valued images

    Get PDF
    In this work we deal with two classic degradation processes in image analysis, namely noise contamination and incomplete data. Standard greyscale and colour photographs as well as matrix-valued images, e.g. diffusion-tensor magnetic resonance imaging, may be corrupted by Gaussian or impulse noise, and may suffer from missing data. In this thesis we develop novel reconstruction approaches to image smoothing and image completion that are applicable to both scalar- and matrix-valued images. For the image smoothing problem, we propose discrete variational methods consisting of nonlocal data and smoothness constraints that penalise general dissimilarity measures. We obtain edge-preserving filters by the joint use of such measures rich in texture content together with robust non-convex penalisers. For the image completion problem, we introduce adaptive, anisotropic morphological partial differential equations modelling the dilation and erosion processes. They adjust themselves to the local geometry to adaptively fill in missing data, complete broken directional structures and even enhance flow-like patterns in an anisotropic manner. The excellent reconstruction capabilities of the proposed techniques are tested on various synthetic and real-world data sets.In dieser Arbeit beschäftigen wir uns mit zwei klassischen Störungsquellen in der Bildanalyse, nämlich mit Rauschen und unvollständigen Daten. Klassische Grauwert- und Farb-Fotografien wie auch matrixwertige Bilder, zum Beispiel Diffusionstensor-Magnetresonanz-Aufnahmen, können durch Gauß- oder Impulsrauschen gestört werden, oder können durch fehlende Daten gestört sein. In dieser Arbeit entwickeln wir neue Rekonstruktionsverfahren zum zur Bildglättung und zur Bildvervollständigung, die sowohl auf skalar- als auch auf matrixwertige Bilddaten anwendbar sind. Zur Lösung des Bildglättungsproblems schlagen wir diskrete Variationsverfahren vor, die aus nichtlokalen Daten- und Glattheitstermen bestehen und allgemeine auf Bildausschnitten definierte Unähnlichkeitsmaße bestrafen. Kantenerhaltende Filter werden durch die gemeinsame Verwendung solcher Maße in stark texturierten Regionen zusammen mit robusten nichtkonvexen Straffunktionen möglich. Für das Problem der Datenvervollständigung führen wir adaptive anisotrope morphologische partielle Differentialgleichungen ein, die Dilatations- und Erosionsprozesse modellieren. Diese passen sich der lokalen Geometrie an, um adaptiv fehlende Daten aufzufüllen, unterbrochene gerichtet Strukturen zu schließen und sogar flussartige Strukturen anisotrop zu verstärken. Die ausgezeichneten Rekonstruktionseigenschaften der vorgestellten Techniken werden anhand verschiedener synthetischer und realer Datensätze demonstriert

    Connecting mathematical models for image processing and neural networks

    Get PDF
    This thesis deals with the connections between mathematical models for image processing and deep learning. While data-driven deep learning models such as neural networks are flexible and well performing, they are often used as a black box. This makes it hard to provide theoretical model guarantees and scientific insights. On the other hand, more traditional, model-driven approaches such as diffusion, wavelet shrinkage, and variational models offer a rich set of mathematical foundations. Our goal is to transfer these foundations to neural networks. To this end, we pursue three strategies. First, we design trainable variants of traditional models and reduce their parameter set after training to obtain transparent and adaptive models. Moreover, we investigate the architectural design of numerical solvers for partial differential equations and translate them into building blocks of popular neural network architectures. This yields criteria for stable networks and inspires novel design concepts. Lastly, we present novel hybrid models for inpainting that rely on our theoretical findings. These strategies provide three ways for combining the best of the two worlds of model- and data-driven approaches. Our work contributes to the overarching goal of closing the gap between these worlds that still exists in performance and understanding.Gegenstand dieser Arbeit sind die Zusammenhänge zwischen mathematischen Modellen zur Bildverarbeitung und Deep Learning. Während datengetriebene Modelle des Deep Learning wie z.B. neuronale Netze flexibel sind und gute Ergebnisse liefern, werden sie oft als Black Box eingesetzt. Das macht es schwierig, theoretische Modellgarantien zu liefern und wissenschaftliche Erkenntnisse zu gewinnen. Im Gegensatz dazu bieten traditionellere, modellgetriebene Ansätze wie Diffusion, Wavelet Shrinkage und Variationsansätze eine Fülle von mathematischen Grundlagen. Unser Ziel ist es, diese auf neuronale Netze zu übertragen. Zu diesem Zweck verfolgen wir drei Strategien. Zunächst entwerfen wir trainierbare Varianten von traditionellen Modellen und reduzieren ihren Parametersatz, um transparente und adaptive Modelle zu erhalten. Außerdem untersuchen wir die Architekturen von numerischen Lösern für partielle Differentialgleichungen und übersetzen sie in Bausteine von populären neuronalen Netzwerken. Daraus ergeben sich Kriterien für stabile Netzwerke und neue Designkonzepte. Schließlich präsentieren wir neuartige hybride Modelle für Inpainting, die auf unseren theoretischen Erkenntnissen beruhen. Diese Strategien bieten drei Möglichkeiten, das Beste aus den beiden Welten der modell- und datengetriebenen Ansätzen zu vereinen. Diese Arbeit liefert einen Beitrag zum übergeordneten Ziel, die Lücke zwischen den zwei Welten zu schließen, die noch in Bezug auf Leistung und Modellverständnis besteht.ERC Advanced Grant INCOVI

    Structure-aware image denoising, super-resolution, and enhancement methods

    Get PDF
    Denoising, super-resolution and structure enhancement are classical image processing applications. The motive behind their existence is to aid our visual analysis of raw digital images. Despite tremendous progress in these fields, certain difficult problems are still open to research. For example, denoising and super-resolution techniques which possess all the following properties, are very scarce: They must preserve critical structures like corners, should be robust to the type of noise distribution, avoid undesirable artefacts, and also be fast. The area of structure enhancement also has an unresolved issue: Very little efforts have been put into designing models that can tackle anisotropic deformations in the image acquisition process. In this thesis, we design novel methods in the form of partial differential equations, patch-based approaches and variational models to overcome the aforementioned obstacles. In most cases, our methods outperform the existing approaches in both quality and speed, despite being applicable to a broader range of practical situations.Entrauschen, Superresolution und Strukturverbesserung sind klassische Anwendungen der Bildverarbeitung. Ihre Existenz bedingt sich in dem Bestreben, die visuelle Begutachtung digitaler Bildrohdaten zu unterstützen. Trotz erheblicher Fortschritte in diesen Feldern bedürfen bestimmte schwierige Probleme noch weiterer Forschung. So sind beispielsweise Entrauschungsund Superresolutionsverfahren, welche alle der folgenden Eingenschaften besitzen, sehr selten: die Erhaltung wichtiger Strukturen wie Ecken, Robustheit bezüglich der Rauschverteilung, Vermeidung unerwünschter Artefakte und niedrige Laufzeit. Auch im Gebiet der Strukturverbesserung liegt ein ungelöstes Problem vor: Bisher wurde nur sehr wenig Forschungsaufwand in die Entwicklung von Modellen investieret, welche anisotrope Deformationen in bildgebenden Verfahren bewältigen können. In dieser Arbeit entwerfen wir neue Methoden in Form von partiellen Differentialgleichungen, patch-basierten Ansätzen und Variationsmodellen um die oben erwähnten Hindernisse zu überwinden. In den meisten Fällen übertreffen unsere Methoden nicht nur qualitativ die bisher verwendeten Ansätze, sondern lösen die gestellten Aufgaben auch schneller. Zudem decken wir mit unseren Modellen einen breiteren Bereich praktischer Fragestellungen ab

    Automatic Detectors for Underwater Soundscape Measurements

    Get PDF
    Environmental impact regulations require that marine industrial operators quantify their contribution to underwater noise scenes. Automation of such assessments becomes feasible with the successful categorisation of sounds into broader classes based on source types – biological, anthropogenic and physical. Previous approaches to passive acoustic monitoring have mostly been limited to a few specific sources of interest. In this study, source-independent signal detectors are developed and a framework is presented for the automatic categorisation of underwater sounds into the aforementioned classes

    Camera Spatial Frequency Response Derived from Pictorial Natural Scenes

    Get PDF
    Camera system performance is a prominent part of many aspects of imaging science and computer vision. There are many aspects to camera performance that determines how accurately the image represents the scene, including measurements of colour accuracy, tone reproduction, geometric distortions, and image noise evaluation. The research conducted in this thesis focuses on the Modulation Transfer Function (MTF), a widely used camera performance measurement employed to describe resolution and sharpness. Traditionally measured under controlled conditions with characterised test charts, the MTF is a measurement restricted to laboratory settings. The MTF is based on linear system theory, meaning the input to output must follow a straightforward correlation. Established methods for measuring the camera system MTF include the ISO12233:2017 for measuring the edge-based Spatial Frequency Response (e-SFR), a sister measure of the MTF designed for measuring discrete systems. Many modern camera systems incorporate non-linear, highly adaptive image signal processing (ISP) to improve image quality. As a result, system performance becomes scene and processing dependant, adapting to the scene contents captured by the camera. Established test chart based MTF/SFR methods do not describe this adaptive nature; they only provide the response of the camera to a test chart signal. Further, with the increased use of Deep Neural Networks (DNN) for image recognition tasks and autonomous vision systems, there is an increased need for monitoring system performance outside laboratory conditions in real-time, i.e. live-MTF. Such measurements would assist in monitoring the camera systems to ensure they are fully operational for decision critical tasks. This thesis presents research conducted to develop a novel automated methodology that estimates the standard e-SFR directly from pictorial natural scenes. This methodology has the potential to produce scene dependant and real-time camera system performance measurements, opening new possibilities in imaging science and allowing live monitoring/calibration of systems for autonomous computer vision applications. The proposed methodology incorporates many well-established image processes, as well as others developed for specific purposes. It is presented in two parts. Firstly, the Natural Scene derived SFR (NS-SFR) are obtained from isolated captured scene step-edges, after verifying that these edges have the correct profile for implementing into the slanted-edge algorithm. The resulting NS-SFRs are shown to be a function of both camera system performance and scene contents. The second part of the methodology uses a series of derived NS-SFRs to estimate the system e-SFR, as per the ISO12233 standard. This is achieved by applying a sequence of thresholds to segment the most likely data corresponding to the system performance. These thresholds a) group the expected optical performance variation across the imaging circle within radial distance segments, b) obtain the highest performance NS-SFRs per segment and c) select the NS-SFRs with input edge and region of interest (ROI) parameter ranges shown to introduce minimal e-SFR variation. The selected NS-SFRs are averaged per radial segment to estimate system e-SFRs across the field of view. A weighted average of these estimates provides an overall system performance estimation. This methodology is implemented for e-SFR estimation of three characterised camera systems, two near-linear and one highly non-linear. Investigations are conducted using large, diverse image datasets as well as restricting scene content and the number of images used for the estimation. The resulting estimates are comparable to ISO12233 e-SFRs derived from test chart inputs for the near-linear systems. Overall estimate stays within one standard deviation of the equivalent test chart measurement. Results from the highly non-linear system indicate scene and processing dependency, potentially leading to a more representative SFR measure than the current chart-based approaches for such systems. These results suggest that the proposed method is a viable alternative to the ISO technique

    Towards music perception by redundancy reduction and unsupervised learning in probabilistic models

    Get PDF
    PhDThe study of music perception lies at the intersection of several disciplines: perceptual psychology and cognitive science, musicology, psychoacoustics, and acoustical signal processing amongst others. Developments in perceptual theory over the last fifty years have emphasised an approach based on Shannon’s information theory and its basis in probabilistic systems, and in particular, the idea that perceptual systems in animals develop through a process of unsupervised learning in response to natural sensory stimulation, whereby the emerging computational structures are well adapted to the statistical structure of natural scenes. In turn, these ideas are being applied to problems in music perception. This thesis is an investigation of the principle of redundancy reduction through unsupervised learning, as applied to representations of sound and music. In the first part, previous work is reviewed, drawing on literature from some of the fields mentioned above, and an argument presented in support of the idea that perception in general and music perception in particular can indeed be accommodated within a framework of unsupervised learning in probabilistic models. In the second part, two related methods are applied to two different low-level representations. Firstly, linear redundancy reduction (Independent Component Analysis) is applied to acoustic waveforms of speech and music. Secondly, the related method of sparse coding is applied to a spectral representation of polyphonic music, which proves to be enough both to recognise that the individual notes are the important structural elements, and to recover a rough transcription of the music. Finally, the concepts of distance and similarity are considered, drawing in ideas about noise, phase invariance, and topological maps. Some ecologically and information theoretically motivated distance measures are suggested, and put in to practice in a novel method, using multidimensional scaling (MDS), for visualising geometrically the dependency structure in a distributed representation.Engineering and Physical Science Research Counci

    Statistical models for natural sounds

    Get PDF
    It is important to understand the rich structure of natural sounds in order to solve important tasks, like automatic speech recognition, and to understand auditory processing in the brain. This thesis takes a step in this direction by characterising the statistics of simple natural sounds. We focus on the statistics because perception often appears to depend on them, rather than on the raw waveform. For example the perception of auditory textures, like running water, wind, fire and rain, depends on summary-statistics, like the rate of falling rain droplets, rather than on the exact details of the physical source. In order to analyse the statistics of sounds accurately it is necessary to improve a number of traditional signal processing methods, including those for amplitude demodulation, time-frequency analysis, and sub-band demodulation. These estimation tasks are ill-posed and therefore it is natural to treat them as Bayesian inference problems. The new probabilistic versions of these methods have several advantages. For example, they perform more accurately on natural signals and are more robust to noise, they can also fill-in missing sections of data, and provide error-bars. Furthermore, free-parameters can be learned from the signal. Using these new algorithms we demonstrate that the energy, sparsity, modulation depth and modulation time-scale in each sub-band of a signal are critical statistics, together with the dependencies between the sub-band modulators. In order to validate this claim, a model containing co-modulated coloured noise carriers is shown to be capable of generating a range of realistic sounding auditory textures. Finally, we explored the connection between the statistics of natural sounds and perception. We demonstrate that inference in the model for auditory textures qualitatively replicates the primitive grouping rules that listeners use to understand simple acoustic scenes. This suggests that the auditory system is optimised for the statistics of natural sounds

    Sparse machine learning methods with applications in multivariate signal processing

    Get PDF
    This thesis details theoretical and empirical work that draws from two main subject areas: Machine Learning (ML) and Digital Signal Processing (DSP). A unified general framework is given for the application of sparse machine learning methods to multivariate signal processing. In particular, methods that enforce sparsity will be employed for reasons of computational efficiency, regularisation, and compressibility. The methods presented can be seen as modular building blocks that can be applied to a variety of applications. Application specific prior knowledge can be used in various ways, resulting in a flexible and powerful set of tools. The motivation for the methods is to be able to learn and generalise from a set of multivariate signals. In addition to testing on benchmark datasets, a series of empirical evaluations on real world datasets were carried out. These included: the classification of musical genre from polyphonic audio files; a study of how the sampling rate in a digital radar can be reduced through the use of Compressed Sensing (CS); analysis of human perception of different modulations of musical key from Electroencephalography (EEG) recordings; classification of genre of musical pieces to which a listener is attending from Magnetoencephalography (MEG) brain recordings. These applications demonstrate the efficacy of the framework and highlight interesting directions of future research

    Beyond the pixels: learning and utilising video compression features for localisation of digital tampering.

    Get PDF
    Video compression is pervasive in digital society. With rising usage of deep convolutional neural networks (CNNs) in the fields of computer vision, video analysis and video tampering detection, it is important to investigate how patterns invisible to human eyes may be influencing modern computer vision techniques and how they can be used advantageously. This work thoroughly explores how video compression influences accuracy of CNNs and shows how optimal performance is achieved when compression levels in the training set closely match those of the test set. A novel method is then developed, using CNNs, to derive compression features directly from the pixels of video frames. It is then shown that these features can be readily used to detect inauthentic video content with good accuracy across multiple different video tampering techniques. Moreover, the ability to explain these features allows predictions to be made about their effectiveness against future tampering methods. The problem is motivated with a novel investigation into recent video manipulation methods, which shows that there is a consistent drive to produce convincing, photorealistic, manipulated or synthetic video. Humans, blind to the presence of video tampering, are also blind to the type of tampering. New detection techniques are required and, in order to compensate for human limitations, they should be broadly applicable to multiple tampering types. This thesis details the steps necessary to develop and evaluate such techniques
    corecore