62 research outputs found
Pedestrian Detection Algorithms using Shearlets
In this thesis, we investigate the applicability of the shearlet transform for the task of pedestrian detection. Due to the usage of in several emerging technologies, such as automated or autonomous vehicles, pedestrian detection has evolved into a key topic of research in the last decade. In this time period, a wealth of different algorithms has been developed. According to the current results on the Caltech Pedestrian Detection Benchmark the algorithms can be divided into two categories. First, application of hand-crafted image features and of a classifier trained on these features. Second, methods using Convolutional Neural Networks in which features are learned during the training phase. It is studied how both of these types of procedures can be further improved by the incorporation of shearlets, a framework for image analysis which has a comprehensive theoretical basis
Sparse and Redundant Representations for Inverse Problems and Recognition
Sparse and redundant representation of data enables the
description of signals as linear combinations of a few atoms from
a dictionary. In this dissertation, we study applications of
sparse and redundant representations in inverse problems and
object recognition. Furthermore, we propose two novel imaging
modalities based on the recently introduced theory of Compressed
Sensing (CS).
This dissertation consists of four major parts. In the first part
of the dissertation, we study a new type of deconvolution
algorithm that is based on estimating the image from a shearlet
decomposition. Shearlets provide a multi-directional and
multi-scale decomposition that has been mathematically shown to
represent distributed discontinuities such as edges better than
traditional wavelets. We develop a deconvolution algorithm that
allows for the approximation inversion operator to be controlled
on a multi-scale and multi-directional basis. Furthermore, we
develop a method for the automatic determination of the threshold
values for the noise shrinkage for each scale and direction
without explicit knowledge of the noise variance using a
generalized cross validation method.
In the second part of the dissertation, we study a reconstruction
method that recovers highly undersampled images assumed to have a
sparse representation in a gradient domain by using partial
measurement samples that are collected in the Fourier domain. Our
method makes use of a robust generalized Poisson solver that
greatly aids in achieving a significantly improved performance
over similar proposed methods. We will demonstrate by experiments
that this new technique is more flexible to work with either
random or restricted sampling scenarios better than its
competitors.
In the third part of the dissertation, we introduce a novel
Synthetic Aperture Radar (SAR) imaging modality which can provide
a high resolution map of the spatial distribution of targets and
terrain using a significantly reduced number of needed transmitted
and/or received electromagnetic waveforms. We demonstrate that
this new imaging scheme, requires no new hardware components and
allows the aperture to be compressed. Also, it
presents many new applications and advantages which include strong
resistance to countermesasures and interception, imaging much
wider swaths and reduced on-board storage requirements.
The last part of the dissertation deals with object recognition
based on learning dictionaries for simultaneous sparse signal
approximations and feature extraction. A dictionary is learned
for each object class based on given training examples which
minimize the representation error with a sparseness constraint. A
novel test image is then projected onto the span of the atoms in
each learned dictionary. The residual vectors along with the
coefficients are then used for recognition. Applications to
illumination robust face recognition and automatic target
recognition are presented
Applied microlocal analysis of deep neural networks for inverse problems
Deep neural networks have recently shown state-of-the-art performance in different imaging tasks. As an example, EfficientNet is today the best image classifier on the ImageNet challenge. They are also very powerful for image reconstruction, for example, deep learning currently yields the best methods for CT reconstruction. Most imaging problems, such as CT reconstruction, are ill-posed inverse problems, which hence require regularization techniques typically based on a-priori information. Also, due to the human visual system, singularities such as edge-like features are the governing structures of images. This leads to the question of how to incorporate such information into a solver of an inverse problem in imaging and how deep neural networks operate on singularities. The main research theme of this thesis is to introduce theoretically founded approaches to use deep neural networks in combination with model-based methods to solve inverse problems from imaging science. We do this by heavily exploring the singularity structure of images as a-priori information. We then develop a comprehensive analysis of how neural networks act on singularities using predominantly methods from the microlocal analysis.
For analyzing the interaction of deep neural networks with singularities, we introduce a novel technique to compute the propagation of wavefront sets through convolutional residual neural networks (conv-ResNet). This is achieved in a two-fold manner: We first study the continuous case where the neural network is defined in an infinite-dimensional continuous space. This problem is tackled by using the structure of these networks as a sequential application of continuous convolutional operators and ReLU non-linearities and applying microlocal analysis techniques to track the propagation of the wavefront set through the layers. This then leads to the so-called \emph{microcanonical relation} that describes the propagation of the wavefront set under the action of such a neural network. Secondly, for studying real-world discrete problems, we digitize the necessary microlocal analysis methods via the digital shearlet transform. The key idea is the fact that the shearlet transform optimally represents Fourier integral operators hence such a discretization decays rapidly, allowing a finite approximation. Fourier integral operators play an important role in microlocal analysis, since it is well known that they preserve singularities on functions, and, in addition, they have a closed form microcanonical relation. Also, based on the newly developed theoretical analysis, we introduce a method that uses digital shearlet coefficients to compute the digital wavefront set of images by a convolutional neural network.
Our approach is then used for a similar analysis of the microlocal behavior of the learned-primal dual architecture, which is formed by a sequence of conv-ResNet blocks. This architecture has shown state-of-the-art performance in inverse problem regularization, in particular, computed tomography reconstruction related to the Radon transform. Since the Radon operator is a Fourier integral operator, our microlocal techniques can be applied. Therefore, we can study with high precision the singularities propagation of this architecture.
Aiming to empirically analyze our theoretical approach, we focus on the reconstruction of X-ray tomographic data. We approach this problem by using a task-adapted reconstruction framework, in which we combine the task of reconstruction with the task of computing the wavefront set of the original image as a-priori information. Our numerical results show superior performance with respect to current state-of-the-art tomographic reconstruction methods; hence we anticipate our work to also be a significant contribution to the biomedical imaging community.Tiefe neuronale Netze haben in letzter Zeit bei verschiedenen Bildverarbeitungsaufgaben Spitzenleistungen gezeigt. Zum Beispiel ist AlexNet heute der beste Bildklassifikator bei der ImageNet-Challenge. Sie sind auch sehr leistungsfaehig fue die Bildrekonstruktion, zum Beispiel liefert Deep Learning derzeit die besten Methoden fuer die CT-Rekonstruktion. Die meisten Bildgebungsprobleme wie die CT-Rekonstruktion sind schlecht gestellte inverse Probleme, die daher Regularisierungstechniken erfordern, die typischerweise auf vorherigen Informationen basieren. Auch aufgrund des menschlichen visuellen Systems sind Singularitaeten wie kantenartige Merkmale die bestimmenden Strukturen von Bildern. Dies fuehrt zu der Frage, wie man solche Informationen in einen Loeser eines inversen Problems in der Bildverarbeitung einbeziehen kann und wie tiefe neuronale Netze mit Singularitaeten arbeiten. Das Hauptforschungsthema dieser Arbeit ist die Einfuehrung theoretisch fundierter konzeptioneller Ansaetze zur Verwendung von tiefen neuronalen Netzen in Kombination mit modellbasierten Methoden zur Loesung inverser Probleme aus der Bildwissenschaft. Wir tun dies, indem wir die Singularitaetsstruktur von Bildern als Vorinformation intensiv erforschen. Dazu entwickeln wir eine umfassende Analyse, wie neuronale Netze auf Singularitaeten wirken, indem wir vorwiegend Methoden aus der mikrolokalen Analyse verwenden.
Um die Interaktion von tiefen neuronalen Netzen mit Singularitaeten zu analysieren, fuehren wir eine neuartige Technik ein, um die Ausbreitung von Wellenfrontsaetzen mit Hilfe von Convolutional Residual neuronalen Netzen (Conv-ResNet) zu berechnen. Dies wird auf zweierlei Weise erreicht: Zunaechst untersuchen wir den kontinuierlichen Fall, bei dem das neuronale Netz in einem unendlich dimensionalen kontinuierlichen Raum definiert ist. Dieses Problem wird angegangen, indem wir die besondere Struktur dieser Netze als sequentielle Anwendung von kontinuierlichen Faltungsoperatoren und ReLU-Nichtlinearitaeten nutzen und mikrolokale Analyseverfahren anwenden, um die Ausbreitung einer Wellenfrontmenge durch die Schichten zu verfolgen. Dies fuehrt dann zu einer mikrokanonischen Beziehung, die die Ausbreitung der Wellenfrontmenge unter ihrer Wirkung beschreibt. Zweitens digitalisieren wir die notwendigen mikrolokalen Analysemethoden ueber die digitale Shearlet-Transformation, wobei die Digitalisierung fuer die Untersuchung realer Probleme notwendig ist. Die Schluesselidee ist die Tatsache, dass die Shearlet-Transformation Fourier-Integraloperatoren optimal repraesentiert, so dass eine solche Diskretisierung schnell abklingt und eine endliche Approximation ermoeglicht. Nebenbei stellen wir auch eine Methode vor, die digitale Shearlet-Koeffizienten verwendet, um den digitalen Wellenfrontsatz von Bildern durch ein Faltungsneuronales Netzwerk zu berechnen.
Unser Ansatz wird dann fuer eine aehnliche Analyse fuer die gelernte primale-duale Architektur verwendet, die durch eine Sequenz von conv-ResNet-Bloecken gebildet wird. Diese Architektur hat bei der Rekonstruktion inverser Probleme, insbesondere bei der Rekonstruktion der Computertomographie im Zusammenhang mit der Radon-Transformation, Spitzenleistungen gezeigt. Da der Radon-Operator ein Fourier-Integraloperator ist, koennen unsere mikrolokalen Techniken angewendet werden.
Um unseren theoretischen Ansatz numerisch zu analysieren, konzentrieren wir uns auf die Rekonstruktion von Roentgentomographiedaten. Wir naehern uns diesem Problem mit Hilfe eines aufgabenangepassten Rekonstruktionsrahmens, in dem wir die Aufgabe der Rekonstruktion mit der Aufgabe der Berechnung der Wellenfrontmenge des Originalbildes als Vorinformation kombinieren. Unsere numerischen Ergebnisse zeigen eine ueberragende Leistung, daher erwarten wir, dass dies auch ein interessanter Beitrag fuer die biomedizinische Bildgebung sein wird
The Incremental Multiresolution Matrix Factorization Algorithm
Multiresolution analysis and matrix factorization are foundational tools in
computer vision. In this work, we study the interface between these two
distinct topics and obtain techniques to uncover hierarchical block structure
in symmetric matrices -- an important aspect in the success of many vision
problems. Our new algorithm, the incremental multiresolution matrix
factorization, uncovers such structure one feature at a time, and hence scales
well to large matrices. We describe how this multiscale analysis goes much
farther than what a direct global factorization of the data can identify. We
evaluate the efficacy of the resulting factorizations for relative leveraging
within regression tasks using medical imaging data. We also use the
factorization on representations learned by popular deep networks, providing
evidence of their ability to infer semantic relationships even when they are
not explicitly trained to do so. We show that this algorithm can be used as an
exploratory tool to improve the network architecture, and within numerous other
settings in vision.Comment: Computer Vision and Pattern Recognition (CVPR) 2017, 10 page
Human Retina Based Identification System Using Gabor Filters and GDA Technique
A biometric authentication system provides an
automatic person authentication based on some characteristic
features possessed by the individual. Among all other biometrics,
human retina is a secure and reliable source of person
recognition as it is unique, universal, lies at the back of the eyeball and hence it is unforgeable. The process of authentication
mainly includes pre-processing, feature extraction and then
features matching and classification. Also authentication systems
are mainly appointed in verification and identification mode
according to the specific application. In this paper, preprocessing and image enhancement stages involve several steps to
highlight interesting features in retinal images. The feature
extraction stage is accomplished using a bank of Gabor filter with
number of orientations and scales. Generalized Discriminant
Analysis (GDA) technique has been used to reduce the size of
feature vectors and enhance the performance of proposed
algorithm. Finally, classification is accomplished using k-nearest
neighbor (KNN) classifier to determine the identity of the genuine
user or reject the forged one as the proposed method operates in
identification mode. The main contribution in this paper is using
Generalized Discriminant Analysis (GDA) technique to address
âcurse of dimensionalityâ problem. GDA is a novel method used
in the area of retina recognition
Box Spline Wavelet Frames for Image Edge Analysis
We present a new box spline wavelet frame and apply it for image edge analysis. The wavelet frame is constructed using a box spline of eight directions. It is tight and has seldom been used for applications. Due to the eight different directions, it can find edges of various types in detail quite well. In addition to step edges (local discontinuities in intensity), it is able to locate Dirac edges (momentary changes of intensity) and hidden edges (local discontinuity in intensity derivatives). The method is simple and robust to noise. Many numerical examples are presented to demonstrate the effectiveness of this method. Quantitative and qualitative comparisons with other edge detection techniques are provided to show the advantages of this wavelet frame. Our test images include synthetic images with known ground truth and natural, medical images with rich geometric information
Advanced Feature Learning and Representation in Image Processing for Anomaly Detection
Techniques for improving the information quality present in imagery for feature extraction are proposed in this thesis. Specifically, two methods are presented: soft feature extraction and improved Evolution-COnstructed (iECO) features. Soft features comprise the extraction of image-space knowledge by performing a per-pixel weighting based on an importance map. Through soft features, one is able to extract features relevant to identifying a given object versus its background. Next, the iECO features framework is presented. The iECO features framework uses evolutionary computation algorithms to learn an optimal series of image transforms, specific to a given feature descriptor, to best extract discriminative information. That is, a composition of image transforms are learned from training data to present a given feature descriptor with the best opportunity to extract its information for the application at hand. The proposed techniques are applied to an automatic explosive hazard detection application and significant results are achieved
- âŠ