72 research outputs found
Backprojection for Training Feedforward Neural Networks in the Input and Feature Spaces
After the tremendous development of neural networks trained by
backpropagation, it is a good time to develop other algorithms for training
neural networks to gain more insights into networks. In this paper, we propose
a new algorithm for training feedforward neural networks which is fairly faster
than backpropagation. This method is based on projection and reconstruction
where, at every layer, the projected data and reconstructed labels are forced
to be similar and the weights are tuned accordingly layer by layer. The
proposed algorithm can be used for both input and feature spaces, named as
backprojection and kernel backprojection, respectively. This algorithm gives an
insight to networks with a projection-based perspective. The experiments on
synthetic datasets show the effectiveness of the proposed method.Comment: Accepted (to appear) in International Conference on Image Analysis
and Recognition (ICIAR) 2020, Springe
Visual Scene Understanding by Deep Fisher Discriminant Learning
Modern deep learning has recently revolutionized
several fields of classic machine learning and computer vision,
such as, scene understanding, natural language processing and
machine translation. The substitution of feature hand-crafting
with automatic feature learning, provides an excellent
opportunity for gaining an in-depth understanding of large-scale
data statistics. Deep neural networks generally train models with
huge numbers of parameters, facilitating efficient search for
optimal and sub-optimal spaces of highly non-convex objective
functions. On the other hand, Fisher discriminant analysis has
been widely employed to impose class discrepancy, for the sake of
segmentation, classification, and recognition tasks. This thesis
bridges between contemporary deep learning and classic
discriminant analysis, to accommodate some important challenges
in visual scene understanding, i.e. semantic segmentation,
texture classification, and object recognition. The aim is to
accomplish specific tasks in some new high-dimensional spaces,
covered by the statistical information of the datasets under
study. Inspired by a new formulation of Fisher discriminant
analysis, this thesis introduces some novel arrangements of
well-known deep learning architectures, to achieve better
performances on the targeted missions. The theoretical
justifications are based upon a large body of experimental work,
and consolidate the contribution of the proposed idea; Deep
Fisher Discriminant Learning, to several challenges in visual
scene understanding
Deep Learning Meets Sparse Regularization: A Signal Processing Perspective
Deep learning has been wildly successful in practice and most
state-of-the-art machine learning methods are based on neural networks.
Lacking, however, is a rigorous mathematical theory that adequately explains
the amazing performance of deep neural networks. In this article, we present a
relatively new mathematical framework that provides the beginning of a deeper
understanding of deep learning. This framework precisely characterizes the
functional properties of neural networks that are trained to fit to data. The
key mathematical tools which support this framework include transform-domain
sparse regularization, the Radon transform of computed tomography, and
approximation theory, which are all techniques deeply rooted in signal
processing. This framework explains the effect of weight decay regularization
in neural network training, the use of skip connections and low-rank weight
matrices in network architectures, the role of sparsity in neural networks, and
explains why neural networks can perform well in high-dimensional problems
Data Reduction Algorithms in Machine Learning and Data Science
Raw data are usually required to be pre-processed for better representation or discrimination of classes. This pre-processing can be done by data reduction, i.e., either reduction in dimensionality or numerosity (cardinality). Dimensionality reduction can be used for feature extraction or data visualization. Numerosity reduction is useful for ranking data points or finding the most and least important data points. This thesis proposes several algorithms for data reduction, known as dimensionality and numerosity reduction, in machine learning and data science. Dimensionality reduction tackles feature extraction and feature selection methods while numerosity reduction includes prototype selection and prototype generation approaches. This thesis focuses on feature extraction and prototype selection for data reduction. Dimensionality reduction methods can be divided into three categories, i.e., spectral, probabilistic, and neural network-based methods. The spectral methods have a geometrical point of view and are mostly reduced to the generalized eigenvalue problem. Probabilistic and network-based methods have stochastic and information theoretic foundations, respectively. Numerosity reduction methods can be divided into methods based on variance, geometry, and isolation.
For dimensionality reduction, under the spectral category, I propose weighted Fisher discriminant analysis, Roweis discriminant analysis, and image quality aware embedding. I also propose quantile-quantile embedding as a probabilistic method where the distribution of embedding is chosen by the user. Backprojection, Fisher losses, and dynamic triplet sampling using Bayesian updating are other proposed methods in the neural network-based category. Backprojection is for training shallow networks with a projection-based perspective in manifold learning. Two Fisher losses are proposed for training Siamese triplet networks for increasing and decreasing the inter- and intra-class variances, respectively. Two dynamic triplet mining methods, which are based on Bayesian updating to draw triplet samples stochastically, are proposed. For numerosity reduction, principal sample analysis and instance ranking by matrix decomposition are the proposed variance-based methods; these methods rank instances using inter-/intra-class variances and matrix factorization, respectively. Curvature anomaly detection, in which the points are assumed to be the vertices of polyhedron, and isolation Mondrian forest are the proposed methods based on geometry and isolation, respectively.
To assess the proposed tools developed for data reduction, I apply them to some applications in medical image analysis, image processing, and computer vision. Data reduction, used as a pre-processing tool, has different applications because it provides various ways of feature extraction and prototype selection for applying to different types of data. Dimensionality reduction extracts informative features and prototype selection selects the most informative data instances. For example, for medical image analysis, I use Fisher losses and dynamic triplet sampling for embedding histopathology image patches and demonstrating how different the tumorous cancer tissue types are from the normal ones. I also propose offline/online triplet mining using extreme distances for this embedding. In image processing and computer vision application, I propose Roweisfaces and Roweisposes for face recognition and 3D action recognition, respectively, using my proposed Roweis discriminant analysis method. I also introduce the concepts of anomaly landscape and anomaly path using the proposed curvature anomaly detection and use them to denoise images and video frames. I report extensive experiments, on different datasets, to show the effectiveness of the proposed algorithms. By experiments, I demonstrate that the proposed methods are useful for extracting informative features and instances for better accuracy, representation, prediction, class separation, data reduction, and embedding. I show that the proposed dimensionality reduction methods can extract informative features for better separation of classes. An example is obtaining an embedding space for separating cancer histopathology patches from the normal patches which helps hospitals diagnose cancers more easily in an automatic way. I also show that the proposed numerosity reduction methods are useful for ranking data instances based on their importance and reducing data volumes without a significant drop in performance of machine learning and data science algorithms
A space-variant visual pathway model for data efficient deep learning
We present an investigation into adopting a model of the retino-cortical mapping, found in biological visual systems, to improve the efficiency of image analysis using Deep Convolutional Neural Nets (DCNNs) in the context of robot vision and egocentric perception systems. This work has now enabled DCNNs to process input images approaching one million pixels in size, in real time, using only consumer grade graphics processor (GPU) hardware in a single pass of the DCNN
Artificial Intelligence and Deep Learning for Advancing PET Image Reconstruction: State-of-the-Art and Future Directions
Positron emission tomography (PET) is vital for diagnosing diseases and monitoring treatments. Conventional image reconstruction (IR) techniques like filtered backprojection and iterative algorithms are powerful but face limitations. PET IR can be seen as an image-to-image translation. Artificial intelligence (AI) and deep learning (DL) using multilayer neural networks enable a new approach to this computer vision task. This review aims to provide mutual understanding for nuclear medicine professionals and AI researchers. We outline fundamentals of PET imaging as well as state-of-the-art in AI-based PET IR with its typical algorithms and DL architectures. Advances improve resolution and contrast recovery, reduce noise, and remove artifacts via inferred attenuation and scatter correction, sinogram inpainting, denoising, and super-resolution refinement. Kernel-priors support list-mode reconstruction, motion correction, and parametric imaging. Hybrid approaches combine AI with conventional IR. Challenges of AI-assisted PET IR include availability of training data, cross-scanner compatibility, and the risk of hallucinated lesions. The need for rigorous evaluations, including quantitative phantom validation and visual comparison of diagnostic accuracy against conventional IR, is highlighted along with regulatory issues. First approved AI-based applications are clinically available, and its impact is foreseeable. Emerging trends, such as the integration of multimodal imaging and the use of data from previous imaging visits, highlight future potentials. Continued collaborative research promises significant improvements in image quality, quantitative accuracy, and diagnostic performance, ultimately leading to the integration of AI-based IR into routine PET imaging protocols
Applying neural networks for improving the MEG inverse solution
Magnetoencephalography (MEG) and electroencephalography (EEG) are appealing non-invasive methods for recording brain activity with high temporal resolution. However, locating the brain source currents from recordings picked up by the sensors on the scalp introduces an ill-posed inverse problem. The MEG inverse problem one of the most difficult inverse problems in medical imaging. The current standard in approximating the MEG inverse problem is to use multiple distributed inverse solutions – namely dSPM, sLORETA and L2 MNE – to estimate the source current distribution in the brain. This thesis investigates if these inverse solutions can be "post-processed" by a neural network to provide improved accuracy on source locations.
Recently, deep neural networks have been used to approximate other ill-posed inverse medical imaging problems with accuracy comparable to current state-of- the-art inverse reconstruction algorithms. Neural networks are powerful tools for approximating problems with limited prior knowledge or problems that require high levels of abstraction. In this thesis a special case of a deep convolutional network, the U-Net, is applied to approximate the MEG inverse problem using the standard inverse solutions (dSPM, sLORETA and L2 MNE) as inputs.
The U-Net is capable of learning non-linear relationships between the inputs and producing predictions about the site of single-dipole activation with higher accuracy than the L2 minimum-norm based inverse solutions with the following resolution metrics: dipole localization error (DLE), spatial dispersion (SD) and overall amplitude (OA). The U-Net model is stable and performs better in aforesaid resolution metrics than the inverse solutions with multi-dipole data previously unseen by the U-Net
- …