72 research outputs found

    Backprojection for Training Feedforward Neural Networks in the Input and Feature Spaces

    Full text link
    After the tremendous development of neural networks trained by backpropagation, it is a good time to develop other algorithms for training neural networks to gain more insights into networks. In this paper, we propose a new algorithm for training feedforward neural networks which is fairly faster than backpropagation. This method is based on projection and reconstruction where, at every layer, the projected data and reconstructed labels are forced to be similar and the weights are tuned accordingly layer by layer. The proposed algorithm can be used for both input and feature spaces, named as backprojection and kernel backprojection, respectively. This algorithm gives an insight to networks with a projection-based perspective. The experiments on synthetic datasets show the effectiveness of the proposed method.Comment: Accepted (to appear) in International Conference on Image Analysis and Recognition (ICIAR) 2020, Springe

    Visual Scene Understanding by Deep Fisher Discriminant Learning

    No full text
    Modern deep learning has recently revolutionized several fields of classic machine learning and computer vision, such as, scene understanding, natural language processing and machine translation. The substitution of feature hand-crafting with automatic feature learning, provides an excellent opportunity for gaining an in-depth understanding of large-scale data statistics. Deep neural networks generally train models with huge numbers of parameters, facilitating efficient search for optimal and sub-optimal spaces of highly non-convex objective functions. On the other hand, Fisher discriminant analysis has been widely employed to impose class discrepancy, for the sake of segmentation, classification, and recognition tasks. This thesis bridges between contemporary deep learning and classic discriminant analysis, to accommodate some important challenges in visual scene understanding, i.e. semantic segmentation, texture classification, and object recognition. The aim is to accomplish specific tasks in some new high-dimensional spaces, covered by the statistical information of the datasets under study. Inspired by a new formulation of Fisher discriminant analysis, this thesis introduces some novel arrangements of well-known deep learning architectures, to achieve better performances on the targeted missions. The theoretical justifications are based upon a large body of experimental work, and consolidate the contribution of the proposed idea; Deep Fisher Discriminant Learning, to several challenges in visual scene understanding

    Deep Learning Meets Sparse Regularization: A Signal Processing Perspective

    Full text link
    Deep learning has been wildly successful in practice and most state-of-the-art machine learning methods are based on neural networks. Lacking, however, is a rigorous mathematical theory that adequately explains the amazing performance of deep neural networks. In this article, we present a relatively new mathematical framework that provides the beginning of a deeper understanding of deep learning. This framework precisely characterizes the functional properties of neural networks that are trained to fit to data. The key mathematical tools which support this framework include transform-domain sparse regularization, the Radon transform of computed tomography, and approximation theory, which are all techniques deeply rooted in signal processing. This framework explains the effect of weight decay regularization in neural network training, the use of skip connections and low-rank weight matrices in network architectures, the role of sparsity in neural networks, and explains why neural networks can perform well in high-dimensional problems

    Data Reduction Algorithms in Machine Learning and Data Science

    Get PDF
    Raw data are usually required to be pre-processed for better representation or discrimination of classes. This pre-processing can be done by data reduction, i.e., either reduction in dimensionality or numerosity (cardinality). Dimensionality reduction can be used for feature extraction or data visualization. Numerosity reduction is useful for ranking data points or finding the most and least important data points. This thesis proposes several algorithms for data reduction, known as dimensionality and numerosity reduction, in machine learning and data science. Dimensionality reduction tackles feature extraction and feature selection methods while numerosity reduction includes prototype selection and prototype generation approaches. This thesis focuses on feature extraction and prototype selection for data reduction. Dimensionality reduction methods can be divided into three categories, i.e., spectral, probabilistic, and neural network-based methods. The spectral methods have a geometrical point of view and are mostly reduced to the generalized eigenvalue problem. Probabilistic and network-based methods have stochastic and information theoretic foundations, respectively. Numerosity reduction methods can be divided into methods based on variance, geometry, and isolation. For dimensionality reduction, under the spectral category, I propose weighted Fisher discriminant analysis, Roweis discriminant analysis, and image quality aware embedding. I also propose quantile-quantile embedding as a probabilistic method where the distribution of embedding is chosen by the user. Backprojection, Fisher losses, and dynamic triplet sampling using Bayesian updating are other proposed methods in the neural network-based category. Backprojection is for training shallow networks with a projection-based perspective in manifold learning. Two Fisher losses are proposed for training Siamese triplet networks for increasing and decreasing the inter- and intra-class variances, respectively. Two dynamic triplet mining methods, which are based on Bayesian updating to draw triplet samples stochastically, are proposed. For numerosity reduction, principal sample analysis and instance ranking by matrix decomposition are the proposed variance-based methods; these methods rank instances using inter-/intra-class variances and matrix factorization, respectively. Curvature anomaly detection, in which the points are assumed to be the vertices of polyhedron, and isolation Mondrian forest are the proposed methods based on geometry and isolation, respectively. To assess the proposed tools developed for data reduction, I apply them to some applications in medical image analysis, image processing, and computer vision. Data reduction, used as a pre-processing tool, has different applications because it provides various ways of feature extraction and prototype selection for applying to different types of data. Dimensionality reduction extracts informative features and prototype selection selects the most informative data instances. For example, for medical image analysis, I use Fisher losses and dynamic triplet sampling for embedding histopathology image patches and demonstrating how different the tumorous cancer tissue types are from the normal ones. I also propose offline/online triplet mining using extreme distances for this embedding. In image processing and computer vision application, I propose Roweisfaces and Roweisposes for face recognition and 3D action recognition, respectively, using my proposed Roweis discriminant analysis method. I also introduce the concepts of anomaly landscape and anomaly path using the proposed curvature anomaly detection and use them to denoise images and video frames. I report extensive experiments, on different datasets, to show the effectiveness of the proposed algorithms. By experiments, I demonstrate that the proposed methods are useful for extracting informative features and instances for better accuracy, representation, prediction, class separation, data reduction, and embedding. I show that the proposed dimensionality reduction methods can extract informative features for better separation of classes. An example is obtaining an embedding space for separating cancer histopathology patches from the normal patches which helps hospitals diagnose cancers more easily in an automatic way. I also show that the proposed numerosity reduction methods are useful for ranking data instances based on their importance and reducing data volumes without a significant drop in performance of machine learning and data science algorithms

    A space-variant visual pathway model for data efficient deep learning

    Get PDF
    We present an investigation into adopting a model of the retino-cortical mapping, found in biological visual systems, to improve the efficiency of image analysis using Deep Convolutional Neural Nets (DCNNs) in the context of robot vision and egocentric perception systems. This work has now enabled DCNNs to process input images approaching one million pixels in size, in real time, using only consumer grade graphics processor (GPU) hardware in a single pass of the DCNN

    Artificial Intelligence and Deep Learning for Advancing PET Image Reconstruction: State-of-the-Art and Future Directions

    Get PDF
    Positron emission tomography (PET) is vital for diagnosing diseases and monitoring treatments. Conventional image reconstruction (IR) techniques like filtered backprojection and iterative algorithms are powerful but face limitations. PET IR can be seen as an image-to-image translation. Artificial intelligence (AI) and deep learning (DL) using multilayer neural networks enable a new approach to this computer vision task. This review aims to provide mutual understanding for nuclear medicine professionals and AI researchers. We outline fundamentals of PET imaging as well as state-of-the-art in AI-based PET IR with its typical algorithms and DL architectures. Advances improve resolution and contrast recovery, reduce noise, and remove artifacts via inferred attenuation and scatter correction, sinogram inpainting, denoising, and super-resolution refinement. Kernel-priors support list-mode reconstruction, motion correction, and parametric imaging. Hybrid approaches combine AI with conventional IR. Challenges of AI-assisted PET IR include availability of training data, cross-scanner compatibility, and the risk of hallucinated lesions. The need for rigorous evaluations, including quantitative phantom validation and visual comparison of diagnostic accuracy against conventional IR, is highlighted along with regulatory issues. First approved AI-based applications are clinically available, and its impact is foreseeable. Emerging trends, such as the integration of multimodal imaging and the use of data from previous imaging visits, highlight future potentials. Continued collaborative research promises significant improvements in image quality, quantitative accuracy, and diagnostic performance, ultimately leading to the integration of AI-based IR into routine PET imaging protocols

    Applying neural networks for improving the MEG inverse solution

    Get PDF
    Magnetoencephalography (MEG) and electroencephalography (EEG) are appealing non-invasive methods for recording brain activity with high temporal resolution. However, locating the brain source currents from recordings picked up by the sensors on the scalp introduces an ill-posed inverse problem. The MEG inverse problem one of the most difficult inverse problems in medical imaging. The current standard in approximating the MEG inverse problem is to use multiple distributed inverse solutions – namely dSPM, sLORETA and L2 MNE – to estimate the source current distribution in the brain. This thesis investigates if these inverse solutions can be "post-processed" by a neural network to provide improved accuracy on source locations. Recently, deep neural networks have been used to approximate other ill-posed inverse medical imaging problems with accuracy comparable to current state-of- the-art inverse reconstruction algorithms. Neural networks are powerful tools for approximating problems with limited prior knowledge or problems that require high levels of abstraction. In this thesis a special case of a deep convolutional network, the U-Net, is applied to approximate the MEG inverse problem using the standard inverse solutions (dSPM, sLORETA and L2 MNE) as inputs. The U-Net is capable of learning non-linear relationships between the inputs and producing predictions about the site of single-dipole activation with higher accuracy than the L2 minimum-norm based inverse solutions with the following resolution metrics: dipole localization error (DLE), spatial dispersion (SD) and overall amplitude (OA). The U-Net model is stable and performs better in aforesaid resolution metrics than the inverse solutions with multi-dipole data previously unseen by the U-Net
    • …
    corecore