124,842 research outputs found

    Deep neural networks on graph signals for brain imaging analysis

    Full text link
    Brain imaging data such as EEG or MEG are high-dimensional spatiotemporal data often degraded by complex, non-Gaussian noise. For reliable analysis of brain imaging data, it is important to extract discriminative, low-dimensional intrinsic representation of the recorded data. This work proposes a new method to learn the low-dimensional representations from the noise-degraded measurements. In particular, our work proposes a new deep neural network design that integrates graph information such as brain connectivity with fully-connected layers. Our work leverages efficient graph filter design using Chebyshev polynomial and recent work on convolutional nets on graph-structured data. Our approach exploits graph structure as the prior side information, localized graph filter for feature extraction and neural networks for high capacity learning. Experiments on real MEG datasets show that our approach can extract more discriminative representations, leading to improved accuracy in a supervised classification task.Comment: Accepted by ICIP 201

    Experiments in Neural-Network Control of a Free-Flying Space Robot

    Get PDF
    Four important generic issues are identified and addressed in some depth in this thesis as part of the development of an adaptive neural network based control system for an experimental free flying space robot prototype. The first issue concerns the importance of true system level design of the control system. A new hybrid strategy is developed here, in depth, for the beneficial integration of neural networks into the total control system. A second important issue in neural network control concerns incorporating a priori knowledge into the neural network. In many applications, it is possible to get a reasonably accurate controller using conventional means. If this prior information is used purposefully to provide a starting point for the optimizing capabilities of the neural network, it can provide much faster initial learning. In a step towards addressing this issue, a new generic Fully Connected Architecture (FCA) is developed for use with backpropagation. A third issue is that neural networks are commonly trained using a gradient based optimization method such as backpropagation; but many real world systems have Discrete Valued Functions (DVFs) that do not permit gradient based optimization. One example is the on-off thrusters that are common on spacecraft. A new technique is developed here that now extends backpropagation learning for use with DVFs. The fourth issue is that the speed of adaptation is often a limiting factor in the implementation of a neural network control system. This issue has been strongly resolved in the research by drawing on the above new contributions

    Combining Deep Learning and Model-Based Methods for Robust Real-Time Semantic Landmark Detection

    Full text link
    Compared to abstract features, significant objects, so-called landmarks, are a more natural means for vehicle localization and navigation, especially in challenging unstructured environments. The major challenge is to recognize landmarks in various lighting conditions and changing environment (growing vegetation) while only having few training samples available. We propose a new method which leverages Deep Learning as well as model-based methods to overcome the need of a large data set. Using RGB images and light detection and ranging (LiDAR) point clouds, our approach combines state-of-the-art classification results of Convolutional Neural Networks (CNN), with robust model-based methods by taking prior knowledge of previous time steps into account. Evaluations on a challenging real-wold scenario, with trees and bushes as landmarks, show promising results over pure learning-based state-of-the-art 3D detectors, while being significant faster.Comment: In 22nd International Conference on Information Fusion (FUSION), 201

    Deep Sinogram Completion with Image Prior for Metal Artifact Reduction in CT Images

    Full text link
    Computed tomography (CT) has been widely used for medical diagnosis, assessment, and therapy planning and guidance. In reality, CT images may be affected adversely in the presence of metallic objects, which could lead to severe metal artifacts and influence clinical diagnosis or dose calculation in radiation therapy. In this paper, we propose a generalizable framework for metal artifact reduction (MAR) by simultaneously leveraging the advantages of image domain and sinogram domain-based MAR techniques. We formulate our framework as a sinogram completion problem and train a neural network (SinoNet) to restore the metal-affected projections. To improve the continuity of the completed projections at the boundary of metal trace and thus alleviate new artifacts in the reconstructed CT images, we train another neural network (PriorNet) to generate a good prior image to guide sinogram learning, and further design a novel residual sinogram learning strategy to effectively utilize the prior image information for better sinogram completion. The two networks are jointly trained in an end-to-end fashion with a differentiable forward projection (FP) operation so that the prior image generation and deep sinogram completion procedures can benefit from each other. Finally, the artifact-reduced CT images are reconstructed using the filtered backward projection (FBP) from the completed sinogram. Extensive experiments on simulated and real artifacts data demonstrate that our method produces superior artifact-reduced results while preserving the anatomical structures and outperforms other MAR methods.Comment: Accepted by IEEE Transactions on Medical Imagin

    Online Structured Laplace Approximations For Overcoming Catastrophic Forgetting

    Get PDF
    We introduce the Kronecker factored online Laplace approximation for overcoming catastrophic forgetting in neural networks. The method is grounded in a Bayesian online learning framework, where we recursively approximate the posterior after every task with a Gaussian, leading to a quadratic penalty on changes to the weights. The Laplace approximation requires calculating the Hessian around a mode, which is typically intractable for modern architectures. In order to make our method scalable, we leverage recent block-diagonal Kronecker factored approximations to the curvature. Our algorithm achieves over 90% test accuracy across a sequence of 50 instantiations of the permuted MNIST dataset, substantially outperforming related methods for overcoming catastrophic forgetting.Comment: 13 pages, 6 figure

    Meta-Learning by Adjusting Priors Based on Extended PAC-Bayes Theory

    Full text link
    In meta-learning an agent extracts knowledge from observed tasks, aiming to facilitate learning of novel future tasks. Under the assumption that future tasks are 'related' to previous tasks, the accumulated knowledge should be learned in a way which captures the common structure across learned tasks, while allowing the learner sufficient flexibility to adapt to novel aspects of new tasks. We present a framework for meta-learning that is based on generalization error bounds, allowing us to extend various PAC-Bayes bounds to meta-learning. Learning takes place through the construction of a distribution over hypotheses based on the observed tasks, and its utilization for learning a new task. Thus, prior knowledge is incorporated through setting an experience-dependent prior for novel tasks. We develop a gradient-based algorithm which minimizes an objective function derived from the bounds and demonstrate its effectiveness numerically with deep neural networks. In addition to establishing the improved performance available through meta-learning, we demonstrate the intuitive way by which prior information is manifested at different levels of the network.Comment: Accepted to ICML 201

    A Dual-Dimer Method for Training Physics-Constrained Neural Networks with Minimax Architecture

    Full text link
    Data sparsity is a common issue to train machine learning tools such as neural networks for engineering and scientific applications, where experiments and simulations are expensive. Recently physics-constrained neural networks (PCNNs) were developed to reduce the required amount of training data. However, the weights of different losses from data and physical constraints are adjusted empirically in PCNNs. In this paper, a new physics-constrained neural network with the minimax architecture (PCNN-MM) is proposed so that the weights of different losses can be adjusted systematically. The training of the PCNN-MM is searching the high-order saddle points of the objective function. A novel saddle point search algorithm called Dual-Dimer method is developed. It is demonstrated that the Dual-Dimer method is computationally more efficient than the gradient descent ascent method for nonconvex-nonconcave functions and provides additional eigenvalue information to verify search results. A heat transfer example also shows that the convergence of PCNN-MMs is faster than that of traditional PCNNs.Comment: 34 pages, 5 figures, accepted by neural network

    Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

    Full text link
    We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples. In our approach, the parameters of the model are explicitly trained such that a small number of gradient steps with a small amount of training data from a new task will produce good generalization performance on that task. In effect, our method trains the model to be easy to fine-tune. We demonstrate that this approach leads to state-of-the-art performance on two few-shot image classification benchmarks, produces good results on few-shot regression, and accelerates fine-tuning for policy gradient reinforcement learning with neural network policies.Comment: ICML 2017. Code at https://github.com/cbfinn/maml, Videos of RL results at https://sites.google.com/view/maml, Blog post at http://bair.berkeley.edu/blog/2017/07/18/learning-to-learn

    Continual Learning in Deep Neural Network by Using a Kalman Optimiser

    Full text link
    Learning and adapting to new distributions or learning new tasks sequentially without forgetting the previously learned knowledge is a challenging phenomenon in continual learning models. Most of the conventional deep learning models are not capable of learning new tasks sequentially in one model without forgetting the previously learned ones. We address this issue by using a Kalman Optimiser. The Kalman Optimiser divides the neural network into two parts: the long-term and short-term memory units. The long-term memory unit is used to remember the learned tasks and the short-term memory unit is to adapt to the new task. We have evaluated our method on MNIST, CIFAR10, CIFAR100 datasets and compare our results with state-of-the-art baseline models. The results show that our approach enables the model to continually learn and adapt to the new changes without forgetting the previously learned tasks.Comment: accepted by ICML worksho

    Noise Contrastive Priors for Functional Uncertainty

    Full text link
    Obtaining reliable uncertainty estimates of neural network predictions is a long standing challenge. Bayesian neural networks have been proposed as a solution, but it remains open how to specify their prior. In particular, the common practice of an independent normal prior in weight space imposes relatively weak constraints on the function posterior, allowing it to generalize in unforeseen ways on inputs outside of the training distribution. We propose noise contrastive priors (NCPs) to obtain reliable uncertainty estimates. The key idea is to train the model to output high uncertainty for data points outside of the training distribution. NCPs do so using an input prior, which adds noise to the inputs of the current mini batch, and an output prior, which is a wide distribution given these inputs. NCPs are compatible with any model that can output uncertainty estimates, are easy to scale, and yield reliable uncertainty estimates throughout training. Empirically, we show that NCPs prevent overfitting outside of the training distribution and result in uncertainty estimates that are useful for active learning. We demonstrate the scalability of our method on the flight delays data set, where we significantly improve upon previously published results.Comment: 12 pages, 6 figure
    • …
    corecore