Search CORE

13,359 research outputs found

Multiple Kernel Learning and Automatic Subspace Relevance Determination for High-dimensional Neuroimaging Data

Author: Ayhan Murat Seckin
Initiative Alzheimer's disease Neuroimaging
Raghavan Vijay
Publication venue
Publication date: 02/06/2017
Field of study

Alzheimer's disease is a major cause of dementia. Its diagnosis requires accurate biomarkers that are sensitive to disease stages. In this respect, we regard probabilistic classification as a method of designing a probabilistic biomarker for disease staging. Probabilistic biomarkers naturally support the interpretation of decisions and evaluation of uncertainty associated with them. In this paper, we obtain probabilistic biomarkers via Gaussian Processes. Gaussian Processes enable probabilistic kernel machines that offer flexible means to accomplish Multiple Kernel Learning. Exploiting this flexibility, we propose a new variation of Automatic Relevance Determination and tackle the challenges of high dimensionality through multiple kernels. Our research results demonstrate that the Gaussian Process models are competitive with or better than the well-known Support Vector Machine in terms of classification performance even in the cases of single kernel learning. Extending the basic scheme towards the Multiple Kernel Learning, we improve the efficacy of the Gaussian Process models and their interpretability in terms of the known anatomical correlates of the disease. For instance, the disease pathology starts in and around the hippocampus and entorhinal cortex. Through the use of Gaussian Processes and Multiple Kernel Learning, we have automatically and efficiently determined those portions of neuroimaging data. In addition to their interpretability, our Gaussian Process models are competitive with recent deep learning solutions under similar settings.Comment: The material presented here is to promote the dissemination of scholarly and technical work in a timely fashion. Data in this article are from ADNI (adni.loni.usc.edu). As such, ADNI provided data but did not participate in writing of this repor

arXiv.org e-Print Archive

Liver segmentation in CT images using three dimensional to two dimensional fully convolutional network

Author: Karimi Nader
Najarian Kayvan
Nasr-Esfahani Ebrahim
Rafiei Shima
Samavi Shadrokh
Soroushmehr S. M. Reza
Publication venue
Publication date: 03/03/2018
Field of study

The need for CT scan analysis is growing for pre-diagnosis and therapy of abdominal organs. Automatic organ segmentation of abdominal CT scan can help radiologists analyze the scans faster and segment organ images with fewer errors. However, existing methods are not efficient enough to perform the segmentation process for victims of accidents and emergencies situations. In this paper we propose an efficient liver segmentation with our 3D to 2D fully connected network (3D-2D-FCN). The segmented mask is enhanced by means of conditional random field on the organ's border. Consequently, we segment a target liver in less than a minute with Dice score of 93.52.Comment: 5 pages, 2 figure

arXiv.org e-Print Archive

MMD GAN: Towards Deeper Understanding of Moment Matching Network

Author: Chang Wei-Cheng
Cheng Yu
Li Chun-Liang
Póczos Barnabás
Yang Yiming
Publication venue
Publication date: 27/11/2017
Field of study

Generative moment matching network (GMMN) is a deep generative model that differs from Generative Adversarial Network (GAN) by replacing the discriminator in GAN with a two-sample test based on kernel maximum mean discrepancy (MMD). Although some theoretical guarantees of MMD have been studied, the empirical performance of GMMN is still not as competitive as that of GAN on challenging and large benchmark datasets. The computational efficiency of GMMN is also less desirable in comparison with GAN, partially due to its requirement for a rather large batch size during the training. In this paper, we propose to improve both the model expressiveness of GMMN and its computational efficiency by introducing adversarial kernel learning techniques, as the replacement of a fixed Gaussian kernel in the original GMMN. The new approach combines the key ideas in both GMMN and GAN, hence we name it MMD GAN. The new distance measure in MMD GAN is a meaningful loss that enjoys the advantage of weak topology and can be optimized via gradient descent with relatively small batch sizes. In our evaluation on multiple benchmark datasets, including MNIST, CIFAR- 10, CelebA and LSUN, the performance of MMD-GAN significantly outperforms GMMN, and is competitive with other representative GAN works.Comment: In the Proceedings of Thirty-first Annual Conference on Neural Information Processing Systems (NIPS 2017

arXiv.org e-Print Archive

Mapping Auto-context Decision Forests to Deep ConvNets for Semantic Segmentation

Author: Kainmueller Dagmar
Myers Eugene W.
Richmond David L.
Rother Carsten
Yang Michael Y.
Publication venue
Publication date: 13/08/2018
Field of study

We consider the task of pixel-wise semantic segmentation given a small set of labeled training images. Among two of the most popular techniques to address this task are Decision Forests (DF) and Neural Networks (NN). In this work, we explore the relationship between two special forms of these techniques: stacked DFs (namely Auto-context) and deep Convolutional Neural Networks (ConvNet). Our main contribution is to show that Auto-context can be mapped to a deep ConvNet with novel architecture, and thereby trained end-to-end. This mapping can be used as an initialization of a deep ConvNet, enabling training even in the face of very limited amounts of training data. We also demonstrate an approximate mapping back from the refined ConvNet to a second stacked DF, with improved performance over the original. We experimentally verify that these mappings outperform stacked DFs for two different applications in computer vision and biology: Kinect-based body part labeling from depth images, and somite segmentation in microscopy images of developing zebrafish. Finally, we revisit the core mapping from a Decision Tree (DT) to a NN, and show that it is also possible to map a fuzzy DT, with sigmoidal split decisions, to a NN. This addresses multiple limitations of the previous mapping, and yields new insights into the popular Rectified Linear Unit (ReLU), and more recently proposed concatenated ReLU (CReLU), activation functions

arXiv.org e-Print Archive

TensorFlow Distributions

Author: Alemi Alex
Brevdo Eugene
Dillon Joshua V.
Hoffman Matt
Langmore Ian
Moore Dave
Patton Brian
Saurous Rif A.
Tran Dustin
Vasudevan Srinivas
Publication venue
Publication date: 28/11/2017
Field of study

The TensorFlow Distributions library implements a vision of probability theory adapted to the modern deep-learning paradigm of end-to-end differentiable computation. Building on two basic abstractions, it offers flexible building blocks for probabilistic computation. Distributions provide fast, numerically stable methods for generating samples and computing statistics, e.g., log density. Bijectors provide composable volume-tracking transformations with automatic caching. Together these enable modular construction of high dimensional distributions and transformations not possible with previous libraries (e.g., pixelCNNs, autoregressive flows, and reversible residual networks). They are the workhorse behind deep probabilistic programming systems like Edward and empower fast black-box inference in probabilistic models built on deep-network components. TensorFlow Distributions has proven an important part of the TensorFlow toolkit within Google and in the broader deep learning community

arXiv.org e-Print Archive

Fully Connected Deep Structured Networks

Author: Schwing Alexander G.
Urtasun Raquel
Publication venue
Publication date: 08/03/2015
Field of study

Convolutional neural networks with many layers have recently been shown to achieve excellent results on many high-level tasks such as image classification, object detection and more recently also semantic segmentation. Particularly for semantic segmentation, a two-stage procedure is often employed. Hereby, convolutional networks are trained to provide good local pixel-wise features for the second step being traditionally a more global graphical model. In this work we unify this two-stage process into a single joint training algorithm. We demonstrate our method on the semantic image segmentation task and show encouraging results on the challenging PASCAL VOC 2012 dataset

arXiv.org e-Print Archive

Probabilistic Programming with Gaussian Process Memoization

Author: Mansinghka Vikash K.
Radul Alexey
Schaechtle Ulrich
Stathis Kostas
Zinberg Ben
Publication venue
Publication date: 05/01/2016
Field of study

Gaussian Processes (GPs) are widely used tools in statistics, machine learning, robotics, computer vision, and scientific computation. However, despite their popularity, they can be difficult to apply; all but the simplest classification or regression applications require specification and inference over complex covariance functions that do not admit simple analytical posteriors. This paper shows how to embed Gaussian processes in any higher-order probabilistic programming language, using an idiom based on memoization, and demonstrates its utility by implementing and extending classic and state-of-the-art GP applications. The interface to Gaussian processes, called gpmem, takes an arbitrary real-valued computational process as input and returns a statistical emulator that automatically improve as the original process is invoked and its input-output behavior is recorded. The flexibility of gpmem is illustrated via three applications: (i) robust GP regression with hierarchical hyper-parameter learning, (ii) discovering symbolic expressions from time-series data by fully Bayesian structure learning over kernels generated by a stochastic grammar, and (iii) a bandit formulation of Bayesian optimization with automatic inference and action selection. All applications share a single 50-line Python library and require fewer than 20 lines of probabilistic code each.Comment: 36 pages, 9 figure

arXiv.org e-Print Archive

Kernel Mean Embedding of Distributions: A Review and Beyond

Author: Fukumizu Kenji
Muandet Krikamol
Schölkopf Bernhard
Sriperumbudur Bharath
Publication venue: 'Now Publishers'
Publication date: 25/01/2017
Field of study

A Hilbert space embedding of a distribution---in short, a kernel mean embedding---has recently emerged as a powerful tool for machine learning and inference. The basic idea behind this framework is to map distributions into a reproducing kernel Hilbert space (RKHS) in which the whole arsenal of kernel methods can be extended to probability measures. It can be viewed as a generalization of the original "feature map" common to support vector machines (SVMs) and other kernel methods. While initially closely associated with the latter, it has meanwhile found application in fields ranging from kernel machines and probabilistic modeling to statistical inference, causal discovery, and deep learning. The goal of this survey is to give a comprehensive review of existing work and recent advances in this research area, and to discuss the most challenging issues and open problems that could lead to new research directions. The survey begins with a brief introduction to the RKHS and positive definite kernels which forms the backbone of this survey, followed by a thorough discussion of the Hilbert space embedding of marginal distributions, theoretical guarantees, and a review of its applications. The embedding of distributions enables us to apply RKHS methods to probability measures which prompts a wide range of applications such as kernel two-sample testing, independent testing, and learning on distributional data. Next, we discuss the Hilbert space embedding for conditional distributions, give theoretical insights, and review some applications. The conditional mean embedding enables us to perform sum, product, and Bayes' rules---which are ubiquitous in graphical model, probabilistic inference, and reinforcement learning---in a non-parametric way. We then discuss relationships between this framework and other related areas. Lastly, we give some suggestions on future research directions.Comment: 147 pages; this is a version of the manuscript after the review proces

arXiv.org e-Print Archive

Bayesian Convolutional Neural Networks with Bernoulli Approximate Variational Inference

Author: Gal Yarin
Ghahramani Zoubin
Publication venue
Publication date: 18/01/2016
Field of study

Convolutional neural networks (CNNs) work well on large datasets. But labelled data is hard to collect, and in some applications larger amounts of data are not available. The problem then is how to use CNNs with small data -- as CNNs overfit quickly. We present an efficient Bayesian CNN, offering better robustness to over-fitting on small data than traditional approaches. This is by placing a probability distribution over the CNN's kernels. We approximate our model's intractable posterior with Bernoulli variational distributions, requiring no additional model parameters. On the theoretical side, we cast dropout network training as approximate inference in Bayesian neural networks. This allows us to implement our model using existing tools in deep learning with no increase in time complexity, while highlighting a negative result in the field. We show a considerable improvement in classification accuracy compared to standard techniques and improve on published state-of-the-art results for CIFAR-10.Comment: 12 pages, 3 figures, ICLR format, updated with reviewer comment

arXiv.org e-Print Archive

Adaptive Sampled Softmax with Kernel Based Sampling

Author: Blanc Guy
Rendle Steffen
Publication venue
Publication date: 01/08/2018
Field of study

Softmax is the most commonly used output function for multiclass problems and is widely used in areas such as vision, natural language processing, and recommendation. A softmax model has linear costs in the number of classes which makes it too expensive for many real-world problems. A common approach to speed up training involves sampling only some of the classes at each training step. It is known that this method is biased and that the bias increases the more the sampling distribution deviates from the output distribution. Nevertheless, almost any recent work uses simple sampling distributions that require a large sample size to mitigate the bias. In this work, we propose a new class of kernel based sampling methods and develop an efficient sampling algorithm. Kernel based sampling adapts to the model as it is trained, thus resulting in low bias. Kernel based sampling can be easily applied to many models because it relies only on the model's last hidden layer. We empirically study the trade-off of bias, sampling distribution and sample size and show that kernel based sampling results in low bias with few samples

arXiv.org e-Print Archive