17,792 research outputs found

    Wasserstein Barycenter Model Ensembling

    Full text link
    In this paper we propose to perform model ensembling in a multiclass or a multilabel learning setting using Wasserstein (W.) barycenters. Optimal transport metrics, such as the Wasserstein distance, allow incorporating semantic side information such as word embeddings. Using W. barycenters to find the consensus between models allows us to balance confidence and semantics in finding the agreement between the models. We show applications of Wasserstein ensembling in attribute-based classification, multilabel learning and image captioning generation. These results show that the W. ensembling is a viable alternative to the basic geometric or arithmetic mean ensembling.Comment: ICLR 201

    Effective learning is accompanied by high dimensional and efficient representations of neural activity

    Full text link
    A fundamental cognitive process is the ability to map value and identity onto objects as we learn about them. Exactly how such mental constructs emerge and what kind of space best embeds this mapping remains incompletely understood. Here we develop tools to quantify the space and organization of such a mapping, thereby providing a framework for studying the geometric representations of neural responses as reflected in functional MRI. Considering how human subjects learn the values of novel objects, we show that quick learners have a higher dimensional geometric representation than slow learners, and hence more easily distinguishable whole-brain responses to objects of different value. Furthermore, we find that quick learners display a more compact embedding of their neural responses and hence have a higher ratio of their task-based dimension to their embedding dimension -- consistent with a greater efficiency of cognitive coding. Lastly, we investigate the neurophysiological drivers of high dimensional patterns at both regional and voxel levels, and we complete our study with a complementary test of the distinguishability of associated whole-brain responses. Our results demonstrate a spatial organization of neural responses characteristic of learning, and offer a suite of geometric measures applicable to the study of efficient coding in higher-order cognitive processes more broadly

    Differential geometric regularization for supervised learning of classifiers

    Full text link
    We study the problem of supervised learning for both binary and multiclass classification from a unified geometric perspective. In particular, we propose a geometric regularization technique to find the submanifold corresponding to an estimator of the class probability P(y|\vec x). The regularization term measures the volume of this submanifold, based on the intuition that overfitting produces rapid local oscillations and hence large volume of the estimator. This technique can be applied to regularize any classification function that satisfies two requirements: firstly, an estimator of the class probability can be obtained; secondly, first and second derivatives of the class probability estimator can be calculated. In experiments, we apply our regularization technique to standard loss functions for classification, our RBF-based implementation compares favorably to widely used regularization methods for both binary and multiclass classification.http://proceedings.mlr.press/v48/baia16.pdfPublished versio

    Curvature-based Comparison of Two Neural Networks

    Full text link
    In this paper we show the similarities and differences of two deep neural networks by comparing the manifolds composed of activation vectors in each fully connected layer of them. The main contribution of this paper includes 1) a new data generating algorithm which is crucial for determining the dimension of manifolds; 2) a systematic strategy to compare manifolds. Especially, we take Riemann curvature and sectional curvature as part of criterion, which can reflect the intrinsic geometric properties of manifolds. Some interesting results and phenomenon are given, which help in specifying the similarities and differences between the features extracted by two networks and demystifying the intrinsic mechanism of deep neural networks

    Geometric Losses for Distributional Learning

    Full text link
    Building upon recent advances in entropy-regularized optimal transport, and upon Fenchel duality between measures and continuous functions , we propose a generalization of the logistic loss that incorporates a metric or cost between classes. Unlike previous attempts to use optimal transport distances for learning, our loss results in unconstrained convex objective functions, supports infinite (or very large) class spaces, and naturally defines a geometric generalization of the softmax operator. The geometric properties of this loss make it suitable for predicting sparse and singular distributions, for instance supported on curves or hyper-surfaces. We study the theoretical properties of our loss and show-case its effectiveness on two applications: ordinal regression and drawing generation

    How Generative Adversarial Networks and Their Variants Work: An Overview

    Full text link
    Generative Adversarial Networks (GAN) have received wide attention in the machine learning field for their potential to learn high-dimensional, complex real data distribution. Specifically, they do not rely on any assumptions about the distribution and can generate real-like samples from latent space in a simple manner. This powerful property leads GAN to be applied to various applications such as image synthesis, image attribute editing, image translation, domain adaptation and other academic fields. In this paper, we aim to discuss the details of GAN for those readers who are familiar with, but do not comprehend GAN deeply or who wish to view GAN from various perspectives. In addition, we explain how GAN operates and the fundamental meaning of various objective functions that have been suggested recently. We then focus on how the GAN can be combined with an autoencoder framework. Finally, we enumerate the GAN variants that are applied to various tasks and other fields for those who are interested in exploiting GAN for their research.Comment: 41 pages, 16 figures, Published in ACM Computing Surveys (CSUR

    PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

    Full text link
    Few prior works study deep learning on point sets. PointNet by Qi et al. is a pioneer in this direction. However, by design PointNet does not capture local structures induced by the metric space points live in, limiting its ability to recognize fine-grained patterns and generalizability to complex scenes. In this work, we introduce a hierarchical neural network that applies PointNet recursively on a nested partitioning of the input point set. By exploiting metric space distances, our network is able to learn local features with increasing contextual scales. With further observation that point sets are usually sampled with varying densities, which results in greatly decreased performance for networks trained on uniform densities, we propose novel set learning layers to adaptively combine features from multiple scales. Experiments show that our network called PointNet++ is able to learn deep point set features efficiently and robustly. In particular, results significantly better than state-of-the-art have been obtained on challenging benchmarks of 3D point clouds

    Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective

    Get PDF
    This paper takes a problem-oriented perspective and presents a comprehensive review of transfer learning methods, both shallow and deep, for cross-dataset visual recognition. Specifically, it categorises the cross-dataset recognition into seventeen problems based on a set of carefully chosen data and label attributes. Such a problem-oriented taxonomy has allowed us to examine how different transfer learning approaches tackle each problem and how well each problem has been researched to date. The comprehensive problem-oriented review of the advances in transfer learning with respect to the problem has not only revealed the challenges in transfer learning for visual recognition, but also the problems (e.g. eight of the seventeen problems) that have been scarcely studied. This survey not only presents an up-to-date technical review for researchers, but also a systematic approach and a reference for a machine learning practitioner to categorise a real problem and to look up for a possible solution accordingly

    Hyperspectral Image Classification and Clutter Detection via Multiple Structural Embeddings and Dimension Reductions

    Full text link
    We present a new and effective approach for Hyperspectral Image (HSI) classification and clutter detection, overcoming a few long-standing challenges presented by HSI data characteristics. Residing in a high-dimensional spectral attribute space, HSI data samples are known to be strongly correlated in their spectral signatures, exhibit nonlinear structure due to several physical laws, and contain uncertainty and noise from multiple sources. In the presented approach, we generate an adaptive, structurally enriched representation environment, and employ the locally linear embedding (LLE) in it. There are two structure layers external to LLE. One is feature space embedding: the HSI data attributes are embedded into a discriminatory feature space where spatio-spectral coherence and distinctive structures are distilled and exploited to mitigate various difficulties encountered in the native hyperspectral attribute space. The other structure layer encloses the ranges of algorithmic parameters for LLE and feature embedding, and supports a multiplexing and integrating scheme for contending with multi-source uncertainty. Experiments on two commonly used HSI datasets with a small number of learning samples have rendered remarkably high-accuracy classification results, as well as distinctive maps of detected clutter regions.Comment: 13 pages, 6 figures (30 images), submitted to International Conference on Computer Vision (ICCV) 201

    Learning Rank Functionals: An Empirical Study

    Full text link
    Ranking is a key aspect of many applications, such as information retrieval, question answering, ad placement and recommender systems. Learning to rank has the goal of estimating a ranking model automatically from training data. In practical settings, the task often reduces to estimating a rank functional of an object with respect to a query. In this paper, we investigate key issues in designing an effective learning to rank algorithm. These include data representation, the choice of rank functionals, the design of the loss function so that it is correlated with the rank metrics used in evaluation. For the loss function, we study three techniques: approximating the rank metric by a smooth function, decomposition of the loss into a weighted sum of element-wise losses and into a weighted sum of pairwise losses. We then present derivations of piecewise losses using the theory of high-order Markov chains and Markov random fields. In experiments, we evaluate these design aspects on two tasks: answer ranking in a Social Question Answering site, and Web Information Retrieval
    • …
    corecore