178,711 research outputs found

    voxel2vec: A Natural Language Processing Approach to Learning Distributed Representations for Scientific Data

    Full text link
    Relationships in scientific data, such as the numerical and spatial distribution relations of features in univariate data, the scalar-value combinations' relations in multivariate data, and the association of volumes in time-varying and ensemble data, are intricate and complex. This paper presents voxel2vec, a novel unsupervised representation learning model, which is used to learn distributed representations of scalar values/scalar-value combinations in a low-dimensional vector space. Its basic assumption is that if two scalar values/scalar-value combinations have similar contexts, they usually have high similarity in terms of features. By representing scalar values/scalar-value combinations as symbols, voxel2vec learns the similarity between them in the context of spatial distribution and then allows us to explore the overall association between volumes by transfer prediction. We demonstrate the usefulness and effectiveness of voxel2vec by comparing it with the isosurface similarity map of univariate data and applying the learned distributed representations to feature classification for multivariate data and to association analysis for time-varying and ensemble data.Comment: Accepted by IEEE Transaction on Visualization and Computer Graphics (TVCG

    Too much information is no information: how machine learning and feature selection could help in understanding the motor control of pointing

    Get PDF
    © 2023 The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY), https://creativecommons.org/licenses/by/4.0/The aim of this study was to develop the use of Machine Learning techniques as a means of multivariate analysis in studies of motor control. These studies generate a huge amount of data, the analysis of which continues to be largely univariate. We propose the use of machine learning classification and feature selection as a means of uncovering feature combinations that are altered between conditions. High dimensional electromyogram (EMG) vectors were generated as several arm and trunk muscles were recorded while subjects pointed at various angles above and below the gravity neutral horizontal plane. We used Linear Discriminant Analysis (LDA) to carry out binary classifications between the EMG vectors for pointing at a particular angle, vs. pointing at the gravity neutral direction. Classification success provided a composite index of muscular adjustments for various task constraints—in this case, pointing angles. In order to find the combination of features that were significantly altered between task conditions, we conducted a post classification feature selection i.e., investigated which combination of features had allowed for the classification. Feature selection was done by comparing the representations of each category created by LDA for the classification. In other words computing the difference between the representations of each class. We propose that this approach will help with comparing high dimensional EMG patterns in two ways; (i) quantifying the effects of the entire pattern rather than using single arbitrarily defined variables and (ii) identifying the parts of the patterns that convey the most information regarding the investigated effects.Peer reviewe

    A Class of Conjugate Priors for Multinomial Probit Models which Includes the Multivariate Normal One

    Full text link
    Multinomial probit models are widely-implemented representations which allow both classification and inference by learning changes in vectors of class probabilities with a set of p observed predictors. Although various frequentist methods have been developed for estimation, inference and classification within such a class of models, Bayesian inference is still lagging behind. This is due to the apparent absence of a tractable class of conjugate priors, that may facilitate posterior inference on the multinomial probit coefficients. Such an issue has motivated increasing efforts toward the development of effective Markov chain Monte Carlo methods, but state-of-the-art solutions still face severe computational bottlenecks, especially in large p settings. In this article, we prove that the entire class of unified skew-normal (SUN) distributions is conjugate to a wide variety of multinomial probit models, and we exploit the SUN properties to improve upon state-of-art-solutions for posterior inference and classification both in terms of closed-form results for key functionals of interest, and also by developing novel computational methods relying either on independent and identically distributed samples from the exact posterior or on scalable and accurate variational approximations based on blocked partially-factorized representations. As illustrated in a gastrointestinal lesions application, the magnitude of the improvements relative to current methods is particularly evident, in practice, when the focus is on large p applications

    VaB-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning

    Get PDF
    Active Learning for discriminative models has largely been studied with the focus on individual samples, with less emphasis on how classes are distributed or which classes are hard to deal with. In this work, we show that this is harmful. We propose a method based on the Bayes' rule, that can naturally incorporate class imbalance into the Active Learning framework. We derive that three terms should be considered together when estimating the probability of a classifier making a mistake for a given sample; i) probability of mislabelling a class, ii) likelihood of the data given a predicted class, and iii) the prior probability on the abundance of a predicted class. Implementing these terms requires a generative model and an intractable likelihood estimation. Therefore, we train a Variational Auto Encoder (VAE) for this purpose. To further tie the VAE with the classifier and facilitate VAE training, we use the classifiers' deep feature representations as input to the VAE. By considering all three probabilities, among them especially the data imbalance, we can substantially improve the potential of existing methods under limited data budget. We show that our method can be applied to classification tasks on multiple different datasets -- including one that is a real-world dataset with heavy data imbalance -- significantly outperforming the state of the art

    Learning Bilingual Word Representations by Marginalizing Alignments

    Full text link
    We present a probabilistic model that simultaneously learns alignments and distributed representations for bilingual data. By marginalizing over word alignments the model captures a larger semantic context than prior work relying on hard alignments. The advantage of this approach is demonstrated in a cross-lingual classification task, where we outperform the prior published state of the art.Comment: Proceedings of ACL 2014 (Short Papers

    Cognition-Based Networks: A New Perspective on Network Optimization Using Learning and Distributed Intelligence

    Get PDF
    IEEE Access Volume 3, 2015, Article number 7217798, Pages 1512-1530 Open Access Cognition-based networks: A new perspective on network optimization using learning and distributed intelligence (Article) Zorzi, M.a , Zanella, A.a, Testolin, A.b, De Filippo De Grazia, M.b, Zorzi, M.bc a Department of Information Engineering, University of Padua, Padua, Italy b Department of General Psychology, University of Padua, Padua, Italy c IRCCS San Camillo Foundation, Venice-Lido, Italy View additional affiliations View references (107) Abstract In response to the new challenges in the design and operation of communication networks, and taking inspiration from how living beings deal with complexity and scalability, in this paper we introduce an innovative system concept called COgnition-BAsed NETworkS (COBANETS). The proposed approach develops around the systematic application of advanced machine learning techniques and, in particular, unsupervised deep learning and probabilistic generative models for system-wide learning, modeling, optimization, and data representation. Moreover, in COBANETS, we propose to combine this learning architecture with the emerging network virtualization paradigms, which make it possible to actuate automatic optimization and reconfiguration strategies at the system level, thus fully unleashing the potential of the learning approach. Compared with the past and current research efforts in this area, the technical approach outlined in this paper is deeply interdisciplinary and more comprehensive, calling for the synergic combination of expertise of computer scientists, communications and networking engineers, and cognitive scientists, with the ultimate aim of breaking new ground through a profound rethinking of how the modern understanding of cognition can be used in the management and optimization of telecommunication network
    corecore