16,776 research outputs found

    Score Function Features for Discriminative Learning: Matrix and Tensor Framework

    Get PDF
    Feature learning forms the cornerstone for tackling challenging learning problems in domains such as speech, computer vision and natural language processing. In this paper, we consider a novel class of matrix and tensor-valued features, which can be pre-trained using unlabeled samples. We present efficient algorithms for extracting discriminative information, given these pre-trained features and labeled samples for any related task. Our class of features are based on higher-order score functions, which capture local variations in the probability density function of the input. We establish a theoretical framework to characterize the nature of discriminative information that can be extracted from score-function features, when used in conjunction with labeled samples. We employ efficient spectral decomposition algorithms (on matrices and tensors) for extracting discriminative components. The advantage of employing tensor-valued features is that we can extract richer discriminative information in the form of an overcomplete representations. Thus, we present a novel framework for employing generative models of the input for discriminative learning.Comment: 29 page

    Multi-Target Prediction: A Unifying View on Problems and Methods

    Full text link
    Multi-target prediction (MTP) is concerned with the simultaneous prediction of multiple target variables of diverse type. Due to its enormous application potential, it has developed into an active and rapidly expanding research field that combines several subfields of machine learning, including multivariate regression, multi-label classification, multi-task learning, dyadic prediction, zero-shot learning, network inference, and matrix completion. In this paper, we present a unifying view on MTP problems and methods. First, we formally discuss commonalities and differences between existing MTP problems. To this end, we introduce a general framework that covers the above subfields as special cases. As a second contribution, we provide a structured overview of MTP methods. This is accomplished by identifying a number of key properties, which distinguish such methods and determine their suitability for different types of problems. Finally, we also discuss a few challenges for future research

    Representation Learning: A Review and New Perspectives

    Full text link
    The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, auto-encoders, manifold learning, and deep networks. This motivates longer-term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation and manifold learning

    Integrating dynamic stopping, transfer learning and language models in an adaptive zero-training ERP speller

    Get PDF
    Objective. Most BCIs have to undergo a calibration session in which data is recorded to train decoders with machine learning. Only recently zero-training methods have become a subject of study. This work proposes a probabilistic framework for BCI applications which exploit event-related potentials (ERPs). For the example of a visual P300 speller we show how the framework harvests the structure suitable to solve the decoding task by (a) transfer learning, (b) unsupervised adaptation, (c) language model and (d) dynamic stopping. Approach. A simulation study compares the proposed probabilistic zero framework (using transfer learning and task structure) to a state-of-the-art supervised model on n = 22 subjects. The individual influence of the involved components (a)–(d) are investigated. Main results. Without any need for a calibration session, the probabilistic zero-training framework with inter-subject transfer learning shows excellent performance—competitive to a state-of-the-art supervised method using calibration. Its decoding quality is carried mainly by the effect of transfer learning in combination with continuous unsupervised adaptation. Significance. A high-performing zero-training BCI is within reach for one of the most popular BCI paradigms: ERP spelling. Recording calibration data for a supervised BCI would require valuable time which is lost for spelling. The time spent on calibration would allow a novel user to spell 29 symbols with our unsupervised approach. It could be of use for various clinical and non-clinical ERP-applications of BCI

    Multi-view Learning as a Nonparametric Nonlinear Inter-Battery Factor Analysis

    Get PDF
    Factor analysis aims to determine latent factors, or traits, which summarize a given data set. Inter-battery factor analysis extends this notion to multiple views of the data. In this paper we show how a nonlinear, nonparametric version of these models can be recovered through the Gaussian process latent variable model. This gives us a flexible formalism for multi-view learning where the latent variables can be used both for exploratory purposes and for learning representations that enable efficient inference for ambiguous estimation tasks. Learning is performed in a Bayesian manner through the formulation of a variational compression scheme which gives a rigorous lower bound on the log likelihood. Our Bayesian framework provides strong regularization during training, allowing the structure of the latent space to be determined efficiently and automatically. We demonstrate this by producing the first (to our knowledge) published results of learning from dozens of views, even when data is scarce. We further show experimental results on several different types of multi-view data sets and for different kinds of tasks, including exploratory data analysis, generation, ambiguity modelling through latent priors and classification.Comment: 49 pages including appendi
    corecore