323 research outputs found

    Online Bayesian Multiple Kernel Bipartite Ranking

    Get PDF
    Abstract Bipartite ranking aims to maximize the area under the ROC curve (AUC) of a decision function. To tackle this problem when the data appears sequentially, existing online AUC maximization methods focus on seeking a point estimate of the decision function in a linear or predefined single kernel space, and cannot learn effective kernels automatically from the streaming data. In this paper, we first develop a Bayesian multiple kernel bipartite ranking model, which circumvents the kernel selection problem by estimating a posterior distribution over the model weights. To make our model applicable to streaming data, we then present a kernelized online Bayesian passive-aggressive learning framework by maintaining a variational approximation to the posterior based on data augmentation. Furthermore, to efficiently deal with large-scale data, we design a fixed budget strategy which can effectively control online model complexity. Extensive experimental studies confirm the superiority of our Bayesian multi-kernel approach

    Kernel learning approaches for image classification

    Get PDF
    This thesis extends the use of kernel learning techniques to specific problems of image classification. Kernel learning is a paradigm in the eld of machine learning that generalizes the use of inner products to compute similarities between arbitrary objects. In image classification one aims to separate images based on their visual content. We address two important problems that arise in this context: learning with weak label information and combination of heterogeneous data sources. The contributions we report on are not unique to image classification, and apply to a more general class of problems. We study the problem of learning with label ambiguity in the multiple instance learning framework. We discuss several different image classification scenarios that arise in this context and argue that the standard multiple instance learning requires a more detailed disambiguation. Finally we review kernel learning approaches proposed for this problem and derive a more efficcient algorithm to solve them. The multiple kernel learning framework is an approach to automatically select kernel parameters. We extend it to its infinite limit and present an algorithm to solve the resulting problem. This result is then applied in two directions. We show how to learn kernels that adapt to the special structure of images. Finally we compare different ways of combining image features for object classification and present significant improvements compared to previous methods.In dieser Dissertation verwenden wir Kernmethoden für spezielle Probleme der Bildklassifikation. Kernmethoden generalisieren die Verwendung von inneren Produkten zu Distanzen zwischen allgemeinen Objekten. Das Problem der Bildklassifikation ist es, Bilder anhand des visuellen Inhaltes zu unterscheiden. Wir beschäftigen uns mit zwei wichtigen Aspekten, die in diesem Problem auftreten: lernen mit mehrdeutiger Annotation und die Kombination von verschiedenartigen Datenquellen. Unsere Ansätze sind nicht auf die Bildklassififikation beschränkt und für einen grösseren Problemkreis verwendbar. Mehrdeutige Annotationen sind ein inhärentes Problem der Bildklassifikation. Wir diskutieren verschiedene Instanzen und schlagen eine neue Unterteilung in mehrere Szenarien vor. Danach stellen wir Kernmethoden für dieses Problem vor und entwickeln einen Algorithmus, der diese effizient löst. Mit der Methode der Kernkombination werden Kernfunktionen anhand von Daten automatisch bestimmt. Wir generalisieren diesen Ansatz indem wir den Suchraum auf kontinuierlich parametrisierte Kernklassen ausgedehnen. Diese Methode wird in zwei verschiedenen Anwendungen eingesetzt. Wir betrachten spezifische Kerne für Bilddaten und lernen diese anhand von Beispielen. Schließlich vergleichen wir verschiedene Verfahren der Merkmalskombination und zeigen signifikante Verbesserungen im Bereich der Objekterkennung gegenüber bestehenden Methoden

    Input variable selection methods for construction of interpretable regression models

    Get PDF
    Large data sets are collected and analyzed in a variety of research problems. Modern computers allow to measure ever increasing numbers of samples and variables. Automated methods are required for the analysis, since traditional manual approaches are impractical due to the growing amount of data. In the present thesis, numerous computational methods that are based on observed data with subject to modelling assumptions are presented for producing useful knowledge from the data generating system. Input variable selection methods in both linear and nonlinear function approximation problems are proposed. Variable selection has gained more and more attention in many applications, because it assists in interpretation of the underlying phenomenon. The selected variables highlight the most relevant characteristics of the problem. In addition, the rejection of irrelevant inputs may reduce the training time and improve the prediction accuracy of the model. Linear models play an important role in data analysis, since they are computationally efficient and they form the basis for many more complicated models. In this work, the estimation of several response variables simultaneously using the linear combinations of the same subset of inputs is especially considered. Input selection methods that are originally designed for a single response variable are extended to the case of multiple responses. The assumption of linearity is not, however, adequate in all problems. Hence, artificial neural networks are applied in the modeling of unknown nonlinear dependencies between the inputs and the response. The first set of methods includes efficient stepwise selection strategies that assess usefulness of the inputs in the model. Alternatively, the problem of input selection is formulated as an optimization problem. An objective function is minimized with respect to sparsity constraints that encourage selection of the inputs. The trade-off between the prediction accuracy and the number of input variables is adjusted by continuous-valued sparsity parameters. Results from extensive experiments on both simulated functions and real benchmark data sets are reported. In comparisons with existing variable selection strategies, the proposed methods typically improve the results either by reducing the prediction error or decreasing the number of selected inputs or with respect to both of the previous criteria. The constructed sparse models are also found to produce more accurate predictions than the models including all the input variables

    Sparse, hierarchical and shared-factors priors for representation learning

    Get PDF
    La représentation en caractéristiques est une préoccupation centrale des systèmes d’apprentissage automatique d’aujourd’hui. Une représentation adéquate peut faciliter une tâche d’apprentissage complexe. C’est le cas lorsque par exemple cette représentation est de faible dimensionnalité et est constituée de caractéristiques de haut niveau. Mais comment déterminer si une représentation est adéquate pour une tâche d’apprentissage ? Les récents travaux suggèrent qu’il est préférable de voir le choix de la représentation comme un problème d’apprentissage en soi. C’est ce que l’on nomme l’apprentissage de représentation. Cette thèse présente une série de contributions visant à améliorer la qualité des représentations apprises. La première contribution élabore une étude comparative des approches par dictionnaire parcimonieux sur le problème de la localisation de points de prises (pour la saisie robotisée) et fournit une analyse empirique de leurs avantages et leurs inconvénients. La deuxième contribution propose une architecture réseau de neurones à convolution (CNN) pour la détection de points de prise et la compare aux approches d’apprentissage par dictionnaire. Ensuite, la troisième contribution élabore une nouvelle fonction d’activation paramétrique et la valide expérimentalement. Finalement, la quatrième contribution détaille un nouveau mécanisme de partage souple de paramètres dans un cadre d’apprentissage multitâche.Feature representation is a central concern of today’s machine learning systems. A proper representation can facilitate a complex learning task. This is the case when for instance the representation has low dimensionality and consists of high-level characteristics. But how can we determine if a representation is adequate for a learning task? Recent work suggests that it is better to see the choice of representation as a learning problem in itself. This is called Representation Learning. This thesis presents a series of contributions aimed at improving the quality of the learned representations. The first contribution elaborates a comparative study of Sparse Dictionary Learning (SDL) approaches on the problem of grasp detection (for robotic grasping) and provides an empirical analysis of their advantages and disadvantages. The second contribution proposes a Convolutional Neural Network (CNN) architecture for grasp detection and compares it to SDL. Then, the third contribution elaborates a new parametric activation function and validates it experimentally. Finally, the fourth contribution details a new soft parameter sharing mechanism for multitasking learning

    Transfer learning algorithms for image classification

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 124-128).An ideal image classifier should be able to exploit complex high dimensional feature representations even when only a few labeled examples are available for training. To achieve this goal we develop transfer learning algorithms that: 1) Leverage unlabeled data annotated with meta-data and 2) Exploit labeled data from related categories. In the first part of this thesis we show how to use the structure learning framework (Ando and Zhang, 2005) to learn efficient image representations from unlabeled images annotated with meta-data. In the second part we present a joint sparsity transfer algorithm for image classification. Our algorithm is based on the observation that related categories might be learnable using only a small subset of shared relevant features. To find these features we propose to train classifiers jointly with a shared regularization penalty that minimizes the total number of features involved in the approximation. To solve the joint sparse approximation problem we develop an optimization algorithm whose time and memory complexity is O(n log n) with n being the number of parameters of the joint model. We conduct experiments on news-topic and keyword prediction image classification tasks. We test our method in two settings: a transfer learning and multitask learning setting and show that in both cases leveraging knowledge from related categories can improve performance when training data per category is scarce. Furthermore, our results demonstrate that our model can successfully recover jointly sparse solutions.by Ariadna Quattoni.Ph.D

    MultiGrain: a unified image embedding for classes and instances

    Get PDF
    MultiGrain is a network architecture producing compact vector representations that are suited both for image classification and particular object retrieval. It builds on a standard classification trunk. The top of the network produces an embedding containing coarse and fine-grained information, so that images can be recognized based on the object class, particular object, or if they are distorted copies. Our joint training is simple: we minimize a cross-entropy loss for classification and a ranking loss that determines if two images are identical up to data augmentation, with no need for additional labels. A key component of MultiGrain is a pooling layer that takes advantage of high-resolution images with a network trained at a lower resolution. When fed to a linear classifier, the learned embeddings provide state-of-the-art classification accuracy. For instance, we obtain 79.4% top-1 accuracy with a ResNet-50 learned on Imagenet, which is a +1.8% absolute improvement over the AutoAugment method. When compared with the cosine similarity, the same embeddings perform on par with the state-of-the-art for image retrieval at moderate resolutions

    Learning Dynamic Systems for Intention Recognition in Human-Robot-Cooperation

    Get PDF
    This thesis is concerned with intention recognition for a humanoid robot and investigates how the challenges of uncertain and incomplete observations, a high degree of detail of the used models, and real-time inference may be addressed by modeling the human rationale as hybrid, dynamic Bayesian networks and performing inference with these models. The key focus lies on the automatic identification of the employed nonlinear stochastic dependencies and the situation-specific inference

    Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

    Get PDF
    The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in Belgium, from Wednesday August 27th till Friday August 29th, 2014. The workshop was conveniently located in "The Arsenal" building within walking distance of both hotels and town center. iTWIST'14 has gathered about 70 international participants and has featured 9 invited talks, 10 oral presentations, and 14 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing; Union of low dimensional subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph sensing/processing; Blind inverse problems and dictionary learning; Sparsity and computational neuroscience; Information theory, geometry and randomness; Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?; Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website: http://sites.google.com/site/itwist1
    • …
    corecore