21 research outputs found

    Generalization Guarantees for a Binary Classification Framework for Two-Stage Multiple Kernel Learning

    Full text link
    We present generalization bounds for the TS-MKL framework for two stage multiple kernel learning. We also present bounds for sparse kernel learning formulations within the TS-MKL framework

    Supervised Learning with Similarity Functions

    Full text link
    We address the problem of general supervised learning when data can only be accessed through an (indefinite) similarity function between data points. Existing work on learning with indefinite kernels has concentrated solely on binary/multi-class classification problems. We propose a model that is generic enough to handle any supervised learning task and also subsumes the model previously proposed for classification. We give a "goodness" criterion for similarity functions w.r.t. a given supervised learning task and then adapt a well-known landmarking technique to provide efficient algorithms for supervised learning using "good" similarity functions. We demonstrate the effectiveness of our model on three important super-vised learning problems: a) real-valued regression, b) ordinal regression and c) ranking where we show that our method guarantees bounded generalization error. Furthermore, for the case of real-valued regression, we give a natural goodness definition that, when used in conjunction with a recent result in sparse vector recovery, guarantees a sparse predictor with bounded generalization error. Finally, we report results of our learning algorithms on regression and ordinal regression tasks using non-PSD similarity functions and demonstrate the effectiveness of our algorithms, especially that of the sparse landmark selection algorithm that achieves significantly higher accuracies than the baseline methods while offering reduced computational costs.Comment: To appear in the proceedings of NIPS 2012, 30 page

    Apprentissage de bonnes similarités pour la classification linéaire parcimonieuse

    No full text
    http://cap2012.loria.fr/pub/Papers/28.pdfNational audienceLe rôle crucial joué par les métriques au sein des processus d'apprentissage automatique a donné lieu ces dernières années à un intérêt croissant pour l'optimisation de fonctions de distances ou de similarités. La plupart des approches de l'état de l'art visent à apprendre une distance de Mahalanobis, devant satisfaire la contrainte de semi-définie positivité (SDP), exploitée in fine dans un algorithme local de type plus-proches-voisins. Cependant, aucun résultat théorique n'établit le lien entre les métriques apprises et leur comportement en classification. Dans cet article, nous exploitons le cadre formel des bonnes similarités pour proposer un algorithme d'apprentissage de similarité linéaire, optimisée dans un espace kernélisé. Nous montrons que la similarité apprise, ne requérant pas d'être SDP, possède des propriétés théoriques de stabilité permettant d'établir une borne en généralisation. Les expérimentations menées sur plusieurs jeux de données confirment son efficacité par rapport à l'état de l'art

    Similarity-based Learning via Data Driven Embeddings

    Full text link
    We consider the problem of classification using similarity/distance functions over data. Specifically, we propose a framework for defining the goodness of a (dis)similarity function with respect to a given learning task and propose algorithms that have guaranteed generalization properties when working with such good functions. Our framework unifies and generalizes the frameworks proposed by [Balcan-Blum ICML 2006] and [Wang et al ICML 2007]. An attractive feature of our framework is its adaptability to data - we do not promote a fixed notion of goodness but rather let data dictate it. We show, by giving theoretical guarantees that the goodness criterion best suited to a problem can itself be learned which makes our approach applicable to a variety of domains and problems. We propose a landmarking-based approach to obtaining a classifier from such learned goodness criteria. We then provide a novel diversity based heuristic to perform task-driven selection of landmark points instead of random selection. We demonstrate the effectiveness of our goodness criteria learning method as well as the landmark selection heuristic on a variety of similarity-based learning datasets and benchmark UCI datasets on which our method consistently outperforms existing approaches by a significant margin.Comment: To appear in the proceedings of NIPS 2011, 14 page

    Good edit similarity learning by loss minimization

    No full text
    International audienceSimilarity functions are a fundamental component of many learning algorithms. When dealing with string or tree-structured data, edit distancebased measures are widely used, and there exists a few methods for learning them from data. However, these methods offer no theoretical guarantee as to the generalization ability and discriminative power of the learned similarities. In this paper, we propose a loss minimization-based edit similarity learning approach, called GESL. It is driven by the notion of (e, γ, τ )-goodness, a theory that bridges the gap between the properties of a similarity function and its performance in classification. We show that our learning framework is a suitable way to deal not only with strings but also with tree-structured data. Using the notion of uniform stability, we derive generalization guarantees for a large class of loss functions. We also provide experimental results on two realworld datasets which show that edit similarities learned with GESL induce more accurate and sparser classifiers than other (standard or learned) edit similarities
    corecore