56 research outputs found

    Learning Stochastic Tree Edit Distance

    No full text
    pages 42-53International audienceTrees provide a suited structural representation to deal with complex tasks such as web information extraction, RNA secondary structure prediction, or conversion of tree structured documents. In this context, many applications require the calculation of similarities between tree pairs. The most studied distance is likely the tree edit distance for which improvements in terms of complexity have been achieved during the last decade. However, this classic edit distance usually uses a priori fixed edit costs which are often difficult to tune, that leaves little room for tackling complex problems. In this paper, we focus on the learning of a stochastic tree edit distance. We use an adaptation of the expectation-maximization algorithm for learning the primitive edit costs. We carried out several series of experiments that confirm the interest to learn a tree edit distance rather than a priori imposing edit costs

    Melody recognition with learned edit distances

    Get PDF
    In a music recognition task, the classification of a new melody is often achieved by looking for the closest piece in a set of already known prototypes. The definition of a relevant similarity measure becomes then a crucial point. So far, the edit distance approach with a-priori fixed operation costs has been one of the most used to accomplish the task. In this paper, the application of a probabilistic learning model to both string and tree edit distances is proposed and is compared to a genetic algorithm cost fitting approach. The results show that both learning models outperform fixed-costs systems, and that the probabilistic approach is able to describe consistently the underlying melodic similarity model.This work was funded by the French ANR Marmota project, the Spanish PROSEMUS project (TIN2006-14932-C02), the research programme Consolider Ingenio 2010 (MIPRCV, CSD2007-00018), and the Pascal Network of Excellence

    On the Usefulness of Similarity Based Projection Spaces for Transfer Learning

    No full text
    talk: http://videolectures.net/simbad2011_morvant_transfer/, 16 pagesInternational audienceSimilarity functions are widely used in many machine learning or pattern recognition tasks. We consider here a recent framework for binary classication, proposed by Balcan et al., allowing to learn in a potentially non geometrical space based on good similarity functions. This framework is a generalization of the notion of kernels used in support vector machines in the sense that allows ne to use similarity functions that do not need to be positive semi-de nite nor symmetric. The similarities are then used to de ne an xplicit projection space where a linear classi er with good generalization properties can be learned. In this paper, we propose to study experimentally the usefulness of similarity based projection spaces for transfer learning issues. More precisely, we consider the problem of domain adaptation where the distributions generating learning data and test data are somewhat different. We stand in the case where no information on the test labels is available. We show that a simple renormalization of a good similarity function taking into account the test data allows us to learn classifiers more performing on the target distribution for difficult adaptation problems. Moreover, this normalization always helps to improve the model when we try to regularize the similarity based projection space in order to move closer the two distributions. We provide experiments on a toy problem and on a real image annotation task

    SEDiL: Software for Edit Distance Learning

    Get PDF
    In this paper, we present SEDiL, a Software for Edit Distance Learning. SEDiL is an innovative prototype implementation grouping together most of the state of the art methods that aim to automatically learn the parameters of string and tree edit distances.This work was funded by the French ANR Marmota project, the Pascal Network of Excellence and the Spanish research programme Consolider Ingenio-2010 (CSD2007-00018)

    Learning Good Edit Similarities with Generalization Guarantees

    No full text
    International audienceSimilarity and distance functions are essential to many learning algorithms, thus training them has attracted a lot of interest. When it comes to dealing with structured data (e.g., strings or trees), edit similarities are widely used, and there exists a few methods for learning them. However, these methods offer no theoretical guarantee as to the generalization performance and discriminative power of the resulting similarities. Recently, a theory of learning with good similarity functions was proposed. This new theory bridges the gap between the properties of a similarity function and its performance in classification. In this paper, we propose a novel edit similarity learning approach (GESL) driven by the idea of goodness, which allows us to derive generalization guarantees using the notion of uniform stability. We experimentally show that edit similarities learned with our method induce classification models that are both more accurate and sparser than those induced by the edit distance or edit similarities learned with a state-of-the-art method
    • …
    corecore