Search CORE

56 research outputs found

Learning Stochastic Tree Edit Distance

Author: A. Dempster
G. Bouchard
M. Neuhaus
P. Bille
P. Klein
R. Durbin
S. Ristad
S. Selkow
Publication venue: HAL CCSD
Publication date: 01/01/2006
Field of study

pages 42-53International audienceTrees provide a suited structural representation to deal with complex tasks such as web information extraction, RNA secondary structure prediction, or conversion of tree structured documents. In this context, many applications require the calculation of similarities between tree pairs. The most studied distance is likely the tree edit distance for which improvements in terms of complexity have been achieved during the last decade. However, this classic edit distance usually uses a priori fixed edit costs which are often difficult to tune, that leaves little room for tackling complex problems. In this paper, we focus on the learning of a stochastic tree edit distance. We use an adaptation of the expectation-maximization algorithm for learning the primitive edit costs. We carried out several series of experiments that confirm the interest to learn a tree edit distance rather than a priori imposing edit costs

Melody recognition with learned edit distances

Author: A. Dempster
D. Rizo
G. Aloupis
G. Bouchard
J. Oncina
L. Boyer
M. Bernard
M. Bernard
M. Mongeau
R. Durbin
S. Doraisamy
S.M. Selkow
S.V. Ristad
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

In a music recognition task, the classification of a new melody is often achieved by looking for the closest piece in a set of already known prototypes. The definition of a relevant similarity measure becomes then a crucial point. So far, the edit distance approach with a-priori fixed operation costs has been one of the most used to accomplish the task. In this paper, the application of a probabilistic learning model to both string and tree edit distances is proposed and is compared to a genetic algorithm cost fitting approach. The results show that both learning models outperform fixed-costs systems, and that the probabilistic approach is able to describe consistently the underlying melodic similarity model.This work was funded by the French ANR Marmota project, the Spanish PROSEMUS project (TIN2006-14932-C02), the research programme Consolider Ingenio 2010 (MIPRCV, CSD2007-00018), and the Pascal Network of Excellence

Repositorio Institucional de la Universidad de Alicante

HAL-UJM

Crossref

HAL AMU

On the Usefulness of Similarity Based Projection Spaces for Transfer Learning

Author: A. Smeaton
B. Haasdonk
E. Ristad
E. Zhong
J. Quionero-Candela
K. Weinberger
L. Bruzzone
M. Bernard
M.F. Balcan
S. Ben-David
S. Ben-David
S. Pan
X. Gao
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2011
Field of study

talk: http://videolectures.net/simbad2011_morvant_transfer/, 16 pagesInternational audienceSimilarity functions are widely used in many machine learning or pattern recognition tasks. We consider here a recent framework for binary classication, proposed by Balcan et al., allowing to learn in a potentially non geometrical space based on good similarity functions. This framework is a generalization of the notion of kernels used in support vector machines in the sense that allows ne to use similarity functions that do not need to be positive semi-de nite nor symmetric. The similarities are then used to de ne an xplicit projection space where a linear classi er with good generalization properties can be learned. In this paper, we propose to study experimentally the usefulness of similarity based projection spaces for transfer learning issues. More precisely, we consider the problem of domain adaptation where the distributions generating learning data and test data are somewhat different. We stand in the case where no information on the test labels is available. We show that a simple renormalization of a good similarity function taking into account the test data allows us to learn classifiers more performing on the target distribution for difficult adaptation problems. Moreover, this normalization always helps to improve the model when we try to regularize the similarity based projection space in order to move closer the two distributions. We provide experiments on a toy problem and on a real image annotation task

PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region

Author: A Krogh
BA Whitlock
C Sass
Chang Liu
Dong Liang
DP Little
ES Ristad
F Austerlitz
H Yao
H Yao
H Štorchová
Huan Li
Hui Yao
J Aldrich
J Shaw
Jianping Han
Jingyuan Song
KH Chu
Kun Jiang
M Brudno
MB Hamilton
MW Chase
PM Hollingsworth
R Lahaye
S Batzoglou
S Chen
S Uliel
SG Newmaster
Shilin Chen
SR Eddy
SR Eddy
Ting Gao
WJ Kress
X Huang
Xiaohui Pang
Xiaojun Guan
Zhihua Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Computational complexity of polyadic lifts of generalized quantifiers in natural language

Author: A. Blass
A. Mostowski
A. Turing
C. Cherniak
C.H. Papadimitriou
C.T. McMillan
E. Grädel
E. Keenan
E. Keenan
E. Keenan
E.S. Ristad
G. Ben-Avi
G. Frege
G. Sher
H.J. Levesque
I. Rooij van
J. Barwise
J. Benthem van
J. Benthem van
J. Hintikka
J. Hintikka
J. Hintikka
J. Szymanik
J. Väänänen
Jakub Szymanik
K. Bach
K. Jaszczolt
L. Hella
L. Robaldo
M. Dalrymple
M. Frixione
M. Hackl
M. Krynicki
M. Mostowski
M. Mostowski
M. Mostowski
M.R. Garey
N. Gierasimczuk
N. Immerman
N. Tennant
P. Lindström
P. Pietroski
R. May
R. Montague
R.M. Karp
R.M. Kempson
R.M. Kempson
R.M. Kempson
S. Beck
S. Peters
Y. Moschovakis
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

SEDiL: Software for Edit Distance Learning

Author: J. Oncina
L. Boyer
M. Bernard
M. Bernard
P. Bille
R. Durbin
R. Wagner
S. Ristad
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

In this paper, we present SEDiL, a Software for Edit Distance Learning. SEDiL is an innovative prototype implementation grouping together most of the state of the art methods that aim to automatically learn the parameters of string and tree edit distances.This work was funded by the French ANR Marmota project, the Pascal Network of Excellence and the Spanish research programme Consolider Ingenio-2010 (CSD2007-00018)

Repositorio Institucional de la Universidad de Alicante

HAL-UJM

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

HAL AMU

Characterization of resistance genes against scald (Rhynchosporium secalis (Oudem.) J.J. Davis) in barley (Hordeum vulgare L.) lines from central Norway, by means of genetic markers and pathotype tests

Author: Bjørnstad Å
Grønnerød S.
Reitan L.
Ristad T. P.
Salamati S.
Skinnes H.
Waugh R.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/04/2002
Field of study

University of Dundee Online Publications

The Generalized MDL Approach for Summarization

Author: Agrawal
Berchtold
Fang
Garcia
Gu
Guttman
Kilpelinen
Mehta
Ng
Reckhow
Ristad
Ross Quinlan
S
Publication venue: 'Elsevier BV'
Publication date: 01/01/2002
Field of study

Crossref

Learning Good Edit Similarities with Generalization Guarantees

Author: C. McDiarmid
E.S. Ristad
H. Saigo
J. Oncina
K.Q. Weinberger
M. Bernard
O. Bousquet
S. Henikoff
Publication venue: HAL CCSD
Publication date: 01/01/2011
Field of study

International audienceSimilarity and distance functions are essential to many learning algorithms, thus training them has attracted a lot of interest. When it comes to dealing with structured data (e.g., strings or trees), edit similarities are widely used, and there exists a few methods for learning them. However, these methods offer no theoretical guarantee as to the generalization performance and discriminative power of the resulting similarities. Recently, a theory of learning with good similarity functions was proposed. This new theory bridges the gap between the properties of a similarity function and its performance in classification. In this paper, we propose a novel edit similarity learning approach (GESL) driven by the idea of goodness, which allows us to derive generalization guarantees using the notion of uniform stability. We experimentally show that edit similarities learned with our method induce classification models that are both more accurate and sparser than those induced by the edit distance or edit similarities learned with a state-of-the-art method

HAL-UJM

Crossref

HAL AMU

Partitional vs Hierarchical Clustering Using a Minimum Grammar Complexity Approach

Author: A. Marzal
E. S. Ristad
G. Cortelazzo
K. S. Fu
K. S. Fu
R. J. Solomonoff
S. Y. Lu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref