Search CORE

7,171 research outputs found

Optimal Decomposition Strategy For Tree Edit Distance

Author: Jiang Shaofeng
Publication venue: Scholarship@Western
Publication date: 13/12/2017
Field of study

An ordered labeled tree is a tree where the left-to-right order among siblings is significant. Given two ordered labeled trees, the edit distance between them is the minimum cost edit operations that convert one tree to the other. In this thesis, we present an algorithm for the tree edit distance problem by using the optimal tree decomposition strategy. By combining the vertical compression of trees with optimal decomposition we can significantly reduce the running time of the algorithm. We compare our method with other methods both theoretically and experimentally. The test results show that our strategies on compressed trees are by far the best decomposition strategy, creating the least number of relevant sub-problems

Scholarship@Western

Decomposition algorithms for the tree edit distance problem

Author: Dulucq Serge
Touzet Hélène
Publication venue: Elsevier B.V.
Publication date
Field of study

AbstractWe study the behavior of dynamic programming methods for the tree edit distance problem, such as [P. Klein, Computing the edit-distance between unrooted ordered trees, in: Proceedings of 6th European Symposium on Algorithms, 1998, p. 91–102; K. Zhang, D. Shasha, SIAM J. Comput. 18 (6) (1989) 1245–1262]. We show that those two algorithms may be described as decomposition strategies. We introduce the general framework of cover strategies, and we provide an exact characterization of the complexity of cover strategies. This analysis allows us to define a new tree edit distance algorithm, that is optimal for cover strategies

Elsevier - Publisher Connector

An O(n^3)-Time Algorithm for Tree Edit Distance

Author: Benjamin Rossman
Demaine E. D.
Dulucq S.
Erik D. Demaine
Klein P. N.
Oren Weimann
Shay Mozes
Waterman M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/04/2006
Field of study

The {\em edit distance} between two ordered trees with vertex labels is the minimum cost of transforming one tree into the other by a sequence of elementary operations consisting of deleting and relabeling existing nodes, as well as inserting new nodes. In this paper, we present a worst-case

O(n^3)

-time algorithm for this problem, improving the previous best

O(n^3\log n)

-time algorithm~\cite{Klein}. Our result requires a novel adaptive strategy for deciding how a dynamic program divides into subproblems (which is interesting in its own right), together with a deeper understanding of the previous algorithms for the problem. We also prove the optimality of our algorithm among the family of \emph{decomposition strategy} algorithms--which also includes the previous fastest algorithms--by tightening the known lower bound of

\Omega(n^2\log^2 n)

~\cite{Touzet} to

\Omega(n^3)

, matching our algorithm's running time. Furthermore, we obtain matching upper and lower bounds of

\Theta(n m^2 (1 + \log \frac{n}{m}))

when the two trees have different sizes

m

and~

n

, where

m < n

.Comment: 10 pages, 5 figures, 5 .tex files where TED.tex is the main on

arXiv.org e-Print Archive

Crossref

Approximating Edit Distance Within Constant Factor in Truly Sub-Quadratic Time

Author: Chakraborty Diptarka
Das Debarati
Goldenberg Elazar
Koucky Michal
Saks Michael
Publication venue
Publication date: 08/10/2018
Field of study

Edit distance is a measure of similarity of two strings based on the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. The edit distance can be computed exactly using a dynamic programming algorithm that runs in quadratic time. Andoni, Krauthgamer and Onak (2010) gave a nearly linear time algorithm that approximates edit distance within approximation factor

\text{poly}(\log n)

. In this paper, we provide an algorithm with running time

\tilde{O}(n^{2-2/7})

that approximates the edit distance within a constant factor

arXiv.org e-Print Archive

Copenhagen University Research Information System

Interactive visualisation and exploration of biological data

Author: Gilbert D
Helden JV
Schroeder M
Publication venue
Publication date: 01/01/2000
Field of study

International audienceno abstrac

HAL AMU

Brunel University Research Archive

Structural Rounding: Approximation Algorithms for Graphs Near an Algorithmically Tractable Class

Author: Demaine Erik D.
Goodrich Timothy D.
Kloster Kyle
Lavallee Brian
Liu Quanquan C.
Sullivan Blair D.
Vakilian Ali
van der Poel Andrew
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual European Symposium on Algorithms (ESA 2019)
Publication date: 09/12/2018
Field of study

We develop a framework for generalizing approximation algorithms from the structural graph algorithm literature so that they apply to graphs somewhat close to that class (a scenario we expect is common when working with real-world networks) while still guaranteeing approximation ratios. The idea is to edit a given graph via vertex- or edge-deletions to put the graph into an algorithmically tractable class, apply known approximation algorithms for that class, and then lift the solution to apply to the original graph. We give a general characterization of when an optimization problem is amenable to this approach, and show that it includes many well-studied graph problems, such as Independent Set, Vertex Cover, Feedback Vertex Set, Minimum Maximal Matching, Chromatic Number, (l-)Dominating Set, Edge (l-)Dominating Set, and Connected Dominating Set. To enable this framework, we develop new editing algorithms that find the approximately-fewest edits required to bring a given graph into one of a few important graph classes (in some cases these are bicriteria algorithms which simultaneously approximate both the number of editing operations and the target parameter of the family). For bounded degeneracy, we obtain an O(r log{n})-approximation and a bicriteria (4,4)-approximation which also extends to a smoother bicriteria trade-off. For bounded treewidth, we obtain a bicriteria (O(log^{1.5} n), O(sqrt{log w}))-approximation, and for bounded pathwidth, we obtain a bicriteria (O(log^{1.5} n), O(sqrt{log w} * log n))-approximation. For treedepth 2 (related to bounded expansion), we obtain a 4-approximation. We also prove complementary hardness-of-approximation results assuming P != NP: in particular, these problems are all log-factor inapproximable, except the last which is not approximable below some constant factor 2 (assuming UGC)

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

A Survey on Metric Learning for Feature Vectors and Structured Data

Author: Bellet Aurélien
Habrard Amaury
Sebban Marc
Publication venue
Publication date: 01/01/2013
Field of study

The need for appropriate ways to measure the distance or similarity between data is ubiquitous in machine learning, pattern recognition and data mining, but handcrafting such good metrics for specific problems is generally difficult. This has led to the emergence of metric learning, which aims at automatically learning a metric from data and has attracted a lot of interest in machine learning and related fields for the past ten years. This survey paper proposes a systematic review of the metric learning literature, highlighting the pros and cons of each approach. We pay particular attention to Mahalanobis distance metric learning, a well-studied and successful framework, but additionally present a wide range of methods that have recently emerged as powerful alternatives, including nonlinear metric learning, similarity learning and local metric learning. Recent trends and extensions, such as semi-supervised metric learning, metric learning for histogram data and the derivation of generalization guarantees, are also covered. Finally, this survey addresses metric learning for structured data, in particular edit distance learning, and attempts to give an overview of the remaining challenges in metric learning for the years to come.Comment: Technical report, 59 pages. Changes in v2: fixed typos and improved presentation. Changes in v3: fixed typos. Changes in v4: fixed typos and new method

arXiv.org e-Print Archive

HAL-UJM