Search CORE

9 research outputs found

Edit Distance between Unrooted Trees in Cubic Time

Author: Dudek Bartlomiej
Gawrychowski Pawel
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)
Publication date: 01/01/2018
Field of study

Edit distance between trees is a natural generalization of the classical edit distance between strings, in which the allowed elementary operations are contraction, uncontraction and relabeling of an edge. Demaine et al. [ACM Trans. on Algorithms, 6(1), 2009] showed how to compute the edit distance between rooted trees on n nodes in O(n^3) time. However, generalizing their method to unrooted trees seems quite problematic, and the most efficient known solution remains to be the previous O(n^3 log n) time algorithm by Klein [ESA 1998]. Given the lack of progress on improving this complexity, it might appear that unrooted trees are simply more difficult than rooted trees. We show that this is, in fact, not the case, and edit distance between unrooted trees on n nodes can be computed in O(n^3) time. A significantly faster solution is unlikely to exist, as Bringmann et al. [SODA 2018] proved that the complexity of computing the edit distance between rooted trees cannot be decreased to O(n^{3-epsilon}) unless some popular conjecture fails, and the lower bound easily extends to unrooted trees. We also show that for two unrooted trees of size m and n, where m <=n, our algorithm can be modified to run in O(nm^2(1+log(n/m))). This, again, matches the complexity achieved by Demaine et al. for rooted trees, who also showed that this is optimal if we restrict ourselves to the so-called decomposition algorithms

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Approximating Tree Edit Distance through String Edit Distance

Author: Akutsu Tatsuya
Fukagawa Daiji
Takasu Atsuhiro
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2010
Field of study

We present an algorithm to approximate edit distance between two ordered and rooted trees of bounded degree. In this algorithm, each input tree is transformed into a string by computing the Euler string, where labels of some edges in the input trees are modified so that structures of small subtrees are reflected to the labels. We show that the edit distance between trees is at least 1/6 and at most O(n 3/4) of the edit distance between the transformed strings, where n is the maximum size of two input trees and we assume unit cost edit operations for both trees and strings. The algorithm works in O(n 2) time since computation of edit distance and reconstruction of tree mapping from string alignment takes O(n 2) time though transformation itself can be done in O(n) time

Kyoto University Research Information Repository

Scaling Similarity Joins over Tree-Structured Data

Author: Cai Y
Mamoulis N
Tang Y
Publication venue: 'VLDB Endowment'
Publication date: 01/01/2015
Field of study

Given a large collection of tree-structured objects (e.g., XML documents), the similarity join finds the pairs of objects that are similar to each other, based on a similarity threshold and a tree edit distance measure. The state-of-the-art similarity join methods compare simpler approximations of the objects (e.g., strings), in order to prune pairs that cannot be part of the similarity join result based on distance bounds derived by the approximations. In this paper, we propose a novel similarity join approach, which is based on the dynamic decomposition of the tree objects into subgraphs, according to the similarity threshold. Our technique avoids computing the exact distance between two tree objects, if the objects do not share at least one common subgraph. In order to scale up the join, the computed subgraphs are managed in a two-layer index. Our experimental results on real and synthetic data collections show that our approach outperforms the state-of-the-art methods by up to an order of magnitude.published_or_final_versio

CiteSeerX

HKU Scholars Hub

Approximating Tree Edit Distance through String Edit Distance

Author: Atsuhiro Takasu
Daiji Fukagawa
G. Valiente
K. Zhang
K.-C. Tai
M. Garofalakis
P. Bille
R. Grossi
T. Akutsu
Tatsuya Akutsu
W. Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Approximating Tree Edit Distance through String Edit Distance for Binary Tree Codes

Author
Publication venue: 'IOS Press'
Publication date: 01/01/2010
Field of study

Crossref

Online Analysis of Dynamic Streaming Data

Author: Kühn Eileen
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2018
Field of study

Die Arbeit zum Thema "Online Analysis of Dynamic Streaming Data" beschäftigt sich mit der Distanzmessung dynamischer, semistrukturierter Daten in kontinuierlichen Datenströmen um Analysen auf diesen Datenstrukturen bereits zur Laufzeit zu ermöglichen. Hierzu wird eine Formalisierung zur Distanzberechnung für statische und dynamische Bäume eingeführt und durch eine explizite Betrachtung der Dynamik von Attributen einzelner Knoten der Bäume ergänzt. Die Echtzeitanalyse basierend auf der Distanzmessung wird durch ein dichte-basiertes Clustering ergänzt, um eine Anwendung des Clustering, einer Klassifikation, aber auch einer Anomalieerkennung zu demonstrieren. Die Ergebnisse dieser Arbeit basieren auf einer theoretischen Analyse der eingeführten Formalisierung von Distanzmessungen für dynamische Bäume. Diese Analysen werden unterlegt mit empirischen Messungen auf Basis von Monitoring-Daten von Batchjobs aus dem Batchsystem des GridKa Daten- und Rechenzentrums. Die Evaluation der vorgeschlagenen Formalisierung sowie der darauf aufbauenden Echtzeitanalysemethoden zeigen die Effizienz und Skalierbarkeit des Verfahrens. Zudem wird gezeigt, dass die Betrachtung von Attributen und Attribut-Statistiken von besonderer Bedeutung für die Qualität der Ergebnisse von Analysen dynamischer, semistrukturierter Daten ist. Außerdem zeigt die Evaluation, dass die Qualität der Ergebnisse durch eine unabhängige Kombination mehrerer Distanzen weiter verbessert werden kann. Insbesondere wird durch die Ergebnisse dieser Arbeit die Analyse sich über die Zeit verändernder Daten ermöglicht

KITopen

Exact and heuristic algorithms for network alignment using graph edit distance models

Author: Ibragimov Rashid
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 31/03/2016
Field of study

In the thesis we aim to study theoretical and practical questions of applying the graph edit distance (GED) model to the protein-protein interaction network alignment problem using topological information of graphs only. In Part II we explore some theoretical aspects of the model formulated as three different problems; Part III presents three heuristics for the PPI network alignment problem based on a GED model that counts the number of deleted and inserted edges.In dieser Arbeit werden theoretische und praktische Aspekte der Anwendung des GED(Graph Edit Distance)-Modells auf PPI (Protein-Protein-Interaktions)-Netzwerke untersucht. Hierbei werden werden ausschließlich topologische Informationen von Graphen verwendet. In zweiten Teil werden einige theoretische Eigenschaften des Modells untersucht, formuliert als drei verschiedene Problemstellungen. Im dritten Teil werden drei Heuristiken zur approximativen Lösung des PPI-Netzwerk-Alignmentproblems präsentiert, basierend auf einem GED-Modell, dass die Anzahl gelöschter und neu eingefügter Kanten auswertet

Acronym

Exact and heuristic algorithms for network alignment using graph edit distance models

Author: Ibragimov Rashid
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik
Publication date: 01/01/2015
Field of study

Universaar

Acronym

MPG.PuRe