1,365 research outputs found
Taming Horizontal Instability in Merge Trees: On the Computation of a Comprehensive Deformation-based Edit Distance
Comparative analysis of scalar fields in scientific visualization often
involves distance functions on topological abstractions. This paper focuses on
the merge tree abstraction (representing the nesting of sub- or superlevel
sets) and proposes the application of the unconstrained deformation-based edit
distance. Previous approaches on merge trees often suffer from instability:
small perturbations in the data can lead to large distances of the
abstractions. While some existing methods can handle so-called vertical
instability, the unconstrained deformation-based edit distance addresses both
vertical and horizontal instabilities, also called saddle swaps. We establish
the computational complexity as NP-complete, and provide an integer linear
program formulation for computation. Experimental results on the TOSCA shape
matching ensemble provide evidence for the stability of the proposed distance.
We thereby showcase the potential of handling saddle swaps for comparison of
scalar fields through merge trees
An approximate search engine for structure
As the size of structural databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute-value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art.
In this dissertation, efficient search techniques are presented for retrieving trees from a database that are similar to a given query tree. Rooted ordered labeled trees, rooted unordered labeled trees and free trees are considered. Ordered labeled trees are trees in which each node has a label and the left-to-right order among siblings matters. Unordered labeled trees are trees in which the parent-child relationship is significant, but the order among siblings is unimportant. Free trees (unrooted unordered trees) are acyclic graphs. These trees find many applications in bioinformatics, Web log analysis, phyloinformatics, XML processing, etc.
Two types of similarity measures are investigated: (i) counting the mismatching paths in the query tree and a data tree, and (ii) measuring the topological relationship between the trees. The proposed approaches include storing the paths of trees in a suffix array, employing hashing techniques to speed up retrieval, and counting the number of up-down operations to move a token from one node to another node in a tree. Various filters for accelerating a search, different strategies for parallelizing these search algorithms and applications of these algorithms to XML and phylogenetic data management are discussed.
The proposed techniques have been implemented into a phylogenetic search engine which is fully operational and is available on the World Wide Web. Experimental results on comparing the similarity measures with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate the effectiveness of the search engine. Future work includes extending the techniques to other structural data, as well as developing new filters and algorithms for speeding up searching and mining in complex structures
Recommended from our members
Metaheuristic approaches for the quartet method of hierarchical clustering
Given a set of objects and their pairwise distances, we wish to determine a visual representation of the data. We use the quartet paradigm to compute a hierarchy of clusters of the objects. The method is based on an NP-hard graph optimization problem called the Minimum Quartet Tree Cost problem. This paper presents and compares several metaheuristic approaches to approximate the optimal hierarchy. The performance of the algorithms is tested through extensive computational experiments and it is shown that the Reduced Variable Neighbourhood Search metaheuristic is the most effective approach to the problem, obtaining high quality solutions in short computational running times
Fixed-Parameter Algorithms for Computing Kemeny Scores - Theory and Practice
The central problem in this work is to compute a ranking of a set of elements
which is "closest to" a given set of input rankings of the elements. We define
"closest to" in an established way as having the minimum sum of Kendall-Tau
distances to each input ranking. Unfortunately, the resulting problem Kemeny
consensus is NP-hard for instances with n input rankings, n being an even
integer greater than three. Nevertheless this problem plays a central role in
many rank aggregation problems. It was shown that one can compute the
corresponding Kemeny consensus list in f(k) + poly(n) time, being f(k) a
computable function in one of the parameters "score of the consensus", "maximum
distance between two input rankings", "number of candidates" and "average
pairwise Kendall-Tau distance" and poly(n) a polynomial in the input size. This
work will demonstrate the practical usefulness of the corresponding algorithms
by applying them to randomly generated and several real-world data. Thus, we
show that these fixed-parameter algorithms are not only of theoretical
interest. In a more theoretical part of this work we will develop an improved
fixed-parameter algorithm for the parameter "score of the consensus" having a
better upper bound for the running time than previous algorithms.Comment: Studienarbei
- …