Search CORE

5 research outputs found

An approximate search engine for structure

Author: Shan Huiyuan
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2004
Field of study

As the size of structural databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute-value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art. In this dissertation, efficient search techniques are presented for retrieving trees from a database that are similar to a given query tree. Rooted ordered labeled trees, rooted unordered labeled trees and free trees are considered. Ordered labeled trees are trees in which each node has a label and the left-to-right order among siblings matters. Unordered labeled trees are trees in which the parent-child relationship is significant, but the order among siblings is unimportant. Free trees (unrooted unordered trees) are acyclic graphs. These trees find many applications in bioinformatics, Web log analysis, phyloinformatics, XML processing, etc. Two types of similarity measures are investigated: (i) counting the mismatching paths in the query tree and a data tree, and (ii) measuring the topological relationship between the trees. The proposed approaches include storing the paths of trees in a suffix array, employing hashing techniques to speed up retrieval, and counting the number of up-down operations to move a token from one node to another node in a tree. Various filters for accelerating a search, different strategies for parallelizing these search algorithms and applications of these algorithms to XML and phylogenetic data management are discussed. The proposed techniques have been implemented into a phylogenetic search engine which is fully operational and is available on the World Wide Web. Experimental results on comparing the similarity measures with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate the effectiveness of the search engine. Future work includes extending the techniques to other structural data, as well as developing new filters and algorithms for speeding up searching and mining in complex structures

Digital Commons @ New Jersey Institute of Technology (NJIT)

PhyloFinder: An intelligent search engine for phylogenetic tree databases

Author: Bansal Mukul S
Burleigh J Gordon
Chen Duhong
Fernández-Baca David
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Bioinformatic tools are needed to store and access the rapidly growing phylogenetic data. These tools should enable users to identify existing phylogenetic trees containing a specified taxon or set of taxa and to compare a specified phylogenetic hypothesis to existing phylogenetic trees. Results PhyloFinder is an intelligent search engine for phylogenetic databases that we have implemented using trees from TreeBASE. It enables taxonomic queries, in which it identifies trees in the database containing the exact name of the query taxon and/or any synonymous taxon names, and it provides spelling suggestions for the query when there is no match. Additionally, PhyloFinder can identify trees containing descendants or direct ancestors of the query taxon. PhyloFinder also performs phylogenetic queries, in which it identifies trees that contain the query tree or topologies that are similar to the query tree. Conclusion PhyloFinder can enhance the utility of any tree database by providing tools for both taxonomic and phylogenetic queries as well as visualization tools that highlight the query results and provide links to NCBI and TBMap. An implementation of PhyloFinder using trees from TreeBASE is available from the web client application found in the availability and requirements section.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Insights into the evolution of enzyme substrate promiscuity after the discovery of (βα)8 isomerase evolutionary intermediates from a diverse metagenome

Crossref

A Structure-Based Search Engine for Phylogenetic Databases

Author: Dennis Shasha
Huiyuan Shan
Jason T. L. Wang
Katherine G. Herbert
William H. Piel
Publication venue: Society Press
Publication date: 01/01/2002
Field of study

Phylogenetic trees are essential for understanding the relationships among organisms or taxa. Many of the current techniques for searching phylogenetic repositories allow the user to perform a keyword-type search or an aligned sequence data search, or to browse a hierarchical list of taxa. Here we describe a new search engine that allows the user to present an example phylogeny, or a query tree, and then searches a phylogenetic database for trees that (approximately) contain the query structure

CiteSeerX

Montclair State University Digital Commons

A structure-based search engine for phylogenetic databases

Author: Huiyuan Shan
Katherine G. Herbert
Publication venue: Society Press
Publication date
Field of study

CiteSeerX