8,531 research outputs found

    An approximate search engine for structure

    Get PDF
    As the size of structural databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute-value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art. In this dissertation, efficient search techniques are presented for retrieving trees from a database that are similar to a given query tree. Rooted ordered labeled trees, rooted unordered labeled trees and free trees are considered. Ordered labeled trees are trees in which each node has a label and the left-to-right order among siblings matters. Unordered labeled trees are trees in which the parent-child relationship is significant, but the order among siblings is unimportant. Free trees (unrooted unordered trees) are acyclic graphs. These trees find many applications in bioinformatics, Web log analysis, phyloinformatics, XML processing, etc. Two types of similarity measures are investigated: (i) counting the mismatching paths in the query tree and a data tree, and (ii) measuring the topological relationship between the trees. The proposed approaches include storing the paths of trees in a suffix array, employing hashing techniques to speed up retrieval, and counting the number of up-down operations to move a token from one node to another node in a tree. Various filters for accelerating a search, different strategies for parallelizing these search algorithms and applications of these algorithms to XML and phylogenetic data management are discussed. The proposed techniques have been implemented into a phylogenetic search engine which is fully operational and is available on the World Wide Web. Experimental results on comparing the similarity measures with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate the effectiveness of the search engine. Future work includes extending the techniques to other structural data, as well as developing new filters and algorithms for speeding up searching and mining in complex structures

    Boyer-Moore strategy to efficient approximate string matching

    Get PDF
    International audienceWe propose a simple but e cient algorithm for searching all occurrences of a pattern or a class of patterns (length m) in a text (length n) with at most k mismatches. This algorithm relies on the Shift-Add algorithm of Baeza-Yates and Gonnet [6], which involves representing by a bit number the current state of the search and uses the ability of programming languages to handle bit words. State representation should not, therefore, exceeds the word size w, that is, m(⌈log2(k+1)⌉+1 )≤w. This algorithm consists in a preprocessing step and a searching step. It is linear and performs 3n operations during the searching step. Notions of shift and character skip found in the Boyer-Moore (BM) [9] approach, are introduced in this algorithm. Provided that the considered alphabet is large enough (compared to the Pattern length), the average number of operations performed by our algorithm during the searching step becomes n(2+(k+4)/(m-k))

    Custom Integrated Circuits

    Get PDF
    Contains reports on twelve research projects.Analog Devices, Inc.International Business Machines, Inc.Joint Services Electronics Program (Contract DAAL03-86-K-0002)Joint Services Electronics Program (Contract DAAL03-89-C-0001)U.S. Air Force - Office of Scientific Research (Grant AFOSR 86-0164)Rockwell International CorporationOKI Semiconductor, Inc.U.S. Navy - Office of Naval Research (Contract N00014-81-K-0742)Charles Stark Draper LaboratoryNational Science Foundation (Grant MIP 84-07285)National Science Foundation (Grant MIP 87-14969)Battelle LaboratoriesNational Science Foundation (Grant MIP 88-14612)DuPont CorporationDefense Advanced Research Projects Agency/U.S. Navy - Office of Naval Research (Contract N00014-87-K-0825)American Telephone and TelegraphDigital Equipment CorporationNational Science Foundation (Grant MIP-88-58764

    Pattern discovery in trees : algorithms and applications to document and scientific data management

    Get PDF
    Ordered, labeled trees are trees in which each node has a label and the left-to-right order of its children (if it has any) is fixed. Such trees have many applications in vision, pattern recognition, molecular biology and natural language processing. In this dissertation we present algorithms for finding patterns in the ordered labeled trees. Specifically we study the largest approximately common substructure (LACS) problem for such trees. We consider a substructure of a tree T to be a connected subgraph of T. Given two trees T1, T2 and an integer d, the LACS problem is to find a substructure U1 of T1 and a substructure U2 of T2 such that U1 is within distance d of U2 and where there does not exist any other substructure V1 of T1 and V2 of T2 such that V1 and V2 satisfy the distance constraint and the sum of the sizes of V1 and V2 is greater than the sum of the sizes of U1 and U2. The LACS problem is motivated by the studies of document and RNA comparison. We consider two types of distance measures: the general edit distance and a restricted edit distance originated from Selkow. We present dynamic programming algorithms to solve the LACS problem based on the two distance measures. The algorithms run as fast as the best known algorithms for computing the distance of two trees when the distance allowed in the common substructures is a constant independent of the input trees. To demonstrate the utility of our algorithms, we discuss their applications to discovering motifs in multiple RNA secondary structures. Such an application shows an example of scientific data mining. We represent an RNA secondary structure by an ordered labeled tree based on a previously proposed scheme. The patterns in the trees are substructures that can differ in both substitutions and deletions/insertions of nodes of the trees. Our techniques incorporate approximate tree matching algorithms and novel heuristics for discovery and optimization. Experimental results obtained by running these algorithms on both generated data and RNA secondary structures show the good performance of the algorithms. It is shown that the optimization heuristics speed up the discovery algorithm by a factor of 10. Moreover, our optimized approach is 100,000 times faster than the brute force method. Finally we implement our techniques into a graphic toolbox that enables users to find repeated substructures in an RNA secondary structure as well as frequently occurring patterns in multiple RNA secondary structures pertaining to rhinovirus obtained from the National Cancer Institute. The system is implemented in C programming language and X windows and is fully operational on SUN workstations

    Pendeteksian Plagiarisme Menggunakan Algoritma Rabin-Karp dengan Metode Rolling Hash

    Get PDF
    Plagiarisme adalah tindakan penyalahgunaan, pencurian/perampasan, penerbitan, pernyataan, atau menyatakan sebagai milik sendiri sebuah pikiran, ide, tulisan, atau ciptaan yang sebenarnya milik orang lain. Plagiat atau biasa disebut penjiplakan adalah sebuah masalah yang cukup signifikan pada akademisi di perguruan tinggi. Hal plagiat yang biasanya dilakukan terhadap konten digital adalah melakukan copy-paste, quote, dan revisi terhadap dokumen asli. Untuk mengantisipasinya, dibutuhkan suatu cara yang dapat menganalisis teknik-teknik plagiat yang dilakukan. Ada beberapa pendekatan yang bisa diambil, salah satunya dengan menggunakan algoritma Rabin-Karp dengan metode Rolling Hash. Pendeteksian plagiarisme menggunakan algoritma Rabin-Karp dengan metode rolling hash ini diimplementasikan ke dalam program atau aplikasi untuk menentukan nilai tingkat akurasi dengan nilai presentase

    Efficient alternative wiring techniques and applications.

    Get PDF
    Sze, Chin Ngai.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves 80-84) and index.Abstracts in English and Chinese.Abstract --- p.iAcknowledgments --- p.iiiCurriculum Vitae --- p.ivList of Figures --- p.ixList of Tables --- p.xiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivation and Aims --- p.1Chapter 1.2 --- Contribution --- p.8Chapter 1.3 --- Organization of Dissertation --- p.10Chapter 2 --- Definitions and Notations --- p.11Chapter 3 --- Literature Review --- p.15Chapter 3.1 --- Logic Reconstruction --- p.15Chapter 3.1.1 --- SIS: A System for Sequential and Combinational Logic Synthesis --- p.16Chapter 3.2 --- ATPG-based Alternative Wiring --- p.17Chapter 3.2.1 --- Redundancy Addition and Removal for Logic Optimization --- p.18Chapter 3.2.2 --- Perturb and Simplify Logic Optimization --- p.18Chapter 3.2.3 --- REWIRE --- p.21Chapter 3.2.4 --- Implication-tree Based Alternative Wiring Logic Trans- formation --- p.22Chapter 3.3 --- Graph-based Alternative Wiring --- p.24Chapter 4 --- Implication Based Alternative Wiring Logic Transformation --- p.25Chapter 4.1 --- Source Node Implication --- p.25Chapter 4.1.1 --- Introduction --- p.25Chapter 4.1.2 --- Implication Relationship and Implication-tree --- p.25Chapter 4.1.3 --- Selection of Alternative Wire Based on Implication-tree --- p.29Chapter 4.1.4 --- Implication-tree Based Logic Transformation --- p.32Chapter 4.2 --- Destination Node Implication --- p.35Chapter 4.2.1 --- Introduction --- p.35Chapter 4.2.2 --- Destination Node Relationship --- p.35Chapter 4.2.3 --- Destination Node Implication-tree --- p.39Chapter 4.2.4 --- Selection of Alternative Wire --- p.41Chapter 4.3 --- The Algorithm --- p.43Chapter 4.3.1 --- IB AW Implementation --- p.43Chapter 4.3.2 --- Experimental Results --- p.43Chapter 4.4 --- Conclusion --- p.45Chapter 5 --- Graph Based Alternative Wiring Logic Transformation --- p.47Chapter 5.1 --- Introduction --- p.47Chapter 5.2 --- Notations and Definitions --- p.48Chapter 5.3 --- Alternative Wire Patterns --- p.50Chapter 5.4 --- Construction of Minimal Patterns --- p.54Chapter 5.4.1 --- Minimality of Patterns --- p.54Chapter 5.4.2 --- Minimal Pattern Formation --- p.56Chapter 5.4.3 --- Pattern Extraction --- p.61Chapter 5.5 --- Experimental Results --- p.63Chapter 5.6 --- Conclusion --- p.63Chapter 6 --- Logic Optimization by GBAW --- p.66Chapter 6.1 --- Introduction --- p.66Chapter 6.2 --- Logic Simplification --- p.67Chapter 6.2.1 --- Single-Addition-Multiple-Removal by Pattern Feature . . --- p.67Chapter 6.2.2 --- Single-Addition-Multiple-Removal by Combination of Pat- terns --- p.68Chapter 6.2.3 --- Single-Addition-Single-Removal --- p.70Chapter 6.3 --- Incremental Perturbation Heuristic --- p.71Chapter 6.4 --- GBAW Optimization Algorithm --- p.73Chapter 6.5 --- Experimental Results --- p.73Chapter 6.6 --- Conclusion --- p.76Chapter 7 --- Conclusion --- p.78Bibliography --- p.80Chapter A --- VLSI Design Cycle --- p.85Chapter B --- Alternative Wire Patterns in [WLFOO] --- p.87Chapter B.1 --- 0-local Pattern --- p.87Chapter B.2 --- 1-local Pattern --- p.88Chapter B.3 --- 2-local Pattern --- p.89Chapter B.4 --- Fanout-reconvergent Pattern --- p.90Chapter C --- New Alternative Wire Patterns --- p.91Chapter C.1 --- Pattern Cluster C1 --- p.91Chapter C.1.1 --- NAND-NAND-AND/NAND;AND/NAND --- p.91Chapter C.1.2 --- NOR-NOR-OR/NOR;AND/NAND --- p.92Chapter C.1.3 --- AND-NOR-OR/NOR;OR/NOR --- p.95Chapter C.1.4 --- OR-NAND-AND/NAND;AND/NAND --- p.95Chapter C.2 --- Pattern Cluster C2 --- p.98Chapter C.3 --- Pattern Cluster C3 --- p.99Chapter C.4 --- Pattern Cluster C4 --- p.104Chapter C.5 --- Pattern Cluster C5 --- p.105Glossary --- p.106Index --- p.10

    Dynamic alignment for real-time CAD-driven PCB inspection

    Get PDF
    • …
    corecore