Search CORE

8,531 research outputs found

An approximate search engine for structure

Author: Shan Huiyuan
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2004
Field of study

As the size of structural databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute-value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art. In this dissertation, efficient search techniques are presented for retrieving trees from a database that are similar to a given query tree. Rooted ordered labeled trees, rooted unordered labeled trees and free trees are considered. Ordered labeled trees are trees in which each node has a label and the left-to-right order among siblings matters. Unordered labeled trees are trees in which the parent-child relationship is significant, but the order among siblings is unimportant. Free trees (unrooted unordered trees) are acyclic graphs. These trees find many applications in bioinformatics, Web log analysis, phyloinformatics, XML processing, etc. Two types of similarity measures are investigated: (i) counting the mismatching paths in the query tree and a data tree, and (ii) measuring the topological relationship between the trees. The proposed approaches include storing the paths of trees in a suffix array, employing hashing techniques to speed up retrieval, and counting the number of up-down operations to move a token from one node to another node in a tree. Various filters for accelerating a search, different strategies for parallelizing these search algorithms and applications of these algorithms to XML and phylogenetic data management are discussed. The proposed techniques have been implemented into a phylogenetic search engine which is fully operational and is available on the World Wide Web. Experimental results on comparing the similarity measures with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate the effectiveness of the search engine. Future work includes extending the techniques to other structural data, as well as developing new filters and algorithms for speeding up searching and mining in complex structures

Digital Commons @ New Jersey Institute of Technology (NJIT)

Boyer-Moore strategy to efficient approximate string matching

Author: Crochemore Maxime
El Mabrouk Nadia
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1996
Field of study

International audienceWe propose a simple but e cient algorithm for searching all occurrences of a pattern or a class of patterns (length m) in a text (length n) with at most k mismatches. This algorithm relies on the Shift-Add algorithm of Baeza-Yates and Gonnet [6], which involves representing by a bit number the current state of the search and uses the ability of programming languages to handle bit words. State representation should not, therefore, exceeds the word size w, that is, m(⌈log2(k+1)⌉+1 )≤w. This algorithm consists in a preprocessing step and a searching step. It is linear and performs 3n operations during the searching step. Notions of shift and character skip found in the Boyer-Moore (BM) [9] approach, are introduced in this algorithm. Provided that the considered alphabet is large enough (compared to the Pattern length), the average number of operations performed by our algorithm during the searching step becomes n(2+(k+4)/(m-k))

CiteSeerX

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Custom Integrated Circuits

Author: Allen Jonathan
Antoniadis Dimitri A.
Armstrong Robert C.
Baltus Donald G.
Bamji Cyrus S.
Devadas Srinivas
Elfadel Ibrahim M.
Feder Meir
Fogg Dennis C. Y.
Hakkarainen Mikko
Horn Berthold K. P.
Keast Craig L.
Lee Hae-Seung
Lloyd Jennifer A.
Lumsdaine Andrew
McCormick Lynne M.
McCormick Steven P.
McQuirk Ignacio S.
Miyanaga Hiroshi
Musicus Bruce R.
Nabors Keith S.
Olsen James A.
Peterson Kevin
Poggio Tomaso
Prasanna G. N. Srinivasa
Reichelt Mark W.
Sikes Bennet
Silviera Luis M.
Sodini Charles G.
Song William S.
Standley David L.
Telichevsky Ricardo
Umminger Christopher B.
Van Aelten Filip J.
Weinstein Ehud
White Jacob K.
Wyatt John L., Jr.
Young Woodward
Publication venue: Research Laboratory of Electronics (RLE) at the Massachusetts Institute of Technology (MIT)
Publication date
Field of study

Contains reports on twelve research projects.Analog Devices, Inc.International Business Machines, Inc.Joint Services Electronics Program (Contract DAAL03-86-K-0002)Joint Services Electronics Program (Contract DAAL03-89-C-0001)U.S. Air Force - Office of Scientific Research (Grant AFOSR 86-0164)Rockwell International CorporationOKI Semiconductor, Inc.U.S. Navy - Office of Naval Research (Contract N00014-81-K-0742)Charles Stark Draper LaboratoryNational Science Foundation (Grant MIP 84-07285)National Science Foundation (Grant MIP 87-14969)Battelle LaboratoriesNational Science Foundation (Grant MIP 88-14612)DuPont CorporationDefense Advanced Research Projects Agency/U.S. Navy - Office of Naval Research (Contract N00014-87-K-0825)American Telephone and TelegraphDigital Equipment CorporationNational Science Foundation (Grant MIP-88-58764

DSpace@MIT

Pattern discovery in trees : algorithms and applications to document and scientific data management

Author: Chang Chia-Yo
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/1999
Field of study

Ordered, labeled trees are trees in which each node has a label and the left-to-right order of its children (if it has any) is fixed. Such trees have many applications in vision, pattern recognition, molecular biology and natural language processing. In this dissertation we present algorithms for finding patterns in the ordered labeled trees. Specifically we study the largest approximately common substructure (LACS) problem for such trees. We consider a substructure of a tree T to be a connected subgraph of T. Given two trees T1, T2 and an integer d, the LACS problem is to find a substructure U1 of T1 and a substructure U2 of T2 such that U1 is within distance d of U2 and where there does not exist any other substructure V1 of T1 and V2 of T2 such that V1 and V2 satisfy the distance constraint and the sum of the sizes of V1 and V2 is greater than the sum of the sizes of U1 and U2. The LACS problem is motivated by the studies of document and RNA comparison. We consider two types of distance measures: the general edit distance and a restricted edit distance originated from Selkow. We present dynamic programming algorithms to solve the LACS problem based on the two distance measures. The algorithms run as fast as the best known algorithms for computing the distance of two trees when the distance allowed in the common substructures is a constant independent of the input trees. To demonstrate the utility of our algorithms, we discuss their applications to discovering motifs in multiple RNA secondary structures. Such an application shows an example of scientific data mining. We represent an RNA secondary structure by an ordered labeled tree based on a previously proposed scheme. The patterns in the trees are substructures that can differ in both substitutions and deletions/insertions of nodes of the trees. Our techniques incorporate approximate tree matching algorithms and novel heuristics for discovery and optimization. Experimental results obtained by running these algorithms on both generated data and RNA secondary structures show the good performance of the algorithms. It is shown that the optimization heuristics speed up the discovery algorithm by a factor of 10. Moreover, our optimized approach is 100,000 times faster than the brute force method. Finally we implement our techniques into a graphic toolbox that enables users to find repeated substructures in an RNA secondary structure as well as frequently occurring patterns in multiple RNA secondary structures pertaining to rhinovirus obtained from the National Cancer Institute. The system is implemented in C programming language and X windows and is fully operational on SUN workstations

Digital Commons @ New Jersey Institute of Technology (NJIT)

Recommended from our members

System and method for searching a data base using a content-searchable memory

Author: G. Jack Lipovski
Publication venue: United States Patent and Trademark Office
Publication date: 07/12/1992
Field of study

" A dynamic storage device requires periodic refresh and includes logical operation circuitry within the refresh circuitry. The individual storage positions of the storage device are periodically read by a refresh amplifier, and then a logical operation is performed on the refresh data before the data re applied to the write amplifier. That operation allows implementation of associative database searching by cyclically executing ""data compare"" and other logical operations within the refresh circuitry. A system of content searching may be implemented in any storage device, dynamic or not, in which a comparand may be matched with any of a plurality of subunits of a word, and a storage bit is used to identify any words in which a mismatch occurs. Upon recognizing a match, the device can be commanded (a) to output the word or a selected portion (which may be different than the matched portion), (b) to move a selected portion of the word to a different location in the word, or (c) to alter the bits of the word or a selected portion. Arithmetical operations may be implemented through such alterations after matching. Off-chip storage systems of use with such devices are also disclosed. "Board of Regents, University of Texas Syste

Texas ScholarWorks

Pendeteksian Plagiarisme Menggunakan Algoritma Rabin-Karp dengan Metode Rolling Hash

Author: Priambodo J. (Joko)
Publication venue: 'Universitas Pamulang'
Publication date: 01/01/2018
Field of study

Plagiarisme adalah tindakan penyalahgunaan, pencurian/perampasan, penerbitan, pernyataan, atau menyatakan sebagai milik sendiri sebuah pikiran, ide, tulisan, atau ciptaan yang sebenarnya milik orang lain. Plagiat atau biasa disebut penjiplakan adalah sebuah masalah yang cukup signifikan pada akademisi di perguruan tinggi. Hal plagiat yang biasanya dilakukan terhadap konten digital adalah melakukan copy-paste, quote, dan revisi terhadap dokumen asli. Untuk mengantisipasinya, dibutuhkan suatu cara yang dapat menganalisis teknik-teknik plagiat yang dilakukan. Ada beberapa pendekatan yang bisa diambil, salah satunya dengan menggunakan algoritma Rabin-Karp dengan metode Rolling Hash. Pendeteksian plagiarisme menggunakan algoritma Rabin-Karp dengan metode rolling hash ini diimplementasikan ke dalam program atau aplikasi untuk menentukan nilai tingkat akurasi dengan nilai presentase

Neliti

Online Journal Systems UNPAM (Universitas Pamulang)

Efficient alternative wiring techniques and applications.

Author
Publication venue
Publication date: 01/01/2001
Field of study

Sze, Chin Ngai.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves 80-84) and index.Abstracts in English and Chinese.Abstract --- p.iAcknowledgments --- p.iiiCurriculum Vitae --- p.ivList of Figures --- p.ixList of Tables --- p.xiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivation and Aims --- p.1Chapter 1.2 --- Contribution --- p.8Chapter 1.3 --- Organization of Dissertation --- p.10Chapter 2 --- Definitions and Notations --- p.11Chapter 3 --- Literature Review --- p.15Chapter 3.1 --- Logic Reconstruction --- p.15Chapter 3.1.1 --- SIS: A System for Sequential and Combinational Logic Synthesis --- p.16Chapter 3.2 --- ATPG-based Alternative Wiring --- p.17Chapter 3.2.1 --- Redundancy Addition and Removal for Logic Optimization --- p.18Chapter 3.2.2 --- Perturb and Simplify Logic Optimization --- p.18Chapter 3.2.3 --- REWIRE --- p.21Chapter 3.2.4 --- Implication-tree Based Alternative Wiring Logic Trans- formation --- p.22Chapter 3.3 --- Graph-based Alternative Wiring --- p.24Chapter 4 --- Implication Based Alternative Wiring Logic Transformation --- p.25Chapter 4.1 --- Source Node Implication --- p.25Chapter 4.1.1 --- Introduction --- p.25Chapter 4.1.2 --- Implication Relationship and Implication-tree --- p.25Chapter 4.1.3 --- Selection of Alternative Wire Based on Implication-tree --- p.29Chapter 4.1.4 --- Implication-tree Based Logic Transformation --- p.32Chapter 4.2 --- Destination Node Implication --- p.35Chapter 4.2.1 --- Introduction --- p.35Chapter 4.2.2 --- Destination Node Relationship --- p.35Chapter 4.2.3 --- Destination Node Implication-tree --- p.39Chapter 4.2.4 --- Selection of Alternative Wire --- p.41Chapter 4.3 --- The Algorithm --- p.43Chapter 4.3.1 --- IB AW Implementation --- p.43Chapter 4.3.2 --- Experimental Results --- p.43Chapter 4.4 --- Conclusion --- p.45Chapter 5 --- Graph Based Alternative Wiring Logic Transformation --- p.47Chapter 5.1 --- Introduction --- p.47Chapter 5.2 --- Notations and Definitions --- p.48Chapter 5.3 --- Alternative Wire Patterns --- p.50Chapter 5.4 --- Construction of Minimal Patterns --- p.54Chapter 5.4.1 --- Minimality of Patterns --- p.54Chapter 5.4.2 --- Minimal Pattern Formation --- p.56Chapter 5.4.3 --- Pattern Extraction --- p.61Chapter 5.5 --- Experimental Results --- p.63Chapter 5.6 --- Conclusion --- p.63Chapter 6 --- Logic Optimization by GBAW --- p.66Chapter 6.1 --- Introduction --- p.66Chapter 6.2 --- Logic Simplification --- p.67Chapter 6.2.1 --- Single-Addition-Multiple-Removal by Pattern Feature . . --- p.67Chapter 6.2.2 --- Single-Addition-Multiple-Removal by Combination of Pat- terns --- p.68Chapter 6.2.3 --- Single-Addition-Single-Removal --- p.70Chapter 6.3 --- Incremental Perturbation Heuristic --- p.71Chapter 6.4 --- GBAW Optimization Algorithm --- p.73Chapter 6.5 --- Experimental Results --- p.73Chapter 6.6 --- Conclusion --- p.76Chapter 7 --- Conclusion --- p.78Bibliography --- p.80Chapter A --- VLSI Design Cycle --- p.85Chapter B --- Alternative Wire Patterns in [WLFOO] --- p.87Chapter B.1 --- 0-local Pattern --- p.87Chapter B.2 --- 1-local Pattern --- p.88Chapter B.3 --- 2-local Pattern --- p.89Chapter B.4 --- Fanout-reconvergent Pattern --- p.90Chapter C --- New Alternative Wire Patterns --- p.91Chapter C.1 --- Pattern Cluster C1 --- p.91Chapter C.1.1 --- NAND-NAND-AND/NAND;AND/NAND --- p.91Chapter C.1.2 --- NOR-NOR-OR/NOR;AND/NAND --- p.92Chapter C.1.3 --- AND-NOR-OR/NOR;OR/NOR --- p.95Chapter C.1.4 --- OR-NAND-AND/NAND;AND/NAND --- p.95Chapter C.2 --- Pattern Cluster C2 --- p.98Chapter C.3 --- Pattern Cluster C3 --- p.99Chapter C.4 --- Pattern Cluster C4 --- p.104Chapter C.5 --- Pattern Cluster C5 --- p.105Glossary --- p.106Index --- p.10

CUHK Digital Repository

Dynamic alignment for real-time CAD-driven PCB inspection

Author: Ong S.-A.
Publication venue
Publication date: 31/05/1995
Field of study

Pure OAI Repository

A program generator for recognition, parsing and transduction with syntactic patterns

Author: Steen G.J. van der
Publication venue: CWI
Publication date: 01/01/1988
Field of study

CWI's Institutional Repository