Search CORE

190 research outputs found

Recommended from our members

Pattern matching : a sheaf-theoretic approach

Author: Srinivas Yellamraju V.
Publication venue: eScholarship, University of California
Publication date: 01/01/1991
Field of study

A general theory of pattern matching is presented by adopting an extensional, geometric view of patterns. The extension of the matching relation consists of the occurrences of all possible patterns in a particular target. The geometry of the pattern describes the structure of the pattern and the spatial relationships among parts of the pattern. The extension and the geometry, when combined, produce a structure called a sheaf. Sheaf theory is a well developed branch of mathematics which studies the global consequences of locally defined properties. For pattern matching, an occurrence of a pattern, a global property of the pattern, is obtained by gluing together occurrences of parts of the pattern, which are locally defined properties.A sheaf-theoretic view of pattern rnatching provides a uniforrn treatrnent of pattern matching on any kind of data structure-strings, trees, graphs, hypergraphs, and so on. Such a parametric description is achieved by using the language of category theory, a highly abstract description of commonly occurring structures and relationships in mathematics.A generalized version of the Knuth-Morris-Pratt pattern matching algorithm is derived by gradually converting the extensional description of pattern rnatching as a sheaf into an intensional description. The algorithm results from a synergy of four very general program synthesis/transformation techniques: (1) Divide and conquer: exploit the sheaf condition; assemble a full match by gluing together partial matches; (2) Finite differencing: collect and update partial matches incrementally while traversing the target; (3) Backtracking: instead of saving all partial matches, save just one; when this partial match cannot be extended, fail back to another; (4) Partial evaluation: precompute pattern-based (and therefore constant) computations.The derivation is carried out in a general frarnework using Grothendieck topologies. By appropriately instantiating the underlying data structures and topologies, the sarne scheme results in matching algorithms for patterns with variables and with multiple patterns. Slight variations of the derivation result in Earley's algorithm for context-free parsing, and Waltz filtering, a relaxation algorithm for providing 3-D interpretations to 2-D irnages.Other applications of a geometric view of patterns are briefly considered: rewrites, parallel algorithms, induction and computability

eScholarship - University of California

String Matching with Variable Length Gaps

Author: Aho
Crochemore
David Kofoed Wind
Fredriksson
Hjalte Wedel Vildhøj
Hofmann
Inge Li Gørtz
Knuth
Morgante
Myers
Myers
Myers
Navarro
Navarro
Philip Bille
Thompson
Publication venue
Publication date: 01/01/2010
Field of study

We consider string matching with variable length gaps. Given a string

T

and a pattern

P

consisting of strings separated by variable length gaps (arbitrary strings of length in a specified range), the problem is to find all ending positions of substrings in

T

that match

P

. This problem is a basic primitive in computational biology applications. Let

m

and

n

be the lengths of

P

and

T

, respectively, and let

k

be the number of strings in

P

. We present a new algorithm achieving time

O(n\log k + m +\alpha)

and space

O(m + A)

, where

A

is the sum of the lower bounds of the lengths of the gaps in

P

and

\alpha

is the total number of occurrences of the strings in

P

within

T

. Compared to the previous results this bound essentially achieves the best known time and space complexities simultaneously. Consequently, our algorithm obtains the best known bounds for almost all combinations of

m

n

k

A

, and

\alpha

. Our algorithm is surprisingly simple and straightforward to implement. We also present algorithms for finding and encoding the positions of all strings in

P

for every match of the pattern.Comment: draft of full version, extended abstract at SPIRE 201

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Crossref

Online Research Database In Technology

A sheaf-theoretic approach to pattern matching and related problems

Author: Srinivas Yellamraju V.
Publication venue: 'Elsevier BV'
Publication date: 26/04/1993
Field of study

AbstractWe present a general theory of pattern matching by adopting an extensional, geometric view of patterns. Representing the geometry of the pattern via a Grothendieck topology, the extension of the matching relation for a constant target and varying pattern forms a sheaf. We derive a generalized version of the Knuth-Morris-Pratt string-matching algorithm by gradually converting this extensional description into an intensional description, i.e., an algorithm. The generality of this approach is illustrated by briefly considering other applications: Earley's algorithm for parsing, Waltz filtering for scene analysis, matching modulo commutativity, and the n-queens problem

Elsevier - Publisher Connector

Cyclic rewriting and conjugacy problems

Author: Diekert Volker
Duncan Andrew
Myasnikov Alexei
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 25/07/2012
Field of study

Cyclic words are equivalence classes of cyclic permutations of ordinary words. When a group is given by a rewriting relation, a rewriting system on cyclic words is induced, which is used to construct algorithms to find minimal length elements of conjugacy classes in the group. These techniques are applied to the universal groups of Stallings pregroups and in particular to free products with amalgamation, HNN-extensions and virtually free groups, to yield simple and intuitive algorithms and proofs of conjugacy criteria.Comment: 37 pages, 1 figure, submitted. Changes to introductio

arXiv.org e-Print Archive

Crossref

A Generalist Neural Algorithmic Learner

Author: Bennani Mehdi
Bevilacqua Beatrice
Blundell Charles
Bošnjak Matko
Csordás Róbert
Deac Andreea
Dudzik Andrew
Ganin Yaroslav
Ibarz Borja
Kurin Vitaly
Nikiforou Kyriacos
Papamakarios George
Rubanova Yulia
Veličković Petar
Vitvitskyi Alex
Publication venue
Publication date: 22/09/2022
Field of study

The cornerstone of neural algorithmic reasoning is the ability to solve algorithmic tasks, especially in a way that generalises out of distribution. While recent years have seen a surge in methodological improvements in this area, they mostly focused on building specialist models. Specialist models are capable of learning to neurally execute either only one algorithm or a collection of algorithms with identical control-flow backbone. Here, instead, we focus on constructing a generalist neural algorithmic learner -- a single graph neural network processor capable of learning to execute a wide range of algorithms, such as sorting, searching, dynamic programming, path-finding and geometry. We leverage the CLRS benchmark to empirically show that, much like recent successes in the domain of perception, generalist algorithmic learners can be built by "incorporating" knowledge. That is, it is possible to effectively learn algorithms in a multi-task manner, so long as we can learn to execute them well in a single-task regime. Motivated by this, we present a series of improvements to the input representation, training regime and processor architecture over CLRS, improving average single-task performance by over 20% from prior art. We then conduct a thorough ablation of multi-task learners leveraging these improvements. Our results demonstrate a generalist learner that effectively incorporates knowledge captured by specialist models.Comment: 20 pages, 10 figure

arXiv.org e-Print Archive

A Survey of String Matching Algorithms

Author: Koloud Al-Khamaiseh
Shadi Alshagarin
Publication venue
Publication date: 03/04/2020
Field of study

ABSTRACT The concept of string matching algorithms are playing an important role of string algorithms in finding a place where one or several strings (patterns) are found in a large body of text (e.g., data streaming, a sentence, a paragraph, a book, etc.). Its application covers a wide range, including intrusion detection Systems (IDS) in computer networks, applications in bioinformatics, detecting plagiarism, information security, pattern recognition, document matching and text mining. In this paper we present a short survey for well-known and recent updated and hybrid string matching algorithms. These algorithms can be divided into two major categories, known as exact string matching and approximate string matching. The string matching classification criteria was selected to highlight important features of matching strategies, in order to identify challenges and vulnerabilities

CiteSeerX

Turning function and shape recognition

Author: Shankar Swetha
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/2008
Field of study

The technique of turning function is a powerful method for measuring similarity between two dimensional shapes. The method works well when the boundary of the shape does not contain noise edges. We propose an algorithm for smoothing noise edges by decomposing the boundary into monotone components and smoothing the noise edges in each component. We also present an implementation of the proposed smoothing algorithm

University of Nevada, Las Vegas Repository

Linear pattern matching on sparse suffix trees

Author: Kolpakov Roman
Kucherov Gregory
Starikovskaya Tatiana
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/03/2011
Field of study

Packing several characters into one computer word is a simple and natural way to compress the representation of a string and to speed up its processing. Exploiting this idea, we propose an index for a packed string, based on a {\em sparse suffix tree} \cite{KU-96} with appropriately defined suffix links. Assuming, under the standard unit-cost RAM model, that a word can store up to

\log_{\sigma}n

characters (

\sigma

the alphabet size), our index takes

O(n/\log_{\sigma}n)

space, i.e. the same space as the packed string itself. The resulting pattern matching algorithm runs in time

O(m+r^2+r\cdot occ)

, where

m

is the length of the pattern,

r

is the actual number of characters stored in a word and

occ

is the number of pattern occurrences

arXiv.org e-Print Archive

CiteSeerX

Crossref

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

The Complexity of the Approximate Multiple Pattern Matching Problem for Random Strings

Author: Rakotoarimalala Tsinjo
Sportiello Andrea
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2020)
Publication date: 01/01/2020
Field of study

We describe a multiple string pattern matching algorithm which is well-suited for approximate search and dictionaries composed of words of different lengths. We prove that this algorithm has optimal complexity rate up to a multiplicative constant, for arbitrary dictionaries. This extends to arbitrary dictionaries the classical results of Yao [SIAM J. Comput. 8, 1979], and Chang and Marr [Proc. CPM94, 1994]

Dagstuhl Research Online Publication Server

HAL-Paris 13

Oskar Bordeaux