Search CORE

22 research outputs found

Bit-parallel search algorithms for long patterns

Author: A. Hume
A.C.-C. Yao
G. Navarro
G. Navarro
G. Zhang
H. Peltola
J. Tarhio
K. Fredriksson
L. He
M. Crochemore
M.O. Külekci
R.N. Horspool
T. Lecroq
Publication venue
Publication date: 01/01/2010
Field of study

Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto

Lazy and incremental program generation

Author: J. Heering
J. Rekers
KOSKIMIES K.
P. Klint
SESTOFT P.
~BROWN P.
~FRITZSON P.
~HORSPOOL R.N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Efficient exact pattern-matching in proteomic sequences

Author: B. Smyth
D.E. Knuth
D.M. Sunday
F. Franek
G. Navarro
H. Peltola
M. Crochemore
M. Crochemore
P.D. Michailidis
R.A. Baeza-Yates
R.M. Karp
R.N. Horspool
R.S. Boyer
T. Lecroq
T. Lecroq
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

This paper proposes a novel algorithm for complete exact pattern-matching focusing the specificities of protein sequences (alphabet of 20 symbols) but, also highly efficient considering larger alphabets. The searching strategy uses large search windows allowing multiple alignments per iteration. A new filtering heuristic, named compatibility rule, contributed decisively to the efficiency improvement. The new algorithm’s performance is, on average, superior in comparison with its best-rated competitors

CiteSeerX

Universidade do Minho: RepositoriUM

Crossref

Biblioteca Digital do IPB

A Fast Algorithm for Approximate String Matching on Gene Sequences

Author: A. Cornish-Bowden
G. Navarro
G. Navarro
J. Tarhio
L. Valinsky
N. El-Mabrouk
R.A. Baeza-Yates
R.N. Horspool
R.S. Boyer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Crossref

Approximate string matching with reduced alphabet

Author: B. Ďurian
E. Ukkonen
E. Ukkonen
E. Ukkonen
E. Ukkonen
E. Ukkonen
J. Kärkkäinen
J. Kärkkäinen
J. Tarhio
J. Tarhio
K. Fredriksson
K. Fredriksson
K. Fredriksson
L. Salmela
M. Fontaine
M.R. Garey
P. Jokinen
P. Jokinen
R. Baeza-Yates
R. Muth
R. Zhu
R.M. Karp
R.N. Horspool
R.S. Boyer
T. Berry
T. Lecroq
V. Mäkinen
V.L. Arlazarov
W.J. Masek
Z. Liu
Publication venue: Heidelberg, Berlin, Springer Verlag,
Publication date: 01/01/2010
Field of study

Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto

A Space Optimization Using Inexact Instruction Matches

Author: M. J. Zastre
R.N. Horspool
Publication venue
Publication date
Field of study

In this paper we examine parameterized procedural abstraction. This is an extension of an optimization whose sole purpose is to reduce code size. Previously published implementations of procedural abstraction have produced space savings if the instruction sequences are exact matches. We show that permanent space savings (compaction) are possible when (1) covering all inexact matches by several procedures and (2) carefully choosing the inexact match instances covered by each procedure. Our algorithms yield substantially better space savings in comparison to approaches constrained to use unparameterized procedures. 1 Introduction Powerful applications that are small and fast have always been desirable. Falling memory prices and higher chip densities mean that internal storage constraints should (theoretically) recede into the background. However, many computer users are unsatisfied as they often find that there is never enough memory for their programs. Compiler optimizations are usual..

CiteSeerX

Disambiguation filters for scannerless generalized LR parsers

Author: Brand van den, M.G.J.
Horspool R.N.
Scheerder J. (Jeroen)
Vinju J.J.
Visser E.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

In this paper we present the fusion of generalized LR parsing and scannerless parsing. This combination supports syntax definitions in which all aspects (lexical and context-free) of the syntax of a language are defined explicitly in one formalism. Furthermore, there are no restrictions on the class of grammars, thus allowing a natural syntax tree structure. Ambiguities that arise through the use of unrestricted grammars are handled by explicit disambiguation constructs, instead of implicit defaults that are taken by traditional scanner and parser generators. Hence, a syntax definition becomes a full declarative description of a language. Scannerless generalized LR parsing is a viable technique that has been applied in various industrial and academic projects

Faster Generalized LR Parsing

Author: A.V. Aho
D.E. Knuth
D.E. Knuth
F.E.J. Kruseman Aretz
G.H. Roberts
G.H. Roberts
H.R. Lewis
M. Tomita
M. Tomita
M.R. Garey
P. Eades
P. Pfahler
R. Leermakers
R.N. Horspool
R.N. Horspool
Publication venue: Springer-Verlag
Publication date: 01/01/1999
Field of study

Tomita devised a method of generalized LR (GLR) parsing to parse ambiguous grammars efficiently. A GLR parser uses linear-time LR parsing techniques as long as possible, falling back on more expensive general techniques when necessary. Much research has addressed speeding up LR parsers. However, we argue that this previous work is not transferable to GLR parsers. Instead, we speed up LR parsers by building larger pushdown automata, trading space for time. A variant of the GLR algorithm then incorporates our faster LR parsers. Our timings show that our new method for GLR parsing can parse highly ambiguous grammars significantly faster than a standard GLR parser

CiteSeerX

Crossref

A two-level structure for compressing aligned bitexts

Author: C.G. Nevill-Manning
D.E. Knuth
E.S. Conley
F.J. Och
G. Navarro
H.S. Heaps
I.D. Melamed
J.G. Cleary
N.R. Brisaboa
R. Mihalcea
R.N. Horspool
R.S. Boyer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

A bitext, or bilingual parallel corpus, consists of two texts, each one in a different language, that are mutual translations. Bitexts are very useful in linguistic engineering because they are used as source of knowledge for different purposes. In this paper we propose a strategy to efﬁciently compress and use bitexts, saving, not only space, but also processing time when exploiting them. Our strategy is based on a two-level structure for the vocabularies, and on the use of biwords, a pair of associated words, one from each language, as basic symbols to be encoded with an ETDC compressor. The resulting compressed bitext needs around 20% of the space and allows more efﬁcient implementations of the different types of searches and operations that linguistic engineerings need to perform on them. In this paper we discuss and provide results for compression, decompression, different types of searches, and bilingual snippets extraction.Spanish projects TIN2006-15071-C03-01, TIN2006-15071-C03-02 and TIN2006-15071-C03-03. Regional Government of Castilla y León and the European Social Fund

Repositorio Institucional de la Universidad de Alicante

Crossref