Search CORE

2,695 research outputs found

Succinct Dictionary Matching With No Slowdown

Author: A.V. Aho
J.I. Munro
K. Sadakane
P. Elias
R.M. Fano
S. Dori
W.-K. Hon
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

The problem of dictionary matching is a classical problem in string matching: given a set S of d strings of total length n characters over an (not necessarily constant) alphabet of size sigma, build a data structure so that we can match in a any text T all occurrences of strings belonging to S. The classical solution for this problem is the Aho-Corasick automaton which finds all occ occurrences in a text T in time O(|T| + occ) using a data structure that occupies O(m log m) bits of space where m <= n + 1 is the number of states in the automaton. In this paper we show that the Aho-Corasick automaton can be represented in just m(log sigma + O(1)) + O(d log(n/d)) bits of space while still maintaining the ability to answer to queries in O(|T| + occ) time. To the best of our knowledge, the currently fastest succinct data structure for the dictionary matching problem uses space O(n log sigma) while answering queries in O(|T|log log n + occ) time. In this paper we also show how the space occupancy can be reduced to m(H0 + O(1)) + O(d log(n/d)) where H0 is the empirical entropy of the characters appearing in the trie representation of the set S, provided that sigma < m^epsilon for any constant 0 < epsilon < 1. The query time remains unchanged.Comment: Corrected typos and other minor error

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Choice and Siting of Natural Vegetation in Reclamation of Disturbed Land

Author: Aho K.
Regele S.
Weaver T.
Publication venue: UKnowledge
Publication date: 27/10/2020
Field of study

University of Kentucky

A Computational Interpretation of Context-Free Expressions

Author: A Frisch
AV Aho
C Grabmayer
J Winter
JA Brzozowski
K Marriott
KZM Lu
M Brandt
M Sulzmann
Publication venue
Publication date: 24/08/2017
Field of study

We phrase parsing with context-free expressions as a type inhabitation problem where values are parse trees and types are context-free expressions. We first show how containment among context-free and regular expressions can be reduced to a reachability problem by using a canonical representation of states. The proofs-as-programs principle yields a computational interpretation of the reachability problem in terms of a coercion that transforms the parse tree for a context-free expression into a parse tree for a regular expression. It also yields a partial coercion from regular parse trees to context-free ones. The partial coercion from the trivial language of all words to a context-free expression corresponds to a predictive parser for the expression

arXiv.org e-Print Archive

Crossref

Correction: Relative abundance of and composition within fungal orders differ between cheatgrass (Bromus tectorum) and sagebrush (Artemisia tridentata)-associated soils

Author: Aho K.
King G. M.
Weber C. F.
Publication venue: LSU Digital Commons
Publication date: 01/01/2015
Field of study

Nonnative Bromus tectorum (cheatgrass) is decimating sagebrush steppe, one of the largest ecosystems in the Western United States, and is causing regional-scale shifts in the predominant plant-fungal interactions. Sagebrush, a native perennial, hosts arbuscular mycorrhizal fungi (AMF), whereas cheatgrass, a winter annual, is a relatively poor host of AMF. This shift is likely intertwined with decreased carbon (C)-sequestration in cheatgrass-invaded soils and alterations in overall soil fungal community composition and structure, but the latter remain unresolved. We examined soil fungal communities using high throughput amplicon sequencing (ribosomal large subunit gene) in the 0-4 cm and 4-8 cm depth intervals of six cores from cheatgrass- and six cores from sagebrush-dominated soils. Sagebrush core surfaces (0-4 cm) contained higher nitrogen and total C than cheatgrass core surfaces; these differences mirrored the presence of glomalin related soil proteins (GRSP), which has been associated with AMF activity and increased C-sequestration. Fungal richness was not significantly affected by vegetation type, depth or an interaction of the two factors. However, the relative abundance of seven taxonomic orders was significantly affected by vegetation type or the interaction between vegetation type and depth. Teloschistales, Spizellomycetales, Pezizales and Cantharellales were more abundant in sagebrush libraries and contain mycorrhizal, lichenized and basal lineages of fungi. Only two orders (Coniochaetales and Sordariales), which contain numerous economically important pathogens and opportunistic saprotrophs, were more abundant in cheatgrass libraries. Pleosporales, Agaricales, Helotiales and Hypocreales were most abundant across all libraries, but the number of genera detected within these orders was as much as 29 times lower in cheatgrass relative to sagebrush libraries. These compositional differences between fungal communities associated with cheatgrass- and sagebrush-dominated soils warrant future research to examine soil fungal community composition across more sites and time points as well as in association with native grass species that also occupy cheatgrass-invaded ecosystems

Directory of Open Access Journals

PubMed Central

Louisiana State University

Linear Parsing Expression Grammars

Author: A Birman
A Fellah
A Morihata
AK Chandra
AV Aho
J Brzozowski
JE Hopcroft
K Thompson
P Linz
S Medeiros
T Parr
Publication venue
Publication date: 09/09/2017
Field of study

PEGs were formalized by Ford in 2004, and have several pragmatic operators (such as ordered choice and unlimited lookahead) for better expressing modern programming language syntax. Since these operators are not explicitly defined in the classic formal language theory, it is significant and still challenging to argue PEGs' expressiveness in the context of formal language theory.Since PEGs are relatively new, there are several unsolved problems.One of the problems is revealing a subclass of PEGs that is equivalent to DFAs. This allows application of some techniques from the theory of regular grammar to PEGs. In this paper, we define Linear PEGs (LPEGs), a subclass of PEGs that is equivalent to DFAs. Surprisingly, LPEGs are formalized by only excluding some patterns of recursive nonterminal in PEGs, and include the full set of ordered choice, unlimited lookahead, and greedy repetition, which are characteristic of PEGs. Although the conversion judgement of parsing expressions into DFAs is undecidable in general, the formalism of LPEGs allows for a syntactical judgement of parsing expressions.Comment: Parsing expression grammars, Boolean finite automata, Packrat parsin

arXiv.org e-Print Archive

Crossref

Constructing multiple unique input/output sequences using metaheuristic optimisation techniques

Author: Aho
Atkinson
Goldberg
Guo
Hierons
Hierons
Jones
K. Derderian
Kirkpatrick
M. Harman
McGookin
Metropolis
Miller
Naik
Pomeranz
Q. Guo
R.M. Hierons
Tracey
Wegener
Yang
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2005
Field of study

Multiple unique input/output sequences (UIOs) are often used to generate robust and compact test sequences in finite state machine (FSM) based testing. However, computing UIOs is NP-hard. Metaheuristic optimisation techniques (MOTs) such as genetic algorithms (GAs) and simulated annealing (SA) are effective in providing good solutions for some NP-hard problems. In the paper, the authors investigate the construction of UIOs by using MOTs. They define a fitness function to guide the search for potential UIOs and use sharing techniques to encourage MOTs to locate UIOs that are calculated as local optima in a search domain. They also compare the performance of GA and SA for UIO construction. Experimental results suggest that, after using a sharing technique, both GA and SA can find a majority of UIOs from the models under test

Crossref

UCL Discovery

King's Research Portal

Brunel University Research Archive