338,872 research outputs found
From Regular Expression Matching to Parsing
Given a regular expression and a string , the regular expression
parsing problem is to determine if matches and if so, determine how it
matches, e.g., by a mapping of the characters of to the characters in .
Regular expression parsing makes finding matches of a regular expression even
more useful by allowing us to directly extract subpatterns of the match, e.g.,
for extracting IP-addresses from internet traffic analysis or extracting
subparts of genomes from genetic data bases. We present a new general
techniques for efficiently converting a large class of algorithms that
determine if a string matches regular expression into algorithms that
can construct a corresponding mapping. As a consequence, we obtain the first
efficient linear space solutions for regular expression parsing
String Matching with Variable Length Gaps
We consider string matching with variable length gaps. Given a string and
a pattern consisting of strings separated by variable length gaps
(arbitrary strings of length in a specified range), the problem is to find all
ending positions of substrings in that match . This problem is a basic
primitive in computational biology applications. Let and be the lengths
of and , respectively, and let be the number of strings in . We
present a new algorithm achieving time and space , where is the sum of the lower bounds of the lengths of the gaps in
and is the total number of occurrences of the strings in
within . Compared to the previous results this bound essentially achieves
the best known time and space complexities simultaneously. Consequently, our
algorithm obtains the best known bounds for almost all combinations of ,
, , , and . Our algorithm is surprisingly simple and
straightforward to implement. We also present algorithms for finding and
encoding the positions of all strings in for every match of the pattern.Comment: draft of full version, extended abstract at SPIRE 201
Regular Expression Search on Compressed Text
We present an algorithm for searching regular expression matches in
compressed text. The algorithm reports the number of matching lines in the
uncompressed text in time linear in the size of its compressed version. We
define efficient data structures that yield nearly optimal complexity bounds
and provide a sequential implementation --zearch-- that requires up to 25% less
time than the state of the art.Comment: 10 pages, published in Data Compression Conference (DCC'19
Intermittent search strategies
This review examines intermittent target search strategies, which combine
phases of slow motion, allowing the searcher to detect the target, and phases
of fast motion during which targets cannot be detected. We first show that
intermittent search strategies are actually widely observed at various scales.
At the macroscopic scale, this is for example the case of animals looking for
food ; at the microscopic scale, intermittent transport patterns are involved
in reaction pathway of DNA binding proteins as well as in intracellular
transport. Second, we introduce generic stochastic models, which show that
intermittent strategies are efficient strategies, which enable to minimize the
search time. This suggests that the intrinsic efficiency of intermittent search
strategies could justify their frequent observation in nature. Last, beyond
these modeling aspects, we propose that intermittent strategies could be used
also in a broader context to design and accelerate search processes.Comment: 72 pages, review articl
- …