6 research outputs found
Fast Searching in Packed Strings
Given strings and the (exact) string matching problem is to find all
positions of substrings in matching . The classical Knuth-Morris-Pratt
algorithm [SIAM J. Comput., 1977] solves the string matching problem in linear
time which is optimal if we can only read one character at the time. However,
most strings are stored in a computer in a packed representation with several
characters in a single word, giving us the opportunity to read multiple
characters simultaneously. In this paper we study the worst-case complexity of
string matching on strings given in packed representation. Let be
the lengths and , respectively, and let denote the size of the
alphabet. On a standard unit-cost word-RAM with logarithmic word size we
present an algorithm using time O\left(\frac{n}{\log_\sigma n} + m +
\occ\right). Here \occ is the number of occurrences of in . For this improves the bound of the Knuth-Morris-Pratt algorithm.
Furthermore, if our algorithm is optimal since any
algorithm must spend at least \Omega(\frac{(n+m)\log
\sigma}{\log n} + \occ) = \Omega(\frac{n}{\log_\sigma n} + \occ) time to
read the input and report all occurrences. The result is obtained by a novel
automaton construction based on the Knuth-Morris-Pratt algorithm combined with
a new compact representation of subautomata allowing an optimal
tabulation-based simulation.Comment: To appear in Journal of Discrete Algorithms. Special Issue on CPM
200
Fast and Compact Regular Expression Matching
We study 4 problems in string matching, namely, regular expression matching,
approximate regular expression matching, string edit distance, and subsequence
indexing, on a standard word RAM model of computation that allows
logarithmic-sized words to be manipulated in constant time. We show how to
improve the space and/or remove a dependency on the alphabet size for each
problem using either an improved tabulation technique of an existing algorithm
or by combining known algorithms in a new way
Fine-grained Complexity Meets IP = PSPACE
In this paper we study the fine-grained complexity of finding exact and
approximate solutions to problems in P. Our main contribution is showing
reductions from exact to approximate solution for a host of such problems.
As one (notable) example, we show that the Closest-LCS-Pair problem (Given
two sets of strings and , compute exactly the maximum with ) is equivalent to its approximation version
(under near-linear time reductions, and with a constant approximation factor).
More generally, we identify a class of problems, which we call BP-Pair-Class,
comprising both exact and approximate solutions, and show that they are all
equivalent under near-linear time reductions.
Exploring this class and its properties, we also show:
Under the NC-SETH assumption (a significantly more relaxed
assumption than SETH), solving any of the problems in this class requires
essentially quadratic time.
Modest improvements on the running time of known algorithms
(shaving log factors) would imply that NEXP is not in non-uniform
.
Finally, we leverage our techniques to show new barriers for
deterministic approximation algorithms for LCS.
At the heart of these new results is a deep connection between interactive
proof systems for bounded-space computations and the fine-grained complexity of
exact and approximate solutions to problems in P. In particular, our results
build on the proof techniques from the classical IP = PSPACE result
Recommended from our members
A framework for correlation and aggregation of security alerts in communication networks. A reasoning correlation and aggregation approach to detect multi-stage attack scenarios using elementary alerts generated by Network Intrusion Detection Systems (NIDS) for a global security perspective.
The tremendous increase in usage and complexity of modern communication and network systems connected to the Internet, places demands upon security management to protect organisations¿ sensitive data and resources from malicious intrusion. Malicious attacks by intruders and hackers exploit flaws and weakness points in deployed systems through several sophisticated techniques that cannot be prevented by traditional measures, such as user authentication, access controls and firewalls. Consequently, automated detection and timely response systems are urgently needed to detect abnormal activities by monitoring network traffic and system events. Network Intrusion Detection Systems (NIDS) and Network Intrusion Prevention Systems (NIPS) are technologies that inspect traffic and diagnose system behaviour to provide improved attack protection.
The current implementation of intrusion detection systems (commercial and open-source) lacks the scalability to support the massive increase in network speed, the emergence of new protocols and services. Multi-giga networks have become a standard installation posing the NIDS to be susceptible to resource exhaustion attacks. The research focuses on two distinct problems for the NIDS: missing alerts due to packet loss as a result of NIDS performance limitations; and the huge volumes of generated alerts by the NIDS overwhelming the security analyst which makes event observation tedious.
A methodology for analysing alerts using a proposed framework for alert correlation has been presented to provide the security operator with a global view of the security perspective. Missed alerts are recovered implicitly using a contextual technique to detect multi-stage attack scenarios. This is based on the assumption that the most serious intrusions consist of relevant steps that temporally ordered. The pre- and post- condition approach is used to identify the logical relations among low level alerts. The alerts are aggregated, verified using vulnerability modelling, and correlated to construct multi-stage attacks. A number of algorithms have been proposed in this research to support the functionality of our framework including: alert correlation, alert aggregation and graph reduction. These algorithms have been implemented in a tool called Multi-stage Attack Recognition System (MARS) consisting of a collection of integrated components. The system has been evaluated using a series of experiments and using different data sets i.e. publicly available datasets and data sets collected using real-life experiments. The results show that our approach can effectively detect multi-stage attacks. The false positive rates are reduced due to implementation of the vulnerability and target host information