27,691 research outputs found

    A new problem in string searching

    Full text link
    We describe a substring search problem that arises in group presentation simplification processes. We suggest a two-level searching model: skip and match levels. We give two timestamp algorithms which skip searching parts of the text where there are no matches at all and prove their correctness. At the match level, we consider Harrison signature, Karp-Rabin fingerprint, Bloom filter and automata based matching algorithms and present experimental performance figures.Comment: To appear in Proceedings Fifth Annual International Symposium on Algorithms and Computation (ISAAC'94), Lecture Notes in Computer Scienc

    Fast Searching in Packed Strings

    Get PDF
    Given strings PP and QQ the (exact) string matching problem is to find all positions of substrings in QQ matching PP. The classical Knuth-Morris-Pratt algorithm [SIAM J. Comput., 1977] solves the string matching problem in linear time which is optimal if we can only read one character at the time. However, most strings are stored in a computer in a packed representation with several characters in a single word, giving us the opportunity to read multiple characters simultaneously. In this paper we study the worst-case complexity of string matching on strings given in packed representation. Let mnm \leq n be the lengths PP and QQ, respectively, and let σ\sigma denote the size of the alphabet. On a standard unit-cost word-RAM with logarithmic word size we present an algorithm using time O\left(\frac{n}{\log_\sigma n} + m + \occ\right). Here \occ is the number of occurrences of PP in QQ. For m=o(n)m = o(n) this improves the O(n)O(n) bound of the Knuth-Morris-Pratt algorithm. Furthermore, if m=O(n/logσn)m = O(n/\log_\sigma n) our algorithm is optimal since any algorithm must spend at least \Omega(\frac{(n+m)\log \sigma}{\log n} + \occ) = \Omega(\frac{n}{\log_\sigma n} + \occ) time to read the input and report all occurrences. The result is obtained by a novel automaton construction based on the Knuth-Morris-Pratt algorithm combined with a new compact representation of subautomata allowing an optimal tabulation-based simulation.Comment: To appear in Journal of Discrete Algorithms. Special Issue on CPM 200

    Forward Private Searchable Symmetric Encryption with Optimized I/O Efficiency

    Get PDF
    Recently, several practical attacks raised serious concerns over the security of searchable encryption. The attacks have brought emphasis on forward privacy, which is the key concept behind solutions to the adaptive leakage-exploiting attacks, and will very likely to become mandatory in the design of new searchable encryption schemes. For a long time, forward privacy implies inefficiency and thus most existing searchable encryption schemes do not support it. Very recently, Bost (CCS 2016) showed that forward privacy can be obtained without inducing a large communication overhead. However, Bost's scheme is constructed with a relatively inefficient public key cryptographic primitive, and has a poor I/O performance. Both of the deficiencies significantly hinder the practical efficiency of the scheme, and prevent it from scaling to large data settings. To address the problems, we first present FAST, which achieves forward privacy and the same communication efficiency as Bost's scheme, but uses only symmetric cryptographic primitives. We then present FASTIO, which retains all good properties of FAST, and further improves I/O efficiency. We implemented the two schemes and compared their performance with Bost's scheme. The experiment results show that both our schemes are highly efficient, and FASTIO achieves a much better scalability due to its optimized I/O
    corecore