2,137 research outputs found

    Fingerprint databases for theorems

    Full text link
    We discuss the advantages of searchable, collaborative, language-independent databases of mathematical results, indexed by "fingerprints" of small and canonical data. Our motivating example is Neil Sloane's massively influential On-Line Encyclopedia of Integer Sequences. We hope to encourage the greater mathematical community to search for the appropriate fingerprints within each discipline, and to compile fingerprint databases of results wherever possible. The benefits of these databases are broad - advancing the state of knowledge, enhancing experimental mathematics, enabling researchers to discover unexpected connections between areas, and even improving the refereeing process for journal publication.Comment: to appear in Notices of the AM

    A new problem in string searching

    Full text link
    We describe a substring search problem that arises in group presentation simplification processes. We suggest a two-level searching model: skip and match levels. We give two timestamp algorithms which skip searching parts of the text where there are no matches at all and prove their correctness. At the match level, we consider Harrison signature, Karp-Rabin fingerprint, Bloom filter and automata based matching algorithms and present experimental performance figures.Comment: To appear in Proceedings Fifth Annual International Symposium on Algorithms and Computation (ISAAC'94), Lecture Notes in Computer Scienc

    Optimal Substring-Equality Queries with Applications to Sparse Text Indexing

    Full text link
    We consider the problem of encoding a string of length nn from an integer alphabet of size σ\sigma so that access and substring equality queries (that is, determining the equality of any two substrings) can be answered efficiently. Any uniquely-decodable encoding supporting access must take nlogσ+Θ(log(nlogσ))n\log\sigma + \Theta(\log (n\log\sigma)) bits. We describe a new data structure matching this lower bound when σnO(1)\sigma\leq n^{O(1)} while supporting both queries in optimal O(1)O(1) time. Furthermore, we show that the string can be overwritten in-place with this structure. The redundancy of Θ(logn)\Theta(\log n) bits and the constant query time break exponentially a lower bound that is known to hold in the read-only model. Using our new string representation, we obtain the first in-place subquadratic (indeed, even sublinear in some cases) algorithms for several string-processing problems in the restore model: the input string is rewritable and must be restored before the computation terminates. In particular, we describe the first in-place subquadratic Monte Carlo solutions to the sparse suffix sorting, sparse LCP array construction, and suffix selection problems. With the sole exception of suffix selection, our algorithms are also the first running in sublinear time for small enough sets of input suffixes. Combining these solutions, we obtain the first sublinear-time Monte Carlo algorithm for building the sparse suffix tree in compact space. We also show how to derandomize our algorithms using small space. This leads to the first Las Vegas in-place algorithm computing the full LCP array in O(nlogn)O(n\log n) time and to the first Las Vegas in-place algorithms solving the sparse suffix sorting and sparse LCP array construction problems in O(n1.5logσ)O(n^{1.5}\sqrt{\log \sigma}) time. Running times of these Las Vegas algorithms hold in the worst case with high probability.Comment: Refactored according to TALG's reviews. New w.h.p. bounds and Las Vegas algorithm

    Existentially Restricted Quantified Constraint Satisfaction

    Full text link
    The quantified constraint satisfaction problem (QCSP) is a powerful framework for modelling computational problems. The general intractability of the QCSP has motivated the pursuit of restricted cases that avoid its maximal complexity. In this paper, we introduce and study a new model for investigating QCSP complexity in which the types of constraints given by the existentially quantified variables, is restricted. Our primary technical contribution is the development and application of a general technology for proving positive results on parameterizations of the model, of inclusion in the complexity class coNP

    A Tight Lower Bound for Counting Hamiltonian Cycles via Matrix Rank

    Get PDF
    For even kk, the matchings connectivity matrix Mk\mathbf{M}_k encodes which pairs of perfect matchings on kk vertices form a single cycle. Cygan et al. (STOC 2013) showed that the rank of Mk\mathbf{M}_k over Z2\mathbb{Z}_2 is Θ(2k)\Theta(\sqrt 2^k) and used this to give an O((2+2)pw)O^*((2+\sqrt{2})^{\mathsf{pw}}) time algorithm for counting Hamiltonian cycles modulo 22 on graphs of pathwidth pw\mathsf{pw}. The same authors complemented their algorithm by an essentially tight lower bound under the Strong Exponential Time Hypothesis (SETH). This bound crucially relied on a large permutation submatrix within Mk\mathbf{M}_k, which enabled a "pattern propagation" commonly used in previous related lower bounds, as initiated by Lokshtanov et al. (SODA 2011). We present a new technique for a similar pattern propagation when only a black-box lower bound on the asymptotic rank of Mk\mathbf{M}_k is given; no stronger structural insights such as the existence of large permutation submatrices in Mk\mathbf{M}_k are needed. Given appropriate rank bounds, our technique yields lower bounds for counting Hamiltonian cycles (also modulo fixed primes pp) parameterized by pathwidth. To apply this technique, we prove that the rank of Mk\mathbf{M}_k over the rationals is 4k/poly(k)4^k / \mathrm{poly}(k). We also show that the rank of Mk\mathbf{M}_k over Zp\mathbb{Z}_p is Ω(1.97k)\Omega(1.97^k) for any prime p2p\neq 2 and even Ω(2.15k)\Omega(2.15^k) for some primes. As a consequence, we obtain that Hamiltonian cycles cannot be counted in time O((6ϵ)pw)O^*((6-\epsilon)^{\mathsf{pw}}) for any ϵ>0\epsilon>0 unless SETH fails. This bound is tight due to a O(6pw)O^*(6^{\mathsf{pw}}) time algorithm by Bodlaender et al. (ICALP 2013). Under SETH, we also obtain that Hamiltonian cycles cannot be counted modulo primes p2p\neq 2 in time O(3.97pw)O^*(3.97^\mathsf{pw}), indicating that the modulus can affect the complexity in intricate ways.Comment: improved lower bounds modulo primes, improved figures, to appear in SODA 201

    Optimal Active Social Network De-anonymization Using Information Thresholds

    Full text link
    In this paper, de-anonymizing internet users by actively querying their group memberships in social networks is considered. In this problem, an anonymous victim visits the attacker's website, and the attacker uses the victim's browser history to query her social media activity for the purpose of de-anonymization using the minimum number of queries. A stochastic model of the problem is considered where the attacker has partial prior knowledge of the group membership graph and receives noisy responses to its real-time queries. The victim's identity is assumed to be chosen randomly based on a given distribution which models the users' risk of visiting the malicious website. A de-anonymization algorithm is proposed which operates based on information thresholds and its performance both in the finite and asymptotically large social network regimes is analyzed. Furthermore, a converse result is provided which proves the optimality of the proposed attack strategy

    Towards Provably Invisible Network Flow Fingerprints

    Full text link
    Network traffic analysis reveals important information even when messages are encrypted. We consider active traffic analysis via flow fingerprinting by invisibly embedding information into packet timings of flows. In particular, assume Alice wishes to embed fingerprints into flows of a set of network input links, whose packet timings are modeled by Poisson processes, without being detected by a watchful adversary Willie. Bob, who receives the set of fingerprinted flows after they pass through the network modeled as a collection of independent and parallel M/M/1M/M/1 queues, wishes to extract Alice's embedded fingerprints to infer the connection between input and output links of the network. We consider two scenarios: 1) Alice embeds fingerprints in all of the flows; 2) Alice embeds fingerprints in each flow independently with probability pp. Assuming that the flow rates are equal, we calculate the maximum number of flows in which Alice can invisibly embed fingerprints while having those fingerprints successfully decoded by Bob. Then, we extend the construction and analysis to the case where flow rates are distinct, and discuss the extension of the network model
    corecore