2,815 research outputs found

    Simple, compact and robust approximate string dictionary

    Full text link
    This paper is concerned with practical implementations of approximate string dictionaries that allow edit errors. In this problem, we have as input a dictionary DD of dd strings of total length nn over an alphabet of size σ\sigma. Given a bound kk and a pattern xx of length mm, a query has to return all the strings of the dictionary which are at edit distance at most kk from xx, where the edit distance between two strings xx and yy is defined as the minimum-cost sequence of edit operations that transform xx into yy. The cost of a sequence of operations is defined as the sum of the costs of the operations involved in the sequence. In this paper, we assume that each of these operations has unit cost and consider only three operations: deletion of one character, insertion of one character and substitution of a character by another. We present a practical implementation of the data structure we recently proposed and which works only for one error. We extend the scheme to 2k<m2\leq k<m. Our implementation has many desirable properties: it has a very fast and space-efficient building algorithm. The dictionary data structure is compact and has fast and robust query time. Finally our data structure is simple to implement as it only uses basic techniques from the literature, mainly hashing (linear probing and hash signatures) and succinct data structures (bitvectors supporting rank queries).Comment: Accepted to a journal (19 pages, 2 figures

    Accelerating the CM method

    Full text link
    Given a prime q and a negative discriminant D, the CM method constructs an elliptic curve E/\Fq by obtaining a root of the Hilbert class polynomial H_D(X) modulo q. We consider an approach based on a decomposition of the ring class field defined by H_D, which we adapt to a CRT setting. This yields two algorithms, each of which obtains a root of H_D mod q without necessarily computing any of its coefficients. Heuristically, our approach uses asymptotically less time and space than the standard CM method for almost all D. Under the GRH, and reasonable assumptions about the size of log q relative to |D|, we achieve a space complexity of O((m+n)log q) bits, where mn=h(D), which may be as small as O(|D|^(1/4)log q). The practical efficiency of the algorithms is demonstrated using |D| > 10^16 and q ~ 2^256, and also |D| > 10^15 and q ~ 2^33220. These examples are both an order of magnitude larger than the best previous results obtained with the CM method.Comment: 36 pages, minor edits, to appear in the LMS Journal of Computation and Mathematic

    Reconstructing Rational Functions with FireFly\texttt{FireFly}

    Full text link
    We present the open-source C++\texttt{C++} library FireFly\texttt{FireFly} for the reconstruction of multivariate rational functions over finite fields. We discuss the involved algorithms and their implementation. As an application, we use FireFly\texttt{FireFly} in the context of integration-by-parts reductions and compare runtime and memory consumption to a fully algebraic approach with the program Kira\texttt{Kira}.Comment: 46 pages, 3 figures, 6 tables; v2: matches published versio

    Generalised Mersenne Numbers Revisited

    Get PDF
    Generalised Mersenne Numbers (GMNs) were defined by Solinas in 1999 and feature in the NIST (FIPS 186-2) and SECG standards for use in elliptic curve cryptography. Their form is such that modular reduction is extremely efficient, thus making them an attractive choice for modular multiplication implementation. However, the issue of residue multiplication efficiency seems to have been overlooked. Asymptotically, using a cyclic rather than a linear convolution, residue multiplication modulo a Mersenne number is twice as fast as integer multiplication; this property does not hold for prime GMNs, unless they are of Mersenne's form. In this work we exploit an alternative generalisation of Mersenne numbers for which an analogue of the above property --- and hence the same efficiency ratio --- holds, even at bitlengths for which schoolbook multiplication is optimal, while also maintaining very efficient reduction. Moreover, our proposed primes are abundant at any bitlength, whereas GMNs are extremely rare. Our multiplication and reduction algorithms can also be easily parallelised, making our arithmetic particularly suitable for hardware implementation. Furthermore, the field representation we propose also naturally protects against side-channel attacks, including timing attacks, simple power analysis and differential power analysis, which is essential in many cryptographic scenarios, in constrast to GMNs.Comment: 32 pages. Accepted to Mathematics of Computatio

    The streaming kk-mismatch problem

    Get PDF
    We consider the streaming complexity of a fundamental task in approximate pattern matching: the kk-mismatch problem. It asks to compute Hamming distances between a pattern of length nn and all length-nn substrings of a text for which the Hamming distance does not exceed a given threshold kk. In our problem formulation, we report not only the Hamming distance but also, on demand, the full \emph{mismatch information}, that is the list of mismatched pairs of symbols and their indices. The twin challenges of streaming pattern matching derive from the need both to achieve small working space and also to guarantee that every arriving input symbol is processed quickly. We present a streaming algorithm for the kk-mismatch problem which uses O(klognlognk)O(k\log{n}\log\frac{n}{k}) bits of space and spends \ourcomplexity time on each symbol of the input stream, which consists of the pattern followed by the text. The running time almost matches the classic offline solution and the space usage is within a logarithmic factor of optimal. Our new algorithm therefore effectively resolves and also extends an open problem first posed in FOCS'09. En route to this solution, we also give a deterministic O(k(lognk+logΣ))O( k (\log \frac{n}{k} + \log |\Sigma|) )-bit encoding of all the alignments with Hamming distance at most kk of a length-nn pattern within a text of length O(n)O(n). This secondary result provides an optimal solution to a natural communication complexity problem which may be of independent interest.Comment: 27 page
    corecore