69,584 research outputs found

    Improved Approximate String Matching and Regular Expression Matching on Ziv-Lempel Compressed Texts

    Full text link
    We study the approximate string matching and regular expression matching problem for the case when the text to be searched is compressed with the Ziv-Lempel adaptive dictionary compression schemes. We present a time-space trade-off that leads to algorithms improving the previously known complexities for both problems. In particular, we significantly improve the space bounds, which in practical applications are likely to be a bottleneck

    Random Access to Grammar Compressed Strings

    Full text link
    Grammar based compression, where one replaces a long string by a small context-free grammar that generates the string, is a simple and powerful paradigm that captures many popular compression schemes. In this paper, we present a novel grammar representation that allows efficient random access to any character or substring without decompressing the string. Let SS be a string of length NN compressed into a context-free grammar S\mathcal{S} of size nn. We present two representations of S\mathcal{S} achieving O(logN)O(\log N) random access time, and either O(nαk(n))O(n\cdot \alpha_k(n)) construction time and space on the pointer machine model, or O(n)O(n) construction time and space on the RAM. Here, αk(n)\alpha_k(n) is the inverse of the kthk^{th} row of Ackermann's function. Our representations also efficiently support decompression of any substring in SS: we can decompress any substring of length mm in the same complexity as a single random access query and additional O(m)O(m) time. Combining these results with fast algorithms for uncompressed approximate string matching leads to several efficient algorithms for approximate string matching on grammar-compressed strings without decompression. For instance, we can find all approximate occurrences of a pattern PP with at most kk errors in time O(n(min{Pk,k4+P}+logN)+occ)O(n(\min\{|P|k, k^4 + |P|\} + \log N) + occ), where occocc is the number of occurrences of PP in SS. Finally, we generalize our results to navigation and other operations on grammar-compressed ordered trees. All of the above bounds significantly improve the currently best known results. To achieve these bounds, we introduce several new techniques and data structures of independent interest, including a predecessor data structure, two "biased" weighted ancestor data structures, and a compact representation of heavy paths in grammars.Comment: Preliminary version in SODA 201

    A practical index for approximate dictionary matching with few mismatches

    Get PDF
    Approximate dictionary matching is a classic string matching problem (checking if a query string occurs in a collection of strings) with applications in, e.g., spellchecking, online catalogs, geolocation, and web searchers. We present a surprisingly simple solution called a split index, which is based on the Dirichlet principle, for matching a keyword with few mismatches, and experimentally show that it offers competitive space-time tradeoffs. Our implementation in the C++ language is focused mostly on data compaction, which is beneficial for the search speed (e.g., by being cache friendly). We compare our solution with other algorithms and we show that it performs better for the Hamming distance. Query times in the order of 1 microsecond were reported for one mismatch for the dictionary size of a few megabytes on a medium-end PC. We also demonstrate that a basic compression technique consisting in qq-gram substitution can significantly reduce the index size (up to 50% of the input text size for the DNA), while still keeping the query time relatively low

    The Underestimation Of Egocentric Distance: Evidence From Frontal Matching Tasks

    Get PDF
    There is controversy over the existence, nature, and cause of error in egocentric distance judgments. One proposal is that the systematic biases often found in explicit judgments of egocentric distance along the ground may be related to recently observed biases in the perceived declination of gaze (Durgin & Li, Attention, Perception, & Psychophysics, in press), To measure perceived egocentric distance nonverbally, observers in a field were asked to position themselves so that their distance from one of two experimenters was equal to the frontal distance between the experimenters. Observers placed themselves too far away, consistent with egocentric distance underestimation. A similar experiment was conducted with vertical frontal extents. Both experiments were replicated in panoramic virtual reality. Perceived egocentric distance was quantitatively consistent with angular bias in perceived gaze declination (1.5 gain). Finally, an exocentric distance-matching task was contrasted with a variant of the egocentric matching task. The egocentric matching data approximate a constant compression of perceived egocentric distance with a power function exponent of nearly 1; exocentric matches had an exponent of about 0.67. The divergent pattern between egocentric and exocentric matches suggests that they depend on different visual cues

    Novel approximate absolute difference hardware

    Get PDF
    Approximate hardware designs have higher performance, smaller area or lower power consumption than exact hardware designs at the expense of lower accuracy. Absolute difference (AD) operation is heavily used in many applications such as motion estimation (ME) for video compression, ME for frame rate conversion, stereo matching for depth estimation. Since most of the applications using AD operation are error tolerant by their nature, approximate hardware designs can be used in these applications. In this paper, novel approximate AD hardware designs are proposed. The proposed approximate AD hardware implementations have higher performance, smaller area and lower power consumption than exact AD hardware implementations at the expense of lower accuracy. They also have less error, smaller area and lower power consumption than the approximate AD hardware implementations which use approximate adders proposed in the literature

    A trade-off design of microstrip broadband power amplifier for UHF applications

    Get PDF
    In this paper, the design of a Broadband Power Amplifier for UHF applications is presented. The proposed BPA is based on ATF13876 Agilent active device. The biasing and matching networks both are implemented by using microstrip transmission lines. The input and output matching circuits are designed by combining two broadband matching techniques: a binomial multi-section quarter wave impedance transformer and an approximate transformation of previously designed lumped elements. The proposed BPA shows excellent performances in terms of impedance matching, power gain and unconditionally stability over the operating bandwidth ranging from 1.2 GHz to 3.3 GHz. At 2.2 GHz, the large signal simulation shows a saturated output power of 18.875 dBm with an output 1-dB compression point of 6.5 dBm of input level and a maximum PAE of 36.26%
    corecore