1,407 research outputs found

    Random Access to Grammar Compressed Strings

    Full text link
    Grammar based compression, where one replaces a long string by a small context-free grammar that generates the string, is a simple and powerful paradigm that captures many popular compression schemes. In this paper, we present a novel grammar representation that allows efficient random access to any character or substring without decompressing the string. Let SS be a string of length NN compressed into a context-free grammar S\mathcal{S} of size nn. We present two representations of S\mathcal{S} achieving O(logN)O(\log N) random access time, and either O(nαk(n))O(n\cdot \alpha_k(n)) construction time and space on the pointer machine model, or O(n)O(n) construction time and space on the RAM. Here, αk(n)\alpha_k(n) is the inverse of the kthk^{th} row of Ackermann's function. Our representations also efficiently support decompression of any substring in SS: we can decompress any substring of length mm in the same complexity as a single random access query and additional O(m)O(m) time. Combining these results with fast algorithms for uncompressed approximate string matching leads to several efficient algorithms for approximate string matching on grammar-compressed strings without decompression. For instance, we can find all approximate occurrences of a pattern PP with at most kk errors in time O(n(min{Pk,k4+P}+logN)+occ)O(n(\min\{|P|k, k^4 + |P|\} + \log N) + occ), where occocc is the number of occurrences of PP in SS. Finally, we generalize our results to navigation and other operations on grammar-compressed ordered trees. All of the above bounds significantly improve the currently best known results. To achieve these bounds, we introduce several new techniques and data structures of independent interest, including a predecessor data structure, two "biased" weighted ancestor data structures, and a compact representation of heavy paths in grammars.Comment: Preliminary version in SODA 201

    Impact of loss on the wave dynamics in photonic waveguide lattices

    Full text link
    We analyze the impact of loss in lattices of coupled optical waveguides and find that in such case, the hopping between adjacent waveguides is necessarily complex. This results not only in a transition of the light spreading from ballistic to diffusive, but also in a new kind of diffraction that is caused by loss dispersion. We prove our theoretical results with experimental observations.Comment: Accepted for publication in PRL, 5+8 pages (Paper + Supplemental material), 4 figure

    The effect of thermal annealing on the properties of Al-AlOx-Al single electron tunneling transistors

    Full text link
    The effect of thermal annealing on the properties of Al-AlOx-Al single electron tunneling transistors is reported. After treatment of the devices by annealing processes in forming gas atmosphere at different temperatures and for different times, distinct and reproducible changes of their resistance and capacitance values were found. According to the temperature regime, we observed different behaviors as regards the resistance changes, namely the tendency to decrease the resistance by annealing at T = 200 degree C, but to increase the resistance by annealing at T = 400 degree C. We attribute this behavior to changes in the aluminum oxide barriers of the tunnel junctions. The good reproducibility of these effects with respect to the changes observed allows the proper annealing treatment to be used for post-process tuning of tunnel junction parameters. Also, the influence of the annealing treatment on the noise properties of the transistors at low frequency was investigated. In no case did the noise figures in the 1/f-regime show significant changes.Comment: 6 pages, 7 eps-figure

    An O(n^3)-Time Algorithm for Tree Edit Distance

    Full text link
    The {\em edit distance} between two ordered trees with vertex labels is the minimum cost of transforming one tree into the other by a sequence of elementary operations consisting of deleting and relabeling existing nodes, as well as inserting new nodes. In this paper, we present a worst-case O(n3)O(n^3)-time algorithm for this problem, improving the previous best O(n3logn)O(n^3\log n)-time algorithm~\cite{Klein}. Our result requires a novel adaptive strategy for deciding how a dynamic program divides into subproblems (which is interesting in its own right), together with a deeper understanding of the previous algorithms for the problem. We also prove the optimality of our algorithm among the family of \emph{decomposition strategy} algorithms--which also includes the previous fastest algorithms--by tightening the known lower bound of Ω(n2log2n)\Omega(n^2\log^2 n)~\cite{Touzet} to Ω(n3)\Omega(n^3), matching our algorithm's running time. Furthermore, we obtain matching upper and lower bounds of Θ(nm2(1+lognm))\Theta(n m^2 (1 + \log \frac{n}{m})) when the two trees have different sizes mm and~nn, where m<nm < n.Comment: 10 pages, 5 figures, 5 .tex files where TED.tex is the main on

    Top Tree Compression of Tries

    Get PDF
    We present a compressed representation of tries based on top tree compression [ICALP 2013] that works on a standard, comparison-based, pointer machine model of computation and supports efficient prefix search queries. Namely, we show how to preprocess a set of strings of total length n over an alphabet of size sigma into a compressed data structure of worst-case optimal size O(n/log_sigma n) that given a pattern string P of length m determines if P is a prefix of one of the strings in time O(min(m log sigma,m + log n)). We show that this query time is in fact optimal regardless of the size of the data structure. Existing solutions either use Omega(n) space or rely on word RAM techniques, such as tabulation, hashing, address arithmetic, or word-level parallelism, and hence do not work on a pointer machine. Our result is the first solution on a pointer machine that achieves worst-case o(n) space. Along the way, we develop several interesting data structures that work on a pointer machine and are of independent interest. These include an optimal data structures for random access to a grammar-compressed string and an optimal data structure for a variant of the level ancestor problem

    The Nearest Colored Node in a Tree

    Get PDF
    We start a systematic study of data structures for the nearest colored node problem on trees. Given a tree with colored nodes and weighted edges, we want to answer queries (v,c) asking for the nearest node to node v that has color c. This is a natural generalization of the well-known nearest marked ancestor problem. We give an O(n)-space O(log log n)-query solution and show that this is optimal. We also consider the dynamic case where updates can change a node\u27s color and show that in O(n) space we can support both updates and queries in O(log n) time. We complement this by showing that O(polylog n) update time implies Omega(log n log log n) query time. Finally, we consider the case where updates can change the edges of the tree (link-cut operations). There is a known (top-tree based) solution that requires update time that is roughly linear in the number of colors. We show that this solution is probably optimal by showing that a strictly sublinear update time implies a strictly subcubic time algorithm for the classical all pairs shortest paths problem on a general graph. We also consider versions where the tree is rooted, and the query asks for the nearest ancestor/descendant of node v that has color c, and present efficient data structures for both variants in the static and the dynamic setting

    A stringent yeast two-hybrid matrix screening approach for protein-protein interaction discovery

    No full text
    The yeast two-hybrid (Y2H) system is currently one of the most important techniques for protein-protein interaction (PPI) discovery. Here, we describe a stringent three-step Y2H matrix interaction approach that is suitable for systematic PPI screening on a proteome scale. We start with the identification and elimination of autoactivating strains that would lead to false-positive signals and prevent the identification of interactions. Nonautoactivating strains are used for the primary PPI screen that is carried out in quadruplicate with arrayed preys. Interacting pairs of baits and preys are identified in a pairwise retest step. Only PPI pairs that pass the retest step are regarded as potentially biologically relevant interactions and are considered for further analysis
    corecore