418,989 research outputs found

    Recursive Sketching For Frequency Moments

    Full text link
    In a ground-breaking paper, Indyk and Woodruff (STOC 05) showed how to compute FkF_k (for k>2k>2) in space complexity O(\mbox{\em poly-log}(n,m)\cdot n^{1-\frac2k}), which is optimal up to (large) poly-logarithmic factors in nn and mm, where mm is the length of the stream and nn is the upper bound on the number of distinct elements in a stream. The best known lower bound for large moments is Ω(log(n)n12k)\Omega(\log(n)n^{1-\frac2k}). A follow-up work of Bhuvanagiri, Ganguly, Kesh and Saha (SODA 2006) reduced the poly-logarithmic factors of Indyk and Woodruff to O(log2(m)(logn+logm)n12k)O(\log^2(m)\cdot (\log n+ \log m)\cdot n^{1-{2\over k}}). Further reduction of poly-log factors has been an elusive goal since 2006, when Indyk and Woodruff method seemed to hit a natural "barrier." Using our simple recursive sketch, we provide a different yet simple approach to obtain a O(log(m)log(nm)(loglogn)4n12k)O(\log(m)\log(nm)\cdot (\log\log n)^4\cdot n^{1-{2\over k}}) algorithm for constant ϵ\epsilon (our bound is, in fact, somewhat stronger, where the (loglogn)(\log\log n) term can be replaced by any constant number of log\log iterations instead of just two or three, thus approaching lognlog^*n. Our bound also works for non-constant ϵ\epsilon (for details see the body of the paper). Further, our algorithm requires only 44-wise independence, in contrast to existing methods that use pseudo-random generators for computing large frequency moments

    Configurations with few crossings in topological graphs

    Get PDF
    AbstractIn this paper we study the problem of computing subgraphs of a certain configuration in a given topological graph G such that the number of crossings in the subgraph is minimum. The configurations that we consider are spanning trees, s–t paths, cycles, matchings, and κ-factors for κ∈{1,2}. We show that it is NP-hard to approximate the minimum number of crossings for these configurations within a factor of k1−ε for any ε>0, where k is the number of crossings in G.We then give a simple fixed-parameter algorithm that tests in O⋆(2k) time whether G has a crossing-free configuration for any of the above, where the O⋆-notation neglects polynomial terms. For some configurations we have faster algorithms. The respective running times are O⋆(1.9999992k) for spanning trees and O⋆((3)k) for s-t paths and cycles. For spanning trees we also have an O⋆(1.968k)-time Monte-Carlo algorithm. Each O⋆(βk)-time decision algorithm can be turned into an O⋆((β+1)k)-time optimization algorithm that computes a configuration with the minimum number of crossings

    Deterministic and {Las Vegas} Algorithms for Sparse Nonnegative Convolution

    Get PDF
    Computing the convolution ABA\star B of two length-nn integer vectors A,BA,B is a core problem in several disciplines. It frequently comes up in algorithms for Knapsack, kk-SUM, All-Pairs Shortest Paths, and string pattern matching problems. For these applications it typically suffices to compute convolutions of nonnegative vectors. This problem can be classically solved in time O(nlogn)O(n\log n) using the Fast Fourier Transform. However, often the involved vectors are sparse and hence one could hope for output-sensitive algorithms to compute nonnegative convolutions. This question was raised by Muthukrishnan and solved by Cole and Hariharan (STOC '02) by a randomized algorithm running in near-linear time in the (unknown) output-size tt. Chan and Lewenstein (STOC '15) presented a deterministic algorithm with a 2O(logtloglogn)2^{O(\sqrt{\log t\cdot\log\log n})} overhead in running time and the additional assumption that a small superset of the output is given; this assumption was later removed by Bringmann and Nakos (ICALP '21). In this paper we present the first deterministic near-linear-time algorithm for computing sparse nonnegative convolutions. This immediately gives improved deterministic algorithms for the state-of-the-art of output-sensitive Subset Sum, block-mass pattern matching, NN-fold Boolean convolution, and others, matching up to log-factors the fastest known randomized algorithms for these problems. Our algorithm is a blend of algebraic and combinatorial ideas and techniques. Additionally, we provide two fast Las Vegas algorithms for computing sparse nonnegative convolutions. In particular, we present a simple O(tlog2t)O(t\log^2t) time algorithm, which is an accessible alternative to Cole and Hariharan's algorithm. We further refine this new algorithm to run in Las Vegas time O(tlogtloglogt)O(t\log t\cdot\log\log t), matching the running time of the dense case apart from the loglogt\log\log t factor

    Exploiting subspace distance equalities in Highdimensional data for knn queries

    Get PDF
    Efficient k-nearest neighbor computation for high-dimensional data is an important, yet challenging task. The response times of stateof-the-art indexing approaches highly depend on factors like distribution of the data. For clustered data, such approaches are several factors faster than a sequential scan. However, if various dimensions contain uniform or Gaussian data they tend to be clearly outperformed by a simple sequential scan. Hence, we require for an approach generally delivering good response times, independent of the data distribution. As solution, we propose to exploit a novel concept to efficiently compute nearest neighbors. We name it sub-space distance equality, which aims at reducing the number of distance computations independent of the data distribution. We integrate knn computing algorithms into the Elf index structure allowing to study the sub-space distance equality concept in isolation and in combination with a main-memory optimized storage layout. In a large comparative study with twelve data sets, our results indicate that indexes based on sub-space distance equalities compute the least amount of distances. For clustered data, our Elf knn algorithm delivers at least a performance increase of factor two up to an increase of two magnitudes without losing the performance gain compared to sequential scans for uniform or Gaussian data

    Deterministic voting in distributed systems using error-correcting codes

    Get PDF
    Distributed voting is an important problem in reliable computing. In an N Modular Redundant (NMR) system, the N computational modules execute identical tasks and they need to periodically vote on their current states. In this paper, we propose a deterministic majority voting algorithm for NMR systems. Our voting algorithm uses error-correcting codes to drastically reduce the average case communication complexity. In particular, we show that the efficiency of our voting algorithm can be improved by choosing the parameters of the error-correcting code to match the probability of the computational faults. For example, consider an NMR system with 31 modules, each with a state of m bits, where each module has an independent computational error probability of 10^-3. In, this NMR system, our algorithm can reduce the average case communication complexity to approximately 1.0825 m compared with the communication complexity of 31 m of the naive algorithm in which every module broadcasts its local result to all other modules. We have also implemented the voting algorithm over a network of workstations. The experimental performance results match well the theoretical predictions

    Conjugacy problem for braid groups and Garside groups

    Full text link
    We present a new algorithm to solve the conjugacy problem in Artin braid groups, which is faster than the one presented by Birman, Ko and Lee. This algorithm can be applied not only to braid groups, but to all Garside groups (which include finite type Artin groups and torus knot groups among others).Comment: New version, with substantial modifications. 21 pages, 2 figure

    Algorithms for Longest Common Abelian Factors

    Full text link
    In this paper we consider the problem of computing the longest common abelian factor (LCAF) between two given strings. We present a simple O(σ n2)O(\sigma~ n^2) time algorithm, where nn is the length of the strings and σ\sigma is the alphabet size, and a sub-quadratic running time solution for the binary string case, both having linear space requirement. Furthermore, we present a modified algorithm applying some interesting tricks and experimentally show that the resulting algorithm runs faster.Comment: 13 pages, 4 figure
    corecore