146 research outputs found

    Suffix arrays: what are they good for?

    Get PDF
    Recently the theoretical community has displayed a flurry of interest in suffix arrays, and compressed suffix arrays. New, asymptotically optimal algorithms for construction, search, and compression of suffix arrays have been proposed. In this talk we will present our investigations into the practicalities of these latest developments. In particular, we investigate whether suffix arrays can indeed replace inverted files, as suggested in recent literature on suffix arrays

    Random Graph Generator for Bipartite Networks Modeling

    Full text link
    The purpose of this article is to introduce a new iterative algorithm with properties resembling real life bipartite graphs. The algorithm enables us to generate wide range of random bigraphs, which features are determined by a set of parameters.We adapt the advances of last decade in unipartite complex networks modeling to the bigraph setting. This data structure can be observed in several situations. However, only a few datasets are freely available to test the algorithms (e.g. community detection, influential nodes identification, information retrieval) which operate on such data. Therefore, artificial datasets are needed to enhance development and testing of the algorithms. We are particularly interested in applying the generator to the analysis of recommender systems. Therefore, we focus on two characteristics that, besides simple statistics, are in our opinion responsible for the performance of neighborhood based collaborative filtering algorithms. The features are node degree distribution and local clustering coeficient

    Improving and Extending the Testing of Distributions for Shape-Restricted Properties

    Get PDF
    Distribution testing deals with what information can be deduced about an unknown distribution over {1,...,n}, where the algorithm is only allowed to obtain a relatively small number of independent samples from the distribution. In the extended conditional sampling model, the algorithm is also allowed to obtain samples from the restriction of the original distribution on subsets of {1,...,n}. In 2015, Canonne, Diakonikolas, Gouleakis and Rubinfeld unified several previous results, and showed that for any property of distributions satisfying a "decomposability" criterion, there exists an algorithm (in the basic model) that can distinguish with high probability distributions satisfying the property from distributions that are far from it in variation distance. We present here a more efficient yet simpler algorithm for the basic model, as well as very efficient algorithms for the conditional model, which until now was not investigated under the umbrella of decomposable properties. Additionally, we provide an algorithm for the conditional model that handles a much larger class of properties. Our core mechanism is a way of efficiently producing an interval-partition of {1,...,n} that satisfies a "fine-grain" quality. We show that with such a partition at hand we can directly move forward with testing individual intervals, instead of first searching for the "correct" partition of {1,...,n}

    Suffix arrays: what are they good for?

    Get PDF
    Recently the theoretical community has displayed a flurry of interest in suffix arrays, and compressed suffix arrays. New, asymptotically optimal algorithms for construction, search, and compression of suffix arrays have been proposed. In this talk we will present our investigations into the practicalities of these latest developments. In particular, we investigate whether suffix arrays can indeed replace inverted files, as suggested in recent literature on suffix arrays

    Cryptography from Information Loss

    Get PDF
    © Marshall Ball, Elette Boyle, Akshay Degwekar, Apoorvaa Deshpande, Alon Rosen, Vinod. Reductions between problems, the mainstay of theoretical computer science, efficiently map an instance of one problem to an instance of another in such a way that solving the latter allows solving the former.1 The subject of this work is “lossy” reductions, where the reduction loses some information about the input instance. We show that such reductions, when they exist, have interesting and powerful consequences for lifting hardness into “useful” hardness, namely cryptography. Our first, conceptual, contribution is a definition of lossy reductions in the language of mutual information. Roughly speaking, our definition says that a reduction C is t-lossy if, for any distribution X over its inputs, the mutual information I(X; C(X)) ≀ t. Our treatment generalizes a variety of seemingly related but distinct notions such as worst-case to average-case reductions, randomized encodings (Ishai and Kushilevitz, FOCS 2000), homomorphic computations (Gentry, STOC 2009), and instance compression (Harnik and Naor, FOCS 2006). We then proceed to show several consequences of lossy reductions: 1. We say that a language L has an f-reduction to a language L0 for a Boolean function f if there is a (randomized) polynomial-time algorithm C that takes an m-tuple of strings X = (x1, . . ., xm), with each xi ∈ {0, 1}n, and outputs a string z such that with high probability, L0(z) = f(L(x1), L(x2), . . ., L(xm)) Suppose a language L has an f-reduction C to L0 that is t-lossy. Our first result is that one-way functions exist if L is worst-case hard and one of the following conditions holds: f is the OR function, t ≀ m/100, and L0 is the same as L f is the Majority function, and t ≀ m/100 f is the OR function, t ≀ O(m log n), and the reduction has no error This improves on the implications that follow from combining (Drucker, FOCS 2012) with (Ostrovsky and Wigderson, ISTCS 1993) that result in auxiliary-input one-way functions. 2. Our second result is about the stronger notion of t-compressing f-reductions – reductions that only output t bits. We show that if there is an average-case hard language L that has a t-compressing Majority reduction to some language for t = m/100, then there exist collision-resistant hash functions. This improves on the result of (Harnik and Naor, STOC 2006), whose starting point is a cryptographic primitive (namely, one-way functions) rather than average-case hardness, and whose assumption is a compressing OR-reduction of SAT (which is now known to be false unless the polynomial hierarchy collapses). Along the way, we define a non-standard one-sided notion of average-case hardness, which is the notion of hardness used in the second result above, that may be of independent interest

    Black Box White Arrow

    Full text link
    The present paper proposes a new and systematic approach to the so-called black box group methods in computational group theory. Instead of a single black box, we consider categories of black boxes and their morphisms. This makes new classes of black box problems accessible. For example, we can enrich black box groups by actions of outer automorphisms. As an example of application of this technique, we construct Frobenius maps on black box groups of untwisted Lie type in odd characteristic (Section 6) and inverse-transpose automorphisms on black box groups encrypting (P)SLn(Fq){\rm (P)SL}_n(\mathbb{F}_q). One of the advantages of our approach is that it allows us to work in black box groups over finite fields of big characteristic. Another advantage is explanatory power of our methods; as an example, we explain Kantor's and Kassabov's construction of an involution in black box groups encrypting SL2(2n){\rm SL}_2(2^n). Due to the nature of our work we also have to discuss a few methodological issues of the black box group theory. The paper is further development of our text "Fifty shades of black" [arXiv:1308.2487], and repeats parts of it, but under a weaker axioms for black box groups.Comment: arXiv admin note: substantial text overlap with arXiv:1308.248

    Testing Shape Restrictions of Discrete Distributions

    Get PDF
    We study the question of testing structured properties (classes) of discrete distributions. Specifically, given sample access to an arbitrary distribution D over [n] and a property P, the goal is to distinguish between D in P and l_{1}(D,P)>epsilon. We develop a general algorithm for this question, which applies to a large range of "shape-constrained" properties, including monotone, log-concave, t-modal, piecewise-polynomial, and Poisson Binomial distributions. Moreover, for all cases considered, our algorithm has near-optimal sample complexity with regard to the domain size and is computationally efficient. For most of these classes, we provide the first non-trivial tester in the literature. In addition, we also describe a generic method to prove lower bounds for this problem, and use it to show our upper bounds are nearly tight. Finally, we extend some of our techniques to tolerant testing, deriving nearly-tight upper and lower bounds for the corresponding questions

    On the Complexity of Decomposable Randomized Encodings, Or: How Friendly Can a Garbling-Friendly PRF Be?

    Get PDF
    • 

    corecore