8,768 research outputs found

    Efficient Seeds Computation Revisited

    Get PDF
    The notion of the cover is a generalization of a period of a string, and there are linear time algorithms for finding the shortest cover. The seed is a more complicated generalization of periodicity, it is a cover of a superstring of a given string, and the shortest seed problem is of much higher algorithmic difficulty. The problem is not well understood, no linear time algorithm is known. In the paper we give linear time algorithms for some of its versions --- computing shortest left-seed array, longest left-seed array and checking for seeds of a given length. The algorithm for the last problem is used to compute the seed array of a string (i.e., the shortest seeds for all the prefixes of the string) in O(n2)O(n^2) time. We describe also a simpler alternative algorithm computing efficiently the shortest seeds. As a by-product we obtain an O(nlog(n/m))O(n\log{(n/m)}) time algorithm checking if the shortest seed has length at least mm and finding the corresponding seed. We also correct some important details missing in the previously known shortest-seed algorithm (Iliopoulos et al., 1996).Comment: 14 pages, accepted to CPM 201

    Interactive Channel Capacity Revisited

    Full text link
    We provide the first capacity approaching coding schemes that robustly simulate any interactive protocol over an adversarial channel that corrupts any ϵ\epsilon fraction of the transmitted symbols. Our coding schemes achieve a communication rate of 1O(ϵloglog1/ϵ)1 - O(\sqrt{\epsilon \log \log 1/\epsilon}) over any adversarial channel. This can be improved to 1O(ϵ)1 - O(\sqrt{\epsilon}) for random, oblivious, and computationally bounded channels, or if parties have shared randomness unknown to the channel. Surprisingly, these rates exceed the 1Ω(H(ϵ))=1Ω(ϵlog1/ϵ)1 - \Omega(\sqrt{H(\epsilon)}) = 1 - \Omega(\sqrt{\epsilon \log 1/\epsilon}) interactive channel capacity bound which [Kol and Raz; STOC'13] recently proved for random errors. We conjecture 1Θ(ϵloglog1/ϵ)1 - \Theta(\sqrt{\epsilon \log \log 1/\epsilon}) and 1Θ(ϵ)1 - \Theta(\sqrt{\epsilon}) to be the optimal rates for their respective settings and therefore to capture the interactive channel capacity for random and adversarial errors. In addition to being very communication efficient, our randomized coding schemes have multiple other advantages. They are computationally efficient, extremely natural, and significantly simpler than prior (non-capacity approaching) schemes. In particular, our protocols do not employ any coding but allow the original protocol to be performed as-is, interspersed only by short exchanges of hash values. When hash values do not match, the parties backtrack. Our approach is, as we feel, by far the simplest and most natural explanation for why and how robust interactive communication in a noisy environment is possible

    Low-shot learning with large-scale diffusion

    Full text link
    This paper considers the problem of inferring image labels from images when only a few annotated examples are available at training time. This setup is often referred to as low-shot learning, where a standard approach is to re-train the last few layers of a convolutional neural network learned on separate classes for which training examples are abundant. We consider a semi-supervised setting based on a large collection of images to support label propagation. This is possible by leveraging the recent advances on large-scale similarity graph construction. We show that despite its conceptual simplicity, scaling label propagation up to hundred millions of images leads to state of the art accuracy in the low-shot learning regime

    Recursive Online Enumeration of All Minimal Unsatisfiable Subsets

    Full text link
    In various areas of computer science, we deal with a set of constraints to be satisfied. If the constraints cannot be satisfied simultaneously, it is desirable to identify the core problems among them. Such cores are called minimal unsatisfiable subsets (MUSes). The more MUSes are identified, the more information about the conflicts among the constraints is obtained. However, a full enumeration of all MUSes is in general intractable due to the large number (even exponential) of possible conflicts. Moreover, to identify MUSes algorithms must test sets of constraints for their simultaneous satisfiabilty. The type of the test depends on the application domains. The complexity of tests can be extremely high especially for domains like temporal logics, model checking, or SMT. In this paper, we propose a recursive algorithm that identifies MUSes in an online manner (i.e., one by one) and can be terminated at any time. The key feature of our algorithm is that it minimizes the number of satisfiability tests and thus speeds up the computation. The algorithm is applicable to an arbitrary constraint domain and its effectiveness demonstrates itself especially in domains with expensive satisfiability checks. We benchmark our algorithm against state of the art algorithm on Boolean and SMT constraint domains and demonstrate that our algorithm really requires less satisfiability tests and consequently finds more MUSes in given time limits

    Reuse It Or Lose It: More Efficient Secure Computation Through Reuse of Encrypted Values

    Full text link
    Two-party secure function evaluation (SFE) has become significantly more feasible, even on resource-constrained devices, because of advances in server-aided computation systems. However, there are still bottlenecks, particularly in the input validation stage of a computation. Moreover, SFE research has not yet devoted sufficient attention to the important problem of retaining state after a computation has been performed so that expensive processing does not have to be repeated if a similar computation is done again. This paper presents PartialGC, an SFE system that allows the reuse of encrypted values generated during a garbled-circuit computation. We show that using PartialGC can reduce computation time by as much as 96% and bandwidth by as much as 98% in comparison with previous outsourcing schemes for secure computation. We demonstrate the feasibility of our approach with two sets of experiments, one in which the garbled circuit is evaluated on a mobile device and one in which it is evaluated on a server. We also use PartialGC to build a privacy-preserving "friend finder" application for Android. The reuse of previous inputs to allow stateful evaluation represents a new way of looking at SFE and further reduces computational barriers.Comment: 20 pages, shorter conference version published in Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, Pages 582-596, ACM New York, NY, US

    Pseudorandom number generators revisited

    Get PDF
    Statistical Methods;mathematische statistiek

    Sketch-based Influence Maximization and Computation: Scaling up with Guarantees

    Full text link
    Propagation of contagion through networks is a fundamental process. It is used to model the spread of information, influence, or a viral infection. Diffusion patterns can be specified by a probabilistic model, such as Independent Cascade (IC), or captured by a set of representative traces. Basic computational problems in the study of diffusion are influence queries (determining the potency of a specified seed set of nodes) and Influence Maximization (identifying the most influential seed set of a given size). Answering each influence query involves many edge traversals, and does not scale when there are many queries on very large graphs. The gold standard for Influence Maximization is the greedy algorithm, which iteratively adds to the seed set a node maximizing the marginal gain in influence. Greedy has a guaranteed approximation ratio of at least (1-1/e) and actually produces a sequence of nodes, with each prefix having approximation guarantee with respect to the same-size optimum. Since Greedy does not scale well beyond a few million edges, for larger inputs one must currently use either heuristics or alternative algorithms designed for a pre-specified small seed set size. We develop a novel sketch-based design for influence computation. Our greedy Sketch-based Influence Maximization (SKIM) algorithm scales to graphs with billions of edges, with one to two orders of magnitude speedup over the best greedy methods. It still has a guaranteed approximation ratio, and in practice its quality nearly matches that of exact greedy. We also present influence oracles, which use linear-time preprocessing to generate a small sketch for each node, allowing the influence of any seed set to be quickly answered from the sketches of its nodes.Comment: 10 pages, 5 figures. Appeared at the 23rd Conference on Information and Knowledge Management (CIKM 2014) in Shanghai, Chin

    Optimizing spread dynamics on graphs by message passing

    Full text link
    Cascade processes are responsible for many important phenomena in natural and social sciences. Simple models of irreversible dynamics on graphs, in which nodes activate depending on the state of their neighbors, have been successfully applied to describe cascades in a large variety of contexts. Over the last decades, many efforts have been devoted to understand the typical behaviour of the cascades arising from initial conditions extracted at random from some given ensemble. However, the problem of optimizing the trajectory of the system, i.e. of identifying appropriate initial conditions to maximize (or minimize) the final number of active nodes, is still considered to be practically intractable, with the only exception of models that satisfy a sort of diminishing returns property called submodularity. Submodular models can be approximately solved by means of greedy strategies, but by definition they lack cooperative characteristics which are fundamental in many real systems. Here we introduce an efficient algorithm based on statistical physics for the optimization of trajectories in cascade processes on graphs. We show that for a wide class of irreversible dynamics, even in the absence of submodularity, the spread optimization problem can be solved efficiently on large networks. Analytic and algorithmic results on random graphs are complemented by the solution of the spread maximization problem on a real-world network (the Epinions consumer reviews network).Comment: Replacement for "The Spread Optimization Problem
    corecore