11,449 research outputs found

    Randomized protocols for asynchronous consensus

    Full text link
    The famous Fischer, Lynch, and Paterson impossibility proof shows that it is impossible to solve the consensus problem in a natural model of an asynchronous distributed system if even a single process can fail. Since its publication, two decades of work on fault-tolerant asynchronous consensus algorithms have evaded this impossibility result by using extended models that provide (a) randomization, (b) additional timing assumptions, (c) failure detectors, or (d) stronger synchronization mechanisms than are available in the basic model. Concentrating on the first of these approaches, we illustrate the history and structure of randomized asynchronous consensus protocols by giving detailed descriptions of several such protocols.Comment: 29 pages; survey paper written for PODC 20th anniversary issue of Distributed Computin

    Subsampling Algorithms for Semidefinite Programming

    Full text link
    We derive a stochastic gradient algorithm for semidefinite optimization using randomization techniques. The algorithm uses subsampling to reduce the computational cost of each iteration and the subsampling ratio explicitly controls granularity, i.e. the tradeoff between cost per iteration and total number of iterations. Furthermore, the total computational cost is directly proportional to the complexity (i.e. rank) of the solution. We study numerical performance on some large-scale problems arising in statistical learning.Comment: Final version, to appear in Stochastic System

    Efficient Algorithms for CUR and Interpolative Matrix Decompositions

    Full text link
    The manuscript describes efficient algorithms for the computation of the CUR and ID decompositions. The methods used are based on simple modifications to the classical truncated pivoted QR decomposition, which means that highly optimized library codes can be utilized for implementation. For certain applications, further acceleration can be attained by incorporating techniques based on randomized projections. Numerical experiments demonstrate advantageous performance compared to existing techniques for computing CUR factorizations

    Fast Low Fidelity Microsimulation of Vehicle Traffic on Supercomputers

    Full text link
    A set of very simple rules for driving behavior used to simulate roadway traffic gives realistic results. Because of its simplicity, it is easy to implement the model on supercomputers (vectorizing and parallel), where we have achieved real time limits of more than 4~million~kilometers (or more than 53~million vehicle sec/sec). The model can be used for applications where both high simulation speed and individual vehicle resolution are needed. We use the model for extended statistical analysis to gain insight into traffic phenomena near capacity, and we discuss that this model is a good candidate for network routing applications. (Submitted to Transportation Research Board Meeting, Jan. 1994, Washington D.C.)Comment: 11 pages, latex, figs. available upon request, Cologne-WP 93.14

    Optimization of Discrete-parameter Multiprocessor Systems using a Novel Ergodic Interpolation Technique

    Full text link
    Modern multi-core systems have a large number of design parameters, most of which are discrete-valued, and this number is likely to keep increasing as chip complexity rises. Further, the accurate evaluation of a potential design choice is computationally expensive because it requires detailed cycle-accurate system simulation. If the discrete parameter space can be embedded into a larger continuous parameter space, then continuous space techniques can, in principle, be applied to the system optimization problem. Such continuous space techniques often scale well with the number of parameters. We propose a novel technique for embedding the discrete parameter space into an extended continuous space so that continuous space techniques can be applied to the embedded problem using cycle accurate simulation for evaluating the objective function. This embedding is implemented using simulation-based ergodic interpolation, which, unlike spatial interpolation, produces the interpolated value within a single simulation run irrespective of the number of parameters. We have implemented this interpolation scheme in a cycle-based system simulator. In a characterization study, we observe that the interpolated performance curves are continuous, piece-wise smooth, and have low statistical error. We use the ergodic interpolation-based approach to solve a large multi-core design optimization problem with 31 design parameters. Our results indicate that continuous space optimization using ergodic interpolation-based embedding can be a viable approach for large multi-core design optimization problems.Comment: A short version of this paper will be published in the proceedings of IEEE MASCOTS 2015 conferenc

    Fast Parallel Randomized Algorithm for Nonnegative Matrix Factorization with KL Divergence for Large Sparse Datasets

    Get PDF
    Nonnegative Matrix Factorization (NMF) with Kullback-Leibler Divergence (NMF-KL) is one of the most significant NMF problems and equivalent to Probabilistic Latent Semantic Indexing (PLSI), which has been successfully applied in many applications. For sparse count data, a Poisson distribution and KL divergence provide sparse models and sparse representation, which describe the random variation better than a normal distribution and Frobenius norm. Specially, sparse models provide more concise understanding of the appearance of attributes over latent components, while sparse representation provides concise interpretability of the contribution of latent components over instances. However, minimizing NMF with KL divergence is much more difficult than minimizing NMF with Frobenius norm; and sparse models, sparse representation and fast algorithms for large sparse datasets are still challenges for NMF with KL divergence. In this paper, we propose a fast parallel randomized coordinate descent algorithm having fast convergence for large sparse datasets to archive sparse models and sparse representation. The proposed algorithm's experimental results overperform the current studies' ones in this problem

    Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations

    Get PDF
    Consider the following heuristic for building a decision tree for a function f:{0,1}n{±1}f : \{0,1\}^n \to \{\pm 1\}. Place the most influential variable xix_i of ff at the root, and recurse on the subfunctions fxi=0f_{x_i=0} and fxi=1f_{x_i=1} on the left and right subtrees respectively; terminate once the tree is an ε\varepsilon-approximation of ff. We analyze the quality of this heuristic, obtaining near-matching upper and lower bounds: \circ Upper bound: For every ff with decision tree size ss and every ε(0,12)\varepsilon \in (0,\frac1{2}), this heuristic builds a decision tree of size at most sO(log(s/ε)log(1/ε))s^{O(\log(s/\varepsilon)\log(1/\varepsilon))}. \circ Lower bound: For every ε(0,12)\varepsilon \in (0,\frac1{2}) and s2O~(n)s \le 2^{\tilde{O}(\sqrt{n})}, there is an ff with decision tree size ss such that this heuristic builds a decision tree of size sΩ~(logs)s^{\tilde{\Omega}(\log s)}. We also obtain upper and lower bounds for monotone functions: sO(logs/ε)s^{O(\sqrt{\log s}/\varepsilon)} and sΩ~(logs4)s^{\tilde{\Omega}(\sqrt[4]{\log s } )} respectively. The lower bound disproves conjectures of Fiat and Pechyony (2004) and Lee (2009). Our upper bounds yield new algorithms for properly learning decision trees under the uniform distribution. We show that these algorithms---which are motivated by widely employed and empirically successful top-down decision tree learning heuristics such as ID3, C4.5, and CART---achieve provable guarantees that compare favorably with those of the current fastest algorithm (Ehrenfeucht and Haussler, 1989). Our lower bounds shed new light on the limitations of these heuristics. Finally, we revisit the classic work of Ehrenfeucht and Haussler. We extend it to give the first uniform-distribution proper learning algorithm that achieves polynomial sample and memory complexity, while matching its state-of-the-art quasipolynomial runtime
    corecore