Search CORE

4 research outputs found

Parallel Computation of the Minimal Elements of a Poset

Author: Leiserson Charles E.
Li Liyun
Maza Marc Moreno
Xie Yuzhen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

Computing the minimal elements of a partially ordered finite set (poset) is a fundamental problem in combinatorics with numerous applications such as polynomial expression optimization, transversal hypergraph generation and redundant component removal, to name a few. We propose a divide-and-conquer algorithm which is not only cache-oblivious but also can be parallelized free of determinacy races. We have implemented it in Cilk++ targeting multicores. For our test problems of sufficiently large input size our code demonstrates a linear speedup on 32 cores.National Science Foundation (U.S.). (Grant number CNS-0615215)National Science Foundation (U.S.). (Grant number CCF- 0621511

DSpace@MIT

Crossref

On the Factor Refinement Principle and its Implementation on Multicore Architectures

Author: Ali Md. Mohsin
Publication venue: Scholarship@Western
Publication date: 01/01/2011
Field of study

The factor refinement principle turns a partial factorization of integers (or polynomi als) into a more complete factorization represented by basis elements and exponents, with basis elements that are pairwise coprime. There are lots of applications of this refinement technique such as simplifying systems of polynomial inequations and, more generally, speeding up certain algebraic algorithms by eliminating redundant expressions that may occur during intermediate computations. Successive GCD computations and divisions are used to accomplish this task until all the basis elements are pairwise coprime. Moreover, square-free factorization (which is the first step of many factorization algorithms) is used to remove the repeated patterns from each input element. Differentiation, division and GCD calculation op erations are required to complete this pre-processing step. Both factor refinement and square-free factorization often rely on plain (quadratic) algorithms for multipli cation but can be substantially improved with asymptotically fast multiplication on sufficiently large input. In this work, we review the working principles and complexity estimates of the factor refinement, in case of plain arithmetic, as well as asymptotically fast arithmetic. Following this review process, we design, analyze and implement parallel adaptations of these factor refinement algorithms. We consider several algorithm optimization techniques such as data locality analysis, balancing subproblems, etc. to fully exploit modern multicore architectures. The Cilk++ implementation of our parallel algorithm based on the augment refinement principle of Bach, Driscoll and Shallit achieves linear speedup for input data of sufficiently large size

Scholarship@Western

Efficient Evaluation of Large Polynomials

Author: Li Liyun
Publication venue: Scholarship@Western
Publication date: 01/01/2010
Field of study

In scientific computing, it is often required to evaluate a polynomial expression (or a matrix depending on some variables) at many points which are not known in advance or with coordinates containing “symbolic expressions”. In these circumstances, standard evaluation schemes, such as those based on Fast Fourier Transforms do not apply. Given a polynomial f expressed as the sum of its terms, we propose an algorithm which generates a representation of f optimizing the process of evaluating f at some points. In addition, this evaluation of f can be done efficiently in terms of data locality and parallelism. We have implemented our algorithm in the Cilk++ concurrency platform and our implementation achieves nearly linear speedup on 16 cores with large enough input. For some large polynomials, the generated schedule can be evaluated at least 10 times faster than the schedules produced by other available software solutions. Moreover, our code can handle much larger input polynomials

Scholarship@Western