141,746 research outputs found
BSF-skeleton: A Template for Parallelization of Iterative Numerical Algorithms on Cluster Computing Systems
This article describes a method for creating applications for cluster
computing systems using the parallel BSF skeleton based on the original BSF
(Bulk Synchronous Farm) model of parallel computations developed by the author
earlier. This model uses the master/slave paradigm. The main advantage of the
BSF model is that it allows to estimate the scalability of a parallel algorithm
before its implementation. Another important feature of the BSF model is the
representation of problem data in the form of lists that greatly simplifies the
logic of building applications. The BSF skeleton is designed for creating
parallel programs in C++ using the MPI library. The scope of the BSF skeleton
is iterative numerical algorithms of high computational complexity. The BSF
skeleton has the following distinctive features. - The BSF-skeleton completely
encapsulates all aspects that are associated with parallelizing a program. -
The BSF skeleton allows error-free compilation at all stages of application
development. - The BSF skeleton supports OpenMP programming model and
workflows.Comment: Submitted to Methods
Lower-bound Time-Complexity Analysis of Logic Programs
The paper proposes a technique for inferring conditions on goals that, when satisfied, ensure that a goal is sufficiently coarse-grained to warrant parallel evaluation. The method is powerful enough to reason about divide-and-conquer programs, and in the case of quicksort, for instance, can infer that a quicksort goal has a time complexity that exceeds 64 resolution steps (a threshold for spawning) if the input list is of length 10 or more. This gives a simple run-time tactic for controlling spawning. The method has been proved correct, can be implemented straightforwardly, has been demonstrated to be useful on a parallel machine, and, in contrast with much of the previous work on time-complexity analysis of logic programs, does not require any complicated difference equation solving machinery
Optimal Union-Find in Constraint Handling Rules
Constraint Handling Rules (CHR) is a committed-choice rule-based language
that was originally intended for writing constraint solvers. In this paper we
show that it is also possible to write the classic union-find algorithm and
variants in CHR. The programs neither compromise in declarativeness nor
efficiency. We study the time complexity of our programs: they match the
almost-linear complexity of the best known imperative implementations. This
fact is illustrated with experimental results.Comment: 12 pages, 3 figures, to appear in Theory and Practice of Logic
Programming (TPLP
Quantifying Resource Use in Computations
It is currently not possible to quantify the resources needed to perform a
computation. As a consequence, it is not possible to reliably evaluate the
hardware resources needed for the application of algorithms or the running of
programs. This is apparent in both computer science, for instance, in
cryptanalysis, and in neuroscience, for instance, comparative neuro-anatomy. A
System versus Environment game formalism is proposed based on Computability
Logic that allows to define a computational work function that describes the
theoretical and physical resources needed to perform any purely algorithmic
computation. Within this formalism, the cost of a computation is defined as the
sum of information storage over the steps of the computation. The size of the
computational device, eg, the action table of a Universal Turing Machine, the
number of transistors in silicon, or the number and complexity of synapses in a
neural net, is explicitly included in the computational cost. The proposed cost
function leads in a natural way to known computational trade-offs and can be
used to estimate the computational capacity of real silicon hardware and neural
nets. The theory is applied to a historical case of 56 bit DES key recovery, as
an example of application to cryptanalysis. Furthermore, the relative
computational capacities of human brain neurons and the C. elegans nervous
system are estimated as an example of application to neural nets.Comment: 26 pages, no figure
Work Analysis with Resource-Aware Session Types
While there exist several successful techniques for supporting programmers in
deriving static resource bounds for sequential code, analyzing the resource
usage of message-passing concurrent processes poses additional challenges. To
meet these challenges, this article presents an analysis for statically
deriving worst-case bounds on the total work performed by message-passing
processes. To decompose interacting processes into components that can be
analyzed in isolation, the analysis is based on novel resource-aware session
types, which describe protocols and resource contracts for inter-process
communication. A key innovation is that both messages and processes carry
potential to share and amortize cost while communicating. To symbolically
express resource usage in a setting without static data structures and
intrinsic sizes, resource contracts describe bounds that are functions of
interactions between processes. Resource-aware session types combine standard
binary session types and type-based amortized resource analysis in a linear
type system. This type system is formulated for a core session-type calculus of
the language SILL and proved sound with respect to a multiset-based operational
cost semantics that tracks the total number of messages that are exchanged in a
system. The effectiveness of the analysis is demonstrated by analyzing standard
examples from amortized analysis and the literature on session types and by a
comparative performance analysis of different concurrent programs implementing
the same interface.Comment: 25 pages, 2 pages of references, 11 pages of appendix, Accepted at
LICS 201
B-LOG: A branch and bound methodology for the parallel execution of logic programs
We propose a computational methodology -"B-LOG"-, which offers the potential for an effective implementation of Logic Programming in a parallel computer. We also propose a weighting scheme to guide the search process through the graph and we apply the concepts of parallel "branch and bound" algorithms in order to perform a "best-first" search using an information theoretic bound. The concept of "session" is used to speed up the search process in a succession of similar queries. Within a session, we strongly modify the bounds in a local database, while bounds kept in a global database are weakly modified to provide a better initial condition for other sessions. We
also propose an implementation scheme based on a database
machine using "semantic paging", and the "B-LOG processor" based on a scoreboard driven controller
- …