35,934 research outputs found
A simulation study of or- and independent and-parallelism
Although studies of a number of parallel implementations of logic programming languages are now available, the results are difficult to interpret due to the multiplicity of factors involved, the effect of each of which is difficult to sepárate. In this paper we present the results of a highlevel simulation study of or- and independent and-parallelism with a wide selection of Prolog programs that aims to facilĂtate this separation. We hope this study will be instrumental in better understanding and comparing results from actual implementations, as shown by an example
in the paper. In addition, the paper examines some of the issues and tradeoffs associated with the combination of and- and or-parallelism and proposes reasonable solutions based on the simulation data
High-level characteristics of or-and independent and-parallelism in prolog
Although studies of a number of parallel implementations of logic programming languages are now available, their results are difficult to interpret due to the multiplicity of factors involved, the effect of each of which is difficult to sepárate. In this paper we present the results of a high-level simulation study of or- and independent
and-parallelism with a wide selection of Prolog programs that aims to determine the intrinsic amount of parallelism,
independently of implementation factors, thus facilitating this separation. We expect this study will be instrumental in better understanding and comparing results from actual implementations, as shown by some examples provided in the paper. In addition, the paper examines some of the issues and tradeoffs associated with the combination of and- and or-parallelism and proposes reasonable solutions based on the simulation data obtained
On Designing Multicore-aware Simulators for Biological Systems
The stochastic simulation of biological systems is an increasingly popular
technique in bioinformatics. It often is an enlightening technique, which may
however result in being computational expensive. We discuss the main
opportunities to speed it up on multi-core platforms, which pose new challenges
for parallelisation techniques. These opportunities are developed in two
general families of solutions involving both the single simulation and a bulk
of independent simulations (either replicas of derived from parameter sweep).
Proposed solutions are tested on the parallelisation of the CWC simulator
(Calculus of Wrapped Compartments) that is carried out according to proposed
solutions by way of the FastFlow programming framework making possible fast
development and efficient execution on multi-cores.Comment: 19 pages + cover pag
Parallel implementation of stochastic simulation for large-scale cellular processes
Experimental and theoretical studies have shown the importance of stochastic processes in genetic regulatory networks and cellular processes. Cellular networks and genetic circuits often involve small numbers of key proteins such as transcriptional factors and signaling proteins. In recent years stochastic models have been used successfully for studying noise in biological pathways, and stochastic modelling of biological systems has become a very important research field in computational biology. One of the challenge problems in this field is the reduction of the huge computing time in stochastic simulations. Based on the system of the mitogen-activated protein kinase cascade that is activated by epidermal growth factor, this work give a parallel implementation by using OpenMP and parallelism across the simulation. Special attention is paid to the independence of the generated random numbers in parallel computing, that is a key criterion for the success of stochastic simulations. Numerical results indicate that parallel computers can be used as an efficient tool for simulating the dynamics of large-scale genetic regulatory networks and cellular processes
RPPM : Rapid Performance Prediction of Multithreaded workloads on multicore processors
Analytical performance modeling is a useful complement to detailed cycle-level simulation to quickly explore the design space in an early design stage. Mechanistic analytical modeling is particularly interesting as it provides deep insight and does not require expensive offline profiling as empirical modeling. Previous work in mechanistic analytical modeling, unfortunately, is limited to single-threaded applications running on single-core processors.
This work proposes RPPM, a mechanistic analytical performance model for multi-threaded applications on multicore hardware. RPPM collects microarchitecture-independent characteristics of a multi-threaded workload to predict performance on a previously unseen multicore architecture. The profile needs to be collected only once to predict a range of processor architectures. We evaluate RPPM's accuracy against simulation and report a performance prediction error of 11.2% on average (23% max). We demonstrate RPPM's usefulness for conducting design space exploration experiments as well as for analyzing parallel application performance
Multi-Architecture Monte-Carlo (MC) Simulation of Soft Coarse-Grained Polymeric Materials: SOft coarse grained Monte-carlo Acceleration (SOMA)
Multi-component polymer systems are important for the development of new
materials because of their ability to phase-separate or self-assemble into
nano-structures. The Single-Chain-in-Mean-Field (SCMF) algorithm in conjunction
with a soft, coarse-grained polymer model is an established technique to
investigate these soft-matter systems. Here we present an im- plementation of
this method: SOft coarse grained Monte-carlo Accelera- tion (SOMA). It is
suitable to simulate large system sizes with up to billions of particles, yet
versatile enough to study properties of different kinds of molecular
architectures and interactions. We achieve efficiency of the simulations
commissioning accelerators like GPUs on both workstations as well as
supercomputers. The implementa- tion remains flexible and maintainable because
of the implementation of the scientific programming language enhanced by
OpenACC pragmas for the accelerators. We present implementation details and
features of the program package, investigate the scalability of our
implementation SOMA, and discuss two applications, which cover system sizes
that are difficult to reach with other, common particle-based simulation
methods
Memory performance of and-parallel prolog on shared-memory architectures
The goal of the RAP-WAM AND-parallel Prolog abstract architecture is to provide inference speeds significantly
beyond those of sequential systems, while supporting Prolog semantics and preserving sequential performance and storage efficiency. This paper presents simulation results supporting these claims with special emphasis on memory performance on a two-level sharedmemory multiprocessor organization. Several solutions to the cache coherency problem are analyzed. It is shown that RAP-WAM offers good locality and storage efficiency and that it can effectively take advantage of broadcast caches. It is argued that speeds in excess of 2 ML IPS on real applications exhibiting medium parallelism can be attained with current technology
Janus II: a new generation application-driven computer for spin-system simulations
This paper describes the architecture, the development and the implementation
of Janus II, a new generation application-driven number cruncher optimized for
Monte Carlo simulations of spin systems (mainly spin glasses). This domain of
computational physics is a recognized grand challenge of high-performance
computing: the resources necessary to study in detail theoretical models that
can make contact with experimental data are by far beyond those available using
commodity computer systems. On the other hand, several specific features of the
associated algorithms suggest that unconventional computer architectures, which
can be implemented with available electronics technologies, may lead to order
of magnitude increases in performance, reducing to acceptable values on human
scales the time needed to carry out simulation campaigns that would take
centuries on commercially available machines. Janus II is one such machine,
recently developed and commissioned, that builds upon and improves on the
successful JANUS machine, which has been used for physics since 2008 and is
still in operation today. This paper describes in detail the motivations behind
the project, the computational requirements, the architecture and the
implementation of this new machine and compares its expected performances with
those of currently available commercial systems.Comment: 28 pages, 6 figure
- …