84 research outputs found
StochKit-FF: Efficient Systems Biology on Multicore Architectures
The stochastic modelling of biological systems is an informative, and in some
cases, very adequate technique, which may however result in being more
expensive than other modelling approaches, such as differential equations. We
present StochKit-FF, a parallel version of StochKit, a reference toolkit for
stochastic simulations. StochKit-FF is based on the FastFlow programming
toolkit for multicores and exploits the novel concept of selective memory. We
experiment StochKit-FF on a model of HIV infection dynamics, with the aim of
extracting information from efficiently run experiments, here in terms of
average and variance and, on a longer term, of more structured data.Comment: 14 pages + cover pag
Parallel Discrete Event Simulation with Erlang
Discrete Event Simulation (DES) is a widely used technique in which the state
of the simulator is updated by events happening at discrete points in time
(hence the name). DES is used to model and analyze many kinds of systems,
including computer architectures, communication networks, street traffic, and
others. Parallel and Distributed Simulation (PADS) aims at improving the
efficiency of DES by partitioning the simulation model across multiple
processing elements, in order to enabling larger and/or more detailed studies
to be carried out. The interest on PADS is increasing since the widespread
availability of multicore processors and affordable high performance computing
clusters. However, designing parallel simulation models requires considerable
expertise, the result being that PADS techniques are not as widespread as they
could be. In this paper we describe ErlangTW, a parallel simulation middleware
based on the Time Warp synchronization protocol. ErlangTW is entirely written
in Erlang, a concurrent, functional programming language specifically targeted
at building distributed systems. We argue that writing parallel simulation
models in Erlang is considerably easier than using conventional programming
languages. Moreover, ErlangTW allows simulation models to be executed either on
single-core, multicore and distributed computing architectures. We describe the
design and prototype implementation of ErlangTW, and report some preliminary
performance results on multicore and distributed architectures using the well
known PHOLD benchmark.Comment: Proceedings of ACM SIGPLAN Workshop on Functional High-Performance
Computing (FHPC 2012) in conjunction with ICFP 2012. ISBN: 978-1-4503-1577-
Constraint programming on hierarchical multiprocessor systems
The work reported in this thesis is about constraint processing in the context of hierarchical
multiprocessor systems, including distributed systems. More speci cally, it develops
techniques and a system to help bringing the power available in today's multiprocessing
networked systems into the constraint processing eld.
Solving constraint speci ed problems is a process which lends itself naturally to
parallelisation, as it usually implies going through very large search spaces, looking for
a solution. Parallel constraint solving draws on the idea of dividing the search space
among several workers, so the search may proceed faster, and thanks to the declarative
nature of constraint programming, the parallelisation happens transparently as far as
the user is concerned. However, to fully take advantage of the parallel computing power
available, techniques must be developed to help ensure that the workers executing the
search are kept busy at all times, which is an issue tackled by this work; RESUMO: Esta tese debruça-se sobre a programação por restrições no contexto dos sistemas multiprocessador
hierárquicos, incluindo os sistemas distribuídos. Mais especificamente, o
trabalho elaborado desenvolve as técnicas de resolução de problemas de satisfação de
restrições recorrendo ao paralelismo.
A actualidade do tema prende-se com a cada vez maior divulgação de que são objecto
os sistemas multiprocessador que, juntamente com a omnipresença das redes de
computadores, põe à nossa disposição uma capacidade de cálculo que necessita de ser
posta a uso, o que tarda em acontecer. Nesta tese desenvolve-se um sistema que permite
tirar partido desses recursos através do processamento de restrições
A programação por restrições é um paradigma declarativo, em que o utilizador não
tem de se preocupar com o controlo da computação, e a introdução de paralelismo nesta
área pode realizar-se transparentemente. Por outro lado, o processo de pesquisa de
soluções para problemas especificados por restrições adapta-se particularmente bem a
ser paralelizado.
Este tese apresenta uma abordagem _à resolução paralela de restrições, que junta
paralelismo local, sob a forma de trabalhadores, com paralelismo distribuído, em que os
actores são as equipas. O sistema construído, destinado a sistemas distribuídos de larga
escala, que _é descrito e os seus resultados apresentados, inclui distribuição de trabalho,
através de roubo de trabalho. Este funciona, localmente, sem a colaboração do roubado
e, remotamente, com colaboração, num ambiente em que todas as equipas cooperam na
procura da solução
A bibliography on parallel and vector numerical algorithms
This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also
Parallel algorithms for inductance extraction
In VLSI circuits, signal delays play an important role in design, timing verification and
signal integrity checks. These delays are attributed to the presence of parasitic resistance,
capacitance and inductance. With increasing clock speed and reducing feature sizes, these
delays will be dominated by parasitic inductance. In the next generation VLSI circuits, with
more than millions of components and interconnect segments, fast and accurate inductance
estimation becomes a crucial step.
A generalized approach for inductance extraction requires the solution of a large,
dense, complex linear system that models mutual inductive effects among circuit elements.
Iterative methods are used to solve the system without explicit computation of the system
matrix itself. Fast hierarchical techniques are used to compute approximate matrix-vector
products with the dense system matrix in a matrix-free way. Due to unavailability of system
matrix, constructing a preconditioner to accelerate the convergence of the iterative method
becomes a challenging task.
This work presents a class of parallel algorithms for fast and accurate inductance extraction
of VLSI circuits. We use the solenoidal basis approach that converts the linear
system into a reduced system. The reduced system of equations is solved by a preconditioned
iterative solver that uses fast hierarchical methods to compute products with the
dense coefficient matrix. A GreenâÃÂÃÂs function based preconditioner is proposed that achieves
near-optimal convergence rates in several cases. By formulating the preconditioner as a
dense matrix similar to the coefficient matrix, we are able to use fast hierarchical methods for the preconditioning step as well. Experiments on a number of benchmark problems
highlight the efficient preconditioning scheme and its advantages over FastHenry.
To further reduce the solution time of the software, we have developed a parallel implementation.
The parallel software package is capable of analyzing interconnects con-
figurations involving several conductors within reasonable time. A two-tier parallelization
scheme enables mixed mode parallelization, which uses both OpenMP and MPI directives.
The parallel performance of the software is demonstrated through experiments on the IBM
p690 and AMD Linux clusters. These experiments highlight the portability and efficiency
of the software on multiprocessors with shared, distributed, and distributed-shared memory
architectures
A review of literature on parallel constraint solving
As multicore computing is now standard, it seems irresponsible for constraints researchers to ignore the implications of it. Researchers need to address a number of issues to exploit parallelism, such as: investigating which constraint algorithms are amenable to parallelisation; whether to use shared memory or distributed computation; whether to use static or dynamic decomposition; and how to best exploit portfolios and cooperating search. We review the literature, and see that we can sometimes do quite well, some of the time, on some instances, but we are far from a general solution. Yet there seems to be little overall guidance that can be given on how best to exploit multicore computers to speed up constraint solving. We hope at least that this survey will provide useful pointers to future researchers wishing to correct this situation
Parallelisation of algorithms
Most numerical software involves performing an extremely large volume of algebraic computations. This is both costly and time consuming in respect of computer resources and, for large problems, often super-computer power is required in order for results to be obtained in a reasonable amount of time. One method whereby both the cost and time can be reduced is to use the principle "Many hands make light work", or rather, allow several computers to operate simultaneously on the code, working towards a common goal, and hopefully obtaining the required results in a fraction of the time and cost normally used. This can be achieved through the modification of the costly, time consuming code, breaking it up into separate individual code segments which may be executed concurrently on different processors. This is termed parallelisation of code. This document describes communication between sequential processes, protocols, message routing and parallelisation of algorithms. In particular, it deals with these aspects with reference to the Transputer as developed by INMOS and includes two parallelisation examples, namely parallelisation of code to study airflow and of code to determine far field patterns of antennas. This document also reports on the practical experiences with programming in parallel
- …