84 research outputs found

    StochKit-FF: Efficient Systems Biology on Multicore Architectures

    Full text link
    The stochastic modelling of biological systems is an informative, and in some cases, very adequate technique, which may however result in being more expensive than other modelling approaches, such as differential equations. We present StochKit-FF, a parallel version of StochKit, a reference toolkit for stochastic simulations. StochKit-FF is based on the FastFlow programming toolkit for multicores and exploits the novel concept of selective memory. We experiment StochKit-FF on a model of HIV infection dynamics, with the aim of extracting information from efficiently run experiments, here in terms of average and variance and, on a longer term, of more structured data.Comment: 14 pages + cover pag

    Parallel Discrete Event Simulation with Erlang

    Full text link
    Discrete Event Simulation (DES) is a widely used technique in which the state of the simulator is updated by events happening at discrete points in time (hence the name). DES is used to model and analyze many kinds of systems, including computer architectures, communication networks, street traffic, and others. Parallel and Distributed Simulation (PADS) aims at improving the efficiency of DES by partitioning the simulation model across multiple processing elements, in order to enabling larger and/or more detailed studies to be carried out. The interest on PADS is increasing since the widespread availability of multicore processors and affordable high performance computing clusters. However, designing parallel simulation models requires considerable expertise, the result being that PADS techniques are not as widespread as they could be. In this paper we describe ErlangTW, a parallel simulation middleware based on the Time Warp synchronization protocol. ErlangTW is entirely written in Erlang, a concurrent, functional programming language specifically targeted at building distributed systems. We argue that writing parallel simulation models in Erlang is considerably easier than using conventional programming languages. Moreover, ErlangTW allows simulation models to be executed either on single-core, multicore and distributed computing architectures. We describe the design and prototype implementation of ErlangTW, and report some preliminary performance results on multicore and distributed architectures using the well known PHOLD benchmark.Comment: Proceedings of ACM SIGPLAN Workshop on Functional High-Performance Computing (FHPC 2012) in conjunction with ICFP 2012. ISBN: 978-1-4503-1577-

    Constraint programming on hierarchical multiprocessor systems

    Get PDF
    The work reported in this thesis is about constraint processing in the context of hierarchical multiprocessor systems, including distributed systems. More speci cally, it develops techniques and a system to help bringing the power available in today's multiprocessing networked systems into the constraint processing eld. Solving constraint speci ed problems is a process which lends itself naturally to parallelisation, as it usually implies going through very large search spaces, looking for a solution. Parallel constraint solving draws on the idea of dividing the search space among several workers, so the search may proceed faster, and thanks to the declarative nature of constraint programming, the parallelisation happens transparently as far as the user is concerned. However, to fully take advantage of the parallel computing power available, techniques must be developed to help ensure that the workers executing the search are kept busy at all times, which is an issue tackled by this work; RESUMO: Esta tese debruça-se sobre a programação por restrições no contexto dos sistemas multiprocessador hierárquicos, incluindo os sistemas distribuídos. Mais especificamente, o trabalho elaborado desenvolve as técnicas de resolução de problemas de satisfação de restrições recorrendo ao paralelismo. A actualidade do tema prende-se com a cada vez maior divulgação de que são objecto os sistemas multiprocessador que, juntamente com a omnipresença das redes de computadores, põe à nossa disposição uma capacidade de cálculo que necessita de ser posta a uso, o que tarda em acontecer. Nesta tese desenvolve-se um sistema que permite tirar partido desses recursos através do processamento de restrições A programação por restrições é um paradigma declarativo, em que o utilizador não tem de se preocupar com o controlo da computação, e a introdução de paralelismo nesta área pode realizar-se transparentemente. Por outro lado, o processo de pesquisa de soluções para problemas especificados por restrições adapta-se particularmente bem a ser paralelizado. Este tese apresenta uma abordagem _à resolução paralela de restrições, que junta paralelismo local, sob a forma de trabalhadores, com paralelismo distribuído, em que os actores são as equipas. O sistema construído, destinado a sistemas distribuídos de larga escala, que _é descrito e os seus resultados apresentados, inclui distribuição de trabalho, através de roubo de trabalho. Este funciona, localmente, sem a colaboração do roubado e, remotamente, com colaboração, num ambiente em que todas as equipas cooperam na procura da solução

    A bibliography on parallel and vector numerical algorithms

    Get PDF
    This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

    Parallel algorithms for inductance extraction

    Get PDF
    In VLSI circuits, signal delays play an important role in design, timing verification and signal integrity checks. These delays are attributed to the presence of parasitic resistance, capacitance and inductance. With increasing clock speed and reducing feature sizes, these delays will be dominated by parasitic inductance. In the next generation VLSI circuits, with more than millions of components and interconnect segments, fast and accurate inductance estimation becomes a crucial step. A generalized approach for inductance extraction requires the solution of a large, dense, complex linear system that models mutual inductive effects among circuit elements. Iterative methods are used to solve the system without explicit computation of the system matrix itself. Fast hierarchical techniques are used to compute approximate matrix-vector products with the dense system matrix in a matrix-free way. Due to unavailability of system matrix, constructing a preconditioner to accelerate the convergence of the iterative method becomes a challenging task. This work presents a class of parallel algorithms for fast and accurate inductance extraction of VLSI circuits. We use the solenoidal basis approach that converts the linear system into a reduced system. The reduced system of equations is solved by a preconditioned iterative solver that uses fast hierarchical methods to compute products with the dense coefficient matrix. A GreenâÃÂÃÂs function based preconditioner is proposed that achieves near-optimal convergence rates in several cases. By formulating the preconditioner as a dense matrix similar to the coefficient matrix, we are able to use fast hierarchical methods for the preconditioning step as well. Experiments on a number of benchmark problems highlight the efficient preconditioning scheme and its advantages over FastHenry. To further reduce the solution time of the software, we have developed a parallel implementation. The parallel software package is capable of analyzing interconnects con- figurations involving several conductors within reasonable time. A two-tier parallelization scheme enables mixed mode parallelization, which uses both OpenMP and MPI directives. The parallel performance of the software is demonstrated through experiments on the IBM p690 and AMD Linux clusters. These experiments highlight the portability and efficiency of the software on multiprocessors with shared, distributed, and distributed-shared memory architectures

    Quantitative Performance Analysis of the SPEC OMPM2001 Benchmarks

    Get PDF

    A review of literature on parallel constraint solving

    Get PDF
    As multicore computing is now standard, it seems irresponsible for constraints researchers to ignore the implications of it. Researchers need to address a number of issues to exploit parallelism, such as: investigating which constraint algorithms are amenable to parallelisation; whether to use shared memory or distributed computation; whether to use static or dynamic decomposition; and how to best exploit portfolios and cooperating search. We review the literature, and see that we can sometimes do quite well, some of the time, on some instances, but we are far from a general solution. Yet there seems to be little overall guidance that can be given on how best to exploit multicore computers to speed up constraint solving. We hope at least that this survey will provide useful pointers to future researchers wishing to correct this situation

    Parallelisation of algorithms

    Get PDF
    Most numerical software involves performing an extremely large volume of algebraic computations. This is both costly and time consuming in respect of computer resources and, for large problems, often super-computer power is required in order for results to be obtained in a reasonable amount of time. One method whereby both the cost and time can be reduced is to use the principle "Many hands make light work", or rather, allow several computers to operate simultaneously on the code, working towards a common goal, and hopefully obtaining the required results in a fraction of the time and cost normally used. This can be achieved through the modification of the costly, time consuming code, breaking it up into separate individual code segments which may be executed concurrently on different processors. This is termed parallelisation of code. This document describes communication between sequential processes, protocols, message routing and parallelisation of algorithms. In particular, it deals with these aspects with reference to the Transputer as developed by INMOS and includes two parallelisation examples, namely parallelisation of code to study airflow and of code to determine far field patterns of antennas. This document also reports on the practical experiences with programming in parallel
    corecore