Search CORE

329 research outputs found

PPF - A Parallel Particle Filtering Library

Author: Demirel Ömer
Meijering Erik
Niessen Wiro
Sbalzarini Ivo F.
Smal Ihor
Publication venue
Publication date: 01/01/2014
Field of study

We present the parallel particle filtering (PPF) software library, which enables hybrid shared-memory/distributed-memory parallelization of particle filtering (PF) algorithms combining the Message Passing Interface (MPI) with multithreading for multi-level parallelism. The library is implemented in Java and relies on OpenMPI's Java bindings for inter-process communication. It includes dynamic load balancing, multi-thread balancing, and several algorithmic improvements for PF, such as input-space domain decomposition. The PPF library hides the difficulties of efficient parallel programming of PF algorithms and provides application developers with the necessary tools for parallel implementation of PF methods. We demonstrate the capabilities of the PPF library using two distributed PF algorithms in two scenarios with different numbers of particles. The PPF library runs a 38 million particle problem, corresponding to more than 1.86 GB of particle data, on 192 cores with 67% parallel efficiency. To the best of our knowledge, the PPF library is the first open-source software that offers a parallel framework for PF applications.Comment: 8 pages, 8 figures; will appear in the proceedings of the IET Data Fusion & Target Tracking Conference 201

arXiv.org e-Print Archive

EUR Research Repository

Simulating spin systems on IANUS, an FPGA-based computer

Author: A. Cruz
A. Gordillo
A. Maiorano
A. Muñoz-Sudupe
A. Tarancón
Amit
Ballesteros
Belanger
Belletti
Condon
Cruz
D. Navarro
D. Sciretti
E. Marinari
Edwards
F. Belletti
F. Mantovani
Heuer
Hukushima
Imry
J.J. Ruiz-Lorenzo
J.L. Velasco
L.A. Fernández
Landau
M. Cotallo
Maiorano
Marinari
Marinari
Marinari
Ogielski
Ogielski
Parisi
R. Tripiccione
Rieger
S. Pérez-Gaviro
S.F. Schifano
Tesi
V. Martín-Mayor
Publication venue: 'Elsevier BV'
Publication date: 26/04/2007
Field of study

We describe the hardwired implementation of algorithms for Monte Carlo simulations of a large class of spin models. We have implemented these algorithms as VHDL codes and we have mapped them onto a dedicated processor based on a large FPGA device. The measured performance on one such processor is comparable to O(100) carefully programmed high-end PCs: it turns out to be even better for some selected spin models. We describe here codes that we are currently executing on the IANUS massively parallel FPGA-based system.Comment: 19 pages, 8 figures; submitted to Computer Physics Communication

arXiv.org e-Print Archive

Archivio della Ricerca - Università degli Studi di Siena

Archivio della ricerca- Università di Roma La Sapienza

Lazy Sequentialization for TSO and PSO via Shared Memory Abstractions

Author: Fischer Bernd
Inverso Omar
La Torre Salvatore
Nguyen Lam Truc
Parlato Gennaro
Tomasco Ermenegildo
Publication venue: FMCAD Inc.
Publication date: 01/01/2016
Field of study

Lazy sequentialization is one of the most effective approaches for the bounded verification of concurrent programs. Existing tools assume sequential consistency (SC), thus the feasibility of lazy sequentializations for weak memory models (WMMs) remains untested. Here, we describe the first lazy sequentialization approach for the total store order (TSO) and partial store order (PSO) memory models. We replace all shared memory accesses with operations on a shared memory abstraction (SMA), an abstract data type that encapsulates the semantics of the underlying WMM and implements it under the simpler SC model. We give efficient SMA implementations for TSO and PSO that are based on temporal circular doubly-linked lists, a new data structure that allows an efficient simulation of the store buffers. We show experimentally, both on the SV-COMP concurrency benchmarks and a real world instance, that this approach works well in combination with lazy sequentialization on top of bounded model checking

Southampton (e-Prints Soton)

Archivio della Ricerca - Università di Salerno

Connected component identification and cluster update on GPU

Author: D. A. Bader
D. B. Kirk
D. Stauffer
J. E. Gentle
K. Binder
Martin Weigel
R. J. Baxter
T. H. Cormen
Publication venue: 'American Physical Society (APS)'
Publication date: 12/06/2011
Field of study

Cluster identification tasks occur in a multitude of contexts in physics and engineering such as, for instance, cluster algorithms for simulating spin models, percolation simulations, segmentation problems in image processing, or network analysis. While it has been shown that graphics processing units (GPUs) can result in speedups of two to three orders of magnitude as compared to serial codes on CPUs for the case of local and thus naturally parallelized problems such as single-spin flip update simulations of spin models, the situation is considerably more complicated for the non-local problem of cluster or connected component identification. I discuss the suitability of different approaches of parallelization of cluster labeling and cluster update algorithms for calculations on GPU and compare to the performance of serial implementations.Comment: 15 pages, 14 figures, one table, submitted to PR

arXiv.org e-Print Archive