Search CORE

13,565 research outputs found

Architecture independent parallel selection with applications to parallel priority queues

Author: Gerbessiotis Alexandros V.
Siniolakis Constantinos J.
Publication venue: Elsevier Science B.V.
Publication date: 14/05/2003
Field of study

AbstractWe present a randomized selection algorithm whose performance is analyzed in an architecture independent way on the bulk-synchronous parallel (BSP) model of computation along with an application of this algorithm to dynamic data structures, namely parallel priority queues. We show that our algorithms improve previous results upon both the communication requirements and the amount of parallel slack required to achieve optimal performance. We also establish that optimality to within small multiplicative constant factors can be achieved for a wide range of parallel machines. While these algorithms are fairly simple themselves, descriptions of their performance in terms of the BSP parameters is somewhat involved; the main reward of quantifying these complications is that it allows transportable software to be written for parallel machines that fit the model

Elsevier - Publisher Connector

A load-sharing architecture for high performance optimistic simulations on multi-core machines

Author: Pellegrini Alessandro
Quaglia Francesco
Vitali Roberto
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

In Parallel Discrete Event Simulation (PDES), the simulation model is partitioned into a set of distinct Logical Processes (LPs) which are allowed to concurrently execute simulation events. In this work we present an innovative approach to load-sharing on multi-core/multiprocessor machines, targeted at the optimistic PDES paradigm, where LPs are speculatively allowed to process simulation events with no preventive verification of causal consistency, and actual consistency violations (if any) are recovered via rollback techniques. In our approach, each simulation kernel instance, in charge of hosting and executing a specific set of LPs, runs a set of worker threads, which can be dynamically activated/deactivated on the basis of a distributed algorithm. The latter relies in turn on an analytical model that provides indications on how to reassign processor/core usage across the kernels in order to handle the simulation workload as efficiently as possible. We also present a real implementation of our load-sharing architecture within the ROme OpTimistic Simulator (ROOT-Sim), namely an open-source C-based simulation platform implemented according to the PDES paradigm and the optimistic synchronization approach. Experimental results for an assessment of the validity of our proposal are presented as well

Crossref

ART

Archivio della ricerca- Università di Roma La Sapienza

Randomized priority queues for fast parallel access

Author: Sanders Peter
Publication venue
Publication date: 02/08/2007
Field of study

Applications like parallel search or discrete event simulation often assign priority or importance to pieces of work. An effective way to exploit this for parallelization is to use a priority queue data structure for scheduling the work; but a bottleneck free implementation of parallel priority queue access by many processors is required to make this approach scalable. We present simple and portable randomized algorithms for parallel priority queues on distributed memory machines with fully distributed storage. Accessing O(n) out of m elements on an n-processor network with diameter d requires amortized time O(d + log m/n) with high probability for many network types. On logarithmic diameter networks, the algorithms are as fast as the best previously known EREW-PRAM methods. Implementations demonstrate that the approach is already useful for medium scale parallelism

KITopen

A Bulk-Parallel Priority Queue in External Memory with STXXL

Author: GS Brodal
J Singler
JS Vitter
L Arge
MC Pinotti
N Deo
P Sanders
P Sanders
PJ Varman
R Dementiev
Publication venue
Publication date: 01/01/2015
Field of study

We propose the design and an implementation of a bulk-parallel external memory priority queue to take advantage of both shared-memory parallelism and high external memory transfer speeds to parallel disks. To achieve higher performance by decoupling item insertions and extractions, we offer two parallelization interfaces: one using "bulk" sequences, the other by defining "limit" items. In the design, we discuss how to parallelize insertions using multiple heaps, and how to calculate a dynamic prediction sequence to prefetch blocks and apply parallel multiway merge for extraction. Our experimental results show that in the selected benchmarks the priority queue reaches 75% of the full parallel I/O bandwidth of rotational disks and and 65% of SSDs, or the speed of sorting in external memory when bounded by computation.Comment: extended version of SEA'15 conference pape

arXiv.org e-Print Archive

Crossref

KITopen

A Power Cap Oriented Time Warp Architecture

Author: Ciciani Bruno
Cingolani Davide
Conoci Stefano
Di Sanzo Pierangelo
Pellegrini Alessandro
Quaglia Francesco
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Controlling power usage has become a core objective in modern computing platforms. In this article we present an innovative Time Warp architecture oriented to efficiently run parallel simulations under a power cap. Our architectural organization considers power usage as a foundational design principle, as opposed to classical power-unaware Time Warp design. We provide early experimental results showing the potential of our proposal

Crossref

Archivio della Ricerca - Università di Roma 3

ART

Archivio della ricerca- Università di Roma La Sapienza