Search CORE

21 research outputs found

Network-Oblivious Algorithms

Author: Andrea Pietracaprina
Bilardi Gianfranco
Bilardi Gianfranco
Bilardi Gianfranco
de la Torre Pilar
Francesco Silvestri
Geppino Pucci
Gerth
Gianfranco Bilardi
Herley Kieran T.
Michele Scquizzato
Rezaul
Scquizzato Michele
Silvestri Francesco
Tang Yuan
Tang Yuan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

A framework is proposed for the design and analysis of network-oblivious algorithms, namely algorithms that can run unchanged, yet efficiently, on a variety of machines characterized by different degrees of parallelism and communication capabilities. The framework prescribes that a network-oblivious algorithm be specified on a parallel model of computation where the only parameter is the problem\u2019s input size, and then evaluated on a model with two parameters, capturing parallelism granularity and communication latency. It is shown that for a wide class of network-oblivious algorithms, optimality in the latter model implies optimality in the decomposable bulk synchronous parallel model, which is known to effectively describe a wide and significant class of parallel platforms. The proposed framework can be regarded as an attempt to port the notion of obliviousness, well established in the context of cache hierarchies, to the realm of parallel computation. Its effectiveness is illustrated by providing optimal network-oblivious algorithms for a number of key problems. Some limitations of the oblivious approach are also discussed

Crossref

The IT University of Copenhagen's Repository

Archivio istituzionale della ricerca - Università di Padova

Deterministic Simulations of Shared Memory on Bounded Degree Networks

Author: Herley Kieran T.
Publication venue: 'SAGE Publications'
Publication date: 01/02/1990
Field of study

The Parallel Random Access Machine (PRAM) is an abstract parallel machine consisting of a synchronous collection of

n

processors connected to a shared memory of

m

cells. The essential feature of the PRAM is that the processors can access any

n

-tuple of distinct cells in a single machine cycle. While the PRAM is an attractive and widely used framework for the design and analysis of parallel algorithms, it does not reflect the constraints of realistic multiprocessors. This thesis explores the problem of efficient deterministic simulations of PRAM computations on bounded degree networks of processors, a model of parallel machines closer to what can be built in practice. It is shown that an arbitrary step of a PRAM with

n

processors and

m \geq n

cells of shared memory can be simulated in

O

(log(

(m/n)

log

n

/log log

n

+ log

n

log log

n

(log log

(m/n)

- log log log

n

)) time in the worst-case on an

n

-node bounded degree network with a particular expander-based structure. This simulation is more efficient than all deterministic simulations previously known both with respect to time and space. In the case where

m/n

is polylogarithmic in

n

, the worst-case time to simulate a single PRAM step is at most

O

(log

n

log log

n

) which is within a factor of

O(log log

the diameter of the network. The space requirements for our algorithm are at most

(

(log

(m/n))^{3})

overall. The simulation may also be adapted to run on to an

-processor augmented mesh-of-trees architecture with a running time of

(log

log log

(log log

(m/n)

- log log log

) + log

(m/n))$ Overall, these results suggest that, in principle at least, it is feasible to provide the abstraction of a shared memory on distributed models of parallel computation with only modest degradation in performance in the worst case

eCommons (Cornell Univ.)

Improved Bounds for the Token Distribution Problem

Author: Herley Kieran T.
Publication venue: 'SAGE Publications'
Publication date: 01/10/1989
Field of study

The problem of packet routing on bounded degree networks is considered. An algorithm is presented that can route

n

packets in

O

(log

n + K)

time on a particular

n

-node expander-based network provided that no more than

K

packets share the same source or destination

eCommons (Cornell Univ.)

�ÓÖÑ�Ò � Ø��×��Ò��Ô�Ò��ÒØÐÝ�Ó�×ÒÓØ��Ò�Ö�ÐÐÝÝ��Ð�ÓÔØ�Ñ�ÐÔ�Ö �ÔÔÖÓÔÖ��Ø�ÒÙÑ��ÖÓ�ÔÓ�ÒØØÓÔÓ�ÒØÑ�××��×�Ò�ÖÓÙØ�Ò� Ø�Ñ��×Ö�ÕÙ�Ö��ØÓÖÓÙØ�Ñ�ÒÑ�××��×Û��Ö��Ñ�×

Author: Kieran T. Herley
Publication venue
Publication date
Field of study

ÑÙ×Ø��Ð�Ú�Ö��ØÓ�ÒÝ�Ò��Ú��Ù�ÐÒÓ��Ì��ÐÓÛ�Ö�ÓÙÒ� ��ÒÓ��Ì��Ö�Ò�ÓÑ�Þ��Ð�ÓÖ�Ø�Ñ�ØØ��Ò×ÓÔØ�Ñ�ÐÔ�Ö �ÓÖÓÒ�ØÓÑ�ÒÝÖÓÙØ�Ò�Û��Ù×�ÓÒ×Ø�ÒØ×�Þ��Ù«�Ö×�Ø �Ð�ÓÖ�Ø�Ñ× �ÓÖÑ�Ò�Û��Ð�Ø��Ø�ÖÑ�Ò�×Ø � ��ØÓÖÓ�Ç ÐÓ�Ò¡Ï��Ð×Ó��×Ö��ÒÓÔØ�Ñ�Ð��Ø�ÖÑ�Ò �Ð�ÓÖ�Ø�Ñ�××ÐÓÛ�Ö�Ý� �×Ø � Ç �Ð�ÓÖ�Ø�ÑØ��Ø�ÓÛ�Ú�ÖÖ�ÕÙ�Ö�×Ð�Ö��Ù«�Ö×Ó�×�Þ� 1. ÊÓÙØ�Ò�ÔÖ�Ñ�Ø�Ú�×�Ó

CiteSeerX

Deterministic Simulations of PRAMs on Bounded Degree Networks

Author: Bilardi Gianfranco
Herley Kieran T.
Publication venue: 'SAGE Publications'
Publication date: 01/11/1988
Field of study

The problem of simulating a PRAM with

n

processors and memory size

m \geq n

on an

n

-node bounded degree network is considered. A scheme is presented which simulates an arbitrary PRAM step in

O ((\log n \log m)/\log \log n)

time in the worst case on an expander-based network. By extending a previously established lower bound, it is shown that the proposed simulation is optimal whenever

\Omega (n^{1 + \epsilon}) \leq m \leq O(2^{(\log n)\alpha})

for some

\epsilon greater than O

and some

\alpha > O

eCommons (Cornell Univ.)

Mesh ∗

Author: Andrea Pietracaprina
Kieran T. Herley
Publication venue
Publication date
Field of study

We study the complexity of routing a set of messages with multiple destinations (multicast routing) on an n-node square mesh under the store-and-forward model. A standard argument proves that Ω ( √ cn) time is required to route n messages, where each message is generated by a distinct node and at most c messages are to be delivered to any individual node. The obvious approach of simply replicating each message into the appropriate number of unicast (single-destination) messages and routing these independently does not yield an optimal algorithm. We provide both randomized and deterministic algorithms for multicast routing, which use constantsize buffers at each node. The randomized algorithm attains � optimal performance, while the deterministic algorithm is slower by a factor of O log 2 � n. We also describe an optimal deterministic algorithm that, however, requires large buffers of size O (c). 2

CiteSeerX

Deterministic Branch-And-Bound On Distributed Memory Machines

Author: Andrea Pietracaprina
Geppino Pucci
Kieran T. Herley
Publication venue
Publication date: 01/01/1999
Field of study

The branch-and-bound problem involves determining the leaf of minimum cost in a cost-labelled, heap-ordered tree, subject to the constraint that only the root is known initially and that the children of a node are revealed only by visiting their parent. We present the first efficient deterministic algorithm to solve the branch-and-bound problem for a tree T of constant degree on a p-processor distributed-memory Optically Connected Parallel Computer (OCPC). Let c be the cost of the minimum-cost leaf in T , and let n and h be the number of nodes and the height, respectively, of the subtree T ` T of nodes whose cost is at most c . When accounting for both computation and communication costs, our algorithm runs in time O \Gamma n=p + h(maxfp; log n log pg) 2 \Delta for general values of n, and can be made to run in time O \Gamma\Gamma n=p + h log 4 p \Delta log log p \Delta for n polynomial in p. For large ranges of the relevant parameters, our algorithm is provab..

CiteSeerX

Archivio istituzionale della ricerca - Università di Padova

Implementing Shared Memory on Mesh-Connected Computers and on the Fat-Tree

Author: Andrea Pietracaprina
Geppino Pucci
Kieran T. Herley
Publication venue
Publication date: 01/01/2001
Field of study

We present deterministic upper and lower bounds on the slowdown required to simulate an (n; m)-PRAM on a variety of networks. The upper bounds are based on a novel scheme that exploits the splitting and combining of messages. This scheme can be implemented on an n-node d-dimensional mesh (for constant d) and on an n-leaf pruned butterfly and attains the smallest worst-case slowdown to date for such interconnections, namely, O \Gamma n 1=d (log(m=n)) 1\Gamma1=d \Delta for the d-dimensional mesh (with constant d) and O( p n log(m=n)) for the pruned butterfly. In fact, the simulation on the pruned butterfly is the first PRAM simulation scheme on an area-universal network. Finally, we prove restricted and unrestricted lower bounds on the slowdown of any deterministic PRAM simulation on an arbitrary network, formulated in terms of the bandwidth properties of the interconnection as expressed by its decomposition tree. 3 List of Symbols Used 1 one l lower-case ell 0 zer..

CiteSeerX

Elsevier - Publisher Connector

Archivio istituzionale della ricerca - Università di Padova

Deterministic Parallel Backtrack Search

Author: Andrea Pietracaprina
Geppino Pucci
Kieran T. Herley
Publication venue
Publication date: 01/01/2002
Field of study

The backtrack search problem involves visiting all the nodes of an arbitrary binary tree given a pointer to its root, subject to the constraint that the children of a node are revealed only after their parent is visited. We present a fast, deterministic backtrack search algorithm for a p-processor COMMON CRCW-PRAM, which visits any n-node tree of height h in time O (n=p + h)(log log log p) 2 . This upper bound compares favourably with a natural n=p + h) lower bound for this problem. Our approach embodies novel, ecient techniques for dynamically assigning tree-nodes to processors to ensure that the work is shared equitably among them. Key words: Backtrack search. Load balancing. PRAM model. Parallel algorithms. 1 Introduction Several algorithmic techniques, such as those employed for solving many optimization problems, are based on the systematic exploration of a tree, whose internal nodes correspond to partial solutions (growing progressively more re- ned with increasing depth)..

CiteSeerX

Elsevier - Publisher Connector

Crossref

Archivio istituzionale della ricerca - Università di Padova