Search CORE

262 research outputs found

Formal Semantics of a Subset of the Paderborn's BSPlib

Author: Frdric Gava
Jean Fortin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

PUB (Paderborn University BSPLib) is a C library supporting the development of Bulk-Synchronous Parallel (BSP) algorithms. The BSP model allows an estimation of the execution time, avoids deadlocks and indeterminism. This paper presents a formal operational semantics for a C+PUB subset language using the Coq proof assistant and a certified N-body computation as example of using this for-mal semantics. 1

CiteSeerX

Crossref

Composition of Efficient Nested BSP Algorithms: Minimum Spanning Tree Computation as an Instructive Example

Author: Friedhelm Meyer
Heide Rolf Wanka
Olaf Bonorden
Publication venue
Publication date
Field of study

We report on the results of an automatic configuration approach for implementing complex parallel BSP algorithms. For this approach, a parallel algorithm is described by a sequence of instructions and of subproblems that have to be solved by other parallel algorithms called as subroutines, together with a mathematical description of its own running time. There also may be free algorithmic parameters as, e. g., the degree of trees in used data structures that have an impact on the running time. As the running time of an algorithm depends on several machine parameters, on some fixed and on the choice of the free algorithmic parameters and on the choice of the parallel subroutines for which the same statement applies in turn, the actual composition of the parallel program for an actual parallel machine from all these ingredients is a difficult task. We have implemented such a configuration system using the Paderborn University BSP library and present as an instructive example the theoretical and experimental results of implementations of sophisticated minimum spanning tree algorithms.

CiteSeerX

Analisis and tools for performance prediction

Author: González J.A.
León C.
Piccoli María Fabiana
Printista Alicia Marcela
Roda García José Luis
Rodriguez C.
Rodríguez J.M.
Sande Gonzalez Francisco de
Publication venue
Publication date: 01/10/2001
Field of study

We present an analytical model that extends BSP to cover both oblivious synchronization and group partitioning. There are a few oversimplifications in BSP that make difficult to have accurate predictions. Even if the numbers of individual communication or computation operations in two stages are the same, the actual times for these two stages may differ. These differences are due to the separate nature of the operations or to the particular pattern followed by the messages. Even worse, the assumption that a constant number of machine instructions takes constant time is far from the truth. Current memory hierarchies imply that memory access vary from a few cycles to several thousands. A natural proposal is to associate a different proportionality constant with each basic block, and analogously, to associate different latencies and bandwidths with each “communication block”. Unfortunately, to use this approach implies that the evaluation parameters not only depend on given architecture, but also reflect algorithm characteristics. Such parameter evaluation must be done for every algorithm. This is a heavy task, implying experiment design, timing, statistics, pattern recognition and multi-parameter fitting algorithms. Software support is required. We have developed a compiler that takes as source a C program annotated with complexity formulas and produces as output an instrumented code. The trace files obtained from the execution of the resulting code are analyzed with an interactive interpreter, giving us, among other information, the values of those parameters.Eje: Programación concurrenteRed de Universidades con Carreras en Informática (RedUNCI

Optimal Reconfiguration of Formation Flying Spacecraft--a Decentralized Approach

Author: Junge Oliver
Marsden Jerrold E.
Ober-Blöbaum Sina
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

This paper introduces a hierarchical, decentralized, and parallelizable method for dealing with optimization problems with many agents. It is theoretically based on a hierarchical optimization theorem that establishes the equivalence of two forms of the problem, and this idea is implemented using DMOC (Discrete Mechanics and Optimal Control). The result is a method that is scalable to certain optimization problems for large numbers of agents, whereas the usual “monolithic” approach can only deal with systems with a rather small number of degrees of freedom. The method is illustrated with the example of deployment of spacecraft, motivated by the Darwin (ESA) and Terrestrial Planet Finder (NASA) missions

Crossref

Caltech Authors

Optimal broadcast on parallel locality models

Author: Juurlink Ben
Kolman Petr
Meyer auf der Heide Friedhelm
Rieping Ingo
Publication venue: Elsevier B.V.
Publication date: 30/04/2003
Field of study

AbstractIn this paper matching upper and lower bounds for broadcast on general purpose parallel computation models that exploit network locality are proven. These models try to capture both the general purpose properties of models like the PRAM or BSP on the one hand, and to exploit network locality of special purpose models like meshes, hypercubes, etc., on the other hand. They do so by charging a cost l(|i−j|) for a communication between processors i and j, where l is a suitably chosen latency function.An upper bound T(p)=∑i=0loglogp2i·l(p1/2i) on the runtime of a broadcast on a p processor H-PRAM is given, for an arbitrary latency function l(k).The main contribution of the paper is a matching lower bound, holding for all latency functions in the range from l(k)=Ω(logk/loglogk) to l(k)=O(log2k). This is not a severe restriction since for latency functions l(k)=O(logk/log1+εlog(k)) with arbitrary ε>0, the runtime of the algorithm matches the trivial lower bound Ω(logp) and for l(k)=Θ(log1+εk) or l(k)=Θ(kε), the runtime matches the other trivial lower bound Ω(l(p)). Both upper and lower bounds apply for other parallel locality models like Y-PRAM, D-BSP and E-BSP, too

Elsevier - Publisher Connector

Implementación de un digesto digital paralelo

Author: Canumán José
Gesto Esteban
Laguía Daniel
Osiris Sofía
Trejo Natalia B.
Publication venue
Publication date: 11/09/2012
Field of study

El crecimiento de la cantidad de información que se pone a disposición en Internet a través de la Web presenta el desafío de satisfacer, en el menor tiempo posible, a los clientes que realizan búsquedas sobre esa información y a la vez mejorar el uso eficiente de los recursos. Los modelos de computación paralela permiten acercarse a este objetivo. Este trabajo presenta una solución eficiente y de bajo costo basada en el modelo de computación Bulk Synchronous Parallel, para la implementación de un Digesto Digital basado en un motor de búsquedas paralelo que utiliza bases de datos relacionales, en un entorno de acceso Web.Eje: Procesamiento distribuido y paraleloRed de Universidades con Carreras en Informática (RedUNCI

Servicio de Difusión de la Creación Intelectual

Flexible Management on BSP Process Rescheduling: Offering Migration at Middleware and Application Levels

Author: Graebin Lucas
Righi Rodrigo da Rosa
Publication venue: Journal of Applied Computing Research
Publication date: 12/12/2011
Field of study

This article describes the rationales for developing jMigBSP - a Java programming library that offers object rescheduling. It was designed to work on grid computing environments and offers an interface that follows the BSP (Bulk Synchronous Parallel) style. jMigBSP’s main contribution focuses on the rescheduling facility in two different ways: (i) by using migration directives on the application coded irectly and (ii) through automatic load balancing at middleware level. Especially, this second idea is feasible thanks to the Java’s inheritance feature, in which transforms a simple jMigBSP application in amigratable one only by changing a single line of code. In addition, the presented library makes the object interaction easier by providing one-sided message passing directives and hides network latency through asynchronous communications. Finally, we developed three BSP applications: (i) Prefix Sum; (ii) Fractal Image Compression (FIC) and; (iii) Fast Fourier Transform (FFT).They show our library as viable solution to offer load balancing on BSP applications. Specially, the FIC results present gains up to 37% when applying migration directives inside the code. Finally, the FFT tests emphasize strength of jMigBSP. In this situation, it outperforms a native library denoted BSPlib when migration facilities take place.Keywords: Bulk Synchronous Parallel, rescheduling, Java, adaptation, object migration, grid computing

Unisinos (Universidade do Vale do Rio dos Sinos): SEER Unisinos

Crossref

On Dynamic Graph Partitioning and Graph Clustering using Diffusion

Author: Gehweiler Joachim
Meyerhenke Henning
Publication venue: Dagstuhl Seminar Proceedings. 10261 - Algorithm Engineering
Publication date: 01/01/2010
Field of study

Dagstuhl Research Online Publication Server

A calculus of functional BSP programs with projection

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Crossref