Search CORE

127,967 research outputs found

Transforming Comparison Model Lower Bounds to the PRAM

Author: Breslauer Dany
Dubhashi Devdatt P.
Publication venue: 'Aarhus University Library'
Publication date: 01/02/1995
Field of study

This note provides general transformations of lower bounds in Valiant'sparallel comparison decision tree model to lower bounds in the priorityconcurrent-read concurrent-write parallel-random-access-machine model.The proofs rely on standard Ramsey-theoretic arguments that simplifythe structure of the computation by restricting the input domain. Thetransformation of comparison model lower bounds, which are usually easierto obtain, to the parallel-random-access-machine, unifies some knownlower bounds and gives new lower bounds for several problems

Tidsskrift.dk (Det Kongelige Bibliotek)

Equivalence Classes and Conditional Hardness in Massively Parallel Computations

Author: Nanongkai Danupon
Scquizzato Michele
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 23rd International Conference on Principles of Distributed Systems (OPODIS 2019)
Publication date: 01/01/2020
Field of study

The Massively Parallel Computation (MPC) model serves as a common abstraction of many modern large-scale data processing frameworks, and has been receiving increasingly more attention over the past few years, especially in the context of classical graph problems. So far, the only way to argue lower bounds for this model is to condition on conjectures about the hardness of some specific problems, such as graph connectivity on promise graphs that are either one cycle or two cycles, usually called the one cycle vs. two cycles problem. This is unlike the traditional arguments based on conjectures about complexity classes (e.g., P ? NP), which are often more robust in the sense that refuting them would lead to groundbreaking algorithms for a whole bunch of problems. In this paper we present connections between problems and classes of problems that allow the latter type of arguments. These connections concern the class of problems solvable in a sublogarithmic amount of rounds in the MPC model, denoted by MPC(o(log N)), and some standard classes concerning space complexity, namely L and NL, and suggest conjectures that are robust in the sense that refuting them would lead to many surprisingly fast new algorithms in the MPC model. We also obtain new conditional lower bounds, and prove new reductions and equivalences between problems in the MPC model

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Archivio istituzionale della ricerca - Università di Padova

Communication Lower Bounds for Distributed-Memory Computations

Author: Scquizzato Michele
Silvestri Francesco
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st International Symposium on Theoretical Aspects of Computer Science (STACS 2014)
Publication date: 20/09/2013
Field of study

In this paper we propose a new approach to the study of the communication requirements of distributed computations, which advocates for the removal of the restrictive assumptions under which earlier results were derived. We illustrate our approach by giving tight lower bounds on the communication complexity required to solve several computational problems in a distributed-memory parallel machine, namely standard matrix multiplication, stencil computations, comparison sorting, and the Fast Fourier Transform. Our bounds rely only on a mild assumption on work distribution, and significantly strengthen previous results which require either the computation to be balanced among the processors, or specific initial distributions of the input data, or an upper bound on the size of processors\u27 local memories

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server

Archivio istituzionale della ricerca - Università di Padova

Processor-Oblivious Parallel Stream Computations

Author: Daouda Traore
Jean-louis Roch
Julien Bernard
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

We study the problem of parallel stream computations on a multiprocessor architecture. Modelling the problem, we exhibit that any parallelisation introduces an arithmetic overhead related to intermediate copy operations. We pro-vide lower bounds for the parallel stream computation on p processors of different speeds with two models, a strict model and a buffered model; to our knowledge, these are new results. We introduce a new parallel algorithm called processor-oblivious: it is based on the coupling of a fast sequential algorithm with a fine-grain parallel one that is scheduled by work-stealing. This algorithm is proved asymptotically optimal. We show that our algorithm has a good experimental behaviour. 1

CiteSeerX

Crossref

A branch and bound approach for large pre-marshalling problems

Author: Alvarez-Valdes Ramón
Parreño-Torres Consuelo
Ruiz García Rubén
Tanaka Shunji
Tierney Kevin
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

[EN] The container pre-marshalling problem involves the sorting of containers in stacks so that there are no blocking containers and retrieval is carried out without additional movements. This sorting process should be carried out in as few container moves as possible. Despite recent advancements in solving real world sized problems to optimality, several classes of pre-marshalling problems remain difficult for exact approaches. We propose a branch and bound algorithm with new components for solving such difficult instances. We strengthen existing lower bounds and introduce two new lower bounds that use a relaxation of the pre-marshalling problem to provide tight bounds in specific situations. We introduce generalized dominance rules that help reduce the search space, and a memoization heuristic that finds feasible solutions quickly. We evaluate our approach on standard benchmarks of pre-marshalling instances, as well as on a new dataset to avoid overfitting to the available data. Overall, our approach optimally solves many more instances than previous work, and finds feasible solutions on nearly every problem it encounters in limited CPU times.The authors thank the Paderborn Center for Parallel Computation (PC2) for the use of the Arminius cluster for the computational study in this work. This work has been partially supported by the Spanish Ministry of Science, Innovation, and Universities FPU Grant A-2015-12849 and by the Spanish Ministry of Economy and Competitiveness, under projects DPI2014-53665-P and DPI2015-65895-R, partially financed with FEDER funds.Tanaka, S.; Tierney, K.; Parreño-Torres, C.; Alvarez-Valdes, R.; Ruiz García, R. (2019). A branch and bound approach for large pre-marshalling problems. European Journal of Operational Research. 278(1):211-225. https://doi.org/10.1016/j.ejor.2019.04.005S211225278

RiuNet

Publications at Bielefeld University

On Characterizing the Data Movement Complexity of Computational DAGs for Parallel Execution

Author: Elango Venmugil
Pouchet Louis-Noël
Ramanujam J.
Rastello Fabrice
Sadayappan P.
Publication venue
Publication date: 01/01/2014
Field of study

Technology trends are making the cost of data movement increasingly dominant, both in terms of energy and time, over the cost of performing arithmetic operations in computer systems. The fundamental ratio of aggregate data movement bandwidth to the total computational power (also referred to the machine balance parameter) in parallel computer systems is decreasing. It is there- fore of considerable importance to characterize the inherent data movement requirements of parallel algorithms, so that the minimal architectural balance parameters required to support it on future systems can be well understood. In this paper, we develop an extension of the well-known red-blue pebble game to develop lower bounds on the data movement complexity for the parallel execution of computational directed acyclic graphs (CDAGs) on parallel systems. We model multi-node multi-core parallel systems, with the total physical memory distributed across the nodes (that are connected through some interconnection network) and in a multi-level shared cache hierarchy for processors within a node. We also develop new techniques for lower bound characterization of non-homogeneous CDAGs. We demonstrate the use of the methodology by analyzing the CDAGs of several numerical algorithms, to develop lower bounds on data movement for their parallel execution

arXiv.org e-Print Archive

CiteSeerX

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server