127,967 research outputs found

    Transforming Comparison Model Lower Bounds to the PRAM

    Get PDF
    This note provides general transformations of lower bounds in Valiant'sparallel comparison decision tree model to lower bounds in the priorityconcurrent-read concurrent-write parallel-random-access-machine model.The proofs rely on standard Ramsey-theoretic arguments that simplifythe structure of the computation by restricting the input domain. Thetransformation of comparison model lower bounds, which are usually easierto obtain, to the parallel-random-access-machine, unifies some knownlower bounds and gives new lower bounds for several problems

    Equivalence Classes and Conditional Hardness in Massively Parallel Computations

    Get PDF
    The Massively Parallel Computation (MPC) model serves as a common abstraction of many modern large-scale data processing frameworks, and has been receiving increasingly more attention over the past few years, especially in the context of classical graph problems. So far, the only way to argue lower bounds for this model is to condition on conjectures about the hardness of some specific problems, such as graph connectivity on promise graphs that are either one cycle or two cycles, usually called the one cycle vs. two cycles problem. This is unlike the traditional arguments based on conjectures about complexity classes (e.g., P ? NP), which are often more robust in the sense that refuting them would lead to groundbreaking algorithms for a whole bunch of problems. In this paper we present connections between problems and classes of problems that allow the latter type of arguments. These connections concern the class of problems solvable in a sublogarithmic amount of rounds in the MPC model, denoted by MPC(o(log N)), and some standard classes concerning space complexity, namely L and NL, and suggest conjectures that are robust in the sense that refuting them would lead to many surprisingly fast new algorithms in the MPC model. We also obtain new conditional lower bounds, and prove new reductions and equivalences between problems in the MPC model

    Communication Lower Bounds for Distributed-Memory Computations

    Get PDF
    In this paper we propose a new approach to the study of the communication requirements of distributed computations, which advocates for the removal of the restrictive assumptions under which earlier results were derived. We illustrate our approach by giving tight lower bounds on the communication complexity required to solve several computational problems in a distributed-memory parallel machine, namely standard matrix multiplication, stencil computations, comparison sorting, and the Fast Fourier Transform. Our bounds rely only on a mild assumption on work distribution, and significantly strengthen previous results which require either the computation to be balanced among the processors, or specific initial distributions of the input data, or an upper bound on the size of processors\u27 local memories

    Processor-Oblivious Parallel Stream Computations

    Full text link
    We study the problem of parallel stream computations on a multiprocessor architecture. Modelling the problem, we exhibit that any parallelisation introduces an arithmetic overhead related to intermediate copy operations. We pro-vide lower bounds for the parallel stream computation on p processors of different speeds with two models, a strict model and a buffered model; to our knowledge, these are new results. We introduce a new parallel algorithm called processor-oblivious: it is based on the coupling of a fast sequential algorithm with a fine-grain parallel one that is scheduled by work-stealing. This algorithm is proved asymptotically optimal. We show that our algorithm has a good experimental behaviour. 1

    A branch and bound approach for large pre-marshalling problems

    Full text link
    [EN] The container pre-marshalling problem involves the sorting of containers in stacks so that there are no blocking containers and retrieval is carried out without additional movements. This sorting process should be carried out in as few container moves as possible. Despite recent advancements in solving real world sized problems to optimality, several classes of pre-marshalling problems remain difficult for exact approaches. We propose a branch and bound algorithm with new components for solving such difficult instances. We strengthen existing lower bounds and introduce two new lower bounds that use a relaxation of the pre-marshalling problem to provide tight bounds in specific situations. We introduce generalized dominance rules that help reduce the search space, and a memoization heuristic that finds feasible solutions quickly. We evaluate our approach on standard benchmarks of pre-marshalling instances, as well as on a new dataset to avoid overfitting to the available data. Overall, our approach optimally solves many more instances than previous work, and finds feasible solutions on nearly every problem it encounters in limited CPU times.The authors thank the Paderborn Center for Parallel Computation (PC2) for the use of the Arminius cluster for the computational study in this work. This work has been partially supported by the Spanish Ministry of Science, Innovation, and Universities FPU Grant A-2015-12849 and by the Spanish Ministry of Economy and Competitiveness, under projects DPI2014-53665-P and DPI2015-65895-R, partially financed with FEDER funds.Tanaka, S.; Tierney, K.; Parreño-Torres, C.; Alvarez-Valdes, R.; Ruiz García, R. (2019). A branch and bound approach for large pre-marshalling problems. European Journal of Operational Research. 278(1):211-225. https://doi.org/10.1016/j.ejor.2019.04.005S211225278

    On Characterizing the Data Movement Complexity of Computational DAGs for Parallel Execution

    Get PDF
    Technology trends are making the cost of data movement increasingly dominant, both in terms of energy and time, over the cost of performing arithmetic operations in computer systems. The fundamental ratio of aggregate data movement bandwidth to the total computational power (also referred to the machine balance parameter) in parallel computer systems is decreasing. It is there- fore of considerable importance to characterize the inherent data movement requirements of parallel algorithms, so that the minimal architectural balance parameters required to support it on future systems can be well understood. In this paper, we develop an extension of the well-known red-blue pebble game to develop lower bounds on the data movement complexity for the parallel execution of computational directed acyclic graphs (CDAGs) on parallel systems. We model multi-node multi-core parallel systems, with the total physical memory distributed across the nodes (that are connected through some interconnection network) and in a multi-level shared cache hierarchy for processors within a node. We also develop new techniques for lower bound characterization of non-homogeneous CDAGs. We demonstrate the use of the methodology by analyzing the CDAGs of several numerical algorithms, to develop lower bounds on data movement for their parallel execution
    • …
    corecore