Search CORE

5,602 research outputs found

Revisiting Matrix Product on Master-Worker Platforms

Author: Dongarra Jack
Laboratoire de l'informatique du parallélisme
Pineau Jean-François
Robert Yves
Shi Zhiao
Vivien Frédéric
Publication venue
Publication date: 01/01/2006
Field of study

This paper is aimed at designing efficient parallel matrix-product algorithms for heterogeneous master-worker platforms. While matrix-product is well-understood for homogeneous 2D-arrays of processors (e.g., Cannon algorithm and ScaLAPACK outer product algorithm), there are three key hypotheses that render our work original and innovative: - Centralized data. We assume that all matrix files originate from, and must be returned to, the master. - Heterogeneous star-shaped platforms. We target fully heterogeneous platforms, where computational resources have different computing powers. - Limited memory. Because we investigate the parallelization of large problems, we cannot assume that full matrix panels can be stored in the worker memories and re-used for subsequent updates (as in ScaLAPACK). We have devised efficient algorithms for resource selection (deciding which workers to enroll) and communication ordering (both for input and result messages), and we report a set of numerical experiments on various platforms at Ecole Normale Superieure de Lyon and the University of Tennessee. However, we point out that in this first version of the report, experiments are limited to homogeneous platforms

arXiv.org e-Print Archive

HAL-ENS-LYON

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Libre Acces aux Rapports Scientifiques et Techniques

The University of Manchester - Institutional Repository

Hal-Diderot

HAL-Rennes 1

Performance Reproduction and Prediction of Selected Dynamic Loop Scheduling Experiments

Author: Ciorba Florina M.
Eleliemy Ahmed
Mohammed Ali
Publication venue
Publication date: 01/01/2018
Field of study

Scientific applications are complex, large, and often exhibit irregular and stochastic behavior. The use of efficient loop scheduling techniques in computationally-intensive applications is crucial for improving their performance on high-performance computing (HPC) platforms. A number of dynamic loop scheduling (DLS) techniques have been proposed between the late 1980s and early 2000s, and efficiently used in scientific applications. In most cases, the computing systems on which they have been tested and validated are no longer available. This work is concerned with the minimization of the sources of uncertainty in the implementation of DLS techniques to avoid unnecessary influences on the performance of scientific applications. Therefore, it is important to ensure that the DLS techniques employed in scientific applications today adhere to their original design goals and specifications. The goal of this work is to attain and increase the trust in the implementation of DLS techniques in present studies. To achieve this goal, the performance of a selection of scheduling experiments from the 1992 original work that introduced factoring is reproduced and predicted via both, simulative and native experimentation. The experiments show that the simulation reproduces the performance achieved on the past computing platform and accurately predicts the performance achieved on the present computing platform. The performance reproduction and prediction confirm that the present implementation of the DLS techniques considered both, in simulation and natively, adheres to their original description. The results confirm the hypothesis that reproducing experiments of identical scheduling scenarios on past and modern hardware leads to an entirely different behavior from expected

arXiv.org e-Print Archive

Crossref

edoc

On data skewness, stragglers, and MapReduce progress indicators

Author: Chambers J. M.
Dai J.
Gufler B.
Herodotou H.
Herodotou H.
Li J.
Ousterhout K.
Zaharia M.
Publication venue
Publication date: 01/01/2015
Field of study

We tackle the problem of predicting the performance of MapReduce applications, designing accurate progress indicators that keep programmers informed on the percentage of completed computation time during the execution of a job. Through extensive experiments, we show that state-of-the-art progress indicators (including the one provided by Hadoop) can be seriously harmed by data skewness, load unbalancing, and straggling tasks. This is mainly due to their implicit assumption that the running time depends linearly on the input size. We thus design a novel profile-guided progress indicator, called NearestFit, that operates without the linear hypothesis assumption and exploits a careful combination of nearest neighbor regression and statistical curve fitting techniques. Our theoretical progress model requires fine-grained profile data, that can be very difficult to manage in practice. To overcome this issue, we resort to computing accurate approximations for some of the quantities used in our model through space- and time-efficient data streaming algorithms. We implemented NearestFit on top of Hadoop 2.6.0. An extensive empirical assessment over the Amazon EC2 platform on a variety of real-world benchmarks shows that NearestFit is practical w.r.t. space and time overheads and that its accuracy is generally very good, even in scenarios where competitors incur non-negligible errors and wide prediction fluctuations. Overall, NearestFit significantly improves the current state-of-art on progress analysis for MapReduce

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

Archivio della ricerca- Università di Roma La Sapienza

Surveillance and alienation in the online economy

Author: Andrejevic Mark B.
Publication venue: Surveillance Studies Network
Publication date: 01/01/2011
Field of study

The critical literature on commercial monitoring and so-called ‘free labour’ (Terranova 2000) locates exploitation in realms beyond the workplace proper, noting the productivity of networked activity including the creation of user-generated-content and the profitability of commercial sites for social networking and communication. The changing context of productivity in these realms, however, requires further development of a critical concept of exploitation. This article defines exploitation as the extraction of unpaid, coerced, and alienated labour. It considers how such a definition might apply to various forms of unpaid but profit-generating online activity, arguing that commercial monitoring redoubles the conscious, intentional activity of users in ways that render it amenable to a critique of exploitation. Given the role of commercial monitoring in the emerging online economy, the paper emphasizes the importance of supplementing privacy critiques with approaches that identify the ways in which new forms of surveillance represent a form of power that seeks to manage and control consumer behaviour

University of Queensland eSpace

Recommended from our members

A grid computing framework for commercial simulation packages

Author: Mustafee Navonil
Publication venue: Brunel University, School of Information Systems, Computing and Mathematics
Publication date: 01/01/2007
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.An increased need for collaborative research among different organizations, together with continuing advances in communication technology and computer hardware, has facilitated the development of distributed systems that can provide users non-trivial access to geographically dispersed computing resources (processors, storage, applications, data, instruments, etc.) that are administered in multiple computer domains. The term grid computing or grids is popularly used to refer to such distributed systems. A broader definition of grid computing includes the use of computing resources within an organization for running organization-specific applications. This research is in the context of using grid computing within an enterprise to maximize the use of available hardware and software resources for processing enterprise applications. Large scale scientific simulations have traditionally been the primary benefactor of grid computing. The application of this technology to simulation in industry has, however, been negligible. This research investigates how grid technology can be effectively exploited by simulation practitioners using Windows-based commercially available simulation packages to model simulations in industry. These packages are commonly referred to as Commercial Off-The-Shelf (COTS) Simulation Packages (CSPs). The study identifies several higher level grid services that could be potentially used to support the practise of simulation in industry. It proposes a grid computing framework to investigate these services in the context of CSP-based simulations. This framework is called the CSP-Grid Computing (CSP-GC) Framework. Each identified higher level grid service in this framework is referred to as a CSP-specific service. A total of six case studies are presented to experimentally evaluate how grid computing technologies can be used together with unmodified simulation packages to support some of the CSP-specific services. The contribution of this thesis is the CSP-GC framework that identifies how simulation practise in industry may benefit from the use of grid technology. A further contribution is the recognition of specific grid computing software (grid middleware) that can possibly be used together with existing CSPs to provide grid support. With its focus on end-users and end-user tools, it is intended that this research will encourage wider adoption of grid computing in the workplace and that simulation users will derive benefit from using this technology

Brunel University Research Archive