Search CORE

55 research outputs found

Survey and Analysis of Production Distributed Computing Infrastructures

Author: Jha Shantenu
Katz Daniel S.
Parashar Manish
Rana Omer
Weissman Jon
Publication venue
Publication date: 13/08/2012
Field of study

This report has two objectives. First, we describe a set of the production distributed infrastructures currently available, so that the reader has a basic understanding of them. This includes explaining why each infrastructure was created and made available and how it has succeeded and failed. The set is not complete, but we believe it is representative. Second, we describe the infrastructures in terms of their use, which is a combination of how they were designed to be used and how users have found ways to use them. Applications are often designed and created with specific infrastructures in mind, with both an appreciation of the existing capabilities provided by those infrastructures and an anticipation of their future capabilities. Here, the infrastructures we discuss were often designed and created with specific applications in mind, or at least specific types of applications. The reader should understand how the interplay between the infrastructure providers and the users leads to such usages, which we call usage modalities. These usage modalities are really abstractions that exist between the infrastructures and the applications; they influence the infrastructures by representing the applications, and they influence the ap- plications by representing the infrastructures

arXiv.org e-Print Archive

FigShare

"Virtual malleability" applied to MPI jobs to improve their execution in a multiprogrammed environment"

Author: Utrera Iglesias Gladys Miriam
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2007
Field of study

This work focuses on scheduling of MPI jobs when executing in shared-memory multiprocessors (SMPs). The objective was to obtain the best performance in response time in multiprogrammed multiprocessors systems using batch systems, assuming all the jobs have the same priority. To achieve that purpose, the benefits of supporting malleability on MPI jobs to reduce fragmentation and consequently improve the performance of the system were studied. The contributions made in this work can be summarized as follows:· Virtual malleability: A mechanism where a job is assigned a dynamic processor partition, where the number of processes is greater than the number of processors. The partition size is modified at runtime, according to external requirements such as the load of the system, by varying the multiprogramming level, making the job contend for resources with itself. In addition to this, a mechanism which decides at runtime if applying local or global process queues to an application depending on the load balancing between processes of it. · A job scheduling policy, that takes decisions such as how many processes to start with and the maximum multiprogramming degree based on the type and number of applications running and queued. Moreover, as soon as a job finishes execution and where there are queued jobs, this algorithm analyzes whether it is better to start execution of another job immediately or just wait until there are more resources available. · A new alternative to backfilling strategies for the problema of window execution time expiring. Virtual malleability is applied to the backfilled job, reducing its partition size but without aborting or suspending it as in traditional backfilling. The evaluation of this thesis has been done using a practical approach. All the proposals were implemented, modifying the three scheduling levels: queuing system, processor scheduler and runtime library. The impact of the contributions were studied under several types of workloads, varying machine utilization, communication and, balance degree of the applications, multiprogramming level, and job size. Results showed that it is possible to offer malleability over MPI jobs. An application obtained better performance when contending for the resources with itself than with other applications, especially in workloads with high machine utilization. Load imbalance was taken into account obtaining better performance if applying the right queue type to each application independently.The job scheduling policy proposed exploited virtual malleability by choosing at the beginning of execution some parameters like the number of processes and maximum multiprogramming level. It performed well under bursty workloads with low to medium machine utilizations. However as the load increases, virtual malleability was not enough. That is because, when the machine is heavily loaded, the jobs, once shrunk are not able to expand, so they must be executed all the time with a partition smaller than the job size, thus degrading performance. Thus, at this point the job scheduling policy concentrated just in moldability.Fragmentation was alleviated also by applying backfilling techniques to the job scheduling algorithm. Virtual malleability showed to be an interesting improvement in the window expiring problem. Backfilled jobs even on a smaller partition, can continue execution reducing memory swapping generated by aborts/suspensions In this way the queueing system is prevented from reinserting the backfilled job in the queue and re-executing it in the future.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Secretaría de Estado de Cultura

Mitigating Interference between Scientific Applications in OS-Level Virtualized Environments

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2018
Field of study

Crossref

Scalable Resource Management in High Performance Computers £

Author: Eitan Frachtenberg
Fabrizio Petrini
Juan Fernandez
Salvador Coll
Publication venue
Publication date: 23/04/2020
Field of study

Abstract Clusters of workstations have emerged as an importan

CiteSeerX

Distributed computing practice for large-scale science and engineering applications

Author: Aldinucci
Allan
Altintas
Anderson
Anderson
Andrews
Berman
Bhat
Bogden
Boghosian
Carilli
Chang
Coddington
Cole
Cooke
Cummings
Dean
Deelman
Docan
Docan
Ellis
Frey
Gallichio
Geyer
Goodman
Grimshaw
Jacob
Jha
Karonis
Kepner
Klasky
Lee
Luckow
MacLaren
Manke
Mattson
Mechoso
Nguyen
Oinn
Pallickara
Parashar
Park
Rabhi
Sarmenta
Scheidegger
Scott
Soh
Swendsen
Taylor
Trinder
van der Aalst
Zaroubi
Zhang
Zhang
Publication venue: 'Wiley'
Publication date: 01/01/2013
Field of study

It is generally accepted that the ability to develop large-scale distributed applications has lagged seriously behind other developments in cyberinfrastructure. In this paper, we provide insight into how such applications have been developed and an understanding of why developing applications for distributed infrastructure is hard. Our approach is unique in the sense that it is centered around half a dozen existing scientific applications; we posit that these scientific applications are representative of the characteristics, requirements, as well as the challenges of the bulk of current distributed applications on production cyberinfrastructure (such as the US TeraGrid). We provide a novel and comprehensive analysis of such distributed scientific applications. Specifically, we survey existing models and methods for large-scale distributed applications and identify commonalities, recurring structures, patterns and abstractions. We find that there are many ad hoc solutions employed to develop and execute distributed applications, which result in a lack of generality and the inability of distributed applications to be extensible and independent of infrastructure details. In our analysis, we introduce the notion of application vectors: a novel way of understanding the structure of distributed applications. Important contributions of this paper include identifying patterns that are derived from a wide range of real distributed applications, as well as an integrated approach to analyzing applications, programming systems and patterns, resulting in the ability to provide a critical assessment of the current practice of developing, deploying and executing distributed applications. Gaps and omissions in the state of the art are identified, and directions for future research are outlined

Crossref

Online Research @ Cardiff

Edinburgh Research Explorer

An efficient virtual network interface in the FUGU scalable workstation dc by Kenneth Martin Mackenzie.

Author: Mackenzie Kenneth M. (Kenneth Martin), 1965-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1998
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 123-129).Ph.D

DSpace@MIT

"Virtual malleability" applied to MPI jobs to improve their execution in a multiprogrammed environment"

Author: Utrera Iglesias Gladys Miriam
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/03/2010
Field of study

Tesis Doctorals en Xarxa

Distributed and Multiprocessor Scheduling

Author: Chapin Steven J.
Weissman Jon B,
Publication venue: SURFACE at Syracuse University
Publication date: 01/01/2002
Field of study

This chapter discusses CPU scheduling in parallel and distributed systems. CPU scheduling is part of a broader class of resource allocation problems, and is probably the most carefully studied such problem. The main motivation for multiprocessor scheduling is the desire for increased speed in the execution of a workload. Parts of the workload, called tasks, can be spread across several processors and thus be executed more quickly than on a single processor. In this chapter, we will examine techniques for providing this facility. The scheduling problem for multiprocessor systems can be generally stated as \How can we execute a set of tasks T on a set of processors P subject to some set of optimizing criteria C? The most common goal of scheduling is to minimize the expected runtime of a task set. Examples of other scheduling criteria include minimizing the cost, minimizing communication delay, giving priority to certain users\u27 processes, or needs for specialized hardware devices. The scheduling policy for a multiprocessor system usually embodies a mixture of several of these criteria. Section 2 outlines general issues in multiprocessor scheduling and gives background material, including issues specific to either parallel or distributed scheduling. Section 3 describes the best practices from prior work in the area, including a broad survey of existing scheduling algorithms and mechanisms. Section 4 outlines research issues and gives a summary. Section 5 lists the terms defined in this chapter, while sections 6 and 7 give references to important research publications in the area

Syracuse University Research Facility and Collaborative Environment