Search CORE

27 research outputs found

rDLB: A Novel Approach for Robust Dynamic Load Balancing of Scientific Applications with Parallel Independent Tasks

Author: Cavelan Aurelien
Ciorba Florina M.
Mohammed Ali
Publication venue
Publication date: 01/01/2019
Field of study

Scientific applications often contain large and computationally intensive parallel loops. Dynamic loop self scheduling (DLS) is used to achieve a balanced load execution of such applications on high performance computing (HPC) systems. Large HPC systems are vulnerable to processors or node failures and perturbations in the availability of resources. Most self-scheduling approaches do not consider fault-tolerant scheduling or depend on failure or perturbation detection and react by rescheduling failed tasks. In this work, a robust dynamic load balancing (rDLB) approach is proposed for the robust self scheduling of independent tasks. The proposed approach is proactive and does not depend on failure or perturbation detection. The theoretical analysis of the proposed approach shows that it is linearly scalable and its cost decrease quadratically by increasing the system size. rDLB is integrated into an MPI DLS library to evaluate its performance experimentally with two computationally intensive scientific applications. Results show that rDLB enables the tolerance of up to (P minus one) processor failures, where P is the number of processors executing an application. In the presence of perturbations, rDLB boosted the robustness of DLS techniques up to 30 times and decreased application execution time up to 7 times compared to their counterparts without rDLB

arXiv.org e-Print Archive

edoc

A Measure of Robustness Against Multiple Kinds of Perturbations

Author: Ali Shoukat
Eslamnour Behdis
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2005
Field of study

Parallel and distributed heterogeneous computing systems may operate in an environment that undergoes unpredictable changes causing certain system performance features to degrade. Such systems need robustness to guarantee limited degradation despite fluctuations in the behavior of its component parts or environment. Our previous work in this area presented a method for generating a measure of robustness for a given system. However, the focus of that approach was on a scenario where all perturbations were of the same kind, e.g., all perturbations were in message sizes or computation times, but not both message sizes and computation times. This paper gives an extended discussion of the case where perturbations could be of different kinds, and presents some new insights

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

10- #1123 DISEÑO ROBUSTO DEL SISTEMA LOGÍSTICO DE ACOPIO Y REFRIGERACIÓN DE LECHE MEDIANTE ANÁLISIS DE LAS COMPENSACIONES ENTRE EMISIONES DE CO2 Y VALOR PRESENTE NETO

Author: Muñoz Pinzón Dairo Steven
Polo Roa Andrés
Publication venue: 'Universidad Industrial de Santander'
Publication date: 14/03/2022
Field of study

El problema de diseño de sistemas logísticos es un problema de nivel estratégico que implica la selección de uno o varios depósitos de un conjunto de ubicaciones candidatas. Durante los últimos años, muchos problemas de logística e investigación de operaciones se han extendido para incluir problemas de efecto invernadero y aspectos financieros relacionados con el impacto ambiental de las actividades de transporte. El presente trabajo presenta un diseño robusto del sistema logístico de acopio y refrigeración de leche en una Cooperativa (Tordecilla-Madera, Polo, Muñoz, González-Rodríguez, 2017). Este diseño consiste en la localización de tanques de refrigeración, en donde cada uno permite acopiar la leche de varios productores. El modelo propuesto está formulado como un problema bi-objetivo, considerando la minimización de las emisiones de gases de efecto invernadero producida por el trasporte de cantinas de leche en motocicleta y la maximización del valor presente neto de la configuración del sistema (VPN). Al caracterizar la relación robustez-VPN y robustez-CO2 se determinó cuál configuración es más robusta y como se genera esta robustez. El modelo matemático propuesto del problema se resuelve con la técnica clásica de restricción épsilon y la robustez se determina por medio de la metodología FePia (Ali, Maciejewski, Siegel, 2004). Se determinó entonces que la Cooperativa debe montar su sistema logístico de acopio y refrigeración de acuerdo con la configuración escogida y para esta se diseñó un plan táctico que optimiza el uso de los tanques de refrigeración instalados

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Robust resource allocation in weather data processing systems

Author: Brateman Jeff
Knapp Keith
Maciejewski Anthony A.
Martin Jon
Oltikar Mohana
Siegel H. J.
White Joe
Publication venue: IEEE Computer Society
Publication date: 01/01/2006
Field of study

Includes bibliographical references (pages [9-10]).Reliability of weather data processing systems is of prime importance to ensure the efficient operation of space-based weather monitoring systems. This work defines a heterogeneous weather data processing system that is susceptible to uncertainties in data set arrival times. The resource allocation must be robust with respect to these uncertainties. The tasks to be executed by the data processing system are classified into three broad categories: telemetry, tracking and control (high priority); data processing (medium priority); and data research (low priority).The high priority tasks must be completed before considering medium and low priority tasks. The goal of this research is to find a resource allocation that minimizes makespan of the high priority tasks, and to find a mapping that maximizes a function of the completion time and priority of the medium and low priority tasks. Different heuristic techniques to find near optimal solutions are studied, and their performance is evaluated

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Robust processor allocation for independent tasks when dollar cost for processors is a constraint

Author: Al-Otaibi Mohammad
Ali Syed
Aydin Mahir
Guru Kumara
Horiuchi Aaron
Krishnamurthy Yogish
Lee Panho
Maciejewski Anthony A.
Mehta Ashish
Oltikar Mohana
Pichel Ron
Pippin Alan
Raskey Michael
Shestak Vladimir
Siegel H. J.
Sugavanam Prasanna
Zhang Junxing
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Includes bibliographical references (pages 9-10).In a distributed heterogeneous computing system, the resources have different capabilities and tasks have different requirements. Different classes of machines used in such systems typically vary in dollar cost based on their computing efficiencies. Makespan (defined as the completion time for an entire set of tasks) is often the performance feature that is optimized. Resource allocation is often done based on estimates of the computation time of each task on each class of machines. Hence, it is important that makespan be robust against errors in computation time estimates. The dollar cost to purchase the machines for use can be a constraint such that only a subset of the machines available can be purchased. The goal of this study is to: (1) select a subset of all the machines available so that the cost constraint for the machines is satisfied, and (2) find a static mapping of tasks so that the robustness of the desired system feature, makespan, is maximized against the errors in task execution time estimates. Six heuristic techniques to this problem are presented and evaluated

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Robustness of resource allocation in parallel and distributed computing systems, The

Author: Ali Shoukat
Maciejewski Anthony A.
Siegel Howard Jay
Publication venue: IEEE Computer Society
Publication date: 01/01/2004
Field of study

Includes bibliographical references (page [9]).This paper gives an overview of the material to be discussed in the invited keynote presentation by H. J. Siegel; it summarizes our research in [1]. Performing computing and communication tasks on parallel and distributed systems involves the coordinated use of different types of machines, networks, interfaces, and other resources. Decisions about how best to allocate resources are often based on estimated values of task and system parameters, due to uncertainties in the system environment. An important research problem is the development of resource management strategies that can guarantee a particular system performance given such uncertainties. We have designed a methodology for deriving the degree of robustness of a resource allocation - the maximum amount of collective uncertainty in system parameters within which a user-specified level of system performance (QoS) can be guaranteed. Our four-step procedure for deriving a robustness metric for an arbitrary system will be presented. We will illustrate this procedure and its usefulness by deriving robustness metrics for some example distributed systems

Mountain Scholar (Digital Collections of Colorado and Wyoming)

The Robustness of Resource Allocation in Parallel and Distributed Computing Systems

Author: Ali Shoukat
Maciejewski A. A.
Siegel Howard Jay
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2004
Field of study

This paper gives an overview of the material to be discussed in the invited keynote presentation by H. J. Siegel. Performing computing and communication tasks on parallel and distributed systems involves the coordinated use of different types of machines, networks, interfaces, and other resources. Decisions about how best to allocate resources are often based on estimated values of task and system parameters, due to uncertainties in the system environment. An important research problem is the development of resource management strategies that can guarantee a particular system performance given such uncertainties. We have designed a methodology for deriving the degree of robustness of a resource allocation - the maximum amount of collective uncertainty in system parameters within which a user-specified level of system performance (QoS) can be guaranteed. Our four-step procedure for deriving a robustness metric for an arbitrary system will be presented. We will illustrate this procedure and its usefulness by deriving robustness metrics for some example distributed systems

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine