145 research outputs found
Resource management in heterogeneous computing systems with tasks of varying importance
2014 Summer.The problem of efficiently assigning tasks to machines in heterogeneous computing environments where different tasks can have different levels of importance (or value) to the computing system is a challenging one. The goal of this work is to study this problem in a variety of environments. One part of the study considers a computing system and its corresponding workload based on the expectations for future environments of Department of Energy and Department of Defense interest. We design heuristics to maximize a performance metric created using utility functions. We also create a framework to analyze the trade-offs between performance and energy consumption. We design techniques to maximize performance in a dynamic environment that has a constraint on the energy consumption. Another part of the study explores environments that have uncertainty in the availability of the compute resources. For this part, we design heuristics and compare their performance in different types of environments
Resource management for heterogeneous computing systems: utility maximization, energy-aware scheduling, and multi-objective optimization
Includes bibliographical references.2015 Summer.As high performance heterogeneous computing systems continually become faster, the operating cost to run these systems has increased. A significant portion of the operating costs can be attributed to the amount of energy required for these systems to operate. To reduce these costs it is important for system administrators to operate these systems in an energy efficient manner. Additionally, it is important to be able to measure the performance of a given system so that the impacts of operating at different levels of energy efficiency can be analyzed. The goal of this research is to examine how energy and system performance interact with each other for a variety of environments. One part of this study considers a computing system and its corresponding workload based on the expectations for future environments of Department of Energy and Department of Defense interest. Numerous Heuristics are presented that maximize a performance metric created using utility functions. Additional heuristics and energy filtering techniques have been designed for a computing system that has the goal of maximizing the total utility earned while being subject to an energy constraint. A framework has been established to analyze the trade-offs between performance (utility earned) and energy consumption. Stochastic models are used to create "fuzzy" Pareto fronts to analyze the variability of solutions along the Pareto front when uncertainties in execution time and power consumption are present within a system. In addition to using utility earned as a measure of system performance, system makespan has also been studied. Finally, a framework has been developed that enables the investigation of the effects of P-states and memory interference on energy consumption and system performance
Resource management for extreme scale high performance computing systems in the presence of failures
2018 Summer.Includes bibliographical references.High performance computing (HPC) systems, such as data centers and supercomputers, coordinate the execution of large-scale computation of applications over tens or hundreds of thousands of multicore processors. Unfortunately, as the size of HPC systems continues to grow towards exascale complexities, these systems experience an exponential growth in the number of failures occurring in the system. These failures reduce performance and increase energy use, reducing the efficiency and effectiveness of emerging extreme-scale HPC systems. Applications executing in parallel on individual multicore processors also suffer from decreased performance and increased energy use as a result of applications being forced to share resources, in particular, the contention from multiple application threads sharing the last-level cache causes performance degradation. These challenges make it increasingly important to characterize and optimize the performance and behavior of applications that execute in these systems. To address these challenges, in this dissertation we propose a framework for intelligently characterizing and managing extreme-scale HPC system resources. We devise various techniques to mitigate the negative effects of failures and resource contention in HPC systems. In particular, we develop new HPC resource management techniques for intelligently utilizing system resources through the (a) optimal scheduling of applications to HPC nodes and (b) the optimal configuration of fault resilience protocols. These resource management techniques employ information obtained from historical analysis as well as theoretical and machine learning methods for predictions. We use these data to characterize system performance, energy use, and application behavior when operating under the uncertainty of performance degradation from both system failures and resource contention. We investigate how to better characterize and model the negative effects from system failures as well as application co-location on large-scale HPC computing systems. Our analysis of application and system behavior also investigates: the interrelated effects of network usage of applications and fault resilience protocols; checkpoint interval selection and its sensitivity to system parameters for various checkpoint-based fault resilience protocols; and performance comparisons of various promising strategies for fault resilience in exascale-sized systems
CEDR -- A Compiler-integrated, Extensible DSSoC Runtime
In this work, we present CEDR, a Compiler-integrated, Extensible Domain
Specific System on Chip Runtime ecosystem to facilitate research towards
addressing the challenges of architecture, system software and application
development with distinct plug-and-play integration points in a unified compile
time and run time workflow. We demonstrate the utility of CEDR on the Xilinx
Zynq MPSoC-ZCU102 for evaluating performance of pre-silicon hardware in the
trade space of SoC configuration, scheduling policy and workload complexity
based on dynamically arriving workload scenarios composed of real-life signal
processing applications scaling to thousands of application instances with FFT
and matrix multiply accelerators. We provide insights into the trade-offs
present in this design space through a number of distinct case studies. CEDR is
portable and has been deployed and validated on Odroid-XU3, X86 and Nvidia
Jetson Xavier based SoC platforms. Taken together, CEDR is a capable
environment for enabling research in exploring the boundaries of productive
application development, resource management heuristic development, and
hardware configuration analysis for heterogeneous architectures.Comment: 35 pages single column, 16 figures, 7 tables. Accepted for
publication in the ACM Transactions on Embedded and Computing System
Resource Allocation of Industry 4.0 Micro-Service Applications across Serverless Fog Federation
The Industry 4.0 revolution has been made possible via AI-based applications
(e.g., for automation and maintenance) deployed on the serverless edge (aka
fog) computing platforms at the industrial sites -- where the data is
generated. Nevertheless, fulfilling the fault-intolerant and real-time
constraints of Industry 4.0 applications on resource-limited fog systems in
remote industrial sites (e.g., offshore oil fields) that are uncertain,
disaster-prone, and have no cloud access is challenging. It is this challenge
that our research aims at addressing. We consider the inelastic nature of the
fog systems, software architecture of the industrial applications
(micro-service-based versus monolithic), and scarcity of human experts in
remote sites. To enable cloud-like elasticity, our approach is to dynamically
and seamlessly (i.e., without human intervention) federate nearby fog systems.
Then, we develop serverless resource allocation solutions that are cognizant of
the applications' software architecture, their latency requirements, and
distributed nature of the underlying infrastructure. We propose methods to
seamlessly and optimally partition micro-service-based application across the
federated fog. Our experimental evaluation express that not only the elasticity
is overcome in a serverless manner, but also our developed application
partitioning method can serve around 20% more tasks on-time than the existing
methods in the literature.Comment: Accepted in the Future Generation Computer Systems (FGCS) Journa
Task Packing: Efficient task scheduling in unbalanced parallel programs to maximize CPU utilization
Load imbalance in parallel systems can be generated by external factors to the currently running applications like operating system noise or the underlying hardware like a heterogeneous cluster. HPC applications working on irregular data structures can also have difficulties to balance their computations across the parallel tasks. In this article we extend, improve and evaluate more deeply the Task Packing mechanism proposed in a previous work. The main idea of the mechanism is to concentrate the idle cycles of unbalanced applications in such a way that one or more CPUs are freed from execution. To achieve this, CPUs are stressed with just useful work of the parallel application tasks, provided performance is not degraded. The packing is solved by an algorithm based on the Knapsack problem, in a minimum number of CPUs and using oversubscription. We design and implement a more efficient version of such mechanism. To that end, we perform the Task Packing “in place”, taking advantage of idle cycles generated at synchronization points of unbalanced applications. Evaluations are carried out on a heterogeneous platform using FT and miniFE benchmarks. Results showed that our proposal generates low overhead. In addition the amount of freed CPUs are related to a load imbalance metric which can be used as a prediction for it.Peer ReviewedPostprint (author's final draft
- …