Search CORE

1,234 research outputs found

Collocation Games and Their Application to Distributed Resource Management

Author: Bestavros Azer
Londoño Jorge
Teng Shang-Hua
Publication venue: Boston University Computer Science Department
Publication date: 07/02/2009
Field of study

We introduce Collocation Games as the basis of a general framework for modeling, analyzing, and facilitating the interactions between the various stakeholders in distributed systems in general, and in cloud computing environments in particular. Cloud computing enables fixed-capacity (processing, communication, and storage) resources to be offered by infrastructure providers as commodities for sale at a fixed cost in an open marketplace to independent, rational parties (players) interested in setting up their own applications over the Internet. Virtualization technologies enable the partitioning of such fixed-capacity resources so as to allow each player to dynamically acquire appropriate fractions of the resources for unencumbered use. In such a paradigm, the resource management problem reduces to that of partitioning the entire set of applications (players) into subsets, each of which is assigned to fixed-capacity cloud resources. If the infrastructure and the various applications are under a single administrative domain, this partitioning reduces to an optimization problem whose objective is to minimize the overall deployment cost. In a marketplace, in which the infrastructure provider is interested in maximizing its own profit, and in which each player is interested in minimizing its own cost, it should be evident that a global optimization is precisely the wrong framework. Rather, in this paper we use a game-theoretic framework in which the assignment of players to fixed-capacity resources is the outcome of a strategic "Collocation Game". Although we show that determining the existence of an equilibrium for collocation games in general is NP-hard, we present a number of simplified, practically-motivated variants of the collocation game for which we establish convergence to a Nash Equilibrium, and for which we derive convergence and price of anarchy bounds. In addition to these analytical results, we present an experimental evaluation of implementations of some of these variants for cloud infrastructures consisting of a collection of multidimensional resources of homogeneous or heterogeneous capacities. Experimental results using trace-driven simulations and synthetically generated datasets corroborate our analytical results and also illustrate how collocation games offer a feasible distributed resource management alternative for autonomic/self-organizing systems, in which the adoption of a global optimization approach (centralized or distributed) would be neither practical nor justifiable.NSF (CCF-0820138, CSR-0720604, EFRI-0735974, CNS-0524477, CNS-052016, CCR-0635102); Universidad Pontificia Bolivariana; COLCIENCIAS–Instituto Colombiano para el Desarrollo de la Ciencia y la Tecnología "Francisco José de Caldas

Boston University Institutional Repository (OpenBU)

Bandwidth-Aware On-Line Scheduling in SMT Multicores

Author: Duato Marín José Francisco
Feliu-Pérez Josué
Petit Martí Salvador Vicente
Sahuquillo Borrás Julio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2016
Field of study

© 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.The memory hierarchy plays a critical role on the performance of current chip multiprocessors. Main memory is shared by all the running processes, which can cause important bandwidth contention. In addition, when the processor implements SMT cores, the L1 bandwidth becomes shared among the threads running on each core. In such a case, bandwidth-aware schedulers emerge as an interesting approach to mitigate the contention. This work investigates the performance degradation that the processes suffer due to memory bandwidth constraints. Experiments show that main memory and L1 bandwidth contention negatively impact the process performance; in both cases, performance degradation can grow up to 40 percent for some of applications. To deal with contention, we devise a scheduling algorithm that consists of two policies guided by the bandwidth consumption gathered at runtime. The process selection policy balances the number of memory requests over the execution time to address main memory bandwidth contention. The process allocation policy tackles L1 bandwidth contention by balancing the L1 accesses among the L1 caches. The proposal is evaluated on a Xeon E5645 platform using a wide set of multiprogrammed workloads, achieving performance benefits up to 6.7 percent with respect to the Linux scheduler.This work was supported by the Spanish Ministerio de Economia y Competitividad (MINECO) and by FEDER funds under Grant TIN2012-38341-C04-01, and by the Intel Early Career Faculty Honor Program Award.Feliu-Pérez, J.; Sahuquillo Borrás, J.; Petit Martí, SV.; Duato Marín, JF. (2016). Bandwidth-Aware On-Line Scheduling in SMT Multicores. IEEE Transactions on Computers. 65(2):422-434. https://doi.org/10.1109/TC.2015.2428694S42243465

Crossref

RiuNet

Task Activity Vectors: A Novel Metric for Temperature-Aware and Energy-Efficient Scheduling

Author: Merkel Andreas
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2010
Field of study

This thesis introduces the abstraction of the task activity vector to characterize applications by the processor resources they utilize. Based on activity vectors, the thesis introduces scheduling policies for improving the temperature distribution on the processor chip and for increasing energy efficiency by reducing the contention for shared resources of multicore and multithreaded processors

KITopen

Recommended from our members

Final Report: Performance Modeling Activities in PERC2

Author: Snavely Allan
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 25/02/2007
Field of study

Progress in Performance Modeling for PERC2 resulted in: • Automated modeling tools that are robust, able to characterize large applications running at scale while simultaneously simulating the memory hierarchies of mul-tiple machines in parallel. • Porting of the requisite tracer tools to multiple platforms. • Improved performance models by using higher resolution memory models that ever before. • Adding control-flow and data dependency analysis to the tracers used in perform-ance tools. • Exploring and developing several new modeling methodologies. • Using modeling tools to develop performance models for strategic codes. • Application of modeling methodology to make a large number of “blind” per-formance predictions on certain mission partner applications, targeting most cur-rently available system architectures. • Error analysis to correct some systematic biases encountered as part of the large-scale blind prediction exercises. • Addition of instrumentation capabilities for communication libraries other than MPI. • Dissemination the tools and modeling methods to several mission partners, in-cluding DoD HPCMO and two DARPA HPCS vendors (Cray and IBM), as well as to the wider HPC community via a series of tutorials

UNT Digital Library

Blox: A Modular Toolkit for Deep Learning Schedulers

Author: Agarwal Saurabh
Phanishayee Amar
Venkataraman Shivaram
Publication venue
Publication date: 19/12/2023
Field of study

Deep Learning (DL) workloads have rapidly increased in popularity in enterprise clusters and several new cluster schedulers have been proposed in recent years to support these workloads. With rapidly evolving DL workloads, it is challenging to quickly prototype and compare scheduling policies across workloads. Further, as prior systems target different aspects of scheduling (resource allocation, placement, elasticity etc.), it is also challenging to combine these techniques and understand the overall benefits. To address these challenges we propose Blox, a modular toolkit which allows developers to compose individual components and realize diverse scheduling frameworks. We identify a set of core abstractions for DL scheduling, implement several existing schedulers using these abstractions, and verify the fidelity of these implementations by reproducing results from prior research. We also highlight how we can evaluate and compare existing schedulers in new settings: different workload traces, higher cluster load, change in DNN workloads and deployment characteristics. Finally, we showcase Blox's extensibility by composing policies from different schedulers, and implementing novel policies with minimal code changes. Blox is available at \url{https://github.com/msr-fiddle/blox}.Comment: To be presented at Eurosys'2

arXiv.org e-Print Archive

Hybrid scheduling algorithms in cloud computing: a review

Author: Arora Neeraj
Banyal Rohitash Kumar
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/02/2022
Field of study

Cloud computing is one of the emerging fields in computer science due to its several advancements like on-demand processing, resource sharing, and pay per use. There are several cloud computing issues like security, quality of service (QoS) management, data center energy consumption, and scaling. Scheduling is one of the several challenging problems in cloud computing, where several tasks need to be assigned to resources to optimize the quality of service parameters. Scheduling is a well-known NP-hard problem in cloud computing. This will require a suitable scheduling algorithm. Several heuristics and meta-heuristics algorithms were proposed for scheduling the user's task to the resources available in cloud computing in an optimal way. Hybrid scheduling algorithms have become popular in cloud computing. In this paper, we reviewed the hybrid algorithms, which are the combinations of two or more algorithms, used for scheduling in cloud computing. The basic idea behind the hybridization of the algorithm is to take useful features of the used algorithms. This article also classifies the hybrid algorithms and analyzes their objectives, quality of service (QoS) parameters, and future directions for hybrid scheduling algorithms

ZENODO

Institute of Advanced Engineering and Science

L1-Bandwidth Aware Thread Allocation in Multicore SMT Processors

Author: Duato Marín José Francisco
Feliu Pérez Josué
Petit Martí Salvador Vicente
Sahuquillo Borrás Julio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

© 2013 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Improving the utilization of shared resources is a key issue to increase performance in SMT processors. Recent work has focused on resource sharing policies to enhance the processor performance, but their proposals mainly concentrate on novel hardware mechanisms that adapt to the dynamic resource requirements of the running threads. This work addresses the L1 cache bandwidth problem in SMT processors experimentally on real hardware. Unlike previous work, this paper concentrates on thread allocation, by selecting the proper pair of co-runners to be launched to the same core. The relation between L1 bandwidth requirements of each benchmark and its performance (IPC) is analyzed. We found that for individual benchmarks, performance is strongly connected to L1 bandwidth consumption, and this observation remains valid when several co-runners are launched to the same SMT core. Based on these findings we propose two L1 bandwidth aware thread to core (t2c) allocation policies, namely Static and Dynamic t2c allocation, respectively. The aim of these policies is to properly balance L1 bandwidth requirements of the running threads among the processor cores. Experiments on a Xeon E5645 processor show that the proposed policies significantly improve the performance of the Linux OS kernel regardless the number of cores considered.This work was supported by the Spanish Ministerio de Econom´ıa y Competitividad (MINECO) and by FEDER funds under Grant TIN2012-38341-C04-01; and by Programa de Apoyo a la Investigacion y Desarrollo (PAID-05-12) of the ´ Universitat Politecnica de Val ` encia under Grant SP20120748Feliu Pérez, J.; Sahuquillo Borrás, J.; Petit Martí, SV.; Duato Marín, JF. (2013). L1-Bandwidth Aware Thread Allocation in Multicore SMT Processors. IEEE. https://doi.org/10.1109/PACT.2013.6618810

Crossref

RiuNet