Search CORE

14,997 research outputs found

Metascheduling of HPC Jobs in Day-Ahead Electricity Markets

Author: Murali Prakash
Vadhiyar Sathish
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/12/2017
Field of study

High performance grid computing is a key enabler of large scale collaborative computational science. With the promise of exascale computing, high performance grid systems are expected to incur electricity bills that grow super-linearly over time. In order to achieve cost effectiveness in these systems, it is essential for the scheduling algorithms to exploit electricity price variations, both in space and time, that are prevalent in the dynamic electricity price markets. In this paper, we present a metascheduling algorithm to optimize the placement of jobs in a compute grid which consumes electricity from the day-ahead wholesale market. We formulate the scheduling problem as a Minimum Cost Maximum Flow problem and leverage queue waiting time and electricity price predictions to accurately estimate the cost of job execution at a system. Using trace based simulation with real and synthetic workload traces, and real electricity price data sets, we demonstrate our approach on two currently operational grids, XSEDE and NorduGrid. Our experimental setup collectively constitute more than 433K processors spread across 58 compute systems in 17 geographically distributed locations. Experiments show that our approach simultaneously optimizes the total electricity cost and the average response time of the grid, without being unfair to users of the local batch systems.Comment: Appears in IEEE Transactions on Parallel and Distributed System

arXiv.org e-Print Archive

The Ultralight project: the network as an integrated and managed resource for data-intensive science

Author: Bunn Julian
Cavanaugh Richard
Legrand Iosif
Low Steven
McKee Shawn
Nae Dan
Newman Harvey
Ravot Sylvain
Steenberg Conrad
Su Xun
Thomas Michael
van Lingen Frank
Xia Yang
Publication venue
Publication date: 01/11/2005
Field of study

Looks at the UltraLight project which treats the network interconnecting globally distributed data sets as a dynamic, configurable, and closely monitored resource to construct a next-generation system that can meet the high-energy physics community's data-processing, distribution, access, and analysis needs

Caltech Authors

Datacenter Traffic Control: Understanding Techniques and Trade-offs

Author: Noormohammadpour Mohammad
Raghavendra Cauligi S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/12/2017
Field of study

Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

arXiv.org e-Print Archive

ZENODO

FigShare

Exploring the Fairness and Resource Distribution in an Apache Mesos Environment

Author: Beltre Angel
Govindaraju Madhusudhan
Saha Pankaj
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/05/2019
Field of study

Apache Mesos, a cluster-wide resource manager, is widely deployed in massive scale at several Clouds and Data Centers. Mesos aims to provide high cluster utilization via fine grained resource co-scheduling and resource fairness among multiple users through Dominant Resource Fairness (DRF) based allocation. DRF takes into account different resource types (CPU, Memory, Disk I/O) requested by each application and determines the share of each cluster resource that could be allocated to the applications. Mesos has adopted a two-level scheduling policy: (1) DRF to allocate resources to competing frameworks and (2) task level scheduling by each framework for the resources allocated during the previous step. We have conducted experiments in a local Mesos cluster when used with frameworks such as Apache Aurora, Marathon, and our own framework Scylla, to study resource fairness and cluster utilization. Experimental results show how informed decision regarding second level scheduling policy of frameworks and attributes like offer holding period, offer refusal cycle and task arrival rate can reduce unfair resource distribution. Bin-Packing scheduling policy on Scylla with Marathon can reduce unfair allocation from 38\% to 3\%. By reducing unused free resources in offers we bring down the unfairness from to 90\% to 28\%. We also show the effect of task arrival rate to reduce the unfairness from 23\% to 7\%

arXiv.org e-Print Archive

Crossref