2,812 research outputs found
HTC Scientific Computing in a Distributed Cloud Environment
This paper describes the use of a distributed cloud computing system for
high-throughput computing (HTC) scientific applications. The distributed cloud
computing system is composed of a number of separate
Infrastructure-as-a-Service (IaaS) clouds that are utilized in a unified
infrastructure. The distributed cloud has been in production-quality operation
for two years with approximately 500,000 completed jobs where a typical
workload has 500 simultaneous embarrassingly-parallel jobs that run for
approximately 12 hours. We review the design and implementation of the system
which is based on pre-existing components and a number of custom components. We
discuss the operation of the system, and describe our plans for the expansion
to more sites and increased computing capacity
Metascheduling of HPC Jobs in Day-Ahead Electricity Markets
High performance grid computing is a key enabler of large scale collaborative
computational science. With the promise of exascale computing, high performance
grid systems are expected to incur electricity bills that grow super-linearly
over time. In order to achieve cost effectiveness in these systems, it is
essential for the scheduling algorithms to exploit electricity price
variations, both in space and time, that are prevalent in the dynamic
electricity price markets. In this paper, we present a metascheduling algorithm
to optimize the placement of jobs in a compute grid which consumes electricity
from the day-ahead wholesale market. We formulate the scheduling problem as a
Minimum Cost Maximum Flow problem and leverage queue waiting time and
electricity price predictions to accurately estimate the cost of job execution
at a system. Using trace based simulation with real and synthetic workload
traces, and real electricity price data sets, we demonstrate our approach on
two currently operational grids, XSEDE and NorduGrid. Our experimental setup
collectively constitute more than 433K processors spread across 58 compute
systems in 17 geographically distributed locations. Experiments show that our
approach simultaneously optimizes the total electricity cost and the average
response time of the grid, without being unfair to users of the local batch
systems.Comment: Appears in IEEE Transactions on Parallel and Distributed System
A Taxonomy of Workflow Management Systems for Grid Computing
With the advent of Grid and application technologies, scientists and
engineers are building more and more complex applications to manage and process
large data sets, and execute scientific experiments on distributed resources.
Such application scenarios require means for composing and executing complex
workflows. Therefore, many efforts have been made towards the development of
workflow management systems for Grid computing. In this paper, we propose a
taxonomy that characterizes and classifies various approaches for building and
executing workflows on Grids. We also survey several representative Grid
workflow systems developed by various projects world-wide to demonstrate the
comprehensiveness of the taxonomy. The taxonomy not only highlights the design
and engineering similarities and differences of state-of-the-art in Grid
workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure
A theoretical and computational basis for CATNETS
The main content of this report is the identification and definition of market mechanisms for Application Layer Networks (ALNs). On basis of the structured Market Engineering process, the work comprises the identification of requirements which adequate market mechanisms for ALNs have to fulfill. Subsequently, two mechanisms for each, the centralized and the decentralized case are described in this document. These build the theoretical foundation for the work within the following two years of the CATNETS project. --Grid Computing
Dependable Distributed Computing for the International Telecommunication Union Regional Radio Conference RRC06
The International Telecommunication Union (ITU) Regional Radio Conference
(RRC06) established in 2006 a new frequency plan for the introduction of
digital broadcasting in European, African, Arab, CIS countries and Iran. The
preparation of the plan involved complex calculations under short deadline and
required dependable and efficient computing capability. The ITU designed and
deployed in-situ a dedicated PC farm, in parallel to the European Organization
for Nuclear Research (CERN) which provided and supported a system based on the
EGEE Grid. The planning cycle at the RRC06 required a periodic execution in the
order of 200,000 short jobs, using several hundreds of CPU hours, in a period
of less than 12 hours. The nature of the problem required dynamic
workload-balancing and low-latency access to the computing resources. We
present the strategy and key technical choices that delivered a reliable
service to the RRC06
A comparison of resource allocation process in grid and cloud technologies
Grid Computing and Cloud Computing are two different technologies that have emerged to validate the long-held dream of computing as utilities which led to an important revolution in IT industry. These technologies came with several challenges in terms of middleware, programming model, resources management and business models. These challenges are seriously considered by Distributed System research. Resources allocation is a key challenge in both technologies as it causes the possible resource wastage and service degradation. This paper is addressing a comprehensive study of the resources allocation processes in both technologies. It provides the researchers with an in-depth understanding of all resources allocation related aspects and associative challenges, including: load balancing, performance, energy consumption, scheduling algorithms, resources consolidation and migration. The comparison also contributes an informal definition of the Cloud resource allocation process. Resources in the Cloud are being shared by all users in a time and space sharing manner, in contrast to dedicated resources that governed by a queuing system in Grid resource management. Cloud Resource allocation suffers from extra challenges abbreviated by achieving good load balancing and making right consolidation decision
Environmental analysis for application layer networks
Die zunehmende Vernetzung von Rechnern über das Internet lies die Vision von Application Layer Netzwerken aufkommen. Sie umfassen Overlay Netzwerke wie beispielsweise Peer-to-Peer Netzwerke und Grid Infrastrukturen unter Verwendung des TCP/IP Protokolls. Ihre gemeinsame Eigenschaft ist die redundante, verteilte Bereitstellung und der Zugang zu Daten-, Rechen- und Anwendungsdiensten, während sie die Heterogenität der Infrastruktur vor dem Nutzer verbergen. In dieser Arbeit werden die Anforderungen, die diese Netzwerke an ökonomische Allokationsmechanismen stellen, untersucht. Die Analyse erfolgt anhand eines Marktanalyseprozesses für einen zentralen Auktionsmechanismus und einen katallaktischen Markt. --Grid Computing
- …