27,703 research outputs found
Towards Formal Interaction-Based Models of Grid Computing Infrastructures
Grid computing (GC) systems are large-scale virtual machines, built upon a
massive pool of resources (processing time, storage, software) that often span
multiple distributed domains. Concurrent users interact with the grid by adding
new tasks; the grid is expected to assign resources to tasks in a fair,
trustworthy way. These distinctive features of GC systems make their
specification and verification a challenging issue. Although prior works have
proposed formal approaches to the specification of GC systems, a precise
account of the interaction model which underlies resource sharing has not been
yet proposed. In this paper, we describe ongoing work aimed at filling in this
gap. Our approach relies on (higher-order) process calculi: these core
languages for concurrency offer a compositional framework in which GC systems
can be precisely described and potentially reasoned about.Comment: In Proceedings DCM 2013, arXiv:1403.768
A Taxonomy of Workflow Management Systems for Grid Computing
With the advent of Grid and application technologies, scientists and
engineers are building more and more complex applications to manage and process
large data sets, and execute scientific experiments on distributed resources.
Such application scenarios require means for composing and executing complex
workflows. Therefore, many efforts have been made towards the development of
workflow management systems for Grid computing. In this paper, we propose a
taxonomy that characterizes and classifies various approaches for building and
executing workflows on Grids. We also survey several representative Grid
workflow systems developed by various projects world-wide to demonstrate the
comprehensiveness of the taxonomy. The taxonomy not only highlights the design
and engineering similarities and differences of state-of-the-art in Grid
workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure
A Mediated Definite Delegation Model allowing for Certified Grid Job Submission
Grid computing infrastructures need to provide traceability and accounting of
their users" activity and protection against misuse and privilege escalation. A
central aspect of multi-user Grid job environments is the necessary delegation
of privileges in the course of a job submission. With respect to these generic
requirements this document describes an improved handling of multi-user Grid
jobs in the ALICE ("A Large Ion Collider Experiment") Grid Services. A security
analysis of the ALICE Grid job model is presented with derived security
objectives, followed by a discussion of existing approaches of unrestricted
delegation based on X.509 proxy certificates and the Grid middleware gLExec.
Unrestricted delegation has severe security consequences and limitations, most
importantly allowing for identity theft and forgery of delegated assignments.
These limitations are discussed and formulated, both in general and with
respect to an adoption in line with multi-user Grid jobs. Based on the
architecture of the ALICE Grid Services, a new general model of mediated
definite delegation is developed and formulated, allowing a broker to assign
context-sensitive user privileges to agents. The model provides strong
accountability and long- term traceability. A prototype implementation allowing
for certified Grid jobs is presented including a potential interaction with
gLExec. The achieved improvements regarding system security, malicious job
exploitation, identity protection, and accountability are emphasized, followed
by a discussion of non- repudiation in the face of malicious Grid jobs
Energy-Efficient Scheduling for Homogeneous Multiprocessor Systems
We present a number of novel algorithms, based on mathematical optimization
formulations, in order to solve a homogeneous multiprocessor scheduling
problem, while minimizing the total energy consumption. In particular, for a
system with a discrete speed set, we propose solving a tractable linear
program. Our formulations are based on a fluid model and a global scheduling
scheme, i.e. tasks are allowed to migrate between processors. The new methods
are compared with three global energy/feasibility optimal workload allocation
formulations. Simulation results illustrate that our methods achieve both
feasibility and energy optimality and outperform existing methods for
constrained deadline tasksets. Specifically, the results provided by our
algorithm can achieve up to an 80% saving compared to an algorithm without a
frequency scaling scheme and up to 70% saving compared to a constant frequency
scaling scheme for some simulated tasksets. Another benefit is that our
algorithms can solve the scheduling problem in one step instead of using a
recursive scheme. Moreover, our formulations can solve a more general class of
scheduling problems, i.e. any periodic real-time taskset with arbitrary
deadline. Lastly, our algorithms can be applied to both online and offline
scheduling schemes.Comment: Corrected typos: definition of J_i in Section 2.1; (3b)-(3c);
definition of \Phi_A and \Phi_D in paragraph after (6b). Previous equations
were correct only for special case of p_i=d_
Resilient network dimensioning for optical grid/clouds using relocation
In this paper we address the problem of dimensioning infrastructure, comprising both network and server resources, for large-scale decentralized distributed systems such as grids or clouds. We will provide an overview of our work in this area, and in particular focus on how to design the resulting grid/cloud to be resilient against network link and/or server site failures. To this end, we will exploit relocation: under failure conditions, a request may be sent to an alternate destination than the one under failure-free conditions. We will provide a comprehensive overview of related work in this area, and focus in some detail on our own most recent work. The latter comprises a case study where traffic has a known origin, but we assume a degree of freedom as to where its end up being processed, which is typically the case for e. g., grid applications of the bag-of-tasks (BoT) type or for providing cloud services. In particular, we will provide in this paper a new integer linear programming (ILP) formulation to solve the resilient grid/cloud dimensioning problem using failure-dependent backup routes. Our algorithm will simultaneously decide on server and network capacity. We find that in the anycast routing problem we address, the benefit of using failure-dependent (FD) rerouting is limited compared to failure-independent (FID) backup routing. We confirm our earlier findings in terms of network capacity savings achieved by relocation compared to not exploiting relocation (order of 6-10% in the current case studies)
- …