14,872 research outputs found
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Data Grids have been adopted as the platform for scientific communities that
need to share, access, transport, process and manage large data collections
distributed worldwide. They combine high-end computing technologies with
high-performance networking and wide-area storage management techniques. In
this paper, we discuss the key concepts behind Data Grids and compare them with
other data sharing and distribution paradigms such as content delivery
networks, peer-to-peer networks and distributed databases. We then provide
comprehensive taxonomies that cover various aspects of architecture, data
transportation, data replication and resource allocation and scheduling.
Finally, we map the proposed taxonomy to various Data Grid systems not only to
validate the taxonomy but also to identify areas for future exploration.
Through this taxonomy, we aim to categorise existing systems to better
understand their goals and their methodology. This would help evaluate their
applicability for solving similar problems. This taxonomy also provides a "gap
analysis" of this area through which researchers can potentially identify new
issues for investigation. Finally, we hope that the proposed taxonomy and
mapping also helps to provide an easy way for new practitioners to understand
this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
A Taxonomy of Workflow Management Systems for Grid Computing
With the advent of Grid and application technologies, scientists and
engineers are building more and more complex applications to manage and process
large data sets, and execute scientific experiments on distributed resources.
Such application scenarios require means for composing and executing complex
workflows. Therefore, many efforts have been made towards the development of
workflow management systems for Grid computing. In this paper, we propose a
taxonomy that characterizes and classifies various approaches for building and
executing workflows on Grids. We also survey several representative Grid
workflow systems developed by various projects world-wide to demonstrate the
comprehensiveness of the taxonomy. The taxonomy not only highlights the design
and engineering similarities and differences of state-of-the-art in Grid
workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure
Querying Large Physics Data Sets Over an Information Grid
Optimising use of the Web (WWW) for LHC data analysis is a complex problem
and illustrates the challenges arising from the integration of and computation
across massive amounts of information distributed worldwide. Finding the right
piece of information can, at times, be extremely time-consuming, if not
impossible. So-called Grids have been proposed to facilitate LHC computing and
many groups have embarked on studies of data replication, data migration and
networking philosophies. Other aspects such as the role of 'middleware' for
Grids are emerging as requiring research. This paper positions the need for
appropriate middleware that enables users to resolve physics queries across
massive data sets. It identifies the role of meta-data for query resolution and
the importance of Information Grids for high-energy physics analysis rather
than just Computational or Data Grids. This paper identifies software that is
being implemented at CERN to enable the querying of very large collaborating
HEP data-sets, initially being employed for the construction of CMS detectors.Comment: 4 pages, 3 figure
Toward a Formal Semantics for Autonomic Components
Autonomic management can improve the QoS provided by parallel/ distributed
applications. Within the CoreGRID Component Model, the autonomic management is
tailored to the automatic - monitoring-driven - alteration of the component
assembly and, therefore, is defined as the effect of (distributed) management
code. This work yields a semantics based on hypergraph rewriting suitable to
model the dynamic evolution and non-functional aspects of Service Oriented
Architectures and component-based autonomic applications. In this regard, our
main goal is to provide a formal description of adaptation operations that are
typically only informally specified. We contend that our approach makes easier
to raise the level of abstraction of management code in autonomic and adaptive
applications.Comment: 11 pages + cover pag
Providing Transaction Class-Based QoS in In-Memory Data Grids via Machine Learning
Elastic architectures and the ”pay-as-you-go” resource pricing model offered by many cloud infrastructure providers may seem the right choice for companies dealing with data centric applications characterized by high variable workload. In such a context, in-memory transactional data grids have demonstrated to be particularly suited for exploiting advantages provided by elastic computing platforms, mainly thanks to their ability to be dynamically (re-)sized and tuned. Anyway, when specific QoS requirements have to be met, this kind of architectures have revealed to be complex to be managed by humans. Particularly, their management is a very complex task without the stand of mechanisms supporting run-time automatic sizing/tuning of the data platform and the underlying (virtual) hardware resources provided by the cloud. In this paper, we present a neural network-based architecture where the system is constantly and automatically re-configured, particularly in terms of computing resources
- …