7,518 research outputs found
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Data Grids have been adopted as the platform for scientific communities that
need to share, access, transport, process and manage large data collections
distributed worldwide. They combine high-end computing technologies with
high-performance networking and wide-area storage management techniques. In
this paper, we discuss the key concepts behind Data Grids and compare them with
other data sharing and distribution paradigms such as content delivery
networks, peer-to-peer networks and distributed databases. We then provide
comprehensive taxonomies that cover various aspects of architecture, data
transportation, data replication and resource allocation and scheduling.
Finally, we map the proposed taxonomy to various Data Grid systems not only to
validate the taxonomy but also to identify areas for future exploration.
Through this taxonomy, we aim to categorise existing systems to better
understand their goals and their methodology. This would help evaluate their
applicability for solving similar problems. This taxonomy also provides a "gap
analysis" of this area through which researchers can potentially identify new
issues for investigation. Finally, we hope that the proposed taxonomy and
mapping also helps to provide an easy way for new practitioners to understand
this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
Towards a service-oriented e-infrastructure for multidisciplinary environmental research
Research e-infrastructures are considered to have generic and thematic parts. The generic part provids high-speed networks, grid (large-scale distributed computing) and database systems (digital repositories and data transfer systems) applicable to all research commnities irrespective of discipline. Thematic parts are specific deployments of e-infrastructures to support diverse virtual research communities. The needs of a virtual community of multidisciplinary envronmental researchers are yet to be investigated. We envisage and argue for an e-infrastructure that will enable environmental researchers to develop environmental models and software entirely out of existing components through loose coupling of diverse digital resources based on the service-oriented achitecture. We discuss four specific aspects for consideration for a future e-infrastructure: 1) provision of digital resources (data, models & tools) as web services, 2) dealing with stateless and non-transactional nature of web services using workflow management systems, 3) enabling web servce discovery, composition and orchestration through semantic registries, and 4) creating synergy with existing grid infrastructures
Recommended from our members
GRIDCC: Real-time workflow system
The Grid is a concept which allows the sharing of resources between distributed communities, allowing each to progress towards potentially different goals. As adoption of the Grid increases so are the activities that people wish to conduct through it. The GRIDCC project is a European Union funded project addressing the issues of integrating instruments into the Grid. This increases the requirement of workflows and Quality of Service upon these workflows as many of these instruments have real-time requirements. In this paper we present the workflow management service within the GRIDCC project which is tasked with optimising the workflows and ensuring that they meet the pre-defined QoS requirements specified upon them
Querying Large Physics Data Sets Over an Information Grid
Optimising use of the Web (WWW) for LHC data analysis is a complex problem
and illustrates the challenges arising from the integration of and computation
across massive amounts of information distributed worldwide. Finding the right
piece of information can, at times, be extremely time-consuming, if not
impossible. So-called Grids have been proposed to facilitate LHC computing and
many groups have embarked on studies of data replication, data migration and
networking philosophies. Other aspects such as the role of 'middleware' for
Grids are emerging as requiring research. This paper positions the need for
appropriate middleware that enables users to resolve physics queries across
massive data sets. It identifies the role of meta-data for query resolution and
the importance of Information Grids for high-energy physics analysis rather
than just Computational or Data Grids. This paper identifies software that is
being implemented at CERN to enable the querying of very large collaborating
HEP data-sets, initially being employed for the construction of CMS detectors.Comment: 4 pages, 3 figure
- …