9,017 research outputs found
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Data Grids have been adopted as the platform for scientific communities that
need to share, access, transport, process and manage large data collections
distributed worldwide. They combine high-end computing technologies with
high-performance networking and wide-area storage management techniques. In
this paper, we discuss the key concepts behind Data Grids and compare them with
other data sharing and distribution paradigms such as content delivery
networks, peer-to-peer networks and distributed databases. We then provide
comprehensive taxonomies that cover various aspects of architecture, data
transportation, data replication and resource allocation and scheduling.
Finally, we map the proposed taxonomy to various Data Grid systems not only to
validate the taxonomy but also to identify areas for future exploration.
Through this taxonomy, we aim to categorise existing systems to better
understand their goals and their methodology. This would help evaluate their
applicability for solving similar problems. This taxonomy also provides a "gap
analysis" of this area through which researchers can potentially identify new
issues for investigation. Finally, we hope that the proposed taxonomy and
mapping also helps to provide an easy way for new practitioners to understand
this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
The CDF Data Handling System
The Collider Detector at Fermilab (CDF) records proton-antiproton collisions
at center of mass energy of 2.0 TeV at the Tevatron collider. A new collider
run, Run II, of the Tevatron started in April 2001. Increased luminosity will
result in about 1~PB of data recorded on tapes in the next two years. Currently
the CDF experiment has about 260 TB of data stored on tapes. This amount
includes raw and reconstructed data and their derivatives.
The data storage and retrieval are managed by the CDF Data Handling (DH)
system. This system has been designed to accommodate the increased demands of
the Run II environment and has proven robust and reliable in providing reliable
flow of data from the detector to the end user. This paper gives an overview of
the CDF Run II Data Handling system which has evolved significantly over the
course of this year. An outline of the future direction of the system is given.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics
(CHEP03), La Jolla, Ca, USA, March 2003, 7 pages, LaTeX, 4 EPS figures, PSN
THKT00
A Factor Framework for Experimental Design for Performance Evaluation of Commercial Cloud Services
Given the diversity of commercial Cloud services, performance evaluations of
candidate services would be crucial and beneficial for both service customers
(e.g. cost-benefit analysis) and providers (e.g. direction of service
improvement). Before an evaluation implementation, the selection of suitable
factors (also called parameters or variables) plays a prerequisite role in
designing evaluation experiments. However, there seems a lack of systematic
approaches to factor selection for Cloud services performance evaluation. In
other words, evaluators randomly and intuitively concerned experimental factors
in most of the existing evaluation studies. Based on our previous taxonomy and
modeling work, this paper proposes a factor framework for experimental design
for performance evaluation of commercial Cloud services. This framework
capsules the state-of-the-practice of performance evaluation factors that
people currently take into account in the Cloud Computing domain, and in turn
can help facilitate designing new experiments for evaluating Cloud services.Comment: 8 pages, Proceedings of the 4th International Conference on Cloud
Computing Technology and Science (CloudCom 2012), pp. 169-176, Taipei,
Taiwan, December 03-06, 201
- …