383,879 research outputs found
The CDF Data Handling System
The Collider Detector at Fermilab (CDF) records proton-antiproton collisions
at center of mass energy of 2.0 TeV at the Tevatron collider. A new collider
run, Run II, of the Tevatron started in April 2001. Increased luminosity will
result in about 1~PB of data recorded on tapes in the next two years. Currently
the CDF experiment has about 260 TB of data stored on tapes. This amount
includes raw and reconstructed data and their derivatives.
The data storage and retrieval are managed by the CDF Data Handling (DH)
system. This system has been designed to accommodate the increased demands of
the Run II environment and has proven robust and reliable in providing reliable
flow of data from the detector to the end user. This paper gives an overview of
the CDF Run II Data Handling system which has evolved significantly over the
course of this year. An outline of the future direction of the system is given.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics
(CHEP03), La Jolla, Ca, USA, March 2003, 7 pages, LaTeX, 4 EPS figures, PSN
THKT00
Reliable Communication in a Dynamic Network in the Presence of Byzantine Faults
We consider the following problem: two nodes want to reliably communicate in
a dynamic multihop network where some nodes have been compromised, and may have
a totally arbitrary and unpredictable behavior. These nodes are called
Byzantine. We consider the two cases where cryptography is available and not
available. We prove the necessary and sufficient condition (that is, the
weakest possible condition) to ensure reliable communication in this context.
Our proof is constructive, as we provide Byzantine-resilient algorithms for
reliable communication that are optimal with respect to our impossibility
results. In a second part, we investigate the impact of our conditions in three
case studies: participants interacting in a conference, robots moving on a grid
and agents in the subway. Our simulations indicate a clear benefit of using our
algorithms for reliable communication in those contexts
Asymmetric Distributed Trust
Quorum systems are a key abstraction in distributed fault-tolerant computing for capturing trust assumptions. They can be found at the core of many algorithms for implementing reliable broadcasts, shared memory, consensus and other problems. This paper introduces asymmetric Byzantine quorum systems that model subjective trust. Every process is free to choose which combinations of other processes it trusts and which ones it considers faulty. Asymmetric quorum systems strictly generalize standard Byzantine quorum systems, which have only one global trust assumption for all processes. This work also presents protocols that implement abstractions of shared memory and broadcast primitives with processes prone to Byzantine faults and asymmetric trust. The model and protocols pave the way for realizing more elaborate algorithms with asymmetric trust
Solving Lattice QCD systems of equations using mixed precision solvers on GPUs
Modern graphics hardware is designed for highly parallel numerical tasks and
promises significant cost and performance benefits for many scientific
applications. One such application is lattice quantum chromodyamics (lattice
QCD), where the main computational challenge is to efficiently solve the
discretized Dirac equation in the presence of an SU(3) gauge field. Using
NVIDIA's CUDA platform we have implemented a Wilson-Dirac sparse matrix-vector
product that performs at up to 40 Gflops, 135 Gflops and 212 Gflops for double,
single and half precision respectively on NVIDIA's GeForce GTX 280 GPU. We have
developed a new mixed precision approach for Krylov solvers using reliable
updates which allows for full double precision accuracy while using only single
or half precision arithmetic for the bulk of the computation. The resulting
BiCGstab and CG solvers run in excess of 100 Gflops and, in terms of iterations
until convergence, perform better than the usual defect-correction approach for
mixed precision.Comment: 30 pages, 7 figure
Cloud Storage and Bioinformatics in a private cloud deployment: Lessons for Data Intensive research
This paper describes service portability for a private cloud deployment, including a detailed case study about Cloud Storage and bioinformatics services developed as part of the Cloud Computing Adoption Framework (CCAF). Our Cloud Storage design and deployment is based on Storage Area Network (SAN) technologies, details of which include functionalities, technical implementation, architecture and user support. Experiments for data services (backup automation, data recovery and data migration) are performed and results confirm backup automation is completed swiftly and is reliable for data-intensive research. The data recovery result confirms that execution time is in proportion to quantity of recovered data, but the failure rate increases in an exponential manner. The data migration result confirms execution time is in proportion to disk volume of migrated data, but again the failure rate increases in an exponential manner. In addition, benefits of CCAF are illustrated using several bioinformatics examples such as tumour modelling, brain imaging, insulin molecules and simulations for medical training. Our Cloud Storage solution described here offers cost reduction, time-saving and user friendliness
Self-Healing Computation
In the problem of reliable multiparty computation (RC), there are
parties, each with an individual input, and the parties want to jointly compute
a function over inputs. The problem is complicated by the fact that an
omniscient adversary controls a hidden fraction of the parties.
We describe a self-healing algorithm for this problem. In particular, for a
fixed function , with parties and gates, we describe how to perform
RC repeatedly as the inputs to change. Our algorithm maintains the
following properties, even when an adversary controls up to parties, for any constant . First, our
algorithm performs each reliable computation with the following amortized
resource costs: messages, computational
operations, and latency, where is the depth of the circuit
that computes . Second, the expected total number of corruptions is , after which the adversarially controlled parties are
effectively quarantined so that they cause no more corruptions.Comment: 17 pages and 1 figure. It is submitted to SSS'1
- …