678 research outputs found
Multi-Terabyte EIDE Disk Arrays running Linux RAID5
High-energy physics experiments are currently recording large amounts of data
and in a few years will be recording prodigious quantities of data. New methods
must be developed to handle this data and make analysis at universities
possible. Grid Computing is one method; however, the data must be cached at the
various Grid nodes. We examine some storage techniques that exploit recent
developments in commodity hardware. Disk arrays using RAID level 5 (RAID-5)
include both parity and striping. The striping improves access speed. The
parity protects data in the event of a single disk failure, but not in the case
of multiple disk failures.
We report on tests of dual-processor Linux Software RAID-5 arrays and
Hardware RAID-5 arrays using a 12-disk 3ware controller, in conjunction with
250 and 300 GB disks, for use in offline high-energy physics data analysis. The
price of IDE disks is now less than $1/GB. These RAID-5 disk arrays can be
scaled to sizes affordable to small institutions and used when fast random
access at low cost is important.Comment: Talk from the 2004 Computing in High Energy and Nuclear Physics
(CHEP04), Interlaken, Switzerland, 27th September - 1st October 2004, 4
pages, LaTeX, uses CHEP2004.cls. ID 47, Poster Session 2, Track
A Case for Redundant Arrays of Hybrid Disks (RAHD)
Hybrid Hard Disk Drive was originally concepted by Samsung, which incorporates a Flash memory in a magnetic disk. The combined ultra-high-density benefits of magnetic storage and the low-power and fast read access of NAND technology inspires us to construct Redundant Arrays of Hybrid Disks (RAHD) to offer a possible alternative to today’s Redundant Arrays of Independent Disks (RAIDs) and/or Massive Arrays of Idle Disks (MAIDs). We first design an internal management system (including Energy-Efficient Control) for hybrid disks. Three traces collected from real systems as well as a synthetic trace are then used to evaluate the RAHD arrays. The trace-driven experimental results show: in the high speed mode, a RAHD outplays the purely-magnetic-disk-based RAIDs by a factor of 2.4–4; in the energy-efficient mode, a RAHD4/5 can save up to 89% of energy at little performance degradationPeer reviewe
Redundant Arrays of IDE Drives
The next generation of high-energy physics experiments is expected to gather
prodigious amounts of data. New methods must be developed to handle this data
and make analysis at universities possible. We examine some techniques that use
recent developments in commodity hardware. We test redundant arrays of
integrated drive electronics (IDE) disk drives for use in offline high-energy
physics data analysis. IDE redundant array of inexpensive disks (RAID) prices
now equal the cost per terabyte of million-dollar tape robots! The arrays can
be scaled to sizes affordable to institutions without robots and used when fast
random access at low cost is important. We also explore three methods of moving
data between sites; internet transfers, hot pluggable IDE disks in FireWire
cases, and writable digital video disks (DVD-R).Comment: Submitted to IEEE Transactions On Nuclear Science, for the 2001 IEEE
Nuclear Science Symposium and Medical Imaging Conference, 8 pages, 1 figure,
uses IEEEtran.cls. Revised March 19, 2002 and published August 200
Does science need computer science?
IBM Hursley Talks
Series 3An afternoon of talks, to be held on Wednesday March 10 from 2:30pm in Bldg 35 Lecture Room A, arranged by the School of Chemistry in conjunction with IBM Hursley and the Combechem e-Science Project.The talks are aimed at science students (undergraduate and post-graduate) from across the faculty. This is the third series of talks we have organized, but the first time we have put them together in an afternoon. The talks are general in nature and knowledge of computer science is certainly not necessary. After the talks there will be an opportunity for a discussion with the lecturers from IBM.Does Science Need Computer Science?Chair and Moderator - Jeremy Frey, School of Chemistry.- 14:00 "Computer games for fun and profit" (*) - Andrew Reynolds - 14:45 "Anyone for tennis? The science behind WIBMledon" (*) - Matt Roberts - 15:30 Tea (Chemistry Foyer, Bldg 29 opposite bldg 35) - 15:45 "Disk Drive physics from grandmothers to gigabytes" (*) - Steve Legg - 16:35 "What could happen to your data?" (*) - Nick Jones - 17:20 Panel Session, comprising the four IBM speakers and May Glover-Gunn (IBM) - 18:00 Receptio
Redundancy and Aging of Efficient Multidimensional MDS-Parity Protected Distributed Storage Systems
The effect of redundancy on the aging of an efficient Maximum Distance
Separable (MDS) parity--protected distributed storage system that consists of
multidimensional arrays of storage units is explored. In light of the
experimental evidences and survey data, this paper develops generalized
expressions for the reliability of array storage systems based on more
realistic time to failure distributions such as Weibull. For instance, a
distributed disk array system is considered in which the array components are
disseminated across the network and are subject to independent failure rates.
Based on such, generalized closed form hazard rate expressions are derived.
These expressions are extended to estimate the asymptotical reliability
behavior of large scale storage networks equipped with MDS parity-based
protection. Unlike previous studies, a generic hazard rate function is assumed,
a generic MDS code for parity generation is used, and an evaluation of the
implications of adjustable redundancy level for an efficient distributed
storage system is presented. Results of this study are applicable to any
erasure correction code as long as it is accompanied with a suitable structure
and an appropriate encoding/decoding algorithm such that the MDS property is
maintained.Comment: 11 pages, 6 figures, Accepted for publication in IEEE Transactions on
Device and Materials Reliability (TDMR), Nov. 201
Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments
Data centres that use consumer-grade disks drives and distributed
peer-to-peer systems are unreliable environments to archive data without enough
redundancy. Most redundancy schemes are not completely effective for providing
high availability, durability and integrity in the long-term. We propose alpha
entanglement codes, a mechanism that creates a virtual layer of highly
interconnected storage devices to propagate redundant information across a
large scale storage system. Our motivation is to design flexible and practical
erasure codes with high fault-tolerance to improve data durability and
availability even in catastrophic scenarios. By flexible and practical, we mean
code settings that can be adapted to future requirements and practical
implementations with reasonable trade-offs between security, resource usage and
performance. The codes have three parameters. Alpha increases storage overhead
linearly but increases the possible paths to recover data exponentially. Two
other parameters increase fault-tolerance even further without the need of
additional storage. As a result, an entangled storage system can provide high
availability, durability and offer additional integrity: it is more difficult
to modify data undetectably. We evaluate how several redundancy schemes perform
in unreliable environments and show that alpha entanglement codes are flexible
and practical codes. Remarkably, they excel at code locality, hence, they
reduce repair costs and become less dependent on storage locations with poor
availability. Our solution outperforms Reed-Solomon codes in many disaster
recovery scenarios.Comment: The publication has 12 pages and 13 figures. This work was partially
supported by Swiss National Science Foundation SNSF Doc.Mobility 162014, 2018
48th Annual IEEE/IFIP International Conference on Dependable Systems and
Networks (DSN
Self-Repairing Disk Arrays
As the prices of magnetic storage continue to decrease, the cost of replacing
failed disks becomes increasingly dominated by the cost of the service call
itself. We propose to eliminate these calls by building disk arrays that
contain enough spare disks to operate without any human intervention during
their whole lifetime. To evaluate the feasibility of this approach, we have
simulated the behavior of two-dimensional disk arrays with n parity disks and
n(n-1)/2 data disks under realistic failure and repair assumptions. Our
conclusion is that having n(n+1)/2 spare disks is more than enough to achieve a
99.999 percent probability of not losing data over four years. We observe that
the same objectives cannot be reached with RAID level 6 organizations and would
require RAID stripes that could tolerate triple disk failures.Comment: Part of ADAPT Workshop proceedings, 2015 (arXiv:1412.2347
- …