163 research outputs found
Multi-Terabyte EIDE Disk Arrays running Linux RAID5
High-energy physics experiments are currently recording large amounts of data
and in a few years will be recording prodigious quantities of data. New methods
must be developed to handle this data and make analysis at universities
possible. Grid Computing is one method; however, the data must be cached at the
various Grid nodes. We examine some storage techniques that exploit recent
developments in commodity hardware. Disk arrays using RAID level 5 (RAID-5)
include both parity and striping. The striping improves access speed. The
parity protects data in the event of a single disk failure, but not in the case
of multiple disk failures.
We report on tests of dual-processor Linux Software RAID-5 arrays and
Hardware RAID-5 arrays using a 12-disk 3ware controller, in conjunction with
250 and 300 GB disks, for use in offline high-energy physics data analysis. The
price of IDE disks is now less than $1/GB. These RAID-5 disk arrays can be
scaled to sizes affordable to small institutions and used when fast random
access at low cost is important.Comment: Talk from the 2004 Computing in High Energy and Nuclear Physics
(CHEP04), Interlaken, Switzerland, 27th September - 1st October 2004, 4
pages, LaTeX, uses CHEP2004.cls. ID 47, Poster Session 2, Track
Redundant Arrays of IDE Drives
The next generation of high-energy physics experiments is expected to gather
prodigious amounts of data. New methods must be developed to handle this data
and make analysis at universities possible. We examine some techniques that use
recent developments in commodity hardware. We test redundant arrays of
integrated drive electronics (IDE) disk drives for use in offline high-energy
physics data analysis. IDE redundant array of inexpensive disks (RAID) prices
now equal the cost per terabyte of million-dollar tape robots! The arrays can
be scaled to sizes affordable to institutions without robots and used when fast
random access at low cost is important. We also explore three methods of moving
data between sites; internet transfers, hot pluggable IDE disks in FireWire
cases, and writable digital video disks (DVD-R).Comment: Submitted to IEEE Transactions On Nuclear Science, for the 2001 IEEE
Nuclear Science Symposium and Medical Imaging Conference, 8 pages, 1 figure,
uses IEEEtran.cls. Revised March 19, 2002 and published August 200
Recommended from our members
Working with Arrays of Inexpensive EIDE Disk Drives
In today's marketplace, the cost per Terabyte of disks with EIDE interfaces is about a third that of disks with SCSI. Hence, three times as many particle physics events could be put online with EIDE. The modern EIDE interface includes many of the performance features that appeared earlier in SCSI. EIDE bus speeds approach 33 Megabytes/s and need only be shared between two disks rather than seven disks. The internal I/O rate of very fast (and expensive) SCSI disks is only 50 per cent greater than EIDE disks. Hence, two EIDE disks whose combined cost is much less than one very fast SCSI disk can actually give more data throughput due to the advantage of multiple spindles and head actuators. We explore the use of 12 and 16 Gigabyte EIDE disks with motherboard and PCI bus card interfaces on a number of operating systems and CPUs. These include Red Hat Linux and Windows 95/98 on a Pentium, MacOS and Apple's Rhapsody/NeXT/UNIX on a PowerPC, and Sun Solaris on a UltraSparc 10 workstation
Adapting SAM for CDF
The CDF and D0 experiments probe the high-energy frontier and as they do so
have accumulated hundreds of Terabytes of data on the way to petabytes of data
over the next two years. The experiments have made a commitment to use the
developing Grid based on the SAM system to handle these data. The D0 SAM has
been extended for use in CDF as common patterns of design emerged to meet the
similar requirements of these experiments. The process by which the merger was
achieved is explained with particular emphasis on lessons learned concerning
the database design patterns plus realization of the use cases.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics
(CHEP03), La Jolla, Ca, USA, March 2003, 4 pages, pdf format, TUAT00
Recommended from our members
Remote procedure execution software for distributed systems
Remote Procedure Execution facilitates the construction of distributed software systems, spanning computers of various types. Programmers who use the RPX package specify subroutine calls which are to be executed on a remote computer. RPX is used to generate code for dummy routines which transmit input parameters and receive output parameters, as well as a main program which receives procedure call requests, calls the requested procedure, and returns the result. The package automatically performs datatype conversions and uses an appropriate connection oriented protocol. Supported operating systems/processors are VMS(VAX), UNIX(MIPS R2000, R3000) and Software Components Group's pSOS (680x0). Connection oriented protocols are supported over Ethernet (TCP/IP) and RS232 (a package of our own design). 2 refs., 2 figs
- …