118 research outputs found

    Multi-Terabyte EIDE Disk Arrays running Linux RAID5

    Full text link
    High-energy physics experiments are currently recording large amounts of data and in a few years will be recording prodigious quantities of data. New methods must be developed to handle this data and make analysis at universities possible. Grid Computing is one method; however, the data must be cached at the various Grid nodes. We examine some storage techniques that exploit recent developments in commodity hardware. Disk arrays using RAID level 5 (RAID-5) include both parity and striping. The striping improves access speed. The parity protects data in the event of a single disk failure, but not in the case of multiple disk failures. We report on tests of dual-processor Linux Software RAID-5 arrays and Hardware RAID-5 arrays using a 12-disk 3ware controller, in conjunction with 250 and 300 GB disks, for use in offline high-energy physics data analysis. The price of IDE disks is now less than $1/GB. These RAID-5 disk arrays can be scaled to sizes affordable to small institutions and used when fast random access at low cost is important.Comment: Talk from the 2004 Computing in High Energy and Nuclear Physics (CHEP04), Interlaken, Switzerland, 27th September - 1st October 2004, 4 pages, LaTeX, uses CHEP2004.cls. ID 47, Poster Session 2, Track

    Redundant Arrays of IDE Drives

    Get PDF
    The next generation of high-energy physics experiments is expected to gather prodigious amounts of data. New methods must be developed to handle this data and make analysis at universities possible. We examine some techniques that use recent developments in commodity hardware. We test redundant arrays of integrated drive electronics (IDE) disk drives for use in offline high-energy physics data analysis. IDE redundant array of inexpensive disks (RAID) prices now equal the cost per terabyte of million-dollar tape robots! The arrays can be scaled to sizes affordable to institutions without robots and used when fast random access at low cost is important. We also explore three methods of moving data between sites; internet transfers, hot pluggable IDE disks in FireWire cases, and writable digital video disks (DVD-R).Comment: Submitted to IEEE Transactions On Nuclear Science, for the 2001 IEEE Nuclear Science Symposium and Medical Imaging Conference, 8 pages, 1 figure, uses IEEEtran.cls. Revised March 19, 2002 and published August 200

    Adapting SAM for CDF

    Full text link
    The CDF and D0 experiments probe the high-energy frontier and as they do so have accumulated hundreds of Terabytes of data on the way to petabytes of data over the next two years. The experiments have made a commitment to use the developing Grid based on the SAM system to handle these data. The D0 SAM has been extended for use in CDF as common patterns of design emerged to meet the similar requirements of these experiments. The process by which the merger was achieved is explained with particular emphasis on lessons learned concerning the database design patterns plus realization of the use cases.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 4 pages, pdf format, TUAT00

    ASCR/HEP Exascale Requirements Review Report

    Full text link
    This draft report summarizes and details the findings, results, and recommendations derived from the ASCR/HEP Exascale Requirements Review meeting held in June, 2015. The main conclusions are as follows. 1) Larger, more capable computing and data facilities are needed to support HEP science goals in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of the demand at the 2025 timescale is at least two orders of magnitude -- and in some cases greater -- than that available currently. 2) The growth rate of data produced by simulations is overwhelming the current ability, of both facilities and researchers, to store and analyze it. Additional resources and new techniques for data analysis are urgently needed. 3) Data rates and volumes from HEP experimental facilities are also straining the ability to store and analyze large and complex data volumes. Appropriately configured leadership-class facilities can play a transformational role in enabling scientific discovery from these datasets. 4) A close integration of HPC simulation and data analysis will aid greatly in interpreting results from HEP experiments. Such an integration will minimize data movement and facilitate interdependent workflows. 5) Long-range planning between HEP and ASCR will be required to meet HEP's research needs. To best use ASCR HPC resources the experimental HEP program needs a) an established long-term plan for access to ASCR computational and data resources, b) an ability to map workflows onto HPC resources, c) the ability for ASCR facilities to accommodate workflows run by collaborations that can have thousands of individual members, d) to transition codes to the next-generation HPC platforms that will be available at ASCR facilities, e) to build up and train a workforce capable of developing and using simulations and analysis to support HEP scientific research on next-generation systems.Comment: 77 pages, 13 Figures; draft report, subject to further revisio

    LSST: from Science Drivers to Reference Design and Anticipated Data Products

    Get PDF
    (Abridged) We describe here the most ambitious survey currently planned in the optical, the Large Synoptic Survey Telescope (LSST). A vast array of science will be enabled by a single wide-deep-fast sky survey, and LSST will have unique survey capability in the faint time domain. The LSST design is driven by four main science themes: probing dark energy and dark matter, taking an inventory of the Solar System, exploring the transient optical sky, and mapping the Milky Way. LSST will be a wide-field ground-based system sited at Cerro Pach\'{o}n in northern Chile. The telescope will have an 8.4 m (6.5 m effective) primary mirror, a 9.6 deg2^2 field of view, and a 3.2 Gigapixel camera. The standard observing sequence will consist of pairs of 15-second exposures in a given field, with two such visits in each pointing in a given night. With these repeats, the LSST system is capable of imaging about 10,000 square degrees of sky in a single filter in three nights. The typical 5σ\sigma point-source depth in a single visit in rr will be ∌24.5\sim 24.5 (AB). The project is in the construction phase and will begin regular survey operations by 2022. The survey area will be contained within 30,000 deg2^2 with ÎŽ<+34.5∘\delta<+34.5^\circ, and will be imaged multiple times in six bands, ugrizyugrizy, covering the wavelength range 320--1050 nm. About 90\% of the observing time will be devoted to a deep-wide-fast survey mode which will uniformly observe a 18,000 deg2^2 region about 800 times (summed over all six bands) during the anticipated 10 years of operations, and yield a coadded map to r∌27.5r\sim27.5. The remaining 10\% of the observing time will be allocated to projects such as a Very Deep and Fast time domain survey. The goal is to make LSST data products, including a relational database of about 32 trillion observations of 40 billion objects, available to the public and scientists around the world.Comment: 57 pages, 32 color figures, version with high-resolution figures available from https://www.lsst.org/overvie
    • 

    corecore