Search CORE

27,169 research outputs found

Multi-Terabyte EIDE Disk Arrays running Linux RAID5

Author: Cremaldi L. M.
Eschenburg V.
Godang R.
Joy M. D.
Petravick D. L.
Sanders D. A.
Summers D. J.
Publication venue
Publication date: 19/11/2004
Field of study

High-energy physics experiments are currently recording large amounts of data and in a few years will be recording prodigious quantities of data. New methods must be developed to handle this data and make analysis at universities possible. Grid Computing is one method; however, the data must be cached at the various Grid nodes. We examine some storage techniques that exploit recent developments in commodity hardware. Disk arrays using RAID level 5 (RAID-5) include both parity and striping. The striping improves access speed. The parity protects data in the event of a single disk failure, but not in the case of multiple disk failures. We report on tests of dual-processor Linux Software RAID-5 arrays and Hardware RAID-5 arrays using a 12-disk 3ware controller, in conjunction with 250 and 300 GB disks, for use in offline high-energy physics data analysis. The price of IDE disks is now less than $1/GB. These RAID-5 disk arrays can be scaled to sizes affordable to small institutions and used when fast random access at low cost is important.Comment: Talk from the 2004 Computing in High Energy and Nuclear Physics (CHEP04), Interlaken, Switzerland, 27th September - 1st October 2004, 4 pages, LaTeX, uses CHEP2004.cls. ID 47, Poster Session 2, Track

arXiv.org e-Print Archive

CERN Document Server

Fine-Grain Checkpointing with In-Cache-Line Logging

Author: Aksun David T.
Avni Hillel
Cohen Nachshon
Larus James R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/02/2019
Field of study

Non-Volatile Memory offers the possibility of implementing high-performance, durable data structures. However, achieving performance comparable to well-designed data structures in non-persistent (transient) memory is difficult, primarily because of the cost of ensuring the order in which memory writes reach NVM. Often, this requires flushing data to NVM and waiting a full memory round-trip time. In this paper, we introduce two new techniques: Fine-Grained Checkpointing, which ensures a consistent, quickly recoverable data structure in NVM after a system failure, and In-Cache-Line Logging, an undo-logging technique that enables recovery of earlier state without requiring cache-line flushes in the normal case. We implemented these techniques in the Masstree data structure, making it persistent and demonstrating the ease of applying them to a highly optimized system and their low (5.9-15.4\%) runtime overhead cost.Comment: In 2019 Architectural Support for Programming Languages and Operating Systems (ASPLOS 19), April 13, 2019, Providence, RI, US

arXiv.org e-Print Archive

Crossref

SSP: Eliminating Redundant Writes in Failure-Atomic NVRAMs via Shadow Sub-Paging

Author: Bittman Daniel
Coburn Joel
Hitz Dave
Kolli Aasheesh
Kwon Youngjin
Lee Changman
Lee Se Kwon
Minh Chi Cao
Ni Yuanjiang
Pelley Steven
Talluri Madhusudhan
Venkataraman Shivaram
Volos Haris
Xu Jian
Yang Jun
Zhao Jishen
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Crossref

eScholarship - University of California

Data processing model for the CDF experiment

Author: Antos J.
Babik M.
Benjamin D.
Cabrera S.
Chan A. W.
Chen Y. C.
Coca M.
Cooper B.
Farrington S.
Genser K.
Hatakeyama K.
Hou S.
Hsieh T. L.
Jayatilaka B.
Jun S. Y.
Kotwal A. V.
Kraan A. C.
Lysak R.
Mandrichenko I. V.
Murat P.
Robson A.
Savard P.
Siket M.
Stelzer B.
Syu J.
Teng P. K.
Timm S. C.
Tomura T.
Vataga E.
Wolbers S. A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/06/2006
Field of study

The data processing model for the CDF experiment is described. Data processing reconstructs events from parallel data streams taken with different combinations of physics event triggers and further splits the events into datasets of specialized physics datasets. The design of the processing control system faces strict requirements on bookkeeping records, which trace the status of data files and event contents during processing and storage. The computing architecture was updated to meet the mass data flow of the Run II data collection, recently upgraded to a maximum rate of 40 MByte/sec. The data processing facility consists of a large cluster of Linux computers with data movement managed by the CDF data handling system to a multi-petaByte Enstore tape library. The latest processing cycle has achieved a stable speed of 35 MByte/sec (3 TByte/day). It can be readily scaled by increasing CPU and data-handling capacity as required.Comment: 12 pages, 10 figures, submitted to IEEE-TN

arXiv.org e-Print Archive

Crossref

UCL Discovery

CERN Document Server

Recommended from our members

Computing infrastructure issues in distributed communications systems : a survey of operating system transport system architectures

Author: Schmidt Douglas C.
Suda Tatsuya
Publication venue: eScholarship, University of California
Publication date: 01/01/1992
Field of study

The performance of distributed applications (such as file transfer, remote login, tele-conferencing, full-motion video, and scientific visualization) is influenced by several factors that interact in complex ways. In particular, application performance is significantly affected both by communication infrastructure factors and computing infrastructure factors. Several communication infrastructure factors include channel speed, bit-error rate, and congestion at intermediate switching nodes. Computing infrastructure factors include (among other things) both protocol processing activities (such as connection management, flow control, error detection, and retransmission) and general operating system factors (such as memory latency, CPU speed, interrupt and context switching overhead, process architecture, and message buffering). Due to a several orders of magnitude increase in network channel speed and an increase in application diversity, performance bottlenecks are shifting from the network factors to the transport system factors.This paper defines an abstraction called an "Operating System Transport System Architecture" (OSTSA) that is used to classify the major components and services in the computing infrastructure. End-to-end network protocols such as TCP, TP4, VMTP, XTP, and Delta-t typically run on general-purpose computers, where they utilize various operating system resources such as processors, virtual memory, and network controllers. The OSTSA provides services that integrate these resources to support distributed applications running on local and wide area networks.A taxonomy is presented to evaluate OSTSAs in terms of their support for protocol processing activities. We use this taxonomy to compare and contrast five general-purpose commercial and experimental operating systems including System V UNIX, BSD UNIX, the x-kernel, Choices, and Xinu

eScholarship - University of California

Data production models for the CDF experiment

Author: Antos J.
Babik M.
Benjamin D.
Cabrera S.
Chan A. W.
Chen Y. C.
Coca M.
Cooper B.
Genser K.
Hatakeyama K.
Hou S.
Hsieh T. L.
Jayatilaka B.
Kraan A. C.
Lysak R.
Mandrichenko I. V.
Robson A.
Siket M.
Stelzer B.
Syu J.
Teng P. K.
Timm S. C.
Tomura T.
Vataga E.
Wolbers S. A.
Yeh P.
Publication venue
Publication date: 01/06/2006
Field of study

The data production for the CDF experiment is conducted on a large Linux PC farm designed to meet the needs of data collection at a maximum rate of 40 MByte/sec. We present two data production models that exploits advances in computing and communication technology. The first production farm is a centralized system that has achieved a stable data processing rate of approximately 2 TByte per day. The recently upgraded farm is migrated to the SAM (Sequential Access to data via Metadata) data handling system. The software and hardware of the CDF production farms has been successful in providing large computing and data throughput capacity to the experiment.Comment: 8 pages, 9 figures; presented at HPC Asia2005, Beijing, China, Nov 30 - Dec 3, 200

arXiv.org e-Print Archive

UNT Digital Library

CERN Document Server