Article thumbnail

DataMover: robust terabyte-scale multi-file replication overwide-area networks

By Alex Sim, Junmin Gu, Arie Shoshani and Vijaya Natarajan


Typically, large scientific datasets (order of terabytes) are generated at large computational centers, and stored on mass storage systems. However, large subsets of the data need to be moved to facilities available to application scientists for analysis. File replication of thousands of files is a tedious, error prone, but extremely important task in scientific applications. The automation of the file replication task requires automatic space acquisition and reuse, and monitoring the progress of staging thousands of files from the source mass storage system, transferring them over the network, archiving them at the target mass storage system or disk systems, and recovering from transient system failures. We have developed a robust replication system, called DataMover, which is now in regular use in High-Energy-Physics and Climate modeling experiments. Only a single command is necessary to request multi-file replication or the replication of an entire directory. A web-based tool was developed to dynamically monitor the progress of the multi-file replication process

Topics: Management, Storage, 99 General And Miscellaneous//Mathematics, Computing, And Information Science, Monitoring, Data Replication Datamover, Automation, Targets, Climates, 71 Classical And Quantum Mechanics, General Physics, High Energy Physics, Transients Data Replication Datamover, Simulation, Monitors
Publisher: Lawrence Berkeley National Laboratory
Year: 2004
OAI identifier:
Provided by: UNT Digital Library
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.