Search CORE

13,808 research outputs found

Compression of interferometric radio-astronomical data

Author: Offringa A. R.
Publication venue: 'EDP Sciences'
Publication date: 07/09/2016
Field of study

The volume of radio-astronomical data is a considerable burden in the processing and storing of radio observations with high time and frequency resolutions and large bandwidths. Lossy compression of interferometric radio-astronomical data is considered to reduce the volume of visibility data and to speed up processing. A new compression technique named "Dysco" is introduced that consists of two steps: a normalization step, in which grouped visibilities are normalized to have a similar distribution; and a quantization and encoding step, which rounds values to a given quantization scheme using a dithering scheme. Several non-linear quantization schemes are tested and combined with different methods for normalizing the data. Four data sets with observations from the LOFAR and MWA telescopes are processed with different processing strategies and different combinations of normalization and quantization. The effects of compression are measured in image plane. The noise added by the lossy compression technique acts like normal system noise. The accuracy of Dysco is depending on the signal-to-noise ratio of the data: noisy data can be compressed with a smaller loss of image quality. Data with typical correlator time and frequency resolutions can be compressed by a factor of 6.4 for LOFAR and 5.3 for MWA observations with less than 1% added system noise. An implementation of the compression technique is released that provides a Casacore storage manager and allows transparent encoding and decoding. Encoding and decoding is faster than the read/write speed of typical disks. The technique can be used for LOFAR and MWA to reduce the archival space requirements for storing observed data. Data from SKA-low will likely be compressible by the same amount as LOFAR. The same technique can be used to compress data from other telescopes, but a different bit-rate might be required.Comment: Accepted for publication in A&A. 13 pages, 8 figures. Abstract was abridge

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Computing in the RAIN: a reliable array of independent nodes

Author: Bohossian Vasken
Bruck Jehoshua
Fan Chenggong C.
LeMahieu Paul S.
Riedel Marc D.
Xu Lihao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1998
Field of study

The RAIN project is a research collaboration between Caltech and NASA-JPL on distributed computing and data-storage systems for future spaceborne missions. The goal of the project is to identify and develop key building blocks for reliable distributed systems built with inexpensive off-the-shelf components. The RAIN platform consists of a heterogeneous cluster of computing and/or storage nodes connected via multiple interfaces to networks configured in fault-tolerant topologies. The RAIN software components run in conjunction with operating system services and standard network protocols. Through software-implemented fault tolerance, the system tolerates multiple node, link, and switch failures, with no single point of failure. The RAIN-technology has been transferred to Rainfinity, a start-up company focusing on creating clustered solutions for improving the performance and availability of Internet data centers. In this paper, we describe the following contributions: 1) fault-tolerant interconnect topologies and communication protocols providing consistent error reporting of link failures, 2) fault management techniques based on group membership, and 3) data storage schemes based on computationally efficient error-control codes. We present several proof-of-concept applications: a highly-available video server, a highly-available Web server, and a distributed checkpointing system. Also, we describe a commercial product, Rainwall, built with the RAIN technology

CiteSeerX

Caltech Authors

Reliability of Erasure Coded Storage Systems: A Geometric Approach

Author: Campello Antonio
Vaishampayan Vinay A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/08/2015
Field of study

We consider the probability of data loss, or equivalently, the reliability function for an erasure coded distributed data storage system under worst case conditions. Data loss in an erasure coded system depends on probability distributions for the disk repair duration and the disk failure duration. In previous works, the data loss probability of such systems has been studied under the assumption of exponentially distributed disk failure and disk repair durations, using well-known analytic methods from the theory of Markov processes. These methods lead to an estimate of the integral of the reliability function. Here, we address the problem of directly calculating the data loss probability for general repair and failure duration distributions. A closed limiting form is developed for the probability of data loss and it is shown that the probability of the event that a repair duration exceeds a failure duration is sufficient for characterizing the data loss probability. For the case of constant repair duration, we develop an expression for the conditional data loss probability given the number of failures experienced by a each node in a given time window. We do so by developing a geometric approach that relies on the computation of volumes of a family of polytopes that are related to the code. An exact calculation is provided and an upper bound on the data loss probability is obtained by posing the problem as a set avoidance problem. Theoretical calculations are compared to simulation results.Comment: 28 pages. 8 figures. Presented in part at IEEE International Conference on BigData 2013, Santa Clara, CA, Oct. 2013 and to be presented in part at 2014 IEEE Information Theory Workshop, Tasmania, Australia, Nov. 2014. New analysis added May 2015. Further Update Aug. 201

arXiv.org e-Print Archive

CiteSeerX