28,486 research outputs found
Optimizing Lossy Compression Rate-Distortion from Automatic Online Selection between SZ and ZFP
With ever-increasing volumes of scientific data produced by HPC applications,
significantly reducing data size is critical because of limited capacity of
storage space and potential bottlenecks on I/O or networks in writing/reading
or transferring data. SZ and ZFP are the two leading lossy compressors
available to compress scientific data sets. However, their performance is not
consistent across different data sets and across different fields of some data
sets: for some fields SZ provides better compression performance, while other
fields are better compressed with ZFP. This situation raises the need for an
automatic online (during compression) selection between SZ and ZFP, with a
minimal overhead. In this paper, the automatic selection optimizes the
rate-distortion, an important statistical quality metric based on the
signal-to-noise ratio. To optimize for rate-distortion, we investigate the
principles of SZ and ZFP. We then propose an efficient online, low-overhead
selection algorithm that predicts the compression quality accurately for two
compressors in early processing stages and selects the best-fit compressor for
each data field. We implement the selection algorithm into an open-source
library, and we evaluate the effectiveness of our proposed solution against
plain SZ and ZFP in a parallel environment with 1,024 cores. Evaluation results
on three data sets representing about 100 fields show that our selection
algorithm improves the compression ratio up to 70% with the same level of data
distortion because of very accurate selection (around 99%) of the best-fit
compressor, with little overhead (less than 7% in the experiments).Comment: 14 pages, 9 figures, first revisio
Distributed Estimation of a Parametric Field Using Sparse Noisy Data
The problem of distributed estimation of a parametric physical field is
stated as a maximum likelihood estimation problem. Sensor observations are
distorted by additive white Gaussian noise. Prior to data transmission, each
sensor quantizes its observation to levels. The quantized data are then
communicated over parallel additive white Gaussian channels to a fusion center
for a joint estimation. An iterative expectation-maximization (EM) algorithm to
estimate the unknown parameter is formulated, and its linearized version is
adopted for numerical analysis. The numerical examples are provided for the
case of the field modeled as a Gaussian bell. The dependence of the integrated
mean-square error on the number of quantization levels, the number of sensors
in the network and the SNR in observation and transmission channels is
analyzed.Comment: to appear at Milcom-201
Magnification Control in Self-Organizing Maps and Neural Gas
We consider different ways to control the magnification in self-organizing
maps (SOM) and neural gas (NG). Starting from early approaches of magnification
control in vector quantization, we then concentrate on different approaches for
SOM and NG. We show that three structurally similar approaches can be applied
to both algorithms: localized learning, concave-convex learning, and winner
relaxing learning. Thereby, the approach of concave-convex learning in SOM is
extended to a more general description, whereas the concave-convex learning for
NG is new. In general, the control mechanisms generate only slightly different
behavior comparing both neural algorithms. However, we emphasize that the NG
results are valid for any data dimension, whereas in the SOM case the results
hold only for the one-dimensional case.Comment: 24 pages, 4 figure
Conditional hitting time estimation in a nonlinear filtering model by the Brownian bridge method
The model consists of a signal process which is a general Brownian
diffusion process and an observation process , also a diffusion process,
which is supposed to be correlated to the signal process. We suppose that the
process is observed from time 0 to at discrete times and aim to
estimate, conditionally on these observations, the probability that the
non-observed process crosses a fixed barrier after a given time . We
formulate this problem as a usual nonlinear filtering problem and use optimal
quantization and Monte Carlo simulations techniques to estimate the involved
quantities
- …