725 research outputs found
Optimizing Lossy Compression Rate-Distortion from Automatic Online Selection between SZ and ZFP
With ever-increasing volumes of scientific data produced by HPC applications,
significantly reducing data size is critical because of limited capacity of
storage space and potential bottlenecks on I/O or networks in writing/reading
or transferring data. SZ and ZFP are the two leading lossy compressors
available to compress scientific data sets. However, their performance is not
consistent across different data sets and across different fields of some data
sets: for some fields SZ provides better compression performance, while other
fields are better compressed with ZFP. This situation raises the need for an
automatic online (during compression) selection between SZ and ZFP, with a
minimal overhead. In this paper, the automatic selection optimizes the
rate-distortion, an important statistical quality metric based on the
signal-to-noise ratio. To optimize for rate-distortion, we investigate the
principles of SZ and ZFP. We then propose an efficient online, low-overhead
selection algorithm that predicts the compression quality accurately for two
compressors in early processing stages and selects the best-fit compressor for
each data field. We implement the selection algorithm into an open-source
library, and we evaluate the effectiveness of our proposed solution against
plain SZ and ZFP in a parallel environment with 1,024 cores. Evaluation results
on three data sets representing about 100 fields show that our selection
algorithm improves the compression ratio up to 70% with the same level of data
distortion because of very accurate selection (around 99%) of the best-fit
compressor, with little overhead (less than 7% in the experiments).Comment: 14 pages, 9 figures, first revisio
C-Coll: Introducing Error-bounded Lossy Compression into MPI Collectives
With the ever-increasing computing power of supercomputers and the growing
scale of scientific applications, the efficiency of MPI collective
communications turns out to be a critical bottleneck in large-scale distributed
and parallel processing. Large message size in MPI collectives is a
particularly big concern because it may significantly delay the overall
parallel performance. To address this issue, prior research simply applies the
off-the-shelf fix-rate lossy compressors in the MPI collectives, leading to
suboptimal performance, limited generalizability, and unbounded errors. In this
paper, we propose a novel solution, called C-Coll, which leverages
error-bounded lossy compression to significantly reduce the message size,
resulting in a substantial reduction in communication cost. The key
contributions are three-fold. (1) We develop two general, optimized
lossy-compression-based frameworks for both types of MPI collectives
(collective data movement as well as collective computation), based on their
particular characteristics. Our framework not only reduces communication cost
but also preserves data accuracy. (2) We customize an optimized version based
on SZx, an ultra-fast error-bounded lossy compressor, which can meet the
specific needs of collective communication. (3) We integrate C-Coll into
multiple collectives, such as MPI_Allreduce, MPI_Scatter, and MPI_Bcast, and
perform a comprehensive evaluation based on real-world scientific datasets.
Experiments show that our solution outperforms the original MPI collectives as
well as multiple baselines and related efforts by 3.5-9.7X.Comment: 12 pages, 15 figures, 5 tables, submitted to SC '2
SRN-SZ: Deep Leaning-Based Scientific Error-bounded Lossy Compression with Super-resolution Neural Networks
The fast growth of computational power and scales of modern super-computing
systems have raised great challenges for the management of exascale scientific
data. To maintain the usability of scientific data, error-bound lossy
compression is proposed and developed as an essential technique for the size
reduction of scientific data with constrained data distortion. Among the
diverse datasets generated by various scientific simulations, certain datasets
cannot be effectively compressed by existing error-bounded lossy compressors
with traditional techniques. The recent success of Artificial Intelligence has
inspired several researchers to integrate neural networks into error-bounded
lossy compressors. However, those works still suffer from limited compression
ratios and/or extremely low efficiencies. To address those issues and improve
the compression on the hard-to-compress datasets, in this paper, we propose
SRN-SZ, which is a deep learning-based scientific error-bounded lossy
compressor leveraging the hierarchical data grid expansion paradigm implemented
by super-resolution neural networks. SRN-SZ applies the most advanced
super-resolution network HAT for its compression, which is free of time-costing
per-data training. In experiments compared with various state-of-the-art
compressors, SRN-SZ achieves up to 75% compression ratio improvements under the
same error bound and up to 80% compression ratio improvements under the same
PSNR than the second-best compressor
MDZ: An Efficient Error-Bounded Lossy Compressor for Molecular Dynamics
Molecular dynamics (MD) has been widely used in today\u27s scientific research across multiple domains including materials science, biochemistry, biophysics, and structural biology. MD simulations can produce extremely large amounts of data in that each simulation could involve a large number of atoms (up to trillions) for a large number of timesteps (up to hundreds of millions). In this paper, we perform an in-depth analysis of a number of MD simulation datasets and then develop an efficient error-bounded lossy compressor that can significantly improve the compression ratios. The contributions are fourfold. (1) We characterize a number of MD datasets and summarize two commonly used execution models. (2) We develop an adaptive error-bounded lossy compression framework (called MDZ), which can optimize the compression for both execution models adaptively by taking advantage of their specific characteristics. (3) We compare our solution with six other state-of-the-art related works by using three MD simulation packages each with multiple configurations. Experiments show that our solution has up to 233 % higher compression ratios than the second-best lossy compressor in most cases. (4) We demonstrate that MDZ is fully capable of handling particle data beyond MD simulations
Recommended from our members
Toward Resilience and Data Reduction in Exascale Scientific Computing
Because of the ever-increasing execution scale, reliability and data management are becoming more and more important for scientific applications. On the one hand, exascale systems are anticipated to be more susceptible to soft errors ,e.g. silent data corruptions, due to the reduction in the size of transistors and the increase of the number of components. These errors will lead to corrupted results without warning, making the output of the computation untrustable. On the other hand, large volumes of highly variable data are produced by scientific computing with high velocity on exascale systems or advanced instruments, and the I/O time on storing these data is prohibitive due to the I/O bottleneck in parallel file systems. In this work, we leverage algorithm-based fault tolerance (ABFT) and error-bound lossy compression to tackle the two problems, in order to support efficient scientific computing on exascale systems.We propose an efficient fault tolerant scheme to tolerant soft errors in Fast Fourier Transform (FFT), one of the most important computation kernels widely used in scientific computing. Traditional redundancy approaches will at least double the execution time or resources, limiting the usage in practice because of the large overhead. Previous works on offline ABFT algorithms for FFT mitigate this problem by providing resilient FFT with lower overhead, but these algorithms fail to make progress in vulnerable environments with high error rates because they can only detect and correct errors after the whole computation finishes. We propose an online ABFT scheme for large-scale FFT inspired by the divide-and-conquer nature of the FFT computation. We devise fault tolerant schemes for both computational and memory errors in FFT, with both serial and parallel optimizations. Experimental results demonstrate that the proposed approach provides more timely error detection and recovery as well as better fault coverage with less overhead, compared to the offline ABFT algorithm.To alleviate the I/O bottleneck in the parallel file systems, we work on a prediction-based error-bounded lossy compressor to significantly reduce the size of scientific datasets while retaining the accuracy of the decompressed data, with adaptive prediction algorithms and compression models. We first propose a regression-based predictor for better prediction accuracy than traditional approaches under large error bounds, followed by an adaptive algorithm that dynamically selects between the traditional Lorenzo predictor and the proposed regression-based predictor, leading to very high compression ratios with little visual distortion. We further unify the prediction-based model and transform-baed model by using transform-based compressors as a predictor, with novel optimizations toward efficient coefficient encoding for both the two models. The proposed adaptive multi-algorithm design provides better compression ratios given the same distortion, significantly reducing storage requirements and I/O time.We further adapt the compression algorithms and compressors to different requirements and/or objectives in realistic scenarios. We leverage a logarithmic transform to precondition the data, which turns a relative-error-bound compression problem into an absolute-error-bound compression problem. This transform aligns two different error requirements while improving the compression quality, efficiently reducing the workload for compressor design. We also correlate the compression algorithm with system information to achieve better I/O performance compared to traditional single compressor deployment. These studies further improve the efficiency of lossy compression from the perspective of efficient I/O in the context of scientific simulation, making scientific applications running on exascale systems more efficient
MGARD+: Optimizing Multilevel Methods for Error-Bounded Scientific Data Reduction
Nowadays, data reduction is becoming increasingly important in dealing with the large amounts of scientific data. Existing multilevel compression algorithms offer a promising way to manage scientific data at scale but may suffer from relatively low performance and reduction quality. In this paper, we propose MGARD+, a multilevel data reduction and refactoring framework drawing on previous multilevel methods, to achieve high-performance data decomposition and high-quality error-bounded lossy compression. Our contributions are four-fold: 1) We propose to leverage a level-wise coefficient quantization method, which uses different error tolerances to quantize the multilevel coefficients. 2) We propose an adaptive decomposition method which treats the multilevel decomposition as a preconditioner and terminates the decomposition process at an appropriate level. 3) We leverage a set of algorithmic optimization strategies to significantly improve the performance of multilevel decomposition/recompositing. 4) We evaluate our proposed method using four real-world scientific datasets and compare with several state-of-the-art lossy compressors. Experiments demonstrate that our optimizations improve the decomposition/recompositing performance of the existing multilevel method by up to 70x, and the proposed compression method can improve compression ratio by up to 2x compared with other state-of-the-art error-bounded lossy compressors under the same level of data distortion
Full-State Quantum Circuit Simulation by Using Data Compression
Quantum circuit simulations are critical for evaluating quantum algorithms
and machines. However, the number of state amplitudes required for full
simulation increases exponentially with the number of qubits. In this study, we
leverage data compression to reduce memory requirements, trading computation
time and fidelity for memory space. Specifically, we develop a hybrid solution
by combining the lossless compression and our tailored lossy compression method
with adaptive error bounds at each timestep of the simulation. Our approach
optimizes for compression speed and makes sure that errors due to lossy
compression are uncorrelated, an important property for comparing simulation
output with physical machines. Experiments show that our approach reduces the
memory requirement of simulating the 61-qubit Grover's search algorithm from 32
exabytes to 768 terabytes of memory on Argonne's Theta supercomputer using
4,096 nodes. The results suggest that our techniques can increase the
simulation size by 2 to 16 qubits for general quantum circuits.Comment: Published in SC2019. Please cite the SC versio
- …