206 research outputs found

    JP3D compression of solar data-cubes: photospheric imaging and spectropolarimetry

    Full text link
    Hyperspectral imaging is an ubiquitous technique in solar physics observations and the recent advances in solar instrumentation enabled us to acquire and record data at an unprecedented rate. The huge amount of data which will be archived in the upcoming solar observatories press us to compress the data in order to reduce the storage space and transfer times. The correlation present over all dimensions, spatial, temporal and spectral, of solar data-sets suggests the use of a 3D base wavelet decomposition, to achieve higher compression rates. In this work, we evaluate the performance of the recent JPEG2000 Part 10 standard, known as JP3D, for the lossless compression of several types of solar data-cubes. We explore the differences in: a) The compressibility of broad-band or narrow-band time-sequence; I or V stokes profiles in spectropolarimetric data-sets; b) Compressing data in [x,y,λ\lambda] packages at different times or data in [x,y,t] packages of different wavelength; c) Compressing a single large data-cube or several smaller data-cubes; d) Compressing data which is under-sampled or super-sampled with respect to the diffraction cut-off

    Data Compression in the Petascale Astronomy Era: a GERLUMPH case study

    Full text link
    As the volume of data grows, astronomers are increasingly faced with choices on what data to keep -- and what to throw away. Recent work evaluating the JPEG2000 (ISO/IEC 15444) standards as a future data format standard in astronomy has shown promising results on observational data. However, there is still a need to evaluate its potential on other type of astronomical data, such as from numerical simulations. GERLUMPH (the GPU-Enabled High Resolution cosmological MicroLensing parameter survey) represents an example of a data intensive project in theoretical astrophysics. In the next phase of processing, the ~27 terabyte GERLUMPH dataset is set to grow by a factor of 100 -- well beyond the current storage capabilities of the supercomputing facility on which it resides. In order to minimise bandwidth usage, file transfer time, and storage space, this work evaluates several data compression techniques. Specifically, we investigate off-the-shelf and custom lossless compression algorithms as well as the lossy JPEG2000 compression format. Results of lossless compression algorithms on GERLUMPH data products show small compression ratios (1.35:1 to 4.69:1 of input file size) varying with the nature of the input data. Our results suggest that JPEG2000 could be suitable for other numerical datasets stored as gridded data or volumetric data. When approaching lossy data compression, one should keep in mind the intended purposes of the data to be compressed, and evaluate the effect of the loss on future analysis. In our case study, lossy compression and a high compression ratio do not significantly compromise the intended use of the data for constraining quasar source profiles from cosmological microlensing.Comment: 15 pages, 9 figures, 5 tables. Published in the Special Issue of Astronomy & Computing on The future of astronomical data format

    Web-Based Visualization of Very Large Scientific Astronomy Imagery

    Full text link
    Visualizing and navigating through large astronomy images from a remote location with current astronomy display tools can be a frustrating experience in terms of speed and ergonomics, especially on mobile devices. In this paper, we present a high performance, versatile and robust client-server system for remote visualization and analysis of extremely large scientific images. Applications of this work include survey image quality control, interactive data query and exploration, citizen science, as well as public outreach. The proposed software is entirely open source and is designed to be generic and applicable to a variety of datasets. It provides access to floating point data at terabyte scales, with the ability to precisely adjust image settings in real-time. The proposed clients are light-weight, platform-independent web applications built on standard HTML5 web technologies and compatible with both touch and mouse-based devices. We put the system to the test and assess the performance of the system and show that a single server can comfortably handle more than a hundred simultaneous users accessing full precision 32 bit astronomy data.Comment: Published in Astronomy & Computing. IIPImage server available from http://iipimage.sourceforge.net . Visiomatic code and demos available from http://www.visiomatic.org

    Streaming Lossless Volumetric Compression of Medical Images Using Gated Recurrent Convolutional Neural Network

    Full text link
    Deep learning-based lossless compression methods offer substantial advantages in compressing medical volumetric images. Nevertheless, many learning-based algorithms encounter a trade-off between practicality and compression performance. This paper introduces a hardware-friendly streaming lossless volumetric compression framework, utilizing merely one-thousandth of the model weights compared to other learning-based compression frameworks. We propose a gated recurrent convolutional neural network that combines diverse convolutional structures and fusion gate mechanisms to capture the inter-slice dependencies in volumetric images. Based on such contextual information, we can predict the pixel-by-pixel distribution for entropy coding. Guided by hardware/software co-design principles, we implement the proposed framework on Field Programmable Gate Array to achieve enhanced real-time performance. Extensive experimental results indicate that our method outperforms traditional lossless volumetric compressors and state-of-the-art learning-based lossless compression methods across various medical image benchmarks. Additionally, our method exhibits robust generalization ability and competitive compression speedComment: 18 pages, 8 figure

    3D Medical Image Lossless Compressor Using Deep Learning Approaches

    Get PDF
    The ever-increasing importance of accelerated information processing, communica-tion, and storing are major requirements within the big-data era revolution. With the extensive rise in data availability, handy information acquisition, and growing data rate, a critical challenge emerges in efficient handling. Even with advanced technical hardware developments and multiple Graphics Processing Units (GPUs) availability, this demand is still highly promoted to utilise these technologies effectively. Health-care systems are one of the domains yielding explosive data growth. Especially when considering their modern scanners abilities, which annually produce higher-resolution and more densely sampled medical images, with increasing requirements for massive storage capacity. The bottleneck in data transmission and storage would essentially be handled with an effective compression method. Since medical information is critical and imposes an influential role in diagnosis accuracy, it is strongly encouraged to guarantee exact reconstruction with no loss in quality, which is the main objective of any lossless compression algorithm. Given the revolutionary impact of Deep Learning (DL) methods in solving many tasks while achieving the state of the art results, includ-ing data compression, this opens tremendous opportunities for contributions. While considerable efforts have been made to address lossy performance using learning-based approaches, less attention was paid to address lossless compression. This PhD thesis investigates and proposes novel learning-based approaches for compressing 3D medical images losslessly.Firstly, we formulate the lossless compression task as a supervised sequential prediction problem, whereby a model learns a projection function to predict a target voxel given sequence of samples from its spatially surrounding voxels. Using such 3D local sampling information efficiently exploits spatial similarities and redundancies in a volumetric medical context by utilising such a prediction paradigm. The proposed NN-based data predictor is trained to minimise the differences with the original data values while the residual errors are encoded using arithmetic coding to allow lossless reconstruction.Following this, we explore the effectiveness of Recurrent Neural Networks (RNNs) as a 3D predictor for learning the mapping function from the spatial medical domain (16 bit-depths). We analyse Long Short-Term Memory (LSTM) models’ generalisabil-ity and robustness in capturing the 3D spatial dependencies of a voxel’s neighbourhood while utilising samples taken from various scanning settings. We evaluate our proposed MedZip models in compressing unseen Computerized Tomography (CT) and Magnetic Resonance Imaging (MRI) modalities losslessly, compared to other state-of-the-art lossless compression standards.This work investigates input configurations and sampling schemes for a many-to-one sequence prediction model, specifically for compressing 3D medical images (16 bit-depths) losslessly. The main objective is to determine the optimal practice for enabling the proposed LSTM model to achieve a high compression ratio and fast encoding-decoding performance. A solution for a non-deterministic environments problem was also proposed, allowing models to run in parallel form without much compression performance drop. Compared to well-known lossless codecs, experimental evaluations were carried out on datasets acquired by different hospitals, representing different body segments, and have distinct scanning modalities (i.e. CT and MRI).To conclude, we present a novel data-driven sampling scheme utilising weighted gradient scores for training LSTM prediction-based models. The objective is to determine whether some training samples are significantly more informative than others, specifically in medical domains where samples are available on a scale of billions. The effectiveness of models trained on the presented importance sampling scheme was evaluated compared to alternative strategies such as uniform, Gaussian, and sliced-based sampling
    corecore