48 research outputs found
Multiplicative Multiresolution Decomposition for Lossless Volumetric Medical Images Compression
With the emergence of medical imaging, the compression of volumetric medical images is essential. For this purpose, we propose a novel Multiplicative Multiresolution Decomposition (MMD) wavelet coding scheme for lossless compression of volumetric medical images. The MMD is used in speckle reduction technique but offers some proprieties which can be exploited in compression. Thus, as the wavelet transform the MMD provides a hierarchical representation and offers a possibility to realize lossless compression. We integrate in proposed scheme an inter slice filter based on wavelet transform and motion compensation to reduce data energy efficiently. We compare lossless results of classical wavelet coders such as 3D SPIHT and JP3D to the proposed scheme. This scheme incorporates MMD in lossless compression technique by applying MMD/wavelet or MMD transform to each slice, after inter slice filter is employed and the resulting sub-bands are coded by the 3D zero-tree algorithm SPIHT. Lossless experimental results show that the proposed scheme with the MMD can achieve lowest bit rates compared to 3D SPIHT and JP3D
Visual analysis of research paper collections using normalized relative compression
The analysis of research paper collections is an interesting topic that can give insights on whether a research area is stalled in the same problems, or there is a great amount of novelty every year. Previous research has addressed similar tasks by the analysis of keywords or reference lists, with different degrees of human intervention. In this paper, we demonstrate how, with the use of Normalized Relative Compression, together with a set of automated data-processing tasks, we can successfully visually compare research articles and document collections. We also achieve very similar results with Normalized Conditional Compression that can be applied with a regular compressor. With our approach, we can group papers of different disciplines, analyze how a conference evolves throughout the different editions, or how the profile of a researcher changes through the time. We provide a set of tests that validate our technique, and show that it behaves better for these tasks than other techniques previously proposed.Peer ReviewedPostprint (published version
Loghub: A Large Collection of System Log Datasets towards Automated Log Analytics
Logs have been widely adopted in software system development and maintenance
because of the rich system runtime information they contain. In recent years,
the increase of software size and complexity leads to the rapid growth of the
volume of logs. To handle these large volumes of logs efficiently and
effectively, a line of research focuses on intelligent log analytics powered by
AI (artificial intelligence) techniques. However, only a small fraction of
these techniques have reached successful deployment in industry because of the
lack of public log datasets and necessary benchmarking upon them. To fill this
significant gap between academia and industry and also facilitate more research
on AI-powered log analytics, we have collected and organized loghub, a large
collection of log datasets. In particular, loghub provides 17 real-world log
datasets collected from a wide range of systems, including distributed systems,
supercomputers, operating systems, mobile systems, server applications, and
standalone software. In this paper, we summarize the statistics of these
datasets, introduce some practical log usage scenarios, and present a case
study on anomaly detection to demonstrate how loghub facilitates the research
and practice in this field. Up to the time of this paper writing, loghub
datasets have been downloaded over 15,000 times by more than 380 organizations
from both industry and academia.Comment: Dateset available at https://zenodo.org/record/322717