Search CORE

128 research outputs found

A Reference-Free Lossless Compression Algorithm for DNA Sequences Using a Competitive Prediction of Two Classes of Weighted Models

Author: Hosseini Morteza
Pinho Armando J.
Pratas Diogo
Silva Jorge M.
Publication venue
Publication date: 01/11/2019
Field of study

The development of efficient data compressors for DNA sequences is crucial not only for reducing the storage and the bandwidth for transmission, but also for analysis purposes. In particular, the development of improved compression models directly influences the outcome of anthropological and biomedical compression-based methods. In this paper, we describe a new lossless compressor with improved compression capabilities for DNA sequences representing different domains and kingdoms. The reference-free method uses a competitive prediction model to estimate, for each symbol, the best class of models to be used before applying arithmetic encoding. There are two classes of models: weighted context models (including substitutional tolerant context models) and weighted stochastic repeat models. Both classes of models use specific sub-programs to handle inverted repeats efficiently. The results show that the proposed method attains a higher compression ratio than state-of-the-art approaches, on a balanced and diverse benchmark, using a competitive level of computational resources. An efficient implementation of the method is publicly available, under the GPLv3 license.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Distributed Joint Source-Channel Coding in Wireless Sensor Networks

Author: Barros
Barros
Costa
Cover
Cover
Cover
Daneshgaran
de Bruyn
Dueck
Garcia-Frias
Garcia-Frias
Garcia-Frias
Garcia-Frias
Gehrig
Han
Lajnef
Lin Zhang
Liveris
Liveris
Nayak
Pradhan
Sartipi
Ser
Shamai
Shamai
Shokrollahi
Slepian
Suhan
Tu
Tuncel
Wyner
Wyner
Xu
Xuqi Zhu
Yu Liu
Publication venue: Molecular Diversity Preservation International (MDPI)
Publication date: 01/01/2009
Field of study

Considering the fact that sensors are energy-limited and the wireless channel conditions in wireless sensor networks, there is an urgent need for a low-complexity coding method with high compression ratio and noise-resisted features. This paper reviews the progress made in distributed joint source-channel coding which can address this issue. The main existing deployments, from the theory to practice, of distributed joint source-channel coding over the independent channels, the multiple access channels and the broadcast channels are introduced, respectively. To this end, we also present a practical scheme for compressing multiple correlated sources over the independent channels. The simulation results demonstrate the desired efficiency

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Multiplicative Multiresolution Decomposition for Lossless Volumetric Medical Images Compression

Author: Belhadef Leila
Déforges Olivier
Mekkakia Maaza Zoulikha
Outtas Meriem
Serir Amina
Publication venue: Budapest University of Technology and Economics (BME)
Publication date: 11/10/2022
Field of study

With the emergence of medical imaging, the compression of volumetric medical images is essential. For this purpose, we propose a novel Multiplicative Multiresolution Decomposition (MMD) wavelet coding scheme for lossless compression of volumetric medical images. The MMD is used in speckle reduction technique but offers some proprieties which can be exploited in compression. Thus, as the wavelet transform the MMD provides a hierarchical representation and offers a possibility to realize lossless compression. We integrate in proposed scheme an inter slice filter based on wavelet transform and motion compensation to reduce data energy efficiently. We compare lossless results of classical wavelet coders such as 3D SPIHT and JP3D to the proposed scheme. This scheme incorporates MMD in lossless compression technique by applying MMD/wavelet or MMD transform to each slice, after inter slice filter is employed and the resulting sub-bands are coded by the 3D zero-tree algorithm SPIHT. Lossless experimental results show that the proposed scheme with the MMD can achieve lowest bit rates compared to 3D SPIHT and JP3D

HAL-CentraleSupelec

Periodica Polytechnica (Budapest University of Technology and Economics)

HAL-Rennes 1

Fixed Block Compression Boosting in FM-Indexes : Theory and Practice

Author: Gog Simon
Kempa Dominik
Kärkkäinen Juha
Petri Matthias
Puglisi Simon J.
Publication venue
Publication date: 01/04/2019
Field of study

The FM index (Ferragina and Manzini in J ACM 52(4):552-581, 2005) is a widely-used compressed data structure that stores a string T in a compressed form and also supports fast pattern matching queries. In this paper, we describe new FM-index variants that combine nice theoretical properties, simple implementation and improved practical performance. Our main theoretical result is a new technique called fixed block compression boosting, which is a simpler and faster alternative to optimal compression boosting and implicit compression boosting used in previous FM-indexes. We also describe several new techniques for implementing fixed-block boosting efficiently, including a new, fast, and space-efficient implementation of wavelet trees. Our extensive experiments show the new indexes to be consistently fast and small relative to the state-of-the-art, and thus they make a good off-the-shelf choice for many applications.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Optimal LZ-End Parsing Is Hard

Author: Bannai Hideo
Funakoshi Mitsuru
Kurita Kazuhiro
Nakashima Yuto
Seto Kazuhisa
Uno Takeaki
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)
Publication date: 01/01/2023
Field of study

LZ-End is a variant of the well-known Lempel-Ziv parsing family such that each phrase of the parsing has a previous occurrence, with the additional constraint that the previous occurrence must end at the end of a previous phrase. LZ-End was initially proposed as a greedy parsing, where each phrase is determined greedily from left to right, as the longest factor that satisfies the above constraint [Kreft & Navarro, 2010]. In this work, we consider an optimal LZ-End parsing that has the minimum number of phrases in such parsings. We show that a decision version of computing the optimal LZ-End parsing is NP-complete by showing a reduction from the vertex cover problem. Moreover, we give a MAX-SAT formulation for the optimal LZ-End parsing adapting an approach for computing various NP-hard repetitiveness measures recently presented by [Bannai et al., 2022]. We also consider the approximation ratio of the size of greedy LZ-End parsing to the size of the optimal LZ-End parsing, and give a lower bound of the ratio which asymptotically approaches 2

Dagstuhl Research Online Publication Server

Prediction and evaluation of zero order entropy changes in grammar-based codes

Author: Platoš Jan
Vašinek Michal
Publication venue: 'MDPI AG'
Publication date: 01/01/2017
Field of study

The change of zero order entropy is studied over different strategies of grammar production rule selection. The two major rules are distinguished: transformations leaving the message size intact and substitution functions changing the message size. Relations for zero order entropy changes were derived for both cases and conditions under which the entropy decreases were described. In this article, several different greedy strategies reducing zero order entropy, as well as message sizes are summarized, and the new strategy MinEnt is proposed. The resulting evolution of the zero order entropy is compared with a strategy of selecting the most frequent digram used in the Re-Pair algorithm.Web of Science195art. no. 22

Multidisciplinary Digital Publishing Institute

DSpace at VSB Technical University of Ostrava

On the Information Rates of the Plenoptic Function

Author: Cunha Arthur
Do Minh
Vetterli Martin
Publication venue
Publication date: 01/01/2009
Field of study

The {\it plenoptic function} (Adelson and Bergen, 91) describes the visual information available to an observer at any point in space and time. Samples of the plenoptic function (POF) are seen in video and in general visual content, and represent large amounts of information. In this paper we propose a stochastic model to study the compression limits of the plenoptic function. In the proposed framework, we isolate the two fundamental sources of information in the POF: the one representing the camera motion and the other representing the information complexity of the "reality" being acquired and transmitted. The sources of information are combined, generating a stochastic process that we study in detail. We first propose a model for ensembles of realities that do not change over time. The proposed model is simple in that it enables us to derive precise coding bounds in the information-theoretic sense that are sharp in a number of cases of practical interest. For this simple case of static realities and camera motion, our results indicate that coding practice is in accordance with optimal coding from an information-theoretic standpoint. The model is further extended to account for visual realities that change over time. We derive bounds on the lossless and lossy information rates for this dynamic reality model, stating conditions under which the bounds are tight. Examples with synthetic sources suggest that in the presence of scene dynamics, simple hybrid coding using motion/displacement estimation with DPCM performs considerably suboptimally relative to the true rate-distortion bound.Comment: submitted to IEEE Transactions in Information Theor

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX