Search CORE

88,783 research outputs found

Rate-distortion Balanced Data Compression for Wireless Sensor Networks

Author: Alsheikh Mohammad Abu
Lin Shaowei
Niyato Dusit
Tan Hwee-Pink
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2016
Field of study

This paper presents a data compression algorithm with error bound guarantee for wireless sensor networks (WSNs) using compressing neural networks. The proposed algorithm minimizes data congestion and reduces energy consumption by exploring spatio-temporal correlations among data samples. The adaptive rate-distortion feature balances the compressed data size (data rate) with the required error bound guarantee (distortion level). This compression relieves the strain on energy and bandwidth resources while collecting WSN data within tolerable error margins, thereby increasing the scale of WSNs. The algorithm is evaluated using real-world datasets and compared with conventional methods for temporal and spatial data compression. The experimental validation reveals that the proposed algorithm outperforms several existing WSN data compression methods in terms of compression efficiency and signal reconstruction. Moreover, an energy analysis shows that compressing the data can reduce the energy expenditure, and hence expand the service lifespan by several folds.Comment: arXiv admin note: text overlap with arXiv:1408.294

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University

University of Canberra Research Repository

Space-Efficient Re-Pair Compression

Author: Bille Philip
Gørtz Inge Li
Prezza Nicola
Publication venue
Publication date: 04/11/2016
Field of study

Re-Pair is an effective grammar-based compression scheme achieving strong compression rates in practice. Let

n

\sigma

, and

d

be the text length, alphabet size, and dictionary size of the final grammar, respectively. In their original paper, the authors show how to compute the Re-Pair grammar in expected linear time and

5n + 4\sigma^2 + 4d + \sqrt{n}

words of working space on top of the text. In this work, we propose two algorithms improving on the space of their original solution. Our model assumes a memory word of

\lceil\log_2 n\rceil

bits and a re-writable input text composed by

n

such words. Our first algorithm runs in expected

\mathcal O(n/\epsilon)

time and uses

(1+\epsilon)n +\sqrt n

words of space on top of the text for any parameter

0<\epsilon \leq 1

chosen in advance. Our second algorithm runs in expected

\mathcal O(n\log n)

time and improves the space to

n +\sqrt n

words

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

Online Research Database In Technology

Regular Expression Search on Compressed Text

Author: Ganty Pierre
Valero Pedro
Publication venue
Publication date: 01/01/2019
Field of study

We present an algorithm for searching regular expression matches in compressed text. The algorithm reports the number of matching lines in the uncompressed text in time linear in the size of its compressed version. We define efficient data structures that yield nearly optimal complexity bounds and provide a sequential implementation --zearch-- that requires up to 25% less time than the state of the art.Comment: 10 pages, published in Data Compression Conference (DCC'19

arXiv.org e-Print Archive

Crossref

Archivo Digital UPM

A Universal Scheme for Wyner–Ziv Coding of Discrete Sources

Author: Jalali Shirin
Verdú Sergio
Weissman Tsachy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

We consider the Wyner–Ziv (WZ) problem of lossy compression where the decompressor observes a noisy version of the source, whose statistics are unknown. A new family of WZ coding algorithms is proposed and their universal optimality is proven. Compression consists of sliding-window processing followed by Lempel–Ziv (LZ) compression, while the decompressor is based on a modification of the discrete universal denoiser (DUDE) algorithm to take advantage of side information. The new algorithms not only universally attain the fundamental limits, but also suggest a paradigm for practical WZ coding. The effectiveness of our approach is illustrated with experiments on binary images, and English text using a low complexity algorithm motivated by our class of universally optimal WZ codes

CiteSeerX

Caltech Authors

Implementasi Algoritma Elias Gamma Kompresi Pada File Teks

Author: Cahayati Dina
Khair Husnul
Pardede Akim M.H.
Publication venue: 'UIN Sumatera Utara Medan'
Publication date: 30/04/2022
Field of study

Large data sizes result in wasted memory and slow data transfer processes. Compression aims to reduce the size of the data to be as small as possible. Elias Gamma algorithm is a type of lossless compression used in this study, whose performance will be measured by Ratio of Compression (RC), Compression Ratio (CR), Redundancy (Rd), compression time ( seconds) and decompression time (seconds) on the text file. Text file compression is done by reading the string in the text file and encoding the string using Elias Gamma, then performing the compression process. The final result of the compression is a file with *.eg extension which contains character information and a compressed bit string that can be decompressed. Elias Gamma's algorithm is influenced by the number of character variations. In the compression process on Elias Gamma's strings the average compression ratio is 2.192%. Keywords: Decompression, Elias Gamma, Text Files, Compression

E-Journal Universitas Islam Negeri Sumatera Utara

Lempel-Ziv Parsing in External Memory

Author: Kempa Dominik
Kärkkäinen Juha
Puglisi Simon J.
Publication venue
Publication date: 04/07/2013
Field of study

For decades, computing the LZ factorization (or LZ77 parsing) of a string has been a requisite and computationally intensive step in many diverse applications, including text indexing and data compression. Many algorithms for LZ77 parsing have been discovered over the years; however, despite the increasing need to apply LZ77 to massive data sets, no algorithm to date scales to inputs that exceed the size of internal memory. In this paper we describe the first algorithm for computing the LZ77 parsing in external memory. Our algorithm is fast in practice and will allow the next generation of text indexes to be realised for massive strings and string collections.Comment: 10 page

arXiv.org e-Print Archive

Crossref

Screen Content Image Segmentation Using Sparse-Smooth Decomposition

Author: Abdolrashidi Amirali
Minaee Shervin
Wang Yao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/11/2015
Field of study

Sparse decomposition has been extensively used for different applications including signal compression and denoising and document analysis. In this paper, sparse decomposition is used for image segmentation. The proposed algorithm separates the background and foreground using a sparse-smooth decomposition technique such that the smooth and sparse components correspond to the background and foreground respectively. This algorithm is tested on several test images from HEVC test sequences and is shown to have superior performance over other methods, such as the hierarchical k-means clustering in DjVu. This segmentation algorithm can also be used for text extraction, video compression and medical image segmentation.Comment: Asilomar Conference on Signals, Systems and Computers, IEEE, 2015, (to Appear

arXiv.org e-Print Archive

Crossref

A Universal Parallel Two-Pass MDL Context Tree Compression Algorithm

Author: Baron Dror
Krishnan Nikhil
Publication venue
Publication date: 21/03/2015
Field of study

Computing problems that handle large amounts of data necessitate the use of lossless data compression for efficient storage and transmission. We present a novel lossless universal data compression algorithm that uses parallel computational units to increase the throughput. The length-

N

input sequence is partitioned into

B

blocks. Processing each block independently of the other blocks can accelerate the computation by a factor of

B

, but degrades the compression quality. Instead, our approach is to first estimate the minimum description length (MDL) context tree source underlying the entire input, and then encode each of the

B

blocks in parallel based on the MDL source. With this two-pass approach, the compression loss incurred by using more parallel units is insignificant. Our algorithm is work-efficient, i.e., its computational complexity is

O(N/B)

. Its redundancy is approximately

B\log(N/B)

bits above Rissanen's lower bound on universal compression performance, with respect to any context tree source whose maximal depth is at most

\log(N/B)

. We improve the compression by using different quantizers for states of the context tree based on the number of symbols corresponding to those states. Numerical results from a prototype implementation suggest that our algorithm offers a better trade-off between compression and throughput than competing universal data compression algorithms.Comment: Accepted to Journal of Selected Topics in Signal Processing special issue on Signal Processing for Big Data (expected publication date June 2015). 10 pages double column, 6 figures, and 2 tables. arXiv admin note: substantial text overlap with arXiv:1405.6322. Version: Mar 2015: Corrected a typ

arXiv.org e-Print Archive