88,783 research outputs found

    Rate-distortion Balanced Data Compression for Wireless Sensor Networks

    Get PDF
    This paper presents a data compression algorithm with error bound guarantee for wireless sensor networks (WSNs) using compressing neural networks. The proposed algorithm minimizes data congestion and reduces energy consumption by exploring spatio-temporal correlations among data samples. The adaptive rate-distortion feature balances the compressed data size (data rate) with the required error bound guarantee (distortion level). This compression relieves the strain on energy and bandwidth resources while collecting WSN data within tolerable error margins, thereby increasing the scale of WSNs. The algorithm is evaluated using real-world datasets and compared with conventional methods for temporal and spatial data compression. The experimental validation reveals that the proposed algorithm outperforms several existing WSN data compression methods in terms of compression efficiency and signal reconstruction. Moreover, an energy analysis shows that compressing the data can reduce the energy expenditure, and hence expand the service lifespan by several folds.Comment: arXiv admin note: text overlap with arXiv:1408.294

    Space-Efficient Re-Pair Compression

    Get PDF
    Re-Pair is an effective grammar-based compression scheme achieving strong compression rates in practice. Let nn, σ\sigma, and dd be the text length, alphabet size, and dictionary size of the final grammar, respectively. In their original paper, the authors show how to compute the Re-Pair grammar in expected linear time and 5n+4σ2+4d+n5n + 4\sigma^2 + 4d + \sqrt{n} words of working space on top of the text. In this work, we propose two algorithms improving on the space of their original solution. Our model assumes a memory word of ⌈log⁥2n⌉\lceil\log_2 n\rceil bits and a re-writable input text composed by nn such words. Our first algorithm runs in expected O(n/Ï”)\mathcal O(n/\epsilon) time and uses (1+Ï”)n+n(1+\epsilon)n +\sqrt n words of space on top of the text for any parameter 0<ϔ≀10<\epsilon \leq 1 chosen in advance. Our second algorithm runs in expected O(nlog⁥n)\mathcal O(n\log n) time and improves the space to n+nn +\sqrt n words

    Regular Expression Search on Compressed Text

    Full text link
    We present an algorithm for searching regular expression matches in compressed text. The algorithm reports the number of matching lines in the uncompressed text in time linear in the size of its compressed version. We define efficient data structures that yield nearly optimal complexity bounds and provide a sequential implementation --zearch-- that requires up to 25% less time than the state of the art.Comment: 10 pages, published in Data Compression Conference (DCC'19

    A Universal Scheme for Wyner–Ziv Coding of Discrete Sources

    Get PDF
    We consider the Wyner–Ziv (WZ) problem of lossy compression where the decompressor observes a noisy version of the source, whose statistics are unknown. A new family of WZ coding algorithms is proposed and their universal optimality is proven. Compression consists of sliding-window processing followed by Lempel–Ziv (LZ) compression, while the decompressor is based on a modification of the discrete universal denoiser (DUDE) algorithm to take advantage of side information. The new algorithms not only universally attain the fundamental limits, but also suggest a paradigm for practical WZ coding. The effectiveness of our approach is illustrated with experiments on binary images, and English text using a low complexity algorithm motivated by our class of universally optimal WZ codes

    Implementasi Algoritma Elias Gamma Kompresi Pada File Teks

    Get PDF
    Large data sizes result in wasted memory and slow data transfer processes. Compression aims to reduce the size of the data to be as small as possible. Elias Gamma algorithm is a type of lossless compression used in this study, whose performance will be measured by Ratio of Compression (RC), Compression Ratio (CR), Redundancy (Rd), compression time ( seconds) and decompression time (seconds) on the text file. Text file compression is done by reading the string in the text file and encoding the string using Elias Gamma, then performing the compression process. The final result of the compression is a file with *.eg extension which contains character information and a compressed bit string that can be decompressed. Elias Gamma's algorithm is influenced by the number of character variations. In the compression process on Elias Gamma's strings the average compression ratio is 2.192%. Keywords: Decompression, Elias Gamma, Text Files, Compression

    Lempel-Ziv Parsing in External Memory

    Full text link
    For decades, computing the LZ factorization (or LZ77 parsing) of a string has been a requisite and computationally intensive step in many diverse applications, including text indexing and data compression. Many algorithms for LZ77 parsing have been discovered over the years; however, despite the increasing need to apply LZ77 to massive data sets, no algorithm to date scales to inputs that exceed the size of internal memory. In this paper we describe the first algorithm for computing the LZ77 parsing in external memory. Our algorithm is fast in practice and will allow the next generation of text indexes to be realised for massive strings and string collections.Comment: 10 page

    Screen Content Image Segmentation Using Sparse-Smooth Decomposition

    Full text link
    Sparse decomposition has been extensively used for different applications including signal compression and denoising and document analysis. In this paper, sparse decomposition is used for image segmentation. The proposed algorithm separates the background and foreground using a sparse-smooth decomposition technique such that the smooth and sparse components correspond to the background and foreground respectively. This algorithm is tested on several test images from HEVC test sequences and is shown to have superior performance over other methods, such as the hierarchical k-means clustering in DjVu. This segmentation algorithm can also be used for text extraction, video compression and medical image segmentation.Comment: Asilomar Conference on Signals, Systems and Computers, IEEE, 2015, (to Appear

    A Universal Parallel Two-Pass MDL Context Tree Compression Algorithm

    Full text link
    Computing problems that handle large amounts of data necessitate the use of lossless data compression for efficient storage and transmission. We present a novel lossless universal data compression algorithm that uses parallel computational units to increase the throughput. The length-NN input sequence is partitioned into BB blocks. Processing each block independently of the other blocks can accelerate the computation by a factor of BB, but degrades the compression quality. Instead, our approach is to first estimate the minimum description length (MDL) context tree source underlying the entire input, and then encode each of the BB blocks in parallel based on the MDL source. With this two-pass approach, the compression loss incurred by using more parallel units is insignificant. Our algorithm is work-efficient, i.e., its computational complexity is O(N/B)O(N/B). Its redundancy is approximately Blog⁥(N/B)B\log(N/B) bits above Rissanen's lower bound on universal compression performance, with respect to any context tree source whose maximal depth is at most log⁥(N/B)\log(N/B). We improve the compression by using different quantizers for states of the context tree based on the number of symbols corresponding to those states. Numerical results from a prototype implementation suggest that our algorithm offers a better trade-off between compression and throughput than competing universal data compression algorithms.Comment: Accepted to Journal of Selected Topics in Signal Processing special issue on Signal Processing for Big Data (expected publication date June 2015). 10 pages double column, 6 figures, and 2 tables. arXiv admin note: substantial text overlap with arXiv:1405.6322. Version: Mar 2015: Corrected a typ
    • 

    corecore