Search CORE

18 research outputs found

Towards optimal symbolization for time series comparisons

Author: Barrack Duncan
Goulding James
Smith Gavin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2013
Field of study

The abundance and value of mining large time series data sets has long been acknowledged. Ubiquitous in fields ranging from astronomy, biology and web science the size and number of these datasets continues to increase, a situation exacerbated by the exponential growth of our digital footprints. The prevalence and potential utility of this data has led to a vast number of time-series data mining techniques, many of which require symbolization of the raw time series as a pre-processing step for which a number of well used, pre-existing approaches from the literature are typically employed. In this work we note that these standard approaches are sub-optimal in (at least) the broad application area of time series comparison leading to unnecessary data corruption and potential performance loss before any real data mining takes place. Addressing this we present a novel quantizer based upon optimization of comparison fidelity and a computationally tractable algorithm for its implementation on big datasets. We demonstrate empirically that our new approach provides a statistically significant reduction in the amount of error introduced by the symbolization process compared to current state-of-the-art. The approach therefore provides a more accurate input for the vast number of data mining techniques in the literature, providing the potential of increased real world performance across a wide range of existing data mining algorithms and applications

Nottingham ePrints

Nottingham eTheses

Crossref

Quantization using permutation codes with a uniform source

Author: Martin C. Wayne
NC DOCKS at The University of North Carolina Wilmington
Publication venue
Publication date: 01/01/2009
Field of study

Permutation coding is a block coding/quantization scheme where the codebook is comprised entirely of permutations of a single starting vector. Permutation codes for the uniform source are developed using a simple algorithm. The performance of these codes is com- pared against scalar codes and permutation codes developed by dierent methodologies. It is shown that the algorithm produces codes as good as other more complex methods. Theo- retical predictions of code design parameters and code performance is veried by numerical simulations

The University of North Carolina at Greensboro

Concentric Permutation Source Codes

Author: Goyal Vivek K
Nguyen Ha Q.
Varshney Lav R.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Permutation codes are a class of structured vector quantizers with a computationally-simple encoding procedure based on sorting the scalar components. Using a codebook comprising several permutation codes as subcodes preserves the simplicity of encoding while increasing the number of rate-distortion operating points, improving the convex hull of operating points, and increasing design complexity. We show that when the subcodes are designed with the same composition, optimization of the codebook reduces to a lower-dimensional vector quantizer design within a single cone. Heuristics for reducing design complexity are presented, including an optimization of the rate allocation in a shape-gain vector quantizer with gain-dependent wrapped spherical shape codebook

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Boston University Institutional Repository (OpenBU)

Frame Permutation Quantization

Author: Abdelkefi
Beferull-Lozano
Benedetto
Berger
Berger
Berger
Bodmann
Bodmann
Brajovic
Cover
Cvetković
Cvetković
Cvetković
Daubechies
David
Donoho
Eldar
Gersho
Goyal
Goyal
Goyal
Gray
Guo
György
Ha Q. Nguyen
Han
Holmes
Jelinek
Kovačević
Kovačević
Lav R. Varshney
Marco
Nguyen
Powell
Rangan
Sakrison
Slepian
Slepian
Strohmer
Strohmer
Sustik
Thao
Thao
Thao
Tropp
Viswanathan
Vivek K. Goyal
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

Frame permutation quantization (FPQ) is a new vector quantization technique using finite frames. In FPQ, a vector is encoded using a permutation source code to quantize its frame expansion. This means that the encoding is a partial ordering of the frame expansion coefficients. Compared to ordinary permutation source coding, FPQ produces a greater number of possible quantization rates and a higher maximum rate. Various representations for the partitions induced by FPQ are presented, and reconstruction algorithms based on linear programming, quadratic programming, and recursive orthogonal projection are derived. Implementations of the linear and quadratic programming algorithms for uniform and Gaussian sources show performance improvements over entropy-constrained scalar quantization for certain combinations of vector dimension and coding rate. Monte Carlo evaluation of the recursive algorithm shows that mean-squared error (MSE) decays as 1/M^4 for an M-element frame, which is consistent with previous results on optimal decay of MSE. Reconstruction using the canonical dual frame is also studied, and several results relate properties of the analysis frame to whether linear reconstruction techniques provide consistent reconstructions.Comment: 29 pages, 5 figures; detailed added to proof of Theorem 4.3 and a few minor correction

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Boston University Institutional Repository (OpenBU)

Low-resolution scalar quantization for Gaussian sources and squared error

Author: D. Marco
D.L. Neuhoff
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Neural Distributed Compressor Discovers Binning

Author: Ballé Johannes
Erkip Elza
Ozyilkan Ezgi
Publication venue
Publication date: 25/10/2023
Field of study

We consider lossy compression of an information source when the decoder has lossless access to a correlated one. This setup, also known as the Wyner-Ziv problem, is a special case of distributed source coding. To this day, practical approaches for the Wyner-Ziv problem have neither been fully developed nor heavily investigated. We propose a data-driven method based on machine learning that leverages the universal function approximation capability of artificial neural networks. We find that our neural network-based compression scheme, based on variational vector quantization, recovers some principles of the optimum theoretical solution of the Wyner-Ziv setup, such as binning in the source space as well as optimal combination of the quantization index and side information, for exemplary sources. These behaviors emerge although no structure exploiting knowledge of the source distributions was imposed. Binning is a widely used tool in information theoretic proofs and methods, and to our knowledge, this is the first time it has been explicitly observed to emerge from data-driven learning.Comment: draft of a journal version of our previous ISIT 2023 paper (available at: arXiv:2305.04380). arXiv admin note: substantial text overlap with arXiv:2305.0438

arXiv.org e-Print Archive