Search CORE

838 research outputs found

Generalized residual vector quantization for large scale data

Author: Liu Shicong
Lu Hongtao
Shao Junru
Publication venue
Publication date: 17/09/2016
Field of study

Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector quantization framework that iteratively minimizes quantization error. First, we provide a detailed review on a relevant vector quantization method named \textit{residual vector quantization} (RVQ). Next, we propose \textit{generalized residual vector quantization} (GRVQ) to further improve over RVQ. Many vector quantization methods can be viewed as the special cases of our proposed framework. We evaluate GRVQ on several large scale benchmark datasets for large scale search, classification and object retrieval. We compared GRVQ with existing methods in detail. Extensive experiments demonstrate our GRVQ framework substantially outperforms existing methods in term of quantization accuracy and computation efficiency.Comment: published on International Conference on Multimedia and Expo 201

arXiv.org e-Print Archive

Crossref

Recovery from Linear Measurements with Complexity-Matching Universal Signal Estimation

Author: Baron Dror
Duarte Marco F.
Zhu Junan
Publication venue
Publication date: 21/12/2014
Field of study

We study the compressed sensing (CS) signal estimation problem where an input signal is measured via a linear matrix multiplication under additive noise. While this setup usually assumes sparsity or compressibility in the input signal during recovery, the signal structure that can be leveraged is often not known a priori. In this paper, we consider universal CS recovery, where the statistics of a stationary ergodic signal source are estimated simultaneously with the signal itself. Inspired by Kolmogorov complexity and minimum description length, we focus on a maximum a posteriori (MAP) estimation framework that leverages universal priors to match the complexity of the source. Our framework can also be applied to general linear inverse problems where more measurements than in CS might be needed. We provide theoretical results that support the algorithmic feasibility of universal MAP estimation using a Markov chain Monte Carlo implementation, which is computationally challenging. We incorporate some techniques to accelerate the algorithm while providing comparable and in many cases better reconstruction quality than existing algorithms. Experimental results show the promise of universality in CS, particularly for low-complexity sources that do not exhibit standard sparsity or compressibility.Comment: 29 pages, 8 figure

arXiv.org e-Print Archive

ScholarWorks@UMass Amherst

Adaptively Lossy Image Compression for Onboard Processing

Author: Goodwill Justin
Publication venue
Publication date: 29/07/2020
Field of study

More efficient image-compression codecs are an emerging requirement for spacecraft because increasingly complex, onboard image sensors can rapidly saturate downlink bandwidth of communication transceivers. While these codecs reduce transmitted data volume, many are compute-intensive and require rapid processing to sustain sensor data rates. Emerging next-generation small satellite (SmallSat) computers provide compelling computational capability to enable more onboard processing and compression than previously considered. For this research, we apply two compression algorithms for deployment on modern flight hardware: (1) end-to-end, neural-network-based, image compression (CNN-JPEG); and (2) adaptive image compression through feature-point detection (FPD-JPEG). These algorithms rely on intelligent data-processing pipelines that adapt to sensor data to compress it more effectively, ensuring efficient use of limited downlink bandwidths. The first algorithm, CNN-JPEG, employs a hybrid approach adapted from literature combining convolutional neural networks (CNNs) and JPEG; however, we modify and tune the training scheme for satellite imagery to account for observed training instabilities. This hybrid CNN-JPEG approach shows 23.5% better average peak signal-to-noise ratio (PSNR) and 33.5% better average structural similarity index (SSIM) versus standard JPEG on a dataset collected on the Space Test Program – Houston 5 (STP-H5-CSP) mission onboard the International Space Station (ISS). For our second algorithm, we developed a novel adaptive image-compression pipeline based upon JPEG that leverages the Oriented FAST and Rotated BRIEF (ORB) feature-point detection algorithm to adaptively tune the compression ratio to allow for a tradeoff between PSNR/SSIM and combined file size over a batch of STP-H5-CSP images. We achieve a less than 1% drop in average PSNR and SSIM while reducing the combined file size by 29.6% compared to JPEG using a static quality factor (QF) of 90

D-Scholarship@Pitt

Gossip Algorithms for Distributed Signal Processing

Author: Dimakis Alexandros G.
Kar Soummya
Moura Jose M. F.
Rabbat Michael G.
Scaglione Anna
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Gossip algorithms are attractive for in-network processing in sensor networks because they do not require any specialized routing, there is no bottleneck or single point of failure, and they are robust to unreliable wireless network conditions. Recently, there has been a surge of activity in the computer science, control, signal processing, and information theory communities, developing faster and more robust gossip algorithms and deriving theoretical performance guarantees. This article presents an overview of recent work in the area. We describe convergence rate results, which are related to the number of transmitted messages and thus the amount of energy consumed in the network for gossiping. We discuss issues related to gossiping over wireless links, including the effects of quantization and noise, and we illustrate the use of gossip algorithms for canonical signal processing tasks including distributed estimation, source localization, and compression.Comment: Submitted to Proceedings of the IEEE, 29 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Dynamic Reduction of Scientific Data Through Spatiotemporal Properties

Author: Hickman Fulp Megan Louise
Publication venue: Clemson University Libraries
Publication date: 01/12/2021
Field of study

Improvements in High-Performance Computing (HPC) has enabled researchers to develop more sophisticated simulations and applications which solve previously intractable problems. While these applications are critical to scientific innovation, they continue to generate even larger quantities of data, which only worsens the existing I/O bottleneck. To resolve this issue, researchers use various forms of data reduction. Currently, researchers have access to many different types of data reduction. These include methods such as data compression, time-step selection, and data sampling. While each of these are effective methods, data compression algorithms and data sampling methods do not leverage the temporal aspect of the data, and time-step selection is prone to missing critical abrupt changes. With this in mind, we develop our spatiotemporal data sampling method. In this thesis, we develop a spatiotemporal data sampling method that leverages both the spatial and temporal properties of simulation data. Specifically, our method compares corresponding regions of the current time-step with that of the previous time-step to determine whether data from the previous time-step is similar enough to reuse. Additionally, this method biases more rare data values during the sampling process to ensure regions of interest are kept with higher fidelity. By operating in this manner, our method improves sample budget utilization and, as a result, post-reconstruction data quality. As the effectiveness of our method relies heavily on user input parameters, we also provide a set of pre-processing steps to alleviate the burden on the user to set appropriate ones. Specifically, these pre-processing steps assist users in determining an optimal value for the number of bins, error threshold, and the number of regions. Finally, we demonstrate the modularity of our sampling process by demonstrating how it works with any different internal core sampling algorithm. Upon evaluating our spatiotemporal sampling algorithm, we find it is capable of achieving higher post-reconstruction quality than Biswas et al.’s non-reuse importance-based sampling method. Specifically, we find our method achieves a 31.3% higher post-reconstruction quality while only introducing a 37% degradation in throughput, on average. When assessing our pre-processing steps, we find they are efficient at assisting users in determining an optimal value for the number of bins, error threshold, and the number of regions. Finally, we illustrate the modularity of our sampling method by showing how one would swap the core sampling algorithm. From our evaluation, we find our spatiotemporal sampling method is an effective choice for sampling simulation data

Clemson University: TigerPrints

Recommended from our members

Advances in Compression using Probabilistic Models

Author: Havasi Marton
Publication venue: University of Cambridge
Publication date: 16/12/2021
Field of study

The increasing demand for data transmission and storage necessitate the use of efficient compression methods. Compression algorithms work by mapping data to a more compact representation from which the original data can be recovered. To operate efficiently, they need to capture the characteristics of the data distribution, which can be difficult, especially for high-dimensional data. One emerging solution lies in applying probabilistic machine learning to capture the data distribution in an unsupervised manner. Once a probabilistic model for the data is defined, variational inference can be used to infer its parameters from data. Variational inference is closely related to the optimal compression size, as stated by Hinton's bits-back argument: the evidence lower bound, the objective optimized by variational inference, corresponds to a lower bound on the optimal compression size of the average datapoint. However, current compression methods rely on variational inference merely as a heuristic, and they do not approach its postulated efficiency. In this thesis, we present principled and practical algorithms that get closer to this limit. After discussing our approach, we demonstrate its efficacy in image compression and model compression. First, we focus on image compression, where we use a variational autoencoder to learn a mapping between the images and their unobserved, latent representations. We propose a stochastic coding scheme to encode the latent representation, from which the original image can be approximately reconstructed. Next, we look at the compression of deep learning models. We use variational inference to approximate the posterior distribution of the weights in a neural network, and apply our stochastic coding scheme to encode a weight configuration. Finally, we investigate a connection between variational inference and our compression algorithm. We show that a technique we used for compression can improve variational inference by generating samples from a highly flexible posterior approximation, without significantly increasing the computational costs

Apollo (Cambridge)