Search CORE

1,035 research outputs found

Simulations for Validation of Vision Systems

Author: Hota Rudra Narayan
Rothkopf Constantin
Veeravasarapu V S R
Visvanathan Ramesh
Publication venue
Publication date: 03/12/2015
Field of study

As the computer vision matures into a systems science and engineering discipline, there is a trend in leveraging latest advances in computer graphics simulations for performance evaluation, learning, and inference. However, there is an open question on the utility of graphics simulations for vision with apparently contradicting views in the literature. In this paper, we place the results from the recent literature in the context of performance characterization methodology outlined in the 90's and note that insights derived from simulations can be qualitative or quantitative depending on the degree of fidelity of models used in simulation and the nature of the question posed by the experimenter. We describe a simulation platform that incorporates latest graphics advances and use it for systematic performance characterization and trade-off analysis for vision system design. We verify the utility of the platform in a case study of validating a generative model inspired vision hypothesis, Rank-Order consistency model, in the contexts of global and local illumination changes, and bad weather, and high-frequency noise. Our approach establishes the link between alternative viewpoints, involving models with physics based semantics and signal and perturbation semantics and confirms insights in literature on robust change detection

arXiv.org e-Print Archive

Image Restoration Using Joint Statistical Modeling in Space-Transform Domain

Author: Gao Wen
Ma Siwei
Xiong Ruiqin
Zhang Jian
Zhao Debin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

This paper presents a novel strategy for high-fidelity image restoration by characterizing both local smoothness and nonlocal self-similarity of natural images in a unified statistical manner. The main contributions are three-folds. First, from the perspective of image statistics, a joint statistical modeling (JSM) in an adaptive hybrid space-transform domain is established, which offers a powerful mechanism of combining local smoothness and nonlocal self-similarity simultaneously to ensure a more reliable and robust estimation. Second, a new form of minimization functional for solving image inverse problem is formulated using JSM under regularization-based framework. Finally, in order to make JSM tractable and robust, a new Split-Bregman based algorithm is developed to efficiently solve the above severely underdetermined inverse problem associated with theoretical proof of convergence. Extensive experiments on image inpainting, image deblurring and mixed Gaussian plus salt-and-pepper noise removal applications verify the effectiveness of the proposed algorithm.Comment: 14 pages, 18 figures, 7 Tables, to be published in IEEE Transactions on Circuits System and Video Technology (TCSVT). High resolution pdf version and Code can be found at: http://idm.pku.edu.cn/staff/zhangjian/IRJSM

arXiv.org e-Print Archive

A sparsity-driven approach to multi-camera tracking in visual sensor networks

Author: Cetin Mujdat
Cosar Serhan
Coşar Serhan
Çetin Müjdat
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/08/2013
Field of study

In this paper, a sparsity-driven approach is presented for multi-camera tracking in visual sensor networks (VSNs). VSNs consist of image sensors, embedded processors and wireless transceivers which are powered by batteries. Since the energy and bandwidth resources are limited, setting up a tracking system in VSNs is a challenging problem. Motivated by the goal of tracking in a bandwidth-constrained environment, we present a sparsity-driven method to compress the features extracted by the camera nodes, which are then transmitted across the network for distributed inference. We have designed special overcomplete dictionaries that match the structure of the features, leading to very parsimonious yet accurate representations. We have tested our method in indoor and outdoor people tracking scenarios. Our experimental results demonstrate how our approach leads to communication savings without significant loss in tracking performance

INRIA a CCSD electronic archive server

Infinite Sparse Structured Factor Analysis

Author: Pearce Matthew C.
White Simon R.
Publication venue
Publication date: 13/04/2017
Field of study

Matrix factorisation methods decompose multivariate observations as linear combinations of latent feature vectors. The Indian Buffet Process (IBP) provides a way to model the number of latent features required for a good approximation in terms of regularised reconstruction error. Previous work has focussed on latent feature vectors with independent entries. We extend the model to include nondiagonal latent covariance structures representing characteristics such as smoothness. This is done by . Using simulations we demonstrate that under appropriate conditions a smoothness prior helps to recover the true latent features, while denoising more accurately. We demonstrate our method on a real neuroimaging dataset, where computational tractability is a sufficient challenge that the efficient strategy presented here is essential

arXiv.org e-Print Archive

Optimization Methods for Convolutional Sparse Coding

Author: Bristow Hilton
Lucey Simon
Publication venue
Publication date: 09/06/2014
Field of study

Sparse and convolutional constraints form a natural prior for many optimization problems that arise from physical processes. Detecting motifs in speech and musical passages, super-resolving images, compressing videos, and reconstructing harmonic motions can all leverage redundancies introduced by convolution. Solving problems involving sparse and convolutional constraints remains a difficult computational problem, however. In this paper we present an overview of convolutional sparse coding in a consistent framework. The objective involves iteratively optimizing a convolutional least-squares term for the basis functions, followed by an L1-regularized least squares term for the sparse coefficients. We discuss a range of optimization methods for solving the convolutional sparse coding objective, and the properties that make each method suitable for different applications. In particular, we concentrate on computational complexity, speed to {\epsilon} convergence, memory usage, and the effect of implied boundary conditions. We present a broad suite of examples covering different signal and application domains to illustrate the general applicability of convolutional sparse coding, and the efficacy of the available optimization methods

arXiv.org e-Print Archive

VLSI Friendly Framework for Scalable Video Coding based on Compressed Sensing

Author: Chakrabarti Indrajit
Gogineni Vinay Chakravarthi
Mula Subrahmanyam
Srinivasarao B. K. N.
Publication venue
Publication date: 24/02/2016
Field of study

This paper presents a new VLSI friendly framework for scalable video coding based on Compressed Sensing (CS). It achieves scalability through 3-Dimensional Discrete Wavelet Transform (3-D DWT) and better compression ratio by exploiting the inherent sparsity of the high-frequency wavelet sub-bands through CS. By using 3-D DWT and a proposed adaptive measurement scheme called AMS at the encoder, one can succeed in improving the compression ratio and reducing the complexity of the decoder. The proposed video codec uses only 7% of the total number of multipliers needed in a conventional CS-based video coding system. A codebook of Bernoulli matrices with different sizes corresponding to the predefined sparsity levels is maintained at both the encoder and the decoder. Based on the calculated l0-norm of the input vector, one of the sixteen possible Bernoulli matrices will be selected for taking the CS measurements and its index will be transmitted along with the measurements. Based on this index, the corresponding Bernoulli matrix has been used in CS reconstruction algorithm to get back the high-frequency wavelet sub-bands at the decoder. At the decoder, a new Enhanced Approximate Message Passing (EAMP) algorithm has been proposed to reconstruct the wavelet coefficients and apply the inverse wavelet transform for restoring back the video frames. Simulation results have established the superiority of the proposed framework over the existing schemes and have increased its suitability for VLSI implementation. Moreover, the coded video is found to be scalable with an increase in a number of levels of wavelet decomposition

arXiv.org e-Print Archive

Multivariate Cryptosystems for Secure Processing of Multidimensional Signals

Author: Pedrouzo-Ulloa Alberto
Pérez-González Fernando
Troncoso-Pastoriza Juan Ramón
Publication venue
Publication date: 03/12/2017
Field of study

Multidimensional signals like 2-D and 3-D images or videos are inherently sensitive signals which require privacy-preserving solutions when processed in untrustworthy environments, but their efficient encrypted processing is particularly challenging due to their structure, dimensionality and size. This work introduces a new cryptographic hard problem denoted m-RLWE (multivariate Ring Learning with Errors) which generalizes RLWE, and proposes several relinearization-based techniques to efficiently convert signals with different structures and dimensionalities. The proposed hard problem and the developed techniques give support to lattice cryptosystems that enable encrypted processing of multidimensional signals and efficient conversion between different structures. We show an example cryptosystem and prove that it outperforms its RLWE counterpart in terms of security against basis-reduction attacks, efficiency and cipher expansion for encrypted image processing, and we exemplify some of the proposed transformation techniques in critical and ubiquitous block-based processing application

arXiv.org e-Print Archive

Merge Frame Design for Video Stream Switching using Piecewise Constant Functions

Author: Au Oscar C.
Cheung Gene
Cheung Ngai-Man
Dai Wei
Ortega Antonio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/09/2015
Field of study

The ability to efficiently switch from one pre-encoded video stream to another (e.g., for bitrate adaptation or view switching) is important for many interactive streaming applications. Recently, stream-switching mechanisms based on distributed source coding (DSC) have been proposed. In order to reduce the overall transmission rate, these approaches provide a "merge" mechanism, where information is sent to the decoder such that the exact same frame can be reconstructed given that any one of a known set of side information (SI) frames is available at the decoder (e.g., each SI frame may correspond to a different stream from which we are switching). However, the use of bit-plane coding and channel coding in many DSC approaches leads to complex coding and decoding. In this paper, we propose an alternative approach for merging multiple SI frames, using a piecewise constant (PWC) function as the merge operator. In our approach, for each block to be reconstructed, a series of parameters of these PWC merge functions are transmitted in order to guarantee identical reconstruction given the known side information blocks. We consider two different scenarios. In the first case, a target frame is first given, and then merge parameters are chosen so that this frame can be reconstructed exactly at the decoder. In contrast, in the second scenario, the reconstructed frame and merge parameters are jointly optimized to meet a rate-distortion criteria. Experiments show that for both scenarios, our proposed merge techniques can outperform both a recent approach based on DSC and the SP-frame approach in H.264, in terms of compression efficiency and decoder complexity

arXiv.org e-Print Archive

A new adaptive interframe transform coding using directional classification

Author: Chan YL
Siu WC
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/12/2014
Field of study

Version of RecordPublishe

PolyU Institutional Repository

Wavelet Video Coding Algorithm Based on Energy Weighted Significance Probability Balancing Tree

Author: Fu Bo
Fu Ming-Zhe
Song Chuan-Ming
Wang Xiang-Hai
Publication venue
Publication date: 29/08/2018
Field of study

This work presents a 3-D wavelet video coding algorithm. By analyzing the contribution of each biorthogonal wavelet basis to reconstructed signal's energy, we weight each wavelet subband according to its basis energy. Based on distribution of weighted coefficients, we further discuss a 3-D wavelet tree structure named \textbf{significance probability balancing tree}, which places the coefficients with similar probabilities of being significant on the same layer. It is implemented by using hybrid spatial orientation tree and temporal-domain block tree. Subsequently, a novel 3-D wavelet video coding algorithm is proposed based on the energy-weighted significance probability balancing tree. Experimental results illustrate that our algorithm always achieves good reconstruction quality for different classes of video sequences. Compared with asymmetric 3-D orientation tree, the average peak signal-to-noise ratio (PSNR) gain of our algorithm are 1.24dB, 2.54dB and 2.57dB for luminance (Y) and chrominance (U,V) components, respectively. Compared with temporal-spatial orientation tree algorithm, our algorithm gains 0.38dB, 2.92dB and 2.39dB higher PSNR separately for Y, U, and V components. In addition, the proposed algorithm requires lower computation cost than those of the above two algorithms.Comment: 17 pages, 2 figures, submission to Multimedia Tools and Application

arXiv.org e-Print Archive