Search CORE

2,771 research outputs found

An Improvement of Just Noticeable Color Difference Estimation

Author: Kuo-Cheng Liu
Publication venue: 'Polish Information Processing Society PTI'
Publication date: 01/10/2016
Field of study

Crossref

Directory of Open Access Journals

The curvelet transform for image denoising

Author: Candès Emmanuel J.
Donoho David L.
Starck Jean-Luc
Publication venue
Publication date: 01/01/2002
Field of study

We describe approximate digital implementations of two new mathematical transforms, namely, the ridgelet transform and the curvelet transform. Our implementations offer exact reconstruction, stability against perturbations, ease of implementation, and low computational complexity. A central tool is Fourier-domain computation of an approximate digital Radon transform. We introduce a very simple interpolation in the Fourier space which takes Cartesian samples and yields samples on a rectopolar grid, which is a pseudo-polar sampling set based on a concentric squares geometry. Despite the crudeness of our interpolation, the visual performance is surprisingly good. Our ridgelet transform applies to the Radon transform a special overcomplete wavelet pyramid whose wavelets have compact support in the frequency domain. Our curvelet transform uses our ridgelet transform as a component step, and implements curvelet subbands using a filter bank of a&grave; trous wavelet filters. Our philosophy throughout is that transforms should be overcomplete, rather than critically sampled. We apply these digital transforms to the denoising of some standard images embedded in white noise. In the tests reported here, simple thresholding of the curvelet coefficients is very competitive with "state of the art" techniques based on wavelets, including thresholding of decimated or undecimated wavelet transforms and also including tree-based Bayesian posterior mean methods. Moreover, the curvelet reconstructions exhibit higher perceptual quality than wavelet-based reconstructions, offering visually sharper images and, in particular, higher quality recovery of edges and of faint linear and curvilinear features. Existing theory for curvelet and ridgelet transforms suggests that these new approaches can outperform wavelet methods in certain image reconstruction problems. The empirical results reported here are in encouraging agreement

CiteSeerX

Caltech Authors

Steered mixture-of-experts for light field images and video : representation and coding

Author: Lambert Peter
Sikora Thomas
Van Wallendael Glenn
Verhack Ruben
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution

Ghent University Academic Bibliography

Computing motion in the primate's visual system

Author: Koch Christof
Mathur Bimal
Wang H. Taichi
Publication venue: 'The Company of Biologists'
Publication date: 01/09/1989
Field of study

Computing motion on the basis of the time-varying image intensity is a difficult problem for both artificial and biological vision systems. We will show how one well-known gradient-based computer algorithm for estimating visual motion can be implemented within the primate's visual system. This relaxation algorithm computes the optical flow field by minimizing a variational functional of a form commonly encountered in early vision, and is performed in two steps. In the first stage, local motion is computed, while in the second stage spatial integration occurs. Neurons in the second stage represent the optical flow field via a population-coding scheme, such that the vector sum of all neurons at each location codes for the direction and magnitude of the velocity at that location. The resulting network maps onto the magnocellular pathway of the primate visual system, in particular onto cells in the primary visual cortex (V1) as well as onto cells in the middle temporal area (MT). Our algorithm mimics a number of psychophysical phenomena and illusions (perception of coherent plaids, motion capture, motion coherence) as well as electrophysiological recordings. Thus, a single unifying principle ‘the final optical flow should be as smooth as possible’ (except at isolated motion discontinuities) explains a large number of phenomena and links single-cell behavior with perception and computational theory

Caltech Authors

A Fully Progressive Approach to Single-Image Super-Resolution

Author: McWilliams Brian
Perazzi Federico
Schroers Christopher
Sorkine-Hornung Alexander
Sorkine-Hornung Olga
Wang Yifan
Publication venue
Publication date: 10/04/2018
Field of study

Recent deep learning approaches to single image super-resolution have achieved impressive results in terms of traditional error measures and perceptual quality. However, in each case it remains challenging to achieve high quality results for large upsampling factors. To this end, we propose a method (ProSR) that is progressive both in architecture and training: the network upsamples an image in intermediate steps, while the learning process is organized from easy to hard, as is done in curriculum learning. To obtain more photorealistic results, we design a generative adversarial network (GAN), named ProGanSR, that follows the same progressive multi-scale design principle. This not only allows to scale well to high upsampling factors (e.g., 8x) but constitutes a principled multi-scale approach that increases the reconstruction quality for all upsampling factors simultaneously. In particular ProSR ranks 2nd in terms of SSIM and 4th in terms of PSNR in the NTIRE2018 SISR challenge [34]. Compared to the top-ranking team, our model is marginally lower, but runs 5 times faster

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

Model-Based Environmental Visual Perception for Humanoid Robots

Author: Gonzalez Aguirre David Israel
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2013
Field of study

The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling

KITopen

Efficient Coding and Statistically Optimal Weighting of Covariance among Acoustic Attributes in Novel Sounds

Author: A Caclin
A Cristia
A Kohn
AA Faisal
AA Stocker
AJ Bell
AL Fairhall
AR Girshick
B McMurray
BA Olshausen
BA Olshausen
BA Olshausen
BH Repp
BR Glasberg
C McCollough
CE Stilp
CE Stilp
CE Stilp
Christian E. Stilp
CWG Clifford
CWG Clifford
D Alais
D Bendor
D Bendor
D Kersten
D Kersten
David S. Vicario
DC Knill
DL Barbour
DO Hebb
E Oja
EJA Turnham
EP Simoncelli
F Attneave
F Opolko
FH Durgin
G Chechik
G Chechik
HB Barlow
HB Barlow
HB Barlow
HB Helbig
HM Sussman
HM Sussman
I Nelken
J Maye
J Maye
J-M Xu
JA Movshon
JC Toscano
JK Chapin
JL Anderson
JM Hillis
JM Hillis
Keith R. Kluender
KP Körding
KP Körding
KR Kluender
KR Kluender
KR Kluender
KR Kluender
KR Kluender
L Lisker
LL Holt
MO Ernst
MO Ernst
O Schwartz
PC Delattre
PK Stanton
RD Patterson
SC Sullivan
T Lu
T Lu
TD Sanger
TJ Sejnowski
TJ Sejnowski
WE Vinje
WE Vinje
WS Geisler
X Wang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

To the extent that sensorineural systems are efficient, redundancy should be extracted to optimize transmission of information, but perceptual evidence for this has been limited. Stilp and colleagues recently reported efficient coding of robust correlation (r = .97) among complex acoustic attributes (attack/decay, spectral shape) in novel sounds. Discrimination of sounds orthogonal to the correlation was initially inferior but later comparable to that of sounds obeying the correlation. These effects were attenuated for less-correlated stimuli (r = .54) for reasons that are unclear. Here, statistical properties of correlation among acoustic attributes essential for perceptual organization are investigated. Overall, simple strength of the principal correlation is inadequate to predict listener performance. Initial superiority of discrimination for statistically consistent sound pairs was relatively insensitive to decreased physical acoustic/psychoacoustic range of evidence supporting the correlation, and to more frequent presentations of the same orthogonal test pairs. However, increased range supporting an orthogonal dimension has substantial effects upon perceptual organization. Connectionist simulations and Eigenvalues from closed-form calculations of principal components analysis (PCA) reveal that perceptual organization is near-optimally weighted to shared versus unshared covariance in experienced sound distributions. Implications of reduced perceptual dimensionality for speech perception and plausible neural substrates are discussed

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Purdue E-Pubs