91 research outputs found
A survey of visual preprocessing and shape representation techniques
Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention)
Hierarchical inference of disparity
Disparity selective cells in V1 respond to the correlated receptive fields of the left and right retinae, which do not necessarily correspond to the same object in the 3D scene, i.e., these cells respond equally to both false and correct stereo matches. On the other hand, neurons in the extrastriate visual area V2 show much stronger responses to correct visual matches [Bakin et al, 2000]. This indicates that a part of the stereo correspondence problem is solved during disparity processing in these two areas. However, the mechanisms employed by the brain to accomplish this task are not yet understood. Existing computational models are mostly based on cooperative computations in V1 [Marr and Poggio 1976, Read and Cumming 2007], without exploiting the potential benefits of the hierarchical structure between V1 and V2. Here we propose a two-layer graphical model for disparity estimation from stereo. The lower layer matches the linear responses of neurons with Gabor receptive fields across images. Nodes in the upper layer infer a sparse code of the disparity map and act as priors that help disambiguate false from correct matches. When learned on natural disparity maps, the receptive fields of the sparse code converge to oriented depth edges, which is consistent with the electrophysiological studies in macaque [von der Heydt et al, 2000]. Moreover, when such a code is used for depth inference in our two layer model, the resulting disparity map for the Tsukuba stereo pair [middlebury database] has 40% less false matches than the solution given by the first layer. Our model offers a demonstration of the hierarchical disparity computation, leading to testable predictions about V1-V2 interactions
Learning Linear, Sparse, Factorial Codes
In previous work (Olshausen & Field 1996), an algorithm was described for learning linear sparse codes which, when trained on natural images, produces a set of basis functions that are spatially localized, oriented, and bandpass (i.e., wavelet-like). This note shows how the algorithm may be interpreted within a maximum-likelihood framework. Several useful insights emerge from this connection: it makes explicit the relation to statistical independence (i.e., factorial coding), it shows a formal relationship to the algorithm of Bell and Sejnowski (1995), and it suggests how to adapt parameters that were previously fixed
Learning sparse representations of depth
This paper introduces a new method for learning and inferring sparse
representations of depth (disparity) maps. The proposed algorithm relaxes the
usual assumption of the stationary noise model in sparse coding. This enables
learning from data corrupted with spatially varying noise or uncertainty,
typically obtained by laser range scanners or structured light depth cameras.
Sparse representations are learned from the Middlebury database disparity maps
and then exploited in a two-layer graphical model for inferring depth from
stereo, by including a sparsity prior on the learned features. Since they
capture higher-order dependencies in the depth structure, these priors can
complement smoothness priors commonly used in depth inference based on Markov
Random Field (MRF) models. Inference on the proposed graph is achieved using an
alternating iterative optimization technique, where the first layer is solved
using an existing MRF-based stereo matching algorithm, then held fixed as the
second layer is solved using the proposed non-stationary sparse coding
algorithm. This leads to a general method for improving solutions of state of
the art MRF-based depth estimation algorithms. Our experimental results first
show that depth inference using learned representations leads to state of the
art denoising of depth maps obtained from laser range scanners and a time of
flight camera. Furthermore, we show that adding sparse priors improves the
results of two depth estimation methods: the classical graph cut algorithm by
Boykov et al. and the more recent algorithm of Woodford et al.Comment: 12 page
Neurobiology, Psychophysics, and Computational Models of Visual Attention
The purpose of this workshop was to discuss both recent experimental findings and
computational models of the neurobiological implementation of selective attention.
Recent experimental results were presented in two of the four presentations given
(C.E. Connor, Washington University and B.C. Motter, SUNY and V.A. Medical
Center, Syracuse), while the other two talks were devoted to computational models
(E. Niebur, Caltech, and B. Olshausen, Washington University)
- …