Search CORE

870 research outputs found

The computational magic of the ventral stream: sketch of a theory (and why some deep architectures work).

Author: Leibo Joel
Mutch Jim
Poggio Tomaso
Rosasco Lorenzo
Tacchetti Andrea
Publication venue
Publication date: 01/01/2012
Field of study

This paper explores the theoretical consequences of a simple assumption: the computational goal of the feedforward path in the ventral stream -- from V1, V2, V4 and to IT -- is to discount image transformations, after learning them during development

CiteSeerX

DSpace@MIT

Sparse Modeling for Image and Vision Processing

Author: Ecole Normale Supérieure
Francis Bach
Francis Bach
Hal Id Hal
Jean Ponce
Jean Ponce
Julien Mairal
Julien Mairal
Sparse Modeling Image
Vision Processing
Publication venue
Publication date: 01/01/2014
Field of study

In recent years, a large amount of multi-disciplinary research has been conducted on sparse models and their applications. In statistics and machine learning, the sparsity principle is used to perform model selection---that is, automatically selecting a simple model among a large collection of them. In signal processing, sparse coding consists of representing data with linear combinations of a few dictionary elements. Subsequently, the corresponding tools have been widely adopted by several scientific communities such as neuroscience, bioinformatics, or computer vision. The goal of this monograph is to offer a self-contained view of sparse modeling for visual recognition and image processing. More specifically, we focus on applications where the dictionary is learned and adapted to data, yielding a compact representation that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics and Visio

arXiv.org e-Print Archive

CiteSeerX

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-Rennes 1

Optimal measurement of visual motion across spatial and temporal scales

Author: A Gorban
A Gorban
A Jones
AB Watson
AJ Doorn van
AJ Doorn van
B Krekelberg
B Sakitt
CW Gardiner
CWG Clifford
D Gabor
D Marr
DH Kelly
DH Kelly
DJ Field
DM MacKay
DO Hebb
EL Bienenstock
EP Simoncelli
ET Jaynes
G Bi
HL Resnikoff
JG Daugman
K Nakayama
LA Lesmes
M Kubovy
M Vergassola
M Wertheimer
MJ Wainwright
MS Landy
O Paulsen
P Burt
P Jurica
RD Luce
S Gepshtein
S Gepshtein
S Gepshtein
S Gepshtein
S Gepshtein
S Marcelja
S Saremi
SB Laughlin
SB Laughlin
TM Cover
VD Glezer
Y Weiss
Y Yeshurun
Y Yeshurun
Publication venue
Publication date: 01/01/2014
Field of study

Sensory systems use limited resources to mediate the perception of a great variety of objects and events. Here a normative framework is presented for exploring how the problem of efficient allocation of resources can be solved in visual perception. Starting with a basic property of every measurement, captured by Gabor's uncertainty relation about the location and frequency content of signals, prescriptions are developed for optimal allocation of sensors for reliable perception of visual motion. This study reveals that a large-scale characteristic of human vision (the spatiotemporal contrast sensitivity function) is similar to the optimal prescription, and it suggests that some previously puzzling phenomena of visual sensitivity, adaptation, and perceptual organization have simple principled explanations.Comment: 28 pages, 10 figures, 2 appendices; in press in Favorskaya MN and Jain LC (Eds), Computer Vision in Advanced Control Systems using Conventional and Intelligent Paradigms, Intelligent Systems Reference Library, Springer-Verlag, Berli

arXiv.org e-Print Archive

CiteSeerX

Crossref

King's Research Portal

Probabilistic models of early vision

Author: Hoyer Patrik O.
Publication venue: Teknillinen korkeakoulu
Publication date: 15/11/2002
Field of study

How do our brains transform patterns of light striking the retina into useful knowledge about objects and events of the external world? Thanks to intense research into the mechanisms of vision, much is now known about this process. However, we do not yet have anything close to a complete picture, and many questions remain unanswered. In addition to its clinical relevance and purely academic significance, research on vision is important because a thorough understanding of biological vision would probably help solve many major problems in computer vision. A major framework for investigating the computational basis of vision is what might be called the probabilistic view of vision. This approach emphasizes the general importance of uncertainty and probabilities in perception and, in particular, suggests that perception is tightly linked to the statistical structure of the natural environment. This thesis investigates this link by building statistical models of natural images, and relating these to what is known of the information processing performed by the early stages of the primate visual system. Recently, it was suggested that the response properties of simple cells in the primary visual cortex could be interpreted as the result of the cells performing an independent component analysis of the natural visual sensory input. This thesis provides some further support for that proposal, and, more importantly, extends the theory to also account for complex cell properties and the columnar organization of the primary visual cortex. Finally, the application of these methods to predicting neural response properties further along the visual pathway is considered. Although the models considered account for only a relatively small part of known facts concerning early visual information processing, it is nonetheless a rather impressive amount considering the simplicity of the models. This is encouraging, and suggests that many of the intricacies of visual information processing might be understood using fairly simple probabilistic models of natural sensory input.reviewe

CiteSeerX

Aaltodoc Publication Archive

Human Face Recognition

Author: Nabatchian Amirhosein
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2011
Field of study

Face recognition, as the main biometric used by human beings, has become more popular for the last twenty years. Automatic recognition of human faces has many commercial and security applications in identity validation and recognition and has become one of the hottest topics in the area of image processing and pattern recognition since 1990. Availability of feasible technologies as well as the increasing request for reliable security systems in today’s world has been a motivation for many researchers to develop new methods for face recognition. In automatic face recognition we desire to either identify or verify one or more persons in still or video images of a scene by means of a stored database of faces. One of the important features of face recognition is its non-intrusive and non-contact property that distinguishes it from other biometrics like iris or finger print recognition that require subjects’ participation. During the last two decades several face recognition algorithms and systems have been proposed and some major advances have been achieved. As a result, the performance of face recognition systems under controlled conditions has now reached a satisfactory level. These systems, however, face some challenges in environments with variations in illumination, pose, expression, etc. The objective of this research is designing a reliable automated face recognition system which is robust under varying conditions of noise level, illumination and occlusion. A new method for illumination invariant feature extraction based on the illumination-reflectance model is proposed which is computationally efficient and does not require any prior information about the face model or illumination. A weighted voting scheme is also proposed to enhance the performance under illumination variations and also cancel occlusions. The proposed method uses mutual information and entropy of the images to generate different weights for a group of ensemble classifiers based on the input image quality. The method yields outstanding results by reducing the effect of both illumination and occlusion variations in the input face images

Scholarship at UWindsor

A System for Anlaysing Axon Activity Using Multi-Angle and Scale Convlutional Networks

Author: Liu Zewen
Publication venue
Publication date: 31/12/2022
Field of study

The University of Manchester - Institutional Repository

Homogeneous and Heterogeneous Face Recognition: Enhancing, Encoding and Matching for Practical Applications

Author: Nicolo Francesco
Publication venue: The Research Repository @ WVU
Publication date: 01/05/2012
Field of study

Face Recognition is the automatic processing of face images with the purpose to recognize individuals. Recognition task becomes especially challenging in surveillance applications, where images are acquired from a long range in the presence of difficult environments. Short Wave Infrared (SWIR) is an emerging imaging modality that is able to produce clear long range images in difficult environments or during night time. Despite the benefits of the SWIR technology, matching SWIR images against a gallery of visible images presents a challenge, since the photometric properties of the images in the two spectral bands are highly distinct.;In this dissertation, we describe a cross spectral matching method that encodes magnitude and phase of multi-spectral face images filtered with a bank of Gabor filters. The magnitude of filtered images is encoded with Simplified Weber Local Descriptor (SWLD) and Local Binary Pattern (LBP) operators. The phase is encoded with Generalized Local Binary Pattern (GLBP) operator. Encoded multi-spectral images are mapped into a histogram representation and cross matched by applying symmetric Kullback-Leibler distance. Performance of the developed algorithm is demonstrated on TINDERS database that contains long range SWIR and color images acquired at a distance of 2, 50, and 106 meters.;Apart from long acquisition range, other variations and distortions such as pose variation, motion and out of focus blur, and uneven illumination may be observed in multispectral face images. Recognition performance of the face recognition matcher can be greatly affected by these distortions. It is important, therefore, to ensure that matching is performed on high quality images. Poor quality images have to be either enhanced or discarded. This dissertation addresses the problem of selecting good quality samples.;The last chapters of the dissertation suggest a number of modifications applied to the cross spectral matching algorithm for matching low resolution color images in near-real time. We show that the method that encodes the magnitude of Gabor filtered images with the SWLD operator guarantees high recognition rates. The modified method (Gabor-SWLD) is adopted in a camera network set up where cameras acquire several views of the same individual. The designed algorithm and software are fully automated and optimized to perform recognition in near-real time. We evaluate the recognition performance and the processing time of the method on a small dataset collected at WVU

The Research Repository @ WVU (West Virginia University)

A survey of visual preprocessing and shape representation techniques

Author: Olshausen Bruno A.
Publication venue
Publication date
Field of study

Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention)

NASA Technical Reports Server