Search CORE

23,624 research outputs found

Unsupervised learning of clutter-resistant visual representations from natural videos

Author: Leibo Joel Z.
Liao Qianli
Poggio Tomaso
Publication venue
Publication date: 23/04/2015
Field of study

Populations of neurons in inferotemporal cortex (IT) maintain an explicit code for object identity that also tolerates transformations of object appearance e.g., position, scale, viewing angle [1, 2, 3]. Though the learning rules are not known, recent results [4, 5, 6] suggest the operation of an unsupervised temporal-association-based method e.g., Foldiak's trace rule [7]. Such methods exploit the temporal continuity of the visual world by assuming that visual experience over short timescales will tend to have invariant identity content. Thus, by associating representations of frames from nearby times, a representation that tolerates whatever transformations occurred in the video may be achieved. Many previous studies verified that such rules can work in simple situations without background clutter, but the presence of visual clutter has remained problematic for this approach. Here we show that temporal association based on large class-specific filters (templates) avoids the problem of clutter. Our system learns in an unsupervised way from natural videos gathered from the internet, and is able to perform a difficult unconstrained face recognition task on natural images: Labeled Faces in the Wild [8]

arXiv.org e-Print Archive

DSpace@MIT

Learning viewpoint invariant perceptual representations from cluttered images

Author: Spratling Michael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

In order to perform object recognition, it is necessary to form perceptual representations that are sufficiently specific to distinguish between objects, but that are also sufficiently flexible to generalize across changes in location, rotation, and scale. A standard method for learning perceptual representations that are invariant to viewpoint is to form temporal associations across image sequences showing object transformations. However, this method requires that individual stimuli be presented in isolation and is therefore unlikely to succeed in real-world applications where multiple objects can co-occur in the visual input. This paper proposes a simple modification to the learning method that can overcome this limitation and results in more robust learning of invariant representations

Crossref

Birkbeck Institutional Research Online

CogPrints Cognitive Sciences Eprint Archive

King's Research Portal

View-tolerant face recognition and Hebbian learning imply mirror-symmetric neural tuning to head orientation

Author: Abbott
Adelson
Afraz
Anselmi
Bart
Bart
Berkes
Bruce
Cox
DiCarlo
Fabio Anselmi
Farzmahdi
Freiwald
Földiák
Hassoun
Hengen
Hung
Isik
Isik
Joel Z. Leibo
Keck
Ku
Leibo
Li
Meyers
Miyashita
Moeller
Oja
Oja
Poggio
Qianli Liao
Riesenhuber
Rolls
Sanger
Serre
Tan
Thorpe
Tomaso Poggio
Tsao
Tsao
Tsao
Turrigiano
Wallis
Winrich A. Freiwald
Wiskott
Publication venue
Publication date: 03/06/2016
Field of study

The primate brain contains a hierarchy of visual areas, dubbed the ventral stream, which rapidly computes object representations that are both specific for object identity and relatively robust against identity-preserving transformations like depth-rotations. Current computational models of object recognition, including recent deep learning networks, generate these properties through a hierarchy of alternating selectivity-increasing filtering and tolerance-increasing pooling operations, similar to simple-complex cells operations. While simulations of these models recapitulate the ventral stream's progression from early view-specific to late view-tolerant representations, they fail to generate the most salient property of the intermediate representation for faces found in the brain: mirror-symmetric tuning of the neural population to head orientation. Here we prove that a class of hierarchical architectures and a broad set of biologically plausible learning rules can provide approximate invariance at the top level of the network. While most of the learning rules do not yield mirror-symmetry in the mid-level representations, we characterize a specific biologically-plausible Hebb-type learning rule that is guaranteed to generate mirror-symmetric tuning to faces tuning at intermediate levels of the architecture

arXiv.org e-Print Archive

DSpace@MIT

Crossref

PubMed Central

Fast, invariant representation for human action in the visual system

Author: Isik Leyla
Poggio Tomaso
Tacchetti Andrea
Publication venue
Publication date: 06/01/2016
Field of study

Humans can effortlessly recognize others' actions in the presence of complex transformations, such as changes in viewpoint. Several studies have located the regions in the brain involved in invariant action recognition, however, the underlying neural computations remain poorly understood. We use magnetoencephalography (MEG) decoding and a dataset of well-controlled, naturalistic videos of five actions (run, walk, jump, eat, drink) performed by different actors at different viewpoints to study the computational steps used to recognize actions across complex transformations. In particular, we ask when the brain discounts changes in 3D viewpoint relative to when it initially discriminates between actions. We measure the latency difference between invariant and non-invariant action decoding when subjects view full videos as well as form-depleted and motion-depleted stimuli. Our results show no difference in decoding latency or temporal profile between invariant and non-invariant action recognition in full videos. However, when either form or motion information is removed from the stimulus set, we observe a decrease and delay in invariant action decoding. Our results suggest that the brain recognizes actions and builds invariance to complex transformations at the same time, and that both form and motion information are crucial for fast, invariant action recognition

arXiv.org e-Print Archive

DSpace@MIT