Search CORE

229,467 research outputs found

No Spare Parts: Sharing Part Detectors for Image Categorization

Author: Mettes Pascal
Snoek Cees G. M.
van Gemert Jan C.
Publication venue
Publication date: 01/01/2016
Field of study

This work aims for image categorization using a representation of distinctive parts. Different from existing part-based work, we argue that parts are naturally shared between image categories and should be modeled as such. We motivate our approach with a quantitative and qualitative analysis by backtracking where selected parts come from. Our analysis shows that in addition to the category parts defining the class, the parts coming from the background context and parts from other image categories improve categorization performance. Part selection should not be done separately for each category, but instead be shared and optimized over all categories. To incorporate part sharing between categories, we present an algorithm based on AdaBoost to jointly optimize part sharing and selection, as well as fusion with the global image representation. We achieve results competitive to the state-of-the-art on object, scene, and action categories, further improving over deep convolutional neural networks

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Web-Scale Training for Face Identification

Author: Ranzato Marc'Aurelio
Taigman Yaniv
Wolf Lior
Yang Ming
Publication venue
Publication date: 18/04/2015
Field of study

Scaling machine learning methods to very large datasets has attracted considerable attention in recent years, thanks to easy access to ubiquitous sensing and data from the web. We study face recognition and show that three distinct properties have surprising effects on the transferability of deep convolutional networks (CNN): (1) The bottleneck of the network serves as an important transfer learning regularizer, and (2) in contrast to the common wisdom, performance saturation may exist in CNN's (as the number of training samples grows); we propose a solution for alleviating this by replacing the naive random subsampling of the training set with a bootstrapping process. Moreover, (3) we find a link between the representation norm and the ability to discriminate in a target domain, which sheds lights on how such networks represent faces. Based on these discoveries, we are able to improve face recognition accuracy on the widely used LFW benchmark, both in the verification (1:1) and identification (1:N) protocols, and directly compare, for the first time, with the state of the art Commercially-Off-The-Shelf system and show a sizable leap in performance

arXiv.org e-Print Archive

Crossref

Dynamics of trimming the content of face representations for categorization in the brain

Author: A Akselrod-Ballin
A Chauvin
A Holmes
AV Flevaris
BJ Liddell
DG Pelli
DH Brainard
DJ Field
DJ Field
E Halgren
F Gosselin
FW Campbell
GA Rousselet
GB Henning
H Kirchner
HC Hughes
HM Morgan
JL McClelland
JS Morris
JS Winston
K Grill-Spector
Karl J. Friston
M Bar
M Eimer
M Livingstone
MG Philiastides
ML Smith
ML Smith
N Sigala
Nicola J. van Rijsbergen
P Ekman
P Rotshtein
P Vuilleumier
PG Schyns
PG Schyns
PG Schyns
PG Schyns
PH Schiller
Philippe G. Schyns
PT Sowden
R Adolphs
S Bentin
S Campanella
S Campanella
S Thorpe
S Ullman
SR Schweinberger
V Goffaux
V Goffaux
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2009
Field of study

To understand visual cognition, it is imperative to determine when, how and with what information the human brain categorizes the visual input. Visual categorization consistently involves at least an early and a late stage: the occipito-temporal N170 event related potential related to stimulus encoding and the parietal P300 involved in perceptual decisions. Here we sought to understand how the brain globally transforms its representations of face categories from their early encoding to the later decision stage over the 400 ms time window encompassing the N170 and P300 brain events. We applied classification image techniques to the behavioral and electroencephalographic data of three observers who categorized seven facial expressions of emotion and report two main findings: (1) Over the 400 ms time course, processing of facial features initially spreads bilaterally across the left and right occipito-temporal regions to dynamically converge onto the centro-parietal region; (2) Concurrently, information processing gradually shifts from encoding common face features across all spatial scales (e.g. the eyes) to representing only the finer scales of the diagnostic features that are richer in useful information for behavior (e.g. the wide opened eyes in 'fear'; the detailed mouth in 'happy'). Our findings suggest that the brain refines its diagnostic representations of visual categories over the first 400 ms of processing by trimming a thorough encoding of features over the N170, to leave only the detailed information important for perceptual decisions over the P300

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Edge Hill University Research Information Repository

Enlighten