7 research outputs found

    Class-Based Feature Matching Across Unrestricted Transformations

    Get PDF
    We develop a novel method for class-based feature matching across large changes in viewing conditions. The method is based on the property that when objects share a similar part, the similarity is preserved across viewing conditions. Given a feature and a training set of object images, we first identify the subset of objects that share this feature. The transformation of the feature's appearance across viewing conditions is determined mainly by properties of the feature, rather than of the object in which it is embedded. Therefore, the transformed feature will be shared by approximately the same set of objects. Based on this consistency requirement, corresponding features can be reliably identified from a set of candidate matches. Unlike previous approaches, the proposed scheme compares feature appearances only in similar viewing conditions, rather than across different viewing conditions. As a result, the scheme is not restricted to locally planar objects or affine transformations. The approach also does not require examples of correct matches. We show that by using the proposed method, a dense set of accurate correspondences can be obtained. Experimental comparisons demonstrate that matching accuracy is significantly improved over previous schemes. Finally, we show that the scheme can be successfully used for invariant object recognition

    Class-Based Feature Matching Across Unrestricted Transformations

    Full text link

    View-tolerant face recognition and Hebbian learning imply mirror-symmetric neural tuning to head orientation

    Get PDF
    The primate brain contains a hierarchy of visual areas, dubbed the ventral stream, which rapidly computes object representations that are both specific for object identity and relatively robust against identity-preserving transformations like depth-rotations. Current computational models of object recognition, including recent deep learning networks, generate these properties through a hierarchy of alternating selectivity-increasing filtering and tolerance-increasing pooling operations, similar to simple-complex cells operations. While simulations of these models recapitulate the ventral stream's progression from early view-specific to late view-tolerant representations, they fail to generate the most salient property of the intermediate representation for faces found in the brain: mirror-symmetric tuning of the neural population to head orientation. Here we prove that a class of hierarchical architectures and a broad set of biologically plausible learning rules can provide approximate invariance at the top level of the network. While most of the learning rules do not yield mirror-symmetry in the mid-level representations, we characterize a specific biologically-plausible Hebb-type learning rule that is guaranteed to generate mirror-symmetric tuning to faces tuning at intermediate levels of the architecture

    Why the brain separates face recognition from object recognition

    Get PDF
    Many studies have uncovered evidence that visual cortex contains specialized regions involved in processing faces but not other object classes. Recent electrophysiology studies of cells in several of these specialized regions revealed that at least some of these regions are organized in a hierarchical manner with viewpoint-specific cells projecting to downstream viewpoint-invariant identity-specific cells (Freiwald and Tsao 2010). A separate computational line of reasoning leads to the claim that some transformations of visual inputs that preserve viewed object identity are class-specific. In particular, the 2D images evoked by a face undergoing a 3D rotation are not produced by the same image transformation (2D) that would produce the images evoked by an object of another class undergoing the same 3D rotation. However, within the class of faces, knowledge of the image transformation evoked by 3D rotation can be reliably transferred from previously viewed faces to help identify a novel face at a new viewpoint. We show, through computational simulations, that an architecture which applies this method of gaining invariance to class-specific transformations is effective when restricted to faces and fails spectacularly when applied across object classes. We argue here that in order to accomplish viewpoint-invariant face identification from a single example view, visual cortex must separate the circuitry involved in discounting 3D rotations of faces from the generic circuitry involved in processing other objects. The resulting model of the ventral stream of visual cortex is consistent with the recent physiology results showing the hierarchical organization of the face processing network.United States. Defense Advanced Research Projects Agency. Information Processing Techniques OfficeUnited States. Defense Advanced Research Projects Agency. System Science Division. Defense Sciences OfficeNational Science Foundation (U.S.) (Grant NSF-0640097)National Science Foundation (U.S.) (Grant NSF-0827427)United States. Air Force Office of Scientific Research (THRL Grant FA8650-05-C-7262)Adobe SystemsHonda Research Institute USA, Inc.King Abdullah University of Science and TechnologyNEC CorporationSony CorporationEugene McDermott Foundatio

    Learning invariant representations and applications to face verification

    Get PDF
    One approach to computer object recognition and modeling the brain's ventral stream involves unsupervised learning of representations that are invariant to common transformations. However, applications of these ideas have usually been limited to 2D affine transformations, e.g., translation and scaling, since they are easiest to solve via convolution. In accord with a recent theory of transformation-invariance, we propose a model that, while capturing other common convolutional networks as special cases, can also be used with arbitrary identity-preserving transformations. The model's wiring can be learned from videos of transforming objects---or any other grouping of images into sets by their depicted object. Through a series of successively more complex empirical tests, we study the invariance/discriminability properties of this model with respect to different transformations. First, we empirically confirm theoretical predictions for the case of 2D affine transformations. Next, we apply the model to non-affine transformations: as expected, it performs well on face verification tasks requiring invariance to the relatively smooth transformations of 3D rotation-in-depth and changes in illumination direction. Surprisingly, it can also tolerate clutter transformations'' which map an image of a face on one background to an image of the same face on a different background. Motivated by these empirical findings, we tested the same model on face verification benchmark tasks from the computer vision literature: Labeled Faces in the Wild, PubFig and a new dataset we gathered---achieving strong performance in these highly unconstrained cases as well.

    Investigating Representations of Facial Identity in Human Ventral Visual Cortex with Transcranial Magnetic Stimulation

    Get PDF
    The occipital face area (OFA) is face-selective. This enhanced activation to faces could reflect either generic face and shape-related processing or high-level conceptual processing of identity. Here we examined these two possibilities using a state-dependent transcranial magnetic stimulation (TMS) paradigm. The lateral occipital (LO) cortex which is activated non-selectively by various types of objects served as a control site. We localized OFA and LO on a per-participant basis using functional MRI. We then examined whether TMS applied to either of these regions affected the ability of participants to decide whether two successively presented and physically different face images were of the same famous person or different famous people. TMS was applied during the delay between first and second face presentations to investigate whether neuronal populations in these regions played a causal role in mediating the behavioral effects of identity repetition. Behaviorally we found a robust identity repetition effect, with shorter reaction times (RTs) when identity was repeated, regardless of the fact that the pictures were physically different. Surprisingly, TMS applied over LO (but not OFA) modulated overall RTs, compared to the No-TMS condition. But critically, we found no effects of TMS to either area that were modulated by identity repetition. Thus, we found no evidence to suggest that OFA or LO contain neuronal representations selective for the identity of famous faces which play a causal role in identity processing. Instead, these brain regions may be involved in the processing of more generic features of their preferred stimulus categories

    Fragment-Based Learning of Visual Object Categories in Non-Human Primates

    Get PDF
    When we perceive a visual object, we implicitly or explicitly associate it with an object category we know. Recent research has shown that the visual system can use local, informative image fragments of a given object, rather than the whole object, to classify it into a familiar category. We have previously reported, using human psychophysical studies, that when subjects learn new object categories using whole objects, they incidentally learn informative fragments, even when not required to do so. However, the neuronal mechanisms by which we acquire and use informative fragments, as well as category knowledge itself, have remained unclear. Here we describe the methods by which we adapted the relevant human psychophysical methods to awake, behaving monkeys and replicated key previous psychophysical results. This establishes awake, behaving monkeys as a useful system for future neurophysiological studies not only of informative fragments in particular, but also of object categorization and category learning in general
    corecore