708 research outputs found

    How can cells in the anterior medial face patch be viewpoint invariant?

    Get PDF
    In a recent paper, Freiwald and Tsao (2010) found evidence that the responses of cells in the macaque anterior medial (AM) face patch are invariant to significant changes in viewpoint. The monkey subjects had no prior experience with the individuals depicted in the stimuli and were never given an opportunity to view the same individual from different viewpoints sequentially. These results cannot be explained by a mechanism based on temporal association of experienced views. Employing a biologically plausible model of object recognition (software available at cbcl.mit.edu), we show two mechanisms which could account for these results. First, we show that hair style and skin color provide sufficient information to enable viewpoint recognition without resorting to any mechanism that associates images across views. It is likely that a large part of the effect described in patch AM is attributable to these cues. Separately, we show that it is possible to further improve view-invariance using class-specific features (see Vetter 1997). Faces, as a class, transform under 3D rotation in similar enough ways that it is possible to use previously viewed example faces to learn a general model of how all faces rotate. Novel faces can be encoded relative to these previously encountered “template” faces and thus recognized with some degree of invariance to 3D rotation. Since each object class transforms differently under 3D rotation, it follows that invariant recognition from a single view requires a recognition architecture with a detection step determining the class of an object (e.g. face or non-face) prior to a subsequent identification stage utilizing the appropriate class-specific features

    Unsupervised learning of clutter-resistant visual representations from natural videos

    Get PDF
    Populations of neurons in inferotemporal cortex (IT) maintain an explicit code for object identity that also tolerates transformations of object appearance e.g., position, scale, viewing angle [1, 2, 3]. Though the learning rules are not known, recent results [4, 5, 6] suggest the operation of an unsupervised temporal-association-based method e.g., Foldiak's trace rule [7]. Such methods exploit the temporal continuity of the visual world by assuming that visual experience over short timescales will tend to have invariant identity content. Thus, by associating representations of frames from nearby times, a representation that tolerates whatever transformations occurred in the video may be achieved. Many previous studies verified that such rules can work in simple situations without background clutter, but the presence of visual clutter has remained problematic for this approach. Here we show that temporal association based on large class-specific filters (templates) avoids the problem of clutter. Our system learns in an unsupervised way from natural videos gathered from the internet, and is able to perform a difficult unconstrained face recognition task on natural images: Labeled Faces in the Wild [8]

    How can cells in the anterior medial face patch be viewpoint invariant?

    Get PDF

    Neurons That Confuse Mirror-Symmetric Object Views

    Get PDF
    Neurons in inferotemporal cortex that respond similarly to many pairs of mirror-symmetric images -- for example, 45 degree and -45 degree views of the same face -- have often been reported. The phenomenon seemed to be an interesting oddity. However, the same phenomenon has also emerged in simple hierarchical models of the ventral stream. Here we state a theorem characterizing sufficient conditions for this curious invariance to occur in a rather large class of hierarchical networks and demonstrate it with simulations

    How Important is Weight Symmetry in Backpropagation?

    Get PDF
    Gradient backpropagation (BP) requires symmetric feedforward and feedback connections—the same weights must be used for forward and backward passes. This “weight transport problem” [1] is thought to be one of the main reasons of BP’s biological implausibility. Using 15 different classification datasets, we systematically study to what extent BP really depends on weight symmetry. In a study that turned out to be surprisingly similar in spirit to Lillicrap et al.’s demonstration [2] but orthogonal in its results, our experiments indicate that: (1) the magnitudes of feedback weights do not matter to performance (2) the signs of feedback weights do matter—the more concordant signs between feedforward and their corresponding feedback connections, the better (3) with feedback weights having random magnitudes and 100% concordant signs, we were able to achieve the same or even better performance than SGD. (4) some normalizations/stabilizations are indispensable for such asymmetric BP to work, namely Batch Normalization (BN) [3] and/or a “Batch Manhattan” (BM) update rule.This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF - 1231216

    Throwing Down the Visual Intelligence Gauntlet

    Get PDF
    In recent years, scientific and technological advances have produced artificial systems that have matched or surpassed human capabilities in narrow domains such as face detection and optical character recognition. However, the problem of producing truly intelligent machines still remains far from being solved. In this chapter, we first describe some of these recent advances, and then review one approach to moving beyond these limited successes---the neuromorphic approach of studying and reverse-engineering the networks of neurons in the human brain (specifically, the visual system). Finally, we discuss several possible future directions in the quest for visual intelligence.This research was sponsored by grants from DARPA (IPTO and DSO), National Science Foundation (NSF-0640097, NSF-0827427), AFSOR-THRL (FA8650-05-C-7262). Additional support was provided by: Adobe, Honda Research Institute USA, King Abdullah University Science and Technology grant to B. DeVore, NEC, Sony and especially by the Eugene McDermott Foundation

    Learning and disrupting invariance in visual recognition

    Get PDF
    Learning by temporal association rules such as Foldiak's trace rule is an attractive hypothesis that explains the development of invariance in visual recognition. Consistent with these rules, several recent experiments have shown that invariance can be broken by appropriately altering the visual environment but found puzzling differences in the effects at the psychophysical versus single cell level. We show a) that associative learning provides appropriate invariance in models of object recognition inspired by Hubel and Wiesel b) that we can replicate the "invariance disruption" experiments using these models with a temporal association learning rule to develop and maintain invariance, and c) that we can thereby explain the apparent discrepancies between psychophysics and singe cells effects. We argue that these models account for the stability of perceptual invariance despite the underlying plasticity of the system, the variability of the visual world and expected noise in the biological mechanisms

    A new wireless sensor platform with camera

    Get PDF
    Abstractthere are several platforms of wireless sensor networks such as micaz, mica2, etc. Each of them has specific characteristics. But the complexity of novel applications requires new characteristics, which more and more new designs of wireless sensor networks are needed. In this paper, the design of a sensor named Lacuna is proposed, which is a new sensor network platform implementing reliable detecting by taking real-time pictures. The paper presents a simplified model of wireless sensor networks (WSN) which is composed of the Lacuna sensors using IEEE 802.15.4 wireless technology. This model has been tested for many times and the model experimental results show that this system can run stably, reliably and efficiently. Stability, reliability, and efficiency are important because they make the operation robust to temporary disconnections or high packet loss. Due to the stability, reliability, and efficiency, the WSN transmits large amounts of continuous stable picture data messages to notebook when one of the nodes finishes taking a picture
    • …
    corecore