1,522 research outputs found

    The contribution of fMRI in the study of visual categorization and expertise

    Get PDF
    No description supplie

    The Neural Representation Benchmark and its Evaluation on Brain and Machine

    Get PDF
    A key requirement for the development of effective learning representations is their evaluation and comparison to representations we know to be effective. In natural sensory domains, the community has viewed the brain as a source of inspiration and as an implicit benchmark for success. However, it has not been possible to directly test representational learning algorithms directly against the representations contained in neural systems. Here, we propose a new benchmark for visual representations on which we have directly tested the neural representation in multiple visual cortical areas in macaque (utilizing data from [Majaj et al., 2012]), and on which any computer vision algorithm that produces a feature space can be tested. The benchmark measures the effectiveness of the neural or machine representation by computing the classification loss on the ordered eigendecomposition of a kernel matrix [Montavon et al., 2011]. In our analysis we find that the neural representation in visual area IT is superior to visual area V4. In our analysis of representational learning algorithms, we find that three-layer models approach the representational performance of V4 and the algorithm in [Le et al., 2012] surpasses the performance of V4. Impressively, we find that a recent supervised algorithm [Krizhevsky et al., 2012] achieves performance comparable to that of IT for an intermediate level of image variation difficulty, and surpasses IT at a higher difficulty level. We believe this result represents a major milestone: it is the first learning algorithm we have found that exceeds our current estimate of IT representation performance. We hope that this benchmark will assist the community in matching the representational performance of visual cortex and will serve as an initial rallying point for further correspondence between representations derived in brains and machines.Comment: The v1 version contained incorrectly computed kernel analysis curves and KA-AUC values for V4, IT, and the HT-L3 models. They have been corrected in this versio

    What does semantic tiling of the cortex tell us about semantics?

    Get PDF
    Recent use of voxel-wise modeling in cognitive neuroscience suggests that semantic maps tile the cortex. Although this impressive research establishes distributed cortical areas active during the conceptual processing that underlies semantics, it tells us little about the nature of this processing. While mapping concepts between Marr's computational and implementation levels to support neural encoding and decoding, this approach ignores Marr's algorithmic level, central for understanding the mechanisms that implement cognition, in general, and conceptual processing, in particular. Following decades of research in cognitive science and neuroscience, what do we know so far about the representation and processing mechanisms that implement conceptual abilities? Most basically, much is known about the mechanisms associated with: (1) features and frame representations, (2) grounded, abstract, and linguistic representations, (3) knowledge-based inference, (4) concept composition, and (5) conceptual flexibility. Rather than explaining these fundamental representation and processing mechanisms, semantic tiles simply provide a trace of their activity over a relatively short time period within a specific learning context. Establishing the mechanisms that implement conceptual processing in the brain will require more than mapping it to cortical (and sub-cortical) activity, with process models from cognitive science likely to play central roles in specifying the intervening mechanisms. More generally, neuroscience will not achieve its basic goals until it establishes algorithmic-level mechanisms that contribute essential explanations to how the brain works, going beyond simply establishing the brain areas that respond to various task conditions

    Birds of a Feather Flock Together: Experience-Driven Formation of Visual Object Categories in Human Ventral Temporal Cortex

    Get PDF
    The present functional magnetic resonance imaging study provides direct evidence on visual object-category formation in the human brain. Although brain imaging has demonstrated object-category specific representations in the occipitotemporal cortex, the crucial question of how the brain acquires this knowledge has remained unresolved. We designed a stimulus set consisting of six highly similar bird types that can hardly be distinguished without training. All bird types were morphed with one another to create different exemplars of each category. After visual training, fMRI showed that responses in the right fusiform gyrus were larger for bird types for which a discrete category-boundary was established as compared with not-trained bird types. Importantly, compared with not-trained bird types, right fusiform responses were smaller for visually similar birds to which subjects were exposed during training but for which no category-boundary was learned. These data provide evidence for experience-induced shaping of occipitotemporal responses that are involved in category learning in the human brain

    The Effect of Learning on the Function of Monkey Extrastriate Visual Cortex

    Get PDF
    One of the most remarkable capabilities of the adult brain is its ability to learn and continuously adapt to an ever-changing environment. While many studies have documented how learning improves the perception and identification of visual stimuli, relatively little is known about how it modifies the underlying neural mechanisms. We trained monkeys to identify natural images that were degraded by interpolation with visual noise. We found that learning led to an improvement in monkeys' ability to identify these indeterminate visual stimuli. We link this behavioral improvement to a learning-dependent increase in the amount of information communicated by V4 neurons. This increase was mediated by a specific enhancement in neural activity. Our results reveal a mechanism by which learning increases the amount of information that V4 neurons are able to extract from the visual environment. This suggests that V4 plays a key role in resolving indeterminate visual inputs by coordinated interaction between bottom-up and top-down processing streams

    How may the basal ganglia contribute to auditory categorization and speech perception?

    Get PDF
    Listeners must accomplish two complementary perceptual feats in extracting a message from speech. They must discriminate linguistically-relevant acoustic variability and generalize across irrelevant variability. Said another way, they must categorize speech. Since the mapping of acoustic variability is language-specific, these categories must be learned from experience. Thus, understanding how, in general, the auditory system acquires and represents categories can inform us about the toolbox of mechanisms available to speech perception. This perspective invites consideration of findings from cognitive neuroscience literatures outside of the speech domain as a means of constraining models of speech perception. Although neurobiological models of speech perception have mainly focused on cerebral cortex, research outside the speech domain is consistent with the possibility of significant subcortical contributions in category learning. Here, we review the functional role of one such structure, the basal ganglia. We examine research from animal electrophysiology, human neuroimaging, and behavior to consider characteristics of basal ganglia processing that may be advantageous for speech category learning. We also present emerging evidence for a direct role for basal ganglia in learning auditory categories in a complex, naturalistic task intended to model the incidental manner in which speech categories are acquired. To conclude, we highlight new research questions that arise in incorporating the broader neuroscience research literature in modeling speech perception, and suggest how understanding contributions of the basal ganglia can inform attempts to optimize training protocols for learning non-native speech categories in adulthood

    Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition

    Get PDF
    The primate visual system achieves remarkable visual object recognition performance even in brief presentations and under changes to object exemplar, geometric transformations, and background variation (a.k.a. core visual object recognition). This remarkable performance is mediated by the representation formed in inferior temporal (IT) cortex. In parallel, recent advances in machine learning have led to ever higher performing models of object recognition using artificial deep neural networks (DNNs). It remains unclear, however, whether the representational performance of DNNs rivals that of the brain. To accurately produce such a comparison, a major difficulty has been a unifying metric that accounts for experimental limitations such as the amount of noise, the number of neural recording sites, and the number trials, and computational limitations such as the complexity of the decoding classifier and the number of classifier training examples. In this work we perform a direct comparison that corrects for these experimental limitations and computational considerations. As part of our methodology, we propose an extension of "kernel analysis" that measures the generalization accuracy as a function of representational complexity. Our evaluations show that, unlike previous bio-inspired models, the latest DNNs rival the representational performance of IT cortex on this visual object recognition task. Furthermore, we show that models that perform well on measures of representational performance also perform well on measures of representational similarity to IT and on measures of predicting individual IT multi-unit responses. Whether these DNNs rely on computational mechanisms similar to the primate visual system is yet to be determined, but, unlike all previous bio-inspired models, that possibility cannot be ruled out merely on representational performance grounds.Comment: 35 pages, 12 figures, extends and expands upon arXiv:1301.353

    Making sense of real-world scenes

    Get PDF
    To interact with the world, we have to make sense of the continuous sensory input conveying information about our environment. A recent surge of studies has investigated the processes enabling scene understanding, using increasingly complex stimuli and sophisticated analyses to highlight the visual features and brain regions involved. However, there are two major challenges to producing a comprehensive framework for scene understanding. First, scene perception is highly dynamic, subserving multiple behavioral goals. Second, a multitude of different visual properties co-occur across scenes and may be correlated or independent. We synthesize the recent literature and argue that for a complete view of scene understanding, it is necessary to account for both differing observer goals and the contribution of diverse scene properties

    Perceptual Learning Of Object Shape

    Get PDF
    Recognition of objects is accomplished through the use of cues that depend on internal representations of familiar shapes. We used a paradigm of perceptual learning during visual search to explore what features human observers use to identify objects. Human subjects were trained to search for a target object embedded in an array of distractors, until their performance improved from chance levels to over 80% of trials in an object specific manner. We determined the role of specific object components in the recognition of the object as a whole by measuring the transfer of learning from the trained object to other objects sharing components with it. Depending on the geometric relationship of the trained object with untrained objects, transfer to untrained objects was observed. Novel objects that shared a component with the trained object were identified at much higher levels than those that did not, and this could be used as an indicator of which features of the object were important for recognition. Training on an object transferred to the less complex components of the object when these components were embedded in an array of distractors of similar complexity. There was transfer to the different components of object, regardless of how well they distinguish the object from distractors. These results suggest that objects are not represented in a holistic manner during learning, but that their individual components are encoded. Transfer between objects was not complete, and occurred for more than one component, suggesting that a joint involvement of multiple components was necessary for full performance. The sequence of this learning indicated a possible underlying mechanism of the learning. Subjects improved first in a single quadrant of the visual field, and the improvement then spread out sequentially to the other quadrants. This location specificity of the improvement suggests that, with training, encoding information about object shape occurs in early, retinotopically mapped cortical areas. fMRI work suggests that the learning of novel objects in this manner involves a reciprocal switch between two cortical networks, one that involves the normally object-sensitive regions of LOC, and one that involves the temporal and parietal cortices

    Diagnostic information use to understand brain mechanisms of facial expression categorization

    Get PDF
    Proficient categorization of facial expressions is crucial for normal social interaction. Neurophysiological, behavioural, event-related potential, lesion and functional neuroimaging techniques can be used to investigate the underlying brain mechanisms supporting this seemingly effortless process, and the associated arrangement of bilateral networks. These brain areas exhibit consistent and replicable activation patterns, and can be broadly defined to include visual (occipital and temporal), limbic (amygdala) and prefrontal (orbitofrontal) regions. Together, these areas support early perceptual processing, the formation of detailed representations and subsequent recognition of expressive faces. Despite the critical role of facial expressions in social communication and extensive work in this area, it is still not known how the brain decodes nonverbal signals in terms of expression-specific features. For these reasons, this thesis investigates the role of these so-called diagnostic facial features at three significant stages in expression recognition; the spatiotemporal inputs to the visual system, the dynamic integration of features in higher visual (occipitotemporal) areas, and early sensitivity to features in V1. In Chapter 1, the basic emotion categories are presented, along with the brain regions that are activated by these expressions. In line with this, the current cognitive theory of face processing reviews functional and anatomical dissociations within the distributed neural “face network”. Chapter 1 also introduces the way in which we measure and use diagnostic information to derive brain sensitivity to specific facial features, and how this is a useful tool by which to understand spatial and temporal organisation of expression recognition in the brain. In relation to this, hierarchical, bottom-up neural processing is discussed along with high-level, top-down facilitatory mechanisms. Chapter 2 describes an eye-movement study that reveals inputs to the visual system via fixations reflect diagnostic information use. Inputs to the visual system dictate the information distributed to cognitive systems during the seamless and rapid categorization of expressive faces. How we perform eye-movements during this task informs how task-driven and stimulus-driven mechanisms interact to guide the extraction of information supporting recognition. We recorded eye movements of observers who categorized the six basic categories of facial expressions. We use a measure of task-relevant information (diagnosticity) to discuss oculomotor behaviour, with focus on two findings. Firstly, fixated regions reveal expression differences. Secondly, by examining fixation sequences, the intersection of fixations with diagnostic information increases in a sequence of fixations. This suggests a top-down drive to acquire task-relevant information, with different functional roles for first and final fixations. A combination of psychophysical studies of visual recognition together with the EEG (electroencephalogram) signal is used to infer the dynamics of feature extraction and use during the recognition of facial expressions in Chapter 3. The results reveal a process that integrates visual information over about 50 milliseconds prior to the face-sensitive N170 event-related potential, starting at the eye region, and proceeding gradually towards lower regions. The finding that informative features for recognition are not processed simultaneously but in an orderly progression over a short time period is instructive for understanding the processes involved in visual recognition, and in particular the integration of bottom-up and top-down processes. In Chapter 4 we use fMRI to investigate the task-dependent activation to diagnostic features in early visual areas, suggesting top-down mechanisms as V1 traditionally exhibits only simple response properties. Chapter 3 revealed that diagnostic features modulate the temporal dynamics of brain signals in higher visual areas. Within the hierarchical visual system however, it is not known if an early (V1/V2/V3) sensitivity to diagnostic information contributes to categorical facial judgements, conceivably driven by top-down signals triggered in visual processing. Using retinotopic mapping, we reveal task-dependent information extraction within the earliest cortical representation (V1) of two features known to be differentially necessary for face recognition tasks (eyes and mouth). This strategic encoding of face images is beyond typical V1 properties and suggests a top-down influence of task extending down to the earliest retinotopic stages of visual processing. The significance of these data is discussed in the context of the cortical face network and bidirectional processing in the visual system. The visual cognition of facial expression processing is concerned with the interactive processing of bottom-up sensory-driven information and top-down mechanisms to relate visual input to categorical judgements. The three experiments presented in this thesis are summarized in Chapter 5 in relation to how diagnostic features can be used to explore such processing in the human brain leading to proficient facial expression categorization
    corecore