36,872 research outputs found

    Visual prototypes in the ventral stream are attuned to complexity and gaze behavior

    Get PDF
    Early theories of efficient coding suggested the visual system could compress the world by learning to represent features where information was concentrated, such as contours. This view was validated by the discovery that neurons in posterior visual cortex respond to edges and curvature. Still, it remains unclear what other information-rich features are encoded by neurons in more anterior cortical regions (e.g., inferotemporal cortex). Here, we use a generative deep neural network to synthesize images guided by neuronal responses from across the visuocortical hierarchy, using floating microelectrode arrays in areas V1, V4 and inferotemporal cortex of two macaque monkeys. We hypothesize these images ( prototypes ) represent such predicted information-rich features. Prototypes vary across areas, show moderate complexity, and resemble salient visual attributes and semantic content of natural images, as indicated by the animals\u27 gaze behavior. This suggests the code for object recognition represents compressed features of behavioral relevance, an underexplored aspect of efficient coding

    Shape similarity, better than semantic membership, accounts for the structure of visual object representations in a population of monkey inferotemporal neurons

    Get PDF
    The anterior inferotemporal cortex (IT) is the highest stage along the hierarchy of visual areas that, in primates, processes visual objects. Although several lines of evidence suggest that IT primarily represents visual shape information, some recent studies have argued that neuronal ensembles in IT code the semantic membership of visual objects (i.e., represent conceptual classes such as animate and inanimate objects). In this study, we investigated to what extent semantic, rather than purely visual information, is represented in IT by performing a multivariate analysis of IT responses to a set of visual objects. By relying on a variety of machine-learning approaches (including a cutting-edge clustering algorithm that has been recently developed in the domain of statistical physics), we found that, in most instances, IT representation of visual objects is accounted for by their similarity at the level of shape or, more surprisingly, low-level visual properties. Only in a few cases we observed IT representations of semantic classes that were not explainable by the visual similarity of their members. Overall, these findings reassert the primary function of IT as a conveyor of explicit visual shape information, and reveal that low-level visual properties are represented in IT to a greater extent than previously appreciated. In addition, our work demonstrates how combining a variety of state-of-the-art multivariate approaches, and carefully estimating the contribution of shape similarity to the representation of object categories, can substantially advance our understanding of neuronal coding of visual objects in cortex

    I'm sorry to say, but your understanding of image processing fundamentals is absolutely wrong

    Full text link
    The ongoing discussion whether modern vision systems have to be viewed as visually-enabled cognitive systems or cognitively-enabled vision systems is groundless, because perceptual and cognitive faculties of vision are separate components of human (and consequently, artificial) information processing system modeling.Comment: To be published as chapter 5 in "Frontiers in Brain, Vision and AI", I-TECH Publisher, Viena, 200

    Information extraction from multimedia web documents: an open-source platform and testbed

    No full text
    The LivingKnowledge project aimed to enhance the current state of the art in search, retrieval and knowledge management on the web by advancing the use of sentiment and opinion analysis within multimedia applications. To achieve this aim, a diverse set of novel and complementary analysis techniques have been integrated into a single, but extensible software platform on which such applications can be built. The platform combines state-of-the-art techniques for extracting facts, opinions and sentiment from multimedia documents, and unlike earlier platforms, it exploits both visual and textual techniques to support multimedia information retrieval. Foreseeing the usefulness of this software in the wider community, the platform has been made generally available as an open-source project. This paper describes the platform design, gives an overview of the analysis algorithms integrated into the system and describes two applications that utilise the system for multimedia information retrieval

    PredNet and Predictive Coding: A Critical Review

    Full text link
    PredNet, a deep predictive coding network developed by Lotter et al., combines a biologically inspired architecture based on the propagation of prediction error with self-supervised representation learning in video. While the architecture has drawn a lot of attention and various extensions of the model exist, there is a lack of a critical analysis. We fill in the gap by evaluating PredNet both as an implementation of the predictive coding theory and as a self-supervised video prediction model using a challenging video action classification dataset. We design an extended model to test if conditioning future frame predictions on the action class of the video improves the model performance. We show that PredNet does not yet completely follow the principles of predictive coding. The proposed top-down conditioning leads to a performance gain on synthetic data, but does not scale up to the more complex real-world action classification dataset. Our analysis is aimed at guiding future research on similar architectures based on the predictive coding theory

    Cyber situational awareness: from geographical alerts to high-level management

    Get PDF
    This paper focuses on cyber situational awareness and describes a visual analytics solution for monitoring and putting in tight relation data from network level with the organization business. The goal of the proposed solution is to make different security profiles (network security officer, network security manager, and financial security manager) aware of the actual network state (e.g., risk and attack progress) and the impact it actually has on the business tasks, making clear the relationships that exist between the network level and the business level. The proposed solution is instantiated on the ACEA infrastructure, the Italian company that provides power and water purification services to cities in central Italy (millions of end users
    corecore