1,916 research outputs found

    Integrating a Non-Uniformly Sampled Software Retina with a Deep CNN Model

    Get PDF
    We present a biologically inspired method for pre-processing images applied to CNNs that reduces their memory requirements while increasing their invariance to scale and rotation changes. Our method is based on the mammalian retino-cortical transform: a mapping between a pseudo-randomly tessellated retina model (used to sample an input image) and a CNN. The aim of this first pilot study is to demonstrate a functional retinaintegrated CNN implementation and this produced the following results: a network using the full retino-cortical transform yielded an F1 score of 0.80 on a test set during a 4-way classification task, while an identical network not using the proposed method yielded an F1 score of 0.86 on the same task. The method reduced the visual data by e×7, the input data to the CNN by 40% and the number of CNN training epochs by 64%. These results demonstrate the viability of our method and hint at the potential of exploiting functional traits of natural vision systems in CNNs

    Brain asymmetry and visual word recognition: do we have a split fovea?

    Get PDF
    In this chapter we discuss how the anatomical divide between the left and the right brain half has implications for visual word recognition. In particular, it introduces the need for massive interhemispheric communication. Unlike what was believed in the traditional view, it looks increasingly likely that interhemispheric integration is already needed from the very first stages of word processing, when the letter information is combined to activate stored word representations. Taking into account these insights not only improves our understanding of the neurophysiological and cognitive mechanisms of reading, it also gives us new ideas to look at individual differences in reading

    Egocentric Perception using a Biologically Inspired Software Retina Integrated with a Deep CNN

    Get PDF
    We presented the concept of of a software retina, capable of significant visual data reduction in combination with scale and rotation invariance, for applications in egocentric and robot vision at the first EPIC workshop in Amsterdam [9]. Our method is based on the mammalian retino-cortical transform: a mapping between a pseudo-randomly tessellated retina model (used to sample an input image) and a CNN. The aim of this first pilot study is to demonstrate a functional retina-integrated CNN implementation and this produced the following results: a network using the full retino-cortical transform yielded an F1 score of 0.80 on a test set during a 4-way classification task, while an identical network not using the proposed method yielded an F1 score of 0.86 on the same task. On a 40K node retina the method reduced the visual data bye×7, the input data to the CNN by 40% and the number of CNN training epochs by 36%. These results demonstrate the viability of our method and hint at the potential of exploiting functional traits of natural vision systems in CNNs. In addition, to the above study, we present further recent developments in porting the retina to an Apple iPhone, an implementation in CUDA C for NVIDIA GPU platforms and extensions of the retina model we have adopted

    A computer vision model for visual-object-based attention and eye movements

    Get PDF
    This is the post-print version of the final paper published in Computer Vision and Image Understanding. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2008 Elsevier B.V.This paper presents a new computational framework for modelling visual-object-based attention and attention-driven eye movements within an integrated system in a biologically inspired approach. Attention operates at multiple levels of visual selection by space, feature, object and group depending on the nature of targets and visual tasks. Attentional shifts and gaze shifts are constructed upon their common process circuits and control mechanisms but also separated from their different function roles, working together to fulfil flexible visual selection tasks in complicated visual environments. The framework integrates the important aspects of human visual attention and eye movements resulting in sophisticated performance in complicated natural scenes. The proposed approach aims at exploring a useful visual selection system for computer vision, especially for usage in cluttered natural visual environments.National Natural Science of Founda- tion of Chin

    Temporal Dynamics of Decision-Making during Motion Perception in the Visual Cortex

    Get PDF
    How does the brain make decisions? Speed and accuracy of perceptual decisions covary with certainty in the input, and correlate with the rate of evidence accumulation in parietal and frontal cortical "decision neurons." A biophysically realistic model of interactions within and between Retina/LGN and cortical areas V1, MT, MST, and LIP, gated by basal ganglia, simulates dynamic properties of decision-making in response to ambiguous visual motion stimuli used by Newsome, Shadlen, and colleagues in their neurophysiological experiments. The model clarifies how brain circuits that solve the aperture problem interact with a recurrent competitive network with self-normalizing choice properties to carry out probablistic decisions in real time. Some scientists claim that perception and decision-making can be described using Bayesian inference or related general statistical ideas, that estimate the optimal interpretation of the stimulus given priors and likelihoods. However, such concepts do not propose the neocortical mechanisms that enable perception, and make decisions. The present model explains behavioral and neurophysiological decision-making data without an appeal to Bayesian concepts and, unlike other existing models of these data, generates perceptual representations and choice dynamics in response to the experimental visual stimuli. Quantitative model simulations include the time course of LIP neuronal dynamics, as well as behavioral accuracy and reaction time properties, during both correct and error trials at different levels of input ambiguity in both fixed duration and reaction time tasks. Model MT/MST interactions compute the global direction of random dot motion stimuli, while model LIP computes the stochastic perceptual decision that leads to a saccadic eye movement.National Science Foundation (SBE-0354378, IIS-02-05271); Office of Naval Research (N00014-01-1-0624); National Institutes of Health (R01-DC-02852

    A Psychogenetic Algorithm for Behavioral Sequence Learning

    Get PDF
    This work presents an original algorithmic model of some essential features of psychogenetic theory, as was proposed by J.Piaget. Specifically, we modeled some elements of cognitive structure learning in children from 0 to 4 months of life. We are in fact convinced that the study of well-established cognitive models of human learning can suggest new, interesting approaches to problem so far not satisfactorily solved in the field of machine learning. Further, we discussed the possible parallels between our model and subsymbolic machine learning and neuroscience. The model was implemented and tested in some simple experimental settings, with reference to the task of learning sensorimotor sequences

    View-Invariant Object Category Learning, Recognition, and Search: How Spatial and Object Attention Are Coordinated Using Surface-Based Attentional Shrouds

    Full text link
    Air Force Office of Scientific Research (F49620-01-1-0397); National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624