9 research outputs found

    Visual sense of number vs. sense of magnitude in humans and machines

    Get PDF
    Numerosity perception is thought to be foundational to mathematical learning, but its computational bases are strongly debated. Some investigators argue that humans are endowed with a specialized system supporting numerical representations; others argue that visual numerosity is estimated using continuous magnitudes, such as density or area, which usually co-vary with number. Here we reconcile these contrasting perspectives by testing deep neural networks on the same numerosity comparison task that was administered to human participants, using a stimulus space that allows the precise measurement of the contribution of non-numerical features. Our model accurately simulates the psychophysics of numerosity perception and the associated developmental changes: discrimination is driven by numerosity, but non-numerical features also have a significant impact, especially early during development. Representational similarity analysis further highlights that both numerosity and continuous magnitudes are spontaneously encoded in deep networks even when no task has to be carried out, suggesting that numerosity is a major, salient property of our visual environment

    The role of architectural and learning constraints in neural network models: A case study on visual space coding

    Get PDF
    The recent “deep learning revolution” in artificial neural networks had strong impact and widespread deployment for engineering applications, but the use of deep learning for neurocomputational modeling has been so far limited. In this article we argue that unsupervised deep learning represents an important step forward for improving neurocomputational models of perception and cognition, because it emphasizes the role of generative learning as opposed to discriminative (supervised) learning. As a case study, we present a series of simulations investigating the emergence of neural coding of visual space for sensorimotor transformations. We compare different network architectures commonly used as building blocks for unsupervised deep learning by systematically testing the type of receptive fields and gain modulation developed by the hidden neurons. In particular, we compare Restricted Boltzmann Machines (RBMs), which are stochastic, generative networks with bidirectional connections trained using contrastive divergence, with autoencoders, which are deterministic networks trained using error backpropagation. For both learning architectures we also explore the role of sparse coding, which has been identified as a fundamental principle of neural computation. The unsupervised models are then compared with supervised, feed-forward networks that learn an explicit mapping between different spatial reference frames. Our simulations show that both architectural and learning constraints strongly influenced the emergent coding of visual space in terms of distribution of tuning functions at the level of single neurons. Unsupervised models, and particularly RBMs, were found to more closely adhere to neurophysiological data from single-cell recordings in the primate parietal cortex. These results provide new insights into how basic properties of artificial neural networks might be relevant for modeling neural information processing in biological systems

    Emergence of network motifs in deep neural networks

    Get PDF
    Network science can offer fundamental insights into the structural and functional properties of complex systems. For example, it is widely known that neuronal circuits tend to organize into basic functional topological modules, called network motifs. In this article, we show that network science tools can be successfully applied also to the study of artificial neural networks operating according to self-organizing (learning) principles. In particular, we study the emergence of network motifs in multi-layer perceptrons, whose initial connectivity is defined as a stack of fully-connected, bipartite graphs. Simulations show that the final network topology is shaped by learning dynamics, but can be strongly biased by choosing appropriate weight initialization schemes. Overall, our results suggest that non-trivial initialization strategies can make learning more effective by promoting the development of useful network motifs, which are often surprisingly consistent with those observed in general transduction networks

    Tracing the Algorithm of Bilingual Language Learning

    Get PDF
    206 p.Aprender un nuevo idioma es una tarea ardua pero altamente gratificante. Los aprendices deben adquirir un vocabulario extensivo, así como una serie de reglas sobre cómo variar y combinar este vocabulario para producir oraciones con sentido. No obstante, es posible que aprender nuevos idiomas se vuelva más sencillo una vez conocemos al menos dos. Basado en esta idea, en esta tesis exploro si existen diferencias entre las personas que sólo saben un idioma (monolingües) y aquellas que hablan dos idiomas (bilingües) a la hora de aprender un nuevo idioma. Para ello, llevé a cabo seis experimentos conductuales con participantes de distintos perfiles lingüísticos: un grupo de hablantes monolingües del castellano, un grupo bilingüe castellano-inglés, y un grupo bilingüe castellano-vasco. Estos experimentos, en conjunto, abarcaban el aprendizaje implícito y explícito de nuevos idiomas utilizando estímulos lingüísticos artificiales. En general, los resultados de todos experimentos indicaron que ambos grupos bilingües desempeñaron mejor que el grupo monolingüe al aprender vocabulario de manera implícita y explícita, pero no en otros ámbitos (fonología, ortografía, morfología). Para explicar cómo surgen estas diferencias en el aprendizaje de vocabulario, desarrollé un modelo computacional capaz de aprender palabras escritas utilizando los patrones ortográficos de palabras en uno o dos idiomas. Este modelo indicó que, al aprender palabras en dos idiomas, es más sencillo reconocer y producir nuevas palabras que al aprender vocabulario de un único idioma. La totalidad de estos resultados me llevaron a concluir que los monolingües y bilingües difieren fundamentalmente en el aprendizaje de vocabulario, debido a que la exposición a distintos patrones dentro de palabras en dos idiomas les hace más flexibles a la hora de integrar la información ortográfica (y posiblemente fonológica) de nuevas palabras

    Modeling cognition with generative neural networks: The case of orthographic processing

    Get PDF
    This thesis investigates the potential of generative neural networks to model cognitive processes. In contrast to many popular connectionist models, the computational framework adopted in this research work emphasizes the generative nature of cognition, suggesting that one of the primary goals of cognitive systems is to learn an internal model of the surrounding environment that can be used to infer causes and make predictions about the upcoming sensory information. In particular, we consider a powerful class of recurrent neural networks that learn probabilistic generative models from experience in a completely unsupervised way, by extracting high-order statistical structure from a set of observed variables. Notably, this type of networks can be conveniently formalized within the more general framework of probabilistic graphical models, which provides a unified language to describe both neural networks and structured Bayesian models. Moreover, recent advances allow to extend basic network architectures to build more powerful systems, which exploit multiple processing stages to perform learning and inference over hierarchical models, or which exploit delayed recurrent connections to process sequential information. We argue that these advanced network architectures constitute a promising alternative to the more traditional, feed-forward, supervised neural networks, because they more neatly capture the functional and structural organization of cortical circuits, providing a principled way to combine top-down, high-level contextual information with bottom-up, sensory evidence. We provide empirical support justifying the use of these models by studying how efficient implementations of hierarchical and temporal generative networks can extract information from large datasets containing thousands of patterns. In particular, we perform computational simulations of recognition of handwritten and printed characters belonging to different writing scripts, which are successively combined spatially or temporally in order to build more complex orthographic units such as those constituting English words
    corecore