9 research outputs found
Visual sense of number vs. sense of magnitude in humans and machines
Numerosity perception is thought to be foundational to mathematical learning, but its computational bases are strongly debated. Some investigators argue that humans are endowed with a specialized system supporting numerical representations; others argue that visual numerosity is estimated using continuous magnitudes, such as density or area, which usually co-vary with number. Here we reconcile these contrasting perspectives by testing deep neural networks on the same numerosity comparison task that was administered to human participants, using a stimulus space that allows the precise measurement of the contribution of non-numerical features. Our model accurately simulates the psychophysics of numerosity perception and the associated developmental changes: discrimination is driven by numerosity, but non-numerical features also have a significant impact, especially early during development. Representational similarity analysis further highlights that both numerosity and continuous magnitudes are spontaneously encoded in deep networks even when no task has to be carried out, suggesting that numerosity is a major, salient property of our visual environment
The role of architectural and learning constraints in neural network models: A case study on visual space coding
The recent “deep learning revolution” in artificial neural networks had strong impact and widespread deployment for engineering applications, but the use of deep learning for neurocomputational modeling has been so far limited. In this article we argue that unsupervised deep learning represents an important step forward for improving neurocomputational models of perception and cognition, because it emphasizes the role of generative learning as opposed to discriminative (supervised) learning. As a case study, we present a series of simulations investigating the emergence of neural coding of visual space for sensorimotor transformations. We compare different network architectures commonly used as building blocks for unsupervised deep learning by systematically testing the type of receptive fields and gain modulation developed by the hidden neurons. In particular, we compare Restricted Boltzmann Machines (RBMs), which are stochastic, generative networks with bidirectional connections trained using contrastive divergence, with autoencoders, which are deterministic networks trained using error backpropagation. For both learning architectures we also explore the role of sparse coding, which has been identified as a fundamental principle of neural computation. The unsupervised models are then compared with supervised, feed-forward networks that learn an explicit mapping between different spatial reference frames. Our simulations show that both architectural and learning constraints strongly influenced the emergent coding of visual space in terms of distribution of tuning functions at the level of single neurons. Unsupervised models, and particularly RBMs, were found to more closely adhere to neurophysiological data from single-cell recordings in the primate parietal cortex. These results provide new insights into how basic properties of artificial neural networks might be relevant for modeling neural information processing in biological systems
Emergence of network motifs in deep neural networks
Network science can offer fundamental insights into the structural and functional properties of complex systems. For example, it is widely known that neuronal circuits tend to organize into basic functional topological modules, called network motifs. In this article, we show that network science tools can be successfully applied also to the study of artificial neural networks operating according to self-organizing (learning) principles. In particular, we study the emergence of network motifs in multi-layer perceptrons, whose initial connectivity is defined as a stack of fully-connected, bipartite graphs. Simulations show that the final network topology is shaped by learning dynamics, but can be strongly biased by choosing appropriate weight initialization schemes. Overall, our results suggest that non-trivial initialization strategies can make learning more effective by promoting the development of useful network motifs, which are often surprisingly consistent with those observed in general transduction networks
Tracing the Algorithm of Bilingual Language Learning
206 p.Aprender un nuevo idioma es una tarea ardua pero altamente gratificante. Los aprendices deben adquirir un vocabulario extensivo, asĂ como una serie de reglas sobre cĂłmo variar y combinar este vocabulario para producir oraciones con sentido. No obstante, es posible que aprender nuevos idiomas se vuelva más sencillo una vez conocemos al menos dos. Basado en esta idea, en esta tesis exploro si existen diferencias entre las personas que sĂłlo saben un idioma (monolingĂĽes) y aquellas que hablan dos idiomas (bilingĂĽes) a la hora de aprender un nuevo idioma. Para ello, llevĂ© a cabo seis experimentos conductuales con participantes de distintos perfiles lingĂĽĂsticos: un grupo de hablantes monolingĂĽes del castellano, un grupo bilingĂĽe castellano-inglĂ©s, y un grupo bilingĂĽe castellano-vasco. Estos experimentos, en conjunto, abarcaban el aprendizaje implĂcito y explĂcito de nuevos idiomas utilizando estĂmulos lingĂĽĂsticos artificiales. En general, los resultados de todos experimentos indicaron que ambos grupos bilingĂĽes desempeñaron mejor que el grupo monolingĂĽe al aprender vocabulario de manera implĂcita y explĂcita, pero no en otros ámbitos (fonologĂa, ortografĂa, morfologĂa). Para explicar cĂłmo surgen estas diferencias en el aprendizaje de vocabulario, desarrollĂ© un modelo computacional capaz de aprender palabras escritas utilizando los patrones ortográficos de palabras en uno o dos idiomas. Este modelo indicĂł que, al aprender palabras en dos idiomas, es más sencillo reconocer y producir nuevas palabras que al aprender vocabulario de un Ăşnico idioma. La totalidad de estos resultados me llevaron a concluir que los monolingĂĽes y bilingĂĽes difieren fundamentalmente en el aprendizaje de vocabulario, debido a que la exposiciĂłn a distintos patrones dentro de palabras en dos idiomas les hace más flexibles a la hora de integrar la informaciĂłn ortográfica (y posiblemente fonolĂłgica) de nuevas palabras
Modeling cognition with generative neural networks: The case of orthographic processing
This thesis investigates the potential of generative neural networks to model cognitive processes. In contrast to many popular connectionist models, the computational framework adopted in this research work emphasizes the generative nature of cognition, suggesting that one of the primary goals of cognitive systems is to learn an internal model of the surrounding environment that can be used to infer causes and make predictions about the upcoming sensory information. In particular, we consider a powerful class of recurrent neural networks that learn probabilistic generative models from experience in a completely unsupervised way, by extracting high-order statistical structure from a set of observed variables. Notably, this type of networks can be conveniently formalized within the more general framework of probabilistic graphical models, which provides a unified language to describe both neural networks and structured Bayesian models. Moreover, recent advances allow to extend basic network architectures to build more powerful systems, which exploit multiple processing stages to perform learning and inference over hierarchical models, or which exploit delayed recurrent connections to process sequential information. We argue that these advanced network architectures constitute a promising alternative to the more traditional, feed-forward, supervised neural networks, because they more neatly capture the functional and structural organization of cortical circuits, providing a principled way to combine top-down, high-level contextual information with bottom-up, sensory evidence. We provide empirical support justifying the use of these models by studying how efficient implementations of hierarchical and temporal generative networks can extract information from large datasets containing thousands of patterns. In particular, we perform computational simulations of recognition of handwritten and printed characters belonging to different writing scripts, which are successively combined spatially or temporally in order to build more complex orthographic units such as those constituting English words
Learning Orthographic Structure With Sequential Generative Neural Networks
none4sinoneTestolin, Alberto; Stoianov, Ivilin; Sperduti, Alessandro; Zorzi, MarcoTestolin, Alberto; Stoianov, IVILIN PEEV; Sperduti, Alessandro; Zorzi, Marc