Search CORE

22,004 research outputs found

Adaptation to criticality through organizational invariance in embodied agents

Author: Aguilera Miguel
Bedia Manuel G.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Many biological and cognitive systems do not operate deep within one or other regime of activity. Instead, they are poised at critical points located at phase transitions in their parameter space. The pervasiveness of criticality suggests that there may be general principles inducing this behaviour, yet there is no well-founded theory for understanding how criticality is generated at a wide span of levels and contexts. In order to explore how criticality might emerge from general adaptive mechanisms, we propose a simple learning rule that maintains an internal organizational structure from a specific family of systems at criticality. We implement the mechanism in artificial embodied agents controlled by a neural network maintaining a correlation structure randomly sampled from an Ising model at critical temperature. Agents are evaluated in two classical reinforcement learning scenarios: the Mountain Car and the Acrobot double pendulum. In both cases the neural controller appears to reach a point of criticality, which coincides with a transition point between two regimes of the agent's behaviour. These results suggest that adaptation to criticality could be used as a general adaptive mechanism in some circumstances, providing an alternative explanation for the pervasive presence of criticality in biological and cognitive systems.Comment: arXiv admin note: substantial text overlap with arXiv:1704.0525

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Archivo Digital para la Docencia y la Investigación

Structured chaos shapes spike-response noise entropy in balanced neural networks

Author: Lajoie Guillaume
Shea-Brown Eric
Thivierge Jean-Philippe
Publication venue
Publication date: 01/01/2014
Field of study

Large networks of sparsely coupled, excitatory and inhibitory cells occur throughout the brain. A striking feature of these networks is that they are chaotic. How does this chaos manifest in the neural code? Specifically, how variable are the spike patterns that such a network produces in response to an input signal? To answer this, we derive a bound for the entropy of multi-cell spike pattern distributions in large recurrent networks of spiking neurons responding to fluctuating inputs. The analysis is based on results from random dynamical systems theory and is complimented by detailed numerical simulations. We find that the spike pattern entropy is an order of magnitude lower than what would be extrapolated from single cells. This holds despite the fact that network coupling becomes vanishingly sparse as network size grows -- a phenomenon that depends on ``extensive chaos," as previously discovered for balanced networks without stimulus drive. Moreover, we show how spike pattern entropy is controlled by temporal features of the inputs. Our findings provide insight into how neural networks may encode stimuli in the presence of inherently chaotic dynamics.Comment: 9 pages, 5 figure

arXiv.org e-Print Archive

Frontiers - Publisher Connector

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues

Author: Bengio Yoshua
Charlin Laurent
Courville Aaron
Lowe Ryan
Pineau Joelle
Serban Iulian Vlad
Sordoni Alessandro
Publication venue
Publication date: 13/06/2016
Field of study

Sequential data often possesses a hierarchical structure with complex dependencies between subsequences, such as found between the utterances in a dialogue. In an effort to model this kind of generative process, we propose a neural network-based generative architecture, with latent stochastic variables that span a variable number of time steps. We apply the proposed model to the task of dialogue response generation and compare it with recent neural network architectures. We evaluate the model performance through automatic evaluation metrics and by carrying out a human evaluation. The experiments demonstrate that our model improves upon recently proposed models and that the latent variables facilitate the generation of long outputs and maintain the context.Comment: 15 pages, 5 tables, 4 figure

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

BowTie - A deep learning feedforward neural network for sentiment analysis

Author: Vassilev Apostol
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/04/2019
Field of study

How to model and encode the semantics of human-written text and select the type of neural network to process it are not settled issues in sentiment analysis. Accuracy and transferability are critical issues in machine learning in general. These properties are closely related to the loss estimates for the trained model. I present a computationally-efficient and accurate feedforward neural network for sentiment prediction capable of maintaining low losses. When coupled with an effective semantics model of the text, it provides highly accurate models with low losses. Experimental results on representative benchmark datasets and comparisons to other methods show the advantages of the new approach.Comment: 12 pages, 7 figures, 4 table

arXiv.org e-Print Archive

Crossref

Cultural Neuroeconomics of Intertemporal Choice

Author: Cannas Sergio
Fukui Hiroki
Hadzibeganovic Tarik
Kitayama Shinobu
Makino Takaki
Takahashi Taiki
Publication venue
Publication date: 01/01/2008
Field of study

According to theories of cultural neuroscience, Westerners and Easterners may have distinct styles of cognition (e.g., different allocation of attention). Previous research has shown that Westerners and Easterners tend to utilize analytical and holistic cognitive styles, respectively. On the other hand, little is known regarding the cultural differences in neuroeconomic behavior. For instance, economic decisions may be affected by cultural differences in neurocomputational processing underlying attention; however, this area of neuroeconomics has been largely understudied. In the present paper, we attempt to bridge this gap by considering the links between the theory of cultural neuroscience and neuroeconomic theory\ud of the role of attention in intertemporal choice. We predict that (i) Westerners are more impulsive and inconsistent in intertemporal choice in comparison to Easterners, and (ii) Westerners more steeply discount delayed monetary losses than Easterners. We examine these predictions by utilizing a novel temporal discounting model based on Tsallis' statistics (i.e. a q-exponential model). Our preliminary analysis of temporal discounting of gains and losses by Americans and Japanese confirmed the predictions from the cultural neuroeconomic theory. Future study directions, employing computational modeling via neural networks, are briefly outlined and discussed

Munich RePEc Personal Archive

CiteSeerX

CogPrints Cognitive Sciences Eprint Archive

Recommended from our members

Understanding Model-Based Reinforcement Learning and its Application in Safe Reinforcement Learning

Author: Hu Dingcheng
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Model-based reinforcement learning algorithms have been shown to achieve successful results on various continuous control benchmarks, but the understanding of model-based methods is limited. We try to interpret how model-based method works through novel experiments on state-of-the-art algorithms with an emphasis on the model learning part. We evaluate the role of the model learning in policy optimization and propose methods to learn a more accurate model. With a better understanding of model-based reinforcement learning, we then apply model-based methods to solve safe reinforcement learning (RL) problems with near-zero violation of hard constraints throughout training. Drawing an analogy with how humans and animals learn to perform safe actions, we break down the safe RL problem into three stages. First, we train agents in a constraint-free environment to learn a performant policy for reaching high rewards, and simultaneously learn a model of the dynamics. Second, we use model-based methods to plan safe actions and train a safeguarding policy from these actions through imitation. Finally, we propose a factored framework to train an overall policy that mixes the performant policy and the safeguarding policy. This three-step curriculum ensures near-zero violation of safety constraints at all times. As an advantage of model-based method, the sample complexity required at the second and third steps of the process is significantly lower than model-free methods and can enable online safe learning. We demonstrate the effectiveness of our methods in various continuous control problems and analyze the advantages over state-of-the-art approaches

eScholarship - University of California

Linear combination of one-step predictive information with an external reward in an episodic policy gradient setting: a critical analysis

Author: Ay Nihat
Martius Georg
Zahedi Keyan
Publication venue
Publication date: 01/01/2013
Field of study

One of the main challenges in the field of embodied artificial intelligence is the open-ended autonomous learning of complex behaviours. Our approach is to use task-independent, information-driven intrinsic motivation(s) to support task-dependent learning. The work presented here is a preliminary step in which we investigate the predictive information (the mutual information of the past and future of the sensor stream) as an intrinsic drive, ideally supporting any kind of task acquisition. Previous experiments have shown that the predictive information (PI) is a good candidate to support autonomous, open-ended learning of complex behaviours, because a maximisation of the PI corresponds to an exploration of morphology- and environment-dependent behavioural regularities. The idea is that these regularities can then be exploited in order to solve any given task. Three different experiments are presented and their results lead to the conclusion that the linear combination of the one-step PI with an external reward function is not generally recommended in an episodic policy gradient setting. Only for hard tasks a great speed-up can be achieved at the cost of an asymptotic performance lost

arXiv.org e-Print Archive

Frontiers - Publisher Connector