Search CORE

288,155 research outputs found

Evolution of learning strategies in changing environments

Author: Acerbi
Aoki
Aunger
Best
Blackmore
Borenstein
Borg
Boyd
Bullinaria
Bullinaria
Bullinaria
Crispo
Diamond
Ehn
Eiben
Feldman
Grove
Halley
Henrich
Henrich
Henrich
Higgs
Hinton
John A. Bullinaria
Kameda
Kendal
Kline
Laland
Marriott
McElreath
Miller
Muthukrishna
Potts
Potts
Potts
Reader
Rendell
Rogers
Shultz
Steele
Thornton
Tomasello
Wakano
Wakano
Walker
Whitehead
Whitehead
Publication venue
Publication date: 01/12/2018
Field of study

Crossref

University of Birmingham Research Portal

Cooperation in the iterated prisoner's dilemma is learned by operant conditioning mechanisms

Author: Gutnisky D. A.
Zanutto Bonifacio Silvano
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2004
Field of study

The prisoner's dilemma (PD) is the leading metaphor for the evolution of cooperative behavior in populations of selfish agents. Although cooperation in the iterated prisoner's dilemma (IPD) has been studied for over twenty years, most of this research has been focused on strategies that involve nonlearned behavior. Another approach is to suppose that players' selection of the preferred reply might he enforced in the same way as feeding animals track the best way to feed in changing nonstationary environments. Learning mechanisms such as operant conditioning enable animals to acquire relevant characteristics of their environment in order to get reinforcements and to avoid punishments. In this study, the role of operant conditioning in the learning of cooperation was evaluated in the PD. We found that operant mechanisms allow the learning of IPD play against other strategies. When random moves are allowed in the game, the operant learning model showed low sensitivity. On the basis of this evidence, it is suggested that operant learning might be involved in reciprocal altruism.Fil: Gutnisky, D. A.. Universidad de Buenos Aires. Facultad de Ingenieria. Instituto de Ingeniería Biomédica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Biología y Medicina Experimental. Fundación de Instituto de Biología y Medicina Experimental. Instituto de Biología y Medicina Experimental; ArgentinaFil: Zanutto, Bonifacio Silvano. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Biología y Medicina Experimental. Fundación de Instituto de Biología y Medicina Experimental. Instituto de Biología y Medicina Experimental; Argentina. Universidad de Buenos Aires. Facultad de Ingenieria. Instituto de Ingeniería Biomédica; Argentin

CONICET Digital

A step towards a reinforcement learning de novo genome assembler

Author: Alves Ronnie
Carvalho Andre
Chateau Annie
Padovani Kleber
Reali Anna
Xavier Roberto
Publication venue
Publication date: 09/06/2021
Field of study

The use of reinforcement learning has proven to be very promising for solving complex activities without human supervision during their learning process. However, their successful applications are predominantly focused on fictional and entertainment problems - such as games. Based on the above, this work aims to shed light on the application of reinforcement learning to solve this relevant real-world problem, the genome assembly. By expanding the only approach found in the literature that addresses this problem, we carefully explored the aspects of intelligent agent learning, performed by the Q-learning algorithm, to understand its suitability to be applied in scenarios whose characteristics are more similar to those faced by real genome projects. The improvements proposed here include changing the previously proposed reward system and including state space exploration optimization strategies based on dynamic pruning and mutual collaboration with evolutionary computing. These investigations were tried on 23 new environments with larger inputs than those used previously. All these environments are freely available on the internet for the evolution of this research by the scientific community. The results suggest consistent performance progress using the proposed improvements, however, they also demonstrate the limitations of them, especially related to the high dimensionality of state and action spaces. We also present, later, the paths that can be traced to tackle genome assembly efficiently in real scenarios considering recent, successfully reinforcement learning applications - including deep reinforcement learning - from other domains dealing with high-dimensional inputs

arXiv.org e-Print Archive

Social learning strategies modify the effect of network structure on group performance

Author: Barkoczi Daniel
Galesic Mirta
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The structure of communication networks is an important determinant of the capacity of teams, organizations and societies to solve policy, business and science problems. Yet, previous studies reached contradictory results about the relationship between network structure and performance, finding support for the superiority of both well-connected efficient and poorly connected inefficient network structures. Here we argue that understanding how communication networks affect group performance requires taking into consideration the social learning strategies of individual team members. We show that efficient networks outperform inefficient networks when individuals rely on conformity by copying the most frequent solution among their contacts. However, inefficient networks are superior when individuals follow the best member by copying the group member with the highest payoff. In addition, groups relying on conformity based on a small sample of others excel at complex tasks, while groups following the best member achieve greatest performance for simple tasks. Our findings reconcile contradictory results in the literature and have broad implications for the study of social learning across disciplines

arXiv.org e-Print Archive

PubMed Central

MPG.PuRe

Adaptive Investment Strategies For Periodic Environments

Author: Arrow J. K.
Dixit A.
Gintis H.
Goldberg D. E.
Holland J. H.
Kelly J. L.
Levy M.
Markowitz H. M.
Maslov S.
Michalewiçz Z.
Sankoff D.
Sornette D.
Takahashi H.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 28/11/2007
Field of study

In this paper, we present an adaptive investment strategy for environments with periodic returns on investment. In our approach, we consider an investment model where the agent decides at every time step the proportion of wealth to invest in a risky asset, keeping the rest of the budget in a risk-free asset. Every investment is evaluated in the market via a stylized return on investment function (RoI), which is modeled by a stochastic process with unknown periodicities and levels of noise. For comparison reasons, we present two reference strategies which represent the case of agents with zero-knowledge and complete-knowledge of the dynamics of the returns. We consider also an investment strategy based on technical analysis to forecast the next return by fitting a trend line to previous received returns. To account for the performance of the different strategies, we perform some computer experiments to calculate the average budget that can be obtained with them over a certain number of time steps. To assure for fair comparisons, we first tune the parameters of each strategy. Afterwards, we compare the performance of these strategies for RoIs with different periodicities and levels of noise.Comment: Paper submitted to Advances in Complex Systems (November, 2007) 22 pages, 9 figure

arXiv.org e-Print Archive

Crossref

Evolving Inborn Knowledge For Fast Adaptation in Dynamic POMDP Problems

Author: Bengio Samy
Blynel Jesper
Duan Yan
Floreano Dario
Mishra Nikhil
Rakelly Kate
Rothfuss Jonas
Soltoggio Andrea
Thrun Sebastian
Werbos Paul J
Zintgraf Luisa
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/04/2020
Field of study

Rapid online adaptation to changing tasks is an important problem in machine learning and, recently, a focus of meta-reinforcement learning. However, reinforcement learning (RL) algorithms struggle in POMDP environments because the state of the system, essential in a RL framework, is not always visible. Additionally, hand-designed meta-RL architectures may not include suitable computational structures for specific learning problems. The evolution of online learning mechanisms, on the contrary, has the ability to incorporate learning strategies into an agent that can (i) evolve memory when required and (ii) optimize adaptation speed to specific online learning problems. In this paper, we exploit the highly adaptive nature of neuromodulated neural networks to evolve a controller that uses the latent space of an autoencoder in a POMDP. The analysis of the evolved networks reveals the ability of the proposed algorithm to acquire inborn knowledge in a variety of aspects such as the detection of cues that reveal implicit rewards, and the ability to evolve location neurons that help with navigation. The integration of inborn knowledge and online plasticity enabled fast adaptation and better performance in comparison to some non-evolutionary meta-reinforcement learning algorithms. The algorithm proved also to succeed in the 3D gaming environment Malmo Minecraft.Comment: 9 pages. Accepted as a full paper in the Genetic and Evolutionary Computation Conference (GECCO 2020

arXiv.org e-Print Archive

Crossref

Loughborough University Institutional Repository

Final report of work-with-IT: the JISC study into evolution of working practices

Author: Comrie A.
Cullen D.
McDonald D.
Publication venue: University of Strathclyde
Publication date: 01/01/2010
Field of study

Technology is increasingly being used to underpin business processes across teaching and learning, research, knowledge exchange and business support activities in both HE and FE. The introduction of technology has a significant impact on the working practices of staff, often requiring them to work in a radically different way. Change in any situation can be unsettling and problematic and, where not effectively managed, can lead to poor service or functionality and disenfranchised staff. These issues can have a direct impact on institutional effectiveness, reputation and the resulting student experience. The Work-with-IT project, based at the University of Strathclyde, sought to examine changes to working practices across HE and FE, the impact on staff roles and relationships and the new skills sets that are required to meet these changes

University of Strathclyde Institutional Repository

Hyper-learning for population-based incremental learning in dynamic environments

Author: Richter H
Yang S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

This article is posted here here with permission from IEEE - Copyright @ 2009 IEEEThe population-based incremental learning (PBIL) algorithm is a combination of evolutionary optimization and competitive learning. Recently, the PBIL algorithm has been applied for dynamic optimization problems. This paper investigates the effect of the learning rate, which is a key parameter of PBIL, on the performance of PBIL in dynamic environments. A hyper-learning scheme is proposed for PBIL, where the learning rate is temporarily raised whenever the environment changes. The hyper-learning scheme can be combined with other approaches, e.g., the restart and hypermutation schemes, for PBIL in dynamic environments. Based on a series of dynamic test problems, experiments are carried out to investigate the effect of different learning rates and the proposed hyper-learning scheme in combination with restart and hypermutation schemes on the performance of PBIL. The experimental results show that the learning rate has a significant impact on the performance of the PBIL algorithm in dynamic environments and that the effect of the proposed hyper-learning scheme depends on the environmental dynamics and other schemes combined in the PBIL algorithm.The work by Shengxiang Yang was supported by the Engineering and Physical Sciences Research Council (EPSRC) of the United Kingdom under Grant EP/E060722/1

CiteSeerX

Crossref

De Montfort University Open Research Archive

Brunel University Research Archive