288,155 research outputs found
Cooperation in the iterated prisoner's dilemma is learned by operant conditioning mechanisms
The prisoner's dilemma (PD) is the leading metaphor for the evolution of cooperative behavior in populations of selfish agents. Although cooperation in the iterated prisoner's dilemma (IPD) has been studied for over twenty years, most of this research has been focused on strategies that involve nonlearned behavior. Another approach is to suppose that players' selection of the preferred reply might he enforced in the same way as feeding animals track the best way to feed in changing nonstationary environments. Learning mechanisms such as operant conditioning enable animals to acquire relevant characteristics of their environment in order to get reinforcements and to avoid punishments. In this study, the role of operant conditioning in the learning of cooperation was evaluated in the PD. We found that operant mechanisms allow the learning of IPD play against other strategies. When random moves are allowed in the game, the operant learning model showed low sensitivity. On the basis of this evidence, it is suggested that operant learning might be involved in reciprocal altruism.Fil: Gutnisky, D. A.. Universidad de Buenos Aires. Facultad de Ingenieria. Instituto de IngenierĂa BiomĂ©dica; Argentina. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas. Instituto de BiologĂa y Medicina Experimental. FundaciĂłn de Instituto de BiologĂa y Medicina Experimental. Instituto de BiologĂa y Medicina Experimental; ArgentinaFil: Zanutto, Bonifacio Silvano. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas. Instituto de BiologĂa y Medicina Experimental. FundaciĂłn de Instituto de BiologĂa y Medicina Experimental. Instituto de BiologĂa y Medicina Experimental; Argentina. Universidad de Buenos Aires. Facultad de Ingenieria. Instituto de IngenierĂa BiomĂ©dica; Argentin
A step towards a reinforcement learning de novo genome assembler
The use of reinforcement learning has proven to be very promising for solving
complex activities without human supervision during their learning process.
However, their successful applications are predominantly focused on fictional
and entertainment problems - such as games. Based on the above, this work aims
to shed light on the application of reinforcement learning to solve this
relevant real-world problem, the genome assembly. By expanding the only
approach found in the literature that addresses this problem, we carefully
explored the aspects of intelligent agent learning, performed by the Q-learning
algorithm, to understand its suitability to be applied in scenarios whose
characteristics are more similar to those faced by real genome projects. The
improvements proposed here include changing the previously proposed reward
system and including state space exploration optimization strategies based on
dynamic pruning and mutual collaboration with evolutionary computing. These
investigations were tried on 23 new environments with larger inputs than those
used previously. All these environments are freely available on the internet
for the evolution of this research by the scientific community. The results
suggest consistent performance progress using the proposed improvements,
however, they also demonstrate the limitations of them, especially related to
the high dimensionality of state and action spaces. We also present, later, the
paths that can be traced to tackle genome assembly efficiently in real
scenarios considering recent, successfully reinforcement learning applications
- including deep reinforcement learning - from other domains dealing with
high-dimensional inputs
Social learning strategies modify the effect of network structure on group performance
The structure of communication networks is an important determinant of the
capacity of teams, organizations and societies to solve policy, business and
science problems. Yet, previous studies reached contradictory results about the
relationship between network structure and performance, finding support for the
superiority of both well-connected efficient and poorly connected inefficient
network structures. Here we argue that understanding how communication networks
affect group performance requires taking into consideration the social learning
strategies of individual team members. We show that efficient networks
outperform inefficient networks when individuals rely on conformity by copying
the most frequent solution among their contacts. However, inefficient networks
are superior when individuals follow the best member by copying the group
member with the highest payoff. In addition, groups relying on conformity based
on a small sample of others excel at complex tasks, while groups following the
best member achieve greatest performance for simple tasks. Our findings
reconcile contradictory results in the literature and have broad implications
for the study of social learning across disciplines
Adaptive Investment Strategies For Periodic Environments
In this paper, we present an adaptive investment strategy for environments
with periodic returns on investment. In our approach, we consider an investment
model where the agent decides at every time step the proportion of wealth to
invest in a risky asset, keeping the rest of the budget in a risk-free asset.
Every investment is evaluated in the market via a stylized return on investment
function (RoI), which is modeled by a stochastic process with unknown
periodicities and levels of noise. For comparison reasons, we present two
reference strategies which represent the case of agents with zero-knowledge and
complete-knowledge of the dynamics of the returns. We consider also an
investment strategy based on technical analysis to forecast the next return by
fitting a trend line to previous received returns. To account for the
performance of the different strategies, we perform some computer experiments
to calculate the average budget that can be obtained with them over a certain
number of time steps. To assure for fair comparisons, we first tune the
parameters of each strategy. Afterwards, we compare the performance of these
strategies for RoIs with different periodicities and levels of noise.Comment: Paper submitted to Advances in Complex Systems (November, 2007) 22
pages, 9 figure
Evolving Inborn Knowledge For Fast Adaptation in Dynamic POMDP Problems
Rapid online adaptation to changing tasks is an important problem in machine
learning and, recently, a focus of meta-reinforcement learning. However,
reinforcement learning (RL) algorithms struggle in POMDP environments because
the state of the system, essential in a RL framework, is not always visible.
Additionally, hand-designed meta-RL architectures may not include suitable
computational structures for specific learning problems. The evolution of
online learning mechanisms, on the contrary, has the ability to incorporate
learning strategies into an agent that can (i) evolve memory when required and
(ii) optimize adaptation speed to specific online learning problems. In this
paper, we exploit the highly adaptive nature of neuromodulated neural networks
to evolve a controller that uses the latent space of an autoencoder in a POMDP.
The analysis of the evolved networks reveals the ability of the proposed
algorithm to acquire inborn knowledge in a variety of aspects such as the
detection of cues that reveal implicit rewards, and the ability to evolve
location neurons that help with navigation. The integration of inborn knowledge
and online plasticity enabled fast adaptation and better performance in
comparison to some non-evolutionary meta-reinforcement learning algorithms. The
algorithm proved also to succeed in the 3D gaming environment Malmo Minecraft.Comment: 9 pages. Accepted as a full paper in the Genetic and Evolutionary
Computation Conference (GECCO 2020
Final report of work-with-IT: the JISC study into evolution of working practices
Technology is increasingly being used to underpin business processes across teaching and learning, research, knowledge exchange and business support activities in both HE and FE. The introduction of technology has a significant impact on the working practices of staff, often requiring them to work in a radically different way. Change in any situation can be unsettling and problematic and, where not effectively managed, can lead to poor service or functionality and disenfranchised staff. These issues can have a direct impact on institutional effectiveness, reputation and the resulting student experience. The Work-with-IT project, based at the University of Strathclyde, sought to examine changes to working practices across HE and FE, the impact on staff roles and relationships and the new skills sets that are required to meet these changes
Hyper-learning for population-based incremental learning in dynamic environments
This article is posted here here with permission from IEEE - Copyright @ 2009 IEEEThe population-based incremental learning (PBIL) algorithm is a combination of evolutionary optimization and competitive learning. Recently, the PBIL algorithm has been applied for dynamic optimization problems. This paper investigates the effect of the learning rate, which is a key parameter of PBIL, on the performance of PBIL in dynamic environments. A hyper-learning scheme is proposed for PBIL, where the learning rate is temporarily raised whenever the environment changes. The hyper-learning scheme can be combined with other approaches, e.g., the restart and hypermutation schemes, for PBIL in dynamic environments. Based on a series of dynamic test problems, experiments are carried out to investigate the effect of different learning rates and the proposed hyper-learning scheme in combination with restart and hypermutation schemes on the performance of PBIL. The experimental results show that the learning rate has a significant impact on the performance of the PBIL algorithm in dynamic environments and that the effect of the proposed hyper-learning scheme depends on the environmental dynamics and other schemes combined in the PBIL algorithm.The work by Shengxiang Yang was supported by the Engineering and Physical Sciences Research Council (EPSRC) of the United Kingdom under Grant EP/E060722/1
- …