Search CORE

377 research outputs found

Intrinsically Motivated Reinforcement Learning based Recommendation with Counterfactual Data Augmentation

Author: Chen Xiaocong
Li Yong
Qi Lianyong
Wang Siyu
Yao Lina
Publication venue
Publication date: 16/09/2022
Field of study

Deep reinforcement learning (DRL) has been proven its efficiency in capturing users' dynamic interests in recent literature. However, training a DRL agent is challenging, because of the sparse environment in recommender systems (RS), DRL agents could spend times either exploring informative user-item interaction trajectories or using existing trajectories for policy learning. It is also known as the exploration and exploitation trade-off which affects the recommendation performance significantly when the environment is sparse. It is more challenging to balance the exploration and exploitation in DRL RS where RS agent need to deeply explore the informative trajectories and exploit them efficiently in the context of recommender systems. As a step to address this issue, We design a novel intrinsically ,otivated reinforcement learning method to increase the capability of exploring informative interaction trajectories in the sparse environment, which are further enriched via a counterfactual augmentation strategy for more efficient exploitation. The extensive experiments on six offline datasets and three online simulation platforms demonstrate the superiority of our model to a set of existing state-of-the-art methods

arXiv.org e-Print Archive

Evolving internal reinforcers for an intrinsically motivated reinforcement-learning robot

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Crawling in Rogue's dungeons with (partitioned) A3C

Author: A Asperti
A Asperti
MG Bellemare
R Sun
RS Sutton
V Cerny
V Mnih
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/09/2018
Field of study

Rogue is a famous dungeon-crawling video-game of the 80ies, the ancestor of its gender. Rogue-like games are known for the necessity to explore partially observable and always different randomly-generated labyrinths, preventing any form of level replay. As such, they serve as a very natural and challenging task for reinforcement learning, requiring the acquisition of complex, non-reactive behaviors involving memory and planning. In this article we show how, exploiting a version of A3C partitioned on different situations, the agent is able to reach the stairs and descend to the next level in 98% of cases.Comment: Accepted at the Fourth International Conference on Machine Learning, Optimization, and Data Science (LOD 2018

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Bio-Inspired Virtual Populations: Adaptive Behavior with Affective Feedback

Author: Antunes Rui Filipe
Publication venue
Publication date
Field of study

In this paper, we describe an agency model for generative populations of humanoid characters, based upon temporal variation of affective states. We have built on an existing agent framework from Sequeira et al. [17], and adapted it to be susceptible to temperamental and emotive states in the context of cooperative and non-cooperative interactions based on trading activity. More specifically, this model operates within two existing frameworks: a) intrinsically motivated reinforcement learning, structured upon affective appraisals in the relationship of the agents with their environment [19,17]; b) a multi-temporal representation of individual psychology, common in the field of affective computing, structuring individual psychology as a tripartite relationship: emotions-moods-personality [7,15]. Results show a populations of agents that express their individuality and autonomy with a high level of heterogeneous and spontaneous behaviors, while simultaneously adapting and overcoming their perceptual limitations

ZENODO