Search CORE

306 research outputs found

Whole-Chain Recommendations

Author: Liu Hui
Tang Jiliang
Xia Long
Yin Dawei
Zhao Xiangyu
Zou Linxin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/08/2020
Field of study

With the recent prevalence of Reinforcement Learning (RL), there have been tremendous interests in developing RL-based recommender systems. In practical recommendation sessions, users will sequentially access multiple scenarios, such as the entrance pages and the item detail pages, and each scenario has its specific characteristics. However, the majority of existing RL-based recommender systems focus on optimizing one strategy for all scenarios or separately optimizing each strategy, which could lead to sub-optimal overall performance. In this paper, we study the recommendation problem with multiple (consecutive) scenarios, i.e., whole-chain recommendations. We propose a multi-agent RL-based approach (DeepChain), which can capture the sequential correlation among different scenarios and jointly optimize multiple recommendation strategies. To be specific, all recommender agents (RAs) share the same memory of users' historical behaviors, and they work collaboratively to maximize the overall reward of a session. Note that optimizing multiple recommendation strategies jointly faces two challenges in the existing model-free RL model - (i) it requires huge amounts of user behavior data, and (ii) the distribution of reward (users' feedback) are extremely unbalanced. In this paper, we introduce model-based RL techniques to reduce the training data requirement and execute more accurate strategy updates. The experimental results based on a real e-commerce platform demonstrate the effectiveness of the proposed framework.Comment: 29th ACM International Conference on Information and Knowledge Managemen

arXiv.org e-Print Archive

Crossref

TempLe: Learning Template of Transitions for Sample Efficient Multi-task RL

Author: Huang Furong
Sun Yanchao
Yin Xiangyu
Publication venue
Publication date: 08/03/2021
Field of study

Transferring knowledge among various environments is important to efficiently learn multiple tasks online. Most existing methods directly use the previously learned models or previously learned optimal policies to learn new tasks. However, these methods may be inefficient when the underlying models or optimal policies are substantially different across tasks. In this paper, we propose Template Learning (TempLe), the first PAC-MDP method for multi-task reinforcement learning that could be applied to tasks with varying state/action space. TempLe generates transition dynamics templates, abstractions of the transition dynamics across tasks, to gain sample efficiency by extracting similarities between tasks even when their underlying models or optimal policies have limited commonalities. We present two algorithms for an "online" and a "finite-model" setting respectively. We prove that our proposed TempLe algorithms achieve much lower sample complexity than single-task learners or state-of-the-art multi-task methods. We show via systematically designed experiments that our TempLe method universally outperforms the state-of-the-art multi-task methods (PAC-MDP or not) in various settings and regimes

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Small-Signal Stability Analysis and Optimal Parameters Design of Microgrid Clusters

Author: Guerrero Josep M.
He Jinghan
Wu Xiangyu
Wu Xiaoyu
Xu Yin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

VBN