Search CORE

160 research outputs found

From Manifest V2 to V3 : A Study on the Discoverability of Chrome Extensions

Author: Bucci Valerio
Li Wanpeng
Publication venue: Springer
Publication date: 01/12/2023
Field of study

Peer reviewedPostprin

Aberdeen University Research

Does the IdP Mix-Up attack really work?

Author: Li Wanpeng
Mitchell Christopher J
Publication venue
Publication date: 03/06/2016
Field of study

Royal Holloway - Pure

User Access Privacy in OAuth 2.0 and OpenID Connect

Author: Li Wanpeng
Mitchell Chris J
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/10/2020
Field of study

Crossref

Royal Holloway - Pure

Improving the Security of Real World Identity Management Systems

Author: Li Wanpeng
Publication venue
Publication date: 01/01/2017
Field of study

Royal Holloway - Pure

Industry Herding in Crypto Assets

Author: Li Wanpeng
Liu Nan
Zhao Yuan
Publication venue
Publication date: 02/02/2024
Field of study

Peer reviewedPostprin

Aberdeen University Research

Tackling Non-Stationarity in Reinforcement Learning via Causal-Origin Representation

Author: Li Yilin
Lu Zongqing
Yang Boyu
Zhang Wanpeng
Publication venue
Publication date: 29/09/2023
Field of study

In real-world scenarios, the application of reinforcement learning is significantly challenged by complex non-stationarity. Most existing methods attempt to model changes in the environment explicitly, often requiring impractical prior knowledge. In this paper, we propose a new perspective, positing that non-stationarity can propagate and accumulate through complex causal relationships during state transitions, thereby compounding its sophistication and affecting policy learning. We believe that this challenge can be more effectively addressed by tracing the causal origin of non-stationarity. To this end, we introduce the Causal-Origin REPresentation (COREP) algorithm. COREP primarily employs a guided updating mechanism to learn a stable graph representation for states termed as causal-origin representation. By leveraging this representation, the learned policy exhibits impressive resilience to non-stationarity. We supplement our approach with a theoretical analysis grounded in the causal interpretation for non-stationary reinforcement learning, advocating for the validity of the causal-origin representation. Experimental results further demonstrate the superior performance of COREP over existing methods in tackling non-stationarity

arXiv.org e-Print Archive