Search CORE

26 research outputs found

Provably Improved Context-Based Offline Meta-RL with Attention and Contrastive Learning

Author: Chen Mingzhe
Huang Junzhou
Huang Yuanhao
Li Lanqing
Luo Dijun
Luo Siteng
Publication venue
Publication date: 22/02/2021
Field of study

Meta-learning for offline reinforcement learning (OMRL) is an understudied problem with tremendous potential impact by enabling RL algorithms in many real-world applications. A popular solution to the problem is to infer task identity as augmented state using a context-based encoder, for which efficient learning of robust task representations remains an open challenge. In this work, we provably improve upon one of the SOTA OMRL algorithms, FOCAL, by incorporating intra-task attention mechanism and inter-task contrastive learning objectives, to robustify task representation learning against sparse reward and distribution shift. Theoretical analysis and experiments are presented to demonstrate the superior performance and robustness of our end-to-end and model-free framework compared to prior algorithms across multiple meta-RL benchmarks.Comment: 21 pages, 7 figure

arXiv.org e-Print Archive

University of Miami: Scholarship Miami

Bias-reduced Multi-step Hindsight Experience Replay for Efficient Multi-goal Reinforcement Learning

Author: Li Lanqing
Li Xiu
Luo Dijun
Luo Feng
Lyu Jiafei
Ya Jiangpeng
Yang Rui
Yang Yu
Publication venue
Publication date: 30/06/2021
Field of study

Multi-goal reinforcement learning is widely applied in planning and robot manipulation. Two main challenges in multi-goal reinforcement learning are sparse rewards and sample inefficiency. Hindsight Experience Replay (HER) aims to tackle the two challenges via goal relabeling. However, HER-related works still need millions of samples and a huge computation. In this paper, we propose Multi-step Hindsight Experience Replay (MHER), incorporating multi-step relabeled returns based on

n

-step relabeling to improve sample efficiency. Despite the advantages of

n

-step relabeling, we theoretically and experimentally prove the off-policy

n

-step bias introduced by

n

-step relabeling may lead to poor performance in many environments. To address the above issue, two bias-reduced MHER algorithms, MHER(

\lambda

) and Model-based MHER (MMHER) are presented. MHER(

\lambda

) exploits the

\lambda

return while MMHER benefits from model-based value expansions. Experimental results on numerous multi-goal robotic tasks show that our solutions can successfully alleviate off-policy

n

-step bias and achieve significantly higher sample efficiency than HER and Curriculum-guided HER with little additional computation beyond HER.Comment: 20pages, 8 figure

arXiv.org e-Print Archive

New Probabilistic Multi-Graph Decomposition Model to Identify Consistent Human Brain Network Modules

Author: Huang Heng
Huo Zhouyuan
Luo Dijun
Saykin Andrew J.
Shen Li
Wang Yang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2016
Field of study

Many recent scientific efforts have been devoted to constructing the human connectome using Diffusion Tensor Imaging (DTI) data for understanding large-scale brain networks that underlie higher-level cognition in human. However, suitable network analysis computational tools are still lacking in human brain connectivity research. To address this problem, we propose a novel probabilistic multi-graph decomposition model to identify consistent network modules from the brain connectivity networks of the studied subjects. At first, we propose a new probabilistic graph decomposition model to address the high computational complexity issue in existing stochastic block models. After that, we further extend our new probabilistic graph decomposition model for multiple networks/graphs to identify the shared modules cross multiple brain networks by simultaneously incorporating multiple networks and predicting the hidden block state variables. We also derive an efficient optimization algorithm to solve the proposed objective and estimate the model parameters. We validate our method by analyzing both the weighted fiber connectivity networks constructed from DTI images and the standard human face image clustering benchmark data sets. The promising empirical results demonstrate the superior performance of our proposed method

IUPUIScholarWorks

Multi-Level Cluster Indicator Decompositions of Matrices and Tensors

Author: Ding Chris
Huang Heng
Luo Dijun
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 04/08/2011
Field of study

A main challenging problem for many machine learning and data mining applications is that the amount of data and features are very large, so that low-rank approximations of original data are often required for efficient computation. We propose new multi-level clustering based low-rank matrix approximations which are comparable and even more compact than Singular Value Decomposition (SVD). We utilize the cluster indicators of data clustering results to form the subspaces, hence our decomposition results are more interpretable. We further generalize our clustering based matrix decompositions to tensor decompositions that are useful in high-order data analysis. We also provide an upper bound for the approximation error of our tensor decomposition algorithm. In all experimental results, our methods significantly outperform traditional decomposition methods such as SVD and high-order SVD

Association for the Advancement of Artificial Intelligence: AAAI Publications