26 research outputs found

    Provably Improved Context-Based Offline Meta-RL with Attention and Contrastive Learning

    Full text link
    Meta-learning for offline reinforcement learning (OMRL) is an understudied problem with tremendous potential impact by enabling RL algorithms in many real-world applications. A popular solution to the problem is to infer task identity as augmented state using a context-based encoder, for which efficient learning of robust task representations remains an open challenge. In this work, we provably improve upon one of the SOTA OMRL algorithms, FOCAL, by incorporating intra-task attention mechanism and inter-task contrastive learning objectives, to robustify task representation learning against sparse reward and distribution shift. Theoretical analysis and experiments are presented to demonstrate the superior performance and robustness of our end-to-end and model-free framework compared to prior algorithms across multiple meta-RL benchmarks.Comment: 21 pages, 7 figure

    Bias-reduced Multi-step Hindsight Experience Replay for Efficient Multi-goal Reinforcement Learning

    Full text link
    Multi-goal reinforcement learning is widely applied in planning and robot manipulation. Two main challenges in multi-goal reinforcement learning are sparse rewards and sample inefficiency. Hindsight Experience Replay (HER) aims to tackle the two challenges via goal relabeling. However, HER-related works still need millions of samples and a huge computation. In this paper, we propose Multi-step Hindsight Experience Replay (MHER), incorporating multi-step relabeled returns based on nn-step relabeling to improve sample efficiency. Despite the advantages of nn-step relabeling, we theoretically and experimentally prove the off-policy nn-step bias introduced by nn-step relabeling may lead to poor performance in many environments. To address the above issue, two bias-reduced MHER algorithms, MHER(λ\lambda) and Model-based MHER (MMHER) are presented. MHER(λ\lambda) exploits the λ\lambda return while MMHER benefits from model-based value expansions. Experimental results on numerous multi-goal robotic tasks show that our solutions can successfully alleviate off-policy nn-step bias and achieve significantly higher sample efficiency than HER and Curriculum-guided HER with little additional computation beyond HER.Comment: 20pages, 8 figure

    New Probabilistic Multi-Graph Decomposition Model to Identify Consistent Human Brain Network Modules

    Get PDF
    Many recent scientific efforts have been devoted to constructing the human connectome using Diffusion Tensor Imaging (DTI) data for understanding large-scale brain networks that underlie higher-level cognition in human. However, suitable network analysis computational tools are still lacking in human brain connectivity research. To address this problem, we propose a novel probabilistic multi-graph decomposition model to identify consistent network modules from the brain connectivity networks of the studied subjects. At first, we propose a new probabilistic graph decomposition model to address the high computational complexity issue in existing stochastic block models. After that, we further extend our new probabilistic graph decomposition model for multiple networks/graphs to identify the shared modules cross multiple brain networks by simultaneously incorporating multiple networks and predicting the hidden block state variables. We also derive an efficient optimization algorithm to solve the proposed objective and estimate the model parameters. We validate our method by analyzing both the weighted fiber connectivity networks constructed from DTI images and the standard human face image clustering benchmark data sets. The promising empirical results demonstrate the superior performance of our proposed method

    Multi-Level Cluster Indicator Decompositions of Matrices and Tensors

    No full text
    A main challenging problem for many machine learning and data mining applications is that the amount of data and features are very large, so that low-rank approximations of original data are often required for efficient computation. We propose new multi-level clustering based low-rank matrix approximations which are comparable and even more compact than Singular Value Decomposition (SVD). We utilize the cluster indicators of data clustering results to form the subspaces, hence our decomposition results are more interpretable. We further generalize our clustering based matrix decompositions to tensor decompositions that are useful in high-order data analysis. We also provide an upper bound for the approximation error of our tensor decomposition algorithm. In all experimental results, our methods significantly outperform traditional decomposition methods such as SVD and high-order SVD
    corecore