81 research outputs found

    Explaining the Decisions of Deep Policy Networks for Robotic Manipulations

    Full text link
    Deep policy networks enable robots to learn behaviors to solve various real-world complex tasks in an end-to-end fashion. However, they lack transparency to provide the reasons of actions. Thus, such a black-box model often results in low reliability and disruptive actions during the deployment of the robot in practice. To enhance its transparency, it is important to explain robot behaviors by considering the extent to which each input feature contributes to determining a given action. In this paper, we present an explicit analysis of deep policy models through input attribution methods to explain how and to what extent each input feature affects the decisions of the robot policy models. To this end, we present two methods for applying input attribution methods to robot policy networks: (1) we measure the importance factor of each joint torque to reflect the influence of the motor torque on the end-effector movement, and (2) we modify a relevance propagation method to handle negative inputs and outputs in deep policy networks properly. To the best of our knowledge, this is the first report to identify the dynamic changes of input attributions of multi-modal sensor inputs in deep policy networks online for robotic manipulation.Comment: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021

    Variational Curriculum Reinforcement Learning for Unsupervised Discovery of Skills

    Full text link
    Mutual information-based reinforcement learning (RL) has been proposed as a promising framework for retrieving complex skills autonomously without a task-oriented reward function through mutual information (MI) maximization or variational empowerment. However, learning complex skills is still challenging, due to the fact that the order of training skills can largely affect sample efficiency. Inspired by this, we recast variational empowerment as curriculum learning in goal-conditioned RL with an intrinsic reward function, which we name Variational Curriculum RL (VCRL). From this perspective, we propose a novel approach to unsupervised skill discovery based on information theory, called Value Uncertainty Variational Curriculum (VUVC). We prove that, under regularity conditions, VUVC accelerates the increase of entropy in the visited states compared to the uniform curriculum. We validate the effectiveness of our approach on complex navigation and robotic manipulation tasks in terms of sample efficiency and state coverage speed. We also demonstrate that the skills discovered by our method successfully complete a real-world robot navigation task in a zero-shot setup and that incorporating these skills with a global planner further increases the performance.Comment: ICML 2023. First two authors contributed equally. Code at https://github.com/seongun-kim/vcr

    Adaptive and Explainable Deployment of Navigation Skills via Hierarchical Deep Reinforcement Learning

    Full text link
    For robotic vehicles to navigate robustly and safely in unseen environments, it is crucial to decide the most suitable navigation policy. However, most existing deep reinforcement learning based navigation policies are trained with a hand-engineered curriculum and reward function which are difficult to be deployed in a wide range of real-world scenarios. In this paper, we propose a framework to learn a family of low-level navigation policies and a high-level policy for deploying them. The main idea is that, instead of learning a single navigation policy with a fixed reward function, we simultaneously learn a family of policies that exhibit different behaviors with a wide range of reward functions. We then train the high-level policy which adaptively deploys the most suitable navigation skill. We evaluate our approach in simulation and the real world and demonstrate that our method can learn diverse navigation skills and adaptively deploy them. We also illustrate that our proposed hierarchical learning framework presents explainability by providing semantics for the behavior of an autonomous agent.Comment: ICRA 2023. First two authors contributed equally. Code at https://github.com/leekwoon/hrl-na

    Refining Diffusion Planner for Reliable Behavior Synthesis by Automatic Detection of Infeasible Plans

    Full text link
    Diffusion-based planning has shown promising results in long-horizon, sparse-reward tasks by training trajectory diffusion models and conditioning the sampled trajectories using auxiliary guidance functions. However, due to their nature as generative models, diffusion models are not guaranteed to generate feasible plans, resulting in failed execution and precluding planners from being useful in safety-critical applications. In this work, we propose a novel approach to refine unreliable plans generated by diffusion models by providing refining guidance to error-prone plans. To this end, we suggest a new metric named restoration gap for evaluating the quality of individual plans generated by the diffusion model. A restoration gap is estimated by a gap predictor which produces restoration gap guidance to refine a diffusion planner. We additionally present an attribution map regularizer to prevent adversarial refining guidance that could be generated from the sub-optimal gap predictor, which enables further refinement of infeasible plans. We demonstrate the effectiveness of our approach on three different benchmarks in offline control settings that require long-horizon planning. We also illustrate that our approach presents explainability by presenting the attribution maps of the gap predictor and highlighting error-prone transitions, allowing for a deeper understanding of the generated plans.Comment: NeurIPS 2023. First two authors contributed equally. Code at http://github.com/leekwoon/rg

    Extending CLIP's Image-Text Alignment to Referring Image Segmentation

    Full text link
    Referring Image Segmentation (RIS) is a cross-modal task that aims to segment an instance described by a natural language expression. Recent methods leverage large-scale pretrained unimodal models as backbones along with fusion techniques for joint reasoning across modalities. However, the inherent cross-modal nature of RIS raises questions about the effectiveness of unimodal backbones. We propose RISCLIP, a novel framework that effectively leverages the cross-modal nature of CLIP for RIS. Observing CLIP's inherent alignment between image and text features, we capitalize on this starting point and introduce simple but strong modules that enhance unimodal feature extraction and leverage rich alignment knowledge in CLIP's image-text shared-embedding space. RISCLIP exhibits outstanding results on all three major RIS benchmarks and also outperforms previous CLIP-based methods, demonstrating the efficacy of our strategy in extending CLIP's image-text alignment to RIS.Comment: NAACL 202

    An empirical Bayes model using a competition score for metabolite identification in gas chromatography mass spectrometry

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Mass spectrometry (MS) based metabolite profiling has been increasingly popular for scientific and biomedical studies, primarily due to recent technological development such as comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GCxGC/TOF-MS). Nevertheless, the identifications of metabolites from complex samples are subject to errors. Statistical/computational approaches to improve the accuracy of the identifications and false positive estimate are in great need. We propose an empirical Bayes model which accounts for a competing score in addition to the similarity score to tackle this problem. The competition score characterizes the propensity of a candidate metabolite of being matched to some spectrum based on the metabolite's similarity score with other spectra in the library searched against. The competition score allows the model to properly assess the evidence on the presence/absence status of a metabolite based on whether or not the metabolite is matched to some sample spectrum.</p> <p>Results</p> <p>With a mixture of metabolite standards, we demonstrated that our method has better identification accuracy than other four existing methods. Moreover, our method has reliable false discovery rate estimate. We also applied our method to the data collected from the plasma of a rat and identified some metabolites from the plasma under the control of false discovery rate.</p> <p>Conclusions</p> <p>We developed an empirical Bayes model for metabolite identification and validated the method through a mixture of metabolite standards and rat plasma. The results show that our hierarchical model improves identification accuracy as compared with methods that do not structurally model the involved variables. The improvement in identification accuracy is likely to facilitate downstream analysis such as peak alignment and biomarker identification. Raw data and result matrices can be found at <url>http://www.biostat.iupui.edu/~ChangyuShen/index.htm</url></p> <p>Trial Registration</p> <p>2123938128573429</p

    Model-based peak alignment of metabolomic profiling from comprehensive two-dimensional gas chromatography mass spectrometry

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GCxGC/TOF-MS) has been used for metabolite profiling in metabolomics. However, there is still much experimental variation to be controlled including both within-experiment and between-experiment variation. For efficient analysis, an ideal peak alignment method to deal with such variations is in great need.</p> <p>Results</p> <p>Using experimental data of a mixture of metabolite standards, we demonstrated that our method has better performance than other existing method which is not model-based. We then applied our method to the data generated from the plasma of a rat, which also demonstrates good performance of our model.</p> <p>Conclusions</p> <p>We developed a model-based peak alignment method to process both homogeneous and heterogeneous experimental data. The unique feature of our method is the only model-based peak alignment method coupled with metabolite identification in an unified framework. Through the comparison with other existing method, we demonstrated that our method has better performance. Data are available at <url>http://stage.louisville.edu/faculty/x0zhan17/software/software-development/mspa</url>. The R source codes are available at <url>http://www.biostat.iupui.edu/~ChangyuShen/CodesPeakAlignment.zip</url>.</p> <p>Trial Registration</p> <p>2136949528613691</p

    Learning Debiased Classifier with Biased Committee

    Full text link
    Neural networks are prone to be biased towards spurious correlations between classes and latent attributes exhibited in a major portion of training data, which ruins their generalization capability. We propose a new method for training debiased classifiers with no spurious attribute label. The key idea is to employ a committee of classifiers as an auxiliary module that identifies bias-conflicting data, i.e., data without spurious correlation, and assigns large weights to them when training the main classifier. The committee is learned as a bootstrapped ensemble so that a majority of its classifiers are biased as well as being diverse, and intentionally fail to predict classes of bias-conflicting data accordingly. The consensus within the committee on prediction difficulty thus provides a reliable cue for identifying and weighting bias-conflicting data. Moreover, the committee is also trained with knowledge transferred from the main classifier so that it gradually becomes debiased along with the main classifier and emphasizes more difficult data as training progresses. On five real-world datasets, our method outperforms prior arts using no spurious attribute label like ours and even surpasses those relying on bias labels occasionally.Comment: Conference on Neural Information Processing Systems (NeurIPS), New Orleans, 202
    corecore