81 research outputs found
Explaining the Decisions of Deep Policy Networks for Robotic Manipulations
Deep policy networks enable robots to learn behaviors to solve various
real-world complex tasks in an end-to-end fashion. However, they lack
transparency to provide the reasons of actions. Thus, such a black-box model
often results in low reliability and disruptive actions during the deployment
of the robot in practice. To enhance its transparency, it is important to
explain robot behaviors by considering the extent to which each input feature
contributes to determining a given action. In this paper, we present an
explicit analysis of deep policy models through input attribution methods to
explain how and to what extent each input feature affects the decisions of the
robot policy models. To this end, we present two methods for applying input
attribution methods to robot policy networks: (1) we measure the importance
factor of each joint torque to reflect the influence of the motor torque on the
end-effector movement, and (2) we modify a relevance propagation method to
handle negative inputs and outputs in deep policy networks properly. To the
best of our knowledge, this is the first report to identify the dynamic changes
of input attributions of multi-modal sensor inputs in deep policy networks
online for robotic manipulation.Comment: 2021 IEEE/RSJ International Conference on Intelligent Robots and
Systems (IROS 2021
Variational Curriculum Reinforcement Learning for Unsupervised Discovery of Skills
Mutual information-based reinforcement learning (RL) has been proposed as a
promising framework for retrieving complex skills autonomously without a
task-oriented reward function through mutual information (MI) maximization or
variational empowerment. However, learning complex skills is still challenging,
due to the fact that the order of training skills can largely affect sample
efficiency. Inspired by this, we recast variational empowerment as curriculum
learning in goal-conditioned RL with an intrinsic reward function, which we
name Variational Curriculum RL (VCRL). From this perspective, we propose a
novel approach to unsupervised skill discovery based on information theory,
called Value Uncertainty Variational Curriculum (VUVC). We prove that, under
regularity conditions, VUVC accelerates the increase of entropy in the visited
states compared to the uniform curriculum. We validate the effectiveness of our
approach on complex navigation and robotic manipulation tasks in terms of
sample efficiency and state coverage speed. We also demonstrate that the skills
discovered by our method successfully complete a real-world robot navigation
task in a zero-shot setup and that incorporating these skills with a global
planner further increases the performance.Comment: ICML 2023. First two authors contributed equally. Code at
https://github.com/seongun-kim/vcr
Adaptive and Explainable Deployment of Navigation Skills via Hierarchical Deep Reinforcement Learning
For robotic vehicles to navigate robustly and safely in unseen environments,
it is crucial to decide the most suitable navigation policy. However, most
existing deep reinforcement learning based navigation policies are trained with
a hand-engineered curriculum and reward function which are difficult to be
deployed in a wide range of real-world scenarios. In this paper, we propose a
framework to learn a family of low-level navigation policies and a high-level
policy for deploying them. The main idea is that, instead of learning a single
navigation policy with a fixed reward function, we simultaneously learn a
family of policies that exhibit different behaviors with a wide range of reward
functions. We then train the high-level policy which adaptively deploys the
most suitable navigation skill. We evaluate our approach in simulation and the
real world and demonstrate that our method can learn diverse navigation skills
and adaptively deploy them. We also illustrate that our proposed hierarchical
learning framework presents explainability by providing semantics for the
behavior of an autonomous agent.Comment: ICRA 2023. First two authors contributed equally. Code at
https://github.com/leekwoon/hrl-na
Refining Diffusion Planner for Reliable Behavior Synthesis by Automatic Detection of Infeasible Plans
Diffusion-based planning has shown promising results in long-horizon,
sparse-reward tasks by training trajectory diffusion models and conditioning
the sampled trajectories using auxiliary guidance functions. However, due to
their nature as generative models, diffusion models are not guaranteed to
generate feasible plans, resulting in failed execution and precluding planners
from being useful in safety-critical applications. In this work, we propose a
novel approach to refine unreliable plans generated by diffusion models by
providing refining guidance to error-prone plans. To this end, we suggest a new
metric named restoration gap for evaluating the quality of individual plans
generated by the diffusion model. A restoration gap is estimated by a gap
predictor which produces restoration gap guidance to refine a diffusion
planner. We additionally present an attribution map regularizer to prevent
adversarial refining guidance that could be generated from the sub-optimal gap
predictor, which enables further refinement of infeasible plans. We demonstrate
the effectiveness of our approach on three different benchmarks in offline
control settings that require long-horizon planning. We also illustrate that
our approach presents explainability by presenting the attribution maps of the
gap predictor and highlighting error-prone transitions, allowing for a deeper
understanding of the generated plans.Comment: NeurIPS 2023. First two authors contributed equally. Code at
http://github.com/leekwoon/rg
Extending CLIP's Image-Text Alignment to Referring Image Segmentation
Referring Image Segmentation (RIS) is a cross-modal task that aims to segment
an instance described by a natural language expression. Recent methods leverage
large-scale pretrained unimodal models as backbones along with fusion
techniques for joint reasoning across modalities. However, the inherent
cross-modal nature of RIS raises questions about the effectiveness of unimodal
backbones. We propose RISCLIP, a novel framework that effectively leverages the
cross-modal nature of CLIP for RIS. Observing CLIP's inherent alignment between
image and text features, we capitalize on this starting point and introduce
simple but strong modules that enhance unimodal feature extraction and leverage
rich alignment knowledge in CLIP's image-text shared-embedding space. RISCLIP
exhibits outstanding results on all three major RIS benchmarks and also
outperforms previous CLIP-based methods, demonstrating the efficacy of our
strategy in extending CLIP's image-text alignment to RIS.Comment: NAACL 202
An empirical Bayes model using a competition score for metabolite identification in gas chromatography mass spectrometry
<p>Abstract</p> <p>Background</p> <p>Mass spectrometry (MS) based metabolite profiling has been increasingly popular for scientific and biomedical studies, primarily due to recent technological development such as comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GCxGC/TOF-MS). Nevertheless, the identifications of metabolites from complex samples are subject to errors. Statistical/computational approaches to improve the accuracy of the identifications and false positive estimate are in great need. We propose an empirical Bayes model which accounts for a competing score in addition to the similarity score to tackle this problem. The competition score characterizes the propensity of a candidate metabolite of being matched to some spectrum based on the metabolite's similarity score with other spectra in the library searched against. The competition score allows the model to properly assess the evidence on the presence/absence status of a metabolite based on whether or not the metabolite is matched to some sample spectrum.</p> <p>Results</p> <p>With a mixture of metabolite standards, we demonstrated that our method has better identification accuracy than other four existing methods. Moreover, our method has reliable false discovery rate estimate. We also applied our method to the data collected from the plasma of a rat and identified some metabolites from the plasma under the control of false discovery rate.</p> <p>Conclusions</p> <p>We developed an empirical Bayes model for metabolite identification and validated the method through a mixture of metabolite standards and rat plasma. The results show that our hierarchical model improves identification accuracy as compared with methods that do not structurally model the involved variables. The improvement in identification accuracy is likely to facilitate downstream analysis such as peak alignment and biomarker identification. Raw data and result matrices can be found at <url>http://www.biostat.iupui.edu/~ChangyuShen/index.htm</url></p> <p>Trial Registration</p> <p>2123938128573429</p
Model-based peak alignment of metabolomic profiling from comprehensive two-dimensional gas chromatography mass spectrometry
<p>Abstract</p> <p>Background</p> <p>Comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GCxGC/TOF-MS) has been used for metabolite profiling in metabolomics. However, there is still much experimental variation to be controlled including both within-experiment and between-experiment variation. For efficient analysis, an ideal peak alignment method to deal with such variations is in great need.</p> <p>Results</p> <p>Using experimental data of a mixture of metabolite standards, we demonstrated that our method has better performance than other existing method which is not model-based. We then applied our method to the data generated from the plasma of a rat, which also demonstrates good performance of our model.</p> <p>Conclusions</p> <p>We developed a model-based peak alignment method to process both homogeneous and heterogeneous experimental data. The unique feature of our method is the only model-based peak alignment method coupled with metabolite identification in an unified framework. Through the comparison with other existing method, we demonstrated that our method has better performance. Data are available at <url>http://stage.louisville.edu/faculty/x0zhan17/software/software-development/mspa</url>. The R source codes are available at <url>http://www.biostat.iupui.edu/~ChangyuShen/CodesPeakAlignment.zip</url>.</p> <p>Trial Registration</p> <p>2136949528613691</p
Learning Debiased Classifier with Biased Committee
Neural networks are prone to be biased towards spurious correlations between
classes and latent attributes exhibited in a major portion of training data,
which ruins their generalization capability. We propose a new method for
training debiased classifiers with no spurious attribute label. The key idea is
to employ a committee of classifiers as an auxiliary module that identifies
bias-conflicting data, i.e., data without spurious correlation, and assigns
large weights to them when training the main classifier. The committee is
learned as a bootstrapped ensemble so that a majority of its classifiers are
biased as well as being diverse, and intentionally fail to predict classes of
bias-conflicting data accordingly. The consensus within the committee on
prediction difficulty thus provides a reliable cue for identifying and
weighting bias-conflicting data. Moreover, the committee is also trained with
knowledge transferred from the main classifier so that it gradually becomes
debiased along with the main classifier and emphasizes more difficult data as
training progresses. On five real-world datasets, our method outperforms prior
arts using no spurious attribute label like ours and even surpasses those
relying on bias labels occasionally.Comment: Conference on Neural Information Processing Systems (NeurIPS), New
Orleans, 202
- …