2,015 research outputs found
Learning Dynamic Robot-to-Human Object Handover from Human Feedback
Object handover is a basic, but essential capability for robots interacting
with humans in many applications, e.g., caring for the elderly and assisting
workers in manufacturing workshops. It appears deceptively simple, as humans
perform object handover almost flawlessly. The success of humans, however,
belies the complexity of object handover as collaborative physical interaction
between two agents with limited communication. This paper presents a learning
algorithm for dynamic object handover, for example, when a robot hands over
water bottles to marathon runners passing by the water station. We formulate
the problem as contextual policy search, in which the robot learns object
handover by interacting with the human. A key challenge here is to learn the
latent reward of the handover task under noisy human feedback. Preliminary
experiments show that the robot learns to hand over a water bottle naturally
and that it adapts to the dynamics of human motion. One challenge for the
future is to combine the model-free learning algorithm with a model-based
planning approach and enable the robot to adapt over human preferences and
object characteristics, such as shape, weight, and surface texture.Comment: Appears in the Proceedings of the International Symposium on Robotics
Research (ISRR) 201
Robustness of Bayesian Pool-based Active Learning Against Prior Misspecification
We study the robustness of active learning (AL) algorithms against prior
misspecification: whether an algorithm achieves similar performance using a
perturbed prior as compared to using the true prior. In both the average and
worst cases of the maximum coverage setting, we prove that all
-approximate algorithms are robust (i.e., near -approximate) if
the utility is Lipschitz continuous in the prior. We further show that
robustness may not be achieved if the utility is non-Lipschitz. This suggests
we should use a Lipschitz utility for AL if robustness is required. For the
minimum cost setting, we can also obtain a robustness result for approximate AL
algorithms. Our results imply that many commonly used AL algorithms are robust
against perturbed priors. We then propose the use of a mixture prior to
alleviate the problem of prior misspecification. We analyze the robustness of
the uniform mixture prior and show experimentally that it performs reasonably
well in practice.Comment: This paper is published at AAAI Conference on Artificial Intelligence
(AAAI 2016
Adaptive Semi-supervised Learning for Cross-domain Sentiment Classification
We consider the cross-domain sentiment classification problem, where a
sentiment classifier is to be learned from a source domain and to be
generalized to a target domain. Our approach explicitly minimizes the distance
between the source and the target instances in an embedded feature space. With
the difference between source and target minimized, we then exploit additional
information from the target domain by consolidating the idea of semi-supervised
learning, for which, we jointly employ two regularizations -- entropy
minimization and self-ensemble bootstrapping -- to incorporate the unlabeled
target data for classifier refinement. Our experimental results demonstrate
that the proposed approach can better leverage unlabeled data from the target
domain and achieve substantial improvements over baseline methods in various
experimental settings.Comment: Accepted to EMNLP201
Monte Carlo Bayesian Reinforcement Learning
Bayesian reinforcement learning (BRL) encodes prior knowledge of the world in
a model and represents uncertainty in model parameters by maintaining a
probability distribution over them. This paper presents Monte Carlo BRL
(MC-BRL), a simple and general approach to BRL. MC-BRL samples a priori a
finite set of hypotheses for the model parameter values and forms a discrete
partially observable Markov decision process (POMDP) whose state space is a
cross product of the state space for the reinforcement learning task and the
sampled model parameter space. The POMDP does not require conjugate
distributions for belief representation, as earlier works do, and can be solved
relatively easily with point-based approximation algorithms. MC-BRL naturally
handles both fully and partially observable worlds. Theoretical and
experimental results show that the discrete POMDP approximates the underlying
BRL task well with guaranteed performance.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012
- …