26 research outputs found
End-to-end Driving via Conditional Imitation Learning
Deep networks trained on demonstrations of human driving have learned to
follow roads and avoid obstacles. However, driving policies trained via
imitation learning cannot be controlled at test time. A vehicle trained
end-to-end to imitate an expert cannot be guided to take a specific turn at an
upcoming intersection. This limits the utility of such systems. We propose to
condition imitation learning on high-level command input. At test time, the
learned driving policy functions as a chauffeur that handles sensorimotor
coordination but continues to respond to navigational commands. We evaluate
different architectures for conditional imitation learning in vision-based
driving. We conduct experiments in realistic three-dimensional simulations of
urban driving and on a 1/5 scale robotic truck that is trained to drive in a
residential area. Both systems drive based on visual input yet remain
responsive to high-level navigational commands. The supplementary video can be
viewed at https://youtu.be/cFtnflNe5fMComment: Published at the International Conference on Robotics and Automation
(ICRA), 201
Predicting when to laugh with structured classification
International audienceToday, Embodied Conversational Agents (ECAs) are emerging as natural media to interact with machines. Applications are numerous and ECAs can reduce the technological gap between people by providing user-friendly interfaces. Yet, ECAs are still unable to produce social signals appropriately during their interaction with humans, which tends to make the interaction less instinctive. Especially, very little attention has been paid to the use of laughter in human-avatar interactions despite the crucial role played by laughter in human-human interaction. In this paper, a method for predicting the most appropriate moment for laughing for an ECA is proposed. Imitation learning via a structured classification algorithm is used in this purpose and is shown to produce a behavior similar to humans’ on a practical application: the yes/no game
Difference of Convex Functions Programming Applied to Control with Expert Data
This paper reports applications of Difference of Convex functions (DC)
programming to Learning from Demonstrations (LfD) and Reinforcement Learning
(RL) with expert data. This is made possible because the norm of the Optimal
Bellman Residual (OBR), which is at the heart of many RL and LfD algorithms, is
DC. Improvement in performance is demonstrated on two specific algorithms,
namely Reward-regularized Classification for Apprenticeship Learning (RCAL) and
Reinforcement Learning with Expert Demonstrations (RLED), through experiments
on generic Markov Decision Processes (MDP), called Garnets
Time-Contrastive Networks: Self-Supervised Learning from Video
We propose a self-supervised approach for learning representations and
robotic behaviors entirely from unlabeled videos recorded from multiple
viewpoints, and study how this representation can be used in two robotic
imitation settings: imitating object interactions from videos of humans, and
imitating human poses. Imitation of human behavior requires a
viewpoint-invariant representation that captures the relationships between
end-effectors (hands or robot grippers) and the environment, object attributes,
and body pose. We train our representations using a metric learning loss, where
multiple simultaneous viewpoints of the same observation are attracted in the
embedding space, while being repelled from temporal neighbors which are often
visually similar but functionally different. In other words, the model
simultaneously learns to recognize what is common between different-looking
images, and what is different between similar-looking images. This signal
causes our model to discover attributes that do not change across viewpoint,
but do change across time, while ignoring nuisance variables such as
occlusions, motion blur, lighting and background. We demonstrate that this
representation can be used by a robot to directly mimic human poses without an
explicit correspondence, and that it can be used as a reward function within a
reinforcement learning algorithm. While representations are learned from an
unlabeled collection of task-related videos, robot behaviors such as pouring
are learned by watching a single 3rd-person demonstration by a human. Reward
functions obtained by following the human demonstrations under the learned
representation enable efficient reinforcement learning that is practical for
real-world robotic systems. Video results, open-source code and dataset are
available at https://sermanet.github.io/imitat
RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in One-Shot
A key challenge in robotic manipulation in open domains is how to acquire
diverse and generalizable skills for robots. Recent research in one-shot
imitation learning has shown promise in transferring trained policies to new
tasks based on demonstrations. This feature is attractive for enabling robots
to acquire new skills and improving task and motion planning. However, due to
limitations in the training dataset, the current focus of the community has
mainly been on simple cases, such as push or pick-place tasks, relying solely
on visual guidance. In reality, there are many complex skills, some of which
may even require both visual and tactile perception to solve. This paper aims
to unlock the potential for an agent to generalize to hundreds of real-world
skills with multi-modal perception. To achieve this, we have collected a
dataset comprising over 110,000 contact-rich robot manipulation sequences
across diverse skills, contexts, robots, and camera viewpoints, all collected
in the real world. Each sequence in the dataset includes visual, force, audio,
and action information. Moreover, we also provide a corresponding human
demonstration video and a language description for each robot sequence. We have
invested significant efforts in calibrating all the sensors and ensuring a
high-quality dataset. The dataset is made publicly available at rh20t.github.ioComment: RSS 2023 workshop on LTAMP. The project page is at rh20t.github.i
Optimization And Learning For Rough Terrain Legged Locomotion
We present a novel approach to legged locomotion over rough terrain that is thoroughly rooted in optimization. This approach relies on a hierarchy of fast, anytime algorithms to plan a set of footholds, along with the dynamic body motions required to execute them. Components within the planning framework coordinate to exchange plans, cost-to-go estimates, and \u27certificates\u27 that ensure the output of an abstract high-level planner can be realized by lower layers of the hierarchy. The burden of careful engineering of cost functions to achieve desired performance is substantially mitigated by a simple inverse optimal control technique. Robustness is achieved by real-time re-planning of the full trajectory, augmented by reflexes and feedback control. We demonstrate the successful application of our approach in guiding the LittleDog quadruped robot over a variety of types of rough terrain. Other novel aspects of our past research efforts include a variety of pioneering inverse optimal control techniques as well as a system for planning using arbitrary pre-recorded robot behavior