169,161 research outputs found
Task-Oriented Manipulation Planning: Teaching Robot Manipulators to Learn Trajectory Tasks
As robot manipulator applications are conducted in more complex tasks and unstructured environments, traditional manual programming cannot match the growing requirements. However, human experts usually know how to operate robot manipulators to complete tasks, but they do not know how to manually program the robot for automatically executing tasks. From a general point of view, a robot manipulation task is composed of a series of consecutive robot actions and environment states which we call it trajectory task. Imitation learning, an emerging and popular technique of robot behavior programming, is a good way to tackle this line of work but still needs robotic and machine learning skills. Moreover, the state-of-the-art methods, such as inverse reinforcement learning methods, curriculum learning methods, and behavior cloning methods are suffering from expensive data collection, intensive data labeling, target goal recognition, and sometimes combinations of those challenges. In order to solve these challenges, we propose a method that can teach robot manipulators a variety of tasks without too many robotic and machine learning skills required. Besides the drawbacks of the state-of-the-art methods, there are new challenges: noisy demonstration data, a limited number of demonstration episodes, and a low random exploring success rate. To tackle this problem, we disassembled it into three parts: demonstration episode evaluation, demonstration guided trajectory generation and utilizing vision sensors for trajectory generation. These three parts correspond to chapters 2, 3, and 4 which state the details of each challenge. From the results, our proposed method outperforms the state-of-the-art methods and can be applied to different tasks
Speech Emotion Recognition System using Librosa for Better Customer Experience
Call center employees usually depend on instinct to judge a potential customer and how to pitch to them. In this paper, we pitch a more effective way for call center employees to generate more leads and engagement to generate higher revenue by analyzing the speech of the target customer by using machine learning practices and depending on data to make data-driven decisions rather than intuition. Speech Emotion Recognition otherwise known as SER is the demonstration of aspiring to perceive human inclination along with the behavior. Normally voice reflects basic feeling through tone and pitch. According to human behavior, many creatures other than human beings are also synced themselves. In this paper, we have used a python-based library named Librosa for examining music tones and sounds or speeches. In this regard, various libraries are being assembled to build a detection model utilizing an MLP (Multilayer Perceptron) classifier. The classifier will train to perceive feeling from multiple sound records. The whole implementation will be based on an existing Kaggle dataset for speech recognition. The training set will be treated to train the perceptron whereas the test set will showcase the accuracy of the model
Discovery and recognition of motion primitives in human activities
We present a novel framework for the automatic discovery and recognition of
motion primitives in videos of human activities. Given the 3D pose of a human
in a video, human motion primitives are discovered by optimizing the `motion
flux', a quantity which captures the motion variation of a group of skeletal
joints. A normalization of the primitives is proposed in order to make them
invariant with respect to a subject anatomical variations and data sampling
rate. The discovered primitives are unknown and unlabeled and are
unsupervisedly collected into classes via a hierarchical non-parametric Bayes
mixture model. Once classes are determined and labeled they are further
analyzed for establishing models for recognizing discovered primitives. Each
primitive model is defined by a set of learned parameters.
Given new video data and given the estimated pose of the subject appearing on
the video, the motion is segmented into primitives, which are recognized with a
probability given according to the parameters of the learned models.
Using our framework we build a publicly available dataset of human motion
primitives, using sequences taken from well-known motion capture datasets. We
expect that our framework, by providing an objective way for discovering and
categorizing human motion, will be a useful tool in numerous research fields
including video analysis, human inspired motion generation, learning by
demonstration, intuitive human-robot interaction, and human behavior analysis
CompILE: Compositional Imitation Learning and Execution
We introduce Compositional Imitation Learning and Execution (CompILE): a
framework for learning reusable, variable-length segments of
hierarchically-structured behavior from demonstration data. CompILE uses a
novel unsupervised, fully-differentiable sequence segmentation module to learn
latent encodings of sequential data that can be re-composed and executed to
perform new tasks. Once trained, our model generalizes to sequences of longer
length and from environment instances not seen during training. We evaluate
CompILE in a challenging 2D multi-task environment and a continuous control
task, and show that it can find correct task boundaries and event encodings in
an unsupervised manner. Latent codes and associated behavior policies
discovered by CompILE can be used by a hierarchical agent, where the high-level
policy selects actions in the latent code space, and the low-level,
task-specific policies are simply the learned decoders. We found that our
CompILE-based agent could learn given only sparse rewards, where agents without
task-specific policies struggle.Comment: ICML (2019
Goal Set Inverse Optimal Control and Iterative Re-planning for Predicting Human Reaching Motions in Shared Workspaces
To enable safe and efficient human-robot collaboration in shared workspaces
it is important for the robot to predict how a human will move when performing
a task. While predicting human motion for tasks not known a priori is very
challenging, we argue that single-arm reaching motions for known tasks in
collaborative settings (which are especially relevant for manufacturing) are
indeed predictable. Two hypotheses underlie our approach for predicting such
motions: First, that the trajectory the human performs is optimal with respect
to an unknown cost function, and second, that human adaptation to their
partner's motion can be captured well through iterative re-planning with the
above cost function. The key to our approach is thus to learn a cost function
which "explains" the motion of the human. To do this, we gather example
trajectories from pairs of participants performing a collaborative assembly
task using motion capture. We then use Inverse Optimal Control to learn a cost
function from these trajectories. Finally, we predict reaching motions from the
human's current configuration to a task-space goal region by iteratively
re-planning a trajectory using the learned cost function. Our planning
algorithm is based on the trajectory optimizer STOMP, it plans for a 23 DoF
human kinematic model and accounts for the presence of a moving collaborator
and obstacles in the environment. Our results suggest that in most cases, our
method outperforms baseline methods when predicting motions. We also show that
our method outperforms baselines for predicting human motion when a human and a
robot share the workspace.Comment: 12 pages, Accepted for publication IEEE Transaction on Robotics 201
Prediction of intent in robotics and multi-agent systems.
Moving beyond the stimulus contained in observable agent behaviour, i.e. understanding the underlying intent of the observed agent is of immense interest in a variety of domains that involve collaborative and competitive scenarios, for example assistive robotics, computer games, robot-human interaction, decision support and intelligent tutoring. This review paper examines approaches for performing action recognition and prediction of intent from a multi-disciplinary perspective, in both single robot and multi-agent scenarios, and analyses the underlying challenges, focusing mainly on generative approaches
- …