2,142 research outputs found
Vision-based deep execution monitoring
Execution monitor of high-level robot actions can be effectively improved by
visual monitoring the state of the world in terms of preconditions and
postconditions that hold before and after the execution of an action.
Furthermore a policy for searching where to look at, either for verifying the
relations that specify the pre and postconditions or to refocus in case of a
failure, can tremendously improve the robot execution in an uncharted
environment. It is now possible to strongly rely on visual perception in order
to make the assumption that the environment is observable, by the amazing
results of deep learning. In this work we present visual execution monitoring
for a robot executing tasks in an uncharted Lab environment. The execution
monitor interacts with the environment via a visual stream that uses two DCNN
for recognizing the objects the robot has to deal with and manipulate, and a
non-parametric Bayes estimation to discover the relations out of the DCNN
features. To recover from lack of focus and failures due to missed objects we
resort to visual search policies via deep reinforcement learning
Assembly via disassembly: A case in machine perceptual development
First results in the effort of learning about representations of objects is presented. The questions attempted to be answered are: What is innate and what must be derived from the environment. The problem is casted in the framework of disassembly of an object into two parts
How do robots take two parts apart
This research is a natural progression of efforts which begun with the introduction of a new research paradigm in machine perception, called Active Perception. There it was stated that Active Perception is a problem of intelligent control strategies applied to data acquisition processes which will depend on the current state of the data interpretation, including recognition. The disassembly/assembly problem is treated as an Active Perception problem, and a method for autonomous disassembly based on this framework is presented
Deep Reinforcement Learning for 3D-based Object Grasping
Nowadays, collaborative robots based on Artificial Intelligence algorithms are very common to see in
workstations and laboratories and they are expected to help their human colleagues in their everyday
work. However, this type of robots can also assist in a domestic home, in tasks such as separate and
organizing cutlery objects, but for that they need an algorithm to tell them which object to grasp and
where to it.
The main focus of this thesis is to create or improve an existing algorithm based on a Deep Reinforcement Learning for 3D-based Object Grasping, aiming to help collaborative robots on such tasks.
Therefore, this work aims to present the state of the art and the study carried out, that enables the
implementation of the proposed model that will help such robots to detect, grasp and separate each
type of cutlery objects and consecutive experiments and results, as well as the retrospective of all the
work done.Hoje em dia, ouve-se falar mais de robôs e do crescimento da robótica do que se ouviria há duas décadas atrás. A indústria da robótica tem vindo a evoluir imenso e a prova disso é a existência de robôs
em estações de trabalho e laboratórios, cujo seu propósito é colaborar nas tarefas dos seus colegas
trabalhadores humanos. A este tipo de robôs dá-se o nome de Cobot ou robô colaborativo.
Estes robôs têm de suporte algoritmos da Inteligência Artificial para os ajudar a tomar as decisões
mais corretas nas tarefas que têm de desempenhar. Contudo, este tipo de robôs já começa a ser adotado para tarefas domésticas.
O tema desta dissertação envolve três grandes áreas: Inteligência Artificial, Visão Computacional e
Robótica e tem como principal objetivo o desenvolvimento de um algoritmo de Aprendizagem por
Reforço, que dê suporte a um robô universal, versão 3, na tomada de decisões para apanhar e separar
objetos de cozinha por tipo.
Assim sendo optou-se pelo uso de um algoritmo já desenvolvido, chamado Visual Pushing-for-Grasping,
que permite simular robôs colaborativos a empurrar e apanhar objetos. Todavia, os objetos utilizados
por este algoritmo em simulação não eram objetos de cozinha e o algoritmo apenas realiza apanha de
objetos sem realizar a separação dos mesmos.
Como tal, propomos uma nova abordagem com base no algoritmo anteriormente referido, e que passará a utilizar modelos 3D de objetos de cozinha, fará a deteção do tipo de objeto no cenário com
recurso a um modelo de deteção de objetos exterior ao algoritmo base e que procederá à separação
dos objetos por tipo.
Os resultados experimentais permitem concluir que esta nova abordagem ainda precisa de ser melhorada, contudo e por ser uma abordagem nova tanto no ramo da Robótica como no ramo da Inteligència
Artificial, para uso com o robôs universais da versão 3, afirmamos que os resultados estão melhores
do que o esperado e expectamos que um dia esta possa ser aplicada a um robô físico em contexto real
On CAD Informed Adaptive Robotic Assembly
We introduce a robotic assembly system that streamlines the design-to-make
workflow for going from a CAD model of a product assembly to a fully programmed
and adaptive assembly process. Our system captures (in the CAD tool) the intent
of the assembly process for a specific robotic workcell and generates a recipe
of task-level instructions. By integrating visual sensing with deep-learned
perception models, the robots infer the necessary actions to assemble the
design from the generated recipe. The perception models are trained directly
from simulation, allowing the system to identify various parts based on CAD
information. We demonstrate the system with a workcell of two robots to
assemble interlocking 3D part designs. We first build and tune the assembly
process in simulation, verifying the generated recipe. Finally, the real
robotic workcell assembles the design using the same behavior
The automaticity of visual perspective-taking in autism spectrum conditions
The thesis investigated visual perspective-taking differences between adults on the autism spectrum and a neurotypical control group. In Experiment 1, participants were required to explicitly make a left/right judgement to the spatial location of a target object from two different perspectives, one’s own perspective (self) and the actor’s perspective (other). The two perspectives were interleaved in a block of trials. The reaction time findings revealed that the ASC were slower overall compared to the matched control group. In Experiment 2, participants explicitly judged the spatial location of the target object from only the other perspective. The reaction time findings showed that there was no difference between the ASC group and the matched control group when making a judgement from the other perspective. Experiment 3 was conducted online to measure the proportion of spontaneous self or other responses to three pictures, each with a corresponding question. The findings suggest that there was no difference between the proportion of self or other response for the ASC group and control group. There was no evidence found for impaired explicit and spontaneous perspective-taking in ASC. However, the findings demonstrate that when ASC participants have to devote more cognitive resources to shift between the two perspectives, consequently their reaction time suffers. This suggests that visual perspective level 2 appears to be intact, although poorer executive functioning in ASC could partially contribute to worse performance on tasks that are more cognitively demanding
HEAP: A Sensory Driven Distributed Manipulation System
We address the problems of locating, grasping, and removing one or more unknown objects from a given area. In order to accomplish the task we use HEAP, a system of coordinating the motions of the hand and arm. HEAP also includes a laser range finer, mounted at the end of a PUMA 560, allowing the system to obtain multiple views of the workspace. We obtain volumetric information of the objects we locate by fitting superquadric surfaces on the raw range data. The volumetric information is used to ascertain the best hand configuration to enclose and constrain the object stably. The Penn Hand used to grasp the object, is fitted with 14 tactile sensors to determine the contact area and the normal components of the grasping forces. In addition the hand is used as a sensor to avoid any undesired collisions. The objective in grasping the objects is not to impart arbitrary forces on the object, but instead to be able to grasp a variety of objects using a simple grasping scheme assisted with a volumetric description and force and touch sensing
ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes
Understanding the continuous states of objects is essential for task learning
and planning in the real world. However, most existing task learning benchmarks
assume discrete(e.g., binary) object goal states, which poses challenges for
the learning of complex tasks and transferring learned policy from simulated
environments to the real world. Furthermore, state discretization limits a
robot's ability to follow human instructions based on the grounding of actions
and states. To tackle these challenges, we present ARNOLD, a benchmark that
evaluates language-grounded task learning with continuous states in realistic
3D scenes. ARNOLD is comprised of 8 language-conditioned tasks that involve
understanding object states and learning policies for continuous goals. To
promote language-instructed learning, we provide expert demonstrations with
template-generated language descriptions. We assess task performance by
utilizing the latest language-conditioned policy learning models. Our results
indicate that current models for language-conditioned manipulations continue to
experience significant challenges in novel goal-state generalizations, scene
generalizations, and object generalizations. These findings highlight the need
to develop new algorithms that address this gap and underscore the potential
for further research in this area. See our project page at:
https://arnold-benchmark.github.ioComment: The first two authors contributed equally; 20 pages; 17 figures;
project availalbe: https://arnold-benchmark.github.io
Composing Diverse Policies for Temporally Extended Tasks
Robot control policies for temporally extended and sequenced tasks are often
characterized by discontinuous switches between different local dynamics. These
change-points are often exploited in hierarchical motion planning to build
approximate models and to facilitate the design of local, region-specific
controllers. However, it becomes combinatorially challenging to implement such
a pipeline for complex temporally extended tasks, especially when the
sub-controllers work on different information streams, time scales and action
spaces. In this paper, we introduce a method that can compose diverse policies
comprising motion planning trajectories, dynamic motion primitives and neural
network controllers. We introduce a global goal scoring estimator that uses
local, per-motion primitive dynamics models and corresponding activation
state-space sets to sequence diverse policies in a locally optimal fashion. We
use expert demonstrations to convert what is typically viewed as a
gradient-based learning process into a planning process without explicitly
specifying pre- and post-conditions. We first illustrate the proposed framework
using an MDP benchmark to showcase robustness to action and model dynamics
mismatch, and then with a particularly complex physical gear assembly task,
solved on a PR2 robot. We show that the proposed approach successfully
discovers the optimal sequence of controllers and solves both tasks
efficiently.Comment: arXiv admin note: substantial text overlap with arXiv:1906.1009
- …