1,524 research outputs found
Online Context-based Object Recognition for Mobile Robots
This work proposes a robotic object recognition
system that takes advantage of the contextual information latent
in human-like environments in an online fashion. To fully leverage
context, it is needed perceptual information from (at least) a
portion of the scene containing the objects of interest, which could
not be entirely covered by just an one-shot sensor observation.
Information from a larger portion of the scenario could still
be considered by progressively registering observations, but this
approach experiences difficulties under some circumstances, e.g.
limited and heavily demanded computational resources, dynamic
environments, etc. Instead of this, the proposed recognition
system relies on an anchoring process for the fast registration
and propagation of objectsâ features and locations beyond the
current sensor frustum. In this way, the system builds a graphbased
world model containing the objects in the scenario (both
in the current and previously perceived shots), which is exploited
by a Probabilistic Graphical Model (PGM) in order to leverage
contextual information during recognition. We also propose a
novel way to include the outcome of local object recognition
methods in the PGM, which results in a decrease in the usually
high CRF learning complexity. A demonstration of our proposal
has been conducted employing a dataset captured by a mobile
robot from restaurant-like settings, showing promising results.Universidad de MĂĄlaga. Campus de Excelencia Internacional AndalucĂa Tech
Who am I talking with? A face memory for social robots
In order to provide personalized services and to
develop human-like interaction capabilities robots need to rec-
ognize their human partner. Face recognition has been studied
in the past decade exhaustively in the context of security systems
and with significant progress on huge datasets. However, these
capabilities are not in focus when it comes to social interaction
situations. Humans are able to remember people seen for a
short moment in time and apply this knowledge directly in
their engagement in conversation. In order to equip a robot with
capabilities to recall human interlocutors and to provide user-
aware services, we adopt human-human interaction schemes to
propose a face memory on the basis of active appearance models
integrated with the active memory architecture. This paper
presents the concept of the interactive face memory, the applied
recognition algorithms, and their embedding into the robotâs
system architecture. Performance measures are discussed for
general face databases as well as scenario-specific datasets
Artificial Cognition for Social Human-Robot Interaction: An Implementation
© 2017 The Authors HumanâRobot Interaction challenges Artificial Intelligence in many regards: dynamic, partially unknown environments that were not originally designed for robots; a broad variety of situations with rich semantics to understand and interpret; physical interactions with humans that requires fine, low-latency yet socially acceptable control strategies; natural and multi-modal communication which mandates common-sense knowledge and the representation of possibly divergent mental models. This article is an attempt to characterise these challenges and to exhibit a set of key decisional issues that need to be addressed for a cognitive robot to successfully share space and tasks with a human. We identify first the needed individual and collaborative cognitive skills: geometric reasoning and situation assessment based on perspective-taking and affordance analysis; acquisition and representation of knowledge models for multiple agents (humans and robots, with their specificities); situated, natural and multi-modal dialogue; human-aware task planning; humanârobot joint task achievement. The article discusses each of these abilities, presents working implementations, and shows how they combine in a coherent and original deliberative architecture for humanârobot interaction. Supported by experimental results, we eventually show how explicit knowledge management, both symbolic and geometric, proves to be instrumental to richer and more natural humanârobot interactions by pushing for pervasive, human-level semantics within the robot's deliberative system
A cognitive ego-vision system for interactive assistance
With increasing computational power and decreasing size, computers nowadays are already wearable and mobile. They become attendant of peoples' everyday life. Personal digital assistants and mobile phones equipped with adequate software gain a lot of interest in public, although the functionality they provide in terms of assistance is little more than a mobile databases for appointments, addresses, to-do lists and photos. Compared to the assistance a human can provide, such systems are hardly to call real assistants. The motivation to construct more human-like assistance systems that develop a certain level of cognitive capabilities leads to the exploration of two central paradigms in this work. The first paradigm is termed cognitive vision systems. Such systems take human cognition as a design principle of underlying concepts and develop learning and adaptation capabilities to be more flexible in their application. They are embodied, active, and situated. Second, the ego-vision paradigm is introduced as a very tight interaction scheme between a user and a computer system that especially eases close collaboration and assistance between these two. Ego-vision systems (EVS) take a user's (visual) perspective and integrate the human in the system's processing loop by means of a shared perception and augmented reality. EVSs adopt techniques of cognitive vision to identify objects, interpret actions, and understand the user's visual perception. And they articulate their knowledge and interpretation by means of augmentations of the user's own view. These two paradigms are studied as rather general concepts, but always with the goal in mind to realize more flexible assistance systems that closely collaborate with its users. This work provides three major contributions. First, a definition and explanation of ego-vision as a novel paradigm is given. Benefits and challenges of this paradigm are discussed as well. Second, a configuration of different approaches that permit an ego-vision system to perceive its environment and its user is presented in terms of object and action recognition, head gesture recognition, and mosaicing. These account for the specific challenges identified for ego-vision systems, whose perception capabilities are based on wearable sensors only. Finally, a visual active memory (VAM) is introduced as a flexible conceptual architecture for cognitive vision systems in general, and for assistance systems in particular. It adopts principles of human cognition to develop a representation for information stored in this memory. So-called memory processes continuously analyze, modify, and extend the content of this VAM. The functionality of the integrated system emerges from their coordinated interplay of these memory processes. An integrated assistance system applying the approaches and concepts outlined before is implemented on the basis of the visual active memory. The system architecture is discussed and some exemplary processing paths in this system are presented and discussed. It assists users in object manipulation tasks and has reached a maturity level that allows to conduct user studies. Quantitative results of different integrated memory processes are as well presented as an assessment of the interactive system by means of these user studies
The Robot@Home2 dataset: A new release with improved usability tools
Released in 2017, Robot@Home is a dataset captured by a mobile robot during indoor navigation sessions in apartments. This paper presents Robot@Home2, an enhanced version of the Robot@Home dataset, aimed at improving usability and functionality for developing and testing mobile robotics and computer vision algorithms. Robot@Home2 consists of three main components. Firstly, a relational database that states the contextual information and data links, compatible with Standard Query Language. Secondly,a Python package for managing the database, including downloading, querying, and interfacing functions. Finally, learning resources in the form of Jupyter notebooks, runnable locally or on the Google Colab platform, enabling users to explore the dataset without local installations. These freely available tools are expected to enhance the ease of exploiting the Robot@Home dataset and accelerate research in computer vision and robotics.Partial funding for open access charges: Universidad de MĂĄlaga / CBU
Robust Scene Estimation for Goal-directed Robotic Manipulation in Unstructured Environments
To make autonomous robots "taskable" so that they function properly and interact fluently with human partners, they must be able to perceive and understand the semantic aspects of their environments. More specifically, they must know what objects exist and where they are in the unstructured human world. Progresses in robot perception, especially in deep learning, have greatly improved for detecting and localizing objects. However, it still remains a challenge for robots to perform a highly reliable scene estimation in unstructured environments that is determined by robustness, adaptability and scale. In this dissertation, we address the scene estimation problem under uncertainty, especially in unstructured environments. We enable robots to build a reliable object-oriented representation that describes objects present in the environment, as well as inter-object spatial relations. Specifically, we focus on addressing following challenges for reliable scene estimation: 1) robust perception under uncertainty results from noisy sensors, objects in clutter and perceptual aliasing, 2) adaptable perception in adverse conditions by combined deep learning and probabilistic generative methods, 3) scalable perception as the number of objects grows and the structure of objects becomes more complex (e.g. objects in dense clutter).
Towards realizing robust perception, our objective is to ground raw sensor observations into scene states while dealing with uncertainty from sensor measurements and actuator control . Scene states are represented as scene graphs, where scene graphs denote parameterized axiomatic statements that assert relationships between objects and their poses. To deal with the uncertainty, we present a pure generative approach, Axiomatic Scene Estimation (AxScEs). AxScEs estimates a probabilistic distribution across plausible scene graph hypotheses describing the configuration of objects. By maintaining a diverse set of possible states, the proposed approach demonstrates the robustness to the local minimum in the scene graph state space and effectiveness for manipulation-quality perception based on edit distance on scene graphs.
To scale up to more unstructured scenarios and be adaptable to adversarial scenarios, we present Sequential Scene Understanding and Manipulation (SUM), which estimates the scene as a collection of objects in cluttered environments. SUM is a two-stage method that leverages the accuracy and efficiency from convolutional neural networks (CNNs) with probabilistic inference methods. Despite the strength from CNNs, they are opaque in understanding how the decisions are made and fragile for generalizing beyond overfit training samples in adverse conditions (e.g., changes in illumination). The probabilistic generative method complements these weaknesses and provides an avenue for adaptable perception.
To scale up to densely cluttered environments where objects are physically touching with severe occlusions, we present GeoFusion, which fuses noisy observations from multiple frames by exploring geometric consistency at object level. Geometric consistency characterizes geometric compatibility between objects and geometric similarity between observations and objects. It reasons about geometry at the object-level, offering a fast and reliable way to be robust to semantic perceptual aliasing. The proposed approach demonstrates greater robustness and accuracy than the state-of-the-art pose estimation approach.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163060/1/zsui_1.pd
- âŠ